CN111508033A

CN111508033A - Camera parameter determination method, image processing method, storage medium, and electronic apparatus

Info

Publication number: CN111508033A
Application number: CN202010313129.3A
Authority: CN
Inventors: 肖泽东
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Shenzhen Yayue Technology Co ltd
Priority date: 2020-04-20
Filing date: 2020-04-20
Publication date: 2020-08-07

Abstract

The invention discloses a camera parameter determining method, an image processing method, a storage medium and an electronic device, comprising: inputting a target picture into a target recognition model, wherein the target recognition model is a neural network model obtained after training by utilizing a plurality of sample pictures and picture parameter labels respectively matched with each sample picture in the plurality of sample pictures, the target recognition model is used for recognizing the view angle parameters of a camera for shooting the sample pictures, and the view angle parameters comprise: roll angle, pitch angle and field angle; acquiring a recognition result output by the target recognition model, wherein the recognition result is used for indicating a view angle parameter of a target camera for shooting a target picture; and determining the focal length of the target camera according to the field angle of the target camera. The method and the device can determine the camera parameters corresponding to the designated image without the orthogonal parallel lines.

Description

Camera parameter determination method, image processing method, storage medium, and electronic apparatus

Technical Field

The present invention relates to the field of computers, and in particular, to a camera parameter determination method, an image processing method, a storage medium, and an electronic apparatus.

Background

At present, the camera parameters may be determined by using a calibration object, and establishing a mapping relationship between a world coordinate system and an image coordinate system by coordinates in the world coordinate system corresponding to a plurality of points on the calibration object and coordinates of the plurality of points of the calibration object in the image coordinate system, and solving the mapping relationship to obtain the camera parameters.

In practice, it has been found that the above-described method of determining a calibration object requires a camera with parameters to be determined to capture the calibration object, and that the camera parameters cannot be determined for the determined image. Specifically, at least two groups of orthogonal parallel lines need to be determined on the image, vanishing points are determined according to each group of orthogonal parallel lines, and camera parameters are determined through coordinates of the vanishing points. However, many images do not have orthogonal parallel lines, and at this time, the vanishing point method cannot be used for determining the camera parameters. As can be seen, for a given image without orthogonal parallel lines, the camera parameters corresponding to the image cannot be determined.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the invention provides a camera parameter determining method, an image processing method, a storage medium and an electronic device, which are used for at least determining camera parameters corresponding to a designated image without orthogonal parallel lines.

According to an aspect of the embodiments of the present invention, there is provided a camera parameter determining method, including: inputting a target picture into a target recognition model, wherein the target recognition model is a neural network model obtained by training a plurality of sample pictures and picture parameter tags respectively matched with each sample picture in the plurality of sample pictures, the target recognition model is used for recognizing the view angle parameters of a camera for shooting the sample pictures, and the view angle parameters include: roll angle, pitch angle and field angle; acquiring a recognition result output by the target recognition model, wherein the recognition result is used for indicating a view angle parameter of a target camera for shooting the target picture; and determining the focal length of the target camera according to the field angle of the target camera.

According to another aspect of the embodiments of the present invention, there is also provided a camera parameter determining apparatus, including: an input unit, configured to input a target picture into a target recognition model, where the target identification model is a neural network model obtained by training a plurality of sample pictures and picture parameter tags respectively matched with each of the plurality of sample pictures, the target recognition model is configured to recognize view parameters of a camera that captures the sample pictures, and the view parameters include: roll angle, pitch angle and field angle; a first obtaining unit, configured to obtain a recognition result output by the target recognition model, where the recognition result is used to indicate a viewing angle parameter of a target camera that captures the target picture; and the determining unit is used for determining the focal length of the target camera according to the field angle of the target camera.

According to another aspect of the embodiments of the present invention, there is provided an image processing method, including: determining camera parameters corresponding to the target picture according to the camera parameter determination method; acquiring target object parameters and a position to be implanted; and implanting the target object into the target picture based on the camera parameters, the target object parameters and the position to be implanted.

According to another aspect of the embodiments of the present invention, there is provided a picture processing apparatus including: an input unit, configured to input a target picture into a target recognition model, where the target recognition model is a neural network model obtained by training a plurality of sample pictures and picture parameter tags respectively matched with each of the plurality of sample pictures, the target recognition model is configured to identify a view angle parameter of a camera that captures the sample picture, and the view angle parameter includes: roll angle, pitch angle and field angle; a first obtaining unit, configured to obtain a recognition result output by the target recognition model, where the recognition result is used to indicate a viewing angle parameter of a target camera that captures the target picture; a determination unit configured to determine a focal length of the target camera according to an angle of view of the target camera; the target object acquisition unit is used for acquiring target object parameters to be implanted and positions to be implanted; an implantation unit, configured to implant the target object into the target picture based on the camera parameters, the target object parameters, and the to-be-implanted position.

According to still another aspect of the embodiments of the present invention, there is also provided a storage medium having a computer program stored therein, wherein the computer program is configured to execute the above camera parameter determination method or the above picture processing method when running.

According to another aspect of the embodiments of the present invention, there is also provided an electronic apparatus, including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the camera parameter determination method or the picture processing method through the computer program.

In the embodiment of the present invention, a target picture is input into a target identification model, where the target identification model is a neural network model obtained by training a plurality of sample pictures and picture parameter tags respectively matched with each of the plurality of sample pictures, the target identification model is used to identify view parameters of a camera that captures the sample pictures, and the view parameters include: roll angle, pitch angle and field angle; acquiring a recognition result output by the target recognition model, wherein the recognition result is used for indicating a view angle parameter of a target camera for shooting a target picture; and determining the focal length of the target camera according to the field angle of the target camera. In the process, the trained target recognition model can be used for determining camera parameters such as the roll angle, the pitch angle and the camera focal length of the target camera according to the picture characteristics without depending on a calibration object and orthogonal parallel lines in the target picture, so that the camera parameters corresponding to the designated image without the orthogonal parallel lines can be determined.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention to a proper form. In the drawings:

FIG. 1 is a schematic diagram of a network environment for an alternative camera parameter determination method according to an embodiment of the invention;

FIG. 2 is a schematic diagram of an application of an alternative advertisement placement according to an embodiment of the present invention;

FIG. 3 is a flow chart of an alternative camera parameter determination method according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of an alternative camera parameter determination according to an embodiment of the invention;

FIG. 5 is a schematic diagram of an alternative training of recognition models, according to an embodiment of the invention;

FIG. 6 is a schematic diagram of an alternative generation of a sample picture according to an embodiment of the invention;

FIG. 7 is a graphical illustration of training results for an alternative error function in accordance with embodiments of the present invention;

FIG. 8 is a schematic diagram of an alternative selection of first and second coordinates in accordance with an embodiment of the invention;

FIG. 9 is a flow chart of an alternative picture processing method according to an embodiment of the invention;

FIG. 10 is a schematic diagram of an alternative camera parameter determination apparatus according to an embodiment of the present invention;

FIG. 11 is a block diagram of an alternative image processing apparatus according to an embodiment of the present invention;

FIG. 12 is a schematic diagram of an alternative electronic device according to an embodiment of the invention;

fig. 13 is a schematic structural diagram of an alternative electronic device according to an embodiment of the invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

According to an aspect of the embodiments of the present invention, a camera parameter determining method is provided, and optionally, as an optional implementation, the camera parameter determining method may be but is not limited to be applied in a camera parameter determining system in a network environment as shown in fig. 1, where the camera parameter determining system includes a user equipment 102, a network 110 and a server 112. The user equipment 102 includes a human-computer interaction screen 104, a processor 106 and a memory 108. The human-computer interaction screen 104 is configured to display human-computer interaction information, for example, input prompt information for prompting a user to input a target picture may be displayed, or after an input instruction for inputting the target picture by the user is received, the target picture may also be displayed, and the like, which is not limited in the embodiment of the present invention; the processor 106 is configured to obtain a target picture input by a user, and input the target picture into a trained target recognition model, where the target recognition model may be a neural network model obtained by training using a plurality of sample pictures and a picture parameter tag respectively matched with each sample picture in the plurality of sample pictures, and after the target picture is input into the target recognition model, the target recognition model may output view angle parameters, such as a roll angle, a pitch angle, and a field angle, of a target camera that captures the target picture. The memory 108 is used for storing the recognition result output by the target recognition model. Alternatively, the target recognition model may be built on the user device 102, or on the server 114. Further, in the case that the target recognition model is built on the server 114, the server 112 may train the recognition model, and specifically, the server 112 includes a database 114 and a processing engine 116. The database 114 is configured to store a plurality of sample pictures for training the recognition model, and each sample picture has a corresponding stored picture parameter tag. The processor engine 116 is configured to input the correspondingly stored sample picture and the picture parameter tag into the identification model to be trained, obtain a training identification result, where the training identification result is a roll angle, a pitch angle, and a field angle corresponding to the sample picture, and substitute the roll angle, the pitch angle, the field angle, and the training identification result in the picture parameter tag into an error function for training, so as to obtain a view angle weight, and further obtain the target identification model with the view angle weight, and this process can implement training of the target identification model on the server 112, so as to obtain the target identification model. After obtaining the target recognition model, the user device 102 may send the target picture to the server 112 through the network 110, and the server 112 may return the recognition result output by the target recognition model to the user device 102 through the network 110. Specifically, the following steps may be performed:

s101, the user equipment 102 sends the target picture to the network 110;

s102, the network 110 sends the target picture to the server 112;

s103, the server 112 inputs the target picture into a target recognition model, where the target recognition model is a neural network model obtained by training a plurality of sample pictures and picture parameter tags respectively matched with each sample picture in the plurality of sample pictures, the target recognition model is used to identify view parameters of a camera that takes the sample pictures, and the view parameters include: roll angle, pitch angle and field angle;

s104, the server 112 returns the recognition result output by the target recognition model to the network 110, wherein the recognition result is used for indicating the view angle parameter of the target camera for shooting the target picture;

s105, the network 110 returns the identification result to the user equipment 102;

s106, the user device 102 obtains the recognition result, and determines the focal length of the target camera according to the field angle of the target camera.

In the embodiment of the invention, the target recognition model is a neural network model obtained by training a large number of sample pictures, the large number of sample pictures and picture parameter labels matched with each sample picture are input into the recognition model to be trained, an error function is constructed to solve the view angle weight, and the target identification model is obtained, wherein the target recognition model has the trained view angle weight. The user equipment 102 may acquire a target picture requiring camera parameter recognition, and input the target picture into a target recognition model to obtain a recognition result, so as to determine the recognition result output by the target recognition model for taking the target picture. Specifically, in an advertisement implanting scene, an advertisement page needs to be implanted into a picture, or an advertisement page needs to be implanted into a section of video, and for the scene in which the advertisement page is implanted into the video, the advertisement page can be split into a plurality of video frame pictures, that is, no matter the advertisement page is implanted into the picture or the video, the problem to be finally solved is to implant the advertisement page to be implanted into the picture. At this time, in order to enable the advertisement page to be implanted with the picture naturally, the view angle parameter of the target camera for shooting the target picture needs to be acquired, so that the advertisement page to be implanted is rendered by using the view angle parameter, and the picture is implanted into the advertisement page naturally. Specifically, the image may be input into the target recognition model through the user equipment 102, and a recognition result output by the target recognition model may be obtained, so as to determine the view angle parameter of the target camera that captures the target image according to the recognition result.

Referring to fig. 2 together, fig. 2 is an application schematic diagram of an optional advertisement implantation disclosed in the embodiment of the present invention, as shown in fig. 2, a to-be-implanted commodity 201 needs to be naturally implanted into a left first picture, at this time, the left first picture may be determined as a target picture, the target picture may be input into a target recognition model in order to obtain parameters of a target camera for taking the target picture, and the target recognition model may output view angle parameters of the target camera for taking the target picture, that is, a view angle, a pitch angle, and a rotation angle of the target camera for taking the target picture may be obtained by inputting the target picture into the target recognition model, further, a focal length of the target camera may be obtained by a conversion formula according to the view angle of the target camera, and further, a translation distance of the target camera may be obtained by labeling two coordinate points on the target picture, thereby obtaining the internal reference and the external reference of the target camera. The internal parameters of the target camera comprise a camera focal length and an image center point coordinate, the external parameters of the target camera comprise a rotation angle and a translation distance, and the rotation angle comprises a pitch angle, a rotation angle and a yaw angle. Wherein the abscissa of the image center point coordinate may be determined as a value of dividing the image width value by 2, the ordinate of the image center point coordinate may be determined as a value of dividing the image height value by 2, and the yaw angle may be determined as 0. The determined camera parameters of the target camera for shooting the target picture can be used for rendering the picture corresponding to the commodity to be implanted in the picture in fig. 2, so that the commodity to be implanted can be mapped to a proper position in the target picture, for example, the commodity to be implanted can be mapped to a table in the target picture, and natural advertisement implantation is realized. Still alternatively, the camera parameter determination method may also be applied to an augmented reality scene, and specifically, in the augmented reality scene, the virtual object often needs to be superimposed on a picture corresponding to the real scene. At this time, a picture corresponding to the real scene may be acquired, the picture may be input to the target recognition model as a target picture, corresponding camera parameters may be acquired, and the virtual object may be rendered using the camera parameters, thereby implementing a superimposed picture of the virtual object and the real scene.

Optionally, in this embodiment, the user equipment may be, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a PC, and other computer equipment that supports running an application client. The server and the user equipment may implement data interaction through a network, which may include, but is not limited to, a wireless network or a wired network. Wherein, this wireless network includes: bluetooth, WIFI and other networks that enable wireless communication. Such wired networks may include, but are not limited to: wide area networks, metropolitan area networks, and local area networks. The above is merely an example, and this is not limited in this embodiment.

Optionally, as an optional implementation manner, as shown in fig. 3, the camera parameter determining method includes:

s301, inputting a target picture into a target recognition model, wherein the target recognition model is a neural network model obtained after training by using a plurality of sample pictures and picture parameter tags respectively matched with each sample picture in the plurality of sample pictures, the target recognition model is used for recognizing the view angle parameters of a camera for shooting the sample pictures, and the view angle parameters comprise: roll angle, pitch angle and field angle;

s302, acquiring a recognition result output by the target recognition model, wherein the recognition result is used for indicating a view angle parameter of a target camera for shooting a target picture;

and S303, determining the focal length of the target camera according to the field angle of the target camera.

In the embodiment of the present invention, the target picture is a picture that needs to identify camera parameters for shooting the picture, and the target identification model is a neural network model obtained by training a large number of sample pictures and picture parameter labels corresponding to each sample picture, and may include, but is not limited to, a ResNet convolutional neural network model or a DenseNet convolutional neural network model. The target recognition model is used for recognizing the view angle parameters of the target camera for shooting the target picture, that is, the target picture is input into the target recognition model, and the target recognition model outputs the view angle parameters of the target camera for shooting the target picture, wherein the view angle parameters include a roll angle, a pitch angle and a view angle. Wherein the field angle is used to describe the field of view of the target camera, the roll angle and the pitch angle are used to describe the rotation angle of the object from the world coordinate system to the camera coordinate system, and the roll angle may be used to describe the rotation angle around the x-axis of the camera coordinate system and the pitch angle may be used to describe the rotation angle around the y-axis of the camera coordinate system. Furthermore, after the target recognition model outputs the recognition result, the focal length of the camera can be determined according to the view field angle of the camera in the recognition result, the image characteristics can be recognized by the neural network model in the process, the camera parameters corresponding to the image are directly obtained, a calibration object is not needed, vanishing points in the target image are not required, and the camera parameters corresponding to the designated image without the orthogonal parallel lines can be recognized.

Referring to fig. 4, fig. 4 is a schematic diagram illustrating an optional determination of camera parameters according to an embodiment of the present invention, and as shown in fig. 4, when the camera parameter determination method described in the embodiment of the present invention is used to determine camera parameters, a target picture needs to be input into a target recognition model, and the target recognition model outputs camera parameters corresponding to the target picture, where the camera parameters are camera parameters of a target camera that captures the target picture. Specifically, the target identification model may be a convolutional neural network model, after the target picture is input into the target identification model, the target identification model may perform a convolution operation on the target picture by using a convolution kernel, where the convolution kernel is a plurality of matrices for identifying specified image features, and the form of the target picture after being identified by the target identification model is a matrix, and performing the convolution operation on the target picture by using the convolution kernel is a matrix multiplication operation of the convolution kernel and the target picture, in this process, the picture feature of the target picture may be identified by a matrix multiplication result of the convolution kernel and the target picture, and the target identification model may output a feature value according to the picture feature, where the representation form of the picture feature may be a multidimensional vector matrix, and a weighted sum operation may be performed on a weighted value and a feature value of each vector in the multidimensional vector matrix to obtain a feature value, the process can convert the matrix with larger calculation amount into the characteristic value with smaller calculation amount, reduce the calculation amount and improve the calculation efficiency. Because the target recognition model is a model trained by a large number of sample pictures, the trained view angle weight and the characteristic value can be used for calculating view angle parameters, namely, the pitch angle, the rotation angle and the view angle of the target camera can be directly calculated. Further, according to the conversion formula of the angle of view and the focal length, the focal length of the camera can be calculated by using the angle of view. Furthermore, the translation distance of the camera can be obtained by marking and solving the coordinate points on the target picture, so that the target identification model can be used for outputting camera parameters such as a pitch angle, a rotation angle, a focal length, the translation distance and the like of the camera.

As an alternative implementation, obtaining the recognition result output by the target recognition model may include the following steps:

s1, identifying the picture characteristics of the target picture by using the target identification model, and acquiring characteristic values corresponding to the picture characteristics;

and S2, inputting the visual angle weight corresponding to the characteristic value and the visual angle parameter into the activation function, and obtaining the recognition result output by the activation function.

In the embodiment of the invention, the characteristic of the neural network model can be utilized to identify the picture characteristic of the target picture, and the characteristic value corresponding to the picture characteristic can be acquired, further, the view angle weight in the target identification model can be acquired, wherein the view angle weight comprises a first weight corresponding to the view angle, a second weight corresponding to the roll angle and a third weight corresponding to the pitch angle. The product of the characteristic value and the first weight can be substituted into the activation function to obtain the field angle; the product of the characteristic value and the second weight can be substituted into the activation function to obtain the roll angle; the pitch angle may be obtained by substituting the product of the feature value and the third weight into the activation function. Wherein, the activation function may include, but is not limited to, a Relu activation function or a sigmoid activation function. Further, the recognition result output by the activation function includes a field angle, a roll angle, and a pitch angle, that is, a field angle parameter indicating a target camera that takes a target picture of the recognition result.

As an optional implementation, before inputting the target picture into the target recognition model, the following steps may be further performed:

s1, simulating a plurality of sample view angle parameters by using a virtual camera model to generate a plurality of sample pictures, wherein each sample picture has a correspondingly stored picture parameter label, the picture parameter label comprises sample view angle parameters corresponding to the sample picture, and the sample view angle parameters at least comprise a roll angle, a pitch angle and a view angle;

s2, inputting the correspondingly stored sample picture and picture parameter label into the recognition model to be trained, and obtaining a training recognition result output by the recognition model to be trained;

and S3, substituting the roll angle, the pitch angle, the field angle and the training recognition result in the picture parameter label into an error function for training to obtain the view angle weight.

In the embodiment of the present invention, the virtual camera model is a camera model capable of simulating various viewing angle parameters to generate the sample picture, that is, the virtual camera model can simulate different roll angles, pitch angles, yaw angles, and viewing angles to generate the sample picture. And each generated sample picture and the picture parameter label thereof are correspondingly stored, wherein the picture parameter label is the roll angle, the pitch angle, the yaw angle and the view angle corresponding to the sample picture. A large number of sample pictures and picture parameter labels stored correspondingly to the sample pictures can be input into the recognition model to be trained, and a training recognition result output by the recognition model to be trained is obtained, wherein the training recognition result is a roll angle, a pitch angle and a field angle output after the recognition model recognizes the sample pictures. And training the visual angle weight of the model by using the difference between the visual angle parameter in the picture parameter label and the visual angle parameter in the training recognition result to obtain the trained visual angle weight. The view angle weight includes a first weight corresponding to the view angle, a second weight corresponding to the roll angle, and a third weight corresponding to the pitch angle.

Referring to fig. 5, fig. 5 is a schematic diagram of an alternative recognition model training method according to an embodiment of the present invention, as shown in fig. 5, the training of the recognition model requires inputting a sample picture with a picture parameter label into the recognition model to be trained, the recognition model outputs a characteristic value, the characteristic value is substituted into a first error function corresponding to a roll angle, a second error function corresponding to a pitch angle and a third error function corresponding to a field angle, training and learning the weight of the roll angle in the first error function to obtain a second weight, training and learning the weight of the pitch angle in the second error function to obtain a third weight, and training and learning the weight of the angle of view in the third error function to obtain the first weight, thereby obtaining the weight of the angle of view and further obtaining a target recognition model for training the weight of the angle of view.

Referring to fig. 6, fig. 6 is a schematic diagram of an optional generated sample picture disclosed in an embodiment of the present invention, as shown in fig. 6, a 360-piece panoramic image data set may be used when generating the sample picture, the 360-piece panoramic image data set includes 5 pieces of 360-piece panoramic image data in various scenes, and for each piece of panoramic image data, the virtual camera model described in the embodiment of the present invention may simulate different roll angles, pitch angles, yaw angles, and field angles to generate different sample pictures. Optionally, the virtual camera model described in the embodiment of the present invention may adopt a perspective projection model, and different roll angles, pitch angles, yaw angles, and field angles may be simulated by setting parameters of the perspective projection model. For example, in the schematic diagram of generating a sample picture shown in fig. 6, the sample picture may be generated by setting the viewing angle parameter of the perspective projection model to 90, or the sample picture may be generated by setting the pitch angle parameter of the perspective projection model to 30, or the sample picture may be generated by setting the yaw angle parameter of the perspective projection model to 30, or the sample picture may be generated by setting the roll angle parameter of the perspective projection model to 30.

Referring to fig. 7, fig. 7 is a schematic diagram of a training result of an optional error function disclosed in an embodiment of the present invention, in the embodiment of the present invention, for training of a recognition model, 256 sample pictures may be selected by one training, and 24 times of training are performed, it may be understood that the above numbers are only examples and do not represent limitations on the present invention, as shown in fig. 7, fig. 7 is a training result of a second error function corresponding to a pitch angle, where the left graph in fig. 7 is used to describe an error rate of a pitch angle calculated by the recognition model to be trained, and the right graph in fig. 7 is used to describe an accuracy rate of a pitch angle calculated by the recognition model to be trained, where the error rate may be calculated by K L divergence, and the accuracy rate may be calculated by a proportion that an error of the pitch angle is less than 5 degrees, further, an adam algorithm may be used for the optimization method of the error function, other algorithms such as a random gradient descent algorithm may be used for the optimization algorithm of the error function, and the final verification result is that a calibration data obtained by a calibration method of a calibration camera recognition model, where the angle recognition data set is a calibration target image, and the angle recognition data set, and the final verification method is a calibration data, where the calibration data of the angle recognition model is obtained by a calibration method, the present invention, where the calibration method is 2.

As an alternative implementation, substituting the roll angle, the pitch angle, the field angle and the recognition result to be trained in the picture parameter tag into the error function, and training the weight to be trained, wherein obtaining the field angle weight may include the following steps:

s1, constructing a first error function corresponding to the roll angle, a second error function corresponding to the pitch angle and a third error function corresponding to the field angle, wherein the first error function formula is as follows:

loss_roll＝||roll_gt-g(W_rollx)||

wherein, the roll_gtIndicating the roll angle, W_rollThe weight of the roll angle is calculated, and x is a characteristic value calculated by the recognition model to be trained;

the second error formula is as follows:

loss_pitch＝||pitch_gt-g(W_pitchx)||

wherein, pitch_gtRepresenting the pitch angle parameter, W_pitchCalculating a pitch angle weight, wherein x is a characteristic value calculated by the identification model to be trained;

the third error function is formulated as follows:

loss_vfov＝||vfov_gt-g(W_vfovx)||

wherein, vfov_gtDenotes the field angle parameter, W_vfovCalculating the weight of the field angle, wherein x is a characteristic value calculated by the identification model to be trained;

s2, determining a fourth error function as the sum of the first error function, the second error function and the third error function, wherein the fourth error function is expressed as follows:

loss＝loss_vfov+loss_roll+loss_pitch

therein, loss_rollIs a first error function, loss_pitchIs a second error function, loss_vfovIs a third error function;

and S3, training the first error function, the second error function, the third error function and the fourth error function to obtain a roll angle weight, a pitch angle weight and a view angle weight, wherein the view angle weight comprises the roll angle weight, the pitch angle weight and the view angle weight.

In the embodiment of the invention, the construction mode of the error function is to substitute the product of the visual angle weight and the characteristic value into the activation function, subtract the product of the visual angle weight substituted into the activation function and the characteristic value from the visual angle parameter in the picture parameter label corresponding to the sample picture, and carry out L2 paradigm operation on the subtracted value.

As an alternative embodiment, determining the focal length of the target camera according to the field angle of the target camera may include the steps of:

substituting the field angle of the target camera into a conversion formula, and calculating to obtain the focal length of the target camera, wherein the conversion formula is as follows:

where hfov denotes a horizontal angle of view, vfov denotes a vertical angle of view, width denotes a width of the target picture, height denotes a height of the target picture, and f denotes a focal length.

In the embodiment of the present invention, the angle of view may include a horizontal angle of view and a vertical angle of view, and the angle of view output by the object recognition model may be a horizontal angle of view or a vertical angle of view. When the angle of view output by the target recognition model is a horizontal angle of view, the error function corresponding to the angle of view when the recognition model to be trained is trained may be constructed as an error function corresponding to the horizontal angle of view, and when the angle of view output by the target recognition model is a vertical angle of view, the error function corresponding to the angle of view when the recognition model to be trained is trained may be constructed as an error function corresponding to the vertical angle of view. The horizontal angle of view and the vertical angle of view can be solved mutually, that is, the vertical angle of view can be calculated by using the horizontal angle of view, and the horizontal angle of view can be calculated by using the vertical angle of view. With the horizontal angle of view, the vertical angle of view, the image width, and the image height in the above formula known, the obtained focal length can be solved.

As an optional implementation, after determining the focal length of the target camera according to the field angle of the target camera, the following steps may be further performed:

s1, acquiring a first coordinate and a second coordinate corresponding to the virtual object in the target picture, wherein the first coordinate and the second coordinate are coordinates in a world coordinate system;

s2, substituting the first coordinate and the second coordinate into a camera imaging formula, and solving to obtain a translation distance corresponding to the target picture, wherein the camera imaging formula is as follows:

λ₁p₁＝K(RP₁+t)

λ₂p₂＝K(RP₂+t)

wherein K represents camera internal parameters, the camera internal parameters at least comprise focal length and central point coordinates corresponding to the target picture, and P represents the central point coordinates of the target picture₁Denotes a first coordinate, P₂Representing the second coordinate, t the translation distance, R the roll and pitch angles, λ₁Is the first coefficient to be solved, λ₂For the second coefficient to be solved, p₁Is the coordinate corresponding to the first coordinate in the image coordinate system, p₂Is the coordinate corresponding to the second coordinate in the image coordinate system.

Wherein the content of the first and second substances,

(u₀，v₀) Is the coordinate of the center point of the target picture, u₀Is the width of the target picture divided by 2, v₀For the height of the target picture divided by 2, the above two equations are subtracted:

get it solved

And solving t:

in this embodiment of the present invention, the virtual object may be any object or person object in the target picture, which is not limited in this embodiment of the present invention. The first coordinate and the second coordinate of the selected virtual object may be a bottom midpoint coordinate of the virtual object and a top midpoint coordinate of the virtual object, or may be a bottom lower-right corner coordinate of the virtual object and a top upper-left corner coordinate of the virtual object. Preferably, the manner of obtaining the first coordinate and the second coordinate corresponding to the virtual object in the target picture may specifically be: randomly selecting a first coordinate corresponding to any point on a virtual object in a target picture, adding a preset height value to a vertical coordinate of the first coordinate to obtain a first end point, and subtracting the preset height value from the vertical coordinate of the first coordinate to obtain a second end point; and randomly selecting a second coordinate from a height interval formed by the first end point and the second end point, wherein the ordinate of the second coordinate falls into the height interval. For example, the height interval may be 6 centimeters, that is, the range is 3 centimeters above the ordinate of the first coordinate and 3 centimeters below the ordinate of the first coordinate, and at this time, the first coordinate and the second coordinate have a certain height difference, so that it is more convenient to determine the first coordinate and the second coordinate.

Referring to fig. 8, fig. 8 is a schematic diagram illustrating selection of an optional first coordinate and an optional second coordinate disclosed in the embodiment of the present invention, as shown in fig. 8, the first coordinate and the second coordinate may be obtained in a target picture at the leftmost side of fig. 8, as shown in a picture in the middle of fig. 8, the first coordinate 802 and the second coordinate 801 may be obtained, where as shown in a picture at the rightmost side of fig. 8, the obtained first coordinate 802 corresponds to a midpoint coordinate at the bottom of a virtual object, the obtained second coordinate 801 corresponds to a midpoint coordinate at the top of the virtual object, and the first coordinate 802 and the second coordinate 801 are coordinates of the virtual object in a world coordinate system, for example, the first coordinate 802 is (0, 0, 0), the second coordinate 801 is (0, h, 0), and h is a height of the virtual object in the world coordinate system. The translation distance t can be obtained by substituting the first coordinate 802 and the second coordinate 801 into the camera imaging formula.

Optionally, as shown in fig. 9, fig. 9 is a picture processing method disclosed in the embodiment of the present invention, and may include the following steps:

s901, determining camera parameters corresponding to a target picture;

s902, acquiring parameters of a target object and a position to be implanted;

and S903, implanting the target object into the target picture based on the camera parameters, the target object parameters and the position to be implanted.

In the embodiment of the present invention, the camera parameters corresponding to the target image may be determined by using the camera parameter determination method, and the camera parameters may include a pitch angle of the target camera, a roll angle of the target camera, and a focal length of the target camera. For example, the above-described steps S301 to S303 may be performed to determine the pitch angle of the object camera, the roll angle of the object camera, and the focal length of the object camera. Further, the target object parameter is a parameter for describing a feature of the target object, and may include a size of the target object, a shape of the target object, and the like, for example, in fig. 2, the target object is the article 201 to be implanted, the size of the target object may include a height of the article 201 to be implanted, and the like, and the shape of the target object may be an irregular cube. Further, the position to be implanted is a position on the target image, and when the target object is implanted in the target image, the position of the target object in the target image is the position to be implanted, optionally, the position to be implanted may include a plurality of coordinate point sets in the target image. Further, the manner of implanting the target object into the target picture based on the camera parameters, the target object parameters and the to-be-implanted position may specifically be: determining the coordinates of the target object in a world coordinate system according to the target object parameters; acquiring the coordinates of the position to be implanted in an image coordinate system; and transforming the coordinates of the target object in the world coordinate system by using the camera parameters and the coordinate system conversion formula, so that the target object can fall into the coordinates of the position to be implanted in the picture coordinate system, and the target object is implanted into the target picture.

As an optional implementation, the target object parameter includes a size of the target object, and implanting the target object into the target picture based on the camera parameter, the parameter of the target object, and the to-be-implanted position may include:

and determining a preset point on the target object, enabling the preset point to be superposed with the position to be implanted, and determining the display mode of the target object in the target picture by taking the preset point as a reference and combining the view angle parameter and the size of the target object.

In the embodiment of the present invention, the preset point on the target object is a point for performing coordinate transformation on the target object, and may be, but is not limited to, a central point at the bottom of the target object, a central point at the top of the target object, and the like, and optionally, the preset point may be determined according to the shape of the target object, for example, when the target object is a cylinder, the central point at the bottom and the central point at the top of the target object may be determined as the preset point, and when the target object is a sphere, the center of sphere may be determined as the preset point, and the like. Further, the size of the target object may include, but is not limited to, height, width, radius, and the like. After the preset point is determined, the preset point can be overlapped with a corresponding position point in the position to be implanted, then the preset point is taken as the reference of the target object, and the display mode of the target object in the target picture is determined by combining the view angle parameter in the camera parameter and the size of the target object, so that the target object is implanted into the target picture.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.

According to another aspect of the embodiments of the present invention, there is also provided a camera parameter determination apparatus for implementing the above-described camera parameter determination method. As shown in fig. 10, the apparatus includes:

the first input unit 1001 is configured to input a target picture into a target recognition model, where the target recognition model is a neural network model obtained after training by using a plurality of sample pictures and picture parameter tags respectively matched with each of the plurality of sample pictures, the target recognition model is configured to recognize view parameters of a camera that captures the sample pictures, and the view parameters include: roll angle, pitch angle and field angle;

a first obtaining unit 1002, configured to obtain a recognition result output by a target recognition model, where the recognition result is used to indicate a viewing angle parameter of a target camera that takes a target picture;

a first determining unit 1003 for determining a focal length of the target camera according to a field angle of the target camera.

As an optional implementation, the first obtaining unit may include:

the first obtaining subunit is used for identifying the picture characteristics of the target picture by using the target identification model and obtaining characteristic values corresponding to the picture characteristics;

and the second acquisition subunit is used for inputting the visual angle weight corresponding to the characteristic value and the visual angle parameter into the activation function to obtain an identification result output by the activation function.

As an optional implementation, the apparatus may further include:

the generating unit is used for simulating a plurality of sample view angle parameters by using a virtual camera model to generate a plurality of sample pictures before inputting the target pictures into the target identification model, each sample picture is provided with a correspondingly stored picture parameter label, the picture parameter labels comprise the sample view angle parameters corresponding to the sample pictures, and the sample view angle parameters at least comprise a roll angle, a pitch angle and a view angle;

the third acquisition unit is used for inputting the correspondingly stored sample picture and the picture parameter label into the identification model to be trained and acquiring a training identification result output by the identification model to be trained;

and the fourth acquisition unit is used for substituting the roll angle, the pitch angle, the field angle and the training identification result in the picture parameter label into the error function for training to obtain the view angle weight.

As an optional implementation, the fourth obtaining unit may include:

the construction subunit is configured to construct a first error function corresponding to the roll angle, a second error function corresponding to the pitch angle, and a third error function corresponding to the field angle, where the first error function formula is as follows:

loss_roll＝||roll_gt-g(W_rollx)||

wherein the content of the first and second substances,roll_gtindicating the roll angle, W_rollThe weight of the roll angle is calculated, and x is a characteristic value calculated by the recognition model to be trained;

the second error formula is as follows:

loss_pitch＝||pitch_gt-g(W_pitchx)||

the third error function is formulated as follows:

loss_vfov＝||vfov_gt-g(W_vfovx)||

a determining subunit, configured to determine a fourth error function as a sum of the first error function, the second error function, and the third error function, where the fourth error function is expressed as follows:

loss＝loss_vfov+loss_roll+loss_pitch

the training subunit is configured to train the first error function, the second error function, the third error function, and the fourth error function to obtain a roll angle weight, a pitch angle weight, and a view angle weight, where the view angle weight includes the roll angle weight, the pitch angle weight, and the view angle weight.

As an optional implementation manner, the manner of determining the focal length of the target camera according to the field angle of the target camera by the first determining unit is specifically as follows:

the determining unit is used for substituting the field angle of the target camera into a conversion formula and calculating and obtaining the focal length of the target camera, wherein the conversion formula is as follows:

As an optional implementation, the apparatus may further include:

the fifth acquisition unit is used for acquiring a first coordinate and a second coordinate corresponding to the virtual object in the target picture after the focal length of the target camera is determined according to the field angle of the target camera, wherein the first coordinate and the second coordinate are coordinates in a world coordinate system;

and the solving unit is used for substituting the first coordinate and the second coordinate into a camera imaging formula and solving to obtain the corresponding translation distance of the target picture, wherein the camera imaging formula is as follows:

λ₁p₁＝K(RP₁+t)

λ₂p₂＝K(RP₂+t)

wherein K represents camera internal parameters, the camera internal parameters at least comprise focal length and central point coordinates corresponding to the target picture, and P represents the central point coordinates of the target picture₁Denotes a first coordinate, P₂Representing the second coordinate, t the translation distance, R the roll and pitch angles, λ₁Is the first coefficient to be solved, λ₂Is the second coefficient to be solved.

Optionally, an embodiment of the present invention further provides a picture processing apparatus for implementing the picture processing method. As shown in fig. 11, the apparatus includes:

the second input unit 1101 is configured to input a target picture into a target recognition model, where the target recognition model is a neural network model obtained by training a plurality of sample pictures and picture parameter tags respectively matched with each of the plurality of sample pictures, the target recognition model is configured to recognize view parameters of a camera that captures the sample pictures, and the view parameters include: roll angle, pitch angle and field angle;

a second obtaining unit 1102, configured to obtain a recognition result output by the target recognition model, where the recognition result is used to indicate a viewing angle parameter of a target camera that captures a target picture;

a second determining unit 1103 for determining a focal length of the object camera according to an angle of field of the object camera;

a target object acquisition unit 1104 for acquiring target object parameters to be implanted and a position to be implanted;

an implanting unit 1105, configured to implant the target object into the target picture based on the camera parameters, the target object parameters, and the position to be implanted.

As an optional implementation manner, the target object parameter includes a size of the target object, and the implanting unit 1105 is configured to implant the target object into the target picture based on the camera parameter, the parameter of the target object, and the to-be-implanted position, specifically, the method may be:

the implanting unit 1105 is configured to determine a preset point on the target object, make the preset point coincide with the position to be implanted, and determine, with the preset point as a reference, a display manner of the target object in the target picture by combining the view angle parameter and the size of the target object.

According to yet another aspect of the embodiments of the present invention, there is also provided an electronic device for implementing the camera parameter determination method, as shown in fig. 12, the electronic device includes a memory 1202 and a processor 1204, the memory 1202 stores therein a computer program, and the processor 1204 is configured to execute the steps in any one of the method embodiments through the computer program.

Optionally, in this embodiment, the electronic apparatus may be located in at least one network device of a plurality of network devices of a computer network.

Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:

s1, inputting a target picture into a target recognition model, wherein the target recognition model is a neural network model obtained by training a plurality of sample pictures and picture parameter labels respectively matched with each sample picture in the plurality of sample pictures, the target recognition model is used for recognizing the view angle parameters of a camera for shooting the sample pictures, and the view angle parameters comprise: roll angle, pitch angle and field angle;

s2, acquiring a recognition result output by the target recognition model, wherein the recognition result is used for indicating a view angle parameter of a target camera for shooting a target picture;

and S3, determining the focal length of the target camera according to the field angle of the target camera.

Alternatively, it can be understood by those skilled in the art that the structure shown in fig. 12 is only an illustration, and the electronic device may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 12 is a diagram illustrating a structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 12, or have a different configuration than shown in FIG. 12.

The memory 1202 may be used to store software programs and modules, such as program instructions/modules corresponding to the camera parameter determination method and apparatus in the embodiments of the present invention, and the processor 1204 executes various functional applications and data processing by running the software programs and modules stored in the memory 1202, so as to implement the above-mentioned camera parameter determination method. The memory 1202 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 1202 may further include memory located remotely from the processor 1204, which may be connected to a terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 1202 may be, but is not limited to, specifically configured to store information such as operation instructions. As an example, as shown in fig. 12, the memory 1202 may include, but is not limited to, a first input unit 1001, a first acquisition unit 1002, and a first determination unit 1003 in the camera parameter determination device. In addition, other module units in the above-mentioned camera parameter determination device may also be included, but are not limited to this, and are not described in detail in this example.

Optionally, the transmitting device 1206 is configured to receive or transmit data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmitting device 1206 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices to communicate with the internet or a local area Network. In one example, the transmitting device 1206 is a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.

In addition, the electronic device further includes: a display 1208, configured to display the target picture and the recognition result output by the target recognition model; and a connection bus 1210 for connecting the respective module parts in the above-described electronic apparatus.

According to another aspect of the embodiments of the present invention, there is also provided an electronic device for implementing the above-mentioned picture processing method, as shown in fig. 13, the electronic device includes a memory 1302 and a processor 1304, the memory 1302 stores a computer program, and the processor 1304 is configured to execute the steps in any one of the above-mentioned method embodiments through the computer program.

s1, determining the camera parameters corresponding to the target image picture according to the camera parameter determination method;

s2, acquiring target object parameters and a position to be implanted;

s3, implanting the target object into the target picture based on the camera parameters, the target object parameters and the position to be implanted.

Alternatively, it can be understood by those skilled in the art that the structure shown in fig. 13 is only an illustration, and the electronic device may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, and a Mobile Internet Device (MID), a PAD, and the like. Fig. 13 is a diagram illustrating a structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 13, or have a different configuration than shown in FIG. 13.

The memory 1302 may be used to store software programs and modules, such as program instructions/modules corresponding to the image processing method and apparatus in the embodiment of the present invention, and the processor 1304 executes various functional applications and data processing by running the software programs and modules stored in the memory 1302, that is, implementing the image processing method. The memory 1302 may include high speed random access memory and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, the memory 1302 may further include memory located remotely from the processor 1304, which may be connected to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 1302 may be used for storing information such as an operation instruction, but not limited to. As an example, as shown in fig. 13, the memory 1302 may include, but is not limited to, a second input unit 1101, a second acquisition unit 1102, a second determination unit 1103, a target object acquisition unit 1104, and an implantation unit 1105 in the picture processing apparatus. In addition, the image processing apparatus may further include, but is not limited to, other module units in the image processing apparatus, which is not described in this example again.

Optionally, the transmitting device 1306 is used for receiving or sending data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmitting device 1306 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices to communicate with the internet or a local area Network. In one example, the transmitting device 1306 is a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.

In addition, the electronic device further includes: a display 1308 for displaying the target picture and the recognition result output by the target recognition model; and a connection bus 1310 for connecting the respective module parts in the above-described electronic apparatus.

According to a further aspect of an embodiment of the present invention, there is also provided a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when executed.

Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the following steps:

s2, acquiring target object parameters and a position to be implanted;

Alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the methods of the foregoing embodiments may be implemented by a program instructing hardware related to the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a storage medium, and including instructions for causing one or more computer devices (which may be personal computers, servers, network devices, or the like) to execute all or part of the steps of the method according to the embodiments of the present invention.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may also be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A method for determining camera parameters, comprising:

inputting a target picture into a target recognition model, wherein the target recognition model is a neural network model obtained by training a plurality of sample pictures and picture parameter tags respectively matched with each sample picture in the plurality of sample pictures, the target recognition model is used for recognizing visual angle parameters of a camera for shooting the sample pictures, and the visual angle parameters include: roll angle, pitch angle and field angle;

acquiring a recognition result output by the target recognition model, wherein the recognition result is used for indicating a view angle parameter of a target camera for shooting the target picture;

and determining the focal length of the target camera according to the field angle of the target camera.

2. The method of claim 1, wherein the obtaining of the recognition result output by the target recognition model comprises:

identifying the picture characteristics of the target picture by using the target identification model, and acquiring characteristic values corresponding to the picture characteristics;

and inputting the visual angle weight corresponding to the characteristic value and the visual angle parameter into an activation function to obtain the identification result output by the activation function.

3. The method of claim 1, further comprising, prior to said inputting the target picture into the target recognition model:

simulating a plurality of sample view angle parameters by using a virtual camera model to generate a plurality of sample pictures, wherein each sample picture has a correspondingly stored picture parameter label, the picture parameter label comprises the sample view angle parameters corresponding to the sample picture, and the sample view angle parameters at least comprise a roll angle, a pitch angle and a view angle;

inputting the correspondingly stored sample picture and the picture parameter label into a recognition model to be trained, and obtaining a training recognition result output by the recognition model to be trained;

and substituting the roll angle, the pitch angle, the field angle and the training identification result in the picture parameter label into an error function for training to obtain the view angle weight.

4. The method according to claim 3, wherein the training weights to be trained by substituting the roll angle, the pitch angle, the field angle and the recognition result to be trained in the picture parameter label into an error function to obtain the field angle weights comprises:

constructing a first error function corresponding to the roll angle, a second error function corresponding to the pitch angle and a third error function corresponding to the field angle, wherein the first error function formula is as follows:

loss_roll＝||roll_gt-g(W_rollx)||

wherein, the roll_gtRepresents the roll angle, W_rollThe weight of the roll angle is calculated, and x is a characteristic value calculated by the recognition model to be trained;

the second error formula is as follows:

loss_pitch＝||pitch_gt-g(W_pitchx)||

wherein, pitch_gtRepresenting said pitch angle parameter, W_pitchCalculating a pitch angle weight, wherein x is a characteristic value calculated by the recognition model to be trained;

the third error function is formulated as follows:

loss_vfov＝||vfov_gt-g(W_vfovx)||

wherein, vfov_gtRepresents the field angle parameter, W_vfovCalculating the field angle weight, wherein x is a characteristic value calculated by the recognition model to be trained;

determining the sum of the first error function, the second error function and the third error function as a fourth error function, wherein the fourth error function is expressed as follows:

loss＝loss_vfov+loss_roll+loss_pitch

training the first error function, the second error function, the third error function and the fourth error function to obtain the roll angle weight, the pitch angle weight and the view angle weight, wherein the view angle weight includes the roll angle weight, the pitch angle weight and the view angle weight.

5. The method of claim 1, wherein determining the focal length of the target camera from the field of view of the target camera comprises:

where hfov represents a horizontal angle of view, vfov represents a vertical angle of view, width represents the width of the target picture, height represents the height of the target picture, and f represents a focal length.

6. The method according to any one of claims 1 to 5, further comprising, after determining the focal length of the object camera according to the field angle of the object camera:

acquiring a first coordinate and a second coordinate corresponding to a virtual object in the target picture, wherein the first coordinate and the second coordinate are coordinates in a world coordinate system;

substituting the first coordinate and the second coordinate into a camera imaging formula, and solving to obtain a translation distance corresponding to the target picture, wherein the camera imaging formula is as follows:

λ₁p₁＝K(RP₁+t)

λ₂p₂＝K(RP₂+t)

wherein K represents camera internal parameters, the camera internal parameters at least comprise the focal length and the central point coordinate corresponding to the target picture, and P is₁Representing said first coordinate, P₂Representing the second coordinate, t representing a translation distance, R representing the roll angle and the pitchElevation angle, λ₁Is the first coefficient to be solved, λ₂Is the second coefficient to be solved.

7. A camera parameter determination apparatus, comprising:

the image processing device comprises a first input unit, a second input unit and a third input unit, wherein the first input unit is used for inputting a target picture into a target recognition model, the target recognition model is a neural network model obtained by training a plurality of sample pictures and picture parameter labels respectively matched with each sample picture in the plurality of sample pictures, the target recognition model is used for recognizing visual angle parameters of a camera for shooting the sample pictures, and the visual angle parameters comprise: roll angle, pitch angle and field angle;

a first acquisition unit, configured to acquire a recognition result output by the target recognition model, where the recognition result is used to indicate a viewing angle parameter of a target camera that takes the target picture;

a first determining unit, configured to determine a focal length of the target camera according to a field angle of the target camera.

8. An image processing method for implanting a target object in a target image, comprising:

determining camera parameters corresponding to the target picture according to the method of any one of claims 1-6;

acquiring target object parameters and a position to be implanted;

implanting the target object into the target picture based on the camera parameters, the target object parameters, and the position to be implanted.

9. The method according to claim 8, wherein the target object parameters include a size of the target object, and wherein implanting the target object into the target picture based on the camera parameters, the parameters of the target object, and the to-be-implanted position comprises:

10. An image processing apparatus characterized by comprising:

the second input unit is configured to input a target picture into a target recognition model, where the target recognition model is a neural network model obtained by training a plurality of sample pictures and picture parameter tags respectively matched with each of the plurality of sample pictures, the target recognition model is configured to identify view parameters of a camera that captures the sample pictures, and the view parameters include: roll angle, pitch angle and field angle;

a second obtaining unit, configured to obtain a recognition result output by the target recognition model, where the recognition result is used to indicate a viewing angle parameter of a target camera that takes the target picture;

a second determining unit configured to determine a focal length of the target camera according to a field angle of the target camera;

the target object acquisition unit is used for acquiring target object parameters to be implanted and positions to be implanted;

an implantation unit configured to implant the target object into the target picture based on the camera parameters, the target object parameters, and the position to be implanted.

11. A computer-readable storage medium comprising a stored program, wherein the program when executed performs the method of any of claims 1 to 6 or claims 8 to 9.

12. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1 to 6 or claims 8-9 by means of the computer program.