CN112163988B

CN112163988B - Infrared image generation method and device, computer equipment and readable storage medium

Info

Publication number: CN112163988B
Application number: CN202010825513.1A
Authority: CN
Inventors: 王勇; 席有猷; 范梅梅; 王涵
Original assignee: Pla 93114
Current assignee: Pla 93114
Priority date: 2020-08-17
Filing date: 2020-08-17
Publication date: 2022-12-13
Anticipated expiration: 2040-08-17
Also published as: CN112163988A

Abstract

The application provides a method and a device for generating an infrared image, computer equipment and a readable storage medium, and relates to the field of artificial intelligence, wherein the method comprises the following steps: acquiring attribute information of a shooting scene for the visible light image; according to the attribute information, inquiring the infrared images collected historically to take the matched infrared images as style images; performing feature extraction on the visible light image by adopting a multilayer convolution neural network to obtain a first reference feature; performing feature extraction on the style image by adopting a multilayer convolutional neural network to obtain a second reference feature; initializing the simulation image; performing a plurality of iterative update processes on the simulated image based on the first reference feature and the second reference feature to minimize a difference between the features of the simulated image and the first reference feature and the second reference feature; and taking the simulation image after executing the multi-round iterative updating process as a target infrared image. The method can reduce the complexity and operation difficulty of the infrared image generation process and improve the simulation efficiency.

Description

Infrared image generation method and device, computer equipment and readable storage medium

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for generating an infrared image, a computer device, and a readable storage medium.

Background

In the prior art, when a visible light image is used for simulating an infrared image, ground object material classification, material assignment, heat exchange calculation, atmosphere transmission radiation calculation, simulation of heat exchange principles of different weather and ground object backgrounds in the visible light image are needed, an infrared thermal database of typical land and sea surface scene materials and typical target entity materials is established, dynamic distribution of surface temperature fields of a target entity and the background under specific weather conditions and specific moments is obtained by simulating a thermal physical exchange process and an infrared imaging sensor imaging process of different ground objects, and finally a corresponding infrared image is generated.

However, the current infrared image simulation method relates to knowledge in multiple subject fields such as remote sensing image processing, infrared thermophysical calculation, computer three-dimensional modeling and the like, and has the advantages of complex whole process, strong specialty, high operation difficulty and time consumption.

Disclosure of Invention

The present application is directed to solving, at least to some extent, one of the technical problems in the related art.

The application provides an infrared image generation method and device, computer equipment and a readable storage medium, so that complexity and operation difficulty of an infrared image generation process are reduced, and simulation efficiency of an infrared image is improved.

An embodiment of a first aspect of the present application provides a method for generating an infrared image, including:

acquiring a visible light image of an infrared image to be generated;

acquiring attribute information of a shooting scene for the visible light image;

according to the attribute information, inquiring the infrared images collected historically to take the matched infrared images as style images;

performing feature extraction on the visible light image by adopting a multilayer convolutional neural network to obtain a first reference feature of the visible light image;

performing feature extraction on the style image by adopting the multilayer convolution neural network to obtain a second reference feature of the style image;

initializing the simulation image;

performing a plurality of iterative update processes on the simulated image based on the first and second reference features to minimize a difference between the features of the simulated image and the first and second reference features; wherein, each round of the iterative update process comprises: performing feature extraction on the initialized or updated simulation image in the previous round by adopting the multilayer convolutional neural network; determining a value of a content loss function according to a difference between the feature of the simulated image and the first reference feature; determining a value of a style loss function according to the difference between the characteristics of the simulated image and the second reference characteristics; determining the value of a total loss function according to the value of the content loss function and the value of the style loss function; updating the simulation image by adopting a gradient descent algorithm according to the value of the total loss function;

and taking the simulation image after the multi-round iterative updating process as a target infrared image corresponding to the visible light image.

According to the method for generating the infrared image, the attribute information of the shooting scene is obtained through the visible light image; according to the attribute information, inquiring the infrared images collected historically to take the matched infrared images as style images; performing feature extraction on the visible light image by adopting a multilayer convolution neural network to obtain a first reference feature; performing feature extraction on the style image by adopting a multilayer convolution neural network to obtain a second reference feature; initializing the simulation image; performing a plurality of iterative update processes on the simulated image based on the first reference feature and the second reference feature to minimize a difference between the features of the simulated image and the first reference feature and the second reference feature; and taking the simulation image after the multi-round iterative updating process as a target infrared image. In the application, according to the visible light image and the actually measured infrared image, a multilayer convolution neural network is constructed, the conversion from the visible light image to the infrared image is realized through a migration algorithm, the target infrared image is automatically generated, the processes of geometric modeling, material thermophysical property, atmospheric radiation modeling, heat exchange calculation and the like of a target entity and ground objects do not need to be considered, so that the knowledge in the field of multiple subjects does not need to be involved, the complexity and the operation difficulty of the infrared image generation process can be reduced, and the simulation efficiency of the infrared image is improved.

An embodiment of a second aspect of the present application provides an apparatus for generating an infrared image, including:

the image acquisition module is used for acquiring a visible light image of an infrared image to be generated;

the attribute acquisition module is used for acquiring the attribute information of a shooting scene for the visible light image;

the query module is used for querying the historically collected infrared images according to the attribute information so as to take the matched infrared images as style images;

the first extraction module is used for performing feature extraction on the visible light image by adopting a multilayer convolution neural network to obtain a first reference feature of the visible light image;

the second extraction module is used for extracting the features of the style image by adopting the multilayer convolutional neural network so as to obtain a second reference feature of the style image;

the processing module is used for initializing the simulation image;

an iterative update module that performs multiple iterative update processes on the simulated image based on the first and second reference features to minimize a difference between a feature of the simulated image and the first and second reference features; wherein, each round of the iterative update process comprises: performing feature extraction on the initialized or updated simulation image in the previous round by adopting the multilayer convolutional neural network; determining a value of a content loss function according to a difference between the characteristics of the simulated image and the first reference characteristics; determining the value of a style loss function according to the difference between the characteristics of the simulation image and the second reference characteristics; determining the value of a total loss function according to the value of the content loss function and the value of the style loss function; updating the simulation image by adopting a gradient descent algorithm according to the value of the total loss function;

and the generating module is used for taking the simulation image after the multiple rounds of iterative updating processes are executed as the target infrared image corresponding to the visible light image.

The infrared image generation device of the embodiment of the application acquires the attribute information of a shooting scene through the visible light image; according to the attribute information, inquiring the infrared images collected historically to take the matched infrared images as style images; performing feature extraction on the visible light image by adopting a multilayer convolution neural network to obtain a first reference feature; performing feature extraction on the style image by adopting a multilayer convolutional neural network to obtain a second reference feature; initializing the simulation image; performing a plurality of iterative update processes on the simulated image based on the first reference feature and the second reference feature to minimize a difference between the features of the simulated image and the first reference feature and the second reference feature; and taking the simulation image after executing the multi-round iterative updating process as a target infrared image. In the application, according to the visible light image and the actually measured infrared image, a multilayer convolution neural network is constructed, the conversion from the visible light image to the infrared image is realized through a migration algorithm, the target infrared image is automatically generated, the processes of geometric modeling, material thermophysical property, atmospheric radiation modeling, heat exchange calculation and the like of a target entity and ground objects do not need to be considered, so that the knowledge in the field of multiple subjects does not need to be involved, the complexity and the operation difficulty of the infrared image generation process can be reduced, and the simulation efficiency of the infrared image is improved.

An embodiment of a third aspect of the present application provides a computer device, including: the present invention relates to a device for generating infrared images, and a method for generating infrared images, and more particularly, to a device for generating infrared images, a method for generating infrared images, and a computer program stored in a memory and executable on a processor.

An embodiment of a fourth aspect of the present application provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for generating an infrared image as set forth in the embodiment of the first aspect of the present application.

Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a schematic flowchart of a method for generating an infrared image according to an embodiment of the present application;

fig. 2 is a schematic flowchart of a method for generating an infrared image according to a second embodiment of the present application;

fig. 3 is a schematic structural diagram of an infrared image generation apparatus according to a third embodiment of the present application;

FIG. 4 illustrates a block diagram of an exemplary computer device suitable for use to implement embodiments of the present application.

Detailed Description

Reference will now be made in detail to the embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.

In the prior art, an infrared image simulation method is mainly implemented by the following two ways:

the first method is realized by adopting an infrared texture mapping method on the basis of performing computer three-dimensional modeling on a target entity and a ground object background, and the generation of an infrared texture mapping image can be realized by adopting an infrared image shot in the field and then performing texture mapping.

However, this method may have many limitations in real-field imaging, for example, in places such as private or sensitive places, in which real-field imaging cannot be performed, and even if an infrared image is captured, the infrared image is only an infrared image under a certain weather condition and is not suitable for a background situation in other situations. To solve this problem, an infrared imaging model can be used to simulate and generate an infrared image of the target entity and the ground object background, and appropriate perturbation is applied to generate a corresponding infrared texture image, and the whole process needs to consider a thermophysical process, which is complex.

And in the second mode, the infrared background image is generated by adopting a continuously-converting principle according to the imaging correlation of the same background and different wave bands. For a target entity and a ground scene in the real world, because a high-resolution visible light image is easy to obtain, and if physical attributes of objects in the scene are established in advance, the objects can be classified and solved through an existing infrared imaging model to obtain the distribution of a temperature field, and further obtain a corresponding infrared ground background image.

That is, the existing infrared image simulation method includes two core modeling processes: three-dimensional solid modeling and heat exchange process modeling.

The three-dimensional solid modeling is the basis of the whole graphics, a high-quality three-dimensional scene needs to be generated firstly, a high-quality three-dimensional model needs to be generated, the final drawing of a simulation scene is completed by calling a graphic drawing interface of a computer, and the building with the appearance is mainly generated by some common graphic modeling tool software such as 3d max modeling.

The modeling of the heat exchange process involves knowledge about infrared physics and heat transfer, and analyzes the heat exchange process of various parts inside the target entity, including heat conduction convection, radiation and the like, and also analyzes the heat exchange process of the target entity with the environment, for example, if the target entity is in contact with air, the heat loss of the target entity due to convection or the heat absorption from the environment can occur, and if the target entity is in contact with the ground, the heat conduction effect can be generated with the ground. In addition, due to the radiation of the sun and the long wave radiation of the atmosphere, the surface of the target entity can generate heat radiation exchange, the temperature difference of the surface of the target entity changes at different places, different seasons and different times in a day, and in order to truly express the characteristics, a complete mathematical model of the physical process needs to be established, the dynamic distribution of the surface temperature fields of the target entity and the background needs to be solved according to the basic principles such as energy conservation and the like, and the dynamic distribution is converted into the display of an infrared image according to a certain rule.

In summary, the current infrared image simulation method is a field of deep application of computer graphics, needs to integrate multiple disciplinary knowledge such as computer graphics, simulation modeling technology, infrared physics, heat transfer, combined materials and the like, and has the advantages of complex whole process, strong specialty and high operation difficulty.

In addition, in the current infrared image simulation, not only a typical target entity, a background material and corresponding infrared thermophysical properties including surface blackness (emissivity), density, a thermal conductivity coefficient, an absorption rate, a specific heat capacity and the like of a common material need to be considered, but also a relatively accurate atmospheric radiation model needs to be constructed because thermophysical modeling is performed on different types of materials, the target entity to be simulated and a ground object need to be subjected to material classification, different thermophysical material processing models are selectively used. The transmission characteristic of infrared radiation in the atmosphere has an extremely important influence on the infrared imaging quality, and the actual process of transmitting infrared radiation through the atmosphere is very complex, for example, the actual process is closely related to the molecular type concentration causing absorption and scattering, the size and the characteristic of suspended particles in the atmosphere, the temperature and the pressure of each point on a transmission path and the like, and is also related to factors such as atmospheric molecular scattering, absorption and Mie scattering of infrared radiation by aerosol, fluctuation of radiation intensity caused by atmospheric turbulence, light spot jumping, scattering of turbulent eddy, atmospheric refractive index and the like.

In addition, because the current infrared image simulation method has more influencing factors and more complex processes, obvious modeling errors can be generated in all links of geometric modeling, material classification, material thermophysical attribute assignment, heat exchange calculation, atmospheric radiation modeling, atmospheric heat exchange calculation and the like of a target entity and a ground object, the errors are overlapped layer by layer through one-step operation, finally, the error of the whole simulation result is very large, the simulation truth can only reach 50% -70%, and the use requirement of a user is difficult to achieve.

Therefore, the present application provides a method for generating an infrared image, mainly aiming at the problems existing in the infrared image simulation method.

According to the infrared image generation method, processes such as geometric modeling, material thermophysical properties, atmospheric radiation modeling and heat exchange calculation of the target entity and the ground object are not considered, so that knowledge in multiple subject fields is not needed, and complexity and operation difficulty of the infrared image generation process are reduced. Specifically, in the application, by adopting a style migration technology in the field of image artificial intelligence, a convolutional layer characteristic model is constructed by analyzing a large number of visible light image samples containing typical target entities and ground objects and infrared sensor real-time measurement image samples, conversion from the visible light images to the infrared images is realized through a migration algorithm, and the infrared images containing the required target entities and backgrounds are automatically generated. Therefore, complexity of the infrared image generation process can be reduced, and simulation efficiency of the infrared image is improved.

A method, an apparatus, a computer device, and a readable storage medium for generating an infrared image according to an embodiment of the present application are described below with reference to the drawings.

Fig. 1 is a schematic flowchart of a method for generating an infrared image according to an embodiment of the present disclosure.

An execution subject of the embodiment of the present application is an infrared image generation apparatus provided in the present application, and the infrared image generation apparatus may be configured in an electronic device or a server networked with the electronic device to control the electronic device.

The electronic device may be any device, apparatus or machine with computing processing capability, for example, the electronic device may be a Personal Computer (PC), a mobile terminal, an intelligent robot, and the like, and the mobile terminal may be a mobile phone, a tablet Computer, a Personal digital assistant, a wearable device, an in-vehicle device, and other hardware devices with various operating systems, touch screens, and/or display screens.

As shown in fig. 1, the method for generating an infrared image includes the following steps:

step 101, acquiring a visible light image of an infrared image to be generated.

In this embodiment of the application, the visible light image may be an image stored locally by the electronic device, such as an image downloaded and photographed in advance, or the visible light image may also be an image acquired by the electronic device, or the visible light image may also be an image browsed online, or the visible light image may also be a new image obtained after image processing, and the like, which is not limited in this application. The visible light image is an image based on visual perception, and may be, for example, a color image or a Red Green Blue (RGB) image.

In the embodiment of the application, the electronic device can be provided with a visible light camera, such as an RGB camera, for collecting a visible light image. The number of the visible light cameras is not limited, and may be, for example, one or more. The form of the visible light camera disposed in the electronic device is not limited, and for example, the visible light camera may be a camera built in the electronic device, or may be a camera externally disposed on the electronic device, and for example, the visible light camera may be a front camera or a rear camera.

In the embodiment of the application, the user operation can be detected, and the visible light image of the infrared image to be generated is acquired in response to the user operation. Alternatively, image acquisition is performed continuously or intermittently to acquire visible light images of the infrared image to be generated.

And 102, acquiring the attribute information of the shooting scene for the visible light image.

In this embodiment of the application, the attribute information of the shooting scene may include attribute information such as a foreground object in a foreground region, a background object in a background region, and a time, a place, and weather where the shooting scene is located in the visible light image.

In the embodiment of the application, the foreground object and the background object in the visible light image can be identified and obtained based on an image identification algorithm. The time when the scene of the visible light image is shot may be the acquisition time of the visible light image, and it should be understood that attribute data of the image may be recorded in each image, where the attribute data is used to describe relevant information of the image, such as the generation time of the image, the acquisition time, the source type (e.g., receiving, downloading, or using an application program to perform image processing), the image size (e.g., how many K), the image format (e.g., jpg, bmp, etc.), the image resolution (e.g., how many pixels/inch), directory information of the stored image, and so on. The location and weather of the shooting scene may be manually marked, or may be determined based on an image recognition algorithm, which is not limited in this application.

And 103, inquiring the historically collected infrared images according to the attribute information to take the matched infrared images as style images.

In the embodiment of the application, the style image is an infrared image collected historically on the spot, and for example, the style image can be obtained in advance based on actual measurement of an infrared sensor. Wherein the attribute information of the scene of the style image is matched with the attribute information of the scene of the visible light image. For example, for each historically collected infrared image, the similarity between the attribute information of the shooting scene of the historically collected infrared image and the attribute information of the shooting scene of the visible light image may be calculated based on a similarity calculation algorithm, the calculated similarity may be used as the attribute information of the shooting scene of the historically collected infrared image, and the matching degree between the attribute information of the shooting scene of the visible light image, and when the matching degree is higher than a preset threshold, the attribute information of the shooting scene of the historically collected infrared image may be determined to match the attribute information of the shooting scene of the visible light image. The similarity may be euclidean distance similarity, manhattan distance similarity, cosine cos similarity, or the like, which is not limited in the present application. The preset threshold is preset, and may be, for example, 80%, 85%, 90%, or the like.

In the embodiment of the application, the infrared images collected in each history can be inquired from the stored data, each infrared image collected in each history is matched with the obtained visible light image, the matching degree between the attribute information of the shooting scene of the infrared image collected in the history and the attribute information of the shooting scene of the visible light image is calculated, and if the matching degree is higher than the preset threshold value, the infrared image collected in the history can be used as the style image.

In practical application, there may be attribute information of a plurality of shooting scenes of the historically acquired infrared images, which is matched with the attribute information of the shooting scene of the visible light image.

And 104, performing feature extraction on the visible light image by adopting a multilayer convolution neural network to obtain a first reference feature of the visible light image.

In the embodiment of the application, the style of the visible light image can be transferred according to the style image so as to obtain the target infrared image. The core idea of style migration is to restore an original image corresponding to a feature by using an intermediate feature of a convolutional layer, that is, by inputting an infrared image which is actually measured at a certain time and contains a ground typical target entity, and obtaining various convolutional layer features after calculation by using a neural network model, according to the features of the convolutional layer, a model describing a style of the infrared image is proposed, and a style loss function is defined at the same time to describe the style difference between the input infrared image (i.e., the style image) and the generated infrared image (i.e., the target infrared image), wherein the smaller the style loss function value is, the closer the styles of the two images are, and the larger the style loss is, the larger the style difference between the two images is. In this way, the style of the input infrared image (i.e., the style image) can be restored by using the style loss function and the gradient descent method, and the visible light image is generated into an image having the same style as the input infrared image (i.e., the style image), so that the target infrared image can be rapidly generated.

Specifically, firstly, a multilayer convolutional neural network can be used for feature extraction on the visible light image to obtain a first reference feature of the visible light image. The multilayer convolutional neural Network may be a VGGNet (Visual Geometry Group Network) neural Network, for example, the mark may image light as

Using pairs of multi-layered convolutional neural networks

Performing feature extraction to obtain a first reference feature of

Wherein, the first and the second end of the pipe are connected with each other,

and the ith channel of the output of the ith layer of convolutional neural network is represented, and the first reference feature at the jth position is represented.

It should be noted that, in general, the output of the convolutional neural network is characterized by three dimensions, and the three-dimensional coordinates correspond to (height, width, channel), respectively, and here, the specific height and width may not be considered, but only the position j is considered to flatten the convolution function.

And 105, performing feature extraction on the style image by adopting a multilayer convolutional neural network to obtain a second reference feature of the style image.

In the embodiment of the application, the marked style image is

Using pairs of multi-layered convolutional neural networks

Performing feature extraction to obtain a second reference feature of

Wherein the content of the first and second substances,

and the second reference characteristic at the ith channel and the jth position of the output of the ith layer of convolutional neural network.

And 106, initializing the simulation image.

And step 107, performing multiple rounds of iterative updating processes on the simulated image according to the first reference feature and the second reference feature so as to minimize the difference between the features of the simulated image and the first reference feature and the second reference feature.

In the embodiment of the present application, each round of the iterative update process includes: feature extraction using multi-layer convolutional neural networks on the initialized or updated simulation image of the previous round, e.g. marking the simulation image as

Using pairs of multi-layered convolutional neural networks

After the characteristic extraction, the obtained simulation image has the characteristics of

Wherein the content of the first and second substances,

and (3) representing the characteristics of the simulated image on the ith channel and the jth position of the output of the ith layer of convolutional neural network. Then, according to the characteristics of the simulation image

And the first reference feature

The difference between them, the value of the content loss function is determined, for example, the simulated image can be taken

Is characterized by

And visible light images

First reference feature of

Substituting into the content loss function formula to determine the content loss function

The value of (a).

Wherein the content loss function

w _l And representing the preset weight of the l-th layer convolutional neural network. Wherein, w _l The setting may be based on an empirical value, such as 220, or may be 0, which is not limited. In the present application, the content loss function L _content For indicating visible light images

And a simulation image

Degree of difference in content, L _content Smaller, indicating a visible light image

And a simulation image

The smaller the content gap, L _content The larger the size, the visible light image is indicated

And a simulation image

The larger the content gap.

Meanwhile, the characteristics of the simulated image can be determined

And the second reference feature

And determining the value of the style loss function according to the difference between the two types of the style loss function. For example, it may be based on a simulated image

Is characterized by

Determining a simulated image

Corresponding style matrix G ^l (ii) a Wherein the style matrix G ^l The element in the qth row and the r column in (1) is

The characteristics of the simulated image which represents the number of channels i = q and the position j = k output by the I layer convolution neural network,

representing the characteristics of the simulated image with the channel number i = r and the position j = k output by the l-th layer convolutional neural network, wherein k is a natural number with the value range less than or equal to the total number of positions, and the total number of positions is the product M of the height and the width of the l-th layer convolutional neural network _l That is, G ^l Is made up of sets of vectors

A matrix of composition, wherein N _l Is the number of channels of the convolutional neural network,

furthermore, the images can be according to styles

Second reference feature of

Determining a stylistic image

Corresponding style matrix A ^l (ii) a Wherein the style matrix A ^l The element in the qth row and the r column in (1) is

A second reference feature representing the number of channels i = q, the position j = k of the output of the l-th convolutional neural network,

and a second reference feature representing the channel number i = r and the position j = k of the output of the l-th layer convolutional neural network. Finally, the simulation image is processed

Elements of the corresponding style matrix

And style image

Elements of the corresponding style matrix

Substituting style loss function formula to determine style loss function

The value of (c).

Wherein the style loss function

Wherein the content of the first and second substances,

is a normalization term. By setting up normalizationTerms to prevent the magnitude of the style loss function from being too large compared to the content loss function. Therefore, the accuracy of the output result can be improved by utilizing the multi-layer style loss function for calculation, wherein the multi-layer style loss function is the weighted accumulation of single-layer style loss functions.

And finally, determining the value of the total loss function according to the value of the content loss function and the value of the style loss function, and updating the simulation image by adopting a gradient descent algorithm according to the value of the total loss function. For example, the total loss function is marked

It should be understood that the updated simulation image needs to maintain the content of the visible light image and also needs to have the style of the style image so as to conform to the basic characteristics of the infrared image, and therefore, the total loss function

Can be as follows:

where α, β are the hyperparameters that balance the two losses. If α is large, the restored image (i.e., the target infrared image) will be closer to the style of the stylistic image.

According to the method and the device, the content of the visible light image and the style of the stylistic image can be combined by using the total loss function, so that the target infrared image corresponding to the visible light image can be generated by using the existing infrared image with the attribute similar to that of the visible light image, the complex infrared physics and atmosphere calculation process does not need to be considered, and the target infrared image is directly generated to improve the simulation efficiency.

And step 108, taking the simulation image after the multiple rounds of iterative updating processes as a target infrared image corresponding to the visible light image.

In the embodiment of the application, after the simulation image is subjected to the multiple rounds of iterative updating processes, so that the difference between the characteristics of the simulation image and the first reference characteristics and the second reference characteristics is minimized (that is, the total loss function reaches the preset threshold), the simulation image subjected to the multiple rounds of iterative updating processes can be used as the target infrared image corresponding to the visible light image. Therefore, the style migration algorithm is adopted, the style of the visible light image is migrated according to the style image to obtain the target infrared image, the calculation processes of geometric modeling, material thermophysical attributes, atmospheric radiation modeling, heat exchange calculation and the like of a target entity and a ground object are avoided from being considered in the infrared image simulation scheme in the prior art, the technical threshold of the infrared simulation image is reduced, a user only needs to input the visible light image corresponding to the target entity and any infrared image with the similar attributes, the simulation generation of the target infrared image can be realized, the simulation efficiency is improved, and the complexity of the infrared image generation process can be reduced.

According to the method for generating the infrared image, the attribute information of a shooting scene is acquired through the visible light image; according to the attribute information, inquiring the infrared images collected historically to take the matched infrared images as style images; performing feature extraction on the visible light image by adopting a multilayer convolution neural network to obtain a first reference feature; performing feature extraction on the style image by adopting a multilayer convolutional neural network to obtain a second reference feature; initializing the simulation image; performing a plurality of iterative update processes on the simulated image based on the first reference feature and the second reference feature to minimize a difference between the features of the simulated image and the first reference feature and the second reference feature; and taking the simulation image after the multi-round iterative updating process as a target infrared image. In the application, according to the visible light image and the actually measured infrared image, a multilayer convolution neural network is constructed, the conversion from the visible light image to the infrared image is realized through a migration algorithm, the target infrared image is automatically generated, and processes such as geometric modeling, material thermophysical property, atmospheric radiation modeling, heat exchange calculation and the like of a target entity and a ground object do not need to be considered, so that knowledge in multiple subject fields does not need to be involved, the complexity and the operation difficulty of the infrared image generation process can be reduced, and the simulation efficiency of the infrared image is improved.

For clearly explaining the first embodiment, the present embodiment provides another method for generating an infrared image, and fig. 2 is a schematic flow chart of a method for generating an infrared image according to a second embodiment of the present application.

As shown in fig. 2, the method for generating an infrared image may include the steps of:

step 201, a visible light image of an infrared image to be generated is obtained.

The execution process of step 201 may refer to the execution process of step 101 in the above embodiments, which is not described herein again.

Step 202, performing foreground and background recognition on the visible light image to obtain a foreground and a background.

As a possible implementation manner, region of Interest (ROI) identification may be performed on the visible light image, the identified ROI is used as a foreground, and a Region other than the foreground in the visible light image is used as a background.

As another possible implementation manner, the electronic device may further include a camera capable of acquiring Depth information, such as a Time of Flight (TOF) camera, a structured light camera, an RGB-D (Depth) camera, and the like. When the visible light camera is controlled to collect the visible light image, the depth camera can be controlled to synchronously collect a depth map, wherein the depth map is used for indicating depth information corresponding to each pixel unit in the visible light image, so that the foreground and the background in the visible light image can be determined according to the depth map. Generally, the depth information indicates that the shooting object is closer to the plane where the camera is located, and when the depth value is smaller, the object can be determined to be the foreground, otherwise, the object is the background. And then the foreground and the background of the visible light image can be determined according to the depth information of each object in the depth map.

As another possible implementation manner, the electronic device may include a main camera and a sub-camera, and one of the main camera and the sub-camera collects a visible light image, and the other of the main camera and the sub-camera collects a test image, so that depth information of the visible light image may be determined according to the test image and the visible light image, and a foreground and a background may be determined according to the depth information of the visible light image.

In particular, since there is a certain distance between the main camera and the sub-camera, and thus the two cameras have parallax, images taken by different cameras should be different. According to the principle of triangulation, the depth information of the same object in the visible light image and the test image can be calculated, namely the distance between the object and the plane where the main camera and the auxiliary camera are located. After the depth information of the visible light image is obtained through calculation, whether the object is a foreground or a background can be determined according to the depth information of each object in the visible light image, and then the foreground and the background in the visible light image can be obtained.

Step 203, respectively performing object identification on the foreground and the background to obtain a foreground object presented by the foreground and a background object presented by the background.

In the embodiment of the application, after determining the foreground and the background in the visible light image, the object recognition may be performed on the foreground and the background, respectively, based on an object detection algorithm, so as to obtain a foreground object represented by the foreground and a background object represented by the background.

Step 204, the time, the place and the weather of the shooting scene of the visible light image are taken, and the foreground object and the background object of the shooting scene are taken as attribute information.

The time of the shooting scene of the visible light image can be the acquisition time of the visible light image, and the time of the shooting scene of the visible light image can be determined according to the attribute data of the visible light image. The location and weather of the shooting scene may be manually marked, or may be determined based on an image recognition algorithm, which is not limited in this application.

It should be understood that the temperature difference of the surface of the target entity will vary from place to place, from season to season, and from time of day to time of day, and therefore, in order to enable the simulated image to exhibit the above-described characteristics, the time at which the scene is captured may indicate the season and the time of day in the present application.

And step 205, inquiring the historically acquired infrared images according to the attribute information so as to take the matched infrared images as style images.

In the embodiment of the application, the style image is an infrared image collected historically on the spot, and for example, the style image can be obtained in advance based on actual measurement of an infrared sensor. Wherein the attribute information of the scene of the style image is matched with the attribute information of the scene of the visible light image.

And step 206, performing feature extraction on the visible light image by adopting a multilayer convolution neural network to obtain a first reference feature of the visible light image.

And step 207, performing feature extraction on the style image by adopting a multilayer convolutional neural network to obtain a second reference feature of the style image.

Step 208, the simulation image is initialized.

In step 209, a plurality of iterative update processes are performed on the simulated image based on the first reference feature and the second reference feature to minimize the difference between the features of the simulated image and the first reference feature and the second reference feature.

Wherein, each round of iterative updating process comprises the following steps: carrying out feature extraction on the initialized or updated simulation image in the previous round by adopting a multilayer convolutional neural network; determining the value of a content loss function according to the difference between the characteristics of the simulation image and the first reference characteristics; determining the value of the style loss function according to the difference between the characteristics of the simulation image and the second reference characteristics; determining the value of the total loss function according to the value of the content loss function and the value of the style loss function; and updating the simulation image by adopting a gradient descent algorithm according to the value of the total loss function.

And step 210, taking the simulation image after the multi-round iterative updating process as a target infrared image corresponding to the visible light image.

In the embodiment of the application, the characteristics of the simulation image can be gradually changed by adopting a gradient descent method according to the total loss function, after the multi-round iterative updating process is executed, the total loss function can reach a preset threshold value, the difference between the characteristics of the simulation image and the first reference characteristics and the second reference characteristics is minimized, and at the moment, the simulation image after the multi-round iterative updating process is executed can be used as the target infrared image corresponding to the visible light image.

In the embodiment of the application, a computer image processing method is adopted, an artificial intelligence transfer learning theory calculation method is used for reference, a convolutional layer characteristic model is respectively constructed on a visible light image containing a typical target entity and a ground object and a style image with similar attributes to the visible light image obtained based on actual measurement of an infrared sensor, conversion from the visible light image to an infrared image is realized through a transfer algorithm, and a required target infrared image is automatically generated.

In order to implement the above embodiments, the present application further provides an infrared image generating device.

Fig. 3 is a schematic structural diagram of an infrared image generation apparatus according to a third embodiment of the present application.

As shown in fig. 3, the infrared image generating apparatus 100 may include: an image acquisition module 110, an attribute acquisition module 120, a query module 130, a first extraction module 140, a second extraction module 150, a processing module 160, an iterative update module 170, and a generation module 180.

The image obtaining module 110 is configured to obtain a visible light image of an infrared image to be generated.

The attribute obtaining module 120 is configured to obtain attribute information of a shooting scene for the visible light image.

And the query module 130 is configured to query the historically acquired infrared images according to the attribute information, so as to use the matched infrared images as the style images.

The first extraction module 140 is configured to perform feature extraction on the visible light image by using a multilayer convolutional neural network to obtain a first reference feature of the visible light image.

And a second extraction module 150, configured to perform feature extraction on the style image by using a multilayer convolutional neural network to obtain a second reference feature of the style image.

And the processing module 160 is used for initializing the simulation image.

An iterative update module 170 that performs multiple iterative update processes on the simulated image based on the first reference feature and the second reference feature to minimize a difference between the features of the simulated image and the first reference feature and the second reference feature; wherein, each round of iterative updating process comprises the following steps: carrying out feature extraction on the initialized or updated simulation image in the previous round by adopting a multilayer convolution neural network; determining the value of a content loss function according to the difference between the characteristics of the simulated image and the first reference characteristics; determining the value of the style loss function according to the difference between the characteristics of the simulation image and the second reference characteristics; determining the value of the total loss function according to the value of the content loss function and the value of the style loss function; and updating the simulation image by adopting a gradient descent algorithm according to the value of the total loss function.

Further, in a possible implementation manner of the embodiment of the present application, the iterative update module 170 is specifically configured to: simulating the image

Is characterized by

And visible light images

First reference feature of

Substituting into content loss function formula to determine content loss function

The value of (a).

Wherein the content of the first and second substances,

the ith channel representing the output of the ith convolutional neural network, and the first reference feature at the jth position;

indicating the l-th layer volumeAccumulating the characteristics of the simulation image on the ith channel and the jth position output by the neural network; function of content loss

Wherein the content of the first and second substances,

w _l and representing the preset weight of the l-th layer convolutional neural network.

Further, in a possible implementation manner of the embodiment of the present application, the iterative update module 170 is further configured to: from simulated images

Is characterized by

Determining a simulated image

representing the characteristics of a simulation image with the channel number i = r and the position j = k output by the l-th layer convolutional neural network, wherein k is a natural number with the value range being less than or equal to the total number of the positions; according to the style image

Second reference characteristic of

Determining a stylistic image

Corresponding style matrix A ^l ；

A second reference feature at the ith channel, the jth position, representing the output of the ith convolutional neural network; style matrix A ^l The element in the qth row and the r column in (1) is

a second reference feature representing the number of channels i = r and the position j = k output by the l-th layer convolutional neural network; simulating the image

Elements of the corresponding style matrix

And style image

Elements of the corresponding style matrix

Substituting style loss function formula to determine style loss function

The value of (a).

Wherein the style loss function

Wherein the content of the first and second substances,

is a normalization term.

And the generating module 180 is configured to use the simulation image after the multiple rounds of iterative updating processes are performed as a target infrared image corresponding to the visible light image.

Further, in a possible implementation manner of the embodiment of the present application, the attribute obtaining module 120 is specifically configured to: carrying out foreground and background identification on the visible light image to obtain a foreground and a background; respectively carrying out object identification on the foreground and the background to obtain a foreground object presented by the foreground and a background object presented by the background; the time, place, and weather at which the shooting scene of the visible light image is located, and the foreground object and the background object of the shooting scene are taken as attribute information.

Further, in a possible implementation manner of the embodiment of the present application, the time at which the scene is shot is used to indicate the season and the time of day.

It should be noted that the foregoing explanation on the embodiment of the method for generating an infrared image is also applicable to the apparatus for generating an infrared image of this embodiment, and details are not described here again.

The infrared image generation device of the embodiment of the application acquires the attribute information of a shooting scene through the visible light image; according to the attribute information, inquiring the infrared images collected historically to take the matched infrared images as style images; performing feature extraction on the visible light image by adopting a multilayer convolution neural network to obtain a first reference feature; performing feature extraction on the style image by adopting a multilayer convolution neural network to obtain a second reference feature; initializing the simulation image; performing a plurality of iterative update processes on the simulated image based on the first reference feature and the second reference feature to minimize a difference between the features of the simulated image and the first reference feature and the second reference feature; and taking the simulation image after executing the multi-round iterative updating process as a target infrared image. In the application, according to the visible light image and the actually measured infrared image, a multilayer convolution neural network is constructed, the conversion from the visible light image to the infrared image is realized through a migration algorithm, the target infrared image is automatically generated, the processes of geometric modeling, material thermophysical property, atmospheric radiation modeling, heat exchange calculation and the like of a target entity and ground objects do not need to be considered, so that the knowledge in the field of multiple subjects does not need to be involved, the complexity and the operation difficulty of the infrared image generation process can be reduced, and the simulation efficiency of the infrared image is improved.

In order to implement the foregoing embodiment, the present application further provides a computer device, including: the infrared image generation device comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein when the processor executes the program, the infrared image generation method as set forth in the foregoing embodiments of the present application is realized.

In order to achieve the above embodiments, the present application also proposes a non-transitory computer-readable storage medium on which a computer program is stored, which when executed by a processor implements the method for generating an infrared image as proposed in the foregoing embodiments of the present application.

FIG. 4 illustrates a block diagram of an exemplary computer device suitable for use in implementing embodiments of the present application. The computer device 12 shown in fig. 4 is only an example and should not bring any limitation to the function and the scope of use of the embodiments of the present application.

As shown in FIG. 4, computer device 12 is in the form of a general purpose computing device. The components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus architectures. These architectures include, but are not limited to, industry Standard Architecture (ISA) bus, micro Channel Architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus, to name a few.

Computer device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

Memory 28 may include computer system readable media in the form of volatile Memory, such as Random Access Memory (RAM) 30 and/or cache Memory 32. Computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, commonly referred to as a "hard drive"). Although not shown in FIG. 4, a disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a Compact disk Read Only Memory (CD-ROM), a Digital versatile disk Read Only Memory (DVD-ROM), or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the application.

A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including but not limited to an operating system, one or more application programs, other program modules, and program data, each of which or some combination of which may comprise an implementation of a network environment. Program modules 42 generally perform the functions and/or methodologies of the embodiments described herein.

Computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with computer device 12, and/or with any devices (e.g., network card, modem, etc.) that enable computer device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Moreover, computer device 12 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public Network such as the Internet via Network adapter 20. As shown, network adapter 20 communicates with the other modules of computer device 12 via bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

The processing unit 16 executes various functional applications and data processing, for example, implementing the methods mentioned in the foregoing embodiments, by running a program stored in the system memory 28.

In the description of the present specification, reference to the description of "one embodiment," "some embodiments," "an example," "a specific example," or "some examples" or the like means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Moreover, various embodiments or examples and features of various embodiments or examples described in this specification can be combined and combined by one skilled in the art without being mutually inconsistent.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or to implicitly indicate the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one of the feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. A method for generating an infrared image is characterized by comprising the following steps:

acquiring a visible light image of an infrared image to be generated;

performing feature extraction on the visible light image by adopting a multilayer convolution neural network to obtain a first reference feature of the visible light image;

initializing the simulation image;

performing a plurality of iterative update processes on the simulated image based on the first and second reference features to minimize a difference between the features of the simulated image and the first and second reference features; wherein, each round of the iterative update process comprises: performing feature extraction on the initialized or updated simulation image in the previous round by adopting the multilayer convolutional neural network; determining a value of a content loss function according to a difference between the characteristics of the simulated image and the first reference characteristics; determining the value of a style loss function according to the difference between the characteristics of the simulation image and the second reference characteristics; determining the value of a total loss function according to the value of the content loss function and the value of the style loss function; updating the simulation image by adopting a gradient descent algorithm according to the value of the total loss function;

2. The method of generating of claim 1, wherein determining a value of a content loss function based on a difference between the feature of the simulated image and the first reference feature comprises:

the simulation image is processed

Is characterized by

With the visible light image

Of the first reference feature

Taking the value of (A);

the ith channel of the output of the ith layer of convolutional neural network is represented, and the first reference characteristic at the jth position is obtained;

representing the characteristics of the simulation image on the ith channel and the jth position of the output of the ith layer of convolutional neural network;

the content loss function

3. The method of generating according to claim 2, wherein determining a value of a style loss function according to a difference between the feature of the simulated image and the second reference feature comprises:

according to the simulation image

Is characterized by

Determining the simulated image

representing the characteristics of a simulation image with the channel number i = r and the position j = k output by the l-th layer convolutional neural network, wherein k is a natural number with the value range being less than or equal to the total number of the positions;

according to the style image

Second reference characteristic of

Determining the stylistic image

Corresponding style matrix A ^l ；

a second reference feature representing the channel number i = r and the position j = k of the output of the l layer convolutional neural network;

the simulation image is processed

Elements of the corresponding style matrix

And the style image

Elements of the corresponding style matrix

Substituting style loss function formula to determine style loss function

Taking the value of (A);

wherein the style loss function

Wherein the content of the first and second substances,

is a normalization term.

4. The generation method according to any one of claims 1 to 3, wherein the acquiring, for the visible light image, attribute information of a shooting scene includes:

performing foreground and background identification on the visible light image to obtain the foreground and the background;

respectively carrying out object identification on the foreground and the background to obtain a foreground object presented by the foreground and a background object presented by the background;

and taking the time, the place and the weather of the shooting scene of the visible light image, and taking the foreground object and the background object of the shooting scene as the attribute information.

5. The generation method according to claim 4, wherein the time at which the scene is shot is used to indicate a season and a time of day.

6. An infrared image generation apparatus, comprising:

the attribute acquisition module is used for acquiring attribute information of a shooting scene for the visible light image;

the query module is used for querying the historically acquired infrared images according to the attribute information so as to take the matched infrared images as style images;

the second extraction module is used for carrying out feature extraction on the style image by adopting the multilayer convolutional neural network so as to obtain a second reference feature of the style image;

the processing module is used for initializing the simulation image;

an iterative update module that performs multiple iterative update processes on the simulated image based on the first and second reference features to minimize a difference between a feature of the simulated image and the first and second reference features; wherein, each round of the iterative update process comprises: performing feature extraction on the initialized or updated simulation image in the previous round by adopting the multilayer convolutional neural network; determining a value of a content loss function according to a difference between the feature of the simulated image and the first reference feature; determining the value of a style loss function according to the difference between the characteristics of the simulation image and the second reference characteristics; determining the value of a total loss function according to the value of the content loss function and the value of the style loss function; updating the simulation image by adopting a gradient descent algorithm according to the value of the total loss function;

7. The generation apparatus according to claim 6, wherein the attribute acquisition module is specifically configured to:

8. The generation apparatus according to claim 7, wherein the time at which the scene is shot indicates a season and a time of day.

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of generating an infrared image as claimed in any one of claims 1 to 5 when executing the program.

10. A non-transitory computer-readable storage medium on which a computer program is stored, the program, when being executed by a processor, implementing the method for generating an infrared image according to any one of claims 1 to 5.