CN111723767A

CN111723767A - Image processing method and device and computer storage medium

Info

Publication number: CN111723767A
Application number: CN202010606947.2A
Authority: CN
Inventors: 宋旭鸣; 许朝斌
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2020-06-29
Filing date: 2020-06-29
Publication date: 2020-09-29
Anticipated expiration: 2040-06-29
Also published as: CN111723767B

Abstract

The invention provides an image processing method, an image processing device and a computer storage medium, relates to the technical field of intelligent video monitoring, and solves the problem of how to identify sundries existing in a fire fighting channel at present. The method comprises the steps of obtaining a picture to be processed; acquiring a reference picture in a current image library; determining at least one comparison result according to the picture to be processed in the current image library and at least one reference picture in the current image library; the comparison result comprises at least one foreground object, at least one current image library foreground object is a region with different pixels between a to-be-processed picture of the current image library and a reference picture, and one current image library comparison result corresponds to one current image library reference picture; determining at least one first target foreground contained in the to-be-processed picture of the current image library according to the comparison result of the at least one current image library; the first target foreground of the current image library is a foreground object existing in each comparison result of the at least one comparison result.

Description

Image processing method and device and computer storage medium

Technical Field

The invention relates to the technical field of intelligent video monitoring, in particular to an image processing method and device and a computer storage medium.

Background

The traditional fire fighting access sundries detection mainly depends on manual safety inspection, and a special worker is appointed to go to a specific fire fighting access regularly to check whether the fire fighting access is blocked, the method is simple and easy to implement, and does not need to depend on complex equipment, but the method has the defects that: firstly, whether a fire fighting channel is blocked cannot be found in time, and the influence of the manual inspection period is great; secondly, the method greatly depends on professional quality and working attitude of workers, and has strong subjectivity.

In order to solve the problems, a common technical means in the prior art is to collect real-time images of a fire fighting channel and identify trained sundries in the fire fighting channel, and the untrained sundries cannot be identified.

Disclosure of Invention

The invention provides an image processing method, an image processing device and a computer storage medium, which solve the problem of how to identify sundries existing in a fire fighting channel at present.

In order to achieve the purpose, the invention adopts the following technical scheme:

in a first aspect, an embodiment of the present invention provides an image processing method, where when a to-be-processed picture is obtained, a reference picture in a current image library is obtained, and then at least one comparison result may be determined according to the to-be-processed picture and at least one reference picture. And then, determining at least one first target foreground contained in the picture to be processed according to at least one comparison result. The comparison result comprises at least one foreground object, the at least one foreground object is an area with different pixels between the picture to be processed and one reference picture, one comparison result corresponds to one reference picture, and the first target foreground is a foreground object existing in each comparison result in the at least one comparison result.

As can be seen from the above, when comparing the to-be-processed picture with a certain reference picture, it can be determined that the to-be-processed picture is different from the reference picture, such as at least one extra foreground object (the foreground object may be an impurity). Similarly, if the to-be-processed picture is compared with each reference picture, at least one comparison result can be determined (each comparison result includes at least one foreground object, and one comparison result corresponds to one reference picture one by one). It is easy to understand that, if the first target foreground exists in each comparison result, it is indicated that the first target foreground is a unique object in the image to be processed, and may be specifically an impurity. Based on the principle, at least one first target foreground determined by the image processing method provided by the application is an object uniquely existing in the picture to be processed. It can be seen that the image processing method provided by the application can effectively and accurately identify the first target foreground whatever.

If the background area is a fire fighting access, the object in the fire fighting access can be effectively identified by adopting the image processing method provided by the application.

Optionally, the embodiment provides an image processing method applicable to debris detection of a fire fighting access, and in combination with the above description, in such a scene, debris existing in a picture to be processed can be accurately determined, so that the problem of how to identify debris existing in the fire fighting access at present is solved.

In a possible design manner, the image processing method provided in the embodiment of the present invention further includes: determining at least one preset foreground in a picture to be processed; determining at least one second target foreground according to the at least one first target foreground and the at least one preset foreground; the second target foreground is any one of the at least one first target foreground and does not belong to the at least one preset foreground.

In a possible design manner, the implementation method for determining at least one comparison result according to the picture to be processed and the at least one reference picture includes: and inputting the picture to be processed and at least one reference picture into a pre-trained picture comparison detection model, and determining at least one comparison result.

In a possible design, the training process of the "image comparison detection model" includes: acquiring a training sample image and a labeling result of the training sample image; wherein the training sample image comprises a foreground and a background; inputting a training sample image into a deep learning model; determining whether a prediction comparison result of the deep learning model for the training sample image is matched with an annotation result based on a target loss function; and when the predicted comparison result is not matched with the labeling result, iteratively updating the network parameters of the deep learning model repeatedly and circularly until the model converges to obtain the image comparison detection model.

In a possible design manner, the implementation method for determining at least one first target foreground included in the to-be-processed picture according to at least one comparison result includes: determining a target frame of each foreground object in each comparison result in at least one comparison result; executing a first operation on each comparison result to determine at least one first target foreground; the first operation is: if the target frame of the first foreground object in the first comparison result is overlapped with one target frame in each comparison result except the first comparison result, determining that the first foreground object in the first comparison result is a first target foreground; the first comparison result is any one of the at least one comparison result, and the first foreground object is any one of the first comparison result.

In a possible design manner, the implementation method for determining at least one first target foreground included in the to-be-processed picture according to at least one comparison result includes: determining a target frame of each foreground object in each comparison result in at least one comparison result; performing a second operation on each comparison result to determine at least one first target foreground; wherein the second operation is: if the first target frame is overlapped with at least two second target frames in the second comparison result, respectively calculating the intersection and comparison between the first target frame and each second target frame in the at least two second target frames; the first target frame is a target frame of a second foreground object in the third comparison result, the second comparison result and the third comparison result are any one of at least one comparison result, the second comparison result is different from the third comparison result, the second foreground object is any one of the third comparison results, and the intersection comparison is equal to the ratio of the area of the intersection region of the first target frame and one second target frame to the area of the union region of the first target frame and one second target frame; determining a third foreground object in a second comparison result as a first target foreground according to the intersection and comparison of the first target frame and each of at least two second target frames obtained through calculation; the first target frame is overlapped with a second target frame corresponding to a third foreground object in the second comparison result, the intersection and comparison of the first target frame and the second target frame corresponding to the third foreground object in the second comparison result meets a preset condition, and the third foreground object is any one foreground object in the second comparison result.

In a possible design manner, the image processing method provided in the embodiment of the present invention further includes: and when determining that the picture to be processed does not contain the first target foreground according to at least one comparison result, storing the picture to be processed into the current image library.

In a second aspect, the present invention provides an image processing apparatus comprising: the device comprises an acquisition unit, a storage unit and a processing unit. Specifically, the acquiring unit is configured to acquire a picture to be processed; the acquiring unit is further configured to acquire a reference picture in the current image library stored in the storage unit; the processing unit is configured to determine at least one comparison result according to the to-be-processed picture acquired by the acquisition unit and the reference picture acquired by the at least one acquisition unit. One comparison result comprises at least one foreground object, the at least one foreground object is an area with different pixels between the picture to be processed and one reference picture, and one comparison result corresponds to one reference picture. The processing unit is further configured to determine at least one first target foreground included in the to-be-processed picture according to the at least one comparison result. The first target foreground is a foreground object existing in each comparison result of the at least one comparison result.

In a possible design manner, the processing unit is further configured to determine at least one preset foreground in the to-be-processed picture acquired by the acquiring unit; the processing unit is further configured to determine at least one second target foreground according to the at least one first target foreground and the at least one preset foreground; the second target foreground is any one of the at least one first target foreground and does not belong to the at least one preset foreground.

In a possible design manner, the processing unit is specifically configured to input the picture to be processed acquired by the acquiring unit and at least one reference picture stored in the storage unit into a pre-trained picture comparison detection model, and determine at least one comparison result.

In a possible design, the training process of the image comparison detection model includes: the acquiring unit is further configured to acquire a training sample image and a labeling result of the training sample image; wherein the training sample image comprises a foreground and a background; the processing unit is further configured to input the training sample image acquired by the acquisition unit into the deep learning model; the processing unit is further configured to determine whether a prediction comparison result of the deep learning model for the training sample image matches the annotation result based on the target loss function; the processing unit is further configured to iteratively update the network parameters of the deep learning model repeatedly and circularly when the predicted comparison result is determined not to be matched with the labeling result, until the model converges, and obtain the picture comparison detection model.

In a possible design manner, the processing unit is specifically configured to determine a target frame of each foreground object in each comparison result of the at least one comparison result; the processing unit is specifically configured to perform a first operation on each comparison result to determine at least one first target foreground; the first operation is: if the target frame of the first foreground object in the first comparison result is overlapped with one target frame in each comparison result except the first comparison result, determining that the first foreground object in the first comparison result is a first target foreground; the first comparison result is any one of the at least one comparison result, and the first foreground object is any one of the first comparison result.

In a possible design manner, the processing unit is specifically configured to determine a target frame of each foreground object in each comparison result of the at least one comparison result; the processing unit is specifically configured to perform a second operation on each comparison result to determine at least one first target foreground; wherein the second operation is: if the first target frame is overlapped with at least two second target frames in the second comparison result, respectively calculating the intersection and comparison between the first target frame and each second target frame in the at least two second target frames; the first target frame is a target frame of a second foreground object in the third comparison result, the second comparison result and the third comparison result are any one of at least one comparison result, the second comparison result is different from the third comparison result, the second foreground object is any one of the third comparison results, and the intersection comparison is equal to the ratio of the area of the intersection region of the first target frame and one second target frame to the area of the union region of the first target frame and one second target frame; the processing unit is specifically configured to determine, according to the intersection and comparison between the first target frame obtained through calculation and each of the at least two second target frames, that a third foreground object in the second comparison result is the first target foreground; the first target frame is overlapped with a second target frame corresponding to a third foreground object in the second comparison result, the intersection and comparison of the first target frame and the second target frame corresponding to the third foreground object in the second comparison result meets a preset condition, and the third foreground object is any one foreground object in the second comparison result.

In a possible design manner, the processing unit is specifically configured to store the to-be-processed picture in the current image library when it is determined that the to-be-processed picture acquired by the acquiring unit does not include the first target foreground according to the at least one comparison result.

In a third aspect, the present invention provides an image processing apparatus comprising: communication interface, processor, memory, bus; the memory is used for storing computer execution instructions, and the processor is connected with the memory through a bus. When the image processing apparatus is running, the processor executes computer-executable instructions stored by the memory to cause the image processing apparatus to perform the image processing method as provided in the first aspect above.

In a fourth aspect, the invention provides a computer-readable storage medium comprising instructions. When the instructions are run on a computer, the instructions cause the computer to perform the image processing method as provided in the first aspect above.

In a fifth aspect, the present invention provides a computer program product which, when run on a computer, causes the computer to perform the image processing method according to the first aspect.

It should be noted that all or part of the above computer instructions may be stored on the first computer readable storage medium. The first computer readable storage medium may be packaged with a processor of the image processing apparatus, or may be packaged separately from the processor of the image processing apparatus, which is not limited in the present invention.

For the description of the second, third, fourth and fifth aspects of the present invention, reference may be made to the detailed description of the first aspect; in addition, for the beneficial effects described in the second aspect, the third aspect, the fourth aspect and the fifth aspect, reference may be made to beneficial effect analysis of the first aspect, and details are not repeated here.

In the present invention, the names of the above-mentioned image processing apparatuses do not limit the devices or functional modules themselves, and in actual implementation, these devices or functional modules may appear by other names. Insofar as the functions of the respective devices or functional blocks are similar to those of the present invention, they are within the scope of the claims of the present invention and their equivalents.

These and other aspects of the invention will be more readily apparent from the following description.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic diagram of a fire fighting tunnel collected by a camera device according to an embodiment of the present invention, which has no prospect and prospect;

FIG. 2 is a schematic diagram of an image processing system according to an embodiment of the present invention;

fig. 3 is a flowchart illustrating an image processing method according to an embodiment of the present invention;

fig. 4 is a second schematic flowchart of an image processing method according to an embodiment of the present invention;

fig. 5 is a third schematic flowchart of an image processing method according to an embodiment of the present invention;

FIG. 6 is a fourth flowchart illustrating an image processing method according to an embodiment of the present invention;

FIG. 7 is a fifth flowchart illustrating an image processing method according to an embodiment of the present invention;

FIG. 8 is a sixth flowchart illustrating an image processing method according to an embodiment of the present invention;

fig. 9 is a seventh schematic flowchart of an image processing method according to an embodiment of the present invention;

fig. 10 is an eighth schematic flowchart of an image processing method according to an embodiment of the present invention;

FIG. 11 is a ninth flowchart illustrating an image processing method according to an embodiment of the present invention;

fig. 12 is a schematic diagram illustrating a reference picture and a to-be-processed picture in an image processing method according to an embodiment of the present invention are compared;

fig. 13 is a tenth of a flowchart illustrating an image processing method according to an embodiment of the present invention;

fig. 14 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present invention;

fig. 15 is a second schematic structural diagram of an image processing apparatus according to a second embodiment of the present invention;

fig. 16 is a schematic structural diagram of a computer program product of an image processing method according to an embodiment of the present invention.

Detailed Description

Embodiments of the present invention will be described below with reference to the accompanying drawings.

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

For the convenience of clearly describing the technical solutions of the embodiments of the present invention, in the embodiments of the present invention, the words "first", "second", and the like are used to distinguish the same items or similar items with basically the same functions and actions, and those skilled in the art can understand that the words "first", "second", and the like do not limit the quantity and execution order.

In order to facilitate understanding of the present invention, the following description explains related terms to which the present invention relates.

Background area: refers to an area in the picture that does not change at a longer time scale. As shown in fig. 1 a, the background may be a fire passage without any foreground.

Foreground object: refers to the area of the picture that changes at a longer time scale. As shown in fig. 1 b, the foreground may be an automobile 1 and a human body 2.

The deep neural network simulates the neural connection structure of the human brain by establishing a model, and describes the data characteristics by layering a plurality of transformation stages when processing signals such as images, sounds and texts.

Generally, a neural network is composed of a plurality of network layers, each of which processes input data thereof and transmits the processed data to a next network layer. Specifically, in each network layer, the processing device (the device storing the neural network) performs processing such as convolution and multiplication-addition on the input data using the weight value corresponding to the network layer. The processing method of the processing device is determined by the attributes of the network layer (such as convolutional layer, full link layer, etc.), and the weight value used by the processing device is determined by the processing device in the process of training the neural network. The processing device adjusts the weight value corresponding to the network layer to obtain different data processing results. A Convolutional Neural Network (CNN) model is one of deep neural network models.

The following describes a structure of an image processing system to which the image processing method provided by the embodiment of the present invention is applied.

Fig. 2 is a schematic diagram of a structure of an image processing system according to an embodiment of the present invention. As shown in fig. 2, the image processing system may include: at least one camera device (in fig. 2, the camera device is a camera, and the system includes 3 cameras "camera 3, camera 4, and camera 5" for example) and a server 6.

The camera device can be used for acquiring images of a designated area (such as a fire fighting channel) and sending the acquired images to the server. Illustratively, when the designated area is a fire passage and the camera device is a camera, the camera 3 captures an image of the fire passage as shown in fig. 2.

Specifically, in practical applications, the shooting area of the camera is fixed when the installation position of the camera is fixed. Illustratively, when the camera is of a gun camera, a small hemisphere or a large hemisphere, the shooting angle is unique; when the camera is a ball machine, the ball machine is provided with the high-speed stepping motor holder, and the high-speed stepping motor holder is controlled to rotate, so that the shooting angle of the ball machine is not unique.

In an implementable manner, after a camera acquires a picture of a current scene, it is first determined whether the picture includes an image of a specific area (such as a fire fighting access); and when the camera determines that the specific area is contained in the picture, sending the image comprising the specific area to the server.

In another practical mode, after the camera acquires the picture of the current scene, the picture is directly sent to the server; then, the server determines whether a specific area (such as a fire passage) is contained in the picture, and processes the picture including the specific area when the server determines that the specific area is contained in the picture.

The server 6 may be configured to receive an image of a designated area acquired by at least one camera, and process the image according to the image processing method provided by the embodiment of the present invention, so as to identify a foreground object in the image.

Specifically, the server 6 provided in the embodiment of the present invention may be various computing devices such as a personal computer, a notebook computer, a smart phone, and a tablet computer. The image pickup device may be an apparatus for acquiring an image, such as: cameras, snap shots, video cameras, and the like.

The image processing apparatus in the embodiment of the present invention may be the server 6 shown in fig. 2, or may be a part of the server 6. Such as a system of chips in the server 6. The system-on-chip is arranged to support the server 6 to implement the functionality referred to in the first aspect and any one of its possible implementations. For example, an image of a designated area captured by at least one camera is acquired. The chip system includes a chip and may also include other discrete devices or circuit structures.

The following describes an image processing method provided by an embodiment of the present invention, taking an image processing apparatus as the server 6 as an example, with reference to the system architecture shown in fig. 2.

As shown in fig. 3, an image processing method according to an embodiment of the present invention includes:

and S11, the server 6 acquires the picture to be processed.

In an implementable manner, the picture to be processed may be an image collected by a camera device and containing a specific area (such as a fire fighting access); if the image captured by the camera device does not include the specific area, the image is not transmitted to the server 6. For example, as shown in fig. 1, when the image captured by the camera device is a in fig. 1, the camera device determines that the image contains a specific area (e.g., a fire fighting access). Therefore, the camera device can transmit the captured image to the server 6, and then the server 6 takes the received image as a picture to be processed.

In another practical way, the picture to be processed may be an image captured by a camera. Illustratively, as shown in fig. 1, when the image captured by the camera is a in fig. 1, the camera sends the captured image to the server 6. Then, when the server 6 determines that a specific area (such as a fire passage) is included in the image, the image is determined to be a picture to be processed.

S12, the server 6 obtains the reference picture in the current image library.

S13, the server 6 determines at least one comparison result according to the to-be-processed picture and the at least one reference picture. One comparison result comprises at least one foreground object, the at least one foreground object is an area with different pixels between the picture to be processed and one reference picture, and one comparison result corresponds to one reference picture.

It should be noted that the background area of the to-be-processed picture is the same as the background area of the reference picture. A foreground object comprises one pixel or at least two interconnected pixels. Illustratively, the shooting angle and the shot background area of the picture to be processed and each reference picture are the same.

In an implementable manner, the reference picture may be a picture without any foreground object taken in different time periods in the same background area. For example, the reference pictures may be a in fig. 1 and c in fig. 1. The shooting time of a in fig. 1 may be 12 o 'clock at noon, and the shooting time of c in fig. 1 may be 0 o' clock at night.

In another practical manner, the reference picture may be a picture/group of pictures taken by the user in advance in the background area, and/or the reference picture may be a to-be-processed picture that does not include the first target foreground.

S14, the server 6 determines at least one first target foreground included in the to-be-processed picture according to the at least one comparison result. The first target foreground is a foreground object existing in each comparison result of the at least one comparison result.

It can be seen that when the to-be-processed picture is compared with a certain reference picture, it can be determined that the to-be-processed picture is different from the reference picture, such as at least one extra foreground object (the foreground object may be an impurity). Similarly, if the to-be-processed picture is compared with each reference picture, at least one comparison result can be determined (each comparison result includes at least one foreground object, and one comparison result corresponds to one reference picture one by one). It is easy to understand that, if the first target foreground exists in each comparison result, it is indicated that the first target foreground is a unique object in the image to be processed, and may be specifically an impurity. Based on the principle, at least one first target foreground determined by the image processing method provided by the application is an object uniquely existing in the picture to be processed. It can be seen that the image processing method provided by the application can effectively and accurately identify the first target foreground whatever.

In a practical manner, referring to fig. 3, as shown in fig. 4, the image processing method according to the embodiment of the present invention further includes S15.

And S15, when the server 6 determines that the occurrence frequency of any first target foreground in the preset time period is greater than a first threshold value, sending out alarm information.

In a possible implementation manner, the server 6 may send an alarm sound to remind a video monitoring worker of paying attention, or send alarm information to a server of the security system to remind security personnel of paying attention, so that the security personnel can effectively take measures to avoid the occurrence of potential dangerous events. In addition, the server 6 can send the abnormal alarm to the staff performing the on-site patrol around the target area through other communication means (for example, a short message method, etc.), so that the staff can effectively take measures to avoid the occurrence of potential dangerous events.

In another possible implementation manner, a user may select a retention area in a background area corresponding to a reference picture, and send out warning information when the number of times that any one of the first target foregrounds appears in the retention area in a preset time period (where the preset time period may acquire multiple pictures to be processed) is greater than a preset threshold.

In one possible implementation manner, referring to fig. 3, as shown in fig. 5, the image processing method provided by the embodiment of the present invention further includes S16 and S17.

S16, the server 6 determines at least one preset foreground in the picture to be processed.

In one implementable manner, the preset foreground may be a foreground object that the user does not need to alarm. Such as: human or animal.

S17, the server 6 determines at least one second target foreground according to the at least one first target foreground and the at least one preset foreground. The second target foreground is any one of the at least one first target foreground and does not belong to the at least one preset foreground.

Illustratively, when the preset foreground is taken as the human body and the picture to be processed is b in fig. 1, it is determined that the preset foreground included in the picture to be processed is the human body 2. By comparing the picture to be processed with a in fig. 1 and c in fig. 1, it can be determined that the at least one first target foreground is the car 1 and the human body 2, respectively. Then, based on at least one first target foreground (such as the car 1 and the human body 2) and at least one preset foreground (such as the human body 2), the second target foreground can be determined to be the car 1.

It can be seen that, since there may be a specific preset foreground in the to-be-processed picture, the preset foreground may be a foreground object that the user does not need to pay attention to. Therefore, the foreground objects identical to the preset foreground in the to-be-processed picture are removed from the at least one first target foreground through identifying the at least one preset foreground in the to-be-processed picture, and therefore a user can more intuitively see the foreground objects existing in the to-be-processed picture.

In one possible implementation manner, with reference to fig. 5, as shown in fig. 6, the image processing method provided by the embodiment of the present invention further includes S18.

And S18, when the server 6 determines that the occurrence frequency of any second target foreground in the preset time period is greater than a second threshold value, sending out alarm information.

Specifically, the description of the alarm information refers to the description of S15, and is not repeated here. The value of the second threshold may be the same as or different from the value of the first threshold.

Therefore, the foreground objects which do not need to be paid attention are removed, so that the generated warning information is higher in accuracy, and the user experience is improved.

In one possible implementation manner, as shown in fig. 7 in conjunction with fig. 5, the step S16 described above can be specifically implemented by the step S160 described below.

S160, the server 6 detects the picture to be processed according to a preset target detection algorithm, and determines at least one preset foreground.

It should be noted that the preset target detection algorithm may be any algorithm that can identify a preset foreground, and is not limited herein.

For example, taking the picture to be processed as b in fig. 1 as an example, the picture to be processed is detected by a preset target detection algorithm capable of identifying a human body in the picture, so that it can be determined that the human body 2 exists in b in fig. 1. Therefore, the preset foreground included in the picture to be processed is determined as the human body 2.

It can be seen that, since the preset foreground is trained in the preset target detection algorithm, the preset foreground in the picture to be processed can be identified through the preset target detection algorithm. Therefore, the user can select a proper target detection algorithm according to actual needs, determine a preset prospect required to be appointed by the user, and bring convenience to the experience of the user.

In one possible implementation manner, as shown in fig. 8 in conjunction with fig. 3, the step S13 described above can be specifically implemented by the step S130 described below.

S130, the server 6 inputs the picture to be processed and at least one reference picture into a pre-trained picture comparison detection model, and at least one comparison result is determined.

In an implementable manner, the server 6 determines at least one picture pair to be compared. One of the to-be-compared picture pairs includes the to-be-compared picture and one reference picture, and each to-be-compared picture pair corresponds to only one reference picture. Then, the server 6 sequentially inputs each of the at least one to-be-compared picture pair into the picture comparison detection model, and determines at least one comparison result (one to-be-compared picture corresponds to one comparison result).

For example, taking reference pictures as a in fig. 1 and c in fig. 1, and pictures to be compared as b in fig. 1 as an example, the picture pairs to be compared are a picture pair to be compared including a in fig. 1 and b in fig. 1 and a picture pair to be compared including c in fig. 1 and b in fig. 1, respectively. Then, after determining a comparison result, inputting the to-be-compared picture including a in fig. 1 and b in fig. 1 into the picture comparison detection model, and after determining another comparison result, inputting the to-be-compared picture including c in fig. 1 and b in fig. 1 into the picture comparison detection model.

In another practical manner, after the server 6 inputs the pictures to be compared into the picture comparison detection model, each reference picture is sequentially input into the picture comparison detection model, so as to determine at least one comparison result (one reference picture corresponds to one comparison result).

For example, taking reference pictures as a in fig. 1 and c in fig. 1, and pictures to be compared as b in fig. 1 as an example, the server may first input b in fig. 1 into the picture comparison detection model, and then input a in fig. 1 into the picture comparison detection model to perform comparison between a in fig. 1 and b in fig. 1, so as to determine a comparison result. Then, c in fig. 1 is input into the picture alignment detection model to perform alignment between c in fig. 1 and b in fig. 1, and another alignment result is determined.

Here, the example of including 2 reference pictures is described, and the total number of reference pictures included in the reference picture library is not limited.

Therefore, the foreground object in the picture to be processed can be identified more conveniently by inputting the picture to be processed and at least one reference picture into the pre-trained picture comparison detection model. The user experience is facilitated.

In one possible implementation, referring to fig. 8, as shown in fig. 9, the training process of the graph alignment detection model includes S19-S22.

S19, the server 6 obtains the training sample image and the labeling result of the training sample image. Wherein the training sample image comprises a foreground and a background.

For example, taking the training sample image as b in fig. 1 as an example, the foreground of the training sample image is the car 1 and the human body 2, and the background of the training sample image is the image shown as a in fig. 1.

S20, the server 6 inputs the training sample image into the deep learning model.

S21, the server 6 determines whether the prediction comparison result of the deep learning model for the training sample image is matched with the annotation result based on the target loss function.

And S22, when the server 6 determines that the prediction comparison result is not matched with the labeling result, iteratively updating the network parameters of the deep learning model repeatedly and circularly until the model converges to obtain the image comparison detection model.

In an implementable manner, the number of iterations in a loop may be set according to the actual requirements of the user, thereby simplifying the acquisition process of the image comparison detection model.

In one possible implementation, as shown in fig. 10 in conjunction with fig. 3, the step S14 described above can be specifically implemented by the steps S140 and S141 described below.

S140, the server 6 determines a target frame of each foreground object in each comparison result of the at least one comparison result.

S141, the server 6 performs a first operation on each comparison result to determine at least one first target foreground. Wherein the first operation is: if the target frame of the first foreground object in the first comparison result is overlapped with one target frame in each comparison result except the first comparison result, determining that the first foreground object in the first comparison result is a first target foreground; the first comparison result is any one of the at least one comparison result, and the first foreground object is any one of the first comparison result.

In one possible implementation, as shown in fig. 11 in conjunction with fig. 3, the above step S14 can be specifically implemented by the following steps S142-S144.

S142, the server 6 determines a target frame of each foreground object in each comparison result of the at least one comparison result.

S143, the server 6 performs a second operation on each comparison result to determine at least one first target foreground. Wherein the second operation is: if the first target frame is overlapped with at least two second target frames in the second comparison result, respectively calculating the intersection and comparison between the first target frame and each second target frame in the at least two second target frames; the first target frame is a target frame of a second foreground object in the third comparison result, the second comparison result and the third comparison result are both any one of at least one comparison result, the second comparison result is different from the third comparison result, the second foreground object is any one of the third comparison results, and the intersection comparison is equal to the ratio of the area of the intersection region of the first target frame and one second target frame to the area of the union region of the first target frame and one second target frame.

And S144, the server 6 determines that the third foreground object in the second comparison result is the first target foreground according to the calculated intersection and comparison of the first target frame and each of the at least two second target frames. The first target frame is overlapped with a second target frame corresponding to a third foreground object in the second comparison result, the intersection and comparison of the first target frame and the second target frame corresponding to the third foreground object in the second comparison result meets a preset condition, and the third foreground object is any one foreground object in the second comparison result.

In an implementable manner, the preset condition is that the cross-over ratio is maximum, which indicates that the cross-over ratio of any of the other comparison results satisfies the preset condition that the third foreground object is most matched with the second foreground object.

In another implementable manner, the server 6 determines that the first target frame corresponding to the second target foreground in any one of the comparison results does not have the second target frame corresponding to the third target foreground overlapping with the first target frame in any other one of the comparison results, and deletes the second target foreground.

In another implementable manner, the server 6 determines that the first target frame corresponding to the second target foreground in any one of the comparison results has a second target frame corresponding to any one of the third target foreground in any one of the other comparison results and does not overlap with the first target frame, and deletes the second target foreground.

Exemplarily, a reference picture library comprises two reference pictures (a in fig. 12 and c in fig. 12 respectively), the picture to be processed is b in fig. 12, b in fig. 12 is compared with a in fig. 12, and foreground objects contained in the comparison result are determined to be an automobile 1-ab and a human body 2-ab respectively. Comparing b in fig. 12 with c in fig. 12, determining that the foreground objects contained in the comparison result are respectively car 1-cb, human body 2-cb and other 3-cb (possibly caused by identification error). Then, a target frame (a minimum circumscribed frame (e.g.: rectangle) of the foreground object) of each foreground object in each comparison result is determined, wherein the target frame of the car 1-ab is shown as d in fig. 12, the target frame of the human body 2-ab is shown as d in fig. 12 (only the outer contour of the reference picture is reserved here, and the content in the corresponding background area is deleted, so that the target frame of each foreground object can be determined), the target frame of the car 1-cb is shown as e in fig. 12, the target frame of the human body 2-cb is shown as e in fig. 12, and the target frames of the other 3-cb are shown as e in fig. 12. As can be seen from the above, since there are only 2 reference pictures, 2 comparison results can be determined. When the foreground objects contained in any comparison result are respectively an automobile 1-ab and a human body 2-ab, the foreground objects contained in any other comparison result are respectively an automobile 1-cb, a human body 2-cb and other 3-cb.

As shown in f in fig. 12, the first target frame of the car 1-ab intersects with the second target frame of the car 1-cb, so that it can be determined that the car 1-ab is the first target foreground.

As shown in f of fig. 12, the first target frame of the human body 2-ab intersects with the second target frame of the human body 2-cb and the second target frames of the other 3-cbs at the same time, and thus it is necessary to calculate a first intersection ratio of the first target frame of the human body 2-ab to the second target frame of the human body 2-cb and calculate a second intersection ratio of the first target frame of the human body 2-ab to the second target frames of the other 3-cbs. The first cross-over ratio is larger than the second cross-over ratio. Therefore, the human body 2-cb is determined as the first target foreground as indicated by g in fig. 12.

Since car 1-ab and car 1-cb represent the same foreground object, human 2-ab and human 2-cb represent the same foreground object. Therefore, as shown in h in fig. 12, at least one first target foreground is the automobile 1 and the human body 2, respectively.

It can be seen that when the server compares the picture to be processed with a background picture, at least one comparison result can be determined. Then, the server screens out the target foreground existing in each comparison result from the at least one comparison result, so that at least one first target foreground can be determined. Therefore, the user can determine the sundries existing in the picture to be processed according to the sundries included in the at least one first target foreground instead of recognizing the trained sundries, and therefore the problem of how to recognize the sundries existing in the fire fighting access at present is solved.

In one possible implementation manner, with reference to fig. 3, as shown in fig. 13, the image processing method provided by the embodiment of the present invention further includes S23.

And S23, when the server 6 determines that the picture to be processed does not contain the first target foreground according to the at least one comparison result, storing the picture to be processed into the current image library.

In an implementation manner, the reference picture in the picture library is composed of two parts, one of which is the reference picture set by the user, and although the reference picture set by the user has the most complete prior information, the background most similar to the picture to be processed cannot be reflected well. Therefore, it is also necessary to timely store the to-be-processed picture without the first target foreground into the picture library, so as to ensure the accuracy of the finally identified first target foreground.

In another practical manner, according to a principle of "first in first out", when the server 6 determines that the to-be-processed picture does not include the first target foreground according to at least one comparison result, and stores the to-be-processed picture into the current image library, it is necessary to delete one of the to-be-processed pictures with the earliest storage time (where no deletion or replacement operation is performed on a reference picture set by a user), so as to ensure accuracy of the finally identified first target foreground.

The scheme provided by the embodiment of the invention is mainly introduced from the perspective of a method. To implement the above functions, it includes hardware structures and/or software modules for performing the respective functions. Those of skill in the art will readily appreciate that the present invention can be implemented in hardware or a combination of hardware and computer software, with the exemplary elements and algorithm steps described in connection with the embodiments disclosed herein. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The image processing apparatus according to the embodiments of the present invention may be divided into functional modules according to the above method, for example, each functional module may be divided for each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. It should be noted that, the division of the modules in the embodiment of the present invention is schematic, and is only a logic function division, and there may be another division manner in actual implementation.

Fig. 14 is a schematic structural diagram of an image processing apparatus 10 according to an embodiment of the present invention. The image processing apparatus 10 is configured to, when the to-be-processed picture is obtained, obtain a reference picture in a current image library, and determine at least one comparison result according to the to-be-processed picture and at least one reference picture. And then, determining at least one first target foreground contained in the picture to be processed according to at least one comparison result. The image processing apparatus 10 may include an acquisition unit 101, a processing unit 102, and a storage unit 103.

An obtaining unit 101 is configured to obtain a picture to be processed. The obtaining unit 101 is further configured to obtain a reference picture in the current image library. For example, in conjunction with FIG. 3, the fetch unit 101 may be used to perform S11 and S12. In conjunction with fig. 9, the obtaining unit 101 may be configured to execute S19.

The processing unit 102 is configured to determine at least one comparison result according to the to-be-processed picture acquired by the acquiring unit 101 and at least one reference picture acquired by the acquiring unit 101. The processing unit 102 is further configured to determine, according to the at least one comparison result, at least one first target foreground included in the to-be-processed picture. For example, in conjunction with FIG. 3, processing unit 102 may be configured to perform S13 and S14. In conjunction with fig. 4, processing unit 102 may be configured to perform S15. In conjunction with FIG. 5, processing unit 102 may be configured to perform S13 and S17. In conjunction with fig. 6, processing unit 102 may be configured to perform S18. In conjunction with fig. 7, the processing unit 102 may be configured to perform S160. In conjunction with fig. 8, the processing unit 102 may be configured to execute S130. In conjunction with FIG. 9, the processing unit 102 may be configured to perform S19, S20, S21, and S22. In conjunction with fig. 10, processing unit 102 may be configured to perform S140 and S141. In conjunction with fig. 11, processing unit 102 may be configured to perform S142, S143, and S144. In connection with fig. 13, the processing unit 102 may be used for S23.

The storage unit 103 may be used to store the program code of the write image processing apparatus 10, and may also be used to store data generated by the write image processing apparatus 10 during operation, such as data in a write request.

All relevant contents of each step related to the above method embodiment may be referred to the functional description of the corresponding functional module, and the function thereof is not described herein again.

Fig. 15 is a schematic structural diagram of an image processing apparatus 10 according to an embodiment of the present invention, and as shown in fig. 15, the image processing apparatus 10 may include: at least one processor 51, a memory 52, a communication interface 53 and a communication bus 54.

The following specifically describes each component of the image processing apparatus with reference to fig. 15:

the processor 51 is a control center of the image processing apparatus, and may be a single processor or a collective term for a plurality of processing elements. For example, the processor 51 is a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement embodiments of the present invention, such as: one or more DSPs, or one or more Field Programmable Gate Arrays (FPGAs).

In particular implementations, processor 51 may include one or more CPUs such as CPU0 and CPU1 shown in fig. 15 for one embodiment. Also, as an embodiment, the image processing apparatus may include a plurality of processors, such as the processor 51 and the processor 55 shown in fig. 15. Each of these processors may be a Single-core processor (Single-CPU) or a Multi-core processor (Multi-CPU). A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).

The Memory 52 may be a Read-Only Memory (ROM) or other type of static storage device that can store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that can store information and instructions, an electrically erasable Programmable Read-Only Memory (EEPROM), a Compact Disc Read-Only Memory (CD-ROM) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 52 may be self-contained and coupled to the processor 51 via a communication bus 54. The memory 52 may also be integrated with the processor 51.

In a particular implementation, the memory 52 is used for storing data and software programs for implementing the present invention. The processor 51 may perform various functions of the air conditioner by running or executing software programs stored in the memory 52 and calling data stored in the memory 52.

The communication interface 53 is a device such as any transceiver, and is used for communicating with other devices or communication Networks, such as a Radio Access Network (RAN), a Wireless Local Area Network (WLAN), a terminal, and a cloud. The communication interface 53 may include a receiving unit implementing a receiving function and a transmitting unit implementing a transmitting function.

The communication bus 54 may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 15, but this is not intended to represent only one bus or type of bus.

As an example, in connection with fig. 14, the acquisition unit 101 in the image processing apparatus 10 implements the same function as the communication interface 53 in fig. 15, the processing unit 102 implements the same function as the processor 51 in fig. 15, and the storage unit 103 implements the same function as the memory 52 in fig. 15.

Another embodiment of the present invention further provides a computer-readable storage medium, which stores instructions that, when executed on a computer, cause the computer to perform the method shown in the above method embodiment.

In some embodiments, the disclosed methods may be implemented as computer program instructions encoded on a computer-readable storage medium in a machine-readable format or encoded on other non-transitory media or articles of manufacture.

Fig. 16 schematically illustrates a conceptual partial view of a computer program product comprising a computer program for executing a computer process on a computing device provided by an embodiment of the invention.

In one embodiment, the computer program product is provided using a signal bearing medium 410. The signal bearing medium 410 may include one or more program instructions that, when executed by one or more processors, may provide the functions or portions of the functions described above with respect to fig. 3. Thus, for example, referring to the embodiment shown in FIG. 3, one or more features of S11-S13 may be undertaken by one or more instructions associated with the signal bearing medium 410. Further, the program instructions in FIG. 16 also describe example instructions.

In some examples, signal bearing medium 410 may include a computer readable medium 411, such as, but not limited to, a hard disk drive, a Compact Disc (CD), a Digital Video Disc (DVD), a digital tape, a memory, a read-only memory (ROM), a Random Access Memory (RAM), or the like.

In some implementations, the signal bearing medium 410 may comprise a computer recordable medium 412 such as, but not limited to, a memory, a read/write (R/W) CD, a R/W DVD, and the like.

In some implementations, the signal bearing medium 410 may include a communication medium 413, such as, but not limited to, a digital and/or analog communication medium (e.g., a fiber optic cable, a waveguide, a wired communications link, a wireless communication link, etc.).

The signal bearing medium 410 may be conveyed by a wireless form of communication medium 413, such as a wireless communication medium compliant with the IEEE802.41 standard or other transport protocol. The one or more program instructions may be, for example, computer-executable instructions or logic-implementing instructions.

In some examples, a data writing apparatus, such as that described with respect to fig. 3, may be configured to provide various operations, functions, or actions in response to one or more program instructions via the computer-readable medium 411, the computer-recordable medium 412, and/or the communication medium 413.

Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical functional division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another device, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may be one physical unit or a plurality of physical units, that is, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solution of the embodiments of the present invention may be essentially or partially contributed to by the prior art, or all or part of the technical solution may be embodied in the form of a software product, where the software product is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.

The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions within the technical scope of the present invention are intended to be covered by the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims

1. An image processing method, comprising:

acquiring a picture to be processed;

acquiring a reference picture in a current image library;

determining at least one comparison result according to the picture to be processed and at least one reference picture; one comparison result comprises at least one foreground object, at least one foreground object is an area with different pixels between the picture to be processed and one reference picture, and one comparison result corresponds to one reference picture;

determining at least one first target foreground contained in the picture to be processed according to at least one comparison result; the first target foreground is a foreground object existing in each comparison result of the at least one comparison result.

2. The image processing method according to claim 1, characterized in that the image processing method further comprises:

determining at least one preset foreground in the picture to be processed;

determining at least one second target foreground according to the at least one first target foreground and the at least one preset foreground; the second target foreground is any one of the at least one first target foreground and does not belong to at least one preset foreground.

3. The image processing method according to claim 1, wherein the determining at least one comparison result according to the to-be-processed picture and the at least one reference picture comprises:

and inputting the picture to be processed and the at least one reference picture into a pre-trained picture comparison detection model, and determining the at least one comparison result.

4. The image processing method according to claim 3, wherein the training process of the photo alignment detection model comprises:

acquiring a training sample image and an annotation result of the training sample image; wherein the training sample image comprises a foreground and a background;

inputting the training sample image into a deep learning model;

determining whether a prediction comparison result of the deep learning model on the training sample image is matched with the labeling result based on a target loss function;

and when the prediction comparison result is not matched with the labeling result, iteratively updating the network parameters of the deep learning model repeatedly and circularly until the model converges to obtain the image comparison detection model.

5. The image processing method according to claim 1, wherein the determining, according to the at least one comparison result, at least one first target foreground included in the to-be-processed image comprises:

determining a target frame of each foreground object in each comparison result in the at least one comparison result;

performing a first operation on each comparison result to determine the at least one first target foreground; wherein the first operation is: if the target frame of the first foreground object in the first comparison result is overlapped with one target frame in each comparison result except the first comparison result, determining that the first foreground object in the first comparison result is the first target foreground; the first comparison result is any one of the at least one comparison result, and the first foreground object is any one of the first comparison result.

6. The image processing method according to claim 1, wherein the determining, according to the at least one comparison result, at least one first target foreground included in the to-be-processed image comprises:

performing a second operation on each comparison result to determine the at least one first target foreground; wherein the second operation is: if the first target frame is overlapped with at least two second target frames in the second comparison result, respectively calculating the intersection and comparison between the first target frame and each second target frame in the at least two second target frames; the first target frame is a target frame of a second foreground object in a third comparison result, the second comparison result and the third comparison result are both any one of the at least one comparison result, the second comparison result is different from the third comparison result, the second foreground object is any one of the third comparison result, and the intersection comparison is equal to a ratio of an area of an intersection region of the first target frame and one second target frame to an area of a union region of the first target frame and one second target frame;

determining a third foreground object in the second comparison result as the first target foreground according to the calculated intersection and comparison between the first target frame and each of the at least two second target frames; the first target frame is overlapped with a second target frame corresponding to a third foreground object in the second comparison result, and the intersection and comparison between the first target frame and the second target frame corresponding to the third foreground object in the second comparison result meets a preset condition, wherein the third foreground object is any one foreground object in the second comparison result.

7. The image processing method according to claim 1, further comprising:

and when determining that the picture to be processed does not contain the first target foreground according to the at least one comparison result, storing the picture to be processed into the current image library.

8. An image processing apparatus characterized by comprising:

the acquisition unit is used for acquiring a picture to be processed;

the acquisition unit is further used for acquiring a reference picture in the current image library stored by the storage unit;

the processing unit is used for determining at least one comparison result according to the picture to be processed acquired by the acquisition unit and the at least one reference picture acquired by the acquisition unit; one comparison result comprises at least one foreground object, at least one foreground object is an area with different pixels between the picture to be processed and one reference picture, and one comparison result corresponds to one reference picture;

the processing unit is further configured to determine, according to at least one of the comparison results, at least one first target foreground included in the to-be-processed picture; the first target foreground is a foreground object existing in each comparison result of the at least one comparison result.

9. A computer-readable storage medium comprising instructions which, when executed on a computer, cause the computer to perform the image processing method according to any one of claims 1 to 7.

10. An image processing apparatus characterized by comprising: communication interface, processor, memory, bus;

the memory is used for storing computer execution instructions, and the processor is connected with the memory through the bus;

when the image processing apparatus is running, the processor executes computer-executable instructions stored by the memory to cause the image processing apparatus to perform the image processing method according to any one of claims 1 to 7.