CN113627413B

CN113627413B - Data labeling method, image comparison method and device

Info

Publication number: CN113627413B
Application number: CN202110926471.5A
Authority: CN
Inventors: 张景; 陈杰; 陈益伟; 刘世杰
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2021-08-12
Filing date: 2021-08-12
Publication date: 2024-06-04
Anticipated expiration: 2041-08-12
Also published as: CN113627413A

Abstract

The embodiment of the application discloses a data labeling method, an image comparison method and a device, and belongs to the technical field of image processing. In the embodiment of the application, for the target which exists in the reference image but does not exist in the image to be marked, the target area of the target in the reference image is determined, and then the target area is projected into the image to be marked to obtain the marked area in the image to be marked, so that the area where the difference part exists is accurately marked in the image to be marked, and the marked information of the image to be marked can be obtained. The labeling information obtained through the scheme is accurate, and then the training sample obtained through the labeling of the scheme trains the deep learning model to obtain a trained deep learning model, the difference part of the image to be detected compared with the reference image is detected through the trained deep learning model, the detection accuracy can be improved, and the detected difference part is located in a more accurate area.

Description

Data labeling method, image comparison method and device

Technical Field

The embodiment of the application relates to the technical field of image processing, in particular to a data labeling method, an image comparison method and a device.

Background

Currently, deep learning models are widely used in the technical field of image processing. In the application of image comparison, a trained deep learning model is utilized to find out a difference part of an image to be detected compared with a reference image. For example, in a video surveillance scene, a portion of video content that changes from a background environment is monitored at a time. Before the deep learning model is used, the deep learning model needs to be trained, and data annotation is an important step for training the deep learning model. In the image comparison, the data annotation refers to the annotation of the image to be annotated, and the annotation information of the image to be annotated is obtained.

The data annotation in the related art can only annotate the target which does not exist in the reference image but exists in the image to be annotated. For targets existing in the reference image but not in the image to be marked, it is difficult to accurately mark the area where the difference part is located in the image to be marked, which is a problem to be solved at present.

Disclosure of Invention

The embodiment of the application provides a data labeling method, an image comparison device and a computer readable storage medium, and the method can accurately label the region where the difference part is located in the image to be labeled for the target which exists in the reference image but does not exist in the image to be labeled. The technical scheme is as follows:

in one aspect, a method for labeling data is provided, the method comprising:

determining a target area in a reference image, wherein the target area is an area where a target existing in the reference image is located, and the target does not exist in an image to be marked;

Projecting the target region into the image to be marked to obtain a marked region in the image to be marked;

and determining the labeling information of the image to be labeled based on the labeling area.

Optionally, the projecting the target area into the image to be annotated to obtain the annotation area in the image to be annotated includes:

Determining the transverse-longitudinal ratio of the reference image and the image to be marked;

And based on the aspect ratio, projecting the target region into the image to be marked to obtain a marked region in the image to be marked.

Optionally, the aspect ratio includes an abscissa ratio and an ordinate ratio;

The step of projecting the target area into the image to be marked based on the aspect ratio to obtain a marked area in the image to be marked comprises the following steps:

Determining the abscissa of the labeling area based on the abscissa ratio and the abscissa of the target area;

Determining the ordinate of the labeling area based on the ordinate ratio and the ordinate of the target area;

and marking the marked area in the image to be marked based on the horizontal coordinate and the vertical coordinate of the marked area.

Optionally, the determining the target area in the reference image includes:

detecting a user annotation operation with respect to the target;

And marking the target area in the reference image based on the detected user marking operation.

Optionally, the determining the target area in the reference image includes:

And marking the target area in the reference image through an image comparison model based on the reference image and the image to be marked.

Optionally, the labeling the target area in the reference image through an image comparison model based on the reference image and the image to be labeled includes:

Inputting the reference image and the image to be annotated into the image comparison model to obtain a reference image which is output by the image comparison model and marked with an initial area, wherein the initial area is the area where the target is located;

Detecting a user adjustment operation with respect to the initial region;

and adjusting the initial area based on the detected user adjustment operation to obtain the target area.

Optionally, if a portion of the target is occluded, the target area covers the entirety of the target, or covers a portion of the target that is not occluded.

In another aspect, there is provided an image comparison method, the method comprising:

Acquiring an image to be detected;

Detecting the image to be detected through a deep learning model to determine a difference part of the image to be detected compared with a reference image, wherein the difference part comprises targets which are not present in the image to be detected and are present in the reference image, and/or targets which are present in the image to be detected and are not present in the reference image;

The deep learning model is obtained through training of an image sample and labeling information of the image sample, the labeling information of the image sample is determined by projecting a target area in a reference image into the image sample, and the target area is an area where a target existing in the reference image but not existing in the image sample is located. That is, the labeling information of the image sample is obtained by the data labeling method.

In another aspect, there is provided a data tagging device, the device comprising:

The first determining module is used for determining a target area in the reference image, wherein the target area is an area where a target existing in the reference image is located, and the target does not exist in the image to be marked;

The projection module is used for projecting the target area into the image to be marked to obtain a marked area in the image to be marked;

And the second determining module is used for determining the labeling information of the image to be labeled based on the labeling area.

Optionally, the projection module includes:

The determining submodule is used for determining the transverse-longitudinal ratio of the reference image and the image to be marked;

And the projection sub-module is used for projecting the target area into the image to be marked based on the transverse-longitudinal ratio to obtain the marked area in the image to be marked.

Optionally, the aspect ratio includes an abscissa ratio and an ordinate ratio;

the projection submodule is specifically configured to:

Optionally, the first determining module includes:

a first detection sub-module for detecting a user annotation operation with respect to the target;

And the first labeling sub-module is used for labeling the target area in the reference image based on the detected user labeling operation.

Optionally, the first determining module includes:

And the second labeling sub-module is used for labeling the target area in the reference image through an image comparison model based on the reference image and the image to be labeled.

Optionally, the second labeling sub-module is specifically configured to:

Detecting a user adjustment operation with respect to the initial region;

In another aspect, there is provided an image comparison apparatus, the apparatus including:

The acquisition module is used for acquiring the image to be detected;

the detection module is used for detecting the image to be detected through a deep learning model to determine a difference part of the image to be detected compared with a reference image, wherein the difference part comprises targets which are not present in the image to be detected and are present in the reference image, and/or targets which are present in the image to be detected and are not present in the reference image;

In another aspect, a computer device is provided, where the computer device includes a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus, where the memory is used to store a computer program, and where the processor is used to execute the program stored on the memory, so as to implement the steps of the data labeling method or the image comparison method described above.

In another aspect, a computer readable storage medium is provided, in which a computer program is stored, the computer program implementing the steps of the data labeling method or the image comparison method described above when being executed by a processor.

In another aspect, a computer program product is provided comprising instructions which, when run on a computer, cause the computer to perform the steps of the data annotation method or the image comparison method described above.

The technical scheme provided by the embodiment of the application at least has the following beneficial effects:

In the embodiment of the application, for the target which exists in the reference image but does not exist in the image to be marked, the target area of the target in the reference image is determined, and then the target area is projected into the image to be marked to obtain the marked area in the image to be marked, so that the area where the difference part exists is accurately marked in the image to be marked, and the marked information of the image to be marked can be obtained. The labeling information obtained through the scheme is accurate, and then the training sample obtained through the labeling of the scheme trains the deep learning model to obtain a trained deep learning model, the difference part of the image to be detected compared with the reference image is detected through the trained deep learning model, the detection accuracy can be improved, and the detected difference part is located in a more accurate area.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a data labeling method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a reference image and an image to be annotated according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a target region marked in a reference image according to an embodiment of the present application;

FIG. 4 is a schematic diagram of projecting a target area onto an image to be annotated according to an embodiment of the present application;

FIG. 5 is a flow chart of an image comparison method provided by an embodiment of the present application;

FIG. 6 is a schematic structural diagram of a data labeling device according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an image comparing device according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the following detailed description of the embodiments of the present application will be given with reference to the accompanying drawings.

First, some application scenarios according to the embodiments of the present application will be described.

Currently, image comparison is required in many scenes, for example, in a video monitoring scene, it is generally required to monitor a part of a video picture in real time, which is changed from a background environment, for example, whether articles are fewer in the video picture, whether people, vehicles are more, and the like. Video monitoring can be performed based on image comparison in the scenes of business superproviders, communities, banks and the like. Of course, the embodiment of the application is not limited to be applied to the video monitoring scene, and the scheme can be applied to any scene needing image comparison. Wherein image alignment typically utilizes a trained deep learning model to find the changes in the image to be detected that occur compared to the reference image.

The reference image and the image to be detected may be images about the same scene, the reference image may be a reference image about a certain scene, and the image to be detected may be a real-time image about the scene. Of course, the embodiments of the present application do not limit which images are reference images and which images are images to be detected. In the application process, a reference image can be embedded into the deep learning model, or the image characteristics of the reference image are extracted, the image characteristics of the reference image are embedded into the deep learning model, and in the video monitoring process, an image acquired in real time is used as an image to be detected to be input into the deep learning model, and the difference part is found out through the deep learning model.

Before the deep learning model is used, the deep learning model needs to be trained, and data annotation is an important step for training the deep learning model. In the image comparison, the data annotation refers to the annotation of the image to be annotated, and the annotation information of the image to be annotated is obtained. The reference image, the image to be annotated and the annotation information of the image to be annotated can be used as training samples in the follow-up process, and the deep learning model is trained through the training samples. In the related art, for a target that exists in an image to be marked but does not exist in a reference image, the area where the target is located is marked in the image to be marked manually, so that marking information of the image to be marked is obtained, and the target is the difference part. Namely, the data annotation in the related art is only suitable for a scene of annotating a target which exists in an image to be annotated and does not exist in a reference image. The embodiment of the application provides a data labeling method, which can accurately label the region where the difference part is located in the image to be labeled for the target which exists in the reference image but does not exist in the image to be labeled, thereby providing a richer training sample for training of a deep learning model. And training the deep learning model through the training sample obtained by labeling the scheme to obtain the trained deep learning model. The difference part of the image to be detected compared with the reference image is detected through the trained deep learning model, so that the detection accuracy can be improved, namely, the detected difference part is more accurate in the region.

It should be noted that, the data labeling method in the embodiment of the present application may provide training samples for the deep learning model, may provide training samples for other types of algorithm models, and may also be applied to other fields, which is not limited in the embodiment of the present application. The embodiment of the application also does not limit the network structure of the deep learning model and the like. In addition, the data labeling method provided by the embodiment of the application can be executed by any computer device, and the computer device can be a mobile phone, a notebook computer, a desktop computer and the like, and the embodiment of the application is not limited to the method.

The service scenario described in the embodiment of the present application is for more clearly describing the technical solution of the embodiment of the present application, and does not constitute a limitation on the technical solution provided by the embodiment of the present application, and as a person of ordinary skill in the art can know that, with the appearance of a new service scenario, the technical solution provided by the embodiment of the present application is applicable to similar technical problems.

The data labeling method provided by the embodiment of the application is explained in detail below.

Fig. 1 is a flowchart of a data labeling method according to an embodiment of the present application. Taking the application of the method to a computer device as an example, please refer to fig. 1, the method includes the following steps.

Step 101: and determining a target area in the reference image, wherein the target area is an area where a target existing in the reference image is located, and the target does not exist in the image to be marked.

From the foregoing, it can be seen that, in the embodiment of the present application, there is no target in the reference image, that is, the difference portion between the image to be marked and the reference image, it is difficult to directly and accurately mark the region where the difference portion is located in the image to be marked. In the scheme, the computer equipment firstly determines a target area in the reference image, wherein the target area is the area where the target existing in the reference image is located.

In the embodiment of the present application, there are various implementations of determining the target area in the reference image by the computer device, and two implementations thereof are described below.

First implementation

The computer device detects a user annotation operation with respect to the target, and annotates the target region in the reference image based on the detected user annotation operation. That is, in the embodiment of the present application, the target region may be manually noted in the reference image.

Illustratively, the reference image and the image to be annotated are displayed on the computer device, and the user operates the computer device to annotate the region of the target in the reference image. Wherein the user can annotate the target area by any shape. For example, the user may select a target region in the reference image through a mouse frame, and the computer device displays a rectangular frame in which the target region is located based on the detected user labeling operation. Optionally, the user may also input the attribute of the target, such as the target category, etc., to the computer device via a keyboard, etc., and the computer device displays the attribute of the target.

Optionally, if multiple targets exist in the reference image, the user may mark the area where each target is located, if a certain target is blocked, the user may mark only the portion where the target is not blocked, or the user may mark the complete portion of the target through observation and experience. That is, if a part of the target is blocked, the target area covers the entire target or covers a part of the target that is not blocked.

Optionally, the computer device may also pre-process the reference image and/or the image to be annotated, prior to annotating the target region in the reference image, which may include one or more of rotation, scaling, cropping, etc., such that the reference image is consistent with the visual angle, image size, and/or visual range of the image to be annotated. Optionally, taking preprocessing an image to be annotated as an example, the computer device performs preprocessing on the image to be annotated under the condition that user preprocessing operation on the image to be annotated is detected, or the computer device automatically performs preprocessing on the image to be annotated based on an image processing technology.

Second implementation

The computer equipment marks a target area in the reference image through an image comparison model based on the reference image and the image to be marked. That is, the target region in the reference image may be labeled by an existing image comparison model.

The image comparison model can be a model trained based on a deep learning technology, and the network structure of the image comparison model can be a convolutional neural network or other neural networks, so that the application is not limited to the method. The embodiment of the application also does not limit the training mode of the image comparison model, the layer number of the network and the like.

Optionally, the computer device inputs the reference image and the image to be annotated into the image comparison model to obtain the reference image marked with the target area output by the image comparison model. That is, the target region is directly obtained by the image comparison model.

Or the computer equipment inputs the reference image and the image to be annotated into the image comparison model to obtain the reference image which is output by the image comparison model and marked with the initial area, wherein the initial area is the area where the target is located. The computer device detects a user adjustment operation with respect to the initial region, and adjusts the initial region based on the detected user adjustment operation to obtain the target region. That is, the target area is obtained through image comparison and manual adjustment, and the initial area can be corrected through manual adjustment, so that the marked target area is more accurate.

The initial region or the target region directly marked by the image comparison model can be marked in various ways. For example, the region where the target is located is marked by a rectangular frame, triangle, hexagon, irregular shape, or the like. The manner in which the image comparison model marks the region where the target is located is related to a training sample of the image comparison model, a model structure and the like, which is not limited by the embodiment of the application.

In the second implementation, if there are multiple targets in the reference image, the region where each target is located may be marked. If a portion of the target is occluded, the target area covers the entire target or covers a portion of the target that is not occluded.

Fig. 2 is a schematic diagram of a reference image and an image to be annotated according to an embodiment of the present application. Fig. 3 is a schematic diagram of marking a target area in the reference image shown in fig. 2 according to an embodiment of the present application. Referring to fig. 2 and 3, an object a and an object B exist in the reference image, and the object a exists in the image to be marked, wherein the object B is a target, and the area where the object B is located is a target area to be marked in the reference image. Taking manual labeling as an example, a user marks a target area in a reference image displayed on a computer device through program software, wherein the target area is an area in a pentagonal dotted line frame in fig. 3.

Step 102: and projecting the target area into the image to be marked to obtain the marked area in the image to be marked.

In the embodiment of the application, after determining the target area in the reference image, the computer equipment projects the target area into the image to be marked to obtain the marked area in the image to be marked.

Optionally, the computer device projects the target area into the image to be annotated, and one implementation way of obtaining the annotation area in the image to be annotated is as follows: the computer equipment determines the transverse-longitudinal ratio of the reference image and the image to be marked, and projects the target area into the image to be marked based on the transverse-longitudinal ratio to obtain the marked area in the image to be marked.

Optionally, the aspect ratio includes an abscissa ratio and an ordinate ratio. The computer device determines the abscissa of the labeling area based on the abscissa ratio and the abscissa of the target area. The computer device determines the ordinate of the labeling area based on the ordinate ratio and the ordinate of the target area. The computer device marks the marked area in the image to be marked based on the horizontal coordinate and the vertical coordinate of the marked area.

Illustratively, the abscissa is the pixel width of the reference image divided by the pixel width of the image to be annotated, and the ordinate is the pixel height of the reference image divided by the pixel height of the image to be annotated. If the abscissa and ordinate of the reference image and the image to be marked are 1, that is, the dimensions of the reference image and the image to be marked are the same, the computer equipment respectively determines the abscissa and the ordinate of the target area as the abscissa and the ordinate of the marked area. If the abscissa ratio and the ordinate ratio of the reference image and the image to be marked are 1.5 and 1.2 respectively, that is, the sizes of the reference image and the image to be marked are different, the computer device may divide the abscissa of the target area by the abscissa ratio to obtain the abscissa of the marked area, and divide the ordinate of the target area by the ordinate ratio to obtain the ordinate of the marked area.

In addition to the above-described projection method, the target region may be projected into the image to be marked by other methods, so as to obtain the marked region. For example, the computer device scales the image to be annotated to the same size as the reference image based on the aspect ratio, and the computer device determines the abscissa and the ordinate of the target region as the abscissa and the ordinate of the annotation region, respectively. Optionally, the computer device rescales the image to be marked with the marked area to the original size, or the computer device may not scale any more, and the sizes of the reference image and the image to be marked are kept the same.

Fig. 4 is a schematic diagram of projecting a target area into an image to be annotated according to an embodiment of the present application. As shown in fig. 4, the reference image and the image to be annotated have the same size, and the area where the target B is located in the reference image is a target area, such as an area within a pentagonal dashed-line frame in the reference image in fig. 4. The computer device automatically projects the target region into the image to be annotated, and the annotation region in the image to be annotated is obtained, such as the region in the pentagonal dashed line frame in the image to be annotated in fig. 4.

Optionally, whether the target area is marked manually, directly or through an image comparison model, the user can also adjust the marked target area in the reference image at any time after the computer equipment projects the target area into the image to be marked. When the computer equipment detects the user adjustment operation about the target area, the labeling area in the image to be labeled is automatically adjusted based on the adjusted target area.

Step 103: and determining the annotation information of the image to be annotated based on the annotation region.

In the embodiment of the application, after the computer equipment obtains the labeling area in the image to be labeled, labeling information of the image to be labeled is determined based on the labeling area. For example, the annotation information may include the abscissa and the ordinate of the annotation region. Optionally, the target region in the reference image may be labeled with the attribute of the target, for example, manually labeling the attribute, or identifying the attribute of the target through the image comparison model. Optionally, the annotation information of the image to be annotated may further include attribute information of the target.

In the scheme, if the reference image marked with the target area is directly output through the image comparison module, the computer equipment can automatically obtain the marking information of the image to be marked without manual participation. In this case, the computer device may be a server, and the server automatically obtains the labeling information of the image to be labeled based on the reference image and the image to be labeled.

In summary, in the embodiment of the present application, for the target that exists in the reference image but does not exist in the image to be marked, the target area where the target exists in the reference image is determined first, and then the target area is projected into the image to be marked to obtain the marked area in the image to be marked, so that the area where the difference part exists is accurately marked in the image to be marked, and further the marked information of the image to be marked can be obtained, and the obtained marked information is also accurate.

All the above optional technical solutions may be combined according to any choice to form an optional embodiment of the present application, and the embodiments of the present application will not be described in detail.

Next, an explanation will be given of an image comparison method provided by the embodiment of the present application. It should be noted that the image comparison method in the embodiment of the present application may be applied to a computer device, such as a terminal device, a server, etc., and the embodiment of the present application is not limited by the comparison. The image comparison method is applied to the equipment, and the deep learning model is deployed on the equipment, wherein the deep learning model is obtained through training of the image sample and the labeling information of the image sample, and the labeling information of the image sample is obtained through the data labeling method provided by the embodiment.

Fig. 5 is a flowchart of an image comparison method according to an embodiment of the present application. Taking the application of the method to a computer device as an example, please refer to fig. 5, the method includes the following steps.

Step 501: and acquiring an image to be detected.

In the embodiment of the application, an image to be detected is firstly obtained. For example, the image to be detected is an image acquired by a camera in real time, the camera sends the acquired image to a computer device, and the computer device takes the received image as the image to be detected. Or the image to be detected is an image obtained by other modes, and the contrast of the embodiment of the application is not limited.

Step 502: the image to be detected is detected through a deep learning model to determine the difference part of the image to be detected compared with a reference image, the deep learning model is obtained through training of an image sample and marking information of the image sample, and the marking information of the image sample is obtained through a data marking method provided by the embodiment of fig. 1.

In the embodiment of the application, a deep learning model is deployed in the computer equipment, the deep learning model is obtained through training of an image sample and the marking information of the image sample, and the marking information of the image sample is obtained through the data marking method provided by the embodiment. That is, the annotation information of the image sample is determined by projecting the target region in the reference image into the image sample. The target area is an area where a target exists in the reference image but does not exist in the image sample. For example, the computer device projects the target region into the image sample, obtains an annotation region in the image sample, and determines annotation information for the image sample based on the annotation region. It should be noted that, the training samples of the deep learning model may include training samples provided by the data labeling method provided by the above embodiment, and may also include training samples provided by other methods. That is, the data labeling method can provide a richer training sample for training of the deep learning model.

Optionally, the network structure of the deep learning model may be a convolutional neural network or a fully-connected network, or may be another neural network. In addition, the embodiment of the application is not limited to the training method of the deep learning model.

In the embodiment of the application, the image to be detected is detected through the deep learning model, and the difference part of the image to be detected compared with the reference image is determined. Wherein the difference portion comprises an object that is not present in the image to be detected and is present in the reference image and/or an object that is present in the image to be detected and is not present in the reference image. That is, the deep learning model can detect which objects are more in the image to be detected than in the reference image, and can detect which objects are less in the image to be detected than in the reference image. And, the difference part determined by the computer equipment is more accurate in the area of the image to be detected.

Optionally, the computer device marks the region in which the difference portion is located in the image to be detected. For example, in the case where the image to be detected has fewer objects than the reference image, the computer device marks an area where the image to be detected has fewer objects. In the case that the image to be detected has more objects than the reference image, the computer device marks the region of the image to be detected that has more objects. In the case that the image to be detected has fewer objects and more objects than the reference image, the computer device marks the region with fewer objects and the region with more objects in the image to be detected. Optionally, the computer device marks the area with few objects and/or the area with more objects in the image to be detected in a rectangular frame or irregular frame mode.

Optionally, the reference image is embedded in the deep learning model, or image features of the reference image are embedded in the deep learning model. The computer device inputs the image to be detected into a deep learning model, and the difference part is found out through the deep learning model. The computer device may extract the image features of the reference image through the feature extraction model, and the computer device may obtain the image features of the reference image through other manners, which is not limited in the embodiment of the present application. The feature extraction model may be a trained neural network model or other model, among others. Alternatively, the reference image or image features of the reference image may not be embedded in the deep learning model, and the computer device inputs the image to be detected and the reference image into the deep learning model, and the difference portion is found by the deep learning model.

In summary, the labeling information obtained by labeling the image sample by the data labeling method provided by the embodiment is accurate. Then, the training sample obtained through the labeling of the scheme is used for training a deep learning model, the difference part of the image to be detected compared with the reference image is detected through the trained deep learning model, the detection accuracy can be improved, and the detected difference part is more accurate.

Fig. 6 is a schematic structural diagram of a data labeling apparatus 600 according to an embodiment of the present application, where the data labeling apparatus 600 may be implemented as part or all of a computer device by software, hardware, or a combination of both. Referring to fig. 6, the apparatus 600 includes: a first determination module 601, a projection module 602, and a second determination module 603.

The first determining module 601 is configured to determine a target area in the reference image, where the target area is an area where a target existing in the reference image is located, and the target does not exist in the image to be annotated;

the projection module 602 is configured to project the target area into the image to be annotated, so as to obtain an annotation area in the image to be annotated;

The second determining module 603 is configured to determine labeling information of the image to be labeled based on the labeling area.

Optionally, the projection module 602 includes:

Optionally, the aspect ratio includes an abscissa ratio and an ordinate ratio;

The projection submodule is specifically used for:

determining the abscissa of the labeling area based on the abscissa proportion and the abscissa of the target area;

determining the ordinate of the labeling area based on the ordinate proportion and the ordinate of the target area;

Optionally, the first determining module 601 includes:

The second labeling sub-module is used for labeling the target area in the reference image through the image comparison model based on the reference image and the image to be labeled.

Optionally, the second labeling sub-module is specifically configured to:

Inputting a reference image and an image to be annotated into an image comparison model to obtain a reference image which is output by the image comparison model and marked with an initial area, wherein the initial area is an area where the target is located;

Detecting a user adjustment operation with respect to the initial region;

and adjusting the initial area based on the detected user adjustment operation to obtain a target area.

Alternatively, if a portion of the target is occluded, the target area covers the entirety of the target, or covers the portion of the target that is not occluded.

It should be noted that: in the data labeling device provided in the above embodiment, only the division of the above functional modules is used for illustration, and in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the data labeling device and the data labeling method provided in the foregoing embodiments belong to the same concept, and specific implementation processes of the data labeling device and the data labeling method are detailed in the method embodiments and are not repeated herein.

Fig. 7 is a schematic structural diagram of an image comparing apparatus 700 according to an embodiment of the present application, where the image comparing apparatus 700 may be implemented as part or all of a computer device by software, hardware, or a combination of both. Referring to fig. 7, the apparatus 700 includes: an acquisition module 701 and a detection module 702.

An acquisition module 701, configured to acquire an image to be detected;

The detection module 702 is configured to detect an image to be detected through a deep learning model, so as to determine a difference portion of the image to be detected compared with a reference image, where the difference portion includes a target that does not exist in the image to be detected and exists in the reference image, and/or a target that exists in the image to be detected and does not exist in the reference image;

The deep learning model is obtained through training of an image sample and marking information of the image sample, the marking information of the image sample is determined by projecting a target area in a reference image into the image sample, and the target area is an area where a target existing in the reference image but not existing in the image sample is located. That is, the labeling information of the image sample is obtained by the data labeling method provided in the above embodiment.

In the embodiment of the application, the labeling information obtained by labeling the image sample by the data labeling method provided by the embodiment is accurate. Then, the training sample obtained through the labeling of the scheme is used for training the deep learning model, the trained deep learning model is obtained, the difference part of the image to be detected compared with the reference image is detected through the trained deep learning model, the detection accuracy can be improved, and the detected difference part is more accurate.

It should be noted that: in the image comparison device provided in the above embodiment, only the division of the above functional modules is used for illustration when comparing images, and in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the image comparison device provided in the above embodiment belongs to the same concept as the image comparison method embodiment, and the specific implementation process is detailed in the method embodiment, which is not described herein again.

Fig. 8 shows a block diagram of a computer device 800 provided in an exemplary embodiment of the application. The computer device 800 may be: smart phones, tablet computers, notebook computers or desktop computers. Computer device 800 may also be referred to as a terminal, user device, portable terminal, laptop terminal, desktop terminal, and the like.

In general, the computer device 800 includes: a processor 801 and a memory 802.

Processor 801 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 801 may be implemented in at least one hardware form of DSP (DIGITAL SIGNAL Processing), FPGA (Field-Programmable gate array), PLA (Programmable Logic Array ). The processor 801 may also include a main processor, which is a processor for processing data in an awake state, also referred to as a CPU (Central Processing Unit ), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 801 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 801 may also include an AI (ARTIFICIAL INTELLIGENCE ) processor for processing computing operations related to machine learning.

Memory 802 may include one or more computer-readable storage media, which may be non-transitory. Memory 802 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 802 is used to store at least one instruction for execution by processor 801 to implement the data annotation method or image comparison method provided by the method embodiments of the present application.

In some embodiments, the computer device 800 may optionally further include: a peripheral interface 803, and at least one peripheral. The processor 801, the memory 802, and the peripheral interface 803 may be connected by a bus or signal line. Individual peripheral devices may be connected to the peripheral device interface 803 by buses, signal lines, or a circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 804, a display 805, a camera assembly 806, audio circuitry 807, a positioning assembly 808, and a power supply 809.

Peripheral interface 803 may be used to connect at least one Input/Output (I/O) related peripheral to processor 801 and memory 802. In some embodiments, processor 801, memory 802, and peripheral interface 803 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 801, the memory 802, and the peripheral interface 803 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.

The Radio Frequency circuit 804 is configured to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. The radio frequency circuit 804 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 804 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 804 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuitry 804 may communicate with other computer devices via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: metropolitan area networks, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (WIRELESS FIDELITY ) networks. In some embodiments, the radio frequency circuit 804 may further include NFC (NEAR FIELD Communication) related circuits, which is not limited by the present application.

The display 805 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 805 is a touch display, the display 805 also has the ability to collect touch signals at or above the surface of the display 805. The touch signal may be input as a control signal to the processor 801 for processing. At this time, the display 805 may also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards. In some embodiments, the display 805 may be one, disposed on the front panel of the computer device 800; in other embodiments, the display 805 may be at least two, respectively disposed on different surfaces of the computer device 800 or in a folded design; in other embodiments, the display 805 may be a flexible display disposed on a curved surface or a folded surface of the computer device 800. Even more, the display 805 may be arranged in an irregular pattern other than rectangular, i.e., a shaped screen. The display 805 may be made of LCD (Liquid CRYSTAL DISPLAY), OLED (Organic Light-Emitting Diode), or other materials.

The camera assembly 806 is used to capture images or video. Optionally, the camera assembly 806 includes a front camera and a rear camera. Typically, the front camera is disposed on a front panel of the computer device and the rear camera is disposed on a rear surface of the computer device. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, the camera assembly 806 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.

Audio circuitry 807 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and the environment, converting the sound waves into electric signals, inputting the electric signals to the processor 801 for processing, or inputting the electric signals to the radio frequency circuit 804 for voice communication. For purposes of stereo acquisition or noise reduction, the microphone may be multiple, each disposed at a different location of the computer device 800. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 801 or the radio frequency circuit 804 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, audio circuit 807 may also include a headphone jack.

The location component 808 is used to locate the current geographic location of the computer device 800 for navigation or LBS (Location Based Service, location-based services). The positioning component 808 may be a positioning component based on the United states GPS (Global Positioning System ), the Beidou system of China, the Granati system of Russia, or the Galileo system of the European Union.

The power supply 809 is used to power the various components in the computer device 800. The power supply 809 may be an alternating current, direct current, disposable battery, or rechargeable battery. When the power supply 809 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, the computer device 800 also includes one or more sensors 810. The one or more sensors 810 include, but are not limited to: acceleration sensor 811, gyroscope sensor 812, pressure sensor 813, fingerprint sensor 814, optical sensor 815, and proximity sensor 816.

The acceleration sensor 811 can detect the magnitudes of accelerations on three coordinate axes of the coordinate system established with the computer device 800. For example, the acceleration sensor 811 may be used to detect components of gravitational acceleration in three coordinate axes. The processor 801 may control the display screen 805 to display a user interface in a landscape view or a portrait view based on the gravitational acceleration signal acquired by the acceleration sensor 811. Acceleration sensor 811 may also be used for the acquisition of motion data of a game or user.

The gyro sensor 812 may detect a body direction and a rotation angle of the computer device 800, and the gyro sensor 812 may collect a 3D motion of the user on the computer device 800 in cooperation with the acceleration sensor 811. The processor 801 may implement the following functions based on the data collected by the gyro sensor 812: motion sensing (e.g., changing UI according to a tilting operation by a user), image stabilization at shooting, game control, and inertial navigation.

Pressure sensor 813 may be disposed on a side frame of computer device 800 and/or on an underlying layer of display 805. When the pressure sensor 813 is disposed on a side frame of the computer device 800, a grip signal of the computer device 800 by a user may be detected, and the processor 801 performs left-right hand recognition or quick operation according to the grip signal collected by the pressure sensor 813. When the pressure sensor 813 is disposed at the lower layer of the display screen 805, the processor 801 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 805. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.

The fingerprint sensor 814 is used to collect a fingerprint of a user, and the processor 801 identifies the identity of the user based on the fingerprint collected by the fingerprint sensor 814, or the fingerprint sensor 814 identifies the identity of the user based on the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the processor 801 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying for and changing settings, etc. The fingerprint sensor 814 may be disposed on the front, back, or side of the computer device 800. When a physical key or vendor Logo is provided on the computer device 800, the fingerprint sensor 814 may be integrated with the physical key or vendor Logo.

The optical sensor 815 is used to collect the ambient light intensity. In one embodiment, the processor 801 may control the display brightness of the display screen 805 based on the intensity of ambient light collected by the optical sensor 815. Specifically, when the intensity of the ambient light is high, the display brightness of the display screen 805 is turned up; when the ambient light intensity is low, the display brightness of the display screen 805 is turned down. In another embodiment, the processor 801 may also dynamically adjust the shooting parameters of the camera module 806 based on the ambient light intensity collected by the optical sensor 815.

A proximity sensor 816, also referred to as a distance sensor, is typically provided on the front panel of the computer device 800. The proximity sensor 816 is used to collect the distance between the user and the front of the computer device 800. In one embodiment, when the proximity sensor 816 detects a gradual decrease in the distance between the user and the front of the computer device 800, the processor 801 controls the display 805 to switch from the bright screen state to the off screen state; when the proximity sensor 816 detects that the distance between the user and the front of the computer device 800 gradually increases, the processor 801 controls the display 805 to switch from the off-screen state to the on-screen state.

Those skilled in the art will appreciate that the architecture shown in fig. 8 is not limiting and that more or fewer components than shown may be included or that certain components may be combined or that a different arrangement of components may be employed.

In some embodiments, there is also provided a computer readable storage medium having stored therein a computer program which, when executed by a processor, implements the steps of the data annotation method or the image comparison method of the above embodiments. For example, the computer readable storage medium may be ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

It is noted that the computer readable storage medium mentioned in the embodiments of the present application may be a non-volatile storage medium, in other words, may be a non-transitory storage medium.

It should be understood that all or part of the steps to implement the above-described embodiments may be implemented by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The computer instructions may be stored in the computer-readable storage medium described above.

That is, in some embodiments, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the steps of the data annotation method or image comparison method described above.

It should be understood that references herein to "at least one" mean one or more, and "a plurality" means two or more. In the description of the embodiments of the present application, unless otherwise indicated, "/" means or, for example, a/B may represent a or B; "and/or" herein is merely an association relationship describing an association object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone. In addition, in order to facilitate the clear description of the technical solution of the embodiments of the present application, in the embodiments of the present application, the words "first", "second", etc. are used to distinguish the same item or similar items having substantially the same function and effect. It will be appreciated by those of skill in the art that the words "first," "second," and the like do not limit the amount and order of execution, and that the words "first," "second," and the like do not necessarily differ.

The above embodiments are not intended to limit the present application, and any modifications, equivalent substitutions, improvements, etc. within the spirit and principle of the present application should be included in the scope of the present application.

Claims

1. A method of labeling data, the method comprising:

Preprocessing a reference image and/or an image to be annotated so as to enable the visual angle, the image size and/or the visual range of the reference image and the image to be annotated to be consistent, wherein the preprocessing comprises one or more of rotation, scaling and clipping;

Determining a target area in the reference image, wherein the target area is an area where a target existing in the reference image is located, and the target is not existing in the image to be marked;

Based on the aspect ratio, projecting the target area into the image to be marked to obtain a marked area in the image to be marked;

when detecting a user adjustment operation about the target area, adjusting the labeling area based on the adjusted target area;

2. The method of claim 1, wherein the aspect ratio comprises an abscissa ratio and an ordinate ratio;

3. The method according to claim 1 or 2, wherein said determining a target area in said reference image comprises:

4. A method according to claim 3, wherein the labeling the target region in the reference image by an image comparison model based on the reference image and the image to be labeled comprises:

Detecting a user adjustment operation with respect to the initial region;

5. A method according to claim 1 or 2, wherein if a portion of the target is occluded, the target area covers the whole of the target or covers a portion of the target that is not occluded.

6. An image comparison method, the method comprising:

Acquiring an image to be detected;

The depth learning model is obtained through training an image sample and marking information of the image sample, the marking information of the image sample is determined based on a marking area in the image sample, the marking area is obtained by projecting a target area in a reference image into the image sample based on the aspect ratio of the reference image and the image sample, and the target area is an area where a target existing in the reference image but not existing in the image sample is located; the labeling area can be automatically adjusted when a user adjusts the target area; the reference image and/or the image sample is a preprocessed image, the preprocessed reference image being consistent with a visual angle, an image size, and/or a visual range of the image sample, the preprocessing including one or more of rotation, scaling, and cropping.

7. A data tagging device, the device comprising:

The second determining module is used for determining the labeling information of the image to be labeled based on the labeling area;

The apparatus also includes means for:

Preprocessing a reference image and/or an image to be annotated so as to enable the visual angle, the image size and/or the visual range of the reference image and the image to be annotated to be consistent, wherein the preprocessing comprises one or more of rotation, scaling and clipping; when detecting a user adjustment operation about the target area, adjusting the labeling area based on the adjusted target area;

Wherein the projection module comprises:

8. An image comparison apparatus, the apparatus comprising:

The acquisition module is used for acquiring the image to be detected;