CN113076942A

CN113076942A - Method, device, chip and computer readable storage medium for detecting preset mark

Info

Publication number: CN113076942A
Application number: CN202010004169.XA
Authority: CN
Inventors: 章子誉
Original assignee: Shanghai Yitu Network Science and Technology Co Ltd
Current assignee: Shanghai Yitu Network Science and Technology Co Ltd
Priority date: 2020-01-03
Filing date: 2020-01-03
Publication date: 2021-07-06

Abstract

The invention provides a preset mark detection method, a preset mark detection device, a chip and a computer readable storage medium. A preset mark detection method comprises the following steps: detecting a region which is possibly the preset mark in the image to be detected based on a neural network; obtaining at least one key subregion in the first region by a key subregion searching method; extracting feature vectors of the key sub-regions; comparing the feature vectors of the key subregions with the feature vectors of the preset marks extracted in advance to obtain the similarity of the feature vectors of the key subregions and the feature vectors of the preset marks; and determining the image where the key sub-region with the similarity larger than the first preset threshold is located as the image containing the preset mark. According to the method and the device, at least one key sub-region is obtained in the region possibly containing the preset mark through a key sub-region searching method, detection errors caused by deformation are reduced, and the detection accuracy is improved.

Description

Method, device, chip and computer readable storage medium for detecting preset mark

Technical Field

The present invention relates to the field of face recognition, and in particular, to a method, an apparatus, a chip, and a computer-readable storage medium for detecting a preset flag.

Background

The method has important application value in detecting specific marks in videos, such as surveillance videos and terrorism detection marks. The traditional method is to manually detect and identify, so that the labor cost is high.

The current target detection and identification algorithm generally uses complex features to analyze in a multi-scale sliding window mode, and the operation efficiency is often very low. Also from a technical point of view, a particular marking may appear on a different carrier, for example an object that is not deformable or an object that is easily deformable may take on a different form.

However, the detection algorithm is used for searching, but the mark often appears on the surface of different objects, such as clothes, flags and other flexible materials, and cannot be used as non-standardized pattern detection, and is easy to deform, so that the effect is not good.

Disclosure of Invention

In order to solve the problems in the prior art, at least one embodiment of the present invention provides a method, an apparatus, a chip and a computer-readable storage medium for detecting a preset flag, which can reduce detection errors caused by deformation.

In a first aspect, an embodiment of the present invention provides a method for detecting a preset flag, including: detecting a first region in an image to be detected based on a neural network, wherein the first region is a region with the probability of a preset mark exceeding a preset value; obtaining at least one key subregion in the first region by a key subregion searching method; extracting feature vectors of the key sub-regions; comparing the feature vectors of the key subregions with the feature vectors of the preset marks extracted in advance to obtain the similarity of the feature vectors of the key subregions and the feature vectors of the preset marks; and determining the image where the key sub-region with the similarity larger than the first preset threshold is located as the image containing the preset mark.

In some embodiments, a preset flag detection method further comprises: acquiring video information; and dividing the video into a plurality of images to be detected according to a preset rule.

In some embodiments, a preset flag detection method further comprises: expanding the range of the first area to form a second area; and extracting the feature vector of the second region, comparing the feature vector of the second region with the feature vector of the pre-extracted preset mark, acquiring the similarity of the feature vectors of the second region and the feature vector of the pre-extracted preset mark, and removing the second region with the similarity smaller than a second preset threshold value.

In some embodiments, a preset flag detection method further comprises: and removing the key subarea with the quality not meeting the preset requirement.

In some embodiments, a preset flag detection method further comprises: and when the preset mark is determined to be the mark needing reminding and alarming, outputting the image which is determined to contain the preset mark so as to remind and alarm.

In a second aspect, an embodiment of the present invention further provides a preset flag detecting apparatus, including: the detection unit is used for detecting a first region in the image to be detected based on the neural network, wherein the first region comprises a region with the probability of a preset mark exceeding a preset value; a key sub-region acquisition unit, configured to acquire at least one key sub-region in the first region by a key sub-region search method; a feature vector extraction unit, configured to extract a feature vector of the key sub-region; the similarity obtaining unit is used for comparing the feature vectors of the key subregions with the feature vectors of the preset marks extracted in advance and determining the similarity of the feature vectors of the key subregions and the feature vectors of the preset marks; and the determining unit is used for determining that the image where the key sub-area with the similarity larger than the first preset threshold is located is the image containing the preset mark.

In some embodiments, the apparatus for detecting a preset mark further includes a video obtaining unit, configured to obtain video information; and the segmentation unit is used for segmenting the video into a plurality of images to be detected according to a preset rule.

In some embodiments, a preset flag detecting apparatus further includes a false alarm removing unit configured to: expanding the range of the first area to form a second area; and extracting the feature vector of the second region, comparing the feature vector of the second region with the feature vector of the pre-extracted preset mark, acquiring the similarity of the feature vectors of the second region and the pre-extracted feature vector of the preset mark, and removing the second region with the similarity smaller than a second preset threshold value.

In some embodiments, the preset mark detection apparatus further comprises a filtering unit for removing critical sub-regions whose quality does not meet preset requirements.

In some embodiments, the device for detecting a preset mark further includes a reminding unit, configured to output, when it is determined that the preset mark is a mark that needs to be reminded of an alarm, an image that is determined to include the preset mark so as to remind of the alarm.

In a third aspect, an embodiment of the present invention further provides a preset flag detecting apparatus, including: at least one processor; a memory coupled with the at least one processor, the memory storing executable instructions, wherein the executable instructions, when executed by the at least one processor, cause performance of the method of any of the first aspects above.

In a fourth aspect, an embodiment of the present invention further provides a chip, configured to perform the method in the first aspect. Specifically, the chip includes: a processor for calling and running the computer program from the memory so that the device on which the chip is installed is used for executing the method of the first aspect.

In a fifth aspect, the present invention also provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method according to any one of the above first aspects.

In a sixth aspect, an embodiment of the present invention further provides a computer program product, which includes computer program instructions, and the computer program instructions make a computer execute the method in the first aspect.

Therefore, at least one key sub-region is obtained in a region which may be the preset mark by the key sub-region searching method, so that detection errors caused by deformation are reduced, and the detection accuracy is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.

FIG. 1 is a flowchart illustrating a method for detecting a predetermined flag according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a method for detecting a predetermined flag according to another embodiment of the present invention;

FIG. 3 is a schematic diagram of an embodiment of a default mark detection apparatus according to the present invention;

FIG. 4 is a schematic diagram of another embodiment of a predetermined mark detection apparatus according to the present invention.

Detailed description of the preferred embodiments

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.

The inventor finds that in the prior art, when the face of the video is identified to remove privacy, the posture characteristics of the face, such as joy, anger, sadness, head raising or head lowering of the facial expression, cannot be kept. Resulting in a loss of value for many commercial applications. The embodiment of the invention provides the following scheme:

fig. 1 is a flowchart of an embodiment of a method for detecting a preset mark according to the present invention, as shown in fig. 1, in a first aspect, the method for detecting a preset mark according to the first aspect of the present invention includes:

step 101, detecting a first region in an image to be detected based on a neural network, wherein the first region is a region with a probability of a preset mark exceeding a preset value. The first area is an area which may be or may include a preset mark; for example, all regions that may be marker objects are detected by a neural network based object detection algorithm (e.g., SSD, fast RCNN, etc.). This step locks the pre-set markers by defining a larger range.

Step 102, obtaining at least one key sub-area in the area by a key sub-area searching method. Specifically, for example, a selective-search (selective-search) is used to obtain at least one first key subregion in the image to be detected; or, acquiring at least one first key subregion in the image to be detected through a target detection algorithm SSD.

And 103, extracting the feature vectors of the key sub-regions, specifically, extracting the feature vectors from a pre-trained feature extraction network. For example, the feature vector may be extracted, for example, by a feature extraction network trained based on an adaptive coding network.

104, comparing the feature vectors of the key sub-regions with the feature vectors of the pre-extracted preset marks to obtain the similarity of the feature vectors of the key sub-regions and the feature vectors of the pre-extracted preset marks; it is understood that the feature vector of the pre-extracted preset mark may also use the same network as the network used for extracting the feature data of the key sub-region. In particular, when calculating the similarity, the distance of the feature of the key sub-region may be calculated. Specifically, the distance may be an L2 distance (euclidean distance) or a cos distance (cosine distance), and the like, and this is not limited in the embodiment of the present invention. If the distance is below a certain threshold, a more similar flag is considered.

And 105, determining the image where the key sub-area with the similarity larger than the first preset threshold is located as the image containing the preset mark. A similar flag is considered when the distance L2, or alternatively the cos distance, is below a certain threshold.

According to the embodiment of the invention, a larger range is defined firstly, the preset marker is locked, and then at least one key subregion is obtained in the region possibly containing the preset marker by a key subregion searching method.

Optionally, the method for detecting a preset flag further includes: acquiring video information; and dividing the video into a plurality of images to be detected according to a preset rule. Further, correspondingly, the images can also be intercepted continuously in the video stream or intercepted at certain time intervals for detection. Continuously capturing images in the video stream can avoid missing detection, and capturing images at certain time intervals can reduce efficiency.

Optionally, the method for detecting a preset flag further includes: expanding the range of the first area to form a second area; and extracting the feature vector of the second region, comparing the feature vector of the second region with the feature vector of the pre-extracted preset mark, acquiring the similarity of the feature vectors of the second region and the feature vector of the pre-extracted preset mark, and removing the second region with the similarity smaller than a second preset threshold value. Namely, the region which is possibly the preset mark is subjected to external expansion, and false alarm is removed through comparison of a second region characteristic value which is obtained by external expansion of the first region and contains the preset mark and the preset mark characteristic value. For example, the preset mark is a cross-shaped mark, the similarity between the feature vector of the key sub-region in the first region and the feature vector of the preset mark extracted in advance is considered to be 90% or more, but after the range of the first region is expanded to form the second region, the graph is a 'meter' shape, the intersection points of the original cross of four points of the meter are smaller, the similarity between the two points is high only through the comparison of the feature vectors of the key sub-region, but when the similarity between the two points is expanded to the second region, the similarity between the two points can be determined to be lower through feature value comparison. In this embodiment, the false alarm is removed by comparing the second region characteristic value containing the preset mark obtained by the outward expansion of the first region with the preset mark characteristic value, so as to further improve the detection accuracy.

Optionally, the method for detecting a preset flag further includes: and removing the key subarea with the quality not meeting the preset requirement. For example, a region where the pixel point is smaller than a certain value, or a region where the resolution is lower than a preset value, or a region that does not meet other preset requirements is removed, which is not limited in this application. Specifically, the key sub-regions that do not meet the preset requirements may be filtered through a pre-trained classification network. In this way, areas of poor quality can be filtered out.

Optionally, the method for detecting a preset flag further includes: and when the preset mark is determined to be the mark needing reminding and alarming, outputting the image which is determined to contain the preset mark so as to remind and alarm. Specifically, if there is a mark needing to be reminded of alarm in the first several similar marks, the mark in the area in the picture is considered to be the mark needing to be reminded of alarm.

Fig. 2 is a flowchart of another embodiment of the method for detecting a predetermined flag according to the present invention, and as shown in fig. 2, the present embodiment provides a method for detecting a predetermined flag, including:

step 201, acquiring video information;

step 202, dividing a video into a plurality of images to be detected according to a preset rule;

step 203, detecting a first region in the image to be detected based on a neural network, wherein the first region is a region with the probability of a preset mark exceeding a preset value;

step 204, expanding the range of the first area to form a second area;

step 205, extracting the feature vector of the second region, comparing the feature vector of the second region with the feature vector of the pre-extracted preset mark, obtaining the similarity of the feature vectors of the second region and the pre-extracted feature vector of the preset mark, and removing the second region with the similarity smaller than a second preset threshold;

step 206, removing the key sub-regions with quality not meeting the preset requirement;

step 207, extracting feature vectors of the key sub-regions;

step 208, comparing the feature vectors of the key sub-regions with the feature vectors of the pre-extracted preset marks to obtain the similarity of the feature vectors of the key sub-regions and the feature vectors of the pre-extracted preset marks;

step 209, determining an image in which the key sub-region with the similarity greater than a first preset threshold is located as an image containing a preset mark;

step 210, when the preset mark is determined to be the mark needing to be reminded of alarming, outputting an image determined to contain the preset mark so as to remind of alarming.

Fig. 3 is a schematic diagram of an embodiment of a preset mark detection apparatus according to the present invention, as shown in fig. 3, according to a second aspect, the preset mark detection apparatus according to the second aspect of the present invention includes:

the detection unit 301 is configured to detect a first region in an image to be detected based on a neural network, where the first region includes a preset mark and has a probability exceeding a preset value;

a key sub-region obtaining unit 302, configured to obtain at least one key sub-region in the first region by using a key sub-region searching method;

a feature vector extraction unit 303, configured to extract feature vectors of the key sub-regions;

a similarity obtaining unit 304, configured to compare the feature vectors of the key sub-regions with feature vectors of pre-extracted preset marks, and determine similarity between the feature vectors of the key sub-regions and the feature vectors of the pre-extracted preset marks;

the determining unit 305 is configured to determine that an image where the key sub-region with the similarity greater than the first preset threshold is located is an image containing a preset mark.

Optionally, the preset mark detection apparatus of the present invention further includes a video obtaining unit, configured to obtain video information; and the segmentation unit is used for segmenting the video into a plurality of images to be detected according to a preset rule.

Optionally, the preset mark detection apparatus of the present invention further includes a false alarm removal unit, configured to: expanding the range of the first area to form a second area; and extracting the feature vector of the second region, comparing the feature vector of the second region with the feature vector of the pre-extracted preset mark, acquiring the similarity of the feature vectors of the second region and the pre-extracted feature vector of the preset mark, and removing the second region with the similarity smaller than a second preset threshold value.

Optionally, the preset mark detection apparatus of the present invention further includes a filtering unit, configured to remove a critical sub-area whose quality does not meet a preset requirement. For example, a region where the pixel point is smaller than a certain value, or a region where the resolution is lower than a preset value, or a region that does not meet other preset requirements is removed, which is not limited in this application.

Optionally, the preset mark detection device of the present invention further includes a reminding unit, configured to output the image determined to include the preset mark so as to remind an alarm when the preset mark is determined to be a mark that needs to remind an alarm.

Fig. 4 is a schematic diagram of another embodiment of the preset mark detection apparatus according to the present invention, and as shown in fig. 4, the present embodiment provides a preset mark detection apparatus, including:

a video acquisition unit 306 for acquiring video information;

and a dividing unit 307, configured to divide the video into multiple images to be detected according to a preset rule.

a false positive removal unit 308 for:

expanding the range of the first area to form a second area; and extracting the feature vector of the second region, comparing the feature vector of the second region with the feature vector of the pre-extracted preset mark, acquiring the similarity of the feature vectors of the second region and the pre-extracted feature vector of the preset mark, and removing the second region with the similarity smaller than a second preset threshold value.

A filtering unit 309, configured to remove a critical sub-area whose quality does not meet a preset requirement;

a key sub-region obtaining unit 302, configured to obtain at least one key sub-region in a region by using a key sub-region search method;

The reminding unit 310 is configured to, when the preset flag is determined to be a flag that needs to remind an alarm, output an image that is determined to include the preset flag so as to remind the alarm.

The specific technical details of the above-mentioned preset mark detection apparatus are similar to those of the preset mark detection apparatus method, and the technical effects that can be achieved in the implementation of the preset mark detection apparatus can also be achieved in the implementation of the preset mark detection apparatus method, and are not described herein again in order to reduce the repetition. Accordingly, the relevant technical details mentioned in the embodiments of the preset mark detection method can also be applied in the embodiments of the preset mark detection apparatus.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments.

In a third aspect, the present invention further provides a preset flag detecting apparatus, including:

at least one processor; a memory coupled to the at least one processor, the memory storing executable instructions, wherein the executable instructions, when executed by the at least one processor, cause the method of the first aspect of the invention to be carried out.

The present embodiment provides a preset mark detection apparatus, including: at least one processor; a memory coupled to the at least one processor. The processor and the memory may be provided separately or may be integrated together.

For example, the memory may include random access memory, flash memory, read only memory, programmable read only memory, non-volatile memory or registers, and the like. The processor may be a Central Processing Unit (CPU) or the like. Or a Graphics Processing Unit (GPU) memory may store executable instructions. The processor may execute executable instructions stored in the memory to implement the various processes described herein.

It will be appreciated that the memory in this embodiment can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. The non-volatile memory may be a ROM (Read-only memory), a PROM (programmable Read-only memory), an EPROM (erasable programmable Read-only memory), an EEPROM (electrically erasable programmable Read-only memory), or a flash memory. The volatile memory may be a RAM (random access memory) which serves as an external cache. By way of illustration and not limitation, many forms of RAM are available, such as SRAM (staticaram, static random access memory), DRAM (dynamic RAM, dynamic random access memory), SDRAM (synchronous DRAM ), DDRSDRAM (double data rate SDRAM, double data rate synchronous DRAM), ESDRAM (Enhanced SDRAM, Enhanced synchronous DRAM), SLDRAM (synchlink DRAM, synchronous link DRAM), and DRRAM (directrrambus RAM, direct memory random access memory). The memory described herein is intended to comprise, without being limited to, these and any other suitable types of memory.

In some embodiments, the memory stores elements, upgrade packages, executable units, or data structures, or a subset thereof, or an extended set thereof: an operating system and an application program.

The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The application programs comprise various application programs and are used for realizing various application services. The program for implementing the method of the embodiment of the present invention may be included in the application program.

In an embodiment of the present invention, the processor is configured to execute the method steps provided in the second aspect by calling a program or an instruction stored in the memory, specifically, a program or an instruction stored in the application program.

Furthermore, in a fifth aspect, the present invention also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of the second aspect of the present invention.

For example, the machine-readable storage medium may include, but is not limited to, various known and unknown types of non-volatile memory.

Those of skill in the art would understand that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments of the present application, the disclosed system, apparatus and method may be implemented in other ways. For example, the division of the unit is only one logic function division, and there may be another division manner in actual implementation. For example, multiple units or components may be combined or may be integrated into another system. In addition, the coupling between the respective units may be direct coupling or indirect coupling. In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or may exist separately and physically.

It should be understood that, in the various embodiments of the present application, the sequence numbers of the processes do not mean the execution sequence, and the execution sequence of the processes should be determined by the functions and the inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a machine-readable storage medium. Therefore, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a machine-readable storage medium and may include several instructions to cause an electronic device to perform all or part of the processes of the technical solution described in the embodiments of the present application. The storage medium may include various media that can store program codes, such as ROM, RAM, a removable disk, a hard disk, a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present application, and the scope of the present application is not limited thereto. Those skilled in the art can make changes or substitutions within the technical scope disclosed in the present application, and such changes or substitutions should be within the protective scope of the present application.

Claims

1. A method for detecting a preset mark, comprising:

detecting a first region in an image to be detected based on a neural network, wherein the first region is a region with the probability of a preset mark exceeding a preset value;

obtaining at least one key subregion in the first region by a key subregion searching method;

extracting feature vectors of the key sub-regions;

comparing the feature vectors of the key subregions with the feature vectors of the preset marks extracted in advance to obtain the similarity of the feature vectors of the key subregions and the feature vectors of the preset marks;

and determining the image where the key sub-region with the similarity larger than a preset threshold is located as the image containing the preset mark.

2. The method of claim 1, further comprising:

acquiring video information;

and dividing the video into a plurality of images to be detected according to a preset rule.

3. The method of claim 1, further comprising:

expanding the range of the first area to form a second area;

and extracting the feature vector of the second region, comparing the feature vector of the second region with the feature vector of the pre-extracted preset mark, acquiring the similarity of the feature vectors of the second region and the feature vector of the pre-extracted preset mark, and removing the second region with the similarity smaller than a second preset threshold value.

4. The method of claim 1, further comprising:

and removing the key subarea with the quality not meeting the preset requirement.

5. The method of claim 1, further comprising:

and when the preset mark is determined to be the mark needing reminding and alarming, outputting the image which is determined to contain the preset mark to remind alarming.

6. A preset mark detection device, comprising:

the detection unit is used for detecting a first region in the image to be detected based on the neural network, wherein the first region comprises a region with the probability of a preset mark exceeding a preset value;

a key sub-region acquisition unit, configured to acquire at least one key sub-region in the first region by a key sub-region search method;

a feature vector extraction unit, configured to extract a feature vector of the key sub-region;

the similarity obtaining unit is used for comparing the feature vectors of the key subregions with the feature vectors of the preset marks extracted in advance and determining the similarity of the feature vectors of the key subregions and the feature vectors of the preset marks;

and the determining unit is used for determining that the image where the key sub-area with the similarity larger than the first preset threshold is located is the image containing the preset mark.

7. The apparatus of claim 6, further comprising:

the video acquisition unit is used for acquiring video information;

and the segmentation unit is used for segmenting the video into a plurality of images to be detected according to a preset rule.

8. A preset mark detection device comprising:

at least one processor;

a memory coupled with the at least one processor, the memory storing executable instructions, wherein the executable instructions, when executed by the at least one processor, cause the method of any of claims 1-5 to be implemented.

9. A chip, comprising: a processor for calling and running the computer program from the memory so that the device in which the chip is installed performs: the method of any one of claims 1 to 5.

10. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, realizes the steps of the method according to any one of the claims 1 to 5.