CN116206122A

CN116206122A - Image recognition method and device, electronic equipment and storage medium

Info

Publication number: CN116206122A
Application number: CN202111449639.4A
Authority: CN
Inventors: 李岩松; 宋建军
Original assignee: Xinjiang Goldwind Science and Technology Co Ltd
Current assignee: Xinjiang Goldwind Science and Technology Co Ltd
Priority date: 2021-11-30
Filing date: 2021-11-30
Publication date: 2023-06-02

Abstract

The application discloses an image identification method, an image identification device, electronic equipment and a storage medium. The image identification method comprises the following steps: inputting the image to be identified into a feature extraction model to obtain a first feature vector, wherein the first feature vector is used for representing the integral features of the image to be identified; extracting features of n sensitive areas in the image to be identified to obtain a second feature vector, wherein the sensitive areas comprise marked areas, the second feature vector is used for representing features of the sensitive areas, and n is a positive integer; performing fusion processing on the first feature vector and the second feature vector to obtain a feature descriptor of the image to be identified; and determining whether the image to be identified and the images in the image library are repeated or not according to the feature descriptors. According to the embodiment of the application, whether the image with the mark is a repeated image can be accurately identified.

Description

Image recognition method and device, electronic equipment and storage medium

Technical Field

The application belongs to the technical field of image processing, and particularly relates to an image identification method, an image identification device, electronic equipment and a storage medium.

Background

Current repeated picture screening techniques typically generate an image descriptor based on the characteristics of each picture and store the image descriptors of all existing pictures in a database. When a new picture is input, a descriptor is generated for the new picture by the same algorithm, and the descriptor of the new picture is compared with the descriptors of all the old pictures. And if the descriptor similarity of the new picture and all the old pictures is larger than a specified threshold value, the new picture and all the old pictures are not considered to be repeated. But this picture screening technique is not suitable for recognition of pictures in complex scenes. For example, for a picture of a fan quality inspection site, since the picture of the fan quality inspection site covers a plurality of steps, the characteristics are complex, the commonly used image descriptors are easy to be identified by mistake, and for an image with marks, the marks on the picture also have influence on the image descriptors, so that the generation of the image descriptors is interfered, and the pictures with different marks but repeated contents are not screened out. Thus, there is a need for a repeat picture screening technique suitable for pictures with markers.

Disclosure of Invention

The embodiment of the application provides an image identification method, device, equipment and storage medium, which can more accurately identify whether an image with a mark is a repeated image or not.

In one aspect, an embodiment of the present application provides a method for identifying an image, including:

inputting the image to be identified into a feature extraction model to obtain a first feature vector, wherein the first feature vector is used for representing the integral features of the image to be identified;

extracting features of n sensitive areas in the image to be identified to obtain a second feature vector, wherein the sensitive areas comprise marked areas, the second feature vector is used for representing features of the sensitive areas, and n is a positive integer;

performing fusion processing on the first feature vector and the second feature vector to obtain a feature descriptor of the image to be identified;

and determining whether the image to be identified and the images in the image library are repeated or not according to the feature descriptors.

Further, the feature extraction model is a neural network model, and the neural network model sequentially comprises a convolution layer, a pooling layer, a residual layer, a full-connection layer and a classification layer;

inputting the image to be identified into a feature extraction model to obtain a first feature vector, wherein the method comprises the following steps:

inputting the image to be identified into a neural network model, and extracting an output result between the full-connection layer and the classification layer;

and generating a first feature vector according to the output result.

Further, generating a first feature vector according to the output result includes:

and performing dimension reduction processing on the output result to obtain a first feature vector.

Further, performing dimension reduction processing on the output result to obtain a first feature vector, including:

and extracting a plurality of principal components in the output result through principal component analysis to obtain a first feature vector.

Further, feature extraction is performed on n sensitive areas in the image to be identified to obtain a second feature vector, including:

extracting pixel values of each pixel point in each channel in n sensitive areas;

and merging the extracted pixel values according to the sequence of the pixel point positions and the channels to obtain a second feature vector.

Further, the n sensitive areas correspond to n corners in the image to be identified, or to n edges in the image to be identified, or to a collection of n total corners and edges in the image to be identified.

Further, the first feature vector is a p×1-dimensional vector, the second feature vector is a q×1-dimensional vector, p and q are positive integers, and the first feature vector and the second feature vector are fused to obtain a feature descriptor of the image to be identified, including:

and combining the first feature vector and the second feature vector, and taking the combined (p+q) multiplied by 1-dimensional vector as a feature descriptor of the image to be identified.

Further, determining whether the image to be identified and the image in the image library are repeated according to the feature descriptor comprises:

calculating the similarity between the feature descriptor of the image to be identified and the feature descriptor of each image in the image library; wherein the feature descriptors of each image in the image library are generated based on the same way as the feature descriptors of the images to be identified;

and determining that the image to be identified is repeated with the image in the image library in response to the similarity meeting the preset condition.

Further, after determining whether the image to be identified and the images in the image library are repeated according to the feature descriptor, the method further comprises:

and responding to the image to be identified as a quality inspection image, wherein the image library is a fault image library, the fault image library comprises m images with faults of the components, and when the image to be identified is repeated with a target image in the fault image library, the faults of the components in the image to be identified are determined, wherein m is a positive integer.

In another aspect, an embodiment of the present application provides an apparatus for identifying an image, including:

the input unit is used for inputting the image to be identified into the feature extraction model to obtain a first feature vector, and the first feature vector is used for representing the integral features of the image to be identified;

the extraction unit is used for extracting features of n sensitive areas in the image to be identified to obtain second feature vectors, the sensitive areas comprise marked areas, the second feature vectors are used for representing features of the sensitive areas, and n is a positive integer;

the fusion unit is used for carrying out fusion processing on the first feature vector and the second feature vector to obtain a feature descriptor of the image to be identified;

and the determining unit is used for determining whether the image to be identified and the image in the image library are repeated or not according to the feature descriptor.

the input unit includes:

the first extraction subunit is used for inputting the image to be identified into the neural network model and extracting an output result between the full-connection layer and the classification layer;

and the generating subunit is used for generating a first feature vector according to the output result.

Further, the generating subunit is further configured to perform a dimension reduction process on the output result, so as to obtain a first feature vector.

Further, the generating subunit is further configured to extract a plurality of principal components in the output result through principal component analysis, so as to obtain a first feature vector.

Further, the extraction unit includes:

the second extraction subunit is used for extracting the pixel value of each pixel point in each channel in the n sensitive areas;

and the first merging subunit is used for merging the extracted pixel values according to the sequence of the pixel point positions and the channels to obtain a second feature vector.

Further, the first feature vector is a p×1-dimensional vector, the second feature vector is a q×1-dimensional vector, p and q are both positive integers, and the fusion unit includes:

and the second merging subunit is used for merging the first feature vector and the second feature vector, and taking the merged (p+q) multiplied by 1-dimensional vector as a feature descriptor of the image to be identified.

Further, the determining unit includes:

a calculating subunit, configured to calculate a similarity between a feature descriptor of the image to be identified and a feature descriptor of each image in the image library; wherein the feature descriptors of each image in the image library are generated based on the same way as the feature descriptors of the images to be identified;

and the determining subunit is used for determining that the image to be identified is repeated with the image in the image library in response to the similarity meeting the preset condition.

Further, the apparatus further comprises:

the fault determining unit is used for determining whether the image to be identified and the image in the image library are repeated or not according to the feature descriptors, responding that the image to be identified is a quality inspection image, the image library is a fault image library, the fault image library comprises m images with faults of the components, and determining that the components in the image to be identified are faulty when the image to be identified and the target image in the fault image library are repeated, wherein m is a positive integer.

In still another aspect, an embodiment of the present application provides an electronic device, including: a processor and a memory storing computer program instructions;

the processor, when executing the computer program instructions, implements the image recognition method as provided in the embodiments of the present application.

In yet another aspect, embodiments of the present application provide a storage medium having stored thereon computer program instructions that, when executed by a processor, implement a method for identifying an image as provided by embodiments of the present application.

According to the image identification method, device and equipment and storage medium, the image to be identified is input into the feature extraction model to obtain the first feature vector used for representing the overall feature of the image to be identified, the feature extraction is carried out on n sensitive areas in the image to be identified to obtain the second feature vector used for representing the feature of the sensitive area containing the mark, then fusion processing is carried out on the first feature vector and the second feature vector, whether the image to be identified is repeated with the image in the image library or not is determined by utilizing the obtained feature descriptor of the image to be identified, and the technical problem that whether the accuracy rate of identifying the image with the mark is lower in the related art or not can be solved. By extracting the features of the sensitive area containing the marks and representing the features with the feature vectors, and fusing the feature vectors representing the overall features of the image, the feature descriptors for describing the image can be obtained more accurately, so that whether the image with the marks is a repeated image can be identified more accurately.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described, and it is possible for a person skilled in the art to obtain other drawings according to these drawings without inventive effort.

FIG. 1 is a flow chart of a method for recognizing an image according to an embodiment of the present application;

FIG. 2 is a schematic diagram of an image recognition method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a sensitive area of an image recognition method according to one embodiment of the present application;

FIG. 4 is a schematic structural view of an image recognition device according to another embodiment of the present application;

fig. 5 is a schematic structural diagram of an electronic device according to another embodiment of the present application.

Detailed Description

Features and exemplary embodiments of various aspects of the present application are described in detail below to make the objects, technical solutions and advantages of the present application more apparent, and to further describe the present application in conjunction with the accompanying drawings and the detailed embodiments. It should be understood that the specific embodiments described herein are intended to be illustrative of the application and are not intended to be limiting. It will be apparent to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present application by showing examples of the present application.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

In order to solve the problems in the prior art, embodiments of the present application provide an image recognition method, apparatus, device, and storage medium. The following first describes a method for identifying an image provided in an embodiment of the present application.

Fig. 1 is a flow chart of a method for identifying an image according to an embodiment of the present application. As shown in fig. 1, the method may include the following steps 101-104:

and step 101, inputting the image to be identified into a feature extraction model to obtain a first feature vector.

The image to be identified is an image to be compared with the images in the image library.

The feature extraction model is a model for extracting features of an image, and may include an edge extraction algorithm model, a feature point extraction algorithm model, and the like, and may be a pre-trained neural network model for extracting integral features in an image, for example, a convolutional neural network (Convolutional Neural Networks, CNN) model, a recurrent neural network (Recurrent Neural Network, RNN) model, and the like. Taking the CNN model as an example, the CNN model is a feedforward neural network (Feedforward Neural Networks) with a depth structure and including convolution calculation, and is one of representative algorithms of deep learning. Alternatively, a resnet18 model may be used as the feature extraction model in consideration of factors such as accuracy, running speed, model size, and the like.

After the image to be identified is input into the feature extraction model, a first feature vector for characterizing the overall features of the image to be identified can be obtained. The overall characteristics of an image refer to characteristics in the entire picture of the image. The kinds of features may include edges, corner points, colors, object shapes, etc., which are determined according to the kinds of features extracted by the feature extraction model. The first feature vector is a vector that represents the overall feature of the image by the values of the different dimensions in the vector. In one example, the first feature vector may be a 1200 x 1 dimensional vector comprising 1200 values.

In one example, the feature extraction model is a neural network model sequentially comprising a convolution layer, a pooling layer, a residual layer, a full connection layer and a classification layer, and when an image to be identified is input into the feature extraction model to obtain a first feature vector, the specific steps are that the image to be identified is input into the neural network model, an output result between the full connection layer and the classification layer is extracted, and therefore the first feature vector is generated according to the output result. The classification layer is used for classifying according to the output of the full-connection layer to obtain a final classification result, so that the vector output by the full-connection layer before the classification layer can be used for representing the integral feature of the image to be identified and is extracted as a first feature vector.

The structure before the full connection layer in the neural network model is mainly used for extracting features, and the features extracted by the layer which is more advanced are simpler, such as straight lines, arcs, RGB channel values, gray values and the like. As the number of layers advances, the extracted features are also gradually complicated, for example, various straight lines and arcs are combined into various shapes such as circles, rectangles, hexagons, and the RGB channel values and gray values are combined into various colors. Finally, features with higher abstraction degree, such as nuts, bolts, meters and the like, can be combined, and finally classified into one of the classes in advance by the classification layer according to the output of the full connection layer.

An exemplary schematic diagram of a neural network model is shown in fig. 2, after an original image is input, based on pixel values of pixel points, features such as simple lines, RGB, gray scale and the like can be extracted through the neural network model to obtain a first feature map, then the neural network model may extract higher-level features such as shapes, colors and the like to obtain a second feature map, and further the neural network model may also extract higher-level features such as objects, environments and the like contained in an image to obtain a third feature map. Based on the third feature map, the classification and arrangement results of the ten classification layers can be classified, and the output final result is the class to which the image to be identified belongs.

In one example, in order to improve the accuracy of describing features of the neural network model, the vector directly output by the full connection layer is generally higher in dimension, for example, the length of the extracted feature vector may be 512×1000= 512000, while an excessively long feature vector may cause a certain degree of stress on storage and comparison, and may greatly dilute the influence of the second feature vector on the feature descriptor, so when the first feature vector is generated according to the output result, the output result may be subjected to dimension reduction processing to obtain the first feature vector. For example, the dimension may be reduced to be the same as or similar to the second feature vector.

When the output result is subjected to dimension reduction processing, a plurality of principal components in the output result can be extracted through Principal Component Analysis (PCA) to obtain a first feature vector. Principal component analysis is a multivariate statistical method for examining the correlation among a plurality of variables, and reveals the internal structure among the plurality of variables by a few principal components, i.e. the few principal components are derived from the original variables so that they retain as much information as possible of the original variables and are not correlated with each other. In general, the mathematical process is to linearly combine the original P indexes as a new comprehensive index.

One example principal component analysis is implemented by expressing the variance of F1 (the first linear combination selected, i.e., the first composite index), i.e., the larger Var (F1), the more information F1 contains. Therefore, F1 selected from all linear combinations should be the largest variance, so F1 is called the first principal component. If the first principal component is insufficient to represent the original P indices of information, then consider selecting F2, i.e., selecting the second linear combination, in order to effectively reflect the original information, the existing information of F1 need not appear in F2 again, expressed in mathematical language or requiring Cov (F1, F2) =0, then call F2 as the second principal component, and so on, to construct the third, fourth, … …, and P principal components. Illustratively, a vector of length 512000 output by the full connection layer may be reduced in dimension to a vector of length 1200.

And 102, extracting features of n sensitive areas in the image to be identified to obtain a second feature vector.

The sensitive area is the area of the image that contains the mark. n is a positive integer. For example, the indicia may be a watermark, date, etc.

In one example, the region containing the mark may be fixed, for example, when the user uploads an image to the system via the client, the image uploaded by the user is marked with a watermark or the like by the user client, so as to obtain an image to be identified. Thus, if the area containing the mark can be fixed, the sensitive area can be pre-specified.

The second feature vector is used to characterize the sensitive region. Alternatively, the second feature vector may be calculated by a feature extraction model, and the feature extraction model for extracting the second feature vector may be different from the feature extraction model for extracting the first feature vector.

In some optional embodiments, when extracting features of n sensitive areas in the image to be identified to obtain the second feature vector, the pixel values of each pixel point in the n sensitive areas in each channel may be extracted, that is, the pixel values of all pixel points in the n sensitive areas are obtained, and since the pixel values generally include at least one channel, the extracted pixel values may be combined according to the order of the positions of the pixel points and the channels to obtain the second feature vector.

In the merging according to the order of the pixel positions and the channels, the merging order may be pre-designated, for example, the sensitive area may be pre-assigned with a serial number, for example, the 1 st area and the 2 nd area … …, and each channel may also have a corresponding identifier, for example, R (representing a red channel), G (representing a green channel) and B (representing a blue channel), so in an alternative example, the pixel values of each pixel may be first merged into a vector according to the order of R, G, B, and, for example, the vectors of the pixel points in each area may be merged into a vector in order from the edge to the center in turn, and then, the vectors of all the areas may be merged into a vector according to the label order of the areas, to obtain the final second feature vector.

The n sensitive areas may correspond to n corners in the image to be identified, or to n edges in the image to be identified, or to a set of corners and edges of the image to be identified totaling n. Alternatively, the mark may be a region that is designated in advance in the image so that n sensitive regions can be determined based on the region in which the mark is located. That is, the position, number, and size of the sensitive area may be specified in advance in the image to be recognized.

For example, the n sensitive areas may correspond to four corners in the image to be identified, as shown in fig. 3, 100 pixels may be collected from the four corners of the image to be identified inward (i.e., the directions from the four corners to the center of the image shown in fig. 3), where the pixel value of each pixel is three channels, and then the total may obtain values of 100 (the number of pixels) ×3 (the number of channels) ×4 (the number of sensitive areas) =1200, and these 1200 values are sequentially combined to obtain the second feature vector.

In the above example, the pixel points may not be extracted across the center of the image at the four corners. Thus, the upper limit of the extracted length may be half of the smaller of the length and width of the image. In order to unify the picture size, the image to be identified may be first adjusted to a picture with a length and width of 640, and the length of the extracted second feature vector is 640 (the length and width of the image) x 0.5 (half of the smaller value in the length and width of the image) x 3 (the number of channels) x 4 (four corners) =3840.

And 103, carrying out fusion processing on the first feature vector and the second feature vector to obtain a feature descriptor of the image to be identified.

After the first feature vector and the second feature vector are obtained, namely, the vector capable of representing the integral feature of the image to be identified and the vector of the sensitive area feature are obtained respectively, and the two feature vectors are fused together to obtain the feature descriptor capable of representing the integral feature of the image to be identified and the sensitive area feature simultaneously. The feature descriptor may be a vector, and in one example, the fusion process may be to obtain the feature descriptor after weighting calculation by assigning different weights to the first feature vector and the second feature vector.

Alternatively, in another example, the fusion process may be to combine the first feature vector and the second feature vector to obtain the feature descriptor. For example, the first feature vector is a p×1-dimensional vector, the second feature vector is a q×1-dimensional vector, and p and q are positive integers, and then, after the first feature vector and the second feature vector are combined, a (p+q) ×1-dimensional vector can be obtained and used as a feature descriptor of the image to be identified.

Step 104, determining whether the image to be identified and the images in the image library are repeated or not according to the feature descriptors.

For images in the image library, the feature descriptor of each image in the image library may also be generated based on the same method as the feature descriptor of the image to be identified. In this way, it is possible to determine whether the image to be recognized and the images in the image library are repeated by comparing the feature descriptors.

For example, the similarity between the feature descriptor of the image to be identified and the feature descriptor of each image in the image library may be calculated, and the image to be identified and the image in the image library are determined to be repeated in response to the similarity meeting a preset condition. For example, the formula of similarity calculation may be a formula of vector cosine similarity, which is not limited in the embodiment of the present application.

In order to improve the calculation speed, feature descriptors can be generated and stored for each image in the image library in advance, so that when the similarity is calculated, the stored feature descriptors of each image can be directly obtained, and the time and the calculation effort for calculating and generating the feature descriptors of each image in the image library are saved.

In an application scenario, the image recognition method provided by the embodiment of the application can be used for diagnosing the component faults.

Specifically, the image to be identified may be a quality inspection image taken by a quality inspection person on the component, and the image library is a database for storing m images of the component that fail. After determining whether the image to be identified and the image in the image library are repeated according to the feature descriptors, determining that the component in the image to be identified is faulty when the image to be identified and the target image in the faulty image library are identified to be repeated, wherein m is a positive integer.

According to the image recognition method provided by the embodiment of the application, the execution subject can be an image recognition device or a control module for executing the image recognition method in the image recognition device. In the embodiment of the present application, the image recognition device provided in the embodiment of the present application is described by taking an example of a method for performing image recognition by the image recognition device, and in a portion of the image recognition device provided in the embodiment of the present application that is not described in detail, reference may be made to description of the image recognition method provided in the embodiment of the present application, which is not described in detail.

Fig. 4 is a schematic structural diagram of an image recognition apparatus provided in an embodiment of the present application, which includes an input unit 41, an extraction unit 42, a fusion unit 43, and a determination unit 44.

The input unit 41 is configured to input an image to be identified into the feature extraction model, and obtain a first feature vector, where the first feature vector is used to characterize the overall feature of the image to be identified;

the extracting unit 42 is configured to perform feature extraction on n sensitive areas in the image to be identified, so as to obtain a second feature vector, where the sensitive areas include marked areas, the second feature vector is used to characterize features of the sensitive areas, and n is a positive integer;

the fusion unit 43 is configured to perform fusion processing on the first feature vector and the second feature vector to obtain a feature descriptor of the image to be identified;

the determining unit 44 is configured to determine whether the image to be identified and the images in the image library are repeated according to the feature descriptors.

Optionally, the feature extraction model may be a neural network model, where the neural network model sequentially includes a convolution layer, a pooling layer, a residual layer, a full connection layer, and a classification layer; the input unit 41 may include:

Optionally, the generating subunit may be further configured to perform a dimension reduction process on the output result to obtain a first feature vector.

Alternatively, the generating subunit may be further configured to extract a plurality of principal components in the output result by principal component analysis, to obtain the first feature vector.

Alternatively, the extraction unit 42 may include:

Optionally, the n sensitive areas correspond to n corners in the image to be identified, or to n edges in the image to be identified, or to a collection of n total corners and edges in the image to be identified.

Alternatively, the first feature vector is a p×1-dimensional vector, the second feature vector is a q×1-dimensional vector, p and q are both positive integers, and the fusion unit 43 may include:

Alternatively, the determining unit 44 may include:

Optionally, the apparatus may further include:

According to the image recognition device, the image to be recognized is input into the feature extraction model to obtain the first feature vector used for representing the overall feature of the image to be recognized, the feature extraction is carried out on n sensitive areas in the image to be recognized to obtain the second feature vector used for representing the feature of the sensitive area containing the mark, then fusion processing is carried out on the first feature vector and the second feature vector, whether the image to be recognized is repeated with the image in the image library or not is determined by utilizing the obtained feature descriptor of the image to be recognized, and the technical problem that whether the accuracy rate of recognizing whether the image with the mark is a repeated image or not in the related technology is low can be solved. By extracting the features of the sensitive area containing the marks and representing the features with the feature vectors, and fusing the feature vectors representing the overall features of the image, the feature descriptors for describing the image can be obtained more accurately, so that whether the image with the marks is a repeated image can be identified more accurately.

Fig. 5 shows a schematic hardware structure of an electronic device according to an embodiment of the present application.

The electronic device may comprise a processor 301 and a memory 302 storing computer program instructions.

In particular, the processor 301 may include a Central Processing Unit (CPU), or an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or may be configured to implement one or more integrated circuits of embodiments of the present application.

Memory 302 may include mass storage for data or instructions. By way of example, and not limitation, memory 302 may comprise a Hard Disk Drive (HDD), floppy Disk Drive, flash memory, optical Disk, magneto-optical Disk, magnetic tape, or universal serial bus (Universal Serial Bus, USB) Drive, or a combination of two or more of these. Memory 302 may include removable or non-removable (or fixed) media, where appropriate. Memory 302 may be internal or external to the integrated gateway disaster recovery device, where appropriate. In a particular embodiment, the memory 302 is a non-volatile solid-state memory.

The memory may include Read Only Memory (ROM), random Access Memory (RAM), magnetic disk storage media devices, optical storage media devices, flash memory devices, electrical, optical, or other physical/tangible memory storage devices. Thus, in general, the memory includes one or more tangible (non-transitory) computer-readable storage media (e.g., memory devices) encoded with software comprising computer-executable instructions and when the software is executed (e.g., by one or more processors) it is operable to perform the operations described with reference to the methods provided in accordance with embodiments of the present application.

The processor 301 implements any of the methods of the above embodiments by reading and executing computer program instructions stored in the memory 302.

In one example, the electronic device may also include a communication interface 303 and a bus 310. As shown in fig. 5, the processor 301, the memory 302, and the communication interface 303 are connected to each other by a bus 310 and perform communication with each other.

The communication interface 303 is mainly used to implement communication between each module, device, unit and/or apparatus in the embodiments of the present application.

Bus 310 includes hardware, software, or both, that couple components of the electronic device to one another. By way of example, and not limitation, the buses may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a HyperTransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a micro channel architecture (MCa) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus, or a combination of two or more of these. Bus 310 may include one or more buses, where appropriate. Although embodiments of the present application describe and illustrate a particular bus, the present application contemplates any suitable bus or interconnect.

It should be clear that the present application is not limited to the particular arrangements and processes described above and illustrated in the drawings. For the sake of brevity, a detailed description of known methods is omitted here. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present application are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications, and additions, or change the order between steps, after appreciating the spirit of the present application.

The functional blocks shown in the above-described structural block diagrams may be implemented in hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, a plug-in, a function card, or the like. When implemented in software, the elements of the present application are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine readable medium or transmitted over transmission media or communication links by a data signal carried in a carrier wave. A "machine-readable medium" may include any medium that can store or transfer information. Examples of machine-readable media include electronic circuitry, semiconductor memory devices, ROM, flash memory, erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, radio Frequency (RF) links, and the like. The code segments may be downloaded via computer networks such as the internet, intranets, etc.

It should also be noted that the exemplary embodiments mentioned in this application describe some methods or systems based on a series of steps or devices. However, the present application is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be different from the order in the embodiments, or several steps may be performed simultaneously.

Aspects of the present application are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, enable the implementation of the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be, but is not limited to being, a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit. It will also be understood that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware which performs the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In the foregoing, only the specific embodiments of the present application are described, and it will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the systems, modules and units described above may refer to the corresponding processes in the foregoing method embodiments, which are not repeated herein. It should be understood that the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present application, which are intended to be included in the scope of the present application.

Claims

1. A method of recognizing an image, comprising:

inputting an image to be identified into a feature extraction model to obtain a first feature vector, wherein the first feature vector is used for representing the integral features of the image to be identified;

and determining whether the image to be identified and the images in the image library are repeated or not according to the feature descriptor.

2. The method of claim 1, wherein the feature extraction model is a neural network model comprising, in order, a convolutional layer, a pooling layer, a residual layer, a full-connection layer, and a classification layer;

inputting the image to be identified into the neural network model, and extracting an output result between the full-connection layer and the classification layer;

and generating the first feature vector according to the output result.

3. The method of claim 2, wherein generating the first feature vector from the output result comprises:

and performing dimension reduction processing on the output result to obtain the first feature vector.

4. A method according to claim 3, wherein said performing a dimension reduction process on said output result to obtain said first feature vector comprises:

and extracting a plurality of principal components in the output result through principal component analysis to obtain the first feature vector.

5. The method according to claim 1, wherein the feature extracting the n sensitive areas in the image to be identified to obtain a second feature vector includes:

extracting pixel values of each pixel point in each channel in the n sensitive areas;

and merging the extracted pixel values according to the sequence of the pixel point positions and the channels to obtain the second feature vector.

6. The method of claim 5, wherein the n sensitive areas correspond to n corners in the image to be identified, or to n edges in the image to be identified, or to a total number of n corners and a set of edges in the image to be identified.

7. The method according to claim 1, wherein the first feature vector is a p×1-dimensional vector, the second feature vector is a q×1-dimensional vector, p and q are both positive integers, and the fusing the first feature vector and the second feature vector to obtain the feature descriptor of the image to be identified includes:

and merging the first feature vector and the second feature vector, and taking the merged (p+q) multiplied by 1-dimensional vector as a feature descriptor of the image to be identified.

8. The method of claim 1, wherein determining whether the image to be identified is duplicate with an image in an image library based on the feature descriptor comprises:

calculating the similarity between the feature descriptor of the image to be identified and the feature descriptor of each image in the image library; wherein the feature descriptor of each image in the image library is generated based on the same way as the feature descriptor of the image to be identified;

and determining that the image to be identified is repeated with the image in the image library in response to the similarity meeting a preset condition.

9. The method of claim 1, wherein after determining whether the image to be identified and an image in an image library are repeated based on the feature descriptor, the method further comprises:

and responding to the image to be identified as a quality inspection image, wherein the image library is a fault image library, the fault image library comprises m images with components in fault, and when the image to be identified is repeated with a target image in the fault image library, the components in the image to be identified are determined to be in fault, wherein m is a positive integer.

10. An apparatus for recognizing an image, the apparatus comprising:

the extraction unit is used for extracting features of n sensitive areas in the image to be identified to obtain a second feature vector, wherein the sensitive areas comprise marked areas, the second feature vector is used for representing the features of the sensitive areas, and n is a positive integer;

11. The apparatus of claim 10, wherein the feature extraction model is a neural network model comprising, in order, a convolutional layer, a pooling layer, a residual layer, a full-connection layer, and a classification layer;

the input unit includes:

and the generating subunit is used for generating the first feature vector according to the output result.

12. The apparatus of claim 11, wherein the generating subunit is further configured to perform a dimension reduction process on the output result to obtain the first feature vector.

13. The apparatus of claim 12, wherein the generating subunit is further configured to extract a plurality of principal components in the output result by principal component analysis to obtain the first feature vector.

14. The apparatus of claim 10, wherein the extraction unit comprises:

and the first merging subunit is used for merging the extracted pixel values according to the sequence of the pixel point positions and the channels to obtain the second feature vector.

15. The apparatus of claim 14, wherein the n sensitive areas correspond to n corners in an image to be identified, or to n edges in the image to be identified, or to a collection of n total corners and edges in the image to be identified.

16. The apparatus of claim 10, wherein the first feature vector is a p x 1-dimensional vector, the second feature vector is a q x 1-dimensional vector, p, q are both positive integers, and the fusion unit comprises:

17. The apparatus according to claim 10, wherein the determining unit comprises:

a calculating subunit, configured to calculate a similarity between the feature descriptor of the image to be identified and the feature descriptor of each image in the image library; wherein the feature descriptor of each image in the image library is generated based on the same way as the feature descriptor of the image to be identified;

and the determining subunit is used for determining that the image to be identified is repeated with the image in the image library in response to the similarity meeting a preset condition.

18. The apparatus of claim 10, wherein the apparatus further comprises:

and the fault determining unit is used for determining whether the image to be identified and the image in the image library are repeated or not according to the feature descriptors, responding to the fact that the image to be identified is a quality inspection image, the image library is a fault image library, the fault image library comprises m images with parts faulty, and determining that the parts in the image to be identified are faulty when the image to be identified and the target image in the fault image library are repeated, wherein m is a positive integer.

19. An electronic device, the electronic device comprising: a processor and a memory storing computer program instructions;

the processor, when executing the computer program instructions, implements the method of identifying images according to any one of claims 1-9.

20. A storage medium having stored thereon computer program instructions which, when executed by a processor, implement the method of identifying an image according to any of claims 1-9.