CN112651451A - Image recognition method and device, electronic equipment and storage medium - Google Patents

Image recognition method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112651451A
CN112651451A CN202011606881.3A CN202011606881A CN112651451A CN 112651451 A CN112651451 A CN 112651451A CN 202011606881 A CN202011606881 A CN 202011606881A CN 112651451 A CN112651451 A CN 112651451A
Authority
CN
China
Prior art keywords
image
feature
matrix
dimension
enhanced
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011606881.3A
Other languages
Chinese (zh)
Other versions
CN112651451B (en
Inventor
宋希彬
周定富
方进
张良俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Baidu USA LLC
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Baidu USA LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd, Baidu USA LLC filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202011606881.3A priority Critical patent/CN112651451B/en
Publication of CN112651451A publication Critical patent/CN112651451A/en
Application granted granted Critical
Publication of CN112651451B publication Critical patent/CN112651451B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

The present disclosure provides an image recognition method, an image recognition device, an electronic device and a storage medium, which relate to the technical field of image processing, in particular to the field of computer vision, deep learning and other artificial intelligence, and the specific implementation scheme is to obtain an image to be recognized and extract the image features of the image to be recognized; performing dimension reduction processing on the image features to obtain dimension reduction image features; enhancing the dimensionality reduction image features on a feature extraction channel to obtain first enhanced image features; enhancing the dimension-reduced image features on pixels to obtain second enhanced image features; and acquiring the texture type of the image to be identified based on the first enhanced image characteristic and the second enhanced image characteristic. According to the method and the device, the image features are enhanced and fused in two aspects, so that the expression capability of the image features is enhanced, and the accuracy of identifying the texture types of the images is improved.

Description

Image recognition method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of image processing technology, and in particular, to the field of artificial intelligence techniques such as computer vision and deep learning.
Background
Based on the fact that additional training data sets are needed for traditional machine learning or deep learning, texture information of the images is predicted, however, the nonlinear expression capability of the traditional machine learning is often limited, and the problem of insufficient image feature extraction exists in the deep learning, so that the prediction accuracy of the texture information of the images is low.
Disclosure of Invention
The disclosure provides a method, an apparatus, an electronic device, a storage medium and a computer program product for image recognition.
According to one aspect of the disclosure, an image recognition method is provided, which includes acquiring an image to be recognized, and extracting image features of the image to be recognized; performing dimension reduction processing on the image features to obtain dimension reduction image features; enhancing the dimensionality reduction image features on a feature extraction channel to obtain first enhanced image features; enhancing the dimension-reduced image features on pixels to obtain second enhanced image features; and acquiring the texture type of the image to be identified based on the first enhanced image characteristic and the second enhanced image characteristic.
According to a second aspect of the present disclosure, there is provided an image recognition apparatus comprising: the characteristic extraction module is used for acquiring an image to be identified and extracting the image characteristics of the image to be identified; the dimension reduction module is used for carrying out dimension reduction processing on the image features to obtain dimension reduction image features; the first enhancement module is used for enhancing the image features on a feature extraction channel to obtain first enhanced image features; the second enhancement module is used for enhancing the image characteristics on pixels to obtain second enhanced image characteristics; and the texture recognition module is used for acquiring the texture type of the image to be recognized based on the first enhanced image characteristic and the second enhanced image characteristic.
According to a third aspect of the present disclosure, an electronic device is presented, wherein the electronic device comprises a processor and a memory; the processor executes a program corresponding to the executable program code by reading the executable program code stored in the memory, for implementing the image recognition method as set forth in the first aspect above.
According to a fourth aspect of the present disclosure, a computer-readable storage medium is proposed, on which a computer program is stored, comprising program which, when executed by a processor, implements the image recognition method as proposed in the first aspect above. A computer program product comprising instructions which, when executed by a processor of the computer program product, implement the image recognition method as set forth in the first aspect above.
According to a fifth aspect of the present disclosure, a computer program product is proposed, which is characterized by implementing the image recognition method as proposed in the first aspect above when executed by an instruction processor in the computer program product.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
fig. 1 is a schematic flow chart diagram of an image recognition method according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart diagram of an image recognition method according to another embodiment of the present disclosure;
FIG. 3 is a schematic flow chart diagram of an image recognition method according to another embodiment of the present disclosure;
FIG. 4 is a schematic flow chart diagram of an image recognition method according to another embodiment of the present disclosure;
FIG. 5 is a schematic flow chart diagram of an image recognition method according to another embodiment of the present disclosure;
FIG. 6 is a block diagram of an image recognition device according to an embodiment of the present disclosure;
FIG. 7 is a block diagram of an image recognition device according to an embodiment of the present disclosure;
fig. 8 is a schematic block diagram of an electronic device of an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Image Processing (Image Processing) techniques that analyze an Image with a computer to achieve a desired result. Also known as image processing. Image processing generally refers to digital image processing. Digital images are large two-dimensional arrays of elements called pixels and values called gray-scale values, which are captured by industrial cameras, video cameras, scanners, etc. Image processing techniques generally include image compression, enhancement and restoration, matching, description and identification of 3 parts.
Deep Learning (DL) is a new research direction in the field of Machine Learning (ML), and is introduced into Machine Learning to make it closer to the original target, artificial intelligence. Deep learning is the intrinsic law and representation hierarchy of learning sample data, and information obtained in the learning process is very helpful for interpretation of data such as characters, images and sounds. The final aim of the method is to enable the machine to have the analysis and learning capability like a human, and to recognize data such as characters, images and sounds. Deep learning is a complex machine learning algorithm, and achieves the effect in speech and image recognition far exceeding the prior related art.
Computer Vision (Computer Vision) is a science for researching how to make a machine "see", and further, it means that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can acquire 'information' from images or multidimensional data. The information referred to herein refers to information defined by Shannon that can be used to help make a "decision". Because perception can be viewed as extracting information from sensory signals, computer vision can also be viewed as the science of how to make an artificial system "perceive" from images or multidimensional data.
Artificial Intelligence (AI) is a subject of studying some thinking processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.) of a computer to simulate a human life, and has both hardware and software technologies. Artificial intelligence hardware techniques generally include computer vision techniques, speech recognition techniques, natural language processing techniques, and learning/deep learning thereof, big data processing techniques, knowledge-graph techniques, and the like.
Fig. 1 is a schematic flowchart of an image recognition method according to an embodiment of the present disclosure. As shown in the figure, the image recognition method comprises the following steps:
s101, obtaining an image to be identified, and extracting image characteristics of the image to be identified.
In the embodiment of the present disclosure, the image to be recognized may be a pre-collected image, or may also be the image collected in real time. Optionally, the image is a color image.
After the image to be recognized is acquired, in order to recognize or classify the image to be recognized, image features of the image to be recognized need to be extracted, and the image features may include, but are not limited to, the following features: the image features mainly include color features, texture features, shape features and spatial relationship features of the image.
Alternatively, the image features of the image to be recognized may be extracted through a deep learning model or a machine learning model, that is, the image to be recognized is input into a trained feature extraction network, and the image features may be extracted based on the feature extraction network.
And S102, performing dimension reduction processing on the image features to obtain dimension reduction image features.
In the embodiment of the present disclosure, the same feature information in an image feature may be described from different dimensions, for example, one feature information may be described from dimensions such as a feature extraction channel, a feature length, and a feature width. In order to reduce the amount of data processing and realize multiplication of candidate matrices, in the embodiment of the present disclosure, dimension reduction processing may be performed on the image features to obtain dimension reduced image features.
S103, enhancing the dimension-reduced image features on the feature extraction channel to obtain first enhanced image features.
In the implementation, in the feature extraction process of the image to be identified, a plurality of feature extraction channels are required to extract the image features of the image to be identified. For the problem of insufficient image feature extraction in the related art, in the embodiment of the present disclosure, feature enhancement may be performed on a feature extraction channel, so as to obtain a first enhanced image characteristic. Optionally, the dimension-reduced image features are convolved based on a plurality of convolution networks to obtain enhancement weights of a plurality of feature extraction channels, and the first enhanced image features may be obtained based on the enhancement weights and the image features at the channel level.
And S104, enhancing the dimension-reduced image characteristics on the pixels to acquire second enhanced image characteristics.
In order to solve the problem of insufficient image feature extraction in the related art, in the embodiment of the present disclosure, feature enhancement at a pixel level may be performed, and then a second enhanced image characteristic may be obtained. Optionally, the reduced-dimension image feature is convolved based on a plurality of convolution networks to obtain an enhancement weight of each pixel, and the second enhanced image feature may be obtained based on the enhancement weight and the image feature at the pixel level.
And S105, acquiring the texture type of the image to be identified based on the first enhanced image characteristic and the second enhanced image characteristic.
After the first enhanced image feature and the second enhanced image feature are obtained, the two enhanced image features are fused to obtain a final image feature. Optionally, the first enhanced image feature and the second enhanced image feature are weighted to obtain a final target image feature. The first enhanced image feature and the second enhanced image feature enable the intensity of the image features to be higher, and the accuracy of image recognition can be improved more favorably.
After the final target image features are obtained, classification and identification are carried out based on the final target image features, and the texture type of the image to be identified can be obtained. Optionally, based on the trained texture classification model, classifying and identifying the target image features, and finally outputting the texture type corresponding to the image to be identified. For example, the texture type may include soil material, road surface, leaves, and the like.
The image recognition method comprises the steps of obtaining an image to be recognized, extracting image features of the image to be recognized, conducting dimension reduction processing on the image features, obtaining dimension reduction image features, enhancing the dimension reduction image features on a feature extraction channel to obtain first enhanced image features, enhancing the dimension reduction image features on pixels to obtain second enhanced image features, and obtaining texture types of the image to be recognized based on the first enhanced image features and the second enhanced image features. In the disclosure, after the image features are acquired, feature enhancement is respectively performed in two aspects to enhance the expression capability of the features, and sufficient image features can be provided to improve the accuracy of the classification and identification of the texture types of the images.
Fig. 2 is a flowchart illustrating an image recognition method according to another embodiment of the disclosure. As shown in fig. 2, the image recognition method specifically includes the following steps:
s201, acquiring an image to be identified, and extracting image characteristics of the image to be identified.
S202, performing dimension reduction processing on the image features to obtain a first dimension reduction feature matrix and a second dimension reduction feature matrix.
The feature elements in the same row in the first dimension-reducing feature matrix belong to the same feature extraction channel, one column element corresponds to one pixel, and the second dimension-reducing feature matrix is a transposed matrix of the first dimension-reducing feature matrix. In the present disclosure, the dimension-reduced image feature may include a first dimension-reduced feature matrix and a second dimension-reduced feature matrix, which are used to obtain the first enhanced image feature and the second enhanced image feature.
In the implementation, the same feature information in the image features can be described from different dimensions, for example, one feature information can be described from dimensions such as a feature extraction channel, a feature length, a feature width, and the like. In order to reduce the amount of data processing, and to realize multiplication of matrices, in the embodiment of the present disclosure, dimension reduction processing may be performed on the image features. Optionally, two dimensions of the feature length and the feature width in the image feature may be fused to obtain a first dimension-reduced feature matrix and a second dimension-reduced feature matrix.
S203, acquiring a first enhanced image characteristic and a second enhanced image characteristic based on the first dimension reduction characteristic matrix and the second dimension reduction characteristic matrix.
The process of acquiring the first enhanced image feature includes: and multiplying the first dimension reduction feature matrix and the second dimension reduction feature matrix to obtain a first weight matrix corresponding to the feature extraction channel, and then obtaining a first enhanced image feature based on the image feature and the first weight matrix. Optionally, performing convolution operation on the image features to obtain a first intermediate feature matrix, multiplying the first weight matrix by the first intermediate feature matrix to obtain a second intermediate feature matrix, and adding the first intermediate feature matrix and the second intermediate feature matrix to obtain a first enhanced image feature. In the embodiment of the disclosure, the feature extraction channel is used for feature enhancement, that is, the feature extraction capability of the feature channel is enhanced, the intensity of the extracted image features is high, and the accuracy of image identification can be further improved.
The first enhanced image feature obtaining process is explained with reference to fig. 3, and as shown in fig. 3, the feature enhancing module at channel level includes a convolution unit 31, a convolution unit 32, a convolution unit 33, a first matrix multiplication unit 34, a normalization unit 35, a second matrix multiplication unit 36, and an adder 37.
Wherein the image feature F(c×w×h)As input to the channel-level feature enhancement module, where C denotes a feature extraction channel, W denotes a feature width, and H denotes a feature length.
The image feature F is respectively processed by the convolution unit 31 and the convolution unit 32(c×w×h)Performing a convolution operation to the image feature F(c×w×h)Dimension reduction processing is carried out to obtain dimension reduction image characteristics, namely a first dimension reduction characteristic matrix Qc(c×(h*w))And a second dimension-reducing feature matrix Hc((h*w)×c). Wherein Hc((h*w)×c)Is Qc(c×(h*w))The transposed matrix of (2). Further, Q isc(c×(h*w))And Hc((h*w)×c)Inputting the data into a first matrix multiplication unit 34, Q being in the first matrix multiplication unit 34c(c×(h*w))And Hc((h*w)×c)Performing matrix multiplication to output a first weight matrix Mc(c×c)And M isc(c×c)The input normalization unit 35 performs normalization (softmax) to obtain a first weight matrix M 'corresponding to the feature extraction channel'c(c×c)
Image feature F by convolution unit 33(c×w×h)Performing convolution operation to obtain a first intermediate feature matrix Fc(c×h×w)1And finally M 'is provided in the second matrix multiplication unit 36 by the second matrix multiplication unit 36'c(c×c)And Fc(c×h×w)1Carrying out matrix multiplication to obtain an enhanced second intermediate characteristic matrix Fh(c×h×w)1
Further, the second intermediate feature matrix F is applied by an adder 37h(c×h×w)1And a first intermediate feature matrix Fc(c×h×w)1Adding to obtain the final first enhanced image characteristic F1
The process of obtaining the second enhanced image feature includes: and multiplying the second dimension reduction characteristic matrix and the first dimension reduction characteristic matrix to obtain a second weight matrix corresponding to the pixel, and obtaining a second enhanced image characteristic based on the image characteristic and the second weight matrix. Optionally, performing convolution operation on the image features to obtain a third intermediate feature matrix, multiplying the second weight matrix by the third intermediate feature matrix to obtain a fourth intermediate feature matrix, and adding the third intermediate feature matrix and the fourth intermediate feature matrix to obtain a second enhanced image feature. In the embodiment of the disclosure, feature enhancement is performed on the pixels to improve the expression capability of image features, so that the accuracy of image identification can be improved.
The second enhanced image feature obtaining process is explained below with reference to fig. 4, and as shown in fig. 4, the feature enhancing module at the pixel level includes a convolution unit 41, a convolution unit 42, a convolution unit 43, a first matrix multiplication unit 44, a normalization unit 45, a second matrix multiplication unit 46, and an adder 47.
Wherein the image feature F(c×w×h)As input to the pixel-level feature enhancement module.
The image feature F is respectively processed by the convolution unit 41 and the convolution unit 42(c×w×h)Performing a convolution operation to the image feature F(c×w×h)Dimension reduction processing is carried out to obtain dimension reduction image characteristics, namely a first dimension reduction characteristic matrix Qc(c×(h*w))And a second dimension-reducing feature matrix Hc((h*w)×c). Wherein Hc((h*w)×c)Is Qc(c×(h*w))The transposed matrix of (2). Further, mixing Hc((h*w)×c)And Qc(c×(h*w))Input to a first matrix multiplication unit 44, and H is contained in the first matrix multiplication unit 44c((h*w)×c)And Qc(c×(h*w))After matrix multiplication, a second weight matrix M can be obtainedP((h*w)×(h*w))And M isP((h*w)×(h*w))The pixel value is input into the normalization unit 45 to be normalized to obtain a second weight matrix M 'corresponding to the pixel'P((h*w)×(h*w))
Image feature F by convolution unit 43(c×w×h)Performing convolution operation to obtain a third intermediate feature matrix Fc(c×h×w)2And finally M 'in the second matrix multiplication unit 46 by the second matrix multiplication unit 46'P((h*w)×(h*w))And Fc(c×h×w)2Carrying out matrix multiplication to obtain an enhanced fourth intermediate characteristic matrix Fh(c×h×w)2
Further, a fourth intermediate feature matrix F is applied by an adder 47h(c×h×w)2And a third intermediate feature matrix Fc(c×h×w)2Adding on the channel to obtain the final second enhanced image characteristic F2
The process of obtaining the second enhanced image feature includes: and multiplying the second dimension reduction characteristic matrix and the first dimension reduction characteristic matrix to obtain a second weight matrix corresponding to the pixel, and obtaining a second enhanced image characteristic based on the image characteristic and the second weight matrix. Optionally, performing convolution operation on the image features to obtain a third intermediate feature matrix, multiplying the second weight matrix by the third intermediate feature matrix to obtain a fourth intermediate feature matrix, and adding the third intermediate feature matrix and the fourth intermediate feature matrix to obtain a second enhanced image feature.
S204, weighting the first enhanced image characteristic and the second enhanced image characteristic to obtain a target image characteristic.
In the embodiment of the present disclosure, the obtaining of the target image feature requires performing a weighting calculation based on the first enhanced image feature and the second enhanced image feature. Further, setting the target image feature as F, as shown in fig. 3 and 4, the image feature obtained by the image feature through channel level enhancement is F1, the image feature obtained by the image feature through pixel level enhancement is F2, after the enhanced feature is obtained, F1 and F2 are fused based on the weights of the first enhanced image feature and the second enhanced image feature, that is, F ═ a × F1+ b × F2. Where a and b are learnable weight parameters, it can be understood that the weight parameters a and b are obtained by debugging according to a training process and a testing process in the image texture recognition model in the embodiment of the present disclosure.
And S205, acquiring the texture type of the image to be recognized based on the target image characteristics.
The image recognition method comprises the steps of obtaining an image to be recognized, extracting image features of the image to be recognized, conducting dimension reduction processing on the image features, obtaining dimension reduction image features, enhancing the dimension reduction image features on a feature extraction channel to obtain first enhanced image features, enhancing the dimension reduction image features on pixels to obtain second enhanced image features, and obtaining texture types of the image to be recognized based on the first enhanced image features and the second enhanced image features. In the disclosure, after the image features are acquired, feature enhancement is respectively performed in two aspects to enhance the expression capability of the features, and sufficient image features can be provided to improve the accuracy of the classification and identification of the texture types of the images.
The image texture recognition model referred to in the above embodiments is explained below. Firstly, a nonlinear mapping model is constructed, and then a training data set is acquired, wherein the training data set comprises a sample image and a texture class marked by the sample image. And training the constructed nonlinear mapping model based on the training data set to finally obtain an image texture recognition model capable of recognizing image textures.
Alternatively, as shown in fig. 5, the network structure of the image classification recognition model may include: a feature extraction layer 51, a feature enhancement layer 52, wherein the feature enhancement layer includes a channel-level feature enhancement sublayer 521 and a pixel-level feature enhancement sublayer 522, a feature fusion layer 53, a Full Connected (FC) layer 54, and an L2 norm normalization (L2 normalized, L2 nom) layer 55. The image to be recognized is input into the image classification recognition model shown in fig. 5, image features may be extracted through the feature extraction layer 51, feature enhancement at channel level and pixel level is performed through the feature enhancement layer 52, feature fusion is performed through the feature fusion layer 53, the image features are input into the FC layer 54, the FC layer 54 performs full connection on the fused image features, and finally the image features are input into the L2 nom layer 55 for mapping, so as to obtain texture classes of the image to be recognized.
Corresponding to the image recognition methods provided by the above embodiments, an embodiment of the present disclosure further provides an image recognition apparatus, and since the image texture feature extraction apparatus provided by the embodiment of the present disclosure corresponds to the image recognition methods provided by the above embodiments, the implementation of the image recognition method is also applicable to the image recognition apparatus provided by the embodiment of the present disclosure, and will not be described in detail in the following embodiments.
Fig. 6 is a schematic structural diagram of an image recognition apparatus according to another embodiment of the present disclosure. As shown in fig. 6, the image recognition apparatus 600 includes: a feature extraction module 61, a dimensionality reduction module 62, a first enhancement module 63, a second enhancement module 64, and a texture recognition module 65. Wherein:
the feature extraction module 61 is configured to acquire an image to be identified and extract image features of the image to be identified;
the dimension reduction module 62 is configured to perform dimension reduction processing on the image features to obtain dimension reduction image features;
a first enhancement module 63, configured to enhance the image features on the feature extraction channel to obtain first enhanced image features;
a second enhancement module 64 for enhancing the image feature in pixels to obtain a second enhanced image feature;
and the texture recognition module 65 is configured to obtain a texture type of the image to be recognized based on the first enhanced image feature and the second enhanced image feature.
The image recognition device acquires an image to be recognized, extracts image features of the image to be recognized, performs dimensionality reduction on the image features, acquires dimensionality reduction image features, enhances the dimensionality reduction image features on a feature extraction channel to acquire first enhanced image features, enhances the dimensionality reduction image features on pixels to acquire second enhanced image features, and acquires texture types of the image to be recognized based on the first enhanced image features and the second enhanced image features. In the disclosure, after the image features are acquired, feature enhancement is respectively performed in two aspects to enhance the expression capability of the features, and sufficient image features can be provided to improve the accuracy of the classification and identification of the texture types of the images.
Fig. 7 is a schematic structural diagram of an image recognition apparatus according to another embodiment of the present disclosure. As shown in fig. 7, the image recognition apparatus 700 includes: a feature extraction module 71, a dimension reduction module 72, a first enhancement module 73, a second enhancement module 74 and a texture recognition module 75.
It should be noted that the feature extraction module 71, the dimension reduction module 72, the first enhancement module 73, the second enhancement module 74, and the texture recognition module 75 have the same structure and function as the feature extraction module 61, the dimension reduction module 62, the first enhancement module 63, the second enhancement module 64, and the texture recognition module 65.
In the embodiment of the present disclosure, the dimension reduction module 72 is configured to fuse two dimensions, namely, a feature length and a feature width, in an image feature to obtain a first dimension reduction feature matrix and a second dimension reduction feature matrix, where feature elements in the same row in the first dimension reduction feature matrix belong to the same feature extraction channel, a column element corresponds to one pixel, and the second dimension reduction feature matrix is a transposed matrix of the first dimension reduction feature matrix; the first dimension reduction feature matrix and the second dimension reduction feature matrix are used for obtaining a first enhanced image feature and a second enhanced image feature.
In the disclosed embodiment, the first enhancement module 73 includes a first matrix multiplication unit 731 and a first acquisition unit 732.
The first matrix multiplication unit 731 is configured to multiply the first dimension reduction feature matrix and the second dimension reduction feature matrix to obtain a first weight matrix corresponding to the feature extraction channel.
A first obtaining unit 732, configured to obtain a first enhanced image feature based on the image feature and the first weight matrix.
The first obtaining unit 732 is further configured to perform convolution operation on the image feature to obtain a first intermediate feature matrix; multiplying the first weight matrix and the first intermediate feature matrix to obtain a second intermediate feature matrix; and adding the first intermediate feature matrix and the second intermediate feature matrix to obtain a first enhanced image feature.
In the embodiment of the present disclosure, the second enhancement module 74 includes a second matrix multiplication unit 741 and a second obtaining unit 742.
And a second matrix multiplication unit 741, configured to multiply the second dimension-reduced feature matrix and the first dimension-reduced feature matrix, and acquire a second weight matrix corresponding to the pixel.
A second obtaining unit 742 is configured to obtain a second enhanced image feature based on the image feature and the second weight matrix.
The second obtaining unit 742 is further configured to perform convolution operation on the image feature to obtain a third intermediate feature matrix; multiplying the second weight matrix with the third intermediate feature matrix to obtain a fourth intermediate feature matrix; adding the third intermediate feature matrix and the fourth intermediate feature matrix to obtain a second enhanced image feature
The texture recognition module 75 in the embodiment of the present disclosure includes: a weighting unit 751 and a recognition unit 752.
The weighting unit 751 is configured to weight the first enhanced image feature and the second enhanced image feature to obtain a target image feature.
The identifying unit 752 is configured to identify a texture type of the image to be identified based on the target image feature.
In the disclosure, after the image features are acquired, feature enhancement is respectively performed in two aspects to enhance the expression capability of the features, and sufficient image features can be provided to improve the accuracy of the classification and identification of the texture types of the images.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 8 illustrates a schematic block diagram of an example electronic device 800 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The calculation unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.
A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The calculation unit 801 executes the respective methods and processes described above, such as the image recognition method. For example, in some embodiments, the image recognition method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto device 800 via ROM 802 and/or communications unit 809. When the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the image recognition method described above may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the image recognition method in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), the internet, and blockchain networks.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The service end can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service (Virtual Private Server, or VPS for short). The server may also be a server of a distributed system, or a server incorporating a blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (17)

1. An image recognition method, comprising:
acquiring an image to be identified, and extracting image characteristics of the image to be identified;
performing dimension reduction processing on the image features to obtain dimension reduction image features;
enhancing the dimensionality reduction image features on a feature extraction channel to obtain first enhanced image features;
enhancing the dimension-reduced image features on pixels to obtain second enhanced image features;
and acquiring the texture type of the image to be identified based on the first enhanced image characteristic and the second enhanced image characteristic.
2. The image recognition method according to claim 1, wherein the performing dimension reduction processing on the image feature to obtain a dimension-reduced image feature comprises:
fusing two dimensions of feature length and feature width in the image features to obtain a first dimension-reduction feature matrix and a second dimension-reduction feature matrix, wherein feature elements in the same row in the first dimension-reduction feature matrix belong to the same feature extraction channel, a column element corresponds to a pixel, and the second dimension-reduction feature matrix is a transposed matrix of the first dimension-reduction feature matrix;
the dimension-reduced image features comprise the first dimension-reduced feature matrix and the second dimension-reduced feature matrix, and are used for acquiring the first enhanced image features and the second enhanced image features.
3. The image recognition method of claim 2, wherein the enhancing the dimension-reduced image feature on a feature extraction channel to obtain a first enhanced image feature comprises:
multiplying the first dimension reduction feature matrix and the second dimension reduction feature matrix to obtain a first weight matrix corresponding to the feature extraction channel;
and acquiring the first enhanced image characteristic based on the image characteristic and the first weight matrix.
4. The image recognition method of claim 3, wherein the obtaining the first enhanced image feature based on the image feature and the first weight matrix comprises:
performing convolution operation on the image features to obtain a first intermediate feature matrix;
multiplying the first weight matrix and the first intermediate feature matrix to obtain a second intermediate feature matrix;
and adding the first intermediate feature matrix and the second intermediate feature matrix to obtain the first enhanced image feature.
5. The image recognition method of claim 2, wherein the enhancing the reduced-dimension image feature on a pixel to obtain a second enhanced image feature comprises:
multiplying the second dimension reduction feature matrix with the first dimension reduction feature matrix to obtain a second weight matrix corresponding to the pixel;
and acquiring the second enhanced image characteristic based on the image characteristic and the second weight matrix.
6. The image recognition method of claim 5, wherein the obtaining the second enhanced image feature based on the image feature and the second weight matrix comprises:
performing convolution operation on the image features to obtain a third intermediate feature matrix;
multiplying the second weight matrix with the third intermediate feature matrix to obtain a fourth intermediate feature matrix;
and adding the third intermediate feature matrix and the fourth intermediate feature matrix to obtain the second enhanced image feature.
7. The image recognition method according to any one of claims 1 to 6, wherein the obtaining the texture type of the image to be recognized based on the first enhanced image feature and the second enhanced image feature comprises:
weighting the first enhanced image feature and the second enhanced image feature to obtain the target image feature;
and identifying the texture type of the image to be identified based on the target image characteristic.
8. An image recognition apparatus comprising:
the characteristic extraction module is used for acquiring an image to be identified and extracting the image characteristics of the image to be identified;
the dimension reduction module is used for carrying out dimension reduction processing on the image features to obtain dimension reduction image features;
the first enhancement module is used for enhancing the image features on a feature extraction channel to obtain first enhanced image features;
the second enhancement module is used for enhancing the image characteristics on pixels to obtain second enhanced image characteristics;
and the texture recognition module is used for acquiring the texture type of the image to be recognized based on the first enhanced image characteristic and the second enhanced image characteristic.
9. The image recognition device according to claim 8, wherein the dimension reduction module is configured to fuse two dimensions, namely a feature length and a feature width, in the image feature to obtain a first dimension reduction feature matrix and a second dimension reduction feature matrix, where feature elements in a same row in the first dimension reduction feature matrix belong to a same feature extraction channel, a column element corresponds to a pixel, and the second dimension reduction feature matrix is a transpose matrix of the first dimension reduction feature matrix;
the first dimension-reduced feature matrix and the second dimension-reduced feature matrix are used for obtaining the first enhanced image feature and the second enhanced image feature.
10. The image recognition device of claim 9, wherein the first enhancement module comprises:
a first matrix multiplication unit, configured to multiply the first dimension reduction feature matrix and the second dimension reduction feature matrix to obtain a first weight matrix corresponding to the feature extraction channel;
a first obtaining unit, configured to obtain the first enhanced image feature based on the image feature and the first weight matrix.
11. The image recognition apparatus according to claim 10, wherein the first acquisition unit is further configured to:
performing convolution operation on the image features to obtain a first intermediate feature matrix;
multiplying the first weight matrix and the first intermediate feature matrix to obtain a second intermediate feature matrix;
and adding the first intermediate feature matrix and the second intermediate feature matrix to obtain the first enhanced image feature.
12. The image recognition device of claim 9, wherein the second enhancement module comprises:
a second matrix multiplication unit, configured to multiply the second dimension-reduced feature matrix and the first dimension-reduced feature matrix, and obtain a second weight matrix corresponding to the pixel;
a second obtaining unit, configured to obtain the second enhanced image feature based on the image feature and the second weight matrix.
13. The image recognition apparatus according to claim 12, wherein the second acquisition unit is further configured to:
performing convolution operation on the image features to obtain a third intermediate feature matrix;
multiplying the second weight matrix with the third intermediate feature matrix to obtain a fourth intermediate feature matrix;
and adding the third intermediate feature matrix and the fourth intermediate feature matrix to obtain the second enhanced image feature.
14. The image recognition device according to any one of claims 8 to 13, wherein the texture recognition module includes:
the weighting unit is used for weighting the first enhanced image characteristic and the second enhanced image characteristic to obtain the target image characteristic;
and the identification unit is used for identifying the texture type of the image to be identified based on the target image characteristics.
15. An electronic device, comprising:
at least one processor, and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the image recognition method of any one of claims 1-7.
16. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the image recognition method of any one of claims 1-7.
17. A computer program product comprising a computer program which, when executed by a processor, implements an image recognition method according to any one of claims 1-7.
CN202011606881.3A 2020-12-30 2020-12-30 Image recognition method, device, electronic equipment and storage medium Active CN112651451B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011606881.3A CN112651451B (en) 2020-12-30 2020-12-30 Image recognition method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011606881.3A CN112651451B (en) 2020-12-30 2020-12-30 Image recognition method, device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112651451A true CN112651451A (en) 2021-04-13
CN112651451B CN112651451B (en) 2023-08-11

Family

ID=75364329

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011606881.3A Active CN112651451B (en) 2020-12-30 2020-12-30 Image recognition method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112651451B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113343295A (en) * 2021-06-07 2021-09-03 支付宝(杭州)信息技术有限公司 Image processing method, device, equipment and storage medium based on privacy protection
CN114463584A (en) * 2022-01-29 2022-05-10 北京百度网讯科技有限公司 Image processing method, model training method, device, apparatus, storage medium, and program
CN117252449A (en) * 2023-11-20 2023-12-19 水润天府新材料有限公司 Full-penetration drainage low-noise pavement construction process and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110111313A (en) * 2019-04-22 2019-08-09 腾讯科技(深圳)有限公司 Medical image detection method and relevant device based on deep learning
CN111126254A (en) * 2019-12-23 2020-05-08 Oppo广东移动通信有限公司 Image recognition method, device, equipment and storage medium
CN111428807A (en) * 2020-04-03 2020-07-17 桂林电子科技大学 Image processing method and computer-readable storage medium
US20200320369A1 (en) * 2018-03-30 2020-10-08 Tencent Technology (Shenzhen) Company Limited Image recognition method, apparatus, electronic device and storage medium
WO2020232886A1 (en) * 2019-05-21 2020-11-26 平安科技(深圳)有限公司 Video behavior identification method and apparatus, storage medium and server

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200320369A1 (en) * 2018-03-30 2020-10-08 Tencent Technology (Shenzhen) Company Limited Image recognition method, apparatus, electronic device and storage medium
CN110111313A (en) * 2019-04-22 2019-08-09 腾讯科技(深圳)有限公司 Medical image detection method and relevant device based on deep learning
WO2020232886A1 (en) * 2019-05-21 2020-11-26 平安科技(深圳)有限公司 Video behavior identification method and apparatus, storage medium and server
CN111126254A (en) * 2019-12-23 2020-05-08 Oppo广东移动通信有限公司 Image recognition method, device, equipment and storage medium
CN111428807A (en) * 2020-04-03 2020-07-17 桂林电子科技大学 Image processing method and computer-readable storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XIBIN SONG等: "Channel Attention based Iterative Residual Learning for Depth Map Super-Resolution", 《ARXIV》 *
吴睿曦;肖秦琨;: "基于深度网络和数据增强的多物体图像识别", 国外电子测量技术, no. 05 *
李军锋;何双伯;冯伟夏;熊山;薛江;周青云;: "基于改进CNN的增强现实变压器图像识别技术", 现代电子技术, no. 07 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113343295A (en) * 2021-06-07 2021-09-03 支付宝(杭州)信息技术有限公司 Image processing method, device, equipment and storage medium based on privacy protection
CN113343295B (en) * 2021-06-07 2023-01-24 支付宝(杭州)信息技术有限公司 Image processing method, device, equipment and storage medium based on privacy protection
CN114463584A (en) * 2022-01-29 2022-05-10 北京百度网讯科技有限公司 Image processing method, model training method, device, apparatus, storage medium, and program
CN117252449A (en) * 2023-11-20 2023-12-19 水润天府新材料有限公司 Full-penetration drainage low-noise pavement construction process and system
CN117252449B (en) * 2023-11-20 2024-01-30 水润天府新材料有限公司 Full-penetration drainage low-noise pavement construction process and system

Also Published As

Publication number Publication date
CN112651451B (en) 2023-08-11

Similar Documents

Publication Publication Date Title
CN112651451B (en) Image recognition method, device, electronic equipment and storage medium
CN112819007B (en) Image recognition method, device, electronic equipment and storage medium
CN112949710B (en) Image clustering method and device
CN115063875B (en) Model training method, image processing method and device and electronic equipment
CN113591918B (en) Training method of image processing model, image processing method, device and equipment
CN113177472A (en) Dynamic gesture recognition method, device, equipment and storage medium
CN112862006A (en) Training method and device for image depth information acquisition model and electronic equipment
CN114092759A (en) Training method and device of image recognition model, electronic equipment and storage medium
CN112561879A (en) Ambiguity evaluation model training method, image ambiguity evaluation method and device
CN113177449A (en) Face recognition method and device, computer equipment and storage medium
CN112862005A (en) Video classification method and device, electronic equipment and storage medium
CN112784732A (en) Method, device, equipment and medium for recognizing ground object type change and training model
CN114913339A (en) Training method and device of feature map extraction model
CN113888560A (en) Method, apparatus, device and storage medium for processing image
CN113569912A (en) Vehicle identification method and device, electronic equipment and storage medium
CN112580666A (en) Image feature extraction method, training method, device, electronic equipment and medium
CN111932530A (en) Three-dimensional object detection method, device and equipment and readable storage medium
CN115116111B (en) Anti-disturbance human face living body detection model training method and device and electronic equipment
CN112560848B (en) Training method and device for POI (Point of interest) pre-training model and electronic equipment
CN114741697B (en) Malicious code classification method and device, electronic equipment and medium
CN114881227A (en) Model compression method, image processing method, device and electronic equipment
CN113903071A (en) Face recognition method and device, electronic equipment and storage medium
CN113205131A (en) Image data processing method and device, road side equipment and cloud control platform
CN114049491A (en) Fingerprint segmentation model training method, fingerprint segmentation device, fingerprint segmentation equipment and fingerprint segmentation medium
CN112990201A (en) Text box detection method and device, electronic equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant