CN112651451A

CN112651451A - Image recognition method and device, electronic equipment and storage medium

Info

Publication number: CN112651451A
Application number: CN202011606881.3A
Authority: CN
Inventors: 宋希彬; 周定富; 方进; 张良俊
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd; Baidu USA LLC
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd; Baidu USA LLC
Priority date: 2020-12-30
Filing date: 2020-12-30
Publication date: 2021-04-13
Anticipated expiration: 2040-12-30
Also published as: CN112651451B

Abstract

The present disclosure provides an image recognition method, an image recognition device, an electronic device and a storage medium, which relate to the technical field of image processing, in particular to the field of computer vision, deep learning and other artificial intelligence, and the specific implementation scheme is to obtain an image to be recognized and extract the image features of the image to be recognized; performing dimension reduction processing on the image features to obtain dimension reduction image features; enhancing the dimensionality reduction image features on a feature extraction channel to obtain first enhanced image features; enhancing the dimension-reduced image features on pixels to obtain second enhanced image features; and acquiring the texture type of the image to be identified based on the first enhanced image characteristic and the second enhanced image characteristic. According to the method and the device, the image features are enhanced and fused in two aspects, so that the expression capability of the image features is enhanced, and the accuracy of identifying the texture types of the images is improved.

Description

Image recognition method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of image processing technology, and in particular, to the field of artificial intelligence techniques such as computer vision and deep learning.

Background

Based on the fact that additional training data sets are needed for traditional machine learning or deep learning, texture information of the images is predicted, however, the nonlinear expression capability of the traditional machine learning is often limited, and the problem of insufficient image feature extraction exists in the deep learning, so that the prediction accuracy of the texture information of the images is low.

Disclosure of Invention

The disclosure provides a method, an apparatus, an electronic device, a storage medium and a computer program product for image recognition.

According to one aspect of the disclosure, an image recognition method is provided, which includes acquiring an image to be recognized, and extracting image features of the image to be recognized; performing dimension reduction processing on the image features to obtain dimension reduction image features; enhancing the dimensionality reduction image features on a feature extraction channel to obtain first enhanced image features; enhancing the dimension-reduced image features on pixels to obtain second enhanced image features; and acquiring the texture type of the image to be identified based on the first enhanced image characteristic and the second enhanced image characteristic.

According to a second aspect of the present disclosure, there is provided an image recognition apparatus comprising: the characteristic extraction module is used for acquiring an image to be identified and extracting the image characteristics of the image to be identified; the dimension reduction module is used for carrying out dimension reduction processing on the image features to obtain dimension reduction image features; the first enhancement module is used for enhancing the image features on a feature extraction channel to obtain first enhanced image features; the second enhancement module is used for enhancing the image characteristics on pixels to obtain second enhanced image characteristics; and the texture recognition module is used for acquiring the texture type of the image to be recognized based on the first enhanced image characteristic and the second enhanced image characteristic.

According to a third aspect of the present disclosure, an electronic device is presented, wherein the electronic device comprises a processor and a memory; the processor executes a program corresponding to the executable program code by reading the executable program code stored in the memory, for implementing the image recognition method as set forth in the first aspect above.

According to a fourth aspect of the present disclosure, a computer-readable storage medium is proposed, on which a computer program is stored, comprising program which, when executed by a processor, implements the image recognition method as proposed in the first aspect above. A computer program product comprising instructions which, when executed by a processor of the computer program product, implement the image recognition method as set forth in the first aspect above.

According to a fifth aspect of the present disclosure, a computer program product is proposed, which is characterized by implementing the image recognition method as proposed in the first aspect above when executed by an instruction processor in the computer program product.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1 is a schematic flow chart diagram of an image recognition method according to an embodiment of the present disclosure;

FIG. 2 is a schematic flow chart diagram of an image recognition method according to another embodiment of the present disclosure;

FIG. 3 is a schematic flow chart diagram of an image recognition method according to another embodiment of the present disclosure;

FIG. 4 is a schematic flow chart diagram of an image recognition method according to another embodiment of the present disclosure;

FIG. 5 is a schematic flow chart diagram of an image recognition method according to another embodiment of the present disclosure;

FIG. 6 is a block diagram of an image recognition device according to an embodiment of the present disclosure;

FIG. 7 is a block diagram of an image recognition device according to an embodiment of the present disclosure;

fig. 8 is a schematic block diagram of an electronic device of an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Image Processing (Image Processing) techniques that analyze an Image with a computer to achieve a desired result. Also known as image processing. Image processing generally refers to digital image processing. Digital images are large two-dimensional arrays of elements called pixels and values called gray-scale values, which are captured by industrial cameras, video cameras, scanners, etc. Image processing techniques generally include image compression, enhancement and restoration, matching, description and identification of 3 parts.

Deep Learning (DL) is a new research direction in the field of Machine Learning (ML), and is introduced into Machine Learning to make it closer to the original target, artificial intelligence. Deep learning is the intrinsic law and representation hierarchy of learning sample data, and information obtained in the learning process is very helpful for interpretation of data such as characters, images and sounds. The final aim of the method is to enable the machine to have the analysis and learning capability like a human, and to recognize data such as characters, images and sounds. Deep learning is a complex machine learning algorithm, and achieves the effect in speech and image recognition far exceeding the prior related art.

Computer Vision (Computer Vision) is a science for researching how to make a machine "see", and further, it means that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can acquire 'information' from images or multidimensional data. The information referred to herein refers to information defined by Shannon that can be used to help make a "decision". Because perception can be viewed as extracting information from sensory signals, computer vision can also be viewed as the science of how to make an artificial system "perceive" from images or multidimensional data.

Artificial Intelligence (AI) is a subject of studying some thinking processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.) of a computer to simulate a human life, and has both hardware and software technologies. Artificial intelligence hardware techniques generally include computer vision techniques, speech recognition techniques, natural language processing techniques, and learning/deep learning thereof, big data processing techniques, knowledge-graph techniques, and the like.

Fig. 1 is a schematic flowchart of an image recognition method according to an embodiment of the present disclosure. As shown in the figure, the image recognition method comprises the following steps:

s101, obtaining an image to be identified, and extracting image characteristics of the image to be identified.

In the embodiment of the present disclosure, the image to be recognized may be a pre-collected image, or may also be the image collected in real time. Optionally, the image is a color image.

After the image to be recognized is acquired, in order to recognize or classify the image to be recognized, image features of the image to be recognized need to be extracted, and the image features may include, but are not limited to, the following features: the image features mainly include color features, texture features, shape features and spatial relationship features of the image.

Alternatively, the image features of the image to be recognized may be extracted through a deep learning model or a machine learning model, that is, the image to be recognized is input into a trained feature extraction network, and the image features may be extracted based on the feature extraction network.

And S102, performing dimension reduction processing on the image features to obtain dimension reduction image features.

In the embodiment of the present disclosure, the same feature information in an image feature may be described from different dimensions, for example, one feature information may be described from dimensions such as a feature extraction channel, a feature length, and a feature width. In order to reduce the amount of data processing and realize multiplication of candidate matrices, in the embodiment of the present disclosure, dimension reduction processing may be performed on the image features to obtain dimension reduced image features.

S103, enhancing the dimension-reduced image features on the feature extraction channel to obtain first enhanced image features.

In the implementation, in the feature extraction process of the image to be identified, a plurality of feature extraction channels are required to extract the image features of the image to be identified. For the problem of insufficient image feature extraction in the related art, in the embodiment of the present disclosure, feature enhancement may be performed on a feature extraction channel, so as to obtain a first enhanced image characteristic. Optionally, the dimension-reduced image features are convolved based on a plurality of convolution networks to obtain enhancement weights of a plurality of feature extraction channels, and the first enhanced image features may be obtained based on the enhancement weights and the image features at the channel level.

And S104, enhancing the dimension-reduced image characteristics on the pixels to acquire second enhanced image characteristics.

In order to solve the problem of insufficient image feature extraction in the related art, in the embodiment of the present disclosure, feature enhancement at a pixel level may be performed, and then a second enhanced image characteristic may be obtained. Optionally, the reduced-dimension image feature is convolved based on a plurality of convolution networks to obtain an enhancement weight of each pixel, and the second enhanced image feature may be obtained based on the enhancement weight and the image feature at the pixel level.

And S105, acquiring the texture type of the image to be identified based on the first enhanced image characteristic and the second enhanced image characteristic.

After the first enhanced image feature and the second enhanced image feature are obtained, the two enhanced image features are fused to obtain a final image feature. Optionally, the first enhanced image feature and the second enhanced image feature are weighted to obtain a final target image feature. The first enhanced image feature and the second enhanced image feature enable the intensity of the image features to be higher, and the accuracy of image recognition can be improved more favorably.

After the final target image features are obtained, classification and identification are carried out based on the final target image features, and the texture type of the image to be identified can be obtained. Optionally, based on the trained texture classification model, classifying and identifying the target image features, and finally outputting the texture type corresponding to the image to be identified. For example, the texture type may include soil material, road surface, leaves, and the like.

The image recognition method comprises the steps of obtaining an image to be recognized, extracting image features of the image to be recognized, conducting dimension reduction processing on the image features, obtaining dimension reduction image features, enhancing the dimension reduction image features on a feature extraction channel to obtain first enhanced image features, enhancing the dimension reduction image features on pixels to obtain second enhanced image features, and obtaining texture types of the image to be recognized based on the first enhanced image features and the second enhanced image features. In the disclosure, after the image features are acquired, feature enhancement is respectively performed in two aspects to enhance the expression capability of the features, and sufficient image features can be provided to improve the accuracy of the classification and identification of the texture types of the images.

Fig. 2 is a flowchart illustrating an image recognition method according to another embodiment of the disclosure. As shown in fig. 2, the image recognition method specifically includes the following steps:

s201, acquiring an image to be identified, and extracting image characteristics of the image to be identified.

S202, performing dimension reduction processing on the image features to obtain a first dimension reduction feature matrix and a second dimension reduction feature matrix.

The feature elements in the same row in the first dimension-reducing feature matrix belong to the same feature extraction channel, one column element corresponds to one pixel, and the second dimension-reducing feature matrix is a transposed matrix of the first dimension-reducing feature matrix. In the present disclosure, the dimension-reduced image feature may include a first dimension-reduced feature matrix and a second dimension-reduced feature matrix, which are used to obtain the first enhanced image feature and the second enhanced image feature.

In the implementation, the same feature information in the image features can be described from different dimensions, for example, one feature information can be described from dimensions such as a feature extraction channel, a feature length, a feature width, and the like. In order to reduce the amount of data processing, and to realize multiplication of matrices, in the embodiment of the present disclosure, dimension reduction processing may be performed on the image features. Optionally, two dimensions of the feature length and the feature width in the image feature may be fused to obtain a first dimension-reduced feature matrix and a second dimension-reduced feature matrix.

S203, acquiring a first enhanced image characteristic and a second enhanced image characteristic based on the first dimension reduction characteristic matrix and the second dimension reduction characteristic matrix.

The process of acquiring the first enhanced image feature includes: and multiplying the first dimension reduction feature matrix and the second dimension reduction feature matrix to obtain a first weight matrix corresponding to the feature extraction channel, and then obtaining a first enhanced image feature based on the image feature and the first weight matrix. Optionally, performing convolution operation on the image features to obtain a first intermediate feature matrix, multiplying the first weight matrix by the first intermediate feature matrix to obtain a second intermediate feature matrix, and adding the first intermediate feature matrix and the second intermediate feature matrix to obtain a first enhanced image feature. In the embodiment of the disclosure, the feature extraction channel is used for feature enhancement, that is, the feature extraction capability of the feature channel is enhanced, the intensity of the extracted image features is high, and the accuracy of image identification can be further improved.

The first enhanced image feature obtaining process is explained with reference to fig. 3, and as shown in fig. 3, the feature enhancing module at channel level includes a convolution unit 31, a convolution unit 32, a convolution unit 33, a first matrix multiplication unit 34, a normalization unit 35, a second matrix multiplication unit 36, and an adder 37.

Wherein the image feature F_(c×w×h)As input to the channel-level feature enhancement module, where C denotes a feature extraction channel, W denotes a feature width, and H denotes a feature length.

The image feature F is respectively processed by the convolution unit 31 and the convolution unit 32_(c×w×h)Performing a convolution operation to the image feature F_(c×w×h)Dimension reduction processing is carried out to obtain dimension reduction image characteristics, namely a first dimension reduction characteristic matrix Q_c(c×(h*w))And a second dimension-reducing feature matrix H_c((h*w)×c). Wherein H_c((h*w)×c)Is Q_c(c×(h*w))The transposed matrix of (2). Further, Q is_c(c×(h*w))And H_c((h*w)×c)Inputting the data into a first matrix multiplication unit 34, Q being in the first matrix multiplication unit 34_c(c×(h*w))And H_c((h*w)×c)Performing matrix multiplication to output a first weight matrix M_c(c×c)And M is_c(c×c)The input normalization unit 35 performs normalization (softmax) to obtain a first weight matrix M 'corresponding to the feature extraction channel'_c(c×c)。

Image feature F by convolution unit 33_(c×w×h)Performing convolution operation to obtain a first intermediate feature matrix F_c(c×h×w)1And finally M 'is provided in the second matrix multiplication unit 36 by the second matrix multiplication unit 36'_c(c×c)And F_c(c×h×w)1Carrying out matrix multiplication to obtain an enhanced second intermediate characteristic matrix F_h(c×h×w)1。

Further, the second intermediate feature matrix F is applied by an adder 37_h(c×h×w)1And a first intermediate feature matrix F_c(c×h×w)1Adding to obtain the final first enhanced image characteristic F₁。

The process of obtaining the second enhanced image feature includes: and multiplying the second dimension reduction characteristic matrix and the first dimension reduction characteristic matrix to obtain a second weight matrix corresponding to the pixel, and obtaining a second enhanced image characteristic based on the image characteristic and the second weight matrix. Optionally, performing convolution operation on the image features to obtain a third intermediate feature matrix, multiplying the second weight matrix by the third intermediate feature matrix to obtain a fourth intermediate feature matrix, and adding the third intermediate feature matrix and the fourth intermediate feature matrix to obtain a second enhanced image feature. In the embodiment of the disclosure, feature enhancement is performed on the pixels to improve the expression capability of image features, so that the accuracy of image identification can be improved.

The second enhanced image feature obtaining process is explained below with reference to fig. 4, and as shown in fig. 4, the feature enhancing module at the pixel level includes a convolution unit 41, a convolution unit 42, a convolution unit 43, a first matrix multiplication unit 44, a normalization unit 45, a second matrix multiplication unit 46, and an adder 47.

Wherein the image feature F_(c×w×h)As input to the pixel-level feature enhancement module.

The image feature F is respectively processed by the convolution unit 41 and the convolution unit 42_(c×w×h)Performing a convolution operation to the image feature F_(c×w×h)Dimension reduction processing is carried out to obtain dimension reduction image characteristics, namely a first dimension reduction characteristic matrix Q_c(c×(h*w))And a second dimension-reducing feature matrix H_c((h*w)×c). Wherein H_c((h*w)×c)Is Q_c(c×(h*w))The transposed matrix of (2). Further, mixing H_c((h*w)×c)And Q_c(c×(h*w))Input to a first matrix multiplication unit 44, and H is contained in the first matrix multiplication unit 44_c((h*w)×c)And Q_c(c×(h*w))After matrix multiplication, a second weight matrix M can be obtained_{P((h*w)×(h*w))}And M is_{P((h*w)×(h*w))}The pixel value is input into the normalization unit 45 to be normalized to obtain a second weight matrix M 'corresponding to the pixel'_{P((h*w)×(h*w))}。

Image feature F by convolution unit 43_(c×w×h)Performing convolution operation to obtain a third intermediate feature matrix F_c(c×h×w)2And finally M 'in the second matrix multiplication unit 46 by the second matrix multiplication unit 46'_{P((h*w)×(h*w))}And F_c(c×h×w)2Carrying out matrix multiplication to obtain an enhanced fourth intermediate characteristic matrix F_h(c×h×w)2。

Further, a fourth intermediate feature matrix F is applied by an adder 47_h(c×h×w)2And a third intermediate feature matrix F_c(c×h×w)2Adding on the channel to obtain the final second enhanced image characteristic F₂。

The process of obtaining the second enhanced image feature includes: and multiplying the second dimension reduction characteristic matrix and the first dimension reduction characteristic matrix to obtain a second weight matrix corresponding to the pixel, and obtaining a second enhanced image characteristic based on the image characteristic and the second weight matrix. Optionally, performing convolution operation on the image features to obtain a third intermediate feature matrix, multiplying the second weight matrix by the third intermediate feature matrix to obtain a fourth intermediate feature matrix, and adding the third intermediate feature matrix and the fourth intermediate feature matrix to obtain a second enhanced image feature.

S204, weighting the first enhanced image characteristic and the second enhanced image characteristic to obtain a target image characteristic.

In the embodiment of the present disclosure, the obtaining of the target image feature requires performing a weighting calculation based on the first enhanced image feature and the second enhanced image feature. Further, setting the target image feature as F, as shown in fig. 3 and 4, the image feature obtained by the image feature through channel level enhancement is F1, the image feature obtained by the image feature through pixel level enhancement is F2, after the enhanced feature is obtained, F1 and F2 are fused based on the weights of the first enhanced image feature and the second enhanced image feature, that is, F ═ a × F1+ b × F2. Where a and b are learnable weight parameters, it can be understood that the weight parameters a and b are obtained by debugging according to a training process and a testing process in the image texture recognition model in the embodiment of the present disclosure.

And S205, acquiring the texture type of the image to be recognized based on the target image characteristics.

The image texture recognition model referred to in the above embodiments is explained below. Firstly, a nonlinear mapping model is constructed, and then a training data set is acquired, wherein the training data set comprises a sample image and a texture class marked by the sample image. And training the constructed nonlinear mapping model based on the training data set to finally obtain an image texture recognition model capable of recognizing image textures.

Alternatively, as shown in fig. 5, the network structure of the image classification recognition model may include: a feature extraction layer 51, a feature enhancement layer 52, wherein the feature enhancement layer includes a channel-level feature enhancement sublayer 521 and a pixel-level feature enhancement sublayer 522, a feature fusion layer 53, a Full Connected (FC) layer 54, and an L2 norm normalization (L2 normalized, L2 nom) layer 55. The image to be recognized is input into the image classification recognition model shown in fig. 5, image features may be extracted through the feature extraction layer 51, feature enhancement at channel level and pixel level is performed through the feature enhancement layer 52, feature fusion is performed through the feature fusion layer 53, the image features are input into the FC layer 54, the FC layer 54 performs full connection on the fused image features, and finally the image features are input into the L2 nom layer 55 for mapping, so as to obtain texture classes of the image to be recognized.

Corresponding to the image recognition methods provided by the above embodiments, an embodiment of the present disclosure further provides an image recognition apparatus, and since the image texture feature extraction apparatus provided by the embodiment of the present disclosure corresponds to the image recognition methods provided by the above embodiments, the implementation of the image recognition method is also applicable to the image recognition apparatus provided by the embodiment of the present disclosure, and will not be described in detail in the following embodiments.

Fig. 6 is a schematic structural diagram of an image recognition apparatus according to another embodiment of the present disclosure. As shown in fig. 6, the image recognition apparatus 600 includes: a feature extraction module 61, a dimensionality reduction module 62, a first enhancement module 63, a second enhancement module 64, and a texture recognition module 65. Wherein:

the feature extraction module 61 is configured to acquire an image to be identified and extract image features of the image to be identified;

the dimension reduction module 62 is configured to perform dimension reduction processing on the image features to obtain dimension reduction image features;

a first enhancement module 63, configured to enhance the image features on the feature extraction channel to obtain first enhanced image features;

a second enhancement module 64 for enhancing the image feature in pixels to obtain a second enhanced image feature;

and the texture recognition module 65 is configured to obtain a texture type of the image to be recognized based on the first enhanced image feature and the second enhanced image feature.

The image recognition device acquires an image to be recognized, extracts image features of the image to be recognized, performs dimensionality reduction on the image features, acquires dimensionality reduction image features, enhances the dimensionality reduction image features on a feature extraction channel to acquire first enhanced image features, enhances the dimensionality reduction image features on pixels to acquire second enhanced image features, and acquires texture types of the image to be recognized based on the first enhanced image features and the second enhanced image features. In the disclosure, after the image features are acquired, feature enhancement is respectively performed in two aspects to enhance the expression capability of the features, and sufficient image features can be provided to improve the accuracy of the classification and identification of the texture types of the images.

Fig. 7 is a schematic structural diagram of an image recognition apparatus according to another embodiment of the present disclosure. As shown in fig. 7, the image recognition apparatus 700 includes: a feature extraction module 71, a dimension reduction module 72, a first enhancement module 73, a second enhancement module 74 and a texture recognition module 75.

It should be noted that the feature extraction module 71, the dimension reduction module 72, the first enhancement module 73, the second enhancement module 74, and the texture recognition module 75 have the same structure and function as the feature extraction module 61, the dimension reduction module 62, the first enhancement module 63, the second enhancement module 64, and the texture recognition module 65.

In the embodiment of the present disclosure, the dimension reduction module 72 is configured to fuse two dimensions, namely, a feature length and a feature width, in an image feature to obtain a first dimension reduction feature matrix and a second dimension reduction feature matrix, where feature elements in the same row in the first dimension reduction feature matrix belong to the same feature extraction channel, a column element corresponds to one pixel, and the second dimension reduction feature matrix is a transposed matrix of the first dimension reduction feature matrix; the first dimension reduction feature matrix and the second dimension reduction feature matrix are used for obtaining a first enhanced image feature and a second enhanced image feature.

In the disclosed embodiment, the first enhancement module 73 includes a first matrix multiplication unit 731 and a first acquisition unit 732.

The first matrix multiplication unit 731 is configured to multiply the first dimension reduction feature matrix and the second dimension reduction feature matrix to obtain a first weight matrix corresponding to the feature extraction channel.

A first obtaining unit 732, configured to obtain a first enhanced image feature based on the image feature and the first weight matrix.

The first obtaining unit 732 is further configured to perform convolution operation on the image feature to obtain a first intermediate feature matrix; multiplying the first weight matrix and the first intermediate feature matrix to obtain a second intermediate feature matrix; and adding the first intermediate feature matrix and the second intermediate feature matrix to obtain a first enhanced image feature.

In the embodiment of the present disclosure, the second enhancement module 74 includes a second matrix multiplication unit 741 and a second obtaining unit 742.

And a second matrix multiplication unit 741, configured to multiply the second dimension-reduced feature matrix and the first dimension-reduced feature matrix, and acquire a second weight matrix corresponding to the pixel.

A second obtaining unit 742 is configured to obtain a second enhanced image feature based on the image feature and the second weight matrix.

The second obtaining unit 742 is further configured to perform convolution operation on the image feature to obtain a third intermediate feature matrix; multiplying the second weight matrix with the third intermediate feature matrix to obtain a fourth intermediate feature matrix; adding the third intermediate feature matrix and the fourth intermediate feature matrix to obtain a second enhanced image feature

The texture recognition module 75 in the embodiment of the present disclosure includes: a weighting unit 751 and a recognition unit 752.

The weighting unit 751 is configured to weight the first enhanced image feature and the second enhanced image feature to obtain a target image feature.

The identifying unit 752 is configured to identify a texture type of the image to be identified based on the target image feature.

In the disclosure, after the image features are acquired, feature enhancement is respectively performed in two aspects to enhance the expression capability of the features, and sufficient image features can be provided to improve the accuracy of the classification and identification of the texture types of the images.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 8 illustrates a schematic block diagram of an example electronic device 800 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The calculation unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The calculation unit 801 executes the respective methods and processes described above, such as the image recognition method. For example, in some embodiments, the image recognition method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto device 800 via ROM 802 and/or communications unit 809. When the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the image recognition method described above may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the image recognition method in any other suitable manner (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), the internet, and blockchain networks.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The service end can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service (Virtual Private Server, or VPS for short). The server may also be a server of a distributed system, or a server incorporating a blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. An image recognition method, comprising:

acquiring an image to be identified, and extracting image characteristics of the image to be identified;

performing dimension reduction processing on the image features to obtain dimension reduction image features;

enhancing the dimensionality reduction image features on a feature extraction channel to obtain first enhanced image features;

enhancing the dimension-reduced image features on pixels to obtain second enhanced image features;

and acquiring the texture type of the image to be identified based on the first enhanced image characteristic and the second enhanced image characteristic.

2. The image recognition method according to claim 1, wherein the performing dimension reduction processing on the image feature to obtain a dimension-reduced image feature comprises:

fusing two dimensions of feature length and feature width in the image features to obtain a first dimension-reduction feature matrix and a second dimension-reduction feature matrix, wherein feature elements in the same row in the first dimension-reduction feature matrix belong to the same feature extraction channel, a column element corresponds to a pixel, and the second dimension-reduction feature matrix is a transposed matrix of the first dimension-reduction feature matrix;

the dimension-reduced image features comprise the first dimension-reduced feature matrix and the second dimension-reduced feature matrix, and are used for acquiring the first enhanced image features and the second enhanced image features.

3. The image recognition method of claim 2, wherein the enhancing the dimension-reduced image feature on a feature extraction channel to obtain a first enhanced image feature comprises:

multiplying the first dimension reduction feature matrix and the second dimension reduction feature matrix to obtain a first weight matrix corresponding to the feature extraction channel;

and acquiring the first enhanced image characteristic based on the image characteristic and the first weight matrix.

4. The image recognition method of claim 3, wherein the obtaining the first enhanced image feature based on the image feature and the first weight matrix comprises:

performing convolution operation on the image features to obtain a first intermediate feature matrix;

multiplying the first weight matrix and the first intermediate feature matrix to obtain a second intermediate feature matrix;

and adding the first intermediate feature matrix and the second intermediate feature matrix to obtain the first enhanced image feature.

5. The image recognition method of claim 2, wherein the enhancing the reduced-dimension image feature on a pixel to obtain a second enhanced image feature comprises:

multiplying the second dimension reduction feature matrix with the first dimension reduction feature matrix to obtain a second weight matrix corresponding to the pixel;

and acquiring the second enhanced image characteristic based on the image characteristic and the second weight matrix.

6. The image recognition method of claim 5, wherein the obtaining the second enhanced image feature based on the image feature and the second weight matrix comprises:

performing convolution operation on the image features to obtain a third intermediate feature matrix;

multiplying the second weight matrix with the third intermediate feature matrix to obtain a fourth intermediate feature matrix;

and adding the third intermediate feature matrix and the fourth intermediate feature matrix to obtain the second enhanced image feature.

7. The image recognition method according to any one of claims 1 to 6, wherein the obtaining the texture type of the image to be recognized based on the first enhanced image feature and the second enhanced image feature comprises:

weighting the first enhanced image feature and the second enhanced image feature to obtain the target image feature;

and identifying the texture type of the image to be identified based on the target image characteristic.

8. An image recognition apparatus comprising:

the characteristic extraction module is used for acquiring an image to be identified and extracting the image characteristics of the image to be identified;

the dimension reduction module is used for carrying out dimension reduction processing on the image features to obtain dimension reduction image features;

the first enhancement module is used for enhancing the image features on a feature extraction channel to obtain first enhanced image features;

the second enhancement module is used for enhancing the image characteristics on pixels to obtain second enhanced image characteristics;

and the texture recognition module is used for acquiring the texture type of the image to be recognized based on the first enhanced image characteristic and the second enhanced image characteristic.

9. The image recognition device according to claim 8, wherein the dimension reduction module is configured to fuse two dimensions, namely a feature length and a feature width, in the image feature to obtain a first dimension reduction feature matrix and a second dimension reduction feature matrix, where feature elements in a same row in the first dimension reduction feature matrix belong to a same feature extraction channel, a column element corresponds to a pixel, and the second dimension reduction feature matrix is a transpose matrix of the first dimension reduction feature matrix;

the first dimension-reduced feature matrix and the second dimension-reduced feature matrix are used for obtaining the first enhanced image feature and the second enhanced image feature.

10. The image recognition device of claim 9, wherein the first enhancement module comprises:

a first matrix multiplication unit, configured to multiply the first dimension reduction feature matrix and the second dimension reduction feature matrix to obtain a first weight matrix corresponding to the feature extraction channel;

a first obtaining unit, configured to obtain the first enhanced image feature based on the image feature and the first weight matrix.

11. The image recognition apparatus according to claim 10, wherein the first acquisition unit is further configured to:

12. The image recognition device of claim 9, wherein the second enhancement module comprises:

a second matrix multiplication unit, configured to multiply the second dimension-reduced feature matrix and the first dimension-reduced feature matrix, and obtain a second weight matrix corresponding to the pixel;

a second obtaining unit, configured to obtain the second enhanced image feature based on the image feature and the second weight matrix.

13. The image recognition apparatus according to claim 12, wherein the second acquisition unit is further configured to:

14. The image recognition device according to any one of claims 8 to 13, wherein the texture recognition module includes:

the weighting unit is used for weighting the first enhanced image characteristic and the second enhanced image characteristic to obtain the target image characteristic;

and the identification unit is used for identifying the texture type of the image to be identified based on the target image characteristics.

15. An electronic device, comprising:

at least one processor, and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the image recognition method of any one of claims 1-7.

16. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the image recognition method of any one of claims 1-7.

17. A computer program product comprising a computer program which, when executed by a processor, implements an image recognition method according to any one of claims 1-7.