CN110598717B - Image feature extraction method and device and electronic equipment - Google Patents

Image feature extraction method and device and electronic equipment Download PDF

Info

Publication number
CN110598717B
CN110598717B CN201910873505.1A CN201910873505A CN110598717B CN 110598717 B CN110598717 B CN 110598717B CN 201910873505 A CN201910873505 A CN 201910873505A CN 110598717 B CN110598717 B CN 110598717B
Authority
CN
China
Prior art keywords
image
sub
processed
images
generating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910873505.1A
Other languages
Chinese (zh)
Other versions
CN110598717A (en
Inventor
郭梓超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Megvii Technology Co Ltd
Original Assignee
Beijing Megvii Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Megvii Technology Co Ltd filed Critical Beijing Megvii Technology Co Ltd
Priority to CN201910873505.1A priority Critical patent/CN110598717B/en
Publication of CN110598717A publication Critical patent/CN110598717A/en
Application granted granted Critical
Publication of CN110598717B publication Critical patent/CN110598717B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an image feature extraction method, an image feature extraction device and electronic equipment, which relate to the technical field of image processing and comprise the following steps: acquiring an image to be processed; the image to be processed comprises an original image to be processed or a characteristic image to be processed; segmenting an image to be processed to obtain a plurality of sub-images; generating target weights corresponding to the sub-images based on the image to be processed and a plurality of preset initial weights; for each sub-image, performing convolution operation on the sub-image by adopting the target weight corresponding to the sub-image to obtain a characteristic image of the sub-image; and generating a characteristic image of the image to be processed based on the characteristic image of each sub-image. The invention can effectively improve the existing image feature extraction mode.

Description

Image feature extraction method and device and electronic equipment
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to an image feature extraction method and apparatus, and an electronic device.
Background
The image feature extraction is a key link in computer vision and image processing, and the feature extraction is required to be carried out on the image in various application scenes such as pedestrian detection, vehicle re-identification and the like. The existing image feature extraction mode is mainly to extract features of an image to be processed through a convolutional neural network which is obtained through training and set with fixed weight. During specific implementation, a fixed weight is usually preset in a training process of the convolutional neural network, the preset weight is updated and optimized according to a loss function in the training process, and after the training of the convolutional neural network is finished, feature extraction is directly performed on an original image based on the optimized fixed weight.
However, the inventor finds that it is difficult to extract the feature information of the image well by a method of performing feature extraction on different input images to be processed based on fixed weights by using a convolutional neural network.
Disclosure of Invention
In view of the above, the present invention provides an image feature extraction method, an image feature extraction device and an electronic device, which can effectively improve the conventional image feature extraction method.
In order to achieve the above purpose, the embodiment of the present invention adopts the following technical solutions:
in a first aspect, an embodiment of the present invention provides an image feature extraction method, where the method includes: acquiring an image to be processed with features to be extracted; the image to be processed comprises an original image to be processed or a characteristic image to be processed; segmenting an image to be processed to obtain a plurality of sub-images; generating target weights corresponding to the sub-images based on the image to be processed and a plurality of preset initial weights; for each sub-image, performing convolution operation on the sub-image by adopting the target weight corresponding to the sub-image to obtain a characteristic image of the sub-image; and generating a characteristic image of the image to be processed based on the characteristic image of each sub-image.
Further, the step of generating a target weight corresponding to each sub-image based on the image to be processed and a plurality of preset initial weights includes: generating a plurality of linear weighting coefficients based on the image to be processed, the plurality of sub-images and a plurality of preset initial weights; and generating a target weight corresponding to each sub-image based on the linear weighting coefficient and a plurality of preset initial weights.
Further, the step of generating a plurality of linear weighting coefficients based on the image to be processed, the plurality of sub-images and a plurality of preset initial weights includes: performing dimensionality reduction operation on an image to be processed to obtain image information subjected to dimensionality reduction; constructing a full connection layer based on the number of the sub-images and the number of the initial weights; and calculating the image information after the dimension reduction through the full connection layer to generate a plurality of linear weighting coefficients.
Further, the step of generating a target weight corresponding to each sub-image based on the linear weighting coefficient and a plurality of preset initial weights includes: generating a target weight corresponding to each sub-image according to a preset linear weighting formula, a linear weighting coefficient and a plurality of preset initial weights; wherein, the preset linear weighting formula is as follows:
Figure BDA0002201171080000021
wherein, WnsTarget weight, w, corresponding to S sub-image of nth image to be processediIs the ith initial weight, N is the number of initial weights, alphaiAnd the ith linear weighting coefficient corresponds to the S sub-image of the nth image to be processed.
Further, the step of segmenting the image to be processed includes: segmenting an image to be processed according to a preset segmentation rule; the preset segmentation rule comprises a uniform segmentation rule and/or a non-uniform segmentation rule.
Further, the step of generating a feature image of the image to be processed based on the feature image of each sub-image includes: and splicing the characteristic images of the sub-images according to a preset segmentation rule to obtain the characteristic image of the image to be processed.
Further, the initial weight is obtained based on an optimization algorithm; the optimization algorithm comprises one or more of a random steepest descent method SGD, a batch gradient descent method BGD, an Adam optimization algorithm and a root mean square error algorithm RMSProp.
In a second aspect, an embodiment of the present invention provides an apparatus for extracting an image feature, where the apparatus includes: the image acquisition module is used for acquiring an image to be processed with features to be extracted; the image to be processed comprises an original image to be processed or a characteristic image to be processed; the image segmentation module is used for segmenting the image to be processed to obtain a plurality of sub-images; the weight generation module is used for generating target weights corresponding to the sub-images based on the image to be processed and a plurality of preset initial weights; the convolution operation module is used for performing convolution operation on each sub-image by adopting the target weight corresponding to the sub-image to obtain a characteristic image of the sub-image; and the characteristic image generation module is used for generating a characteristic image of the image to be processed based on the characteristic image of each sub-image.
In a third aspect, an embodiment of the present invention provides an electronic device, including: a processor and a storage device; the storage means has stored thereon a computer program which, when executed by the processor, performs the method as in the first aspect.
In a fourth aspect, the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the method in the first aspect.
The embodiment of the invention provides an image feature extraction method, an image feature extraction device and electronic equipment, wherein an acquired image to be processed is firstly segmented into a plurality of sub-images; then generating target weights corresponding to the sub-images based on the image to be processed and a plurality of preset initial weights; and finally, performing convolution operation on each sub-image by adopting the target weight corresponding to the sub-image to obtain the characteristic image of the sub-image, so as to generate the characteristic image of the image to be processed based on the characteristic image of each sub-image. Compared with the prior art that no matter what kind of image to be processed is subjected to feature extraction, a fixed weight mode is adopted, the method provided by the embodiment can generate corresponding target weights based on different sub-images of the image to be processed, and further performs feature extraction based on the target weights of the sub-images, so as to finally obtain the features of the image to be processed. On one hand, each sub-image corresponds to a target weight, the target weight and the sub-image have a strong adaptive relationship, the integrity and the accuracy of the feature extraction of the sub-image can be effectively improved by adopting the target weight of the adaptive sub-image to carry out convolution operation on the sub-image, and the feature extraction effect of each sub-image is improved, so that the feature extraction effect of the whole to-be-processed image can be comprehensively improved; on the other hand, the target weight is generated based on the image to be processed and the initial weight, namely the target weight is related to the input image to be processed, the input image to be processed is different, and the target weight is correspondingly different, so that the characteristic extraction is performed by adopting the target weight related to the image to be processed, the characteristic of the image to be processed can be extracted in a more targeted manner, and the characteristic extraction effect of the image to be processed is further improved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the above-described technology of the disclosure.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating an image feature extraction method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating a segmentation method of an image to be processed according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating an embodiment of the present invention for extracting image features;
fig. 5 shows a block diagram of an image feature extraction apparatus according to an embodiment of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The inventor finds in the research process that when the image is subjected to feature extraction in the prior art, the weights of the convolutional neural network are predefined fixed weights, and the following problems can exist: (1) for one image to be processed, the fixed weight is adopted to carry out global feature extraction, and the difference of different areas in the whole image cannot be considered; (2) the fixed weight is adopted for feature extraction of different images to be processed, and the difference of different images cannot be considered. All the above problems can cause that the existing feature extraction method is difficult to accurately and completely acquire the feature information of the image. Based on this, in order to improve at least one of the above problems, embodiments of the present invention provide an image feature extraction method, an image feature extraction device, and an electronic device, which can effectively improve accuracy of image feature extraction and integrity of extracted features. The technology can be applied to various tasks of image feature extraction through a convolutional neural network, such as a face recognition task, a pedestrian detection task and the like. For ease of understanding, the following detailed description will discuss embodiments of the present invention.
The first embodiment is as follows:
first, an exemplary electronic device 100 for implementing the image feature extraction method and apparatus according to the embodiment of the present invention is described with reference to fig. 1.
As shown in fig. 1, an electronic device 100 includes one or more processors 102, one or more memory devices 104, an input device 106, an output device 108, and an image capture device 110, which are interconnected via a bus system 112 and/or other type of connection mechanism (not shown). It should be noted that the components and configuration of the electronic device 100 shown in FIG. 1 are exemplary only, and not limiting, and that the electronic device may have other components and configurations as desired.
The processor 102 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 100 to perform desired functions.
The storage 104 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. On which one or more computer program instructions may be stored that may be executed by processor 102 to implement client-side functionality (implemented by the processor) and/or other desired functionality in embodiments of the invention described below. Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer-readable storage medium.
The input device 106 may be a device used by a user to input instructions and may include one or more of a keyboard, a mouse, a microphone, a touch screen, and the like.
The output device 108 may output various information (e.g., images or sounds) to the outside (e.g., a user), and may include one or more of a display, a speaker, and the like.
The image capture device 110 may take images (e.g., photographs, videos, etc.) desired by the user and store the taken images in the storage device 104 for use by other components.
Exemplary electronic devices for implementing the image feature extraction method and apparatus according to the embodiments of the present invention may be implemented on smart terminals such as smart phones, tablet computers, and the like.
Example two:
referring to a flowchart of an image feature extraction method shown in fig. 2, the method specifically includes the following steps:
step S202, acquiring an image to be processed with features to be extracted; the image to be processed comprises an original image to be processed or a characteristic image to be processed. The original image to be processed may include an initial image, such as an RGB image, obtained by means of image capture by an image capture device, network download, local storage, or manual upload; the feature image to be processed may include a feature image obtained by performing feature extraction on an initial image in advance by using an existing feature extraction algorithm, such as a HOG (Histogram of Oriented Gradient) algorithm, an LBP (Local Binary Pattern) algorithm, or may include a next-layer feature map obtained by performing a convolution operation on the initial image or an intermediate feature map. In practical applications, the image to be processed may be expressed as a multi-dimensional structure tensor.
And step S204, segmenting the image to be processed to obtain a plurality of sub-images. In this embodiment, the image to be processed may be subjected to uniform segmentation or non-uniform segmentation, and the image to be processed is segmented into at least two local regions, where each local region is a sub-image.
Step S206, generating target weights corresponding to the sub-images based on the image to be processed and a plurality of preset initial weights. The preset initial weights may be fixed weights manually set based on a conventional work experience, or may be weights obtained by optimizing the manually set fixed weights.
And S208, performing convolution operation on each sub-image by adopting the target weight corresponding to the sub-image to obtain the characteristic image of the sub-image.
It will be appreciated that for each sub-image, when the sub-image is convolved with its corresponding target weight, the target weight is shared by multiple sliding windows of the sub-image; the target weights shared between different sub-images are different.
Step S210, generating a characteristic image of the image to be processed based on the characteristic image of each sub-image. The feature images of the sub-images may be recombined in a manner of splitting the image to be processed into the sub-images to generate the feature image of the image to be processed.
In practical applications, compared to the conventional feature extraction operation (such as convolution operation) that a fixed weight is applied to any image to be processed, the image feature extraction process provided in this embodiment may obtain different target weights for different images to be processed, that is, the weights vary with different images, so that the steps S202 to S210 may be regarded as a dynamic convolution operation different from a conventional convolution operation as a whole, each sub-image obtained by segmenting the image to be processed shares a target weight, and different target weights shared by different sub-images are different, so that the steps S202 to S210 may be regarded as a local-sharing convolution operation different from the conventional convolution operation as a whole, and in sum, the steps S202 to S210 may be regarded as a dynamic and local-sharing convolution operation as a whole. The convolutional neural network may repeatedly perform the above dynamic and local shared convolution operation for multiple times, such as, after the feature image of the image to be processed is obtained in step S210, taking the feature image of the image to be processed as a new image to be processed, and then repeatedly performing the above steps S202 to S210 on the new image to be processed, so as to perform further feature extraction on the new image to be processed.
The image feature extraction method provided by the embodiment of the invention comprises the steps of firstly segmenting an acquired image to be processed into a plurality of sub-images; then generating target weights corresponding to the sub-images based on the image to be processed and a plurality of preset initial weights; and finally, performing convolution operation on each sub-image by adopting the target weight corresponding to the sub-image to obtain the characteristic image of the sub-image, so as to generate the characteristic image of the image to be processed based on the characteristic image of each sub-image. Compared with the prior art that no matter what kind of image to be processed is subjected to feature extraction, a fixed weight mode is adopted, the method provided by the embodiment can generate corresponding target weights based on different sub-images of the image to be processed, and further performs feature extraction based on the target weights of the sub-images, so as to finally obtain the features of the image to be processed. On one hand, each sub-image corresponds to a target weight, the target weight and the sub-image have a strong adaptive relationship, the integrity and the accuracy of the feature extraction of the sub-image can be effectively improved by adopting the target weight of the adaptive sub-image to carry out convolution operation on the sub-image, and the feature extraction effect of each sub-image is improved, so that the feature extraction effect of the whole to-be-processed image can be comprehensively improved; on the other hand, the target weight is generated based on the image to be processed and the initial weight, namely the target weight is related to the input image to be processed, the input image to be processed is different, and the target weight is correspondingly different, so that the characteristic extraction is performed by adopting the target weight related to the image to be processed, the characteristic of the image to be processed can be extracted in a more targeted manner, and the characteristic extraction effect of the image to be processed is further improved.
When the image to be processed is segmented in the step S204, the image to be processed may be segmented according to a preset segmentation rule; the preset segmentation rule comprises a uniform segmentation rule and/or a non-uniform segmentation rule. For ease of understanding, examples of the segmentation modes of the uniform segmentation rule and the non-uniform segmentation rule are given below respectively:
the example of the segmentation mode according to the uniform segmentation rule is as follows: uniformly dividing the image to be processed into a plurality of local areas along the transverse direction and/or the longitudinal direction, wherein each local area is a sub-image; such as referring to the schematic diagram of the segmentation mode of the image to be processed shown in fig. 3, a mode of uniformly segmenting the image to be processed into four sub-images along the transverse direction and the longitudinal direction is shown.
The first example is a segmentation mode according to a non-uniform segmentation rule: and randomly segmenting the image to be processed, wherein each local area obtained by segmentation is a sub-image.
The second example of the segmentation mode according to the non-uniform segmentation rule is as follows: the image to be processed is segmented according to the characteristics of the image to be processed, for example, the image to be processed is a face image, the face image can be segmented into a plurality of local areas such as an eye area, a nose area, a mouth area and the like according to the distribution of key points of the face, and each local area is a sub-image.
Of course, the above is only an exemplary illustration of segmenting the image to be processed, and in practical applications, other segmenting manners may also be included, which is not limited herein.
In order to distinguish the different sub-images obtained by slicing, the sub-images may be numbered sequentially, and in practical applications, for example, S ═ 1, 2, and 3 … … may be used to represent the sequential numbers of the sub-images. The ordinal number of the sub-images can be numbered in a variety of ways, such as the following two ways:
the first method is as follows: firstly, determining the subimages arranged at the designated positions as the starting points of the serial numbers, and setting the ordinal number of the subimage as S to be 1; the designated position is usually a vertex position of the image to be processed, such as an upper left corner position, a lower left corner position, and the like of the image to be processed. Then, numbering the sub-images in a sequence number from the starting point according to a preset numbering direction to determine the sequence number of each sub-image; the numbering direction is, for example, from left to right, from top to bottom, and the ordinal numbers of the sub-images shown in fig. 3 are numbered in the numbering direction from left to right and then from top to bottom. The numbering mode is simple, and the method is suitable for scenes with relatively orderly sub-image arrangement, such as sub-images obtained after the images to be processed are segmented according to the uniform segmentation rule.
The second method comprises the following steps: firstly, acquiring position coordinates of preset key points of each subimage; the preset key points are the top points and the central points of the sub-images. Then, ordering the sub-images based on the position coordinates of the sub-images to obtain an ordering result; in specific implementation, the subimages can be sorted from small to large (or from large to small) according to the abscissa in the position coordinate, and then the subimages with the same abscissa are sorted from small to large (or from large to small) according to the ordinate in the position coordinate, so as to obtain a sorting result; of course, the sorting result can also be obtained by adopting a sorting mode based on the ordinate first and then the abscissa. And finally, carrying out ordinal numbering on the sub-images according to the sorting result. The numbering mode can well realize ordinal numbering for the sub-images in various arrangement scenes, and has strong universality.
Based on the image to be processed and the sub-images obtained by segmentation, the embodiment further provides a specific implementation manner for generating the target weights corresponding to the sub-images based on the image to be processed and a plurality of preset initial weights, which may be executed with reference to the following first step and second step:
the method comprises the following steps of firstly, generating a plurality of linear weighting coefficients based on an image to be processed, a plurality of sub-images and a plurality of preset initial weights.
And secondly, generating target weights corresponding to the sub-images based on the linear weighting coefficients and a plurality of preset initial weights.
For ease of understanding, the first and second steps are described separately below.
The embodiment first provides a method for generating a plurality of linear weighting coefficients applied in the first step, which can specifically refer to the following steps:
assuming that the image to be processed is a feature image X to be processed, the feature image X to be processed may be represented by a tensor with a dimension (n, c, h, w); wherein n represents the number of the characteristic images X to be processed; c represents the channel number of the input feature image of the feature extraction network, wherein the feature extraction network is a convolution neural network used for extracting the features of the feature image X to be processed, and can be ResNet34, VGGNet (visual Geometry Group network) and the like; h represents the height of the characteristic image X; w represents the width of the feature image X.
The dimension of each initial weight is (o, c, kh, kw); wherein o represents the number of channels of the output feature image of the feature extraction network, c represents the number of channels of the input feature image of the feature extraction network, kh represents the height of the convolution kernel, and kw represents the width of the convolution kernel.
The initial weight may be a fixed weight manually set based on a general work experience, or may be a weight obtained by optimizing the manually set fixed weight. When the initial weight is a weight obtained by optimizing a fixed weight set by a human, the initial weight may be obtained by referring to:
firstly, carrying out convolution operation on a sample image by adopting a fixed weight set by people to obtain a characteristic extraction result of the sample image; wherein the sample image comprises a plurality of different images. And then reversely adjusting the fixed weight through a preset optimization algorithm based on the feature extraction result of the sample image to obtain an optimized weight, and determining the optimized weight as an initial weight. The optimization algorithm comprises one or more of the following: SGD (Stochastic Gradient, random steepest Descent), BGD (Batch Gradient, Batch Gradient Descent), Adam optimization algorithm, root mean square error algorithm RMSProp. Of course, in practical applications, other optimization algorithms, such as an MBGD (Mini-Batch Gradient Descent) algorithm, may also be used, and are not limited herein.
For the sake of understanding, the present embodiment may describe the method for obtaining the initial weight by taking the BGD algorithm as an example, and for each fixed weight preset by human, refer to the following: determining the fixed weight as an initial weight of the convolutional neural network; constructing a calculation graph according to the convolutional neural network, and calculating the gradient of the initial weight of the convolutional neural network convolutional layer through a back propagation algorithm based on the calculation graph; and then, carrying out iterative updating on the initial weight according to the BGD algorithm, the gradient of the initial weight and the learning rate until a preset iteration frequency is reached or the learning rate of the feature extraction result of the sample image reaches a preset threshold value, and taking the weight after the initial weight is updated as the optimized initial weight.
Based on the to-be-processed image (i.e., the to-be-processed feature image X) and the initial weight described above, the manner of generating the plurality of linear weighting coefficients in the first step may include the following steps (1) to (3):
(1) and performing dimension reduction operation on the image to be processed to obtain dimension-reduced image information. In specific implementation, a numerical value of each to-be-processed image in the dimension (h, w) can be obtained first, and the sum of the height value and the width value of the to-be-processed image is obtained; and then, respectively averaging the sum of the height values and the sum of the width values, and taking the height average value and the width average value as the numerical value of each image to be processed in the dimension (h, w). In this case, the h-dimension and the w-dimension of all the images to be processed are the same, and the difference influence of the dimension (h, w) on the images to be processed can be ignored, so that the dimension of the images to be processed is reduced, and the image information with the dimension (n, c) is obtained.
(2) A fully connected layer is constructed based on the number of sub-images and the number of initial weights. The input dimension of the constructed full-connection layer is c, and the output dimension is (S, N); s is the number of sub-images, and N is the number of initial weights.
(3) And calculating the image information after the dimension reduction through the full connection layer to generate a plurality of linear weighting coefficients. Specifically, a linear weighting coefficient of dimension (N, S, N) is generated by performing matrix multiplication on the full-link layer of input dimension c and output dimension (S, N) and the image information of dimension (N, c). It can be understood that for the image to be processed with the number of samples N, each image to be processed corresponds to S sub-images and (N, S) linear weighting coefficients.
Based on the generated linear weighting coefficients, the embodiment further provides a method for generating a target weight applied to each sub-image in the second step, which may specifically refer to the following steps:
considering that the weights actually applied to feature extraction are usually optimized weights in the convolutional neural network training process, based on this, the second step can be referred to the following steps 1 to 3 in a specific implementation manner:
step 1, generating a target weight corresponding to each sub-image based on a linear weighting coefficient and a plurality of preset initial weights.
Specifically, the target weight corresponding to each sub-image may be generated according to a preset linear weighting formula, a linear weighting coefficient, and a plurality of preset initial weights; wherein, the preset linear weighting formula is as follows:
Figure BDA0002201171080000131
wherein, WnsTarget weight, w, corresponding to S sub-image of nth image to be processediIs the ith initial weight, N is the number of initial weights, alphaiAnd the ith linear weighting coefficient corresponds to the S sub-image of the nth image to be processed. The target weight W can be visually embodied based on the preset linear weighting formulansDependence on the image and sub-image to be processed, including: the input image to be processed is dynamically changed, the sequence number n of the image to be processed is 1, 2 and 3 … … to distinguish the images to be processed, and the target weight W is changed along with the dynamic change of the sequence number n of the image to be processednsAlso dynamically changes along with the change; further comprising: for each image to be processed, the sequence number S of the sub-images is 1, 2 and 3 … … to distinguish the sub-images, and the target weight W is changed along with the dynamic change of the sequence number S of the sub-imagesnsAnd also dynamically changes.
For the purpose of understanding the above target weight WnsReference may be made to the schematic diagram of extracting image features as shown in fig. 4, where fig. 4 shows an image X to be processed with n equal to 11The image to be processed X is processed1Splitting into 4 sub-images XnSAre each X11、X12、X13、X14(ii) a The number of initial weights is N-2, which are: w is a1And W2(ii) a The dimension is (n,alpha of S, N)iThe method comprises the following steps: alpha (alpha) ("alpha")i(1,1,1)、αi(1,2,1)、αi(1,3,1)、αi(1,4,1),αi(1,1,2)、αi(1,2,2)、αi(1,3,2)、αi(1,4,2)。
The following target weight W corresponding to each sub-image can be obtained according to the preset linear weighting formulans
W11=αi(1,1,1)*w1i(1,1,2)*w2,W12=αi(1,2,1)*w1i(1,2,2)*w2,W13=αi(1,3,1)*w1i(1,3,2)*w2And W14=αi(1,4,1)*w1i(1,4,2)*w2
Wherein the target weight WnsThe correspondence to the partial images is determined according to the value of the ordinal number S of the partial image, for example, for the partial image X with S equal to 111Corresponding to the target weight W11
After the target weight corresponding to each sub-image is determined in the above manner, referring to fig. 4, the target weight is used to perform convolution operation on the corresponding sub-image according to the following convolution operation formula: y ═ WX.
Wherein a target weight W is utilized11Sub-image X11Performing convolution operation to obtain sub-image X11Characteristic image Y of11(ii) a Using target weight W12Sub-image X12Performing convolution operation to obtain sub-image X12Characteristic image Y of12(ii) a Using target weight W13Sub-image X13Performing convolution operation to obtain sub-image X13Characteristic image Y of13(ii) a Using target weight W14Sub-image X14Performing convolution operation to obtain sub-image X14Characteristic image Y of14
Characteristic image Y based on each sub-image11、Y12、Y13And Y14And generating a characteristic image Y of the image X to be processed. In this embodiment, the segmentation may be performed according to a preset segmentationAnd regularly splicing the characteristic images of the sub-images to obtain a characteristic image Y of the image to be processed. The characteristic image corresponding to any image to be processed can be represented in the following way:
Yn=(Wn,S=1·XS=1,Wn,S=2·XS=2,……Wn,S=S·XS=S)
wherein, YnAnd the bracket () represents that the S split sub-images are combined according to a preset splitting rule. The characteristic image YnHas a dimension of (n, c, oh, ow); wherein o represents the number of channels of the output feature image of the feature extraction network, c represents the number of channels of the input feature image of the feature extraction network, and oh represents the feature image YnW represents a feature image YnIs wide. Finally all YnThe output characteristic images of the input N images to be processed are formed together.
In summary, compared with the prior art that a fixed weight is adopted no matter what kind of to-be-processed image is subjected to feature extraction, the image feature extraction method provided in the above embodiment can generate corresponding target weights based on different sub-images of the to-be-processed image, and further perform feature extraction based on the target weights of the sub-images, thereby finally obtaining the features of the to-be-processed image. On one hand, each sub-image corresponds to a target weight, the target weight and the sub-image have a strong adaptive relationship, the integrity and the accuracy of the feature extraction of the sub-image can be effectively improved by adopting the target weight of the adaptive sub-image to carry out convolution operation on the sub-image, and the feature extraction effect of each sub-image is improved, so that the feature extraction effect of the whole to-be-processed image can be comprehensively improved; on the other hand, the target weight is generated based on the image to be processed and the initial weight, namely the target weight is related to the input image to be processed, the input image to be processed is different, and the target weight is correspondingly different, so that the characteristic extraction is performed by adopting the target weight related to the image to be processed, the characteristic of the image to be processed can be extracted in a more targeted manner, and the characteristic extraction effect of the image to be processed is further improved.
Example three:
as to the image feature extraction method provided in the second embodiment, an embodiment of the present invention provides an image feature extraction device, and referring to a block diagram of a structure of an image feature extraction device shown in fig. 5, the device includes:
an image obtaining module 502, configured to obtain an image to be processed with features to be extracted; the image to be processed comprises an original image to be processed or a characteristic image to be processed;
the image segmentation module 504 is configured to segment an image to be processed to obtain a plurality of sub-images;
a weight generating module 506, configured to generate a target weight corresponding to each sub-image based on the image to be processed and a plurality of preset initial weights;
a convolution operation module 508, configured to perform convolution operation on each sub-image by using the target weight corresponding to the sub-image to obtain a feature image of the sub-image;
a feature image generating module 510, configured to generate a feature image of the image to be processed based on the feature image of each sub-image.
The image feature extraction device provided by the embodiment of the invention divides the acquired image to be processed into a plurality of sub-images; then generating target weights corresponding to the sub-images based on the image to be processed and a plurality of preset initial weights; and finally, performing convolution operation on each sub-image by adopting the target weight corresponding to the sub-image to obtain the characteristic image of the sub-image, thereby generating the characteristic image of the image to be processed based on the characteristic image of each sub-image. Compared with the prior art that no matter what kind of image to be processed is subjected to feature extraction, a fixed weight mode is adopted, the method provided by the embodiment can generate corresponding target weights based on different sub-images of the image to be processed, and further performs feature extraction based on the target weights of the sub-images, so as to finally obtain the features of the image to be processed. On one hand, each sub-image corresponds to a target weight, the target weight and the sub-image have a strong adaptive relationship, the integrity and the accuracy of the feature extraction of the sub-image can be effectively improved by adopting the target weight of the adaptive sub-image to carry out convolution operation on the sub-image, and the feature extraction effect of each sub-image is improved, so that the feature extraction effect of the whole to-be-processed image can be comprehensively improved; on the other hand, the target weight is generated based on the image to be processed and the initial weight, namely the target weight is related to the input image to be processed, the input image to be processed is different, and the target weight is correspondingly different, so that the characteristic extraction is performed by adopting the target weight related to the image to be processed, the characteristic of the image to be processed can be extracted in a more targeted manner, and the characteristic extraction effect of the image to be processed is further improved.
In some embodiments, the weight generation module 506 is further configured to: generating a plurality of linear weighting coefficients based on the image to be processed, the plurality of sub-images and a plurality of preset initial weights; and generating a target weight corresponding to each sub-image based on the linear weighting coefficient and a plurality of preset initial weights.
In some embodiments, the weight generation module 506 is further configured to: performing dimensionality reduction operation on an image to be processed to obtain image information subjected to dimensionality reduction; constructing a full connection layer based on the number of the sub-images and the number of the initial weights; and calculating the image information after the dimension reduction through the full connection layer to generate a plurality of linear weighting coefficients.
In some embodiments, the weight generation module 506 is further configured to: generating a target weight corresponding to each sub-image according to a preset linear weighting formula, the linear weighting coefficients and a plurality of preset initial weights; wherein the preset linear weighting formula is:
Figure BDA0002201171080000161
wherein, WnsTarget weight, w, corresponding to S sub-image of nth image to be processediIs the ith initial weight, N is the number of initial weights, alphaiThe ith linear addition corresponding to the S sub-image of the nth image to be processedA weight coefficient.
In some embodiments, the image segmentation module 504 is further configured to segment the image to be processed according to a preset segmentation rule; the preset segmentation rules comprise uniform segmentation rules and/or non-uniform segmentation rules.
In some embodiments, the feature image generation module 510 is further configured to: and splicing the characteristic images of the sub-images according to a preset segmentation rule to obtain the characteristic image of the image to be processed.
In some embodiments, the initial weight is obtained based on an optimization algorithm; the optimization algorithm comprises one or more of a random steepest descent method SGD, a batch gradient descent method BGD, an Adam optimization algorithm and a root mean square error algorithm RMSProp.
The device provided in this embodiment has the same implementation principle and the same technical effects as those of the foregoing embodiment, and for the sake of brief description, reference may be made to corresponding contents in the foregoing embodiment two for parts of this embodiment that are not mentioned.
Example four:
based on the foregoing embodiments, the present embodiment provides an electronic device, including: a processor and a storage device; the storage device stores a computer program, and the computer program, when executed by the processor, executes any one of the image feature extraction methods provided in embodiment two.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the electronic device described above may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.
Further, the present embodiment also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processing device, the computer program performs the steps of any one of the methods provided in the second embodiment.
The method and the device for extracting image features and the computer program product of the electronic device provided by the embodiments of the present invention include a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute the method described in the foregoing method embodiments, and specific implementation may refer to the method embodiments, and will not be described herein again.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (8)

1. An image feature extraction method, characterized by comprising:
acquiring an image to be processed; the image to be processed comprises an original image to be processed or a characteristic image to be processed;
segmenting the image to be processed to obtain a plurality of sub-images;
generating a target weight corresponding to each sub-image based on the image to be processed and a plurality of preset initial weights;
for each sub-image, performing convolution operation on the sub-image by adopting the target weight corresponding to the sub-image to obtain a characteristic image of the sub-image;
generating a characteristic image of the image to be processed based on the characteristic image of each sub-image;
the step of generating a target weight corresponding to each of the sub-images based on the image to be processed and a plurality of preset initial weights includes:
generating a plurality of linear weighting coefficients based on the image to be processed, the plurality of sub-images and a plurality of preset initial weights;
generating a target weight corresponding to each of the sub-images based on the linear weighting coefficient and the preset plurality of initial weights;
the step of generating a plurality of linear weighting coefficients based on the image to be processed, the plurality of sub-images and a plurality of preset initial weights comprises:
performing dimensionality reduction operation on the image to be processed to obtain dimensionality-reduced image information;
constructing a fully connected layer based on the number of sub-images and the number of initial weights;
and calculating the image information after the dimensionality reduction through the full connection layer to generate a plurality of linear weighting coefficients.
2. The method of claim 1, wherein the step of generating a target weight corresponding to each of the sub-images based on the linear weighting coefficients and a plurality of preset initial weights comprises:
generating a target weight corresponding to each sub-image according to a preset linear weighting formula, the linear weighting coefficients and a plurality of preset initial weights; wherein the preset linear weighting formula is:
Figure FDA0003491801870000021
wherein, WnsTarget weight, w, corresponding to S sub-image of nth image to be processediIs the ith initial weight, N is the number of initial weights, alphaiAnd the ith linear weighting coefficient corresponds to the S sub-image of the nth image to be processed.
3. The method according to claim 1, wherein the step of segmenting the image to be processed comprises:
segmenting the image to be processed according to a preset segmentation rule; the preset segmentation rules comprise uniform segmentation rules and/or non-uniform segmentation rules.
4. The method according to claim 3, wherein the step of generating the feature image of the image to be processed based on the feature image of each of the sub-images comprises:
and splicing the characteristic images of the sub-images according to the preset segmentation rule to obtain the characteristic image of the image to be processed.
5. The method of claim 1, wherein the initial weights are derived based on an optimization algorithm; the optimization algorithm comprises one or more of a random steepest descent method SGD, a batch gradient descent method BGD, an Adam optimization algorithm and a root mean square error algorithm RMSProp.
6. An apparatus for extracting image features, the apparatus comprising:
the image acquisition module is used for acquiring an image to be processed; the image to be processed comprises an original image to be processed or a characteristic image to be processed;
the image segmentation module is used for segmenting the image to be processed to obtain a plurality of sub-images;
the weight generation module is used for generating target weights corresponding to the sub-images based on the image to be processed and a plurality of preset initial weights;
the convolution operation module is used for performing convolution operation on each sub-image by adopting the target weight corresponding to the sub-image to obtain a characteristic image of the sub-image;
the characteristic image generation module is used for generating a characteristic image of the image to be processed based on the characteristic image of each sub-image;
the weight generation module is further to: generating a plurality of linear weighting coefficients based on the image to be processed, the plurality of sub-images and a plurality of preset initial weights; generating a target weight corresponding to each sub-image based on the linear weighting coefficient and a plurality of preset initial weights;
the weight generation module is further to: performing dimensionality reduction on an image to be processed to obtain image information subjected to dimensionality reduction; constructing a full connection layer based on the number of the sub-images and the number of the initial weights; and calculating the image information after the dimension reduction through the full connection layer to generate a plurality of linear weighting coefficients.
7. An electronic device, comprising: a processor and a storage device;
the storage device has stored thereon a computer program which, when executed by the processor, performs the method of any of claims 1 to 5.
8. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of the claims 1 to 5.
CN201910873505.1A 2019-09-12 2019-09-12 Image feature extraction method and device and electronic equipment Active CN110598717B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910873505.1A CN110598717B (en) 2019-09-12 2019-09-12 Image feature extraction method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910873505.1A CN110598717B (en) 2019-09-12 2019-09-12 Image feature extraction method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN110598717A CN110598717A (en) 2019-12-20
CN110598717B true CN110598717B (en) 2022-06-21

Family

ID=68860020

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910873505.1A Active CN110598717B (en) 2019-09-12 2019-09-12 Image feature extraction method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN110598717B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112883983B (en) * 2021-02-09 2024-06-14 北京迈格威科技有限公司 Feature extraction method, device and electronic system
CN115150576A (en) * 2021-03-30 2022-10-04 哲库科技(上海)有限公司 Image processing method and device and electronic equipment
CN113344040A (en) * 2021-05-20 2021-09-03 深圳索信达数据技术有限公司 Image classification method and device, computer equipment and storage medium
CN113792804B (en) * 2021-09-16 2023-11-21 北京百度网讯科技有限公司 Training method of image recognition model, image recognition method, device and equipment
CN114387450A (en) * 2022-01-11 2022-04-22 平安科技(深圳)有限公司 Picture feature extraction method and device, storage medium and computer equipment
CN116364223B (en) * 2023-05-26 2023-08-29 平安科技(深圳)有限公司 Feature processing method, device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304847A (en) * 2017-11-30 2018-07-20 腾讯科技(深圳)有限公司 Image classification method and device, personalized recommendation method and device
CN108492327A (en) * 2018-02-07 2018-09-04 广州视源电子科技股份有限公司 AOI image matching method, system, readable storage medium and intelligent device
CN109829506A (en) * 2019-02-18 2019-05-31 南京旷云科技有限公司 Image processing method, device, electronic equipment and computer storage medium
CN109871814A (en) * 2019-02-22 2019-06-11 成都旷视金智科技有限公司 Estimation method, device, electronic equipment and the computer storage medium at age

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI653605B (en) * 2017-12-25 2019-03-11 由田新技股份有限公司 Automatic optical detection method, device, computer program, computer readable recording medium and deep learning system using deep learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304847A (en) * 2017-11-30 2018-07-20 腾讯科技(深圳)有限公司 Image classification method and device, personalized recommendation method and device
CN108492327A (en) * 2018-02-07 2018-09-04 广州视源电子科技股份有限公司 AOI image matching method, system, readable storage medium and intelligent device
CN109829506A (en) * 2019-02-18 2019-05-31 南京旷云科技有限公司 Image processing method, device, electronic equipment and computer storage medium
CN109871814A (en) * 2019-02-22 2019-06-11 成都旷视金智科技有限公司 Estimation method, device, electronic equipment and the computer storage medium at age

Also Published As

Publication number Publication date
CN110598717A (en) 2019-12-20

Similar Documents

Publication Publication Date Title
CN110598717B (en) Image feature extraction method and device and electronic equipment
CN108805828B (en) Image processing method, device, computer equipment and storage medium
US9542621B2 (en) Spatial pyramid pooling networks for image processing
CN112949678B (en) Deep learning model countermeasure sample generation method, system, equipment and storage medium
CN111783779B (en) Image processing method, apparatus and computer readable storage medium
CN107784288A (en) A kind of iteration positioning formula method for detecting human face based on deep neural network
CN110765860A (en) Tumble determination method, tumble determination device, computer apparatus, and storage medium
CN112613581A (en) Image recognition method, system, computer equipment and storage medium
EP4107698A1 (en) Generating three-dimensional object models from two-dimensional images
CN111832437A (en) Building drawing identification method, electronic equipment and related product
JP2007128195A (en) Image processing system
WO2018120723A1 (en) Video compressive sensing reconstruction method and system, and electronic apparatus and storage medium
CN111105017A (en) Neural network quantization method and device and electronic equipment
CN111935487B (en) Image compression method and system based on video stream detection
US20210248729A1 (en) Superpixel merging
CN109165654B (en) Training method of target positioning model and target positioning method and device
CN111914908A (en) Image recognition model training method, image recognition method and related equipment
CN111223128A (en) Target tracking method, device, equipment and storage medium
CN115330579B (en) Model watermark construction method, device, equipment and storage medium
CN114494775A (en) Video segmentation method, device, equipment and storage medium
CN114266894A (en) Image segmentation method and device, electronic equipment and storage medium
CN112348808A (en) Screen perspective detection method and device
JP5617841B2 (en) Image processing apparatus, image processing method, and image processing program
CN112465050B (en) Image template selection method, device, equipment and storage medium
CN116310356B (en) Training method, target detection method, device and equipment of deep learning model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant