CN107766863B - Image characterization method and server - Google Patents

Image characterization method and server Download PDF

Info

Publication number
CN107766863B
CN107766863B CN201610694360.5A CN201610694360A CN107766863B CN 107766863 B CN107766863 B CN 107766863B CN 201610694360 A CN201610694360 A CN 201610694360A CN 107766863 B CN107766863 B CN 107766863B
Authority
CN
China
Prior art keywords
matrix
convolution
dimension reduction
matrix set
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610694360.5A
Other languages
Chinese (zh)
Other versions
CN107766863A (en
Inventor
李�昊
孙修宇
刘巍
潘攀
华先胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201610694360.5A priority Critical patent/CN107766863B/en
Publication of CN107766863A publication Critical patent/CN107766863A/en
Application granted granted Critical
Publication of CN107766863B publication Critical patent/CN107766863B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The embodiment of the application discloses an image characterization method and a server. The image characterization method comprises the following steps: acquiring an image matrix set corresponding to a target image in a preset color space; carrying out dimension reduction processing on the image matrix set to obtain at least one dimension reduction matrix set; calculating according to a preset algorithm by using elements with the same two-dimensional coordinate in a first dimension reduction matrix, a second dimension reduction matrix and a third dimension reduction matrix of the dimension reduction matrix set, and outputting a screening matrix corresponding to the dimension reduction matrix set; based on a screening matrix corresponding to the dimension reduction matrix set, performing data screening on the dimension reduction matrix set to obtain at least one characterization matrix set of the target image; wherein the set of characterization matrices includes a first characterization matrix, a second characterization matrix, and a third characterization matrix. The image characterization method and the server can reduce the operation workload generated when the characterization data of the target image are acquired.

Description

Image characterization method and server
Technical Field
The present application relates to the field of computer technologies, and in particular, to an image characterization method and a server.
Background
With the development of computer technology, images have become an important form of information bearing. The characterization of an image generally refers to a process of obtaining characterization data of the image and characterizing the image by using the characterization data, wherein the characterization data is generally data capable of expressing information such as color, texture, shape and the like contained in the image. The representation of the image has wide application in the fields of image retrieval, image splicing, target detection and identification, robot scene positioning, video content analysis and the like. For example, in the field of image search, after receiving a query request including a target image, it is generally necessary to obtain representation data of the target image, and compare the representation data of the target image with the representation data of each image in an image representation database to return an image matching the query request in the image representation data.
In the prior art, the image characterization method is generally as follows:
the method comprises the steps of segmenting a target image by using a preset image segmentation algorithm to obtain a target area where an interested target is located from the target image, extracting data capable of expressing information such as color, texture and shape contained in the target area from the target area by using a preset data extraction algorithm, and taking the extracted data as representation data of the target image. The image segmentation algorithm may include a Roberts (Roberts) algorithm, a Sobel (Sobel) algorithm, and the like, and the data extraction algorithm may include a Fourier transform algorithm, a wavelet transform algorithm, and the like.
In the process of implementing the present application, the inventor finds that at least the following problems exist in the prior art:
in the prior art, when obtaining the representation data of the target image, a preset image segmentation algorithm is usually required to segment the target image so as to obtain a target region where the target of interest is located from the target image. However, image segmentation algorithms generally have a high computational complexity. Therefore, the method in the prior art described above generates a large amount of calculation work when acquiring the representation data of the target image, and places a large load on the server executing the method in the prior art described above. In particular, when the above-mentioned prior art method is applied to the field of image searching, the characterization data of a very large number of target images may need to be acquired at the same time, which may generate a huge amount of calculation workload, bring a huge load to a server performing the image searching, and even may cause a downtime of the server performing the image searching.
Disclosure of Invention
The embodiment of the application aims to provide an image characterization method and a server so as to reduce the operation workload generated when the characterization data of a target image is acquired.
In order to achieve the above object, an embodiment of the present application provides an image characterization method, including: acquiring an image matrix set corresponding to a target image in a preset color space, wherein the image matrix set at least comprises a first-dimensional matrix, a second-dimensional matrix and a third-dimensional matrix; performing dimensionality reduction on the image matrix set to obtain at least one dimensionality reduction matrix set, wherein each dimensionality reduction matrix set comprises a first dimensionality reduction matrix corresponding to the first dimensionality matrix, a second dimensionality reduction matrix corresponding to the second dimensionality matrix, and a third dimensionality reduction matrix corresponding to the third dimensionality matrix; calculating according to elements with the same two-dimensional coordinate in a first dimension reduction matrix, a second dimension reduction matrix and a third dimension reduction matrix of each dimension reduction matrix set according to a preset algorithm, and outputting a screening matrix corresponding to each dimension reduction matrix set; based on a screening matrix corresponding to the dimension reduction matrix set, performing data screening on each dimension reduction matrix set to obtain at least one characterization matrix set of the target image; wherein the set of characterization matrices includes a first characterization matrix, a second characterization matrix, and a third characterization matrix.
To achieve the above object, an embodiment of the present application provides a server, including: the image matrix set acquisition unit is used for acquiring an image matrix set corresponding to a target image in a preset color space, and the image matrix set at least comprises a first dimension matrix, a second dimension matrix and a third dimension matrix; the image matrix set dimension reduction processing unit is used for carrying out dimension reduction processing on the image matrix set to obtain at least one dimension reduction matrix set, wherein the dimension reduction matrix set comprises a first dimension reduction matrix corresponding to the first dimension matrix, a second dimension reduction matrix corresponding to the second dimension matrix and a third dimension reduction matrix corresponding to the third dimension matrix; the screening matrix output unit is used for calculating according to the elements with the same two-dimensional coordinate in the first dimension reduction matrix, the second dimension reduction matrix and the third dimension reduction matrix of the dimension reduction matrix set according to a preset algorithm and outputting a screening matrix corresponding to the dimension reduction matrix set; the dimension reduction matrix set data screening unit is used for carrying out data screening on the dimension reduction matrix set based on a screening matrix corresponding to the dimension reduction matrix set to obtain at least one representation matrix set of the target image; wherein the set of characterization matrices includes a first characterization matrix, a second characterization matrix, and a third characterization matrix.
To achieve the above object, an embodiment of the present application provides another server, including: a memory for storing program instructions; a processor for performing the functions implemented by the program instructions, comprising: acquiring an image matrix set corresponding to a target image in a preset color space, wherein the image matrix set at least comprises a first-dimensional matrix, a second-dimensional matrix and a third-dimensional matrix; performing dimensionality reduction on the image matrix set to obtain at least one dimensionality reduction matrix set, wherein each dimensionality reduction matrix set comprises a first dimensionality reduction matrix corresponding to the first dimensionality matrix, a second dimensionality reduction matrix corresponding to the second dimensionality matrix, and a third dimensionality reduction matrix corresponding to the third dimensionality matrix; calculating elements with the same two-dimensional coordinates in a first dimension reduction matrix, a second dimension reduction matrix and a third dimension reduction matrix of the dimension reduction matrix set according to a preset algorithm in the memory, and outputting a screening matrix corresponding to the dimension reduction matrix set; based on a screening matrix corresponding to the dimension reduction matrix set, performing data screening on the dimension reduction matrix set to obtain at least one characterization matrix set of the target image; wherein the set of characterization matrices includes a first characterization matrix, a second characterization matrix, and a third characterization matrix.
According to the technical scheme provided by the embodiment of the application, the dimension reduction is carried out on the image matrix set of the image, and data screening is carried out on the dimension reduced matrix after dimension reduction, so that the representation matrix representing the image is obtained. Compared with the prior art, the method and the device have the advantages that when the representation matrix of the target image is obtained, the process of segmenting the target image by using an image segmentation algorithm is avoided, and therefore the operation workload generated when the representation matrix of the target image is obtained can be reduced.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of an image characterization method according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a dimension reduction process provided in an embodiment of the present application;
FIG. 3 is a diagram illustrating a dimension reduction process according to an embodiment of the present disclosure;
fig. 4 is a flowchart of an image search application scenario provided in an embodiment of the present application;
fig. 5 is a functional structure diagram of a server according to an embodiment of the present disclosure;
fig. 6 is a schematic functional structure diagram of another server provided in the embodiment of the present application.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art without any inventive work based on the embodiments in the present application shall fall within the scope of protection of the present application.
Please refer to fig. 1. The embodiment of the application provides an image characterization method, which comprises the following steps. The execution main body of the image representation method in the embodiment of the present application may be an electronic device having certain processing capability on an image and having an arithmetic function. Such as a computer, server, or smart phone. Of course, the embodiments of the present application are not limited to the above examples.
Step S10: acquiring an image matrix set corresponding to a target image in a preset color space, wherein the image matrix set at least comprises a first-dimensional matrix, a second-dimensional matrix and a third-dimensional matrix.
In this embodiment, the target image is generally an image of a characterization matrix to be acquired. The storage format of the target image includes, but is not limited to, bitmap format (BMP), joint photographic experts group format (JPEG), Tagged Image File Format (TIFF), and the like.
In this embodiment, the color space is typically a mathematical model that describes color using multiple color channels. Commonly used color spaces may include YUV color space, YCbCr color space, RGB color space, HSL color space, and the like. The YUV color space typically includes 3 color channels, Y, U and V, where color channel Y represents luminance and color channel U and color channel V represent chrominance; the YCbCr color space typically includes 3 color channels, Y, Cb and Cr, where color channel Y represents luminance, color channel Cb represents blue chrominance, and color channel Cr represents red chrominance; the RGB color space typically includes R, G and B3 color channels, where color channel R represents red, color channel G represents green, and color channel B represents blue. The HSL color space typically includes H, S and L3 color channels to describe color, where color channel H represents hue, color channel S represents saturation, and color channel L represents brightness.
The predetermined color space is typically the color space used when describing the colors of the target image. For example, the predetermined color space of the target image a1 may be an RGB color space.
In this embodiment, the target image may be represented by a plurality of data tables arranged in a preset channel order. The number of the data tables may be the same as the number of the color channels of the predetermined color space, and each data table may correspond to one color channel of the predetermined color space. The preset channel sequence is generally an arrangement sequence of the color channels in the predetermined color space.
For example, the predetermined color space of the target image a2 may be an RGB color space. Then, the object image a2 can be represented by 3 data tables arranged in a preset channel order, that is, by the data table TAB1, the data table TAB2, and the data table TAB 3. The data table TAB1 may correspond to the color channel R of the RGB color space, the data table TAB2 may correspond to the color channel G of the RGB color space, and the data table TAB3 may correspond to the color channel B of the RGB color space. Whereas the preset channel order of the RGB color space is generally the color channel R, the color channel G, and the color channel B, the 3 data tables may be arranged in the order of the data table TAB1, the data table TAB2, and the data table TAB 3.
In this embodiment, each data table corresponding to the target image in a predetermined color space may be used as one dimensional matrix. Then, the target image may be represented by an image matrix set, and the image matrix set may include a plurality of dimension matrices arranged in a preset channel order. Each dimension matrix may correspond to a data table of the target image, i.e. may correspond to a color channel of the predetermined color space. The number of dimensional matrices in the image matrix set may be the same as the number of data tables of the target image, i.e., may be the same as the number of color channels of the predetermined color space.
Whereas a common color space usually comprises at least 3 color channels, the set of image matrices may also comprise at least 3 dimensional matrices, i.e. the set of image matrices may comprise at least a first, a second and a third dimensional matrix. The first dimensional matrix, the second dimensional matrix, and the third dimensional matrix may correspond to one color channel of the predetermined color space, respectively. The first dimension matrix, the second dimension matrix, and the third dimension matrix may be arranged in a preset channel order.
The first, second, and third dimensional matrices may have the same number of rows and columns. The number of rows may be the same as the pixel length of the target image and the number of columns may be the same as the pixel width of the target image. The pixel length generally refers to a length of an image in a unit of a pixel, and the pixel width generally refers to a width of the image in a unit of a pixel. For example, the target image a1 may have a pixel length of 120 and a pixel width of 100.
Each element with the same two-dimensional coordinate in the first dimensional matrix, the second dimensional matrix and the third dimensional matrix may correspond to a pixel point of the target image. The two-dimensional coordinates of an element are typically the coordinates formed by the number of rows and columns of the element.
In a specific application scenario, the target image may correspond to an image rectangular coordinate system, and the image rectangular coordinate system may be a rectangular coordinate system established with an upper left corner of the target image as an origin, a vertical downward direction as a positive direction of a Y axis, and a horizontal rightward direction as a positive direction of an X axis. The target image may have a pixel length of L and a pixel width of W, where W and L are positive integers. The predetermined color space may be an RGB color space. The first dimension matrix may be BW×LSpecifically, the color channel R may correspond to the RGB color space; the second dimension matrix may be DW×LSpecifically, the color channel G may correspond to the RGB color space; the third dimensional matrix may be EW×LSpecifically, the color channel B may correspond to the RGB color space. Then, the first dimension matrix BW×LElement b in (1)ijA second dimensional matrix DW×LElement d in (1)ijAnd element e in the third dimensional matrixijMay be elements having the same two-dimensional coordinates (i, j). Element bijElement dijAnd element eijPixel point PX capable of corresponding to the target imageijThe pixel point PXijThe coordinate value in the image rectangular coordinate system may be (i-1, j-1). Wherein, the element bijCan be pixel point PXijThe grey scale value of the R channel; element dijCan be pixel point PXijThe gray scale value of the G channel; element eijCan be pixel point PXijThe gray scale value of the B channel. Wherein i is a positive integer between 1 and W, and j is a positive integer between 1 and L.
In this embodiment, the subject of the image matrix set for acquiring the target image in the predetermined color space may be a server. The target image may be sent from the client to the server, and then the server may receive the target image sent from the client and may obtain an image matrix corresponding to the target image in a predetermined color space. Or, the target image may also be an image pre-stored in the server, and then the server may read the stored target image and may obtain an image matrix corresponding to the target image in a predetermined color space.
Step S11: and performing dimensionality reduction on the image matrix set to obtain at least one dimensionality reduction matrix set, wherein the dimensionality reduction matrix set comprises a first dimensionality reduction matrix corresponding to the first dimensionality matrix, a second dimensionality reduction matrix corresponding to the second dimensionality matrix, and a third dimensionality reduction matrix corresponding to the third dimensionality matrix.
Please refer to fig. 1 and fig. 2 together. In one embodiment, the image matrix set is subjected to dimension reduction, and a Convolutional Neural Network (CNN) may be used. The convolutional neural network dimensionality reduction model can be generally constructed by using historical data, wherein the historical data can comprise image data, matrix data and the like. For example, a dimension reduction modeling model may be established in advance and may be trained using image data to generate a convolutional neural network dimension reduction model. The training targets for the dimensionality reduction modeling model may be: under the condition of less sacrifice of information contained in the target image, the dimension reduction processing can be accurately carried out on the dimension matrix through the convolutional neural network dimension reduction model generated by the dimension reduction modeling model.
In the present embodiment, the dimension of the matrix is generally positively related to the number of rows, columns, number of elements, and rank of the matrix. Then, the dimension of the matrix may be represented using the number of rows, columns, number of elements, or rank of the matrix. The dimensions of the first dimension matrix, the second dimension matrix, and the third dimension matrix are typically higher. In order to reduce the operation workload when obtaining the representation data of the target image, the first dimension matrix, the second dimension matrix and the third dimension matrix may be substituted into a preset convolutional neural network dimension reduction model to perform dimension reduction processing so as to reduce the dimensions of the first dimension matrix, the second dimension matrix and the third dimension matrix.
The preset convolutional neural network dimension reduction model can comprise at least one first layer convolutional matrix set, wherein each first layer convolutional matrix set can be constructed by adopting historical data. For example, the preset convolutional neural network dimension reduction model may include 1, 2, 10, 50, or 1000 sets of first-layer convolutional matrices.
Each of the first layer convolution matrix sets may perform dimensionality reduction on the image matrix set from different angles, which may include color, texture, shape, and the like. For example, the preset convolutional neural network dimensionality reduction model may include 3 first-layer convolutional matrix sets, that is, may include a first-layer convolutional matrix set, a second first-layer convolutional matrix set, and a third first-layer convolutional matrix set. The first layer convolution matrix set can perform dimensionality reduction on the image matrix set from the perspective of color, the second first layer convolution matrix set can perform dimensionality reduction on the image matrix set from the perspective of texture, and the third first layer convolution matrix set can perform dimensionality reduction on the image matrix set from the perspective of shape.
Each of the first layer convolution matrix sets may include a first convolution matrix, a second convolution matrix, and a third convolution matrix, where the first convolution matrix, the second convolution matrix, and the third convolution matrix may be constructed using historical data, respectively. The first, second, and third convolution matrices may have the same number of rows and columns.
Each of the dimension reduction matrix sets may include a first dimension reduction matrix corresponding to the first dimension matrix, a second dimension reduction matrix corresponding to the second dimension matrix, and a third dimension reduction matrix corresponding to the third dimension matrix. The first, second, and third dimension reduction matrices may have the same number of rows and columns.
Each of the first-layer convolution matrix sets may correspond to one of the reduced-dimension matrix sets. Each of the sets of dimension reduction matrices may correspond to one of the sets of first-level convolution matrices. Then, the performing dimension reduction processing may include: performing convolution operation on the first convolution matrix of each first layer of convolution matrix set and the first dimension matrix respectively, and taking an obtained operation result as a first dimension reduction matrix in a corresponding dimension reduction matrix set, wherein the corresponding dimension reduction matrix set can be a dimension reduction matrix set corresponding to the first layer of convolution matrix set; performing convolution operation on the second convolution matrix of each first-layer convolution matrix set and the second dimensional matrix respectively, and taking an obtained operation result as a second dimension reduction matrix in a corresponding dimension reduction matrix set, wherein the corresponding dimension reduction matrix set can be the dimension reduction matrix set corresponding to the first-layer convolution matrix set; and performing convolution operation on the third convolution matrix of each first-layer convolution matrix set and the third matrix respectively, and taking an obtained operation result as a third dimension reduction matrix in a corresponding dimension reduction matrix set, wherein the corresponding dimension reduction matrix set can be the dimension reduction matrix set corresponding to the first-layer convolution matrix set.
The number of rows of the first-dimension matrix is generally greater than or equal to the number of rows of the first convolution matrix in each first layer of convolution matrix set, and the number of columns of the first-dimension matrix is generally greater than or equal to the number of columns of the first convolution matrix in each first layer of convolution matrix set. Then, for each first layer convolution matrix set, performing convolution operation on the first convolution matrix of the first layer convolution matrix set and the first dimension matrix to obtain an operation result, the process may include: dividing the first dimension matrix into at least one sub-block dimension matrix; and performing convolution operation on each subblock dimension matrix and the first convolution matrix of the first layer convolution matrix set respectively to obtain an operation result of the first convolution matrix in the first layer convolution matrix set. The number of rows of each sub-block dimension matrix in the first-layer convolution matrix set may be the same as the number of rows of the first convolution matrix in the first-layer convolution matrix set; the number of columns of each of the sub-block dimension matrices in the first dimension matrix may be the same as the number of columns of the first convolution matrix in the first layer of convolution matrix set; the elements of each sub-block dimension matrix in the first dimension matrix may be different or partially the same.
In a specific application scenario, the preset convolutional neural network dimension reduction model may include a first layer convolutional matrix set MD1, a first layer convolutional matrix set MD2, and a first layer convolutional matrix set MD 3. The first layer set of convolution matrices MD1 may include a first convolution matrix MD1_ MT1, a second convolution matrix MD1_ MT2, and a third convolution matrix MD1_ MT 3. The first layer set of convolution matrices MD2 may include a first convolution matrix MD2_ MT1, a second convolution matrix MD2_ MT2, and a third convolution matrix MD2_ MT 3. The first layer set of convolution matrices MD3 may include a first convolution matrix MD3_ MT1, a second convolution matrix MD3_ MT2, and a third convolution matrix MD3_ MT 3.
The at least one reduced-dimension matrix set may include a reduced-dimension matrix set RD1, a reduced-dimension matrix set RD2, and a reduced-dimension matrix set RD 3. Then, the first convolution matrix MD1_ MT1 may be subjected to convolution operation with the first dimension matrix, and the operation result may be used as the first dimension reduction matrix in the dimension reduction matrix set RD 1; the first convolution matrix MD2_ MT1 may be subjected to convolution operation with the first dimension matrix, and the operation result may be used as a first dimension reduction matrix in the dimension reduction matrix set RD 2; the first convolution matrix MD3_ MT1 may be subjected to convolution operation with the first dimension matrix, and the operation result may be used as the first dimension reduction matrix in the dimension reduction matrix set RD 3. The second convolution matrix MD1_ MT2 may be subjected to convolution operation with the second dimension matrix, and the operation result may be used as the second dimension reduction matrix in the dimension reduction matrix set RD 1; the second convolution matrix MD2_ MT2 may be subjected to convolution operation with the second dimension matrix, and the operation result may be used as the second dimension reduction matrix in the dimension reduction matrix set RD 2; the second convolution matrix MD3_ MT2 may be convolved with the second dimension matrix, and the operation result may be used as the second dimension reduction matrix in the dimension reduction matrix set RD 3. The convolution operation can be carried out on the third convolution matrix MD1_ MT3 and the third matrix, and the operation result can be used as a third dimension reduction matrix in the dimension reduction matrix set RD 1; the convolution operation can be carried out on the third convolution matrix MD2_ MT3 and the third matrix, and the operation result can be used as a third dimension reduction matrix in the dimension reduction matrix set RD 2; the third convolution matrix MD3_ MT3 may be convolved with the third dimension matrix, and the operation result may be used as the third dimension reduction matrix in the dimension reduction matrix set RD 3.
As shown in fig. 3. In an embodiment, the performing the dimension reduction on the image matrix may further include performing dimension reduction on the first dimensional matrix, the second dimensional matrix, and the third dimensional matrix by means of block mapping. Specifically, a matrix can be divided into a plurality of different small matrices, each matrix is operated to obtain a representative value, and the obtained representative values form a new matrix, so that the matrix dimension reduction is realized.
Step S12: and calculating according to the elements with the same two-dimensional coordinates in the first dimension reduction matrix, the second dimension reduction matrix and the third dimension reduction matrix of the dimension reduction matrix set according to a preset algorithm, and outputting a screening matrix corresponding to the dimension reduction matrix set.
In this embodiment, the predetermined algorithm is generally a mathematical model for establishing the screening matrix. The preset algorithm can be obtained by training through historical data. The historical data may include image data and matrix data. The training targets for the preset algorithm may be: the screening matrix output by the preset algorithm can accurately screen the elements in the matrix.
In this embodiment, the target image may generally include a background region and a target region, which is generally the region where the target of interest is located. The first dimension reduction matrix, the second dimension reduction matrix and the third dimension reduction matrix comprise elements reflecting the target area and elements reflecting the background area. In order to enable the acquired characterization data to accurately reflect the information contained in the target image, the elements having the same two-dimensional coordinates in the first dimension reduction matrix, the second dimension reduction matrix and the third dimension reduction matrix of each dimension reduction matrix set may be used as a numerical value set for operation, and whether the element at the two-dimensional coordinate contains the feature data of the target image is determined. If an element of a two-dimensional coordinate is considered to include feature data, the element of the two-dimensional coordinate may be retained. If the element of the two-dimensional coordinate is judged not to be the feature data of the target image, for example, the background part of the target image. The elements corresponding to the binary coordinates may be removed. Therefore, by the method, the background part in the applied characterization data can be removed, the elements for identifying the characteristic data are left, and the obtained characterization data can better represent the target image.
In this embodiment, the preset algorithm may include one or more preset value sets. The number of the numerical values in each preset numerical value set can be one or more. The number of values may be the same between different sets of preset values. Whereas a common color space typically comprises at least 3 color channels, each of said preset value sets may comprise at least 3 values.
In this embodiment, each dimension reduction matrix set may correspond to one preset value set. The preset value sets corresponding to different dimensionality reduction matrix sets can be the same or different. Then, the process of generating the screening matrix corresponding to each dimension reduction matrix set may include: for each dimension reduction matrix set, elements with the same two-dimensional coordinates in the first dimension reduction matrix, the second dimension reduction matrix and the third dimension reduction matrix of the dimension reduction matrix set can be used as a numerical value set, a preset numerical value set corresponding to the dimension reduction matrix set is operated, and an operation result is compared with a preset screening threshold value to determine the value of the two-dimensional coordinates in the screening matrix.
The preset screening threshold may be determined according to actual needs, and may be, for example, 0.1, 0.2, 0.5, 1, 10, 50, or 1000.
The predetermined algorithm may include an algorithm used when performing an operation between two sets of values. For example, the set of values R1 may include the value R1The value r2And the value r3. The set of values R2 may include a value R4The value r5And the value r6. Then, the value set R1 and the value set R2 are operated according to a preset algorithm, so as to obtain the value RS. It is composed ofIn (R) RS1×r4+r2×r5+r3×r6
In a specific application scenario, a value set formed by elements in the first dimension reduction matrix, the second dimension reduction matrix and the third dimension reduction matrix of the dimension reduction matrix set S, each having the same two-dimensional coordinate, may be used as a target value set. The elements of the target value set are typically arranged in order; then the target value set may be treated as a 3-dimensional target vector, and the elements of the target vector may be elements in the target value set. The numerical values of the preset numerical value set corresponding to the dimension reduction matrix set S are generally arranged in sequence; then, the preset value set may be used as a preset vector, and the elements of the preset vector may be elements in the preset value set. Then, the operating the target value set and the preset value set according to a preset algorithm may include: and carrying out inner product operation on the target vector corresponding to the target numerical value set and a preset vector corresponding to the preset numerical value set.
The process of outputting the screening matrix may include: when the operation result is larger than the preset screening threshold value, the value of the element of the two-dimensional coordinate corresponding to the screening matrix is a first value; and when the operation result is smaller than the preset screening threshold value, the value of the element of the two-dimensional coordinate corresponding to the screening matrix is a second value, wherein the first value is larger than the second value. In a specific example, the first value may be 1, and the second value may be 0, and then the screening matrix may be
Figure BDA0001084138310000091
It can be understood that when the operation result is equal to the preset screening threshold, the value of the element of the two-dimensional coordinate may be set according to actual needs, and may be the first value or the second value. In a specific example, when the operation result is equal to the preset filtering threshold, the value of the element of the two-dimensional coordinate is the first value.
In a specific embodimentIn the application scenario of (3), the dimension reduction matrix set S may include a first dimension reduction matrix S12×2A second dimension reduction matrix S22×2And a third dimension reduction matrix S32×2. First dimension reduction matrix S12×2A second dimension reduction matrix S22×2And a third dimension reduction matrix S32×2May each be a2 x 2 order matrix. The preset value set corresponding to the dimension reduction matrix set S may be T. The screening matrix corresponding to the dimension reduction matrix set S can be a2 x 2-order matrix U2×2. The preset screening threshold may be V _1, the first value may be V _2, and the second value may be V _ 3. Wherein,
the preset set of values T may comprise values T1Numerical value t2And a value t3
Figure BDA0001084138310000101
Then, the first dimension reduction matrix S1 may be2×2Element s111A second dimension reduction matrix S22×2Element s211The third dimension reduction matrix S32×2Element s311As a numerical value set, and the preset numerical value set T according to the preset operation rule, i.e., s111×t7+s211×t2+s311×t3. S1 can be substituted11×t7+s211×t2+s311×t3Is compared with a preset screening threshold value to determine a screening matrix U2×2Middle element u11The value of (a). When s111×t7+s211×t2+s311×t3When the value of (b) is greater than the preset screening threshold value V _1, the element u11The value of (d) may be a first value V _ 2; otherwise, element u11May be the first value V _ 3.
Similarly, s1 can be substituted12×t1+s212×t2+s312×t3Is compared with a preset screening threshold value V _1 to determine a screening matrix U2×2Middle element u12The value of (a).When s112×t1+s212×t2+s312×t3When the value of (b) is greater than the preset screening threshold value V _1, the element u12The value of (a) may be V _ 2; otherwise, element u12May be the first value V _ 3.
Similarly, s1 can be substituted21×t1+s221×t2+s321×t3Is compared with a preset screening threshold value to determine a screening matrix U2×2Middle element u21The value of (a). When s121×t1+s221×t2+s321×t3When the value of (b) is greater than the preset screening threshold value V _1, the element u21The value of (d) may be a first value V _ 2; otherwise, element u21May be the first value V _ 3.
Similarly, s1 can be substituted22×t1+s222×t2+s322×t3Is compared with a preset screening threshold value to determine a screening matrix U2×2Middle element u22The value of (a). When s122×t1+s222×t2+s322×t3When the value of (b) is greater than the preset screening threshold value V _1, the element u22The value of (d) may be a first value V _ 2; otherwise, element u22May be the first value V _ 3.
Step S13: based on a screening matrix corresponding to the dimension reduction matrix set, performing data screening on the dimension reduction matrix set to obtain at least one characterization matrix set of the target image; wherein the set of characterization matrices includes a first characterization matrix, a second characterization matrix, and a third characterization matrix.
In this embodiment, the characterization matrix is generally a matrix for expressing the characteristic information included in the target image. The information may typically include color, texture, and shape information.
In this embodiment, the first dimension reduction matrix, the second dimension reduction matrix, and the third dimension reduction matrix of each dimension reduction matrix set may be respectively operated with the screening matrix corresponding to the dimension reduction matrix set to obtain at least one characterization matrix set.
In this embodiment, the screening matrix, the first dimension reduction matrix, the second dimension reduction matrix, and the third dimension reduction matrix may have the same number of rows and columns. Then, for each dimension reduction matrix set, the first dimension reduction matrix, the second dimension reduction matrix, and the third dimension reduction matrix of the dimension reduction matrix set may be subjected to bitwise product with the screening matrix corresponding to the dimension reduction matrix set, so as to obtain a characterization matrix correspondingly. Specifically, the first dimension reduction matrix, the second dimension reduction matrix and the third dimension reduction matrix are respectively subjected to a bitwise product with the screening matrix to obtain a first characterization matrix, a second characterization matrix and a third characterization matrix.
In one specific example, the values of the elements in the screening matrix are selected from 1 and 0. Thus, after the dimension reduction matrix and the screening matrix are subjected to alignment product, the data aligned with 0 can be removed. In the generated characterization matrix, the elements of the dimension reduction matrix corresponding to the data 1 in the screening matrix are retained. Thus, the elements in the dimension reduction matrix corresponding to the background part in the image can be removed through the screening matrix, and the elements corresponding to the characteristic information in the image are reserved.
In this embodiment, the first representation matrix, the second representation matrix, and the third representation matrix may be further subjected to arithmetic processing to obtain representation data of other forms. For example, the elements in the characterization matrix obtained according to the above algorithm are cumulatively summed. Thus, different characterization matrix sets can obtain a value, and at least one obtained value is used as characterization data of the target image. Of course, other variations are possible, and all the methods for generating the characterization data provided by the above embodiments are used as long as the methods are based on the operations of the embodiments of the present application, and all the methods are covered by the scope of the present application.
In one embodiment, in step S11, the obtained characterization matrix can more accurately reflect the information contained in the target image for refinement. In the step of performing dimension reduction processing on the image matrix set, the method may further include: and carrying out convolution operation on the result of the convolution operation on the image matrix set and at least one first-level convolution matrix set and a second-level convolution matrix set.
In this embodiment, each set of the second layer convolution matrices may be constructed using historical data. Each of the second set of layer convolution matrices may include a fourth convolution matrix, a fifth convolution matrix, and a sixth convolution matrix. The fourth, fifth, and sixth convolution matrices may have the same number of rows and columns. The number of rows of the fourth convolution matrix is generally less than or equal to the number of rows of the first convolution matrix, and the number of columns of the fourth convolution matrix is generally less than or equal to the number of columns of the first convolution matrix.
Each set of first layer convolution matrices may correspond to at least one set of second layer convolution matrices; each of the second set of layer convolution matrices may correspond to at least one of the first set of layer convolution matrices. Then, the performing dimension reduction processing may further include: performing convolution operation on an operation result of a first convolution matrix in each first layer of convolution matrix set and a fourth convolution matrix corresponding to a second layer of convolution matrix set respectively, and taking an obtained operation result as a first dimension reduction matrix in the corresponding dimension reduction matrix set, wherein the corresponding second layer of convolution matrix set can be a second layer of convolution matrix set corresponding to the first layer of convolution matrix set, and the corresponding dimension reduction matrix set can be a dimension reduction matrix set corresponding to the first layer of convolution matrix set; performing convolution operation on the operation result of the second convolution matrix in each first layer of convolution matrix set and a fifth convolution matrix corresponding to the second layer of convolution matrix set respectively, and taking the obtained operation result as a second dimension reduction matrix in a corresponding dimension reduction matrix set, wherein the corresponding second layer of convolution matrix set can be the second layer of convolution matrix set corresponding to the first layer of convolution matrix set, and the corresponding dimension reduction matrix set can be the dimension reduction matrix set corresponding to the first layer of convolution matrix set; and performing convolution operation on an operation result of a third convolution matrix in each first layer of convolution matrix set and a sixth convolution matrix corresponding to a second layer of convolution matrix set, and taking the obtained operation result as a third dimension reduction matrix in the corresponding dimension reduction matrix set, wherein the corresponding second layer of convolution matrix set can be a second layer of convolution matrix set corresponding to the first layer of convolution matrix set, and the corresponding dimension reduction matrix set can be a dimension reduction matrix set corresponding to the first layer of convolution matrix set.
Specifically, the operation result of the first convolution matrix in each first layer convolution matrix set may not be directly used as the first dimension reduction matrix in the corresponding dimension reduction matrix set; the operation result of the first convolution matrix in each first layer of convolution matrix set may be convolved with the fourth convolution matrix corresponding to the second layer of convolution matrix set again, and the operation result after the operation is performed again may be used as the first dimension reduction matrix in the corresponding dimension reduction matrix set.
Similarly, the operation result of the second convolution matrix in each first layer convolution matrix set may not be directly used as the second dimension reduction matrix in the corresponding dimension reduction matrix set; the operation result of the second convolution matrix in each first layer of convolution matrix set may be subjected to convolution operation with the fifth convolution matrix corresponding to the second layer of convolution matrix set again, and the operation result after the convolution operation is performed again may be used as the second dimension reduction matrix in the corresponding dimension reduction matrix set.
Similarly, the operation result of the third convolution matrix in each first layer convolution matrix set may not be directly used as the third dimension reduction matrix in the corresponding dimension reduction matrix set; the operation result of the third convolution matrix in each first layer of convolution matrix set may be subjected to convolution operation with the sixth convolution matrix corresponding to the second layer of convolution matrix set again, and the operation result after the convolution operation is performed again may be used as the third dimension reduction matrix in the corresponding dimension reduction matrix set.
Of course, in this embodiment, the predetermined convolutional neural network may further include more levels of convolutional matrix sets for further refinement. The technical solution can be explained with reference to the foregoing, and is not described herein again.
Please refer to fig. 3. In another embodiment, in step S11, in order to refine the obtained representation matrix so that the obtained representation matrix can more accurately reflect the information contained in the target image, a block mapping process may be performed on the operation result of the first convolution matrix in each first layer convolution matrix set, and the obtained process result may be used as the first dimension reduction matrix in the corresponding dimension reduction matrix set, where the corresponding dimension reduction matrix set may be the dimension reduction matrix set corresponding to the first layer convolution matrix set; the operation result of the second convolution matrix in each first layer of convolution matrix set may be subjected to block mapping processing, and the obtained processing result is used as the second dimension reduction matrix in the corresponding dimension reduction matrix set, where the corresponding dimension reduction matrix set may be the dimension reduction matrix set corresponding to the first layer of convolution matrix set; the operation result of the third convolution matrix in each first layer of convolution matrix set may be subjected to block mapping processing, and the obtained processing result is used as the third dimension reduction matrix in the corresponding dimension reduction matrix set, where the corresponding dimension reduction matrix set may be the dimension reduction matrix set corresponding to the first layer of convolution matrix set.
Specifically, the operation result of the first convolution matrix in each first layer convolution matrix set may not be directly used as the first dimension reduction matrix in the corresponding dimension reduction matrix set; instead, the block mapping processing may be performed on the operation result of the first convolution matrix in each first layer convolution matrix set, and the obtained processing result is used as the first dimension reduction matrix corresponding to the dimension reduction matrix set.
For each first layer convolution matrix set, performing block mapping on an operation result of the first convolution matrix in the first layer convolution matrix set to obtain a processing result, which may include:
dividing the operation result of the first convolution matrix in the first layer of convolution matrix set into at least one subblock mapping matrix; respectively obtaining the mapping value of each subblock mapping matrix of the first convolution matrix in the first layer of convolution matrix set; and generating a processing result of the first convolution matrix in the first layer convolution matrix set based on the mapping value of each sub-block mapping matrix of the first convolution matrix in the first layer convolution matrix set. The row number and the column number of each subblock mapping matrix can be preset values; the elements of each sub-block mapping matrix of the first convolution matrix in the first layer of convolution matrix set can be different or partially the same; the mapping value may be an average value, a maximum value, a minimum value, or a median of each element in the subblock mapping matrix.
In a specific application scenario, the operation result of the first convolution matrix in a certain first layer convolution matrix set may be a matrix P of 6 × 6 order6×6. The matrix P may be formed6×6The division into 4 sub-block mapping matrices. Wherein the first sub-block mapping matrix may include an element p11、p12、p13、p21、p22、p23、p31、p32And p33(ii) a The second sub-block mapping matrix may comprise an element p14、p15、p16、p24、p25、p26、p34、p35And p36(ii) a The third sub-block mapping matrix may include an element p41、p42、p43、p51、p52、p53、p61、p62And p63(ii) a The fourth sub-block mapping matrix may include an element p44、p45、p46、p54、p55、p56、p64、p65And p66. Then, the average value of each element of the first sub-block mapping matrix may be used as the mapping value of the first sub-block mapping matrix; the average value of each element in the second sub-block mapping matrix can be used as the mapping value of the second sub-block mapping matrix; the average value of each element in the third sub-block mapping matrix may be used as the mapping value of the third sub-block mapping matrix; the average value of each element in the fourth sub-block mapping matrix may be used as the mapping value of the fourth sub-block mapping matrix. Based on the mapping value of each sub-block mapping matrix of the first convolution matrix in the first layer convolution matrix set, the processing result of the first convolution matrix in the first layer convolution matrix set, i.e. a2 × 2-order first dimension reduction matrix Q, can be generated2×2. Wherein,
Figure BDA0001084138310000131
the first dimension reduction matrix Q2×2In (q)11The average value of each element of the first sub-block mapping matrix can be obtained; q. q.s12The average value of each element of the second sub-block mapping matrix can be taken as the average value; q. q.s21The average value of each element of the third sub-block mapping matrix can be obtained; q. q.s22The average value of each element of the fourth sub-block mapping matrix may be used.
The process of performing block mapping on the operation result of the second convolution matrix in each first layer convolution matrix set to obtain the processing result may be similar to the process of performing block mapping on the operation result of the first convolution matrix in each first layer convolution matrix set to obtain the processing result.
The process of performing block mapping on the operation result of the third convolution matrix in each first layer convolution matrix set to obtain the processing result may be similar to the process of performing block mapping on the operation result of the first convolution matrix in each first layer convolution matrix set to obtain the processing result.
Of course, in this embodiment, in order to further refine the processing, the processing result corresponding to the first convolution matrix in each first layer convolution matrix set may not be directly used as the first dimension reduction matrix in the corresponding dimension reduction matrix set; instead, the processing result corresponding to the first convolution matrix in each first layer convolution matrix set may be subjected to block mapping again, and the processing result after the block mapping is performed again is used as the first dimension reduction matrix in the corresponding dimension reduction matrix set. The processing result corresponding to the first convolution matrix may be a processing result obtained by performing block mapping on the operation result of the first convolution matrix.
Similarly, the processing result corresponding to the second convolution matrix in each first layer convolution matrix set may not be directly used as the second dimension reduction matrix in the corresponding dimension reduction matrix set; instead, the processing result corresponding to the second convolution matrix in each first layer convolution matrix set may be subjected to block mapping again, and the processed result is used as the second dimension reduction matrix corresponding to the dimension reduction matrix set. The processing result corresponding to the second convolution matrix may be a processing result obtained by performing block mapping on the operation result of the second convolution matrix.
Similarly, the processing result corresponding to the third convolution matrix in each first layer convolution matrix set may not be directly used as the third dimension reduction matrix in the corresponding dimension reduction matrix set; instead, the processing result corresponding to the third convolution matrix in each first layer convolution matrix set may be subjected to block mapping again, and the processed result is used as the third dimension reduction matrix corresponding to the dimension reduction matrix set. The processing result corresponding to the third convolution matrix may be a processing result obtained by performing block mapping on the operation result of the third convolution matrix.
It can be seen that, according to the accuracy requirement, the present embodiment may include a process of nesting the partition map for multiple times. As for the number of times of nesting, the operation workload of the multiple-nesting block mapping needs to be considered to decide whether to continue the block mapping process.
It is to be understood that, when performing dimension reduction processing on a target image, the block mapping dimension reduction processing may also be performed on the first dimensional matrix, the second dimensional matrix, and the third dimensional matrix of the target image by directly adopting the block mapping manner described in this embodiment.
In one embodiment, after step S13, the method may further include: and arranging the representation matrixes of at least one target image according to a preset representation sequence to generate a representation data set of the target image.
In this embodiment, the preset representation order may be an arrangement order of the representation data of the target image, and may be flexibly set according to actual needs.
In this embodiment, the generating the target image representation data set may include: for each characterization matrix of the target image, selecting elements with values not equal to a third value from the characterization matrix, using a set formed by the selected elements as a first element set, and obtaining a second representative value of the elements in the first element set; the second representation values corresponding to each representation matrix of the target image may be arranged according to a preset representation sequence, so as to obtain a representation data set of the target image. The second representative value may be a sum, a product, an average, a maximum, a minimum, or a median of the elements in the first set of elements, etc.
In this embodiment, the generating the target image representation data set may further include: each characterization matrix of the target image may be divided into at least one sub-block characterization matrix; for each sub-block characterization matrix, selecting elements with values unequal to a third value in the sub-block characterization matrix, using a set formed by the selected elements as a second element set, and obtaining a third generation table value of the elements in the second element set; the third generation table values corresponding to each characterization matrix of the target image may be arranged according to a preset characterization sequence, so as to obtain a characterization data set of the target image. The third representative value may be a sum, a product, an average, a maximum, a minimum, or a median of the elements in the second set of elements, etc.
It should be noted that, a common color space usually includes at least 3 color channels, and this embodiment takes the common color space as an example, and introduces a technical scheme for obtaining a target image representation matrix. However, the present embodiment does not exclude the application of a color space having 4 or more color channels to the technical solution of the present embodiment. For example, when the predetermined color space of the target image includes at least 4 color channels, an image matrix set corresponding to the target image in the predetermined color space may be obtained, where the image matrix set may include at least a first-dimensional matrix, a second-dimensional matrix, a third-dimensional matrix, and a fourth-dimensional matrix; the image matrix set can be substituted into a preset convolutional neural network dimensionality reduction model to perform dimensionality reduction processing to obtain at least one dimensionality reduction matrix set, wherein each dimensionality reduction matrix set comprises a first dimensionality reduction matrix corresponding to the first dimensionality matrix, a second dimensionality reduction matrix corresponding to the second dimensionality matrix, a third dimensionality reduction matrix corresponding to the third dimensionality matrix, and a fourth dimensionality reduction matrix corresponding to the fourth dimensionality matrix; elements with the same two-dimensional coordinates in the first dimension reduction matrix, the second dimension reduction matrix, the third dimension reduction matrix and the fourth dimension reduction matrix of each dimension reduction matrix set can be used as a numerical value set to be substituted into a screening modeling model, and a current screening model corresponding to each dimension reduction matrix set is generated; and substituting each dimension reduction matrix set into the current screening model corresponding to the dimension reduction matrix set to obtain at least one representation matrix of the target image.
According to the technical scheme provided by the embodiment of the application, the dimension reduction is carried out on the image matrix set of the image, and data screening is carried out on the dimension reduced matrix after dimension reduction, so that the representation matrix representing the image is obtained. Compared with the prior art, when the representation matrix is generated, the process of segmenting the target image by using an image segmentation algorithm is avoided, so that the calculation workload generated when the representation matrix of the target image is acquired can be reduced.
The image characterization method in the embodiment of the application can be applied to an image search application scene. Referring to fig. 4, the image search application scenario may include the following steps.
Step S40: a query request containing a target image is obtained.
The query request may be in the form of a string. The character string may contain encoding information of the target image.
The subject of the query request may be a server. The query request containing the target image may be sent by a client or user to a server, and the server may then receive the query request containing the target image.
Step S41: acquiring an image matrix set corresponding to a target image in a preset color space, wherein the image matrix set at least comprises a first-dimensional matrix, a second-dimensional matrix and a third-dimensional matrix.
Step S42: and performing dimensionality reduction on the image matrix set to obtain at least one dimensionality reduction matrix set, wherein the dimensionality reduction matrix set comprises a first dimensionality reduction matrix corresponding to the first dimensionality matrix, a second dimensionality reduction matrix corresponding to the second dimensionality matrix, and a third dimensionality reduction matrix corresponding to the third dimensionality matrix.
Step S43: and calculating according to the elements with the same two-dimensional coordinates in the first dimension reduction matrix, the second dimension reduction matrix and the third dimension reduction matrix of the dimension reduction matrix set according to a preset algorithm, and outputting a screening matrix corresponding to the dimension reduction matrix set.
Step S44: based on a screening matrix corresponding to the dimension reduction matrix set, performing data screening on the dimension reduction matrix set to obtain at least one characterization matrix set of the target image; wherein the set of characterization matrices includes a first characterization matrix, a second characterization matrix, and a third characterization matrix.
Step S45: and returning the images matched with the query request according to the similarity value of at least one representation matrix set of each image in a preset image representation database and at least one representation matrix set of the target image.
The preset image representation database can contain images and at least one representation matrix set of the images. A set of at least one characterization matrix of the target image may be used as the set of characterization matrices of the target image. For each image in the preset image characterization data, a set formed by at least one characterization matrix of the image may be used as a characterization matrix set of the image. Then, the characterization matrix set of the target image and the similarity value of the characterization matrix set of each image in the preset image characterization database may be calculated, and an image corresponding to the characterization matrix set with the similarity value greater than a preset matching threshold may be returned. The preset matching threshold can be flexibly determined according to actual needs, and for example, can be 10, 50, or 100, etc.
The similarity value between the characterization matrix set of the target image and the characterization matrix set of each image in the preset image characterization database can be calculated by using a Euclidean distance (Euclidean distance) or manhattan distance (manhattan distance) equidistant calculation method.
The image search application scene can acquire a query request containing a target image; acquiring at least one characterization matrix of the target image; and returning the image matched with the query request according to the similarity value between at least one representation matrix set of each image in a preset image representation database and at least one representation matrix of the target image.
Please refer to fig. 5. The embodiment of the application provides a server. The server may include the following elements.
An image matrix set obtaining unit 50, configured to obtain an image matrix set corresponding to a target image in a predetermined color space, where the image matrix set at least includes a first-dimensional matrix, a second-dimensional matrix, and a third-dimensional matrix;
an image matrix set dimension reduction processing unit 51, configured to perform dimension reduction processing on the image matrix set to obtain at least one dimension reduction matrix set, where the dimension reduction matrix set includes a first dimension reduction matrix corresponding to the first dimension matrix, a second dimension reduction matrix corresponding to the second dimension matrix, and a third dimension reduction matrix corresponding to the third dimension matrix;
a screening matrix output unit 52, configured to perform operation according to a preset algorithm on elements having the same two-dimensional coordinate in the first dimension reduction matrix, the second dimension reduction matrix, and the third dimension reduction matrix of the dimension reduction matrix set, and output a screening matrix corresponding to the dimension reduction matrix set;
a dimension reduction matrix set data screening unit 53, configured to perform data screening on the dimension reduction matrix set based on a screening matrix corresponding to the dimension reduction matrix set, so as to obtain at least one characterization matrix set of the target image; wherein the set of characterization matrices includes a first characterization matrix, a second characterization matrix, and a third characterization matrix.
In this embodiment, the image matrix set dimension reduction processing unit 51 may include an image matrix set convolution sub-unit. The image matrix set convolution operation subunit may be configured to perform convolution operation on the image matrix set and at least one first layer convolution matrix set; wherein each first layer convolution matrix set is constructed by adopting historical data.
In this embodiment, the screening matrix output unit 52 may include a dimension reduction matrix set operation subunit and an operation result comparison subunit. The dimensionality reduction matrix set arithmetic subunit is used for taking elements with the same two-dimensional coordinates in a first dimensionality reduction matrix, a second dimensionality reduction matrix and a third dimensionality reduction matrix of each dimensionality reduction matrix set as a numerical value set, and calculating a preset numerical value set corresponding to the dimensionality reduction matrix set according to a preset algorithm; and the operation result comparison subunit is used for comparing the operation result with a preset screening threshold value so as to determine the value of the element in the screening matrix.
Please refer to fig. 6. The embodiment of the application also provides another server. The may include the following structure.
A memory 60 for storing program instructions.
In this embodiment, the Memory may be selected from a Random Access Memory (RAM), a Read-Only Memory (ROM), a Cache (Cache), a Hard Disk Drive (HDD), or a Memory Card (Memory Card).
The processor 61, the functions that can be realized by executing the program instructions stored in the memory 61 include: acquiring an image matrix set corresponding to a target image in a preset color space, wherein the image matrix set at least comprises a first-dimensional matrix, a second-dimensional matrix and a third-dimensional matrix; performing dimensionality reduction on the image matrix set to obtain at least one dimensionality reduction matrix set, wherein each dimensionality reduction matrix set comprises a first dimensionality reduction matrix corresponding to the first dimensionality matrix, a second dimensionality reduction matrix corresponding to the second dimensionality matrix, and a third dimensionality reduction matrix corresponding to the third dimensionality matrix; calculating elements with the same two-dimensional coordinates in a first dimension reduction matrix, a second dimension reduction matrix and a third dimension reduction matrix of the dimension reduction matrix set according to a preset algorithm in the memory, and outputting a screening matrix corresponding to the dimension reduction matrix set; based on a screening matrix corresponding to the dimension reduction matrix set, performing data screening on the dimension reduction matrix set to obtain at least one characterization matrix set of the target image; wherein the set of characterization matrices includes a first characterization matrix, a second characterization matrix, and a third characterization matrix.
In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate a dedicated integrated circuit chip 2. Furthermore, nowadays, instead of manually making an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Language Description Language), traffic, pl (core unified Programming Language), HDCal, JHDL (Java Hardware Description Language), langue, Lola, HDL, laspam, hardbyscript Description Language (vhr Description Language), and the like, which are currently used by Hardware compiler-software (Hardware Description Language-software). It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The means or elements described in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present application.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, embodiments of the server may be explained with reference to the introduction of embodiments of the method.
The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
Although the present application has been described in terms of embodiments, those of ordinary skill in the art will recognize that there are numerous variations and permutations of the present application without departing from the spirit of the application, and it is intended that the appended claims encompass such variations and permutations without departing from the spirit of the application.

Claims (14)

1. A server, comprising: a memory and a processor;
the memory to store program instructions;
the functions implemented by the processor by executing the program instructions include: acquiring an image matrix set corresponding to a target image in a preset color space, wherein the image matrix set at least comprises a first-dimensional matrix, a second-dimensional matrix and a third-dimensional matrix; performing dimensionality reduction on the image matrix set to obtain at least one dimensionality reduction matrix set, wherein each dimensionality reduction matrix set comprises a first dimensionality reduction matrix corresponding to the first dimensionality matrix, a second dimensionality reduction matrix corresponding to the second dimensionality matrix, and a third dimensionality reduction matrix corresponding to the third dimensionality matrix; calculating elements with the same two-dimensional coordinates in a first dimension reduction matrix, a second dimension reduction matrix and a third dimension reduction matrix of the dimension reduction matrix set according to a preset algorithm in the memory, and outputting a screening matrix corresponding to the dimension reduction matrix set; wherein, the outputting the screening matrix corresponding to the dimensionality reduction matrix set comprises: for each dimension reduction matrix set, taking elements with the same two-dimensional coordinates in a first dimension reduction matrix, a second dimension reduction matrix and a third dimension reduction matrix of the dimension reduction matrix set as a numerical value set, and operating a preset numerical value set corresponding to the dimension reduction matrix set according to a preset algorithm; comparing the operation result with a preset screening threshold value to determine the value of the element in the screening matrix; based on a screening matrix corresponding to the dimension reduction matrix set, performing data screening on the dimension reduction matrix set to obtain at least one characterization matrix set of the target image; wherein the set of characterization matrices includes a first characterization matrix, a second characterization matrix, and a third characterization matrix.
2. An image characterization method, comprising:
acquiring an image matrix set corresponding to a target image in a preset color space, wherein the image matrix set at least comprises a first-dimensional matrix, a second-dimensional matrix and a third-dimensional matrix;
performing dimensionality reduction on the image matrix set to obtain at least one dimensionality reduction matrix set, wherein the dimensionality reduction matrix set comprises a first dimensionality reduction matrix corresponding to the first dimensionality matrix, a second dimensionality reduction matrix corresponding to the second dimensionality matrix, and a third dimensionality reduction matrix corresponding to the third dimensionality matrix;
calculating according to a preset algorithm by using elements with the same two-dimensional coordinate in a first dimension reduction matrix, a second dimension reduction matrix and a third dimension reduction matrix of the dimension reduction matrix set, and outputting a screening matrix corresponding to the dimension reduction matrix set; wherein, the outputting the screening matrix corresponding to the dimensionality reduction matrix set comprises: for each dimension reduction matrix set, taking elements with the same two-dimensional coordinates in a first dimension reduction matrix, a second dimension reduction matrix and a third dimension reduction matrix of the dimension reduction matrix set as a numerical value set, and operating a preset numerical value set corresponding to the dimension reduction matrix set according to a preset algorithm; comparing the operation result with a preset screening threshold value to determine the value of the element in the screening matrix;
based on a screening matrix corresponding to the dimension reduction matrix set, performing data screening on the dimension reduction matrix set to obtain at least one characterization matrix set of the target image; wherein the set of characterization matrices includes a first characterization matrix, a second characterization matrix, and a third characterization matrix.
3. The method of claim 2, wherein the first dimensional matrix, the second dimensional matrix, and the third dimensional matrix each correspond to a color channel of the predetermined color space.
4. The method as claimed in claim 2, wherein in the step of performing the dimension reduction processing on the image matrix set, the method comprises: performing convolution operation on the image matrix set and at least one first layer convolution matrix set; wherein each first layer convolution matrix set is constructed by adopting historical data.
5. The method of claim 4, wherein each of the first set of layer convolution matrices includes a first convolution matrix, a second convolution matrix, and a third convolution matrix; the step of performing the convolution operation includes:
performing convolution operation on the first convolution matrix of each first layer convolution matrix set and the first dimension matrix respectively, and taking an obtained operation result as a first dimension reduction matrix in the corresponding dimension reduction matrix set;
performing convolution operation on the second convolution matrix of each first layer convolution matrix set and the second dimensional matrix respectively, and taking the obtained operation result as a second dimension reduction matrix in the corresponding dimension reduction matrix set;
and performing convolution operation on the third convolution matrix of each first-layer convolution matrix set and the third matrix respectively, and taking an obtained operation result as a third dimension reduction matrix in the corresponding dimension reduction matrix set.
6. The method of claim 5, wherein in the step of performing dimension reduction processing on the image matrix set, further comprising: and performing convolution operation on the result of the convolution operation on the image matrix set and at least one first-layer convolution matrix set and a second-layer convolution matrix set.
7. The method of claim 6, wherein each set of second-level convolution matrices includes a fourth convolution matrix, a fifth convolution matrix, and a sixth convolution matrix; the step of performing convolution operation with the second-stage convolution matrix set comprises the following steps:
performing convolution operation on the operation result of the first convolution matrix in each first layer of convolution matrix set and a fourth convolution matrix corresponding to the second layer of convolution matrix set;
performing convolution operation on the operation result of the second convolution matrix in each first layer of convolution matrix set and a fifth convolution matrix corresponding to the second layer of convolution matrix set;
and performing convolution operation on the operation result of the third convolution matrix in each first layer of convolution matrix set and the sixth convolution matrix corresponding to the second layer of convolution matrix set.
8. The method of claim 4, wherein the step of performing dimension reduction processing further comprises:
respectively carrying out block mapping processing on the operation result of the first convolution matrix in each first layer of convolution matrix set;
respectively carrying out block mapping processing on the operation result of the second convolution matrix in each first layer convolution matrix set;
and respectively carrying out block mapping processing on the operation result of the third convolution matrix in each first layer convolution matrix set.
9. The method according to claim 8, wherein for each first layer convolution matrix set, the step of performing block mapping processing on the operation result of the first convolution matrix in the first layer convolution matrix set to obtain a processing result comprises:
dividing the operation result of the first convolution matrix in the first layer of convolution matrix set into at least one subblock mapping matrix;
respectively obtaining the mapping value of each subblock mapping matrix of the first convolution matrix in the first layer of convolution matrix set;
generating a processing result of the first convolution matrix in the first layer convolution matrix set based on the mapping value of each sub-block mapping matrix of the first convolution matrix in the first layer convolution matrix set;
correspondingly, for each first layer convolution matrix set, the step of performing block mapping processing on the operation result of the second convolution matrix in the first layer convolution matrix set to obtain a processing result includes:
dividing the operation result of the second convolution matrix in the first layer convolution matrix set into at least one sub-block mapping matrix;
respectively obtaining the mapping value of each sub-block mapping matrix of the second convolution matrix in the first layer of convolution matrix set;
generating a processing result of the second convolution matrix in the first layer convolution matrix set based on the mapping value of each sub-block mapping matrix of the second convolution matrix in the first layer convolution matrix set;
correspondingly, for each first layer convolution matrix set, the step of performing block mapping processing on the operation result of the third convolution matrix in the first layer convolution matrix set to obtain a processing result includes:
dividing the operation result of the third convolution matrix in the first layer convolution matrix set into at least one sub-block mapping matrix;
respectively obtaining the mapping value of each sub-block mapping matrix of a third convolution matrix in the first layer of convolution matrix set;
and generating a processing result of the third convolution matrix in the first layer convolution matrix set based on the mapping value of each sub-block mapping matrix of the third convolution matrix in the first layer convolution matrix set.
10. The method of claim 9, wherein the mapping value is an average value, a maximum value, a minimum value, or a median of each element in the subblock mapping matrix.
11. The method of claim 2, wherein the step of determining values of elements in the screening matrix comprises:
when the operation result is larger than the preset screening threshold value, the value of the element of the two-dimensional coordinate corresponding to the screening matrix is a first value;
and when the operation result is smaller than the preset screening threshold value, the value of the element of the two-dimensional coordinate corresponding to the screening matrix is a second value, wherein the first value is larger than the second value.
12. The method of claim 2, wherein the method further comprises:
and arranging the representation matrixes of at least one target image according to a preset representation sequence to generate a representation data set of the target image.
13. A server, comprising:
the image matrix set acquisition unit is used for acquiring an image matrix set corresponding to a target image in a preset color space, and the image matrix set at least comprises a first dimension matrix, a second dimension matrix and a third dimension matrix;
the image matrix set dimension reduction processing unit is used for carrying out dimension reduction processing on the image matrix set to obtain at least one dimension reduction matrix set, wherein the dimension reduction matrix set comprises a first dimension reduction matrix corresponding to the first dimension matrix, a second dimension reduction matrix corresponding to the second dimension matrix and a third dimension reduction matrix corresponding to the third dimension matrix;
the screening matrix output unit is used for calculating according to the elements with the same two-dimensional coordinate in the first dimension reduction matrix, the second dimension reduction matrix and the third dimension reduction matrix of the dimension reduction matrix set according to a preset algorithm and outputting a screening matrix corresponding to the dimension reduction matrix set;
the dimension reduction matrix set data screening unit is used for carrying out data screening on the dimension reduction matrix set based on a screening matrix corresponding to the dimension reduction matrix set to obtain at least one representation matrix set of the target image; wherein the set of characterization matrices includes a first characterization matrix, a second characterization matrix, and a third characterization matrix;
wherein, the screening matrix output unit includes:
the dimensionality reduction matrix set arithmetic subunit is used for taking elements with the same two-dimensional coordinates in a first dimensionality reduction matrix, a second dimensionality reduction matrix and a third dimensionality reduction matrix of each dimensionality reduction matrix set as a numerical value set, and calculating a preset numerical value set corresponding to the dimensionality reduction matrix set according to a preset algorithm;
and the operation result comparison subunit is used for comparing the operation result with a preset screening threshold value so as to determine the value of the element in the screening matrix.
14. The server according to claim 13, wherein the image matrix set dimension reduction processing unit includes:
the image matrix set convolution operation subunit is used for performing convolution operation on the image matrix set and at least one first layer convolution matrix set; wherein each first layer convolution matrix set is constructed by adopting historical data.
CN201610694360.5A 2016-08-19 2016-08-19 Image characterization method and server Active CN107766863B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610694360.5A CN107766863B (en) 2016-08-19 2016-08-19 Image characterization method and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610694360.5A CN107766863B (en) 2016-08-19 2016-08-19 Image characterization method and server

Publications (2)

Publication Number Publication Date
CN107766863A CN107766863A (en) 2018-03-06
CN107766863B true CN107766863B (en) 2022-03-04

Family

ID=61263227

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610694360.5A Active CN107766863B (en) 2016-08-19 2016-08-19 Image characterization method and server

Country Status (1)

Country Link
CN (1) CN107766863B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112306922B (en) * 2020-11-12 2023-09-22 山东云海国创云计算装备产业创新中心有限公司 Multi-data-to-multi-port arbitration method and related device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1912950A (en) * 2006-08-25 2007-02-14 浙江工业大学 Device for monitoring vehicle breaking regulation based on all-position visual sensor
CN103839042A (en) * 2012-11-27 2014-06-04 腾讯科技(深圳)有限公司 Human face recognition method and human face recognition system
CN104463199A (en) * 2014-11-28 2015-03-25 福州大学 Rock fragment size classification method based on multiple features and segmentation recorrection
CN105654028A (en) * 2015-09-29 2016-06-08 厦门中控生物识别信息技术有限公司 True and false face identification method and apparatus thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100246969A1 (en) * 2009-03-25 2010-09-30 Microsoft Corporation Computationally efficient local image descriptors
CN104657980A (en) * 2014-12-24 2015-05-27 江南大学 Improved multi-channel image partitioning algorithm based on Meanshift

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1912950A (en) * 2006-08-25 2007-02-14 浙江工业大学 Device for monitoring vehicle breaking regulation based on all-position visual sensor
CN103839042A (en) * 2012-11-27 2014-06-04 腾讯科技(深圳)有限公司 Human face recognition method and human face recognition system
CN104463199A (en) * 2014-11-28 2015-03-25 福州大学 Rock fragment size classification method based on multiple features and segmentation recorrection
CN105654028A (en) * 2015-09-29 2016-06-08 厦门中控生物识别信息技术有限公司 True and false face identification method and apparatus thereof

Also Published As

Publication number Publication date
CN107766863A (en) 2018-03-06

Similar Documents

Publication Publication Date Title
Chen et al. Learning spatial attention for face super-resolution
Chen et al. DISC: Deep image saliency computing via progressive representation learning
He et al. Enhanced boundary learning for glass-like object segmentation
Xiao et al. Deep salient object detection with dense connections and distraction diagnosis
EP4156017A1 (en) Action recognition method and apparatus, and device and storage medium
CN112232346B (en) Semantic segmentation model training method and device, and image semantic segmentation method and device
CN113704531A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
WO2021137946A1 (en) Forgery detection of face image
US11714921B2 (en) Image processing method with ash code on local feature vectors, image processing device and storage medium
CN111914654B (en) Text layout analysis method, device, equipment and medium
CN111292377B (en) Target detection method, device, computer equipment and storage medium
Lin et al. Frequency-aware camouflaged object detection
CN112256899B (en) Image reordering method, related device and computer readable storage medium
CN111709415B (en) Target detection method, device, computer equipment and storage medium
CN112560845A (en) Character recognition method and device, intelligent meal taking cabinet, electronic equipment and storage medium
CN114758145A (en) Image desensitization method and device, electronic equipment and storage medium
CN107766863B (en) Image characterization method and server
CN116977336A (en) Camera defect detection method, device, computer equipment and storage medium
CN117058554A (en) Power equipment target detection method, model training method and device
CN116976372A (en) Picture identification method, device, equipment and medium based on square reference code
CN115810152A (en) Remote sensing image change detection method and device based on graph convolution and computer equipment
US11995890B2 (en) Method and apparatus for tensor processing
CN114820666A (en) Method and device for increasing matting accuracy, computer equipment and storage medium
CN114332522A (en) Image identification method and device and construction method of residual error network model
CN112084874A (en) Object detection method and device and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant