CN111401122B - Knowledge classification-based complex target asymptotic identification method and device - Google Patents

Knowledge classification-based complex target asymptotic identification method and device Download PDF

Info

Publication number
CN111401122B
CN111401122B CN201911377824.XA CN201911377824A CN111401122B CN 111401122 B CN111401122 B CN 111401122B CN 201911377824 A CN201911377824 A CN 201911377824A CN 111401122 B CN111401122 B CN 111401122B
Authority
CN
China
Prior art keywords
resolution
asymptotic
low
bilinear
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911377824.XA
Other languages
Chinese (zh)
Other versions
CN111401122A (en
Inventor
胡君
贺东华
方标新
韦章兵
贾小月
殷贺琦
刘丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aisino Corp
Original Assignee
Aisino Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aisino Corp filed Critical Aisino Corp
Priority to CN201911377824.XA priority Critical patent/CN111401122B/en
Publication of CN111401122A publication Critical patent/CN111401122A/en
Application granted granted Critical
Publication of CN111401122B publication Critical patent/CN111401122B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a knowledge classification-based complex target asymptotic identification method and device. The method comprises the following steps: image preprocessing, namely dividing an original image data set I into data sets with multiple levels of resolution ratios, and taking the data sets as a reference data set for asymptotically identifying a complex target; inputting images in batches into a VGG-16 network pre-trained on an ImageNet data set to perform feature extraction; carrying out bilinear feature fusion calculation and trilinear feature fusion calculation on the extracted features with various resolutions; and predicting the category by using the fused characteristics. The method combines the characteristics of tri-linear pooling and bi-linear pooling, and plans coarse-granularity tasks and fine-granularity tasks of complex targets in a unified frame. The feature reference provided by the coarse granularity task neglected in fine granularity identification in actual life is solved.

Description

Knowledge classification-based complex target asymptotic identification method and device
Technical Field
The invention belongs to the field of image recognition, relates to fine-grained image recognition and retrieval, and particularly relates to a knowledge classification-based complex target asymptotic recognition method and device.
Background
In recent years, fine-grained image recognition and retrieval has become a research hotspot in the fields of visual computing and information retrieval. Although image recognition technology has been greatly developed in recent years, there are still many technical difficulties in fine-grained image recognition and retrieval.
The fine-grained image classification problem is to identify sub-classes under large classes. The distinction and difficulty of fine-grained image analysis tasks over general-purpose image tasks is that the granularity of the class to which the images belong is finer. Not only is the difficulty and challenges of fine-grained image tasks certainly greater for computers and for the average population.
Although the prior art is easy to distinguish objects with obvious appearance differences, such as: cats and dogs, however, these prior art techniques still have difficulty distinguishing objects with less distinct differences in appearance such as: the recognition results of objects in these subclasses are easily affected by their motion gestures, viewing directions, and relative positions, both for boeing 737 guests and for boeing 747 guests.
However, with the development of artificial intelligence, more and more application scenes need to make finer feature distinction on objects in the same category, for example: the identification of brands by merchants, the identification of plants by botanicals, and the like. Fine-grained image classification has wide research requirements and application scenes in both industry and academia. The research subject related to this mainly includes the identification of different kinds of birds, dogs, flowers, cars, airplanes, etc. In real life, there is a great need to identify different sub-categories. For example, in ecological protection, the efficient identification of different species of organisms is an important prerequisite for ecological research.
Unlike the general image classification task, which distinguishes basic categories, fine-grained identification is very challenging. However, in real life scenarios, fine-grained tasks often appear with coarse-grained tasks when the observer is closer to the observer than the observed person is when the distance between the observer and the observed person is shortened. Whereas in previous work, the combination of fine-grained tasks and coarse-grained tasks was often ignored. The learner is more focused on the fine granularity level research, and ignoring the feature references provided by the accompanying coarse granularity tasks has instructive significance.
Therefore, it is necessary to propose a method for planning coarse-granularity tasks and fine-granularity tasks of complex targets in a unified framework, and further for fine-granularity image recognition.
Disclosure of Invention
The invention solves the problem of characteristic references provided by coarse-granularity tasks neglected in fine-granularity identification in actual life.
According to one aspect of the present invention, there is provided a knowledge classification-based complex object asymptotic recognition method, the method comprising:
image preprocessing, namely dividing an original image data set I into data sets with multiple levels of resolution ratios, and taking the data sets as a reference data set for asymptotically identifying a complex target;
inputting images in batches into a VGG-16 network pre-trained on an ImageNet data set to perform feature extraction;
carrying out bilinear feature fusion calculation and trilinear feature fusion calculation on the extracted features with various resolutions;
and predicting the category by using the fused characteristics.
Further, the original image dataset I definition is divided into three image datasets I with resolution from high to low high ,I medium ,I low .。
Further, the resolution r of the original image dataset is defined as a high resolution r high The image dataset is determined as I high
Gradually reducing the resolution of the original image dataset to obtain image datasets of two other resolutions:
when the accuracy is lower than the threshold t med When according to r med Resolution, determining an image dataset as I medium
When the accuracy is lower than the threshold t low When according to r low Resolution, determining image dataset as l low
Further, three resolution images are mapped one-to-one to the biological taxonomies:
I high corresponding species, I medium Corresponding genus, I low Corresponding families.
Further, the image is classified from high resolution r using SVM classification algorithm high Classification into categories of science, passing the accuracy threshold t med And t low To classify.
Further, inputting the images in batches into the VGG-16 network pre-trained on the ImageNet dataset for feature extraction includes: the relu5_1, relu5_2, relu5_3 features of the three resolution atlases are extracted.
Further, the combination of bilinear features f A (I)∈R hw×c And f B (I)∈R hw×c Equal to f A (I) T f B (I)∈R c×c Where c is the number of feature maps and h and w represent the height and width of the feature maps;
bilinear pooling of cross-layer decomposition is expressed as:
wherein X represents one layer and Y represents another layer, whereinAnd->Is a projection matrix +.>Is a classifier matrix, < >>Is the hadamard product, d represents the dimension of the joint embedding, F is the output of the bilinear model, projection matrix f=i.
Further, the tri-linear pooling method is expressed as:
wherein W represents a projection matrixf combines three separate layers, where X represents one layer and Y, Z represents two more layers.
Further, fusing the three-linear characteristic and the three-dimensional bilinear characteristic, and calculating a softMax vector to obtain a predicted result;
the three loss functions add to the total loss function:
l full =l high +l medium +l low .
wherein the loss function loss is defined at each resolution as:
l high =loss(I high ),l medium =loss(I medium ) And l low =loss(I low )。
According to another aspect of the present invention, there is provided a complex object asymptotic recognition apparatus based on knowledge classification, the apparatus including: a memory storing computer executable instructions;
a processor executing computer executable instructions in the memory, the processor performing the steps of:
image preprocessing, namely dividing an original image data set I into data sets with multiple levels of resolution ratios, and taking the data sets as a reference data set for asymptotically identifying a complex target;
inputting images in batches into a VGG-16 network pre-trained on an ImageNet data set to perform feature extraction;
carrying out bilinear feature fusion calculation and trilinear feature fusion calculation on the extracted features with various resolutions;
and predicting the category by using the fused characteristics.
The invention provides a three-linear pooling method, integrates the characteristics of three-linear pooling and double-linear pooling, considers the characteristic interaction between layers, simultaneously avoids introducing additional training parameters, better captures the characteristic relationship between layers, and has high efficiency and powerful functions of the cross-layer double-linear method.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be apparent from the following more particular descriptions of exemplary embodiments of the disclosure as illustrated in the accompanying drawings wherein like reference numbers generally represent like parts throughout exemplary embodiments of the disclosure.
FIG. 1 is a flow chart of a knowledge classification based complex target asymptotic identification method of the present invention.
Fig. 2 is a schematic diagram illustrating an application of a complex object asymptotic recognition method according to an embodiment of the present invention.
FIG. 3 is a partial result of the present invention predicted to be correct on CUB 200-2011.
FIG. 4 is a comparison of recognition accuracy on CUB200-2011,Stanford Cars and FGVC-air datasets of the present invention.
Detailed Description
Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The present invention aims to solve the problem of asymptotically identifying complex objects in real life, whose object is to identify classes of objects at multiple resolutions (from low to high). In order to solve the problem, the invention provides a complex target asymptotic identification method based on knowledge classification. The method combines the characteristics of tri-linear pooling and bi-linear pooling, and plans coarse-granularity tasks and fine-granularity tasks of complex targets in a unified frame. The feature reference provided by the coarse granularity task neglected in fine granularity identification in actual life is solved.
FIG. 1 is a flow chart of a knowledge classification based complex target asymptotic identification method of the present invention. As shown in fig. 1, the present invention proposes a knowledge classification-based complex target asymptotic recognition method, which includes:
image preprocessing, namely dividing an original image data set I into data sets with multiple levels of resolution ratios, and taking the data sets as a reference data set for asymptotically identifying a complex target;
inputting images in batches into a VGG-16 network pre-trained on an ImageNet data set to perform feature extraction;
carrying out bilinear feature fusion calculation and trilinear feature fusion calculation on the extracted features with various resolutions;
and predicting the category by using the fused characteristics.
First, image preprocessing is performed.
The original image dataset I definition is divided into three levels of resolution (high to low) datasets. Three image data sets I thus newly generated high ,I medium ,I low A reference dataset asymptotically identified for the complex target. Specifically, these three resolutions are defined as follows:
first we define the resolution r of the original image as high resolution r high These images are then classified from a high resolution r using an SVM classification algorithm high The species are classified into the class of science, and then we gradually decrease the resolution of the dataset of the original image to get the other two resolutions.
As resolution decreases, classification accuracy of species necessarily decreases. When the accuracy is lower than the threshold t med When, i.e. the classifier is not as accurate as a high resolution classifier, we will determine the resolution at that instant as r med According to r med Resolution, determining an image dataset as I medium . The targets are then changed to be classified on the genus. And so on, the same process is repeated. Finally we can also get r low And l low . Thus, these three resolutions and their corresponding data sets can be determined by two parameters: accuracy threshold t med And t low
In the invention realizeIn the embodiment, the actual setting is t med =0.8,t low =0.8. Further, we map the three resolution images one-to-one with the biological taxonomies. For example, 200 total categories. Can be combined into 113 genera and 36 families. The original classification task is re-planned as: i high Corresponding to 200 species. And I medium And I low For classifying 113 genera and 36 families. It is noted that the three classifiers may be defined using the CNN model, and all of the loss functions may be added.
Next, the images are batch input to a VGG-16 network pre-trained on the ImageNet dataset to extract features.
The size of the model input image is 488×488, the projection layer parameters and the normalized index layer parameters are initialized randomly, and firstly, parameters of other layers are kept unchanged, and only the normalized index layer is trained. The entire network is then trimmed with a random gradient descent with a step size of 8. Momentum 0.9, weight decay 5×10 -4 Learning rate of 1×10 -3 The periodic anneal was 0.5. Empirically, the dimensions of the projection layer were set to 8,192.
Notably, these three levels of training are cyclic, such as: the first tuning parameter is I at the normalized index layer of 200 dimensions high Later on, the normalized index layer at 113-dimension will be used with I medium Is finally used in I low Training the 36-dimensional classifier in (c), and returning to the highest dimension.
The present invention uses standard data enhancement methods. For example: the original image is first adjusted to 512×s, S being the largest edge, then randomly sampled and flipped horizontally during training (only center clipping is included in the test). The whole model training adopts an end-to-end mode.
And carrying out bilinear feature fusion calculation and trilinear feature fusion calculation on the three extracted features with resolution relu5_1, relu5_2 and relu5_3.
Taking image I as input and using two feature functions f A And f B (typically the last layer of convolutional neural network), fromThese two features are extracted from the image. One bilinear vector output is obtained at each position output using the matrix outer product: binding of bilinear features f A (I)∈R hw×c And f B (I)∈R hw×c Equal to f A (I) T f B (I)∈R c×c Where c is the number of feature maps and h and w represent the height and width of the feature maps. Note that h x w needs to be fixed and c can be chosen from different feature functions.
Bilinear pooling of cross-layer decomposition is denoted in this invention as:
wherein X, Y, Z are three different layers,and->Is a projection matrix which is a projection matrix,is a classifier matrix, < >>Is the hadamard product, d represents the dimension of the joint embedding, and f is the output of the bilinear model.
Then, a tri-linear pooling method provided by the invention is utilized to extract a tri-linear characteristic. The specific tri-linear pooling method uses three different layers of X, Y and Z for feature extraction. The tri-linear pooling method replaces Hadamard (Hadamard) products with only two layers, and therefore is expressed as:
where f incorporates three separate layers.
And finally, predicting the category by using the fused characteristics.
And fusing the three-linear characteristic and the three-dimensional bilinear characteristic, and calculating a softMax vector to obtain a predicted result. Wherein the formula of the loss function of the present invention is expressed as:
l full =l high +l medium +l low wherein the loss function (loss) is defined at each resolution as: l (L) high =loss(I high ),l medium =loss(I medium ) And l low =loss(I low ). So far, the introduction of the complex target asymptotic recognition method based on knowledge classification is finished.
According to another embodiment of the present invention, there is provided a complex object asymptotic recognition apparatus based on knowledge classification, the apparatus including: a memory storing computer executable instructions;
a processor executing computer executable instructions in the memory, the processor performing the steps of:
image preprocessing, namely dividing an original image data set I into data sets with multiple levels of resolution ratios, and taking the data sets as a reference data set for asymptotically identifying a complex target;
inputting images in batches into a VGG-16 network pre-trained on an ImageNet data set to perform feature extraction;
carrying out bilinear feature fusion calculation and trilinear feature fusion calculation on the extracted features with various resolutions;
and predicting the category by using the fused characteristics.
Fig. 2 is a schematic diagram illustrating an application of a complex object asymptotic recognition method according to an embodiment of the present invention. As shown in FIG. 2, the identification method of the present invention is illustrated by the identification of a golden samara.
Firstly, dividing pictures into three types according to resolution, namely I high ,I medium ,I low .. Then training on VGG-16 network to extract the features relu5_1, relu5_2, relu5_3 of the three resolution images.
The combination of bilinear features is performed on the basis of three features relu5_1, relu5_2, relu5_3. And then performing bilinear feature fusion by using a cross-layer decomposition bilinear pooling method to obtain three bilinear features.
And extracting a tri-linear characteristic by using a tri-linear pooling method.
And finally, fusing the three-linear characteristic and the three-dimensional bilinear characteristic, and calculating a softMax vector to obtain a predicted result. The species is determined to be of the family of the genus Caragana by the family classifier, the genus Caragana by the genus classifier, and the species of Caragana by the species classifier.
FIG. 3 is a partial result of the present invention predicted to be correct on CUB 200-2011. The CUB200-2011 dataset is a fine-grain dataset proposed by the california academy of technology in 2010, and is also the benchmark image dataset for current fine-grain classification recognition studies. The data set contains 11788 bird pictures, including 200, 113, 36 families. By adopting the identification method, partial pictures are taken in the CUB200-2011 data set for testing, wherein the third row is each mispredicted category shown by a visualization tool and is predicted by an HBP algorithm. And our MLPH model predictions are accurate across these categories.
FIG. 4 is a comparison of recognition accuracy of the method of the present invention on CUB200-2011,Stanford Cars and FGVC-air datasets. The Stanford Cars image data contains 16185 car pictures in total of 196 categories. Of these, 8144 sheets are training data and 8041 sheets are test data. Each category is divided into 196 categories according to year, manufacturer and model, and the 196 categories belong to 13 families. FGVC-Aircrafts data sets are classical benchmark image data sets in fine-grained image classification and identification studies developed by the university of toyota chicago in 2013. The aircraft data set comprises 10,000 aircraft pictures, and is divided into 100 70 categories of 30 families according to three-layer hierarchical structures of manufacturers, families and variants. The comparison test shows that the identification accuracy is obviously higher than that of the HBP method by using the identification method.
The invention plans coarse-grained tasks and fine-grained tasks of complex targets in a unified framework. The feature reference provided by the coarse granularity task neglected in fine granularity identification in actual life is solved. Experiments prove that the knowledge classification-based complex target asymptotic recognition method provided by the invention has obviously improved recognition accuracy on the CUB200-2011,Stanford Cars and FGVC-air datasets compared with the existing method, and respectively achieves optimal accuracy.
The foregoing description of the embodiments of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the technical improvements in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (8)

1. A knowledge classification-based complex target asymptotic identification method is characterized by comprising the following steps:
image preprocessing, namely dividing an original image data set I into data sets with multiple levels of resolution ratios, and taking the data sets as a reference data set for asymptotically identifying a complex target;
inputting images in batches into a VGG-16 network pre-trained on an ImageNet data set to perform feature extraction;
carrying out bilinear feature fusion calculation and trilinear feature fusion calculation on the extracted features with various resolutions;
binding of bilinear features f A (I)∈R hw×c And f B (I)∈R hw×c Equal to f A (I) T f B (I)∈R c×c Where c is the number of feature maps and h and w represent the height and width of the feature maps;
bilinear pooling of cross-layer decomposition is expressed as:
wherein X represents one layer and Y represents another layerWherein U is E R hw×d And V.epsilon.R hw×d Is a projection matrix, P.epsilon.R d×cc Is a matrix of the classifier(s),is the Hadamard product, d represents the dimension of the joint embedding, and f is the output of the bilinear model;
the tri-linear pooling method is expressed as:
wherein W is E R hw×d Representing a projection matrix, f incorporates three separate layers, where X represents one layer and Y, Z represents the other two layers;
and predicting the category by using the fused characteristics.
2. The knowledge-based complex object asymptotic recognition method according to claim 1, characterized in that the definition of the original image dataset I is divided into three image datasets I with resolution from high to low high ,I medium ,I low
3. The knowledge-based complex object asymptotic recognition method according to claim 2, characterized in that the resolution r of the original image dataset is defined as high resolution r high The image dataset is determined as I high
Gradually reducing the resolution of the original image dataset to obtain image datasets of two other resolutions:
when the accuracy is lower than the threshold t med When according to r med Resolution, determining an image dataset as I medium
When the accuracy is lower than the threshold t low When according to r low Resolution, determining image dataset as l low
4. A complex object asymptotic recognition method based on knowledge classification according to claim 3, characterized by the fact that three resolution images are mapped one-to-one with biological taxonomies:
I high corresponding species, I medium Corresponding genus, I low Corresponding families.
5. A knowledge-based complex object asymptotic recognition method according to claim 3, characterized in that the image is classified from high resolution r using an SVM classification algorithm high Classification into categories of science, passing the accuracy threshold t med And t low To classify.
6. The knowledge classification based complex target asymptotic identification method of claim 1, wherein inputting the images in batches into a VGG-16 network pre-trained on an ImageNet dataset for feature extraction includes: the relu5_1, relu5_2, relu5_3 features of the three resolution atlases are extracted.
7. The knowledge classification-based complex target asymptotic identification method of claim 1, wherein the three linear features and the three-dimensional bilinear features are fused, and SoftMax vectors are calculated to obtain a predicted result;
the three loss functions add to the total loss function:
l full =l high +l medium +l low .
wherein the loss function loss is defined at each resolution as:
l high =loss(I high ),l medium =loss(I medium ) And l low =loss(I low )。
8. A knowledge classification-based complex target asymptotic recognition device, the device comprising: a memory storing computer executable instructions;
a processor executing computer executable instructions in the memory, the processor performing the steps of:
image preprocessing, namely dividing an original image data set I into data sets with multiple levels of resolution ratios, and taking the data sets as a reference data set for asymptotically identifying a complex target;
inputting images in batches into a VGG-16 network pre-trained on an ImageNet data set to perform feature extraction;
carrying out bilinear feature fusion calculation and trilinear feature fusion calculation on the extracted features with various resolutions;
binding of bilinear features f A (I)∈R hw×c And f B (I)∈R hw×c Equal to f A (I) T f B (I)∈R c×c Where c is the number of feature maps and h and w represent the height and width of the feature maps;
bilinear pooling of cross-layer decomposition is expressed as:
wherein X represents one layer and Y represents another layer, wherein U.epsilon.R hw×d And V.epsilon.R hw×d Is a projection matrix, P.epsilon.R d×cc Is a matrix of the classifier(s),is the Hadamard product, d represents the dimension of the joint embedding, and f is the output of the bilinear model;
the tri-linear pooling method is expressed as:
wherein W is E R hw×d Representing a projection matrix, f incorporates three separate layers, where X represents one layer and Y, Z represents the other two layers;
and predicting the category by using the fused characteristics.
CN201911377824.XA 2019-12-27 2019-12-27 Knowledge classification-based complex target asymptotic identification method and device Active CN111401122B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911377824.XA CN111401122B (en) 2019-12-27 2019-12-27 Knowledge classification-based complex target asymptotic identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911377824.XA CN111401122B (en) 2019-12-27 2019-12-27 Knowledge classification-based complex target asymptotic identification method and device

Publications (2)

Publication Number Publication Date
CN111401122A CN111401122A (en) 2020-07-10
CN111401122B true CN111401122B (en) 2023-09-26

Family

ID=71430306

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911377824.XA Active CN111401122B (en) 2019-12-27 2019-12-27 Knowledge classification-based complex target asymptotic identification method and device

Country Status (1)

Country Link
CN (1) CN111401122B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380392A (en) * 2020-11-17 2021-02-19 北京百度网讯科技有限公司 Method, apparatus, electronic device and readable storage medium for classifying video
US11748865B2 (en) 2020-12-07 2023-09-05 International Business Machines Corporation Hierarchical image decomposition for defect detection

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108875826A (en) * 2018-06-15 2018-11-23 武汉大学 A kind of multiple-limb method for checking object based on the compound convolution of thickness granularity
CN109086792A (en) * 2018-06-26 2018-12-25 上海理工大学 Based on the fine granularity image classification method for detecting and identifying the network architecture
CN109685115A (en) * 2018-11-30 2019-04-26 西北大学 A kind of the fine granularity conceptual model and learning method of bilinearity Fusion Features
CN110188816A (en) * 2019-05-28 2019-08-30 东南大学 Based on the multiple dimensioned image fine granularity recognition methods for intersecting bilinearity feature of multithread
CN110210550A (en) * 2019-05-28 2019-09-06 东南大学 Image fine granularity recognition methods based on integrated study strategy
WO2019169816A1 (en) * 2018-03-09 2019-09-12 中山大学 Deep neural network for fine recognition of vehicle attributes, and training method thereof
CN110263863A (en) * 2019-06-24 2019-09-20 南京农业大学 Fine granularity mushroom phenotype recognition methods based on transfer learning Yu bilinearity InceptionResNetV2

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009076218A2 (en) * 2007-12-07 2009-06-18 University Of Maryland Composite images for medical procedures

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019169816A1 (en) * 2018-03-09 2019-09-12 中山大学 Deep neural network for fine recognition of vehicle attributes, and training method thereof
CN108875826A (en) * 2018-06-15 2018-11-23 武汉大学 A kind of multiple-limb method for checking object based on the compound convolution of thickness granularity
CN109086792A (en) * 2018-06-26 2018-12-25 上海理工大学 Based on the fine granularity image classification method for detecting and identifying the network architecture
CN109685115A (en) * 2018-11-30 2019-04-26 西北大学 A kind of the fine granularity conceptual model and learning method of bilinearity Fusion Features
CN110188816A (en) * 2019-05-28 2019-08-30 东南大学 Based on the multiple dimensioned image fine granularity recognition methods for intersecting bilinearity feature of multithread
CN110210550A (en) * 2019-05-28 2019-09-06 东南大学 Image fine granularity recognition methods based on integrated study strategy
CN110263863A (en) * 2019-06-24 2019-09-20 南京农业大学 Fine granularity mushroom phenotype recognition methods based on transfer learning Yu bilinearity InceptionResNetV2

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘尚旺 ; 郜翔 ; .基于深度模型迁移的细粒度图像分类方法.计算机应用.2018,(第08期),全文. *
梁华刚 ; 温晓倩 ; 梁丹丹 ; 李怀德 ; 茹锋 ; .多级卷积特征金字塔的细粒度食物图片识别.中国图象图形学报.2019,(第06期),全文. *

Also Published As

Publication number Publication date
CN111401122A (en) 2020-07-10

Similar Documents

Publication Publication Date Title
CN109196514B (en) Image classification and labeling
Endres et al. Category-independent object proposals with diverse ranking
US9111375B2 (en) Evaluation of three-dimensional scenes using two-dimensional representations
JP2017062781A (en) Similarity-based detection of prominent objects using deep cnn pooling layers as features
CN101877064B (en) Image classification method and image classification device
CN111052144A (en) Attribute-aware zero-sample machine vision system by joint sparse representation
Zhou et al. Scene classification using multi-resolution low-level feature combination
Veeravasarapu et al. Adversarially tuned scene generation
CN111401122B (en) Knowledge classification-based complex target asymptotic identification method and device
JPWO2019146057A1 (en) Learning device, live-action image classification device generation system, live-action image classification device generation device, learning method and program
CN104966052A (en) Attributive characteristic representation-based group behavior identification method
Yadav et al. An improved deep learning-based optimal object detection system from images
Lou et al. Extracting 3D layout from a single image using global image structures
Boutell et al. Multi-label Semantic Scene Classfication
Singh et al. Semantically guided geo-location and modeling in urban environments
CN112183464A (en) Video pedestrian identification method based on deep neural network and graph convolution network
CN110111365B (en) Training method and device based on deep learning and target tracking method and device
CN114492634B (en) Fine granularity equipment picture classification and identification method and system
WO2020119624A1 (en) Class-sensitive edge detection method based on deep learning
Shuai et al. Regression convolutional network for vanishing point detection
Ali et al. Human-inspired features for natural scene classification
CN116258937A (en) Small sample segmentation method, device, terminal and medium based on attention mechanism
CN113408546B (en) Single-sample target detection method based on mutual global context attention mechanism
Patil Car damage recognition using the expectation maximization algorithm and mask R-CNN
Anggoro et al. Classification of Solo Batik patterns using deep learning convolutional neural networks algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant