CN112132004B - Fine granularity image recognition method based on multi-view feature fusion - Google Patents

Fine granularity image recognition method based on multi-view feature fusion Download PDF

Info

Publication number
CN112132004B
CN112132004B CN202010992253.7A CN202010992253A CN112132004B CN 112132004 B CN112132004 B CN 112132004B CN 202010992253 A CN202010992253 A CN 202010992253A CN 112132004 B CN112132004 B CN 112132004B
Authority
CN
China
Prior art keywords
feature
loss function
bilinear
image
fine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010992253.7A
Other languages
Chinese (zh)
Other versions
CN112132004A (en
Inventor
黄伟锋
张甜
常东良
马占宇
柳斐
王丹
刘念
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South To North Water Transfer Middle Route Information Technology Co ltd
Beijing University of Posts and Telecommunications
Original Assignee
South To North Water Transfer Middle Route Information Technology Co ltd
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South To North Water Transfer Middle Route Information Technology Co ltd, Beijing University of Posts and Telecommunications filed Critical South To North Water Transfer Middle Route Information Technology Co ltd
Priority to CN202010992253.7A priority Critical patent/CN112132004B/en
Publication of CN112132004A publication Critical patent/CN112132004A/en
Application granted granted Critical
Publication of CN112132004B publication Critical patent/CN112132004B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

A fine-grained image recognition method based on multi-view feature fusion relates to the technical field of image processing, and solves the problems that the existing fine-grained image recognition method ignores detail information of images and has poor adaptability of visual differences among images, an introduced loss function is complex, the quantity of parameters of a model is increased and the like; the invention introduces a suppression branch, and forces the network to find subtle discriminant features among the confusing categories by suppressing the most significant region in the image. And introducing a similar comparison learning module, fusing the feature vectors of similar samples, and increasing the interaction information of different images under the same category. And a center loss function is also introduced, so that the distance between the features and the centers of the corresponding classes is minimized, and the learned features are more discriminative. The accuracy of fine-grained image recognition is improved.

Description

Fine granularity image recognition method based on multi-view feature fusion
Technical Field
The invention relates to the technical field of image processing, in particular to a fine-granularity image recognition method based on multi-view feature fusion.
Background
Fine-grained image classification is based on distinguishing basic categories, such as birds or dogs, for finer subclassification. This problem therefore requires capturing subtle inter-class differences and fully mining the discriminative features of the image.
Fine-grained objects are widely available in real life, and fine-grained image recognition corresponding to the fine-grained objects is an important research topic in computer vision recognition. Current fine-grained image recognition presents mainly the following three challenges: (1) The same category may appear to have a large variance due to differences in pose, background, and shooting angle. (2) The different categories, due to belonging to the same parent category, differ between them only in some subtle areas, such as the beak and tail of a bird, etc. And (3) collecting and labeling fine-grained images is time-consuming and labor-consuming. As shown in fig. 1.
The existing method mainly achieves the aim of identification through the following three aspects: (1) Fine-grained image recognition is performed based on a location-classification network. (2) More discriminant characterization is learned directly by developing a powerful depth model for fine-grained recognition. (3) And combining the global features and the local features of the image to realize fine-grained classification of the image.
In the prior art 1, the feature is extracted through a pretrained twin convolutional neural network (convolutional neural networks) by bilinear pooling fine-grained image classification (Bilinear pooling), and bilinear pooling is carried out on each channel layer of the feature to obtain high-order representation of the feature, so that the discrimination capability of the feature is enhanced. The method benefits from the realization of improvement of the accuracy of fine-grained image recognition by a new pooling mode.
The method provides a new bilinear pooling mode, but no effective design is carried out on the aspects of the relation among fine-granularity image categories, the number of model parameters, the number of detail areas and the like aiming at fine-granularity image identification. The influence of factors such as small inter-class difference, large intra-class difference and the like on the fact that the fine-grained images contain various detail information is not considered.
In the prior art 2, a Multi-Attention Multi-Class Constraint network (Multi-Attention Multi-Class Constraint) is adopted, multiple Attention (Attention) areas of an input image are extracted through a one-time compression-Multi-expansion (one-squeeze Multi-extraction) module, then metric learning (METRIC LEARNING) is introduced, a network is trained by adopting a triplet loss and a softmax loss, the same Attention of similar features is pulled, and different Attention or different types of features are pushed away, so that the relation among components is enhanced, and the improvement of the accuracy of fine-granularity image identification is realized.
The method mainly uses metric learning to improve sample distribution in feature space, and therefore it has poor adaptability to mining visual differences between a pair of images. And the introduced loss function is complex, a large number of sample pairs need to be constructed, and the parameter number of the model is greatly increased.
Disclosure of Invention
The invention provides a fine-grained image recognition method based on multi-view feature fusion, which aims to solve the problems that the existing fine-grained image recognition method ignores detail information of images and has poor adaptability of visual differences between images, an introduced loss function is complex, the number of parameters of a model is increased and the like.
A fine granularity image recognition method based on multi-view feature fusion is realized by the following steps:
Step one, bilinear feature extraction;
Inputting an original image into a bilinear feature extraction network, and fusing feature graphs output by different convolution layers to obtain bilinear feature vectors; the characteristic extraction network adopts a network structure pre-trained in a data set ImageNet;
Step two, inhibiting branch learning, wherein the specific process is as follows:
Step two, generating attention patterns according to the sizes and the threshold values of the feature graphs output by different convolution layers of the feature extraction network in the step one;
Step two, generating a suppression mask according to the attention graph in the step two, and covering the suppression mask on the original image to generate a suppression image with a masked local area;
step two, carrying out bilinear feature extraction on the suppression image in the step two by adopting the step one to obtain bilinear feature vectors, inputting the bilinear feature vectors into a full-connection layer to obtain predicted class probability values, and calculating multi-class cross entropy for the predicted class probability values;
Step three, learning a similar comparison module;
Step three, randomly selecting other N images under the same category as the original image as positive sample images;
Step three, the target image and the positive sample image in step three are sent to the feature extraction network in step one to carry out bilinear feature vector fusion, so that bilinear feature vectors with fusion features integrating a plurality of images under the same category are obtained;
Thirdly, averaging bilinear feature vectors of a plurality of images under the same category obtained in the third step to obtain a fused feature vector, inputting the fused feature vector into a full-connection layer to obtain a predicted probability, and calculating multi-category cross entropy for the obtained predicted probability of the same category;
Step four, calculating a center loss function L C;
Let v i be bilinear feature of the ith sample, c i be average feature of all samples of the class corresponding to sample i, i.e. class center, and N be the number of samples in the current batch, then the formula of the center loss function L C is as follows:
Step five, calculating a model optimization loss function;
And carrying out weighted summation on the cross entropy loss function of the bilinear feature vector of the original image, the cross entropy loss function of the bilinear feature vector of the inhibition image, the cross entropy loss function of the fusion feature and the center loss function, and obtaining the loss function of model optimization.
The invention has the beneficial effects that: the invention comprehensively considers factors such as large intra-class difference, small inter-class difference, large background noise influence and the like of fine-granularity images, introduces a suppression branch, and forces the network to find subtle distinguishing characteristics among confusable classes by suppressing the most obvious area in the images. A similar comparison learning module is also introduced, and feature vectors of similar samples are fused, so that interaction information of different images under the same category is increased. Meanwhile, a center loss function is introduced, the distance between the features and the corresponding class centers is minimized, and the learned features are more discriminant.
By combining the above points, the method comprehensively utilizes the global features and the local features in the judging process, achieves obvious performance improvement on a plurality of fine-granularity image classification tasks, has robustness compared with the existing method, and is easy to be practically deployed. The accuracy of fine-grained image recognition is improved.
Drawings
A, b, c and d in fig. 1 are all schematic diagrams of the existing 4-group fine-grained images;
FIG. 2 is a schematic diagram of bilinear feature extraction in a fine-grained image recognition method based on multi-view feature fusion according to the invention;
FIG. 3 is a schematic diagram of similar contrast learning in a fine-grained image recognition method based on multi-view feature fusion according to the invention;
FIG. 4 is a schematic diagram of model optimization loss function calculation in a fine-grained image recognition method based on multi-view feature fusion according to the invention;
Fig. 5 is a feature visualization effect diagram obtained by a fine-grained image recognition method based on multi-view feature fusion according to the invention.
Detailed Description
The present embodiment will be described with reference to fig. 2 to 5, which are a fine-grained image recognition method based on multi-view feature fusion, and the method is implemented by the following steps:
Step one, bilinear feature extraction: and (3) inputting original images with fixed sizes by adopting a pre-trained ResNet-50 network structure on the ImageNet, and fusing the feature images output by different convolution layers to obtain bilinear feature vectors.
In connection with fig. 2, in the feature extraction step, a network pre-trained in the dataset ImageNet is used as the base network for feature extraction, and a common image classification network such as VGGNet, googLeNet, resNet can be fine-tuned to adapt the model to a specific task. Specifically, the original image is input to a feature extraction network to obtain feature maps (feature maps) output by the last two convolution layers, which are respectively marked asWherein D1, D2 represent the number of channels of the two features, and H and W represent the height and width of the feature map, respectively. In order to solve the problem of too high feature dimension after fusion, the feature information contained in the generated feature vector is enough, and only feature graphs of n channels in the F 2 are randomly extracted to be fused with the F 1. Characteristic vectors of positions along channels in characteristic diagrams F 1 and F 2 are/> After multiplication of the two eigenvectors, a bilinear matrix/>Adding bilinear matrixes corresponding to all positions in the feature map, and expanding the matrixes into a vector, namely bilinear vector/>Where d=d1×d2. The bilinear vector provides a stronger representation of the feature than the linear model.
Step two, a branch learning inhibition step:
A. Note that the map generation step: generating the attention pattern according to the size of the feature map and the threshold value.
B. And generating a suppression mask according to the attention graph, and overlaying the suppression mask on the original image to generate a suppression image with a masked local area.
C. multi-classification cross entropy calculation: and (3) obtaining bilinear feature vectors through the first step of the suppression image, inputting the bilinear feature vectors into a full-connection layer to obtain a prediction probability value, and calculating multi-classification cross entropy for the obtained category prediction value.
In the suppression branch learning step, the following three aspects are included:
step A, outputting a feature map of a convolution layer in a feature extraction network P d, ordered by the average, and the value of top-5 is selected to calculate entropy:
Attention is directed to FIG. A by comparing the entropy and the magnitude of threshold δ:
Step B of enlarging the attention map to the original image size, calculating an average value M thereof, taking m×θ as a threshold, setting elements larger than the threshold in the attention map to 0, and setting other elements to 1, thereby obtaining a suppression mask M:
Calculating an average value m of the attention patterns, setting a threshold value theta ranging from 0 to 1,
Step C, overlaying the suppression mask onto the original image, thereby obtaining a suppression image with the local area masked:
Is(x,y)=I(x,y)*M(x,y)
Where I (x, y) is the value of the (x, y) position in I in the original image.
Because the most significant areas of the image are suppressed, the attention is dispersed, forcing the neural network to learn discriminant information from other areas. The dependence of the network on training samples can be reduced, overfitting is prevented, and the robustness of the model is further improved.
Step three, a similar comparison module study step:
A. And (3) image sampling: the other N images under the same category are randomly selected as positive samples.
B. And a feature fusion step, in which the target image and the randomly sampled positive sample image are fused by the bilinear feature vector obtained in the first step, and the obtained fusion features integrate the feature information of a plurality of images under the same category.
C. Fusion feature loss function calculation: and directly inputting the fused feature vectors into a full-connection layer to obtain a prediction probability, and calculating multi-classification cross entropy for the obtained class prediction value.
Referring to fig. 3, step a randomly selects N images belonging to the same category as the input image, and all the N images are sent to the bilinear feature extraction network of step one.
Step B, averaging bilinear feature vectors of the multiple similar images output in the step A to obtain a fused feature vector:
Wherein j is the position of the feature vector, V (j) is the value of the feature vector at the j-th position, and T is the number of selected positive samples; v r (j) is the value of the nth positive sample at the jth position;
step four, calculating center loss;
A. Class center generation: in the training process, the feature vectors of the learned centers of all the categories are continuously updated.
B. Center loss calculation step: the distance between the bilinear feature vector and the class center vector obtained by each input image is used as the center loss, and the distance is continuously optimized in the training process.
In this embodiment, a feature vector is calculated for each class as the class center of the corresponding class, and this feature vector is updated continuously as training progresses. By punishing the bilinear feature vector of each sample and the offset of the sample center of the corresponding category, samples of the same category are gathered together as much as possible, and the complex sample pair construction process is omitted. Let v i be the bilinear feature of the ith sample, c i be the average feature of all samples in the class corresponding to sample i, i.e. class center, N be the number of samples in the current batch, and the formula is as follows:
step five, calculating a model optimization loss function:
and carrying out weighted summation on the cross entropy loss function of the bilinear features of the original image, the cross entropy loss function of the bilinear features of the inhibition image, the cross entropy loss function of the fusion features and the center loss function, and obtaining the model optimized loss function.
Referring to fig. 4, the cross entropy loss of bilinear feature vectors of the original image is recorded as L CE1, the cross entropy loss of bilinear feature vectors of the suppressed image is recorded as L CE2, the cross entropy loss of fusion features is recorded as L CE3, the center loss is recorded as L C, and the loss functions are weighted and summed to obtain a model-optimized loss function L:
Where λ is the weight of the center loss function.
In connection with fig. 5, the description will be given of the present embodiment, in fig. 5, of an original image selected randomly in the first behavior data set, a class activation map obtained by a global branch of the input of the original image in the second behavior, and a class activation map obtained by a third behavior suppression branch. It can be seen that in the global branch, the network learns the most prominent areas of the image, such as the beak of a bird, the head lamp of a car, etc., and in the inhibitory branch, the network learns subtle features that facilitate fine-grained classification, such as the torso of a bird, wheels, etc. The combination of multiple views enables the judgment basis of the network model to be more comprehensive, not only can the salient region be obtained, but also the fine granularity characteristic can be captured subtly.
The fine-grained image recognition method according to the embodiment introduces a new data enhancement mode, and suppresses the component area in the image through attention map guidance, so that attention is dispersed, and the network learns more complementary area characteristic information. A similar comparison module is introduced, and feature information from a plurality of images in the same category is fused, so that the representation of the images in the same category is as close as possible in an embedding space, and the classification performance is improved.
The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples illustrate only a few embodiments of the invention, which are described in detail and are not to be construed as limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.

Claims (2)

1. A fine granularity image recognition method based on multi-view feature fusion is characterized by comprising the following steps: the method is realized by the following steps:
Step one, bilinear feature extraction;
Inputting an original image into a bilinear feature extraction network, and fusing feature graphs output by different convolution layers to obtain bilinear feature vectors; the characteristic extraction network adopts a network structure pre-trained in a data set ImageNet;
Step two, inhibiting branch learning, wherein the specific process is as follows:
Step two, generating attention patterns according to the sizes and the threshold values of the feature graphs output by different convolution layers of the feature extraction network in the step one;
In the second step, the specific process of generating the attention pattern is as follows:
feature map output to last convolution layer in feature extraction network P k, wherein D is the number of channels characterized, and H and W are the height and width of the feature map, respectively; sequencing according to the average value, and obtaining the formula of the entropy E as follows:
note that fig. a is constructed by comparing the entropy and the magnitude of the threshold δ:
wherein F k is a two-dimensional feature map corresponding to each channel after channel sequencing;
Step two, generating a suppression mask according to the attention graph in the step two, and covering the suppression mask on the original image to generate a suppression image with a masked local area;
In the second step, the specific process of generating the inhibition mask is as follows:
amplifying the attention map in the second step to the original image size, calculating an average value M of the attention map, setting a threshold value theta in a range between 0 and 1, taking M theta as a threshold value, setting elements larger than the threshold value M theta in the attention map as 0, setting other elements as 1, and obtaining a suppression mask M:
wherein a (x, y) is the value of the (x, y) position in attention map a;
Overlaying a suppression mask onto the original image to obtain a suppression image I s (x, y) with a local area masked;
Is(x,y)=I(x,y)*M(x,y)
wherein I (x, y) is the value of the (x, y) position in I in the original image;
Step two, carrying out bilinear feature extraction on the suppression image in the step two by adopting the step one to obtain bilinear feature vectors, inputting the bilinear feature vectors into a full-connection layer to obtain predicted class probability values, and calculating multi-class cross entropy for the predicted class probability values;
Step three, learning a similar comparison module;
Step three, randomly selecting other N images under the same category as the original image as positive sample images;
Step three, the target image and the positive sample image in step three are sent to the feature extraction network in step one to carry out bilinear feature vector fusion, so that bilinear feature vectors with fusion features integrating a plurality of images under the same category are obtained;
Thirdly, averaging bilinear feature vectors of a plurality of images under the same category obtained in the third step to obtain a fused feature vector, inputting the fused feature vector into a full-connection layer to obtain a predicted probability, and calculating multi-category cross entropy for the obtained predicted probability of the same category;
Step four, calculating a center loss function L C;
Let v i be bilinear feature of the ith sample, c i be average feature of all samples of the class corresponding to sample i, i.e. class center, and N be the number of samples in the current batch, then the formula of the center loss function L C is as follows:
Step five, calculating a model optimization loss function;
The cross entropy loss function of the bilinear feature vector of the original image is restrained, the cross entropy loss function of the bilinear feature vector of the image is restrained, the cross entropy loss function of the fusion feature and the center loss function are weighted and summed, and a model optimized loss function is obtained;
In the fifth step, the cross entropy loss function of the bilinear feature vector of the original image is L CE1, the cross entropy loss function of the bilinear feature vector of the inhibition image is L CE2, the cross entropy loss function of the fusion feature is L CE3, the center loss function is L C, the weighted summation is carried out to obtain a model optimized loss function L, and finally the fine granularity image identification is realized; expressed by the following formula:
Where λ is the weight of the center loss function.
2. The fine-grained image recognition method based on multi-view feature fusion according to claim 1, wherein: in the third step, bilinear feature vectors of a plurality of images in the same category are averaged to obtain a fused feature vector V' (j), which is expressed as follows:
Wherein j is the position of the feature vector, V (j) is the value of the feature vector at the j-th position, and T is the number of selected positive samples; v r (j) is the value of the jth positive sample at the jth position.
CN202010992253.7A 2020-09-21 2020-09-21 Fine granularity image recognition method based on multi-view feature fusion Active CN112132004B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010992253.7A CN112132004B (en) 2020-09-21 2020-09-21 Fine granularity image recognition method based on multi-view feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010992253.7A CN112132004B (en) 2020-09-21 2020-09-21 Fine granularity image recognition method based on multi-view feature fusion

Publications (2)

Publication Number Publication Date
CN112132004A CN112132004A (en) 2020-12-25
CN112132004B true CN112132004B (en) 2024-06-25

Family

ID=73841694

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010992253.7A Active CN112132004B (en) 2020-09-21 2020-09-21 Fine granularity image recognition method based on multi-view feature fusion

Country Status (1)

Country Link
CN (1) CN112132004B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112733912B (en) * 2020-12-31 2023-06-09 华侨大学 Fine granularity image recognition method based on multi-granularity countering loss
CN112712066B (en) * 2021-01-19 2023-02-28 腾讯科技(深圳)有限公司 Image recognition method and device, computer equipment and storage medium
CN112766378B (en) * 2021-01-19 2023-07-21 北京工商大学 Cross-domain small sample image classification model method focusing on fine granularity recognition
CN112800927B (en) * 2021-01-25 2024-03-29 北京工业大学 Butterfly image fine-granularity identification method based on AM-Softmax loss
CN112990270B (en) * 2021-02-10 2023-04-07 华东师范大学 Automatic fusion method of traditional feature and depth feature
CN113255793B (en) * 2021-06-01 2021-11-30 之江实验室 Fine-grained ship identification method based on contrast learning
CN113449613B (en) * 2021-06-15 2024-02-27 北京华创智芯科技有限公司 Multi-task long tail distribution image recognition method, system, electronic equipment and medium
CN113642571B (en) * 2021-07-12 2023-10-10 中国海洋大学 Fine granularity image recognition method based on salient attention mechanism
CN113705489B (en) * 2021-08-31 2024-06-07 中国电子科技集团公司第二十八研究所 Remote sensing image fine-granularity airplane identification method based on priori regional knowledge guidance
CN115424086A (en) * 2022-07-26 2022-12-02 北京邮电大学 Multi-view fine-granularity identification method and device, electronic equipment and medium
CN117725483A (en) * 2023-09-26 2024-03-19 电子科技大学 Supervised signal classification method based on neural network

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685115A (en) * 2018-11-30 2019-04-26 西北大学 A kind of the fine granularity conceptual model and learning method of bilinearity Fusion Features

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110135502B (en) * 2019-05-17 2023-04-18 东南大学 Image fine-grained identification method based on reinforcement learning strategy
CN110210550A (en) * 2019-05-28 2019-09-06 东南大学 Image fine granularity recognition methods based on integrated study strategy
CN110222636B (en) * 2019-05-31 2023-04-07 中国民航大学 Pedestrian attribute identification method based on background suppression
CN110807465B (en) * 2019-11-05 2020-06-30 北京邮电大学 Fine-grained image identification method based on channel loss function
CN111523534B (en) * 2020-03-31 2022-04-05 华东师范大学 Image description method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685115A (en) * 2018-11-30 2019-04-26 西北大学 A kind of the fine granularity conceptual model and learning method of bilinearity Fusion Features

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于多视角融合的细粒度图像分类方法;黄伟锋;张甜;常东良;闫冬;王嘉希;王丹;马占宇;;信号处理(09);全文 *

Also Published As

Publication number Publication date
CN112132004A (en) 2020-12-25

Similar Documents

Publication Publication Date Title
CN112132004B (en) Fine granularity image recognition method based on multi-view feature fusion
Zhao et al. Discriminative feature learning for unsupervised change detection in heterogeneous images based on a coupled neural network
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
CN111881714B (en) Unsupervised cross-domain pedestrian re-identification method
Xie et al. Multilevel cloud detection in remote sensing images based on deep learning
Awad et al. Multicomponent image segmentation using a genetic algorithm and artificial neural network
CN107016357B (en) Video pedestrian detection method based on time domain convolutional neural network
Chuang et al. A feature learning and object recognition framework for underwater fish images
Dornaika et al. Building detection from orthophotos using a machine learning approach: An empirical study on image segmentation and descriptors
Ghamisi et al. Multilevel image segmentation based on fractional-order Darwinian particle swarm optimization
Firpi et al. Swarmed feature selection
CN111832514B (en) Unsupervised pedestrian re-identification method and unsupervised pedestrian re-identification device based on soft multiple labels
CN108596211B (en) Shielded pedestrian re-identification method based on centralized learning and deep network learning
CN111639564B (en) Video pedestrian re-identification method based on multi-attention heterogeneous network
CN104346620A (en) Inputted image pixel classification method and device, and image processing system
JP6867054B2 (en) A learning method and a learning device for improving segmentation performance used for detecting a road user event by utilizing a double embedding configuration in a multi-camera system, and a testing method and a testing device using the learning method and a learning device. {LEARNING METHOD AND LEARNING DEVICE FOR IMPROVING SEGMENTATION PERFORMANCE TO BE USED FOR DETECTING ROAD USER EVENTS USING DOUBLE EMBEDDING CONFIGURATION IN MULTI-CAMERA SYSTEM AND TESTING METHOD AND TESTING DEVICE USING THE SAME}
CN105761238B (en) A method of passing through gray-scale statistical data depth information extraction well-marked target
CN111709313B (en) Pedestrian re-identification method based on local and channel combination characteristics
CN110705591A (en) Heterogeneous transfer learning method based on optimal subspace learning
CN111161307A (en) Image segmentation method and device, electronic equipment and storage medium
Geetha et al. Detection and estimation of the extent of flood from crowd sourced images
CN110427835B (en) Electromagnetic signal identification method and device for graph convolution network and transfer learning
CN117475236B (en) Data processing system and method for mineral resource exploration
CN111274964B (en) Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle
Alsanad et al. Real-time fuel truck detection algorithm based on deep convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant