CN116258938A - Image retrieval and identification method based on autonomous evolution loss - Google Patents

Image retrieval and identification method based on autonomous evolution loss Download PDF

Info

Publication number
CN116258938A
CN116258938A CN202211577810.4A CN202211577810A CN116258938A CN 116258938 A CN116258938 A CN 116258938A CN 202211577810 A CN202211577810 A CN 202211577810A CN 116258938 A CN116258938 A CN 116258938A
Authority
CN
China
Prior art keywords
images
loss function
softmax
gradient
classes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211577810.4A
Other languages
Chinese (zh)
Inventor
王鹏
***
张艳宁
吴瑞祺
杨路
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202211577810.4A priority Critical patent/CN116258938A/en
Publication of CN116258938A publication Critical patent/CN116258938A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an image retrieval and identification method based on autonomous evolution loss, and provides a brand new Softmax-like measurement loss function, which shares parameters with an original Softmax loss function, and has three main differences, namely: different gamma, L2 normalization features and W to stop gradient updates j . Because the characteristics used in the gradient stop Softmax loss function are L2 normalized, the distance measurement of the training stage is consistent with that of the testing stage, and the parameters are shared with the original Softmax loss function, so that the network can obtain good characterization (class center), and the problem that training is difficult to converge is solved; the central gradient stops updating in the gradient stop Softmax loss function, but the sampleThe present feature does not stop gradient updates and such a setting may force the sample feature to approach the class center on the hypersphere. The influence of applying the depth measurement learning to the image retrieval task to the model learning effect by only learning the Softmax loss function is better solved.

Description

Image retrieval and identification method based on autonomous evolution loss
Technical Field
The invention belongs to the field of image retrieval, and particularly relates to an image retrieval and identification method based on autonomous evolution loss.
Background
The basic form of an image retrieval task is to give a query image containing a specific instance (e.g., a specific object, scene, building, etc.), and then find an image containing the same instance from the database image. Depth metric learning is one of the important methods widely used in image retrieval tasks.
In an autonomously evolving metric learning task, deep metric learning (Deep metric learning, DML) aims at learning similarity metrics, which can map samples to a high-dimensional space. In high-dimensional space, the closer the samples of the same instance are, the more distant the samples of different instances are. Typical depth metric learning applications include image retrieval, person re-identification, and the like. Popular methods of depth metric learning include pairwise based methods and Softmax based methods. The pairwise based approach focuses on finding efficient methods to improve the sample weighting strategy of existing pairwise losses (such as contrast losses and triple losses). The pair-wise based approach directly affects the distance between pairs of points in the embedding space, which is closely related to the objectives of the DML. With respect to Softmax-based methods, some existing methods believe that good performance can also be achieved using Softmax loss to train the model. In contrast to pairwise based approaches, softmax based approaches can be viewed as approximating each class using an agent, and using all agents to provide a global context for each training iteration.
Prior studies found that optimizing Softmax-based methods corresponds to an approximate world optimizer of basic pairwise loss, indicating that minimizing Softmax loss is equivalent to maximizing a differentiated view of mutual information between features and labels. In fact, when training a Softmax-based depth metric learning model, the inner product without L2 normalization (i.e., the last fully connected layer) is the most widely used similarity metric, but features are typically L2 normalized during the test phase, meaning that the distance metric used during training is different from the distance used during the test phase. To make up for this gap, a simple approach is to use L2 normalization directly during training. However, the introduction of L2 normalization standardization makes it difficult for the network to converge, resulting in failure of training.
The prior studies suggest that this is mainly because the L2 normalized inner product output ranges only from [ -1,1], preventing the probability distribution from approaching 1 even if the samples are well separated. To address this convergence problem, researchers have attempted to add a scaling layer after the inner product. The scaling layer has a learnable parameter for scaling the internal output to a larger value than 1, thereby facilitating the Softmax penalty to continue to decrease, thereby helping the network to converge. However, this approach does not guarantee that the network can learn the optimal scaling parameters.
Disclosure of Invention
Aiming at the defects existing in the prior art, the invention aims to provide an image retrieval and identification method based on autonomous evolution loss, which solves the problem that only a Softmax loss function is learned to influence a model learning effect when depth measurement learning is applied to an image retrieval task.
The invention is realized by the following technical scheme:
the image retrieval and identification method based on the autonomous evolution loss is characterized by comprising the following steps:
(1) Using a ResNet50 model as a backbone network and pre-training on an ImageNet large classification dataset;
(2) The generalized mean pooling is used for replacing global mean pooling, a batch normalization layer without a scaling item and a bias item is added on a main network, the last BN-ReLU module in the main network is removed, and the recall rate in the test is calculated by using L2 normalization Euclidean distance;
(3) All input images are adjusted to 256 x 256 resolutions and cut to 224 x 224 resolutions, data enhancement operation is not carried out on the input data, and the input data is only sampled to 256 x 256 image sizes;
(4) Training the model for 100 rounds, setting a parameter learning rate by adopting a cosine annealing algorithm, and setting gamma=30 as a default value;
(5) Using label smoothing on the Softmax loss function, and stopping the Softmax loss function by gradient to start adding training when the return value of the Softmax loss function is less than 3;
the gradient stop Softmax loss function is as follows equation (2):
Figure SMS_1
where N is the number of samples in each input batch, c is the number of classes in the training set, f i Is characteristic of the ith sample, y i Label for the ith sample, W j Is the j-th column of the last full connection layer, corresponds to the j-th class,
Figure SMS_2
represents L2 normalization, and->
Figure SMS_3
Indicating that it is not allowed to pass through W j Gradient updating is performed, and gamma is a predefined scalar; softplus (x) =log (1+e x ),/>
Figure SMS_4
(6) Fixing model parameters, updating network parameters without using a random gradient descent algorithm, and using a network only as an extractor of image characteristics;
(7) Extracting network output characteristics F from ResNet-50 characteristics of the BN-ReLU module, wherein the ResNet-50 characteristics of the BN-ReLU module are removed after the gradient stop Softmax loss function is deployed;
(8) Obtaining a characteristic F by model characteristic reasoning aiming at a query sample q Extracting and storing the features of all images in the image library as a feature sequence { F } 1 ,…,F m };
(9) Computing query sample characteristics F q And euclidean distance of all image features in the image library: d= |f q -F i || 2 ,i=1,2,3,…,m;
(10) Obtaining distanceSequence d= [ D ] 1 ,d 2 ,…,d m ];
(11) And (3) reordering the D by the distance, taking L images closest to the query sample, and if the images with the same ID as the query sample exist in the images, considering that the image retrieval is successful.
Further, in the step (5), the Softmax loss function and the gradient stop Softmax function are used in combination, and the total loss is expressed as follows:
L=L softmax +L SGSL (1)
the gradient stop Softmax loss function and the original Softmax loss function share parameters.
Further, the image library employed was constructed as follows:
the CUB-200-2011 has 200 classes, 11788 pictures, the first 100 classes 5864 images are used for training, and the other classes 5924 images are used for testing;
CAR-196 has 198 classes and 16,185 images, the first 98 classes are used for training 8054 images, and the other 98 classes are 8131 images for testing;
stanford Online Products there are 22634 classes followed by 120053 pictures, the first 11318 classes 59551 images are used for training, and the other 11316 classes 60502 images are used for testing;
in-shop imaging has 50 fine-grained categories and 1000 attributes, containing more than 80 tens of thousands of images.
The invention provides an autonomous evolution loss-based image retrieval and identification method, and provides a brand new Softmax-like measurement loss function, which shares parameters with an original Softmax loss function, and has three main differences, namely: different gamma, L2 normalization features and W to stop gradient updates j . Because the features used in the gradient stop Softmax loss function are L2 normalized, the distance metric of the training phase is consistent with the test phase, and sharing parameters with the original Softmax loss function also enables the network to obtain good characterization (class center), thereby solving the problem that training is difficult to converge.
At the same time, the gradient of the class center stops updating in the gradient stop Softmax loss function, but the sample feature does not stop gradient updating, and such a setting can force the sample feature to approach the class center on the hypersphere. The influence of applying the depth measurement learning to the image retrieval task to the model learning effect by only learning the Softmax loss function is better solved.
Drawings
FIG. 1 is a flowchart of an overall image retrieval method;
FIG. 2 is a schematic diagram of an original Softmax loss function and a gradient stop Softmax loss function according to the present invention;
Detailed Description
The invention will now be described in further detail with reference to specific examples, which are intended to illustrate, but not to limit, the invention.
As shown in fig. 1, the invention provides an image retrieval and identification method based on autonomous evolution loss, which comprises the following specific implementation processes:
(1) Using ResNet50 as the backbone network and pre-training on the ImageNet large classification dataset;
(2) A generalized mean pooling is used for replacing a global mean pooling, a batch normalization layer without a scaling item and a bias item is added on a backbone network, and an L2 normalization Euclidean distance is used for calculating recall rate during testing;
(3) All input images were adjusted to 256 x 256 resolution and cropped to 224 x 224 resolution, with a number of batch input samples of 64 images (4 images per ID input, 16 IDs total). The method comprises the steps that data enhancement operation is not carried out on input data, and the input data is only sampled to the image size of 256 x 256;
(4) The model is trained for 100 rounds, a cosine annealing algorithm is adopted to set a parameter learning rate, and gamma=30 is set as a default value;
(5) Using label smoothing on the Softmax loss function, and stopping the Softmax loss function by gradient to start adding training when the return value of the Softmax loss function is less than 3;
(6) The model parameters are fixed, network parameter updating is not performed through a random gradient descent algorithm, and only the network is used as an extractor of image features.
(7) In the actual reasoning process, the gradient stop Softmax penalty function will be deployed and the output feature F of the ResNet-50 feature extraction network of the removed BN-ReLU technique will be applied.
(8) For the query sample, the obtained feature F is subjected to model feature reasoning q The invention extracts the characteristics of all images in the image library and stores the extracted characteristics as a characteristic sequence { F } 1 ,…,F m }。
(9) Calculation F q And euclidean distance of all image features in the image library: d= |f q -F i || 2 ,i=1,2,3,…,m
(10) Obtaining a distance sequence D= [ D ] 1 ,d 2 ,…,d m ],
(11) The invention takes L images closest to the query sample by reordering D by distance, and if the images with the same ID as the query sample exist in the images, the image retrieval is considered to be successful.
As shown in FIG. 2, the scheme innovation mainly comprises a gradient stop Softmax loss function and a BN-ReLU removal technology.
1. Gradient stop Softmax loss function
Gradient Stop Softmax penalty function (Stop-Gradient Softmax Loss, SGSL), many existing methods delete the bias term in the last fully connected layer and follow this setup when training the classification network for metric learning. For a better understanding of the method of the present invention, the original Softmax and its variants are briefly reviewed here. The Softmax penalty function for the original unbiased term is shown in equation (1) below:
Figure SMS_5
where N is the number of samples in each input batch, c is the number of classes in the training set, f i Is characteristic of the ith sample, y i Is the label of the i-th sample. W (W) j Is the j-th column of the last full connection layer, corresponding to the j-th class. In addition, there is Softplus (x) =log (1+e x ) And
Figure SMS_6
the gradient stop Softmax loss function formula proposed by the invention is similar to the standard Softmax loss function, but has three differences:
1) Gamma in the gradient stop Softmax loss function formula is not fixed to be 1, but a larger value, and gamma can be regarded as a scaling scale parameter for controlling loss, but unlike the traditional scaling scale parameter, gamma is only W j And f i From a different category;
2)W j and f i Are all L2 normalized;
3) The invention does not allow the W to pass through j And carrying out gradual change update.
Therefore, the gradient stop Softmax loss function proposed by the present invention is defined as the following equation (2):
Figure SMS_7
wherein the method comprises the steps of
Figure SMS_8
Represents L2 normalization, and->
Figure SMS_9
Indicating that it is not allowed to pass through W j Gradient updates are made, gamma being a predefined scalar. The other symbols have the same meaning as in formula (1). In the experiment, the original Softmax loss function and the gradient stop Softmax function provided by the invention are combined, and the total loss is expressed as the following formula:
L=L softmax +L SGSL (3)
as illustrated in fig. 2, the original Softmax loss function and the gradient stop Softmax loss function proposed by the present invention, wherein the gradient stop Softmax loss function and the original Softmax loss function share parameters. Analysis of the principle of action of the gradient stop Softmax loss function proposed by the present invention, softplus (x) =log (1+e x ) Is a convex monotonically increasing functionTo be regarded as a smooth version of the positive partial function max (0, x), and so-called logarithmic index, i.e
Figure SMS_10
Is a form of function frequently encountered in dynamic discrete selection models, which can be viewed as a smoothed version of the function that selects the largest among a set of data. And the larger γ is, the smaller the error is, so the present invention sets γ=30 to a default value in the formula (2).
Based on the above analysis, equation (2) can be expressed approximately as:
Figure SMS_11
wherein [] + Represents max ([. Cndot.],0),
Figure SMS_12
Indicating that it is not allowed to pass through W j Gradient updating and cosine similarity
Figure SMS_13
The normalized version of the inner product of the two vectors is used for measuring the similarity independent of the size between the features, and is equivalent to the L2 normalized Euclidean distance.
From equation (4), it can be seen that the optimization objective of the gradient stop Softmax loss function is to let f i And W is equal to j The cosine similarity between them is greater than f i And (3) with
Figure SMS_14
Maximum cosine similarity between them. In other words, the gradient stop Softmax penalty function requires that the network-learned features should be closer to the characterization of its class (here far and near measured using L2 normalized euclidean distances) and farther from the characterization of the other class, which is closely related to the goal of depth metric learning. In addition, the gradient stop Softmax loss function provided by the invention does not allow gradient update through W, so that the convergence difficulty problem caused by applying the gradient stop Softmax loss function in the actual training learning of a model is less.
2. BN-ReLU removal technique
In deep metric learning, a feature extraction network ResNet50 without the last fully connected layer is often used as the backbone network. Many methods add batch normalization (Batch Normalization, BN) of non-scaling and paranoid terms over the backbone network, as it can smooth and normalize the feature distribution, enhancing compactness within the class. But such a process would bring the last three layers of the backbone network into the form of BN-ReLU-BN, which would increase the learning burden of the model. While adding a continuous BN-ReLU module does not bring any new information for the output features, some information useful for metric learning may be lost instead. Therefore, the invention applies the technology of removing BN-ReLU, and removes the last BN-ReLU module in the backbone network.
1. Data set selection
In the aspect of data set setting, the method uses a general image retrieval data set to evaluate the identification capability of the method in a fine-granularity image retrieval task, and analyzes the improvement of the model performance brought by the gradient stop Softmax loss function. Specifically, the data set used in the present invention is composed of:
1) The CUB-200-2011 has 200 classes and 11788 pictures. The first 100 classes (5864 images) are used for training and the remaining classes (5924 images) are used for testing.
2) CAR-196 has 198 classes and 16,185 images. The first 98 classes were used for training (8054 images) and the other 98 classes (8131 images) were used for testing.
3) Stanford Online Products there are 22634 classes followed by 120053 pictures. The first 11, 318 classes (59551 images) are used for training and the other 11316 classes (60502 images) are used for testing.
4) In-shop imaging is a large garment dataset with comprehensive annotations. It has 50 fine-grained categories and 1000 attributes and contains over 80 tens of thousands of images that are annotated with a large number of attributes, clothing landmarks, and correspondence of images taken in different scenes, including shops, street snapshots, and consumers.
2. Implementation detail setting
The experiments of the present invention were completed on a NVIDIA GTX 2080Ti graphics processor using the PyTorch deep learning framework. The invention uses different numbers of graphics processors for training according to the size of the data set, specifically 4 graphics processors for Stanford Online Products data set and 2 graphics processors for other data sets. All experiments used ResNet50 as the backbone network and were pre-trained on ImageNet large class datasets, with generalized mean pooling instead of global mean pooling. Similar to most of the current depth metric learning methods, a batch normalization layer without scaling and biasing terms is added on the backbone network, and the recall rate at test is calculated using the L2 normalized euclidean distance.
In terms of parameter setting, all input images were adjusted to 256×256 resolutions and clipped to 224×224 resolutions, and the number of batch input samples was 64 images (4 images for each ID input, 16 IDs for total). The model is trained for 100 rounds, and a cosine annealing algorithm is adopted to set the parameter learning rate. γ=30 is set as a default value. To build a robust model with good generalization capability, label smoothing was used for Softmax, and for training stability, the gradient stopped Softmax loss function was only started to join the training if the return value of Softmax was less than 3.
3. Model application
At this stage, the present invention does not perform data enhancement operation on the input data, but samples the input data only to an image size of 256×256. Meanwhile, the invention fixes the model parameters, does not update network parameters through a random gradient descent algorithm, and only uses the network as an extractor of image characteristics. In the actual reasoning process, the present invention uses the output feature F of the ResNet-50 feature extraction network where the gradient stop Softmax penalty function is deployed and the BN-ReLU removal technique is applied. For the query sample, the obtained feature F is subjected to model feature reasoning q The invention extracts the characteristics of all images in the image library and stores the extracted characteristics as a characteristic sequence { F } 1 ,…,F m Then calculate F q And euclidean distance of all image features in the image library:
d=||F q -F i || 2 ,i=1,2,3,…,m
further, the invention obtains a distance sequence D= [ D ] 1 ,d 2 ,…,d m ]Then the invention reorders D by distance, the invention takes L images closest to the query sample, if the images with the same ID as the query sample exist in the images, the image retrieval is considered to be successful.
From the aspect of recall analysis, the method not only solves the problem that the model is difficult to converge, but also reaches 75.9% on CUB-200-2011, 94.7% on CARS196 and 83.1% on SOP, and at least exceeds 1.7%, 2.9% and 1.7% compared with the traditional method.
The foregoing is illustrative of the present invention and is not to be construed as limiting thereof, but rather as various modifications, equivalent arrangements, improvements, etc., which fall within the spirit and principles of the present invention.

Claims (3)

1. The image retrieval and identification method based on the autonomous evolution loss is characterized by comprising the following steps:
(1) Using a ResNet50 model as a backbone network and pre-training on an ImageNet large classification dataset;
(2) The generalized mean pooling is used for replacing global mean pooling, a batch normalization layer without a scaling item and a bias item is added on a main network, the last BN-ReLU module in the main network is removed, and the recall rate in the test is calculated by using L2 normalization Euclidean distance;
(3) All input images are adjusted to 256 x 256 resolutions and cut to 224 x 224 resolutions, data enhancement operation is not carried out on the input data, and the input data is only sampled to 256 x 256 image sizes;
(4) Training the model for 100 rounds, setting a parameter learning rate by adopting a cosine annealing algorithm, and setting gamma=30 as a default value;
(5) Using label smoothing on the Softmax loss function, and stopping the Softmax loss function by gradient to start adding training when the return value of the Softmax loss function is less than 3;
the gradient stop Softmax loss function is as follows equation (2):
Figure FDA0003989630430000011
where N is the number of samples in each input batch, c is the number of classes in the training set, f i Is characteristic of the ith sample, y i Label for the ith sample, W j Is the j-th column of the last full connection layer, corresponds to the j-th class,
Figure FDA0003989630430000012
represents L2 normalization, and->
Figure FDA0003989630430000013
Indicating that it is not allowed to pass through W j Gradient updating is performed, and gamma is a predefined scalar; softplus (x) =log (1+e x ),
Figure FDA0003989630430000014
(6) Fixing model parameters, updating network parameters without using a random gradient descent algorithm, and using a network only as an extractor of image characteristics;
(7) Extracting network output characteristics F from ResNet-50 characteristics of the BN-ReLU module, wherein the ResNet-50 characteristics of the BN-ReLU module are removed after the gradient stop Softmax loss function is deployed;
(8) Obtaining a characteristic F by model characteristic reasoning aiming at a query sample q Extracting and storing the features of all images in the image library as a feature sequence { F } 1 ,…,F m };
(9) Computing query sample characteristics F q And euclidean distance of all image features in the image library: d= |f q -F i || 2 ,i=1,2,3,…,m;
(10) Obtaining a distance sequence D= [ D ] 1 ,d 2 ,…,d m ];
(11) And (3) reordering the D by the distance, taking L images closest to the query sample, and if the images with the same ID as the query sample exist in the images, considering that the image retrieval is successful.
2. The method for image retrieval and recognition based on autonomous evolution loss according to claim 1, wherein: the step (5) combines the Softmax loss function and the gradient stop Softmax function, and the total loss is expressed as follows:
L=L softmax +L SGSL (3)
the gradient stop Softmax loss function and the original Softmax loss function share parameters.
3. The method for image retrieval and recognition based on autonomous evolution loss according to claim 1, wherein: the image library used is constructed as follows:
the CUB-200-2011 has 200 classes, 11788 pictures, the first 100 classes 5864 images are used for training, and the other classes 5924 images are used for testing;
CAR-196 has 198 classes and 16,185 images, the first 98 classes are used for training 8054 images, and the other 98 classes are 8131 images for testing;
stanford Online Products there are 22634 classes followed by 120053 pictures, the first 11318 classes 59551 images are used for training, and the other 11316 classes 60502 images are used for testing;
in-shop imaging has 50 fine-grained categories and 1000 attributes, containing more than 80 tens of thousands of images.
CN202211577810.4A 2022-12-09 2022-12-09 Image retrieval and identification method based on autonomous evolution loss Pending CN116258938A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211577810.4A CN116258938A (en) 2022-12-09 2022-12-09 Image retrieval and identification method based on autonomous evolution loss

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211577810.4A CN116258938A (en) 2022-12-09 2022-12-09 Image retrieval and identification method based on autonomous evolution loss

Publications (1)

Publication Number Publication Date
CN116258938A true CN116258938A (en) 2023-06-13

Family

ID=86681633

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211577810.4A Pending CN116258938A (en) 2022-12-09 2022-12-09 Image retrieval and identification method based on autonomous evolution loss

Country Status (1)

Country Link
CN (1) CN116258938A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116955671A (en) * 2023-09-20 2023-10-27 吉林大学 Fine granularity image retrieval method and device
CN117975445A (en) * 2024-03-29 2024-05-03 江南大学 Food identification method, system, equipment and medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116955671A (en) * 2023-09-20 2023-10-27 吉林大学 Fine granularity image retrieval method and device
CN116955671B (en) * 2023-09-20 2023-12-01 吉林大学 Fine granularity image retrieval method and device
CN117975445A (en) * 2024-03-29 2024-05-03 江南大学 Food identification method, system, equipment and medium
CN117975445B (en) * 2024-03-29 2024-05-31 江南大学 Food identification method, system, equipment and medium

Similar Documents

Publication Publication Date Title
CN109800648B (en) Face detection and recognition method and device based on face key point correction
WO2021143396A1 (en) Method and apparatus for carrying out classification prediction by using text classification model
CN108288051B (en) Pedestrian re-recognition model training method and device, electronic equipment and storage medium
CN110516095B (en) Semantic migration-based weak supervision deep hash social image retrieval method and system
CN112507901B (en) Unsupervised pedestrian re-identification method based on pseudo tag self-correction
CN109993102B (en) Similar face retrieval method, device and storage medium
CN116258938A (en) Image retrieval and identification method based on autonomous evolution loss
CN112084895B (en) Pedestrian re-identification method based on deep learning
CN112949740B (en) Small sample image classification method based on multilevel measurement
CN112070058A (en) Face and face composite emotional expression recognition method and system
CN112800876A (en) Method and system for embedding hypersphere features for re-identification
WO2022199214A1 (en) Sample expansion method, training method and system, and sample learning system
CN113920472A (en) Unsupervised target re-identification method and system based on attention mechanism
CN114708903A (en) Method for predicting distance between protein residues based on self-attention mechanism
CN112464775A (en) Video target re-identification method based on multi-branch network
Okokpujie et al. Predictive modeling of trait-aging invariant face recognition system using machine learning
CN113392191B (en) Text matching method and device based on multi-dimensional semantic joint learning
CN112256895B (en) Fabric image retrieval method based on multitask learning
CN113535928A (en) Service discovery method and system of long-term and short-term memory network based on attention mechanism
CN111563180A (en) Trademark image retrieval method based on deep hash method
CN116796038A (en) Remote sensing data retrieval method, remote sensing data retrieval device, edge processing equipment and storage medium
CN116071544A (en) Image description prediction method oriented to weak supervision directional visual understanding
CN115100694A (en) Fingerprint quick retrieval method based on self-supervision neural network
CN114973350A (en) Cross-domain facial expression recognition method irrelevant to source domain data
CN110750672B (en) Image retrieval method based on deep measurement learning and structure distribution learning loss

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination