CN117541882B - Instance-based multi-view vision fusion transduction type zero sample classification method - Google Patents
Instance-based multi-view vision fusion transduction type zero sample classification method Download PDFInfo
- Publication number
- CN117541882B CN117541882B CN202410017127.8A CN202410017127A CN117541882B CN 117541882 B CN117541882 B CN 117541882B CN 202410017127 A CN202410017127 A CN 202410017127A CN 117541882 B CN117541882 B CN 117541882B
- Authority
- CN
- China
- Prior art keywords
- pictures
- unseen
- view
- semantic
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 230000004927 fusion Effects 0.000 title claims abstract description 20
- 230000026683 transduction Effects 0.000 title claims abstract description 19
- 238000010361 transduction Methods 0.000 title claims abstract description 19
- 230000000007 visual effect Effects 0.000 claims abstract description 43
- 239000011159 matrix material Substances 0.000 claims abstract description 40
- 238000006243 chemical reaction Methods 0.000 claims abstract description 12
- 238000013507 mapping Methods 0.000 claims abstract description 10
- 238000005457 optimization Methods 0.000 claims description 10
- 238000012935 Averaging Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000003993 interaction Effects 0.000 abstract 1
- 238000004590 computer program Methods 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000003909 pattern recognition Methods 0.000 description 2
- 101100477520 Homo sapiens SHOX gene Proteins 0.000 description 1
- 102000048489 Short Stature Homeobox Human genes 0.000 description 1
- 108700025071 Short Stature Homeobox Proteins 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an example-based multi-view vision fusion transduction type zero sample classification method, which comprises the following steps: extracting multi-view visual characteristics of the seen type pictures; sending the multi-view visual characteristics and semantic attributes of the seen pictures into a multi-view visual-semantic mapping model, and learning a conversion matrix at different views by using an alternate direction multiplier method; predicting semantic projection of the unseen pictures by using the learned conversion matrix; further extracting final semantic representation of the unseen pictures from the semantic projection and realizing identification of the unseen pictures based on the final semantic representation; according to the invention, the interaction sharing of visual information at different visual angles is realized by adopting a single linear constraint, so that the complexity of a traditional multi-visual angle information fusion model is simplified; meanwhile, in order to further mine visual-semantic association hidden in the unseen class, a self-supervision learning strategy is provided, semantic calibration on the unseen class picture is realized by utilizing consistency among multiple visual angles, and zero sample classification performance can be greatly improved.
Description
Technical Field
The invention relates to the technical field of image recognition, in particular to a multi-view vision fusion transduction type zero sample classification method based on an example.
Background
In recent years, zero Sample Learning (ZSL) has received increasing attention. Unlike conventional pattern recognition, ZSL is able to recognize samples with tags that are not used in training. And classifying the samples of the unseen category by constructing a mapping relation between the visual features and the semantic attributes by using the inherent association of the semantic attributes among the categories. Most ZSL methods currently use only a single visual feature representation, however in many practical scenarios, multiple viewing angles of visual feature representations are often available through different channels. For high resolution images, different feature extractors (SIFT, SURF, PHOG, pre-training depth networks, etc.) may be used to acquire features. Due to the variability between different perspectives, example-based multi-perspective visual data may provide a more comprehensive description than single visual data, if utilized properly, is expected to greatly improve ZSL performance.
Disclosure of Invention
The invention aims to: the invention aims to provide an example-based multi-view vision fusion transduction type zero sample classification method, which improves the generalization performance of a zero sample classifier, so as to realize more accurate identification of unseen pictures.
The technical scheme is as follows: the invention discloses an example-based multi-view vision fusion transduction type zero sample classification method, which comprises the following steps of:
(1) Extracting multi-view visual characteristics of the seen type pictures and the unseen type pictures;
(2) Sending the multi-view visual characteristics of the seen type pictures and the corresponding category semantic attributes into a multi-view visual-semantic mapping model, and learning a conversion matrix on different view angles by using an alternate direction multiplier method;
(3) Predicting semantic projection of the unseen pictures by using the learned conversion matrix;
(4) And (3) further extracting final semantics of the unseen pictures according to the semantic projection obtained in the step (3) and identifying the unseen pictures.
Further, the step (1) specifically comprises the following steps: visual features were extracted using ResNet and GoogLeNet pre-trained on the ImageNet database, representing view a and view B, respectively.
Further, the multi-view visual-semantic mapping model in the step (2) is expressed as the following optimization problem:
;
the constraint conditions are as follows:
;
wherein, ,/>,,/>,/>Is an optimized variable matrix; representing a view angle feature matrix on a v-th view angle of the seen type picture, wherein each column corresponds to one seen type picture; /(I) A category semantic attribute matrix representing the seen type pictures, wherein each column corresponds to one seen type picture; Representing the average matrix of the semantic attributes of the seen classes, wherein each column of the average matrix is the average vector of all the semantic attributes of the seen classes; A dimension that is a view feature at a v-th view; m is the dimension of the category semantic attribute; n is the number of the pictures of the seen class; /(I) 、/>、/>、/>、/>Are super parameters; v is the number of viewing angles.
Further, the alternative direction multiplier method in the step (2) is specifically as follows:
Initializing:
,/>,/>,/>,/>,/>,;
Let the iteration times Determining convergence threshold/>,/>And related parameters/>,/>,/>;
By solving the followingEquation of/>; Wherein/>For the parameters within the alternate direction multiplier method, the formula is as follows:
;
by solving the following Optimization problem of/>The formula is as follows:
;
by solving the following Equation of/>The formula is as follows:
;
updating by :
;
Updating by:
;
Updating Lagrangian multipliers by the following formula,/>,/>And/>:
;
;
;
;
If it is
;
Then convergence; otherwise, letContinuing the updating operation; the final transformation matrix obtained through iteration is: /(I)。
Further, the semantic projection of the unseen picture on a single view angle obtained in the step (3) is as follows:
;
wherein, Representing a view angle feature matrix on a v view angle of the unseen picture, wherein each column corresponds to one unseen picture; /(I)The number of the unseen pictures.
Further, the final semantic formula of the unseen pictures is extracted in the step (4) as follows:
;
wherein, The final semantic representation of the unseen pictures to be extracted, namely the optimization variables;;/> Is a diagonal matrix;
is a super parameter.
Further, the method comprises the steps of,Calculated by the following formula:
;
wherein, In the form of a block matrix,。
Further, the identifying of the unseen picture in the step (4) includes:
And (3) averaging the final semantic representation of the unseen pictures at each view angle, wherein the formula is as follows:
;
category labels for unseen pictures are obtained using the following formula:
;
wherein, Returning a number vector representing the largest element of each column of the input matrix; semantic attributes are not found; /(I) The number of the unobserved categories is the number of the unobserved categories; /(I)And marking the category of the identified unseen pictures.
The invention relates to an example-based multi-view vision fusion transduction type zero sample identification system, which comprises the following components:
The data acquisition module is used for extracting multi-view visual characteristics of the seen pictures and the unseen pictures;
The model learning module is used for sending the multi-view visual characteristics of the seen type pictures and the corresponding category semantic attributes into a multi-view visual-semantic mapping model, and learning the conversion matrixes at different view angles by using an alternate direction multiplier method; predicting semantic projection of the unseen pictures by using the learned conversion matrix; further extracting final semantic representation of the unseen pictures from the semantic projection;
And the picture identification module is used for classifying the extracted final semantic representations of the unseen pictures.
An apparatus of the present invention includes a memory, a processor, and a computer program stored on the memory and executable on the processor, the computer program implementing an example-based multi-view vision fusion transduction zero sample classification method of any one of the above when loaded into the processor.
The beneficial effects are that: compared with the prior art, the invention has the following remarkable advantages: the multi-view visual features are utilized to contain richer, more sufficient and more comprehensive information of the training samples, so that the generalization performance of the zero sample classifier is effectively improved, and more accurate identification of unseen pictures is realized. Compared with the existing zero sample learning method, the method has the advantages that the classification accuracy of unseen pictures is improved to a large extent, the method is simple and efficient, and the method has good application prospects in the related fields of pattern recognition, data mining, computer vision and the like.
Drawings
FIG. 1 is a flow chart of the present invention.
Description of the embodiments
The technical scheme of the invention is further described below with reference to the accompanying drawings.
As shown in fig. 1, an embodiment of the present invention provides an example-based multi-view vision fusion transduction type zero sample classification method, which includes the following steps:
(1) Extracting multi-view visual characteristics of the seen type pictures and the unseen type pictures; the method comprises the following steps: visual features were extracted using ResNet (fc 9 layer 2048 dimension) and GoogLeNet (fc 17 layer 1024 dimension) pre-trained on the ImageNet database, representing view a and view B, respectively.
(2) Sending the multi-view visual characteristics of the seen type pictures and the corresponding category semantic attributes into a multi-view visual-semantic mapping model, and learning a conversion matrix on different view angles by using an alternate direction multiplier method; the multi-view visual-semantic mapping model is expressed as the following optimization problem P1:
;
the constraint conditions are as follows:
;
;
;
;
wherein, ,/>,,/>,/>Is an optimized variable matrix; representing a view angle feature matrix on a v-th view angle of the seen type picture, wherein each column corresponds to one seen type picture; /(I) A category semantic attribute matrix representing the seen type pictures, wherein each column corresponds to one seen type picture; Representing the average matrix of the semantic attributes of the seen classes, wherein each column of the average matrix is the average vector of all the semantic attributes of the seen classes; A dimension that is a view feature at a v-th view; m is the dimension of the category semantic attribute; n is the number of the pictures of the seen class; /(I) 、/>、/>、/>、/>Are super parameters; v is the number of viewing angles.Is a loss term; /(I)For consistency items, the prediction results of all the visual angles are kept consistent on the seen type samples, and constraint 1.1 is that single linear constraint is adopted to realize interactive sharing of visual information at different visual angles; constraint 1.2-1.4 is used to construct a reconfigurable subspace in the map; constraint 1.5 is a non-negative constraint. The variable of the problem P1 input is/>,/>,/>;
The solution variable is,/>,/>,/>,/>。
For the optimization problem P1, an alternate direction multiplier method is adopted for solving, and the method is specifically as follows:
inputting training set data ,/>,/>; Super parameter/>,/>、/>、/>、/>; Let iteration times/>Determining convergence threshold/>,/>And related parameters/>,/>,/>;
Initializing:
,/>,/>,/>,/>,/>,;
Let the iteration times Determining convergence threshold/>,/>And related parameters/>,/>,/>;
By solving the followingEquation of/>; Wherein/>For the parameters within the alternate direction multiplier method, the formula is as follows:
;
by solving the following Optimization problem of/>The formula is as follows:
;
by solving the following Equation of/>The formula is as follows:
;
updating by :
;
Updating by:
;
Updating Lagrangian multipliers by the following formula,/>,/>And/>:
;
;
;
;
If it is
;
Then convergence; otherwise, letContinuing the updating operation; the final transformation matrix obtained through iteration is: /(I)。
(3) Predicting semantic projection of the unseen pictures by using the learned conversion matrix; the semantic projection of the unseen pictures on a single view angle is obtained as follows:
;
wherein, Representing a view angle feature matrix on a v view angle of the unseen picture, wherein each column corresponds to one unseen picture; /(I)The number of the unseen pictures.
(4) And (3) further extracting final semantics of the unseen pictures according to the semantic projection obtained in the step (3) and identifying the unseen pictures. The final semantic formula for extracting the unseen pictures is as follows:
;
wherein, The final semantic representation of the unseen pictures to be extracted, namely the optimization variables;
; Is a diagonal matrix;
is a super parameter.
Calculated by the following formula:
;
wherein, In the form of a block matrix,。
The identification of the unseen pictures comprises the following steps:
And (3) averaging the final semantic representation of the unseen pictures at each view angle, wherein the formula is as follows:
;
category labels for unseen pictures are obtained using the following formula:
;
wherein, Returning a number vector representing the largest element of each column of the input matrix; /(I)Semantic attributes are not found; /(I)The number of the unobserved categories is the number of the unobserved categories; /(I)And marking the category of the identified unseen pictures.
In order to verify the effect and performance of the method provided by the invention, the invention adopts three classical zero sample classification data sets of AwA, CUB, SUN and the like to carry out a comparison experiment. Table 1 lists the unseen identification accuracies of several existing ZSL methods.
Table 1 comparison of identification results of several methods
Compared with other methods, the multi-view visual fusion transduction type zero sample classification method based on the example provided by the invention can fully utilize the characteristic information of different views, has obvious advantages in generalization performance, and can achieve higher level of accuracy in recognition of unseen pictures.
The embodiment of the invention also provides a multi-view visual fusion transduction type zero sample identification system based on the example, which comprises the following steps:
The data acquisition module is used for extracting multi-view visual characteristics of the seen pictures and the unseen pictures;
The model learning module is used for sending the multi-view visual characteristics of the seen type pictures and the corresponding category semantic attributes into a multi-view visual-semantic mapping model, and learning the conversion matrixes at different view angles by using an alternate direction multiplier method; predicting semantic projection of the unseen pictures by using the learned conversion matrix; further extracting final semantic representation of the unseen pictures from the semantic projection;
And the picture identification module is used for classifying the extracted final semantic representations of the unseen pictures.
The embodiment of the invention also provides equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the computer program realizes the multi-view visual fusion transduction zero sample classification method based on any one of the examples when being loaded to the processor.
Claims (7)
1. An example-based multi-view vision fusion transduction type zero sample classification method is characterized by comprising the following steps of:
(1) Extracting multi-view visual characteristics of the seen type pictures and the unseen type pictures;
(2) Sending the multi-view visual characteristics of the seen type pictures and the corresponding category semantic attributes into a multi-view visual-semantic mapping model, and learning a conversion matrix on different view angles by using an alternate direction multiplier method; the alternate direction multiplier method is specifically as follows:
Initializing:
,/>,/>,/>,/>,/>,;
Let the iteration times Determining convergence threshold/>,/>Sum parameter/>,/>,/>;
By solving the followingEquation of/>; Wherein/>For the parameters within the alternate direction multiplier method, the formula is as follows:
;
by solving the following Optimization problem of/>The formula is as follows:
;
by solving the following Equation of/>The formula is as follows:
;
updating by :
;
Updating by:
;
Updating Lagrangian multipliers by the following formula,/>,/>And/>:
;
;
;
;
If it is
;
Then convergence; otherwise, letContinuing the updating operation; the final transformation matrix obtained through iteration is: /(I);
(3) Predicting semantic projection of the unseen pictures by using the learned conversion matrix;
(4) And (3) further extracting final semantics of the unseen pictures according to the semantic projection obtained in the step (3) and identifying the unseen pictures.
2. The example-based multi-view vision fusion transduction zero sample classification method according to claim 1, wherein the step (1) is specifically as follows: visual features were extracted using ResNet and GoogLeNet pre-trained on the ImageNet database, representing view a and view B, respectively.
3. The example-based multi-view vision fusion transduction zero sample classification method according to claim 1, wherein the multi-view vision-semantic mapping model in the step (2) is expressed as the following optimization problem:
;
the constraint conditions are as follows:
;
wherein, ,/>,,/>,/>Is an optimized variable matrix; representing a view angle feature matrix on a v-th view angle of the seen type picture, wherein each column corresponds to one seen type picture; /(I) A category semantic attribute matrix representing the seen type pictures, wherein each column corresponds to one seen type picture; Representing the average matrix of the semantic attributes of the seen classes, wherein each column of the average matrix is the average vector of all the semantic attributes of the seen classes; A dimension that is a view feature at a v-th view; m is the dimension of the category semantic attribute; n is the number of the pictures of the seen class; /(I) 、/>、/>、/>、/>Are super parameters; v is the number of viewing angles.
4. The example-based multi-view visual fusion transduction zero sample classification method according to claim 1, wherein the semantic projection of the unseen pictures on a single view angle is obtained in the step (3):
;
wherein, Representing a view angle feature matrix on a v view angle of the unseen picture, wherein each column corresponds to one unseen picture; /(I)The number of the unseen pictures.
5. The example-based multi-view visual fusion transduction zero sample classification method according to claim 1, wherein the final semantic formula of the extracted unseen pictures in the step (4) is as follows:
;
wherein, The final semantic representation of the unseen pictures to be extracted, namely the optimization variables;;/> Is a diagonal matrix;
is a super parameter.
6. The method for zero sample classification based on instance-based multi-view visual fusion transduction of claim 1, wherein,Calculated by the following formula:
;
wherein, In the form of a block matrix,。
7. The example-based multi-view visual fusion transduction zero sample classification method according to claim 1, wherein the identifying of the unseen class picture in the step (4) comprises:
And (3) averaging the final semantic representation of the unseen pictures at each view angle, wherein the formula is as follows:
;
category labels for unseen pictures are obtained using the following formula:
;
wherein, Returning a number vector representing the largest element of each column of the input matrix; /(I)Semantic attributes are not found; /(I)The number of the unobserved categories is the number of the unobserved categories; /(I)And marking the category of the identified unseen pictures.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410017127.8A CN117541882B (en) | 2024-01-05 | 2024-01-05 | Instance-based multi-view vision fusion transduction type zero sample classification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410017127.8A CN117541882B (en) | 2024-01-05 | 2024-01-05 | Instance-based multi-view vision fusion transduction type zero sample classification method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117541882A CN117541882A (en) | 2024-02-09 |
CN117541882B true CN117541882B (en) | 2024-04-19 |
Family
ID=89796173
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410017127.8A Active CN117541882B (en) | 2024-01-05 | 2024-01-05 | Instance-based multi-view vision fusion transduction type zero sample classification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117541882B (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109643384A (en) * | 2016-08-16 | 2019-04-16 | 诺基亚技术有限公司 | Method and apparatus for zero sample learning |
CN110431565A (en) * | 2017-03-06 | 2019-11-08 | 诺基亚技术有限公司 | Zero sample learning method and system of direct-push and/or adaptive maximum boundary |
CN111222471A (en) * | 2020-01-09 | 2020-06-02 | 中国科学技术大学 | Zero sample training and related classification method based on self-supervision domain perception network |
KR20200130759A (en) * | 2019-04-25 | 2020-11-20 | 연세대학교 산학협력단 | Zero Shot Recognition Apparatus for Automatically Generating Undefined Attribute Information in Data Set and Method Thereof |
CN112801105A (en) * | 2021-01-22 | 2021-05-14 | 之江实验室 | Two-stage zero sample image semantic segmentation method |
CN113361646A (en) * | 2021-07-01 | 2021-09-07 | 中国科学技术大学 | Generalized zero sample image identification method and model based on semantic information retention |
CN113902969A (en) * | 2021-10-12 | 2022-01-07 | 西安电子科技大学 | Zero-sample SAR target identification method fusing similarity of CNN and image |
CN115424096A (en) * | 2022-11-08 | 2022-12-02 | 南京信息工程大学 | Multi-view zero-sample image identification method |
KR20230078134A (en) * | 2021-11-26 | 2023-06-02 | 연세대학교 산학협력단 | Device and Method for Zero Shot Semantic Segmentation |
CN116433977A (en) * | 2023-04-18 | 2023-07-14 | 国网智能电网研究院有限公司 | Unknown class image classification method, unknown class image classification device, computer equipment and storage medium |
CN117274726A (en) * | 2023-11-23 | 2023-12-22 | 南京信息工程大学 | Picture classification method and system based on multi-view supplementary tag |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11694042B2 (en) * | 2020-06-16 | 2023-07-04 | Baidu Usa Llc | Cross-lingual unsupervised classification with multi-view transfer learning |
CN114037879A (en) * | 2021-10-22 | 2022-02-11 | 北京工业大学 | Dictionary learning method and device for zero sample recognition |
-
2024
- 2024-01-05 CN CN202410017127.8A patent/CN117541882B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109643384A (en) * | 2016-08-16 | 2019-04-16 | 诺基亚技术有限公司 | Method and apparatus for zero sample learning |
CN110431565A (en) * | 2017-03-06 | 2019-11-08 | 诺基亚技术有限公司 | Zero sample learning method and system of direct-push and/or adaptive maximum boundary |
KR20200130759A (en) * | 2019-04-25 | 2020-11-20 | 연세대학교 산학협력단 | Zero Shot Recognition Apparatus for Automatically Generating Undefined Attribute Information in Data Set and Method Thereof |
CN111222471A (en) * | 2020-01-09 | 2020-06-02 | 中国科学技术大学 | Zero sample training and related classification method based on self-supervision domain perception network |
CN112801105A (en) * | 2021-01-22 | 2021-05-14 | 之江实验室 | Two-stage zero sample image semantic segmentation method |
CN113361646A (en) * | 2021-07-01 | 2021-09-07 | 中国科学技术大学 | Generalized zero sample image identification method and model based on semantic information retention |
CN113902969A (en) * | 2021-10-12 | 2022-01-07 | 西安电子科技大学 | Zero-sample SAR target identification method fusing similarity of CNN and image |
KR20230078134A (en) * | 2021-11-26 | 2023-06-02 | 연세대학교 산학협력단 | Device and Method for Zero Shot Semantic Segmentation |
CN115424096A (en) * | 2022-11-08 | 2022-12-02 | 南京信息工程大学 | Multi-view zero-sample image identification method |
CN116433977A (en) * | 2023-04-18 | 2023-07-14 | 国网智能电网研究院有限公司 | Unknown class image classification method, unknown class image classification device, computer equipment and storage medium |
CN117274726A (en) * | 2023-11-23 | 2023-12-22 | 南京信息工程大学 | Picture classification method and system based on multi-view supplementary tag |
Non-Patent Citations (3)
Title |
---|
A sharing multi-view feature selection method via Alternating Direction Method of Multipliers;Qiang Lin等;《Neurocomputing》;20190314;第333卷;124-134 * |
Zero-Shot Learning via Robust Latent Representation and Manifold Regularization;MIn Meng等;《IEEE Transactions on Image Processing》;20190430;第28卷(第4期);1824-1836 * |
基于零样本学习的图像分类研究;王欣洁;《中国优秀硕士学位论文全文数据库 信息科技辑》;20200115;I138-2247 * |
Also Published As
Publication number | Publication date |
---|---|
CN117541882A (en) | 2024-02-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kapoor et al. | Active learning with gaussian processes for object categorization | |
CN108537269B (en) | Weak interactive object detection deep learning method and system thereof | |
WO2020228525A1 (en) | Place recognition method and apparatus, model training method and apparatus for place recognition, and electronic device | |
CN111858954A (en) | Task-oriented text-generated image network model | |
CN111523029A (en) | Personalized recommendation method based on knowledge graph representation learning | |
CN112215171B (en) | Target detection method, device, equipment and computer readable storage medium | |
CN114037674B (en) | Industrial defect image segmentation detection method and device based on semantic context | |
Sahbi | Imageclef annotation with explicit context-aware kernel maps | |
CN113159067A (en) | Fine-grained image identification method and device based on multi-grained local feature soft association aggregation | |
CN112037239B (en) | Text guidance image segmentation method based on multi-level explicit relation selection | |
Liao et al. | Exploring more concentrated and consistent activation regions for cross-domain semantic segmentation | |
CN116935170A (en) | Processing method and device of video processing model, computer equipment and storage medium | |
CN114333062B (en) | Pedestrian re-recognition model training method based on heterogeneous dual networks and feature consistency | |
US20230072445A1 (en) | Self-supervised video representation learning by exploring spatiotemporal continuity | |
Huang et al. | Pedestrian detection using RetinaNet with multi-branch structure and double pooling attention mechanism | |
CN107729821B (en) | Video summarization method based on one-dimensional sequence learning | |
Jiao et al. | Deep cross-modal discriminant adversarial learning for zero-shot sketch-based image retrieval | |
CN117312594A (en) | Sketching mechanical part library retrieval method integrating double-scale features | |
CN117541882B (en) | Instance-based multi-view vision fusion transduction type zero sample classification method | |
CN103049570A (en) | Method for searching and sorting images and videos on basis of relevancy preserving mapping and classifier | |
CN112750128A (en) | Image semantic segmentation method and device, terminal and readable storage medium | |
Cai et al. | Adaptive visual-depth fusion transfer | |
CN114387489A (en) | Power equipment identification method and device and terminal equipment | |
CN114692715A (en) | Sample labeling method and device | |
Zhang et al. | Bicanet: Lidar point cloud classification network based on coordinate attention and blueprint separation involution neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |