CN115205632B - Semi-supervised multi-view metric learning method in Riemann space - Google Patents

Semi-supervised multi-view metric learning method in Riemann space Download PDF

Info

Publication number
CN115205632B
CN115205632B CN202210847014.1A CN202210847014A CN115205632B CN 115205632 B CN115205632 B CN 115205632B CN 202210847014 A CN202210847014 A CN 202210847014A CN 115205632 B CN115205632 B CN 115205632B
Authority
CN
China
Prior art keywords
matrix
view
semi
supervised
learning method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210847014.1A
Other languages
Chinese (zh)
Other versions
CN115205632A (en
Inventor
梁建青
梁吉业
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanxi Jinxinan Technology Co ltd
Original Assignee
Shanxi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanxi University filed Critical Shanxi University
Priority to CN202210847014.1A priority Critical patent/CN115205632B/en
Publication of CN115205632A publication Critical patent/CN115205632A/en
Application granted granted Critical
Publication of CN115205632B publication Critical patent/CN115205632B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a semi-supervised multi-view measurement learning method in Riemann space, which comprises the following steps: extracting multi-view features of the image from the training set and generating a sample pair; constructing a multi-view intra-class and inter-class divergence matrix, embedding semantic information into a feature subspace, and realizing migration and fusion of data and knowledge; embedding data and knowledge from Euclidean space into Riemann manifold subspace to complete feature mapping; and carrying out multi-view fusion to obtain the unified representation of the features. The invention solves the problem of high dependence on strong supervision information and Euclidean space in the related technology, provides a novel high-efficiency measurement learning method suitable for complex application scenes and weak supervision labeling environments, and improves the performance of related tasks of weak supervision heterogeneous data mining and pattern recognition.

Description

Semi-supervised multi-view metric learning method in Riemann space
Technical Field
The invention belongs to the technical field of machine learning, and particularly relates to a semi-supervised multi-view measurement learning method in a Riemann space.
Background
Distance metrics play a decisive role in the performance of most machine learning methods. In the face of complex and varied application scenarios, conventional metric functions have not been able to capture real data structures. How to learn to get task and data driven, flexible distance metrics is a research hotspot in the field of machine learning. As one of the mainstream technologies in the current machine learning field, metric learning aims to automatically learn a suitable metric from data, and is widely used in the fields of face recognition, information retrieval, network link prediction and the like.
Under the background of big data, the data presents the characteristics of high dimension, multi-source isomerism and extremely weak supervision, which makes learning of quick and effective distance measurement difficult, and simultaneously brings unprecedented challenges to intelligent information processing in the fields of traditional machine learning, pattern recognition and the like. The high dependence on the strongly supervised information and the euclidean space is a common problem in the current metric learning research, which leads to the great limitation of the application range of the existing learning model and algorithm in practical application.
Disclosure of Invention
The invention provides a semi-supervised multi-view measurement learning method in Riemann space, which aims to overcome the high dependence on strong supervision information and Euclidean space. The invention can accurately describe manifold distribution of data in weak supervision labeling environment and non-European space, and improves the performance of weak supervision heterogeneous data measurement learning.
The technical scheme of the invention is as follows: a semi-supervised multi-view measurement learning method under Riemann space comprises the following specific steps:
step 101: extracting multi-view features of the image from the training set and generating a sample pair;
step 102: constructing a multi-view intra-class and inter-class divergence matrix, embedding semantic information into a feature subspace, and realizing migration and fusion of data and knowledge;
step 103: embedding data and knowledge from Euclidean space into Riemann manifold subspace to complete feature mapping;
step 104: and carrying out multi-view fusion to obtain the unified representation of the features.
Optionally, the step 101 extracts multi-view features of the image from the training set and forms a sample pair, and further includes:
the training set is transmitted into a local feature HOG, a SIFT feature descriptor and a deep convolutional neural network, and after a word bag model and a final full-connection layer of a feature extraction network are passed, 500-dimensional word bag representation and 1024-dimensional depth features of an image are respectively obtainedAnd obtaining the similar sample pair set S, the dissimilar sample pair set D and the unmarked sample set U according to the sample labels.
Optionally, the loss function is:
wherein L is a measurement learning total loss function, L dis To distinguish loss lambda 1 And lambda (lambda) 2 To control balance parameters between targets, L reg1 For regular loss of semi-supervised graph, L reg2 To measure regular loss, w v For v View weight, A (v) For v-view metric matrix, S (v) For the intra-class divergence of v views, D (v) For v view inter-class divergence, X (v) For the v view feature matrix, L is Laplacian matrix, D sld (A (v) ,A 0 ) For symmetrical LogDet divergence, A 0 The matrix is positive determined for a priori symmetry.
Optionally, discriminating loss L dis And obtaining the distance measurement with strong discrimination capability under the measurement matrix constructed by each view.
Alternatively, a laplace matrix l=d-W, wherein,for a diagonal matrix, the adjacency matrix W is defined as follows
Optionally, semi-supervised graph regularization loss L reg1 And a laplace matrix L, samples located within a local area of the low-dimensional manifold have similar classes based on manifold assumptions.
Optionally, measure canonical loss L reg2 So that in the matrix S (v) Guarantee A in case of near odd or irreversible (v) There is a solution.
Optionally, the loss term L is discriminated from the objective function dis Part, metric matrix A (v) Is generalized by the following objective function
Wherein delta R Riemann distance for SPD matrix
δ R (X,Y):=||log(Y -1/2 XY -1/2 )|| F For X, Y > 0,
optionally, the metric matrix a for each view is obtained and then solved for w.
The measurement learning method solves the problem of high dependence on strong supervision information and Euclidean space in the related technology, provides a novel high-efficiency measurement learning method suitable for complex application scenes and weak supervision labeling environments, and improves the performance of weak supervision heterogeneous data measurement learning.
Drawings
FIG. 1 is a flow chart of a semi-supervised multi-view metric learning method in Riemann space according to an embodiment of the present invention;
FIG. 2 is a specific technical scheme of an embodiment of the present invention.
Detailed Description
In order to enable those skilled in the art to better understand the present invention, the following description will make clear and complete descriptions of the technical solutions according to the embodiments of the present invention with reference to the accompanying drawings. It will be apparent that the described embodiments are merely some, but not all embodiments of the invention. All other embodiments, which can be made by one of ordinary skill in the art without undue burden on the person of ordinary skill in the art based on the embodiments of the present invention, are intended to be within the scope of the present invention.
Assume that there are N samples from m viewsAiming at each view, the invention obtains the distance measurement with strong discrimination capability under the measurement matrix constructed by each view. The invention constructs the Laplace matrix and the regular loss guide data distribution of the semi-supervised graph based on manifold assumption in order to effectively utilize a large number of unmarked samples. In consideration of the condition that the intra-class divergence matrix approaches odd or is irreversible, the invention utilizes symmetrical LogDet divergence to construct measurement regular loss, thereby ensuring that each view measurement matrix has a solution. Finally, the invention generalizes the solving of each metric matrix from Euclidean space to Riemann space, so that the distance metric obtained by learning can better meet the requirements of actual complex application scenes. In the solving process, the invention obtains the measurement matrix of each view and then calculates the weight.
The steps of the invention are specifically described below with reference to fig. 1 and 2:
step 101: multi-view features of the image are extracted from the training set and pairs of samples are generated.
The training set is transmitted into a local feature HOG, a SIFT feature descriptor and a deep convolutional neural network, and after a word bag model and a final full-connection layer of a feature extraction network are passed, 500-dimensional word bag representation and 1024-dimensional depth features of an image are respectively obtainedAnd obtaining the similar sample pair S, the dissimilar sample pair D and the unmarked sample set U according to the sample label.
Step 102: and constructing a multi-view intra-class and inter-class divergence matrix, embedding semantic information into a feature subspace, and realizing migration and fusion of data and knowledge.
By means of the thought of large interval, loss L is distinguished dis And obtaining the distance measurement with strong discrimination capability under the measurement matrix constructed by each view.
Based on manifold assumptions, samples located in the low-dimensional manifold local area have similar categories, and a semi-supervised graph regular loss L is constructed reg1 And a laplace matrix l=d-W, wherein,for a diagonal matrix, the adjacency matrix W is defined as follows
Taking into consideration the condition that the intra-class divergence matrix approaches odd or is irreversible, utilizing symmetrical LogDet divergence to construct a measurement regularized loss L reg2 Thereby ensuring the measurement matrix A of each view (v) Has the specific form of
The total loss function is defined as follows:
wherein L is a measurement learning total loss function, L dis To distinguish loss lambda 1 And lambda (lambda) 2 To control balance parameters between targets, L reg1 For regular loss of semi-supervised graph, L reg2 To measure regular loss, w v For v View weight, A (v) For v-view metric matrix, S (v) For the intra-class divergence of v views, D (v) For v view inter-class divergence, X (v) For the v view feature matrix, L is Laplacian matrix, D sld (A (v) ,A 0 ) For symmetrical LogDet divergence, A 0 The matrix is positive determined for a priori symmetry.
Step 103: and embedding the data and the knowledge from the Euclidean space into the Riemann manifold subspace to finish the feature mapping.
Firstly, solving A by considering fixed w, and judging loss term L in objective function dis Part, metric matrix A (v) Is generalized by the following objective function
Wherein delta R Riemann distance for SPD matrix
δ R (X,Y):=||log(Y -1/2 XY -1/2 )|| F For X, Y > 0,
the above problem has a closed-form solution in the Riemann manifold subspace in the form of a weighted geometric mean
A (v) =(S (v) ) -1 # t D (v)
Further for the overall objective function, each view measures matrix A (v) Solution of (2)
Step 104: and carrying out multi-view fusion to obtain the unified representation of the features.
After a measurement matrix A of each view is obtained by utilizing an alternate solving strategy, carrying constraint conditions into an objective function to construct a generalized Lagrange function for derivation, and then solving w
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims (9)

1. The semi-supervised multi-view measurement learning method in the Riemann space is characterized by comprising the following steps:
step 101: extracting multi-view features of the image from the training set and generating a sample pair;
step 102: constructing a multi-view intra-class and inter-class divergence matrix, embedding semantic information into a feature subspace, and realizing migration and fusion of data and knowledge;
step 103: embedding data and knowledge from Euclidean space into Riemann manifold subspace to complete feature mapping;
step 104: and carrying out multi-view fusion to obtain the unified representation of the features.
2. The semi-supervised multiview metric learning method under Riemann space of claim 1, wherein the step 101 of extracting multiview features of an image from a training set and forming sample pairs further comprises:
the training set is transmitted into a local feature HOG, a SIFT feature descriptor and a deep convolutional neural network, and after a word bag model and a final full-connection layer of a feature extraction network are passed, 500-dimensional word bag representation and 1024-dimensional depth features of an image are respectively obtainedAnd obtaining a similar sample pair set S, a dissimilar sample pair set D and a label-free sample set according to the sample labels>
3. The semi-supervised multiview metric learning method under Riemann space of claim 1, wherein the loss function is:
wherein,,learning the total loss function for metrics>To distinguish loss lambda 1 And lambda (lambda) 2 In order to control the balance parameters between the targets,regular loss for semi-supervised graphs, < >>To measure regular loss, w v For v View weight, A (v) For v-view metric matrix, S (v) For the intra-class divergence of v views, D (v) For v view inter-class divergence, X (v) For the v view feature matrix, L is Laplacian matrix, D sld (A (v) ,A 0 ) For symmetrical LogDet divergence, A 0 The matrix is positive determined for a priori symmetry.
4. A semi-supervised multiview metric learning method as defined in claim 3, wherein said discrimination lossAnd obtaining the distance measurement with strong discrimination capability under the measurement matrix constructed by each view.
5. The method for semi-supervised multiview metric learning under Rieman space as set forth in claim 3, wherein the Laplace matrix L = D-W, wherein,for a diagonal matrix, the adjacency matrix W is defined as follows
6. A semi-supervised multiview metric learning method as claimed in claim 3, wherein the semi-supervised graph regularization lossAnd a laplace matrix L, samples located within a local area of the low-dimensional manifold have similar classes according to manifold assumptions.
7. A semi-supervised multiview metric learning method as claimed in claim 3, wherein said metric regularization lossSo that in the matrix S (v) Guarantee A in case of near odd or irreversible (v) There is a solution.
8. A semi-supervised multiview metric learning method as claimed in claim 3, wherein the loss term is determined in the objective functionPart, metric matrix A (v) The solution of (2) is augmented by an objective function
Wherein delta R Riemann distance for SPD matrix
δ R (X,Y):=||log(Y -1/2 XY -1/2 )|| F For X, Y > 0.
9. A semi-supervised multiview metric learning method as claimed in claim 3, wherein the metric matrix a for each view is obtained and then solved for w.
CN202210847014.1A 2022-07-07 2022-07-07 Semi-supervised multi-view metric learning method in Riemann space Active CN115205632B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210847014.1A CN115205632B (en) 2022-07-07 2022-07-07 Semi-supervised multi-view metric learning method in Riemann space

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210847014.1A CN115205632B (en) 2022-07-07 2022-07-07 Semi-supervised multi-view metric learning method in Riemann space

Publications (2)

Publication Number Publication Date
CN115205632A CN115205632A (en) 2022-10-18
CN115205632B true CN115205632B (en) 2023-07-18

Family

ID=83581743

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210847014.1A Active CN115205632B (en) 2022-07-07 2022-07-07 Semi-supervised multi-view metric learning method in Riemann space

Country Status (1)

Country Link
CN (1) CN115205632B (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414575A (en) * 2019-07-11 2019-11-05 东南大学 A kind of semi-supervised multiple labeling learning distance metric method merging Local Metric
CN110598733A (en) * 2019-08-05 2019-12-20 南京智谷人工智能研究院有限公司 Multi-label distance measurement learning method based on interactive modeling
CN111488951B (en) * 2020-05-22 2023-11-28 南京大学 Method for generating countermeasure metric learning model for RGB-D image classification

Also Published As

Publication number Publication date
CN115205632A (en) 2022-10-18

Similar Documents

Publication Publication Date Title
Wang et al. Adaptive fusion for RGB-D salient object detection
CN109977773B (en) Human behavior identification method and system based on multi-target detection 3D CNN
Shankar et al. Deep-carving: Discovering visual attributes by carving deep neural nets
Zheng et al. Centralized ranking loss with weakly supervised localization for fine-grained object retrieval.
CN107145862B (en) Multi-feature matching multi-target tracking method based on Hough forest
CN111553193A (en) Visual SLAM closed-loop detection method based on lightweight deep neural network
CN110472652B (en) Small sample classification method based on semantic guidance
CN111178208A (en) Pedestrian detection method, device and medium based on deep learning
CN106228539A (en) Multiple geometric primitive automatic identifying method in a kind of three-dimensional point cloud
Zhang et al. Robust adaptive learning with Siamese network architecture for visual tracking
Hu et al. Semantic SLAM based on improved DeepLabv3⁺ in dynamic scenarios
CN106127112A (en) Data Dimensionality Reduction based on DLLE model and feature understanding method
CN105654054A (en) Semi-supervised neighbor propagation learning and multi-visual dictionary model-based intelligent video analysis method
CN113569657A (en) Pedestrian re-identification method, device, equipment and storage medium
Chen et al. Human motion target posture detection algorithm using semi-supervised learning in internet of things
Si et al. [Retracted] Image Matching Algorithm Based on the Pattern Recognition Genetic Algorithm
CN108763926B (en) Industrial control system intrusion detection method with safety immunity capability
Ma et al. Rethinking safe semi-supervised learning: Transferring the open-set problem to a close-set one
CN115205632B (en) Semi-supervised multi-view metric learning method in Riemann space
Zhang [Retracted] Sports Action Recognition Based on Particle Swarm Optimization Neural Networks
CN116205905B (en) Power distribution network construction safety and quality image detection method and system based on mobile terminal
CN110751005A (en) Pedestrian detection method integrating depth perception features and kernel extreme learning machine
Zhu et al. Deep Neural Network Based Object Detection Algorithm With optimized Detection Head for Small Targets
Goswami et al. A comprehensive review on real time object detection using deep learing model
Ji et al. Research on indoor scene classification mechanism based on multiple descriptors fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231206

Address after: Room 1806, Block B, Huanya Times Square, No. 7 Yari Street, Taiyuan Xuefu Park, Shanxi Comprehensive Reform Demonstration Zone, Taiyuan City, Shanxi Province, 030000

Patentee after: Shanxi Jinxinan Technology Co.,Ltd.

Address before: 030006 803, science and technology building, Shanxi University, No. 92, Wucheng Road, Xiaodian District, Taiyuan City, Shanxi Province

Patentee before: SHANXI University