CN107368599A - The visual analysis method and its analysis system of high dimensional data - Google Patents

The visual analysis method and its analysis system of high dimensional data Download PDF

Info

Publication number
CN107368599A
CN107368599A CN201710620143.6A CN201710620143A CN107368599A CN 107368599 A CN107368599 A CN 107368599A CN 201710620143 A CN201710620143 A CN 201710620143A CN 107368599 A CN107368599 A CN 107368599A
Authority
CN
China
Prior art keywords
subspace
mrow
cluster
point
established
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710620143.6A
Other languages
Chinese (zh)
Other versions
CN107368599B (en
Inventor
夏佳志
***
廖胜辉
奎晓燕
王建新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN201710620143.6A priority Critical patent/CN107368599B/en
Publication of CN107368599A publication Critical patent/CN107368599A/en
Application granted granted Critical
Publication of CN107368599B publication Critical patent/CN107368599B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Image Analysis (AREA)

Abstract

A kind of visual analysis method of high dimensional data of the present invention, it is included on original high dimensional data and establishes the projection of Local Subspace difference geodesic distance;Establish the mapping of cluster point cluster;Establish the visual analysis view of serial subspace.The invention also discloses the analysis system for the visual analysis method for realizing the high dimensional data.The present invention is projected by establishing Local Subspace difference geodesic distance, clusters the mapping of point cluster and the visual analysis view of serial subspace, propose the visual analyzing operation of series of interactive, reliable technical foundation is provided for visualization subspace clustering and analysis, effectively guidance and help user high dimensional data can be carried out effectively to analyze and explore, the number of user trial and mistake is significantly reduced in high dimensional data processing, reduce the redundancy of data, the interaction of data analysis process is strengthened, improves the reliability of result.

Description

The visual analysis method and its analysis system of high dimensional data
Technical field
Present invention relates particularly to a kind of visual analysis method of high dimensional data and its analysis system.
Background technology
With the development of national economy technology and the arrival of digital society, data have become people's production and life In indispensable part.People come into contacts with endless data daily, such as finance data, scientific algorithm data, life Thing medical data etc..Therefore, data analysis turns into one of current most popular development field.Data mining and visual analysis technology It is the pith in areas of information technology;During data mining with analysis, a strong visual analysis method Effect can be got twice the result with half the effort.
High-dimensional is one of key character of big data.Subspace clustering structure is usually contained in high dimensional data.Automatically The Subspace clustering method of change, often generate high redundancy and elusive subspace clustering result.Visual analysis is enhancing User cognition, the effective ways for helping user to understand.However, currently towards the visual analysis method of subspace clustering, it is most of It is the visualization towards automatic mode result, rather than towards subspace clustering task in itself, it is difficult to it is effectively improved subspace The result of clustering method.
The content of the invention
An object of the present invention is that provide one kind effectively guidance and help user can be carried out effectively to high dimensional data Analysis and the visual analysis method for the high dimensional data explored.
The second object of the present invention is to provide a kind of point for the visual analysis method for being used to realize the high dimensional data Analysis system.
The visual analysis method of this high dimensional data provided by the invention, comprises the following steps:
S1. Local Subspace difference-geodesic distance projection is established on original high dimensional data;
S2. the mapping of cluster point cluster is established in Local Subspace difference-geodesic distance projection;
S3. the cluster point cluster that the Local Subspace difference obtained according to step S1-geodesic distance projection and step S2 are obtained The visual analysis view of serial subspace is established in mapping.
Local Subspace difference-geodesic distance projection is established on original high dimensional data described in step S1, is specially used Following steps establish projection:
A. for the high dimensional data for needing to project, the data point relativity measurement based on geodesic distance is established;
B. Local Subspace difference metric is established according to the step A measurements established;
C. the measurement established according to step A and B, Local Subspace difference-geodesic distance projection is established.
Data point relativity measurement of the foundation based on geodesic distance described in step A, is specially established using following steps Measurement:
The I, S-NN of structure with some connected components on the basis of the high position data collection for needing to project scheme;
II, is directed to each connected component in step I, and the connected component of any two independence is attached;
III, calculates the beeline between any two points, so as to obtain geodesic distance.
The connected component to any two independence described in step II is attached, and is specially connected in two connected components Two closest data points.
Calculating beeline described in step III, is specially calculated using shortest path first.
Local Subspace difference metric is established described in step B, is specially established and measured using following steps:
1) weight of each dimension is calculated using equation below:
ω is dimension weight matrix in formula, ωiRepresent the weight of i-th of dimension, σiRepresent data point in i-th of dimension Variance, d are the quantity of dimension;
2) the cum rights distance in SNN figures between any two points is calculated using equation below:
dpq[W]=max { dpqp],dpqq]}
Whereindpq[W] is point p and point q cum rights distance matrix, ωp=[ωp1p2,...,ωpi,...,ωpd] represent point p Local Subspace characteristic vector, ωq=[ωq1, ωq2,...,ωqi,...,ωqd] represent q Local Subspace characteristic vector, diFor the Local Subspace p in i-th dimension and point Euclidean distance d between qpqp] it is to put the cum rights distance relative to point p, dpqq] for point p relative to point q cum rights away from From;
3) cosine similarity is based on, residual quantity matrix is established according to equation below:
In formulaFor point piAnd pjDifference value based on cosine similarity; I and j is all the numbering of data point, span for [0, n), n is data set size.
Local Subspace difference-geodesic distance projection is established described in step C, specially using MDS algorithm by space Data point is mapped to x-axis by Local Subspace difference metric, and y-axis is mapped to by the data point relativity measurement of geodesic distance.
The mapping of foundation cluster point cluster described in step S2, is specially established using following steps and mapped:
(I) by the data point information of circle choosing, circle choosing in the Local Subspace difference that extraction step S1 is established-geodesic distance projection Operation is completed by user interactive, obtains pending data point set;
(II) using PCA (Principal Component Analysis) method be calculated in step (I) number The characteristic vector of strong point collection;
(III) mapping transformation on two characteristic vectors generation two dimensional surface that characteristic value is minimum in step (II) is selected, from And complete the mapping of cluster point cluster.
The visual analysis view for establishing serial subspace described in step S3, specially established and visualized using following steps Analyze view:
(a) subspace represent without ginsengization;
(b) similarity measurement is carried out to subspace;
(c) represented without ginsengization according to subspace and similarity measurement, establish the contrast interactive operation interface of subspace, from And complete the visual analyzing of high dimensional data.
Subspace represent without ginsengization described in step (a), specially using following rule represent without ginsengization:
Define a subspace S for containing cluster CcFor { ω [p]:P ∈ C }, the local son that wherein ω [p] is point p is empty Between characteristic vector.
Similarity measurement is carried out to subspace described in step (b), specially carries out similarity measurement using following steps:
(I) is vectorial using following formula defined feature for a d n-dimensional subspace n for containing the cluster C with n point
Wherein [ω in formulai1i2,...,ωij,...ωid] it is electric p in cluster CiSub-space feature vectors;
(II) uses the characteristic vector obtained in step (I)Cosine similarity measure the similitude of two sub-spaces.
The contrast interactive operation interface for establishing subspace described in step (c), operation is specially established using following rule Interface:
It is assumed that cluster C1It is contained in subspace V1In, cluster C2It is contained in subspace V2In, then behaviour is established using following rule Make interface:
Subspace summation interface:Subspace V1+V2Result be C1∩C2, subspace V1+V2Dimension of enlivening be V1And V2's Enliven the union of dimension;
Seek interface in subspace:Subspace V1∩V2Result be C1∪C2, subspace V1∩V2Dimension of enlivening be V1And V2 The common factor for enlivening dimension;
Subspace list ordering interface:One group of dimension selected is provided, the characteristic vector in the dimension selected is sentenced The similitude of disconnected once subspace, and MDS (multidimensional scaling) algorithms are tieed up by one group of subspace by 1 It is mapped on 1 dimension coordinate axle, forms the sorted lists of subspace.
The invention also discloses a kind of analysis system for the visual analysis method for realizing the high dimensional data, including successively Module and serial subspace are established in the mapping that module, cluster point cluster are established in the projection of the Local Subspace difference of series connection-geodesic distance Visual analysis view establishes module;Local Subspace difference-geodesic distance projection establishes module and is used to build on original high dimensional data Vertical Local Subspace difference-geodesic distance, which project and uploads cluster and put the mapping of cluster, establishes module;Mould is established in the mapping of cluster point cluster Block is used to establish the mapping of cluster point cluster according to Local Subspace difference-geodesic distance projection and uploads visual minute of serial subspace Analysis view establishes module;The visual analysis view of serial subspace is established module and is used for according to the Local Subspace that has built up The visual analysis view of serial subspace is established in difference-geodesic distance projection and the mapping of cluster point cluster.
The visual analysis method and its analysis system of this high dimensional data provided by the invention, it is empty by establishing local son Between the projection of difference-geodesic distance, the mapping of cluster point cluster and serial subspace visual analysis view, it is proposed that series of interactive Visual analyzing operation, for visualization subspace clustering and analysis provide reliable technical foundation, can effectively instruct and Help user to carry out effectively analyzing and exploring to high dimensional data, user trial and mistake are significantly reduced in high dimensional data processing Number, reduce the redundancies of data, strengthen the interaction of data analysis process, improve the reliability of result.
Brief description of the drawings
Fig. 1 is the method flow diagram of the inventive method.
Fig. 2 is the functional block diagram of present system.
Embodiment
It is the method flow diagram of the inventive method as shown in Figure 1:The visualization of this high dimensional data provided by the invention point Analysis method, comprises the following steps:
S1. Local Subspace difference-geodesic distance projection is established on original high dimensional data, is specially built using following steps Vertical projection:
A. for the high dimensional data for needing to project, the data point relativity measurement based on geodesic distance is established, is specially adopted Established and measured with following steps:
The I, S-NN of structure with some connected components on the basis of the high position data collection for needing to project scheme;SNN figures are Refer to a drawing of seeds of K-NN figures.Specifically, in SNN figures, just there is one between them when and if only if point p, q is k neighbours Side;
II, is directed to each connected component in step I, and the connected component of any two independence is attached, and is specially Connect two data points closest in two connected components;
III, calculates the beeline between any two points using shortest path first, so as to obtain geodesic distance;
B. Local Subspace difference metric is established according to the step A measurements established, body is to establish to measure using following steps:
1) weight of each dimension is calculated using equation below:
ω is dimension weight matrix in formula, ωiRepresent the weight of i-th of dimension, σiRepresent data point in i-th of dimension Variance, d are the quantity of dimension;
2) the cum rights distance in SNN figures between any two points is calculated using equation below:
dpq[W]=max { dpqp],dpqq]}
Whereindpq[W] is point p and point q cum rights distance matrix, ωp=[ωp1p2,...,ωpi,...,ωpd] represent point p Local Subspace characteristic vector, ωq=[ωq1, ωq2,...,ωqi,...,ωqd] represent q Local Subspace characteristic vector, diFor the Local Subspace p in i-th dimension and point Euclidean distance d between qpqp] it is to put the cum rights distance relative to point p, dpqq] for point p relative to point q cum rights away from From;
3) cosine similarity is based on, residual quantity matrix is established according to equation below:
In formulaFor point piAnd pjDifference based on cosine similarity Value;I and j is all the numbering of data point, span for [0, n), n is data set size;
C. the measurement established according to step A and B, Local Subspace difference-geodesic distance projection is established, specially using MDS Data point in space is mapped to x-axis by algorithm by Local Subspace difference metric, passes through the data point correlation of geodesic distance Measuring mapping is to y-axis;X-axis has meant that out local Subspace difference, and y-axis characterizes geodesic distance measurement.In Local Subspace In difference-geodesic distance mapping, each cluster is compartmentalized;
S2. the mapping of cluster point cluster is established in Local Subspace difference-geodesic distance projection, specially using following steps Establish mapping:
(I) by the data point information of circle choosing, circle choosing in the Local Subspace difference that extraction step S1 is established-geodesic distance projection Data point voluntarily enclose choosing for user and obtain, obtain pending data point set;
(II) using PCA (Principal Component Analysis) method be calculated in step (I) number The characteristic vector of strong point collection;
(III) mapping transformation on two characteristic vectors generation two dimensional surface that characteristic value is minimum in step (II) is selected, from And complete the mapping of cluster point cluster;
S3. the cluster point cluster that the Local Subspace difference obtained according to step S1-geodesic distance projection and step S2 are obtained The visual analysis view of serial subspace is established in mapping, specially establishes visual analyzing view using following steps:
(a) subspace represent without ginsengization, specially using following rule represent without ginsengization:
Define a subspace S for containing cluster CcFor { ω [p]:P ∈ C }, the local son that wherein ω [p] is point p is empty Between characteristic vector;
(b) similarity measurement is carried out to subspace, specially carries out similarity measurement using following steps:
(I) is vectorial using following formula defined feature for a d n-dimensional subspace n for containing the cluster C with n point
Wherein
[ω in formulai1i2,...,ωij,...ωid] it is electric p in cluster CiSub-space feature vectors;
(II) uses the characteristic vector obtained in step (I)Cosine similarity measure the similitude of two sub-spaces;
(c) represented without ginsengization according to subspace and similarity measurement, establish the contrast interactive operation interface of subspace, from And complete the visual analyzing of high dimensional data;Operation interface is specially established using following rule:
It is assumed that cluster C1It is contained in subspace V1In, cluster C2It is contained in subspace V2In, then behaviour is established using following rule Make interface:
Subspace summation interface:Subspace V1+V2Result be C1∩C2, subspace V1+V2Dimension of enlivening be V1And V2's Enliven the union of dimension;
Seek interface in subspace:Subspace V1∩V2Result be C1∪C2, subspace V1∩V2Dimension of enlivening be V1And V2 The common factor for enlivening dimension;
Subspace list ordering interface:One group of dimension selected is provided, the characteristic vector in the dimension selected is sentenced The similitude of disconnected once subspace, and MDS (multidimensional scaling) algorithms are tieed up by one group of subspace by 1 It is mapped on 1 dimension coordinate axle, forms the sorted lists of subspace.
It is illustrated in figure 2 the functional block diagram of present system:The invention also discloses one kind to realize the high dimensional data Visual analysis method analysis system, including the Local Subspace difference being sequentially connected in series-geodesic distance projection establishes module, poly- Module is established in the mapping of class point cluster and the visual analysis view of serial subspace establishes module;Local Subspace difference-geodesic distance Projection, which establishes module and is used to establishing Local Subspace difference-geodesic distance on original high dimensional data project and upload cluster, puts cluster Module is established in mapping;The mapping of cluster point cluster establishes module and is used to establish cluster according to Local Subspace difference-geodesic distance projection The visual analysis view put the mapping of cluster and upload serial subspace establishes module;The visual analysis view of serial subspace is established Module is used to establish serial subspace according to the Local Subspace difference having built up-geodesic distance projection and the mapping of cluster point cluster Visual analysis view.

Claims (10)

1. a kind of visual analysis method of high dimensional data, comprises the following steps:
S1. Local Subspace difference-geodesic distance projection is established on original high dimensional data;
S2. the mapping of cluster point cluster is established in Local Subspace difference-geodesic distance projection;
S3. the mapping for the cluster point cluster that the Local Subspace difference obtained according to step S1-geodesic distance projection and step S2 are obtained Establish the visual analysis view of serial subspace.
2. the visual analysis method of high dimensional data according to claim 1, it is characterised in that described in step S1 in original Local Subspace difference-geodesic distance projection is established on beginning high dimensional data, is specially established and projected using following steps:
A. for the high dimensional data for needing to project, the data point relativity measurement based on geodesic distance is established;
B. Local Subspace difference metric is established according to the step A measurements established;
C. the measurement established according to step A and B, Local Subspace difference-geodesic distance projection is established.
3. the visual analysis method of high dimensional data according to claim 2, it is characterised in that establish base described in step A In the data point relativity measurement of geodesic distance, specially established and measured using following steps:
The I, S-NN of structure with some connected components on the basis of the high position data collection for needing to project scheme;
II, is directed to each connected component in step I, and the connected component of any two independence is attached;
III, calculates the beeline between any two points, so as to obtain geodesic distance.
4. the visual analysis method of high dimensional data according to claim 3, it is characterised in that the foundation office described in step B Portion subspace difference metric, specially established and measured using following steps:
1) weight of each dimension is calculated using equation below:
<mrow> <mi>&amp;omega;</mi> <mo>=</mo> <mo>&amp;lsqb;</mo> <msub> <mi>&amp;omega;</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>&amp;omega;</mi> <mn>2</mn> </msub> <mo>,</mo> <mo>...</mo> <mo>,</mo> <msub> <mi>&amp;omega;</mi> <mi>i</mi> </msub> <mo>,</mo> <mo>...</mo> <mo>,</mo> <msub> <mi>&amp;omega;</mi> <mi>d</mi> </msub> <mo>&amp;rsqb;</mo> <mo>,</mo> <msub> <mi>&amp;omega;</mi> <mi>i</mi> </msub> <mo>=</mo> <mfrac> <mrow> <mn>1</mn> <mo>/</mo> <msub> <mi>&amp;sigma;</mi> <mi>i</mi> </msub> </mrow> <mrow> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>d</mi> </msubsup> <mn>1</mn> <mo>/</mo> <msub> <mi>&amp;sigma;</mi> <mi>i</mi> </msub> </mrow> </mfrac> </mrow>
ω is dimension weight matrix in formula, ωiRepresent the weight of i-th of dimension, σiThe variance of data point in i-th of dimension is represented, D is the quantity of dimension;
2) the cum rights distance in SNN figures between any two points is calculated using equation below:
dpq[W]=max { dpqp],dpqq]}
Whereindpq[W] is point p and point q cum rights distance matrix, ωp= [ωp1p2,...,ωpi,...,ωpd] represent point p Local Subspace characteristic vector, ωq=[ωq1q2,..., ωqi,...,ωqd] represent q Local Subspace characteristic vector, diFor the Europe between the Local Subspace p in i-th dimension and point q Formula distance dpqp] it is to put the cum rights distance relative to point p, dpqq] it is cum rights distances of the point p relative to point q;
3) cosine similarity is based on, residual quantity matrix is established according to equation below:
<mrow> <msubsup> <mi>d</mi> <mrow> <mo>(</mo> <msub> <mi>p</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>p</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mi>s</mi> </msubsup> <mo>=</mo> <mn>1</mn> <mo>-</mo> <mfrac> <mrow> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>d</mi> </msubsup> <msub> <mi>&amp;omega;</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>&amp;CenterDot;</mo> <msub> <mi>&amp;omega;</mi> <mrow> <mi>j</mi> <mi>k</mi> </mrow> </msub> </mrow> <msqrt> <mrow> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>d</mi> </msubsup> <msubsup> <mi>&amp;omega;</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> <mn>2</mn> </msubsup> <mo>&amp;CenterDot;</mo> <msqrt> <mrow> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>d</mi> </msubsup> <msubsup> <mi>&amp;omega;</mi> <mrow> <mi>j</mi> <mi>k</mi> </mrow> <mn>2</mn> </msubsup> </mrow> </msqrt> </mrow> </msqrt> </mfrac> </mrow>
In formulaFor point piAnd pjDifference value based on cosine similarity;I and j is all the numbering of data point, and span is [0, n), n is data set size.
5. the visual analysis method of high dimensional data according to claim 4, it is characterised in that the foundation described in step S2 The mapping of cluster point cluster, is specially established using following steps and mapped:
(I) choosing is enclosed by the data point information of circle choosing in the Local Subspace difference that extraction step S1 is established-geodesic distance projection Data point is that user mutual circle selects to obtain, so as to obtain pending data point set;
(II) using PCA methods be calculated in step (I) data point set characteristic vector;
(III) mapping transformation on two characteristic vectors generation two dimensional surface that characteristic value is minimum in step (II) is selected, so as to complete Into the mapping of cluster point cluster.
6. the visual analysis method of high dimensional data according to claim 5, it is characterised in that the foundation described in step S3 The visual analysis view of serial subspace, specially establishes visual analyzing view using following steps:
(a) subspace represent without ginsengization;
(b) similarity measurement is carried out to subspace;
(c) represented without ginsengization according to subspace and similarity measurement, the contrast interactive operation interface of subspace is established, so as to complete Into the visual analyzing of high dimensional data.
7. the visual analysis method of high dimensional data according to claim 6, it is characterised in that the antithetical phrase described in step (a) Space represented without ginsengization, specially using following rule represent without ginsengization:
Define a subspace S for containing cluster CcFor { ω [p]:P ∈ C }, wherein ω [p] is point p Local Subspace feature Vector.
8. the visual analysis method of high dimensional data according to claim 7, it is characterised in that the antithetical phrase described in step (b) Space carries out similarity measurement, specially carries out similarity measurement using following steps:
(I) is vectorial using following formula defined feature for a d n-dimensional subspace n for containing the cluster C with n point
WhereinFormula In [ωi1i2,...,ωij,...ωid] it is electric p in cluster CiSub-space feature vectors;
(II) uses the characteristic vector obtained in step (I)Cosine similarity measure the similitude of two sub-spaces.
9. the visual analysis method of high dimensional data according to claim 8, it is characterised in that the foundation described in step (c) The contrast interactive operation interface of subspace, operation interface is specially established using following rule:
It is assumed that cluster C1It is contained in subspace V1In, cluster C2It is contained in subspace V2In, then operation circle is established using following rule Face:
Subspace summation interface:Subspace V1+V2Result be C1∩C2, subspace V1+V2Dimension of enlivening be V1And V2Enliven The union of dimension;
Seek interface in subspace:Subspace V1∩V2Result be C1∪C2, subspace V1∩V2Dimension of enlivening be V1And V2Work The common factor for the dimension that jumps;
Subspace list ordering interface:One group of dimension selected is provided, the characteristic vector in the dimension selected is judged one The similitude of subspace is spent, and subspace is formed by one group of subspace mapping to 1 dimension coordinate axle by 1 dimension MDS algorithm Sorted lists.
10. a kind of analysis system for the visual analysis method for realizing one of claim 1~9 high dimensional data, its feature The projection of the Local Subspace difference for being to include being sequentially connected in series-geodesic distance establish module, cluster point cluster mapping establish module and The visual analysis view of serial subspace establishes module;Local Subspace difference-geodesic distance projection is established module and is used for original Established on high dimensional data Local Subspace difference-geodesic distance project and upload cluster point cluster mapping establish module;Cluster point cluster Mapping establish module and be used to establish the mapping of cluster point cluster according to the projection of Local Subspace difference-geodesic distance and upload serial son The visual analysis view in space establishes module;The visual analysis view of serial subspace is established module and is used for according to having built up The visual analysis view of serial subspace is established in Local Subspace difference-geodesic distance projection and the mapping of cluster point cluster.
CN201710620143.6A 2017-07-26 2017-07-26 Visual analysis method and system for high-dimensional data Expired - Fee Related CN107368599B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710620143.6A CN107368599B (en) 2017-07-26 2017-07-26 Visual analysis method and system for high-dimensional data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710620143.6A CN107368599B (en) 2017-07-26 2017-07-26 Visual analysis method and system for high-dimensional data

Publications (2)

Publication Number Publication Date
CN107368599A true CN107368599A (en) 2017-11-21
CN107368599B CN107368599B (en) 2020-06-23

Family

ID=60307219

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710620143.6A Expired - Fee Related CN107368599B (en) 2017-07-26 2017-07-26 Visual analysis method and system for high-dimensional data

Country Status (1)

Country Link
CN (1) CN107368599B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021664A (en) * 2017-12-04 2018-05-11 北京工商大学 A kind of multidimensional data correlation visual analysis method and system based on dimensional projections
CN108090182A (en) * 2017-12-15 2018-05-29 清华大学 A kind of distributed index method and system of extensive high dimensional data
CN111428631A (en) * 2020-03-23 2020-07-17 中南大学 Visual identification and sorting method for flight control signals of unmanned aerial vehicle
CN115952426A (en) * 2023-03-10 2023-04-11 中南大学 Distributed noise data clustering method based on random sampling and user classification method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040241709A1 (en) * 2000-03-22 2004-12-02 Patterson David E. Visualizing high dimensional descriptors of molecular structures
CN102163224A (en) * 2011-04-06 2011-08-24 中南大学 Adaptive spatial clustering method
CN105868352A (en) * 2016-03-29 2016-08-17 天津大学 High-dimensional data dimension ordering method based on dimension correlation analysis
CN106203516A (en) * 2016-07-13 2016-12-07 中南大学 A kind of subspace clustering visual analysis method based on dimension dependency

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040241709A1 (en) * 2000-03-22 2004-12-02 Patterson David E. Visualizing high dimensional descriptors of molecular structures
CN102163224A (en) * 2011-04-06 2011-08-24 中南大学 Adaptive spatial clustering method
CN105868352A (en) * 2016-03-29 2016-08-17 天津大学 High-dimensional data dimension ordering method based on dimension correlation analysis
CN106203516A (en) * 2016-07-13 2016-12-07 中南大学 A kind of subspace clustering visual analysis method based on dimension dependency

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
夏佳志: "一种基于子空间聚类的局部相关可视分析方法", 《计算机辅助设计与图形学学报》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021664A (en) * 2017-12-04 2018-05-11 北京工商大学 A kind of multidimensional data correlation visual analysis method and system based on dimensional projections
CN108021664B (en) * 2017-12-04 2020-05-05 北京工商大学 Multidimensional data correlation visual analysis method and system based on dimension projection
CN108090182A (en) * 2017-12-15 2018-05-29 清华大学 A kind of distributed index method and system of extensive high dimensional data
CN111428631A (en) * 2020-03-23 2020-07-17 中南大学 Visual identification and sorting method for flight control signals of unmanned aerial vehicle
CN111428631B (en) * 2020-03-23 2023-05-05 中南大学 Visual identification and sorting method for unmanned aerial vehicle flight control signals
CN115952426A (en) * 2023-03-10 2023-04-11 中南大学 Distributed noise data clustering method based on random sampling and user classification method

Also Published As

Publication number Publication date
CN107368599B (en) 2020-06-23

Similar Documents

Publication Publication Date Title
CN107368599A (en) The visual analysis method and its analysis system of high dimensional data
US10657686B2 (en) Gragnostics rendering
CN105740651B (en) A kind of construction method of particular cancers difference expression gene regulated and control network
CN107369183A (en) Towards the MAR Tracing Registration method and system based on figure optimization SLAM
CN108428015B (en) Wind power prediction method based on historical meteorological data and random simulation
CN102629275A (en) Face and name aligning method and system facing to cross media news retrieval
CN109447100A (en) A kind of three-dimensional point cloud recognition methods based on the detection of B-spline surface similitude
CN115311730B (en) Face key point detection method and system and electronic equipment
CN112085072A (en) Cross-modal retrieval method of sketch retrieval three-dimensional model based on space-time characteristic information
CN105205135A (en) 3D (three-dimensional) model retrieving method based on topic model and retrieving device thereof
Zhao et al. Non-aligned multi-view multi-label classification via learning view-specific labels
Liang et al. MVCLN: multi-view convolutional LSTM network for cross-media 3D shape recognition
CN114926742A (en) Loop detection and optimization method based on second-order attention mechanism
Hou et al. Fast multi-view outlier detection via deep encoder
Yu et al. A novel multi-feature representation of images for heterogeneous IoTs
CN111914912B (en) Cross-domain multi-view target identification method based on twin condition countermeasure network
Liu et al. Multi-modal fusion based on depth adaptive mechanism for 3D object detection
CN111724298A (en) Dictionary optimization and mapping method for digital rock core super-dimensional reconstruction
Pan et al. A kernel-based probabilistic collaborative representation for face recognition
Qv et al. LG: A clustering framework supported by point proximity relations
CN115273177A (en) Method, device and equipment for recognizing face types of heterogeneous faces and storage medium
N'Cir et al. Kernel overlapping k-means for clustering in feature space
Li et al. Fuzzy granule manifold alignment preserving local topology
CN111353538A (en) Similar image matching method based on deep learning
CN107423763A (en) The two-dimensional projection&#39;s method and its optical projection system of high dimensional data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20200623

Termination date: 20210726