CN107368599A - The visual analysis method and its analysis system of high dimensional data - Google Patents
The visual analysis method and its analysis system of high dimensional data Download PDFInfo
- Publication number
- CN107368599A CN107368599A CN201710620143.6A CN201710620143A CN107368599A CN 107368599 A CN107368599 A CN 107368599A CN 201710620143 A CN201710620143 A CN 201710620143A CN 107368599 A CN107368599 A CN 107368599A
- Authority
- CN
- China
- Prior art keywords
- subspace
- mrow
- cluster
- point
- established
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2462—Approximate or statistical queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/283—Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Probability & Statistics with Applications (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Image Analysis (AREA)
Abstract
A kind of visual analysis method of high dimensional data of the present invention, it is included on original high dimensional data and establishes the projection of Local Subspace difference geodesic distance;Establish the mapping of cluster point cluster;Establish the visual analysis view of serial subspace.The invention also discloses the analysis system for the visual analysis method for realizing the high dimensional data.The present invention is projected by establishing Local Subspace difference geodesic distance, clusters the mapping of point cluster and the visual analysis view of serial subspace, propose the visual analyzing operation of series of interactive, reliable technical foundation is provided for visualization subspace clustering and analysis, effectively guidance and help user high dimensional data can be carried out effectively to analyze and explore, the number of user trial and mistake is significantly reduced in high dimensional data processing, reduce the redundancy of data, the interaction of data analysis process is strengthened, improves the reliability of result.
Description
Technical field
Present invention relates particularly to a kind of visual analysis method of high dimensional data and its analysis system.
Background technology
With the development of national economy technology and the arrival of digital society, data have become people's production and life
In indispensable part.People come into contacts with endless data daily, such as finance data, scientific algorithm data, life
Thing medical data etc..Therefore, data analysis turns into one of current most popular development field.Data mining and visual analysis technology
It is the pith in areas of information technology;During data mining with analysis, a strong visual analysis method
Effect can be got twice the result with half the effort.
High-dimensional is one of key character of big data.Subspace clustering structure is usually contained in high dimensional data.Automatically
The Subspace clustering method of change, often generate high redundancy and elusive subspace clustering result.Visual analysis is enhancing
User cognition, the effective ways for helping user to understand.However, currently towards the visual analysis method of subspace clustering, it is most of
It is the visualization towards automatic mode result, rather than towards subspace clustering task in itself, it is difficult to it is effectively improved subspace
The result of clustering method.
The content of the invention
An object of the present invention is that provide one kind effectively guidance and help user can be carried out effectively to high dimensional data
Analysis and the visual analysis method for the high dimensional data explored.
The second object of the present invention is to provide a kind of point for the visual analysis method for being used to realize the high dimensional data
Analysis system.
The visual analysis method of this high dimensional data provided by the invention, comprises the following steps:
S1. Local Subspace difference-geodesic distance projection is established on original high dimensional data;
S2. the mapping of cluster point cluster is established in Local Subspace difference-geodesic distance projection;
S3. the cluster point cluster that the Local Subspace difference obtained according to step S1-geodesic distance projection and step S2 are obtained
The visual analysis view of serial subspace is established in mapping.
Local Subspace difference-geodesic distance projection is established on original high dimensional data described in step S1, is specially used
Following steps establish projection:
A. for the high dimensional data for needing to project, the data point relativity measurement based on geodesic distance is established;
B. Local Subspace difference metric is established according to the step A measurements established;
C. the measurement established according to step A and B, Local Subspace difference-geodesic distance projection is established.
Data point relativity measurement of the foundation based on geodesic distance described in step A, is specially established using following steps
Measurement:
The I, S-NN of structure with some connected components on the basis of the high position data collection for needing to project scheme;
II, is directed to each connected component in step I, and the connected component of any two independence is attached;
III, calculates the beeline between any two points, so as to obtain geodesic distance.
The connected component to any two independence described in step II is attached, and is specially connected in two connected components
Two closest data points.
Calculating beeline described in step III, is specially calculated using shortest path first.
Local Subspace difference metric is established described in step B, is specially established and measured using following steps:
1) weight of each dimension is calculated using equation below:
ω is dimension weight matrix in formula, ωiRepresent the weight of i-th of dimension, σiRepresent data point in i-th of dimension
Variance, d are the quantity of dimension;
2) the cum rights distance in SNN figures between any two points is calculated using equation below:
dpq[W]=max { dpq[ωp],dpq[ωq]}
Whereindpq[W] is point p and point q cum rights distance matrix,
ωp=[ωp1,ωp2,...,ωpi,...,ωpd] represent point p Local Subspace characteristic vector, ωq=[ωq1,
ωq2,...,ωqi,...,ωqd] represent q Local Subspace characteristic vector, diFor the Local Subspace p in i-th dimension and point
Euclidean distance d between qpq[ωp] it is to put the cum rights distance relative to point p, dpq[ωq] for point p relative to point q cum rights away from
From;
3) cosine similarity is based on, residual quantity matrix is established according to equation below:
In formulaFor point piAnd pjDifference value based on cosine similarity;
I and j is all the numbering of data point, span for [0, n), n is data set size.
Local Subspace difference-geodesic distance projection is established described in step C, specially using MDS algorithm by space
Data point is mapped to x-axis by Local Subspace difference metric, and y-axis is mapped to by the data point relativity measurement of geodesic distance.
The mapping of foundation cluster point cluster described in step S2, is specially established using following steps and mapped:
(I) by the data point information of circle choosing, circle choosing in the Local Subspace difference that extraction step S1 is established-geodesic distance projection
Operation is completed by user interactive, obtains pending data point set;
(II) using PCA (Principal Component Analysis) method be calculated in step (I) number
The characteristic vector of strong point collection;
(III) mapping transformation on two characteristic vectors generation two dimensional surface that characteristic value is minimum in step (II) is selected, from
And complete the mapping of cluster point cluster.
The visual analysis view for establishing serial subspace described in step S3, specially established and visualized using following steps
Analyze view:
(a) subspace represent without ginsengization;
(b) similarity measurement is carried out to subspace;
(c) represented without ginsengization according to subspace and similarity measurement, establish the contrast interactive operation interface of subspace, from
And complete the visual analyzing of high dimensional data.
Subspace represent without ginsengization described in step (a), specially using following rule represent without ginsengization:
Define a subspace S for containing cluster CcFor { ω [p]:P ∈ C }, the local son that wherein ω [p] is point p is empty
Between characteristic vector.
Similarity measurement is carried out to subspace described in step (b), specially carries out similarity measurement using following steps:
(I) is vectorial using following formula defined feature for a d n-dimensional subspace n for containing the cluster C with n point
Wherein
[ω in formulai1,ωi2,...,ωij,...ωid] it is electric p in cluster CiSub-space feature vectors;
(II) uses the characteristic vector obtained in step (I)Cosine similarity measure the similitude of two sub-spaces.
The contrast interactive operation interface for establishing subspace described in step (c), operation is specially established using following rule
Interface:
It is assumed that cluster C1It is contained in subspace V1In, cluster C2It is contained in subspace V2In, then behaviour is established using following rule
Make interface:
Subspace summation interface:Subspace V1+V2Result be C1∩C2, subspace V1+V2Dimension of enlivening be V1And V2's
Enliven the union of dimension;
Seek interface in subspace:Subspace V1∩V2Result be C1∪C2, subspace V1∩V2Dimension of enlivening be V1And V2
The common factor for enlivening dimension;
Subspace list ordering interface:One group of dimension selected is provided, the characteristic vector in the dimension selected is sentenced
The similitude of disconnected once subspace, and MDS (multidimensional scaling) algorithms are tieed up by one group of subspace by 1
It is mapped on 1 dimension coordinate axle, forms the sorted lists of subspace.
The invention also discloses a kind of analysis system for the visual analysis method for realizing the high dimensional data, including successively
Module and serial subspace are established in the mapping that module, cluster point cluster are established in the projection of the Local Subspace difference of series connection-geodesic distance
Visual analysis view establishes module;Local Subspace difference-geodesic distance projection establishes module and is used to build on original high dimensional data
Vertical Local Subspace difference-geodesic distance, which project and uploads cluster and put the mapping of cluster, establishes module;Mould is established in the mapping of cluster point cluster
Block is used to establish the mapping of cluster point cluster according to Local Subspace difference-geodesic distance projection and uploads visual minute of serial subspace
Analysis view establishes module;The visual analysis view of serial subspace is established module and is used for according to the Local Subspace that has built up
The visual analysis view of serial subspace is established in difference-geodesic distance projection and the mapping of cluster point cluster.
The visual analysis method and its analysis system of this high dimensional data provided by the invention, it is empty by establishing local son
Between the projection of difference-geodesic distance, the mapping of cluster point cluster and serial subspace visual analysis view, it is proposed that series of interactive
Visual analyzing operation, for visualization subspace clustering and analysis provide reliable technical foundation, can effectively instruct and
Help user to carry out effectively analyzing and exploring to high dimensional data, user trial and mistake are significantly reduced in high dimensional data processing
Number, reduce the redundancies of data, strengthen the interaction of data analysis process, improve the reliability of result.
Brief description of the drawings
Fig. 1 is the method flow diagram of the inventive method.
Fig. 2 is the functional block diagram of present system.
Embodiment
It is the method flow diagram of the inventive method as shown in Figure 1:The visualization of this high dimensional data provided by the invention point
Analysis method, comprises the following steps:
S1. Local Subspace difference-geodesic distance projection is established on original high dimensional data, is specially built using following steps
Vertical projection:
A. for the high dimensional data for needing to project, the data point relativity measurement based on geodesic distance is established, is specially adopted
Established and measured with following steps:
The I, S-NN of structure with some connected components on the basis of the high position data collection for needing to project scheme;SNN figures are
Refer to a drawing of seeds of K-NN figures.Specifically, in SNN figures, just there is one between them when and if only if point p, q is k neighbours
Side;
II, is directed to each connected component in step I, and the connected component of any two independence is attached, and is specially
Connect two data points closest in two connected components;
III, calculates the beeline between any two points using shortest path first, so as to obtain geodesic distance;
B. Local Subspace difference metric is established according to the step A measurements established, body is to establish to measure using following steps:
1) weight of each dimension is calculated using equation below:
ω is dimension weight matrix in formula, ωiRepresent the weight of i-th of dimension, σiRepresent data point in i-th of dimension
Variance, d are the quantity of dimension;
2) the cum rights distance in SNN figures between any two points is calculated using equation below:
dpq[W]=max { dpq[ωp],dpq[ωq]}
Whereindpq[W] is point p and point q cum rights distance matrix,
ωp=[ωp1,ωp2,...,ωpi,...,ωpd] represent point p Local Subspace characteristic vector, ωq=[ωq1,
ωq2,...,ωqi,...,ωqd] represent q Local Subspace characteristic vector, diFor the Local Subspace p in i-th dimension and point
Euclidean distance d between qpq[ωp] it is to put the cum rights distance relative to point p, dpq[ωq] for point p relative to point q cum rights away from
From;
3) cosine similarity is based on, residual quantity matrix is established according to equation below:
In formulaFor point piAnd pjDifference based on cosine similarity
Value;I and j is all the numbering of data point, span for [0, n), n is data set size;
C. the measurement established according to step A and B, Local Subspace difference-geodesic distance projection is established, specially using MDS
Data point in space is mapped to x-axis by algorithm by Local Subspace difference metric, passes through the data point correlation of geodesic distance
Measuring mapping is to y-axis;X-axis has meant that out local Subspace difference, and y-axis characterizes geodesic distance measurement.In Local Subspace
In difference-geodesic distance mapping, each cluster is compartmentalized;
S2. the mapping of cluster point cluster is established in Local Subspace difference-geodesic distance projection, specially using following steps
Establish mapping:
(I) by the data point information of circle choosing, circle choosing in the Local Subspace difference that extraction step S1 is established-geodesic distance projection
Data point voluntarily enclose choosing for user and obtain, obtain pending data point set;
(II) using PCA (Principal Component Analysis) method be calculated in step (I) number
The characteristic vector of strong point collection;
(III) mapping transformation on two characteristic vectors generation two dimensional surface that characteristic value is minimum in step (II) is selected, from
And complete the mapping of cluster point cluster;
S3. the cluster point cluster that the Local Subspace difference obtained according to step S1-geodesic distance projection and step S2 are obtained
The visual analysis view of serial subspace is established in mapping, specially establishes visual analyzing view using following steps:
(a) subspace represent without ginsengization, specially using following rule represent without ginsengization:
Define a subspace S for containing cluster CcFor { ω [p]:P ∈ C }, the local son that wherein ω [p] is point p is empty
Between characteristic vector;
(b) similarity measurement is carried out to subspace, specially carries out similarity measurement using following steps:
(I) is vectorial using following formula defined feature for a d n-dimensional subspace n for containing the cluster C with n point
Wherein
[ω in formulai1,ωi2,...,ωij,...ωid] it is electric p in cluster CiSub-space feature vectors;
(II) uses the characteristic vector obtained in step (I)Cosine similarity measure the similitude of two sub-spaces;
(c) represented without ginsengization according to subspace and similarity measurement, establish the contrast interactive operation interface of subspace, from
And complete the visual analyzing of high dimensional data;Operation interface is specially established using following rule:
It is assumed that cluster C1It is contained in subspace V1In, cluster C2It is contained in subspace V2In, then behaviour is established using following rule
Make interface:
Subspace summation interface:Subspace V1+V2Result be C1∩C2, subspace V1+V2Dimension of enlivening be V1And V2's
Enliven the union of dimension;
Seek interface in subspace:Subspace V1∩V2Result be C1∪C2, subspace V1∩V2Dimension of enlivening be V1And V2
The common factor for enlivening dimension;
Subspace list ordering interface:One group of dimension selected is provided, the characteristic vector in the dimension selected is sentenced
The similitude of disconnected once subspace, and MDS (multidimensional scaling) algorithms are tieed up by one group of subspace by 1
It is mapped on 1 dimension coordinate axle, forms the sorted lists of subspace.
It is illustrated in figure 2 the functional block diagram of present system:The invention also discloses one kind to realize the high dimensional data
Visual analysis method analysis system, including the Local Subspace difference being sequentially connected in series-geodesic distance projection establishes module, poly-
Module is established in the mapping of class point cluster and the visual analysis view of serial subspace establishes module;Local Subspace difference-geodesic distance
Projection, which establishes module and is used to establishing Local Subspace difference-geodesic distance on original high dimensional data project and upload cluster, puts cluster
Module is established in mapping;The mapping of cluster point cluster establishes module and is used to establish cluster according to Local Subspace difference-geodesic distance projection
The visual analysis view put the mapping of cluster and upload serial subspace establishes module;The visual analysis view of serial subspace is established
Module is used to establish serial subspace according to the Local Subspace difference having built up-geodesic distance projection and the mapping of cluster point cluster
Visual analysis view.
Claims (10)
1. a kind of visual analysis method of high dimensional data, comprises the following steps:
S1. Local Subspace difference-geodesic distance projection is established on original high dimensional data;
S2. the mapping of cluster point cluster is established in Local Subspace difference-geodesic distance projection;
S3. the mapping for the cluster point cluster that the Local Subspace difference obtained according to step S1-geodesic distance projection and step S2 are obtained
Establish the visual analysis view of serial subspace.
2. the visual analysis method of high dimensional data according to claim 1, it is characterised in that described in step S1 in original
Local Subspace difference-geodesic distance projection is established on beginning high dimensional data, is specially established and projected using following steps:
A. for the high dimensional data for needing to project, the data point relativity measurement based on geodesic distance is established;
B. Local Subspace difference metric is established according to the step A measurements established;
C. the measurement established according to step A and B, Local Subspace difference-geodesic distance projection is established.
3. the visual analysis method of high dimensional data according to claim 2, it is characterised in that establish base described in step A
In the data point relativity measurement of geodesic distance, specially established and measured using following steps:
The I, S-NN of structure with some connected components on the basis of the high position data collection for needing to project scheme;
II, is directed to each connected component in step I, and the connected component of any two independence is attached;
III, calculates the beeline between any two points, so as to obtain geodesic distance.
4. the visual analysis method of high dimensional data according to claim 3, it is characterised in that the foundation office described in step B
Portion subspace difference metric, specially established and measured using following steps:
1) weight of each dimension is calculated using equation below:
<mrow>
<mi>&omega;</mi>
<mo>=</mo>
<mo>&lsqb;</mo>
<msub>
<mi>&omega;</mi>
<mn>1</mn>
</msub>
<mo>,</mo>
<msub>
<mi>&omega;</mi>
<mn>2</mn>
</msub>
<mo>,</mo>
<mo>...</mo>
<mo>,</mo>
<msub>
<mi>&omega;</mi>
<mi>i</mi>
</msub>
<mo>,</mo>
<mo>...</mo>
<mo>,</mo>
<msub>
<mi>&omega;</mi>
<mi>d</mi>
</msub>
<mo>&rsqb;</mo>
<mo>,</mo>
<msub>
<mi>&omega;</mi>
<mi>i</mi>
</msub>
<mo>=</mo>
<mfrac>
<mrow>
<mn>1</mn>
<mo>/</mo>
<msub>
<mi>&sigma;</mi>
<mi>i</mi>
</msub>
</mrow>
<mrow>
<msubsup>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>d</mi>
</msubsup>
<mn>1</mn>
<mo>/</mo>
<msub>
<mi>&sigma;</mi>
<mi>i</mi>
</msub>
</mrow>
</mfrac>
</mrow>
ω is dimension weight matrix in formula, ωiRepresent the weight of i-th of dimension, σiThe variance of data point in i-th of dimension is represented,
D is the quantity of dimension;
2) the cum rights distance in SNN figures between any two points is calculated using equation below:
dpq[W]=max { dpq[ωp],dpq[ωq]}
Whereindpq[W] is point p and point q cum rights distance matrix, ωp=
[ωp1,ωp2,...,ωpi,...,ωpd] represent point p Local Subspace characteristic vector, ωq=[ωq1,ωq2,...,
ωqi,...,ωqd] represent q Local Subspace characteristic vector, diFor the Europe between the Local Subspace p in i-th dimension and point q
Formula distance dpq[ωp] it is to put the cum rights distance relative to point p, dpq[ωq] it is cum rights distances of the point p relative to point q;
3) cosine similarity is based on, residual quantity matrix is established according to equation below:
<mrow>
<msubsup>
<mi>d</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>p</mi>
<mi>i</mi>
</msub>
<mo>,</mo>
<msub>
<mi>p</mi>
<mi>j</mi>
</msub>
<mo>)</mo>
</mrow>
<mi>s</mi>
</msubsup>
<mo>=</mo>
<mn>1</mn>
<mo>-</mo>
<mfrac>
<mrow>
<msubsup>
<mi>&Sigma;</mi>
<mrow>
<mi>k</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>d</mi>
</msubsup>
<msub>
<mi>&omega;</mi>
<mrow>
<mi>i</mi>
<mi>k</mi>
</mrow>
</msub>
<mo>&CenterDot;</mo>
<msub>
<mi>&omega;</mi>
<mrow>
<mi>j</mi>
<mi>k</mi>
</mrow>
</msub>
</mrow>
<msqrt>
<mrow>
<msubsup>
<mi>&Sigma;</mi>
<mrow>
<mi>k</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>d</mi>
</msubsup>
<msubsup>
<mi>&omega;</mi>
<mrow>
<mi>i</mi>
<mi>k</mi>
</mrow>
<mn>2</mn>
</msubsup>
<mo>&CenterDot;</mo>
<msqrt>
<mrow>
<msubsup>
<mi>&Sigma;</mi>
<mrow>
<mi>k</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>d</mi>
</msubsup>
<msubsup>
<mi>&omega;</mi>
<mrow>
<mi>j</mi>
<mi>k</mi>
</mrow>
<mn>2</mn>
</msubsup>
</mrow>
</msqrt>
</mrow>
</msqrt>
</mfrac>
</mrow>
In formulaFor point piAnd pjDifference value based on cosine similarity;I and j is all the numbering of data point, and span is
[0, n), n is data set size.
5. the visual analysis method of high dimensional data according to claim 4, it is characterised in that the foundation described in step S2
The mapping of cluster point cluster, is specially established using following steps and mapped:
(I) choosing is enclosed by the data point information of circle choosing in the Local Subspace difference that extraction step S1 is established-geodesic distance projection
Data point is that user mutual circle selects to obtain, so as to obtain pending data point set;
(II) using PCA methods be calculated in step (I) data point set characteristic vector;
(III) mapping transformation on two characteristic vectors generation two dimensional surface that characteristic value is minimum in step (II) is selected, so as to complete
Into the mapping of cluster point cluster.
6. the visual analysis method of high dimensional data according to claim 5, it is characterised in that the foundation described in step S3
The visual analysis view of serial subspace, specially establishes visual analyzing view using following steps:
(a) subspace represent without ginsengization;
(b) similarity measurement is carried out to subspace;
(c) represented without ginsengization according to subspace and similarity measurement, the contrast interactive operation interface of subspace is established, so as to complete
Into the visual analyzing of high dimensional data.
7. the visual analysis method of high dimensional data according to claim 6, it is characterised in that the antithetical phrase described in step (a)
Space represented without ginsengization, specially using following rule represent without ginsengization:
Define a subspace S for containing cluster CcFor { ω [p]:P ∈ C }, wherein ω [p] is point p Local Subspace feature
Vector.
8. the visual analysis method of high dimensional data according to claim 7, it is characterised in that the antithetical phrase described in step (b)
Space carries out similarity measurement, specially carries out similarity measurement using following steps:
(I) is vectorial using following formula defined feature for a d n-dimensional subspace n for containing the cluster C with n point
WhereinFormula
In [ωi1,ωi2,...,ωij,...ωid] it is electric p in cluster CiSub-space feature vectors;
(II) uses the characteristic vector obtained in step (I)Cosine similarity measure the similitude of two sub-spaces.
9. the visual analysis method of high dimensional data according to claim 8, it is characterised in that the foundation described in step (c)
The contrast interactive operation interface of subspace, operation interface is specially established using following rule:
It is assumed that cluster C1It is contained in subspace V1In, cluster C2It is contained in subspace V2In, then operation circle is established using following rule
Face:
Subspace summation interface:Subspace V1+V2Result be C1∩C2, subspace V1+V2Dimension of enlivening be V1And V2Enliven
The union of dimension;
Seek interface in subspace:Subspace V1∩V2Result be C1∪C2, subspace V1∩V2Dimension of enlivening be V1And V2Work
The common factor for the dimension that jumps;
Subspace list ordering interface:One group of dimension selected is provided, the characteristic vector in the dimension selected is judged one
The similitude of subspace is spent, and subspace is formed by one group of subspace mapping to 1 dimension coordinate axle by 1 dimension MDS algorithm
Sorted lists.
10. a kind of analysis system for the visual analysis method for realizing one of claim 1~9 high dimensional data, its feature
The projection of the Local Subspace difference for being to include being sequentially connected in series-geodesic distance establish module, cluster point cluster mapping establish module and
The visual analysis view of serial subspace establishes module;Local Subspace difference-geodesic distance projection is established module and is used for original
Established on high dimensional data Local Subspace difference-geodesic distance project and upload cluster point cluster mapping establish module;Cluster point cluster
Mapping establish module and be used to establish the mapping of cluster point cluster according to the projection of Local Subspace difference-geodesic distance and upload serial son
The visual analysis view in space establishes module;The visual analysis view of serial subspace is established module and is used for according to having built up
The visual analysis view of serial subspace is established in Local Subspace difference-geodesic distance projection and the mapping of cluster point cluster.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710620143.6A CN107368599B (en) | 2017-07-26 | 2017-07-26 | Visual analysis method and system for high-dimensional data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710620143.6A CN107368599B (en) | 2017-07-26 | 2017-07-26 | Visual analysis method and system for high-dimensional data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107368599A true CN107368599A (en) | 2017-11-21 |
CN107368599B CN107368599B (en) | 2020-06-23 |
Family
ID=60307219
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710620143.6A Expired - Fee Related CN107368599B (en) | 2017-07-26 | 2017-07-26 | Visual analysis method and system for high-dimensional data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107368599B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108021664A (en) * | 2017-12-04 | 2018-05-11 | 北京工商大学 | A kind of multidimensional data correlation visual analysis method and system based on dimensional projections |
CN108090182A (en) * | 2017-12-15 | 2018-05-29 | 清华大学 | A kind of distributed index method and system of extensive high dimensional data |
CN111428631A (en) * | 2020-03-23 | 2020-07-17 | 中南大学 | Visual identification and sorting method for flight control signals of unmanned aerial vehicle |
CN115952426A (en) * | 2023-03-10 | 2023-04-11 | 中南大学 | Distributed noise data clustering method based on random sampling and user classification method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040241709A1 (en) * | 2000-03-22 | 2004-12-02 | Patterson David E. | Visualizing high dimensional descriptors of molecular structures |
CN102163224A (en) * | 2011-04-06 | 2011-08-24 | 中南大学 | Adaptive spatial clustering method |
CN105868352A (en) * | 2016-03-29 | 2016-08-17 | 天津大学 | High-dimensional data dimension ordering method based on dimension correlation analysis |
CN106203516A (en) * | 2016-07-13 | 2016-12-07 | 中南大学 | A kind of subspace clustering visual analysis method based on dimension dependency |
-
2017
- 2017-07-26 CN CN201710620143.6A patent/CN107368599B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040241709A1 (en) * | 2000-03-22 | 2004-12-02 | Patterson David E. | Visualizing high dimensional descriptors of molecular structures |
CN102163224A (en) * | 2011-04-06 | 2011-08-24 | 中南大学 | Adaptive spatial clustering method |
CN105868352A (en) * | 2016-03-29 | 2016-08-17 | 天津大学 | High-dimensional data dimension ordering method based on dimension correlation analysis |
CN106203516A (en) * | 2016-07-13 | 2016-12-07 | 中南大学 | A kind of subspace clustering visual analysis method based on dimension dependency |
Non-Patent Citations (1)
Title |
---|
夏佳志: "一种基于子空间聚类的局部相关可视分析方法", 《计算机辅助设计与图形学学报》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108021664A (en) * | 2017-12-04 | 2018-05-11 | 北京工商大学 | A kind of multidimensional data correlation visual analysis method and system based on dimensional projections |
CN108021664B (en) * | 2017-12-04 | 2020-05-05 | 北京工商大学 | Multidimensional data correlation visual analysis method and system based on dimension projection |
CN108090182A (en) * | 2017-12-15 | 2018-05-29 | 清华大学 | A kind of distributed index method and system of extensive high dimensional data |
CN111428631A (en) * | 2020-03-23 | 2020-07-17 | 中南大学 | Visual identification and sorting method for flight control signals of unmanned aerial vehicle |
CN111428631B (en) * | 2020-03-23 | 2023-05-05 | 中南大学 | Visual identification and sorting method for unmanned aerial vehicle flight control signals |
CN115952426A (en) * | 2023-03-10 | 2023-04-11 | 中南大学 | Distributed noise data clustering method based on random sampling and user classification method |
Also Published As
Publication number | Publication date |
---|---|
CN107368599B (en) | 2020-06-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107368599A (en) | The visual analysis method and its analysis system of high dimensional data | |
US10657686B2 (en) | Gragnostics rendering | |
CN105740651B (en) | A kind of construction method of particular cancers difference expression gene regulated and control network | |
CN107369183A (en) | Towards the MAR Tracing Registration method and system based on figure optimization SLAM | |
CN108428015B (en) | Wind power prediction method based on historical meteorological data and random simulation | |
CN102629275A (en) | Face and name aligning method and system facing to cross media news retrieval | |
CN109447100A (en) | A kind of three-dimensional point cloud recognition methods based on the detection of B-spline surface similitude | |
CN115311730B (en) | Face key point detection method and system and electronic equipment | |
CN112085072A (en) | Cross-modal retrieval method of sketch retrieval three-dimensional model based on space-time characteristic information | |
CN105205135A (en) | 3D (three-dimensional) model retrieving method based on topic model and retrieving device thereof | |
Zhao et al. | Non-aligned multi-view multi-label classification via learning view-specific labels | |
Liang et al. | MVCLN: multi-view convolutional LSTM network for cross-media 3D shape recognition | |
CN114926742A (en) | Loop detection and optimization method based on second-order attention mechanism | |
Hou et al. | Fast multi-view outlier detection via deep encoder | |
Yu et al. | A novel multi-feature representation of images for heterogeneous IoTs | |
CN111914912B (en) | Cross-domain multi-view target identification method based on twin condition countermeasure network | |
Liu et al. | Multi-modal fusion based on depth adaptive mechanism for 3D object detection | |
CN111724298A (en) | Dictionary optimization and mapping method for digital rock core super-dimensional reconstruction | |
Pan et al. | A kernel-based probabilistic collaborative representation for face recognition | |
Qv et al. | LG: A clustering framework supported by point proximity relations | |
CN115273177A (en) | Method, device and equipment for recognizing face types of heterogeneous faces and storage medium | |
N'Cir et al. | Kernel overlapping k-means for clustering in feature space | |
Li et al. | Fuzzy granule manifold alignment preserving local topology | |
CN111353538A (en) | Similar image matching method based on deep learning | |
CN107423763A (en) | The two-dimensional projection's method and its optical projection system of high dimensional data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20200623 Termination date: 20210726 |