CN110175191A - Data filtering rule modeling method in data analysis - Google Patents
Data filtering rule modeling method in data analysis Download PDFInfo
- Publication number
- CN110175191A CN110175191A CN201910401717.XA CN201910401717A CN110175191A CN 110175191 A CN110175191 A CN 110175191A CN 201910401717 A CN201910401717 A CN 201910401717A CN 110175191 A CN110175191 A CN 110175191A
- Authority
- CN
- China
- Prior art keywords
- data
- column
- analysis
- type
- cnt
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2474—Sequence data queries, e.g. querying versioned data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Complex Calculations (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Data filtering rule modeling method the invention belongs to data analysis technique field, in specially a kind of data analysis.Data filtering rule modeling method of the invention mainly includes three parts: (1) data column analysis filtering (2) data area analysis filtering (3) result set automatic visual.The present invention, which passes through, reasonably sets relevant rule solves how to apply the foundation analysis filtering model of data filtering rule in data analysis, crosses filter data and intuitive display data using model analysis.The present invention can facilitate the quick garbled data of user and find interested data subset, contact between analysis and mining data item.
Description
Technical field
The invention belongs to data analysis technique fields, and in particular to the data filtering rule modeling method in data analysis.
Background technique
In the data ubiquitous epoch, the decision of user is increasingly by the driving of data.It is analyzed typically for data
As a result difference tends to significantly affect decision process.Select improper data, it is either intentional still unintentionally, may cause
The decision of mistake, misleading or " fragility ".For the user for having no data analysis experience particularly with data analysis, these are bad
The result of data analysis may result in serious economic loss.So guidance user carries out good data selection energy band to use
The data investigative analysis of family better quality is experienced.
In order to enable the user of no data analysis experience to eliminate as much as the Data Mining process of error and numerous of being easy
Trivial analysis filter condition setting, it is flat-footed to obtain good data analysis filter effect.There is no doubt that we need
A standardized process is wanted to determine how this carries out the selection of the filter analysis of data, how to be automated according to the feature of data
Carry out data filtering rule modeling.
Summary of the invention
The scene that the purpose of the present invention is explore for interactive data provides a kind of data filtering rule modeling method,
Quickly to carry out analysis mining for the data on data set, facilitate exploration and analysis of the user for data.
For the recommendation rules modeling on data set, our desired characteristics are as follows:
1. interpretation: how suitably to generate recommendation inside a visualization system;
2. feasibility: generating and recommend should have enough analysis significances, it would be desirable to be able to excavate the potential association between data;
3. qualitative: the building of the characteristic explored due to user, model has high efficiency, robustness.
Data filtering rule modeling method provided by the invention, the specific steps are as follows:
(1) give whether the data set D being made of mass data is referred to using the method for random forest feature selecting according to user
Determine critical data, calculates the different degree of data column.Detailed process is as follows:
(1.1) prominence score (variable importance measures), is indicated with VIM, by Gini index GI
To indicate, it is assumed that there is m data column X1, X2, X3..., Xm, it is now to calculate each column XjGini index score VIMj (Gini), that is, it is listed in the average knots modification of all decision tree interior joint division impurity levels of random forest (RF) for j-th;Wherein Gini
Index:
Wherein, K indicates that m node has K classification, p in all decision trees of RFmkIndicate ratio shared by classification k in node m,
pmk′Indicate the complement value of ratio shared by classification k in node m;It intuitively, is exactly that two samples are at will randomly selected from node m
This, the inconsistent probability of category label.
(1.2) data column XjIn the importance of node m, i.e., the Gini index variation amount before and after node m branch is;WithRespectively indicate the Gini index of latter two new node of branch.
(1.3) data column XjThe node occurred in decision tree i is in set M, then XjIn the importance that i-th is set are as follows:
。
(1.4) n tree is shared inside random forest, then data column XjImportance are as follows:
。
(1.5) according to the sequence for calculating importance, returning to customer analysis filter result is most important two column data, note
Importance ranking for A, B, A is higher than B.
(2) data area analysis filtering.The present invention illustrates how that carrying out data area analysis filters in the case of the column of A, B two,
Detailed process is as follows:
(2.1) present invention is divided into three classes according to two column data type of A, B first: numeric type N, discrete value type X, timing type T;For
Numeric type N, can do sliding-model control first, and specific practice is data are carried out with branch mailbox to handle to obtain each chest to be denoted as n ', count
The counting for calculating each branch mailbox is denoted as CNT (n ');For discrete value type X, the counting for calculating each discrete value is denoted as CNT (x);
Since timing type data often have the feature of season property, the present invention can be automatically according to the time series data model of data column T
It encloses and divides time slice case, data column T handles to obtain each timing case by branch mailbox is denoted as t ';Such as: the data area 2017 of T
- 2019 years years, then timing case t ' was divided as unit of year, and the data area of T is only data in 2019, then timing case t ' is with the moon
For unit division;The data area for similarly arranging T is only in January, 2019, then timing case t ' is divided as unit of day.
(2.2) two kinds of data are formed according to three different data types and analyzes filtration combination model, data set D is carried out
Data filtering analyzes (wherein all "/" meanings are "or", are not expressed as division);Specifically:
(2.2.1) A is timing type data, and B is discrete value type or numeric type;The unit choosing for the timing case t ' that A is obtained according to (2.1)
Take the proximal segment time appropriate as first filter condition trecent(such as: nearest 3 years, six months nearest, seven days nearest, no
It is sufficient then do not generate this filtering);Data set after the conditional filtering of A column is D*, dispersion number is obtained by filtration in data column B
According to column B*X1 *, x2 *..., xk *Or numeric data column B*(n will be obtained by branch mailbox again1 *) ', (n2 *) ' ..., (nk *) ', wherein
Chest quantity is k, with x*/ (n*) ' in the maximum three value CNT (x of counting*)top3/ CNT ((n*) ') top3Three of place from
Dissipate data xmax *Or case (nmax *) ' numberical range as second filter condition;With two filter condition trecentAnd xmax */
(nmax *) ' intersection trecent∩xmax */ (nmax *) ' as analysis filtration combination model analysis filter condition, to data set D into
Row data filter analysis;
(2.2.2) A is discrete value type or numeric type, and B is timing type data;A calculate the CNT (x) of each discrete value amount or case/
CNT (n ') chooses and counts five most constant xtop5Or case (ntop5) ' (discrete value or box number deficiency will not then generate this
Filtering) corresponding numberical range is as first filter condition;Data set after the conditional filtering of A column is D*;It chooses in A
Count most constant xmaxOr case (nmax) ' corresponding data column B*Timing range tmaxAs second filter condition;With
Two filter condition xtop5/(ntop5) ' and tmaxIntersection xtop5/(ntop5)′∩tmaxAnalysis as analysis filtration combination model
Filter condition carries out data filtering analysis to data set D.
(3) in order to be presented to the user the data filtered by analysis, the present invention will pass through step (1), (2) two-step analysis
The result data collection being obtained by filtration automatically visualizes.Detailed process is as follows:
(3.1) result data collection is visualized to obtain the base value d (X) of column X, arranges the maximum value max (X) of X, minimum value min
(X), the record strip number of X is arranged | X |, arrange the data type type(X of X) and, arrange the counting CNT of each corresponding x ' of case data x ' of X
(x ') (each discrete value of discrete value column X can regard a case as), the phase of each case data x ' corresponding counting CNT (x ')
Relationship number correlation (x, CNT (x ')).
(3.2) the column type type(X according to obtained in (3.1)) define a set of shearing rule;When the data type of column x
It can be histogram, line chart for timing type: Visual Chart;When the data type of column x is discrete value type or numeric type: visualization
Chart can be histogram, cake chart, scatter plot.
(3.3) present invention proposes that a kind of data analysing method-Relative Entropy filters to determine from step (1), (2) analysis
The result data collection obtained afterwards the visualization how to automate;The core concept of this method calculates each data column X visualization
For ratio of the comentropy relative to standardized chart-information entropy of various charts, it is denoted as C(X)1, C(X)2..., C(X)k;Than
The size of more each Relative Entropy, maximum value C(X)maxCorresponding subtype is exactly the visualization types of data column X.Specifically
Way is as follows:
(3.3.1) column diagram is most commonly used one of the chart of analyst, and the difference in height of pillar is using raising user for data
The identification of difference;Column diagram is suitable for each scene, can preferably show when x ' element (i.e. the number of case) is more
The details of data;The Relative Entropy for calculating histogram uses the base value d (X) of column X, | d (X) | indicate the radix d of column X
(X) value;
(3.3.2) pie chart can show multi-group data, and performance each group of data accounts for always than situation;We need differentiation in cake chart
The CNT(x ' of degree) highlight the accounting of each section, Shannon entropy is introduced thus:, make
For the part of criterion;Wherein y indicates each value of CNT (x'), and P (y) indicates the quantity accounting value of y, i.e. y is at CNT (x')
Probability of happening;
The advantage of (3.3.3) line chart can reflect the case where development and change of the same thing in different time;As data CNT
When (x ') and x ' meet certain distribution (such as: linear distribution, exponential distribution, log series model, low order power are distributed), the expression of distribution
Formula is denoted as distribution (x ', CNT(x ')), comentropy C(X) it is 1;Otherwise, comentropy C(X) it is 0;
C(X)=distribution (x ', CNT(x '));
(3.3.4) scatter plot indicates the relationship between two variables by reference axis;Use related coefficient correlation
(x ', CNT (x ')) is calculated;
C(X)=correlation (x ', CNT (x ')).
(3.4) relative information Entropy sequence is obtained under various Visual Charts by comparing column X, obtain Relative Entropy most
Big value C(X)max.(1) the result data collection obtained after (2) analysis filtering will use C(X)maxCorresponding subtype carries out visual
Change shows.
The present invention, which passes through, reasonably sets relevant rule solves how to build in data analysis using data filtering rule
Vertical analysis filtering model, crosses filter data and intuitive display data using model analysis.The present invention can facilitate user quickly to screen
Data simultaneously find interested data subset, contact between analysis and mining data item.
Detailed description of the invention
Fig. 1 is data column analysis example diagram.
Fig. 2 is the process of data analysis filtering.
Fig. 3 is the example of data analysis filtering.Wherein, it is price filtering example that (a), which is sales date filtering example figure (b),
Figure.
Fig. 4 is result data collection visual means comparison diagram.Wherein, (a) is that result data collection histogram shows that (b) is knot
Fruit data set line chart is shown.
Fig. 5 is the method for the present invention process diagram.
Specific embodiment
We introduce the present invention by a specific data analysis system in this section.
The data that the present invention selects include 33 column, 344355 data.Process as described above is operated, analysis
The data visualization that analysis obtains simultaneously is returned to user's displaying by data column and data area later.It is illustrated in fig. 1 shown below, the present invention
Data column analysis method is arranged using profit and analyzes remaining all data column as key column, and analysis result is sales date and price
The importance highest of two column.
The present invention is based on the schemes that (2) provide to establish data filtering rule model, to target column sales date and price into
The combination of row screening conditions, data analysis system obtain the behaviour that analysis data are illustrated in fig. 2 shown below based on data filtering rule model
Make sequence, obtaining the sales date is nearest one month, the maximum case data area 0-57 of price.It finally obtains as shown in Figure 3
Filter result system example show.
The visual form of the automation that the present invention uses.Therefore the autonomous analysis result data collection of meeting, with appropriate visual
Change chart to show result data collection.It is illustrated in fig. 4 shown below, is just less closed shown in left figure using data as histogram displaying
It is suitable, and data visualization is turned into right figure line chart, trend just is better seen than being visualized as histogram.Therefore, the present invention uses
The line chart display data column price on the right.
Claims (1)
1. the data filtering rule modeling method in a kind of data analysis, the specific steps are as follows:
(1) data set being made of mass data is givenD, using the method for random forest feature selecting, whether referred to according to user
Determine critical data, calculates the different degree of data column;Detailed process is as follows:
(1.1) prominence score is indicated with VIM;Gini index is indicated with GI, it is assumed that have m data column X1, X2,
X3..., Xm, to calculate each column XjGini index score VIMj (Gini), that is, it is all to be listed in random forest (RF) for j-th
The average knots modification of decision tree interior joint division impurity level;Gini index are as follows:
;
Wherein, K indicates that m node has K classification, p in all decision trees of RFmkIndicate ratio shared by classification k, p in node mmk′
Indicate the complement value of ratio shared by classification k in node m;
(1.2) data column XjGini index variation amount in the importance of node m, i.e., before and after node m branch are as follows:
;
WithRespectively indicate the Gini index of latter two new node of branch;
(1.3) data column XjThe node occurred in decision tree i is in set M, then XjIn the importance that i-th is set are as follows:
;
(1.4) n tree is shared inside random forest, then data column XjImportance are as follows:
;
(1.5) according to importance ranking is calculated, returning to customer analysis filter result is most important two column data, is denoted as A, B,
The importance ranking of A is higher than B;
(2) data area analysis filtering;Detailed process is as follows:
(2.1) it is divided into three classes first according to two column data type of A, B: numeric type N, discrete value type X, timing type T;For numerical value
Type N, does sliding-model control first, and specific practice is data are carried out with branch mailbox to handle to obtain each chest to be denoted as n ', calculates each
The counting of branch mailbox is denoted as CNT (n ');For discrete value type X, the counting for calculating each discrete value is denoted as CNT (x);
Timing type T divides time slice case according to the time series data range of data column T, and data column T handles to obtain every by branch mailbox
A timing case is denoted as t ';
(2.2) two kinds of data are formed according to three different data types and analyzes filtration combination mode, data are carried out to data set D
Filter analysis;Specifically:
(2.2.1) A is timing type data, and B is discrete value type or numeric type;The unit choosing for the timing case t ' that A is obtained according to (2.1)
Take the proximal segment time appropriate as first filter condition trecent;Data set after the conditional filtering of A column is denoted as D*,
Discrete data column B is obtained by filtration in data column B*X1 *, x2 *..., xk *Or numeric data column B*It branch mailbox will obtain again
(n1 *) ', (n2 *) ' ..., (nk *Wherein chest quantity is k to) ', with x*/ (n*) ' in the maximum three value CNT of counting
(x*)top3/ CNT ((n*) ') top3Three discrete data x at placemax *Or case (nmax *) ' numberical range as second filter
Condition;With two filter condition trecentAnd xmax */ (nmax *) ' intersection trecent∩xmax */ (nmax *) ' as analysis filtering group
The analysis filter condition of molding type carries out data filtering analysis to data set D;
(2.2.2) A is discrete value type or numeric type, and B is timing type data;A calculate the CNT (x) of each discrete value amount or case/
CNT (n ') chooses and counts five most constant xtop5Or case (ntop5) ' corresponding numberical range is as first filter condition;
Data set after the conditional filtering of A column is D*;It chooses in A and counts most constant xmaxOr case (nmax) ' corresponding number
According to column B*Timing range tmaxAs second filter condition;With two filter condition xtop5/(ntop5) ' and tmaxIntersection
xtop5/(ntop5)′∩tmaxAs the analysis filter condition of analysis filtration combination model, data filtering analysis is carried out to data set D;
(3) in order to be presented to the user the data filtered by analysis, the result being obtained by filtration will be analyzed by step (1), (2)
Data set automatically visualizes;Detailed process is as follows:
(3.1) result data collection is visualized to obtain the base value d (X) of column X, arranges the maximum value max (X) of X, minimum value min
(X), the record strip number of X is arranged | X |, arrange the data type type(X of X) and, arrange the counting CNT of each corresponding x ' of case data x ' of X
(x '), the related coefficient correlation (x, CNT (x ')) of each case data x ' corresponding counting CNT (x ');
(3.2) the column type type(X according to obtained in (3.1)) define a set of shearing rule;When the data type of column x is timing
Type: Visual Chart is histogram, line chart;When the data type of column x is discrete value type or numeric type: Visual Chart is column
Shape figure, cake chart, scatter plot;
(3.3) number of results obtained after step (1), (2) analysis filtering is determined using data analysing method-Relative Entropy
The visualization how automated according to collection;The core concept of this method is the letter for calculating each data column X and being visualized as various charts
Ratio of the entropy relative to standardized chart-information entropy is ceased, C(X is denoted as)1, C(X)2..., C(X)k;Compare each relative information
The size of entropy, maximum value C(X)maxCorresponding subtype is exactly the visualization types of data column X;It is specific as follows:
In (3.3.1) column diagram, the difference in height of pillar is for improving user for the identification of data difference;Calculate histogram
Relative Entropy uses the base value d (X) of column X, | d (X) | indicate the value of the radix d (X) of column X:
(3.3.2) pie chart can show multi-group data, and performance each group of data accounts for always than situation;In cake chart, discrimination is needed
CNT(x ') highlight the accounting of each section, introduce Shannon entropy thus:, as
The part of criterion;Wherein y indicates each value of CNT (x'), and P (y) indicates the quantity accounting value of y, i.e. y is CNT's (x')
Probability of happening;
(3.3.3) line chart can reflect the case where development and change of the same thing in different time;As data CNT(x ') with
X ' meets certain distribution: when linear distribution, exponential distribution, log series model or low order power are distributed, the expression formula of distribution is denoted as
Distribution (x ', CNT(x ')), comentropy C(X) it is 1;Otherwise, comentropy C(X) it is 0;
C(X)=distribution (x ', CNT(x '))
In (3.3.4) scatter plot, by reference axis, the relationship between two variables is indicated;Use related coefficient correlation
(x ', CNT (x ')) is calculated;
C(X)=correlation (x ', CNT (x '))
(3.4) relative information Entropy sequence is obtained under various Visual Charts by comparing column X, obtain Relative Entropy maximum value
C(X)max;The result data collection obtained after step (1), (2) analysis filtering is using C(X)maxCorresponding subtype carries out
Visualization shows.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910401717.XA CN110175191B (en) | 2019-05-14 | 2019-05-14 | Modeling method for data filtering rule in data analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910401717.XA CN110175191B (en) | 2019-05-14 | 2019-05-14 | Modeling method for data filtering rule in data analysis |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110175191A true CN110175191A (en) | 2019-08-27 |
CN110175191B CN110175191B (en) | 2023-06-27 |
Family
ID=67691033
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910401717.XA Active CN110175191B (en) | 2019-05-14 | 2019-05-14 | Modeling method for data filtering rule in data analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110175191B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110766167A (en) * | 2019-10-29 | 2020-02-07 | 深圳前海微众银行股份有限公司 | Interactive feature selection method, device and readable storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105550374A (en) * | 2016-01-29 | 2016-05-04 | 湖南大学 | Random forest parallelization machine studying method for big data in Spark cloud service environment |
CN106295983A (en) * | 2016-08-08 | 2017-01-04 | 烟台海颐软件股份有限公司 | Power marketing data visualization statistical analysis technique and system |
CN106599325A (en) * | 2017-01-18 | 2017-04-26 | 河海大学 | Method for constructing data mining visualization platform based on R and HighCharts |
CN107103050A (en) * | 2017-03-31 | 2017-08-29 | 海通安恒(大连)大数据科技有限公司 | A kind of big data Modeling Platform and method |
CN107193967A (en) * | 2017-05-25 | 2017-09-22 | 南开大学 | A kind of multi-source heterogeneous industry field big data handles full link solution |
CN108171617A (en) * | 2017-12-08 | 2018-06-15 | 全球能源互联网研究院有限公司 | A kind of power grid big data analysis method and device |
CN109409647A (en) * | 2018-09-10 | 2019-03-01 | 昆明理工大学 | A kind of analysis method of the salary level influence factor based on random forests algorithm |
-
2019
- 2019-05-14 CN CN201910401717.XA patent/CN110175191B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105550374A (en) * | 2016-01-29 | 2016-05-04 | 湖南大学 | Random forest parallelization machine studying method for big data in Spark cloud service environment |
CN106295983A (en) * | 2016-08-08 | 2017-01-04 | 烟台海颐软件股份有限公司 | Power marketing data visualization statistical analysis technique and system |
CN106599325A (en) * | 2017-01-18 | 2017-04-26 | 河海大学 | Method for constructing data mining visualization platform based on R and HighCharts |
CN107103050A (en) * | 2017-03-31 | 2017-08-29 | 海通安恒(大连)大数据科技有限公司 | A kind of big data Modeling Platform and method |
CN107193967A (en) * | 2017-05-25 | 2017-09-22 | 南开大学 | A kind of multi-source heterogeneous industry field big data handles full link solution |
CN108171617A (en) * | 2017-12-08 | 2018-06-15 | 全球能源互联网研究院有限公司 | A kind of power grid big data analysis method and device |
CN109409647A (en) * | 2018-09-10 | 2019-03-01 | 昆明理工大学 | A kind of analysis method of the salary level influence factor based on random forests algorithm |
Non-Patent Citations (1)
Title |
---|
魏正韬: "基于非平衡数据的随机森林算法研究", 信息科技, no. 2018 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110766167A (en) * | 2019-10-29 | 2020-02-07 | 深圳前海微众银行股份有限公司 | Interactive feature selection method, device and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110175191B (en) | 2023-06-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8041714B2 (en) | Filter chains with associated views for exploring large data sets | |
Kosara et al. | Parallel sets: Interactive exploration and visual analysis of categorical data | |
Achtert et al. | Evaluation of clusterings--metrics and visual support | |
CN107766428B (en) | Method and system for automatically realizing data visualization | |
Ko et al. | Marketanalyzer: An interactive visual analytics system for analyzing competitive advantage using point of sale data | |
US7446769B2 (en) | Tightly-coupled synchronized selection, filtering, and sorting between log tables and log charts | |
US20090096812A1 (en) | Apparatus and method for morphing data visualizations | |
US20070050237A1 (en) | Visual designer for multi-dimensional business logic | |
CN108140025A (en) | For the interpretation of result of graphic hotsopt | |
Cheng et al. | Visually exploring missing values in multivariable data using a graphical user interface | |
CN108982377A (en) | Corn growth stage spectrum picture and chlorophyll content correlation and period division methods | |
EP2713319A1 (en) | Analyzing and displaying multidimensional data | |
Lu et al. | Palettailor: Discriminable colorization for categorical data | |
US20130054510A1 (en) | Automated system for preparing and presenting control charts | |
CN110321914B (en) | Oil quality analysis management and control system | |
CN110175191A (en) | Data filtering rule modeling method in data analysis | |
US20150032685A1 (en) | Visualization and comparison of business intelligence reports | |
US12014436B2 (en) | Intellectual-property landscaping platform | |
US20220100358A1 (en) | Intellectual-Property Landscaping Platform | |
Han et al. | Rankbrushers: interactive analysis of temporal ranking ensembles | |
CN112732878A (en) | Unstructured data analysis system and method | |
CN105022724A (en) | Automatic selection method of statistical symbol on the basis of statistical data and charting requirements | |
US7957932B1 (en) | Data analysis systems and related methods | |
Leite et al. | PhenoVis–A tool for visual phenological analysis of digital camera images using chronological percentage maps | |
Sifer | User interfaces for the exploration of hierarchical multi-dimensional data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |