CN109670037A - K-means Text Clustering Method based on topic model and rough set - Google Patents

K-means Text Clustering Method based on topic model and rough set Download PDF

Info

Publication number
CN109670037A
CN109670037A CN201811324306.7A CN201811324306A CN109670037A CN 109670037 A CN109670037 A CN 109670037A CN 201811324306 A CN201811324306 A CN 201811324306A CN 109670037 A CN109670037 A CN 109670037A
Authority
CN
China
Prior art keywords
theme
text
reduction
topic model
clustering method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811324306.7A
Other languages
Chinese (zh)
Inventor
谢珺
段利国
郝晓燕
梁凤梅
续欣莹
靳红伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taiyuan University of Technology
Original Assignee
Taiyuan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taiyuan University of Technology filed Critical Taiyuan University of Technology
Priority to CN201811324306.7A priority Critical patent/CN109670037A/en
Publication of CN109670037A publication Critical patent/CN109670037A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of K-means Text Clustering Method based on topic model and rough set.The shortcomings that for K-means algorithm, proposes the optimization method to initial center point, use LDA topic model, by co-occurrence of the lexical item in documentation level, the semantic information in text is efficiently extracted, while word spatial transformation being the theme space, realize theme dimensionality reduction, then in conjunction with Rough Set Knowledge Reduction theory, redundancy theme feature is deleted, to improve theme feature extraction efficiency, optimize initial center point, improves K-means text cluster effect.

Description

K-means Text Clustering Method based on topic model and rough set
Technical field
The present invention relates to text cluster field more particularly to a kind of K-means texts based on topic model and rough set Clustering method.
Background technique
With the development and application of network technology, information resources explosive growth, text mining, information filtering and information are searched There is unprecedented prospect in the research of rope.Therefore, clustering technique is just becoming the core of text information digging technology.Text is poly- Class is an important technology for being used to find data distribution and its implicit data pattern in text mining.Cluster is by that will have The data of similitude are divided into different groups to realize, so that the element in each cluster shares some common traits, usually It is far and near according to the distance metric of definition.K-means cluster is a kind of Classic Clustering Algorithms based on division, because its principle is simple, It is easily achieved, the advantages that fast convergence rate and is used widely.However this algorithm will lead to difference to different initial values Cluster result, be easily trapped into local minimum, to peel off sensitivity the disadvantages of.The shortcomings that for K-means algorithm, proposes to first The optimization method of beginning central point is efficiently extracted in text using LDA topic model by co-occurrence of the lexical item in documentation level Semantic information, while word spatial transformation being the theme space, realizes theme dimensionality reduction, managed then in conjunction with Rough Set Knowledge Reduction By deleting the theme feature of redundancy, improve theme feature extraction efficiency, optimize the selection of initial center point, improve K-means text This Clustering Effect.
Summary of the invention
It is provided a kind of based on topic model and rough set it is an object of the invention to avoid the deficiencies in the prior art place K-means Text Clustering Method.
The purpose of the present invention can be realized by using following technical measures, be designed a kind of based on topic model and thick The K-means Text Clustering Method of rough collection, comprising steps of choosing text set, text set is expressed as by this vectorization of composing a piece of writing of going forward side by side Text-lexical item matrix;Text modeling is carried out to text-lexical item matrix using LDA topic model, modeling parameters are estimated, Document-theme matrix is obtained, while generating low-dimensional theme feature;Wherein, low-dimensional theme feature indicates each of text set The theme probability of the appearance of word;Document-theme matrix conversion is the theme lexical item decision system, is led using neighborhood rough set The reduction of topic feature obtains the reduction set of theme according to the different degree of theme;Theme reduction set is carried out to the pact of theme value Letter obtains the complete reduction set of theme, optimizes the selection of initial center point;K-means is carried out to the completely brief set of theme Text cluster.
Wherein, in the step of carrying out text modeling to text-lexical item matrix using LDA topic model, comprising steps of from The concentration of theme corresponding to a document in document sets randomly selects out a theme, the word corresponding to the theme being drawn into Language concentrate it is random extract a word, repeat aforesaid operations, until traversing word all in document completely;Using general The thought of rate statistics models document sets, two matrixes: text-theme matrix and theme-word matrix is obtained, to excavate text This potential semantic information.
It wherein, further include that pretreated step is carried out to text set before the step of carrying out text vector to text set Suddenly;Wherein, pretreated mode includes at least stammerer participle and removes stop words.
Wherein, in the step of carrying out K-means text cluster to the completely brief set of theme, K-means algorithmic procedure Include the following steps, it is assumed that text set is divided into c classification:
Randomly choose the initial center of c class;
In kth iteration, to any one sample, it is asked to arrive the distance at c each center of classification initial center, and by sample Originally it is grouped into apart from the class where shortest center;
Such central value is updated using the methods of mean value;
C all cluster centres is updated using abovementioned steps, if cluster centre value remains unchanged, i.e., objective function is received It holds back, then stops iteration.
Wherein, in the step of carrying out the reduction of theme feature using neighborhood rough set, the reduction mode packet of theme feature Include the reduction of theme reduction and theme value.
Wherein, in the different degree according to theme, in the step of obtaining the reduction set of theme, pass through in reduction calculating process Judge whether subject importance is greater than zero and obtains reduction set, will be greater than zero theme and be put into reduction set.
Wherein, the method for calculating subject importance is the method for computation attribute dependency degree, specific steps are as follows: calculates theme Positive domain number of samples under collection calculates the difference of the attribute dependability of each theme according to the positive domain calculated, and obtains each master The different degree of topic.
It wherein, further include the step of Cluster Assessment after the step of carrying out K-means text cluster to completely brief set Suddenly.
It is different from the prior art, the K-means Text Clustering Method of the invention based on topic model and rough set is directed to The shortcomings that K-means algorithm, proposes the optimization method to initial center point, using LDA topic model, by lexical item in documentation level In co-occurrence, efficiently extract the semantic information in text, while word spatial transformation being the theme space, realize theme dimensionality reduction, Then in conjunction with Rough Set Knowledge Reduction theory, redundancy theme is deleted, optimizes the selection of initial center point, improves k-means text Clustering Effect.
Detailed description of the invention
Fig. 1 is a kind of process of K-means Text Clustering Method based on topic model and rough set provided by the invention Schematic diagram;
Fig. 2 is a kind of logic of K-means Text Clustering Method based on topic model and rough set provided by the invention Schematic diagram;
Fig. 3 is text-in a kind of K-means Text Clustering Method based on topic model and rough set provided by the invention The structural schematic diagram of theme matrix model.
Specific embodiment
Further more detailed description is made to technical solution of the present invention With reference to embodiment.Obviously, it is retouched The embodiment stated is only a part of the embodiments of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, Those of ordinary skill in the art's every other embodiment obtained without creative labor, all should belong to The scope of protection of the invention.
Refering to fig. 1 and Fig. 2, Fig. 1 are that a kind of K-means text based on topic model and rough set provided by the invention is poly- The flow diagram of class method;Fig. 2 is a kind of K-means text cluster based on topic model and rough set provided by the invention The logical schematic of method.The step of this method includes:
S110: choosing text set, and text set is expressed as text-lexical item matrix by this vectorization of composing a piece of writing of going forward side by side.Text-master It is as shown in Figure 3 to inscribe matrix norm type structure.From the figure 3, it may be seen that LDA topic model can divide by force each word in document sets With theme, therefore non-active theme can be retained, influence the distribution of theme, the problem for causing theme too wide in range.
S120: text modeling is carried out to text-lexical item matrix using LDA topic model, modeling parameters are estimated, are obtained To document-theme matrix, while generating low-dimensional theme feature;Wherein, low-dimensional theme feature indicates each of text set word Appearance theme probability.
S130: document-theme matrix conversion is the theme lexical item decision system, and neighborhood rough set is utilized to carry out theme feature Reduction the reduction set of theme is obtained according to the different degree of theme.
Document-theme matrix conversion is the theme lexical item decision system TDS=(TU, TC ∪ D, V, f), it is coarse using neighborhood Collection carries out the reduction of theme feature, and wherein TU is the M piece article containing N number of theme, i.e. text-theme matrix, that is, domain, TC is K theme, i.e. attribute set, and D is text categories, i.e. decision attribute, and V is theme value, and f is an information function, is used for Value in theme is distributed into lexical item.For k-th theme, fK:TC→VK,VKIt is the codomain of theme.
To obtained document-theme matrix, the reduction of theme feature, including theme reduction and master are carried out using neighborhood rough set The reduction of topic value, to achieve the purpose that optimization initial center point.The positive domain number of samples under theme subset is calculated, according to calculating The positive domain POS comek(D), the difference of the dependency degree between each theme is then calculated, To obtain the different degree SIG of each theme, it is then manually entered different degree lower limit, EFC is the control parameter of different degree lower limit, Take the number close to zero.It can be seen that in the algorithm, remain subject importance it is maximum that, that is, ensure that core not by Reduction.It follows that neighborhood rough set can be used to evaluate data for the importance of classification.
S140: by theme reduction setREDThe reduction for carrying out theme value, obtains the complete reduction set RED' of theme.
Present invention introduces neighborhood rough set models, carry out reduction to redundancy theme feature, reach optimization initial center point Purpose.The problem of rough set will be handled is described as a message system, and information system DT (U, C ∪ D, V, f) is known as one certainly Plan system, wherein U is sample set, also referred to as domain { χ12,...,χn, A=C ∪ D is attribute set, and wherein C is condition Attribute set, also referred to as characteristic set { a1,a2,...,am, for describing the characteristic information of each sample, D indicates decision category Property set.F indicates the information function of decision system, faFor the information function of attribute a, V is the codomain of information function f.For number Value type data judge the neighbor relationships between its similarity degree and sample by calculating the distance between sample.
S150: K-means text cluster is carried out to the completely brief set of theme.
Wherein, in the step of carrying out text modeling to text-lexical item matrix using LDA topic model, comprising steps of from The concentration of theme corresponding to a document in document sets randomly selects out a theme, the word corresponding to the theme being drawn into Language concentrate it is random extract a word, repeat aforesaid operations, until traversing word all in document completely;Using general The thought of rate statistics models document sets, two matrixes: text-theme matrix and theme-word matrix is obtained, to excavate text This potential semantic information.
It wherein, further include that pretreated step is carried out to text set before the step of carrying out text vector to text set Suddenly;Wherein, pretreated mode includes at least stammerer participle and removes stop words.
Wherein, in the step of carrying out K-means text cluster to completely brief set, K-means algorithmic procedure includes Following steps, it is assumed that text set is divided into c classification:
Randomly choose the initial center of c class;
In kth iteration, to any one sample, it is asked to arrive the distance at c each center of classification initial center, and by sample Originally it is grouped into apart from the class where shortest center;
Such central value is updated using the methods of mean value;
C all cluster centres is updated using abovementioned steps, if cluster centre value remains unchanged, i.e., objective function is received It holds back, then stops iteration.
Wherein, in the step of carrying out the reduction of theme feature using neighborhood rough set, the reduction mode packet of theme feature Include the reduction of theme reduction and theme value.
Wherein, in the different degree according to theme, in the step of obtaining the reduction set of theme, pass through in reduction calculating process Judge whether subject importance is greater than zero and obtains reduction set, will be greater than zero theme and be put into reduction set.
Wherein, the method for calculating subject importance is the method for computation attribute dependency degree, specific steps are as follows: calculates theme Positive domain number of samples under collection calculates the difference of the attribute dependability of each theme according to the positive domain calculated, and obtains each master The different degree of topic.
The method for the calculating different degree that the present invention uses is the method for computation attribute dependency degree, and class categories D is to text master Topic TC dependency degree be
Theme decision system obtains relation database table RED (B)=(TU an of Relative Reduced Concept after attribute reductionB,TB∪ D, V, f), in RED (B), then each of RED (B) theme is considered as a decision rule by reduction by redundancy theme dX, X ∈ TUBAnd X matching ruleIt is led on the basis of this theme rule set extraction The reduction of topic value.
It wherein, further include Cluster Assessment after the step of carrying out K-means text cluster to the completely brief set of theme The step of.
Specifically, evaluating cluster result using F value, it is both accuracy rate (Precision) and recall rate (Recall) Harmonic average, give predefined classification i and cluster classification j, calculation formula is as follows:
Accuracy rate P (i, j)=Nij/Nj
Recall rate R (i, j)=Nij/Ni
Wherein, NijIt is the text number clustered in classification j comprising predefined classification i, NjIt is actual text in cluster classification j This number, NiIt is to give the text number that should have in predefined classification i.
The judgement schematics of cluster result are as follows:
Wherein, n is the number of test text.As can be seen that F value is bigger, Clustering Effect is better.
In order to verify the validity of this paper algorithm, three kinds of improved k-means Text Clustering Algorithms and different models are selected Carry out Clustering Effect comparative experiments.The data set of selection is Fudan University's testing material library, chooses art, economy and sport therein etc. The text of ten classifications, totally 2000 articles, 200 articles of each classification, every text number of words 500 to 8000 differ.By scheming As can be seen that this paper algorithm is better than other three kinds of clustering algorithms, while LDA topic model is highlighted and having existed with rough set connected applications It is preferable to verify the Model tying effect for advantage in terms of text cluster.Comparison result is as shown in the table.
Method F value (%)
Original k-means 73.67
Rough set 79.54
LDA topic model 84.19
Algorithm 1 87.31
Algorithm 2 78.68
Algorithm 3 85.32
The method of the present invention 92.03
It is different from the prior art, the K-means Text Clustering Method of the invention based on topic model and rough set is directed to The shortcomings that K-means algorithm, proposes the optimization method to initial center point, using LDA topic model, by lexical item in documentation level In co-occurrence, efficiently extract the semantic information in text, while word spatial transformation being the theme space, realize theme dimensionality reduction, Then in conjunction with Rough Set Knowledge Reduction theory, redundancy theme feature is deleted, improves theme feature extraction efficiency, optimizes initial center Point improves k-means text cluster effect.
The above is only embodiments of the present invention, are not intended to limit the scope of the invention, all to utilize the present invention Equivalent structure or equivalent flow shift made by specification and accompanying drawing content is applied directly or indirectly in other relevant technologies Field is included within the scope of the present invention.

Claims (8)

1. a kind of K-means Text Clustering Method based on topic model and rough set characterized by comprising
Text set is chosen, text set is expressed as text-lexical item matrix by this vectorization of composing a piece of writing of going forward side by side;
Text modeling is carried out to text-lexical item matrix using LDA topic model, modeling parameters are estimated, document-master is obtained Matrix is inscribed, while generating low-dimensional theme feature;Wherein, low-dimensional theme feature indicates the master of each of the text set appearance of word Inscribe probability;
Document-theme matrix conversion is the theme lexical item decision system, neighborhood rough set is utilized to carry out the reduction of theme feature, root According to the different degree of theme, the reduction set of theme is obtained;
The reduction that theme reduction set is carried out to theme value, obtains the complete reduction set of theme;
K-means text cluster is carried out to completely brief set.
2. the K-means Text Clustering Method according to claim 1 based on topic model and rough set, feature exist In, in the step of carrying out text modeling to text-lexical item matrix using LDA topic model, comprising steps of
Document sets are modeled using the thought of probability statistics, obtain two matrixes: text-theme matrix and theme-word square Battle array, to excavate the potential semantic information of text.
The concentration of the theme corresponding to the document in document sets randomly selects out a theme, right from the theme institute being drawn into The word answered concentrate it is random extract a word, repeat aforesaid operations, until traversing word all in document completely.
3. the K-means Text Clustering Method according to claim 1 based on topic model and rough set, feature exist In further including carrying out pretreated step to text set before the step of carrying out text vector to text set;Wherein, in advance The mode of processing includes at least stammerer participle and removes stop words.
4. the K-means Text Clustering Method according to claim 1 based on topic model and rough set, feature exist In in the step of carrying out K-means text cluster to the complete reduction set of theme, K-means algorithmic procedure includes following step Suddenly, it is assumed that text set is divided into c classification:
Randomly choose the initial center of c classification;
In kth iteration, to any one text, it is asked to arrive the distance at c each center of classification initial center, and sample is returned To apart from the class where shortest center;
Such central value is updated using the methods of mean value;
C all cluster centres is updated using abovementioned steps, if cluster centre value remains unchanged, i.e., objective function is restrained, Then stop iteration.
5. the K-means Text Clustering Method according to claim 1 based on topic model and rough set, feature exist In in the step of carrying out the reduction of theme feature using neighborhood rough set, the reduction mode of theme feature includes theme reduction With the reduction of theme value.
6. the K-means Text Clustering Method according to claim 1 based on topic model and rough set, feature exist In in the different degree according to theme, in the step of obtaining the reduction set of theme, by judging theme weight in reduction calculating process It spends and whether is greater than zero and obtains reduction set, will be greater than zero theme and be put into reduction set.
7. the K-means Text Clustering Method according to claim 6 based on topic model and rough set, feature exist In the method for calculating subject importance is the method for computation attribute dependency degree, specific steps are as follows: calculate the positive domain under theme subset Number of samples calculates the difference of the dependency degree between each theme according to the positive domain calculated, and obtains the different degree of each theme.
8. the K-means Text Clustering Method according to claim 1 based on topic model and rough set, feature exist In further including the steps that Cluster Assessment after the step of to brief set carries out K-means text cluster completely.
CN201811324306.7A 2018-11-08 2018-11-08 K-means Text Clustering Method based on topic model and rough set Pending CN109670037A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811324306.7A CN109670037A (en) 2018-11-08 2018-11-08 K-means Text Clustering Method based on topic model and rough set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811324306.7A CN109670037A (en) 2018-11-08 2018-11-08 K-means Text Clustering Method based on topic model and rough set

Publications (1)

Publication Number Publication Date
CN109670037A true CN109670037A (en) 2019-04-23

Family

ID=66142065

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811324306.7A Pending CN109670037A (en) 2018-11-08 2018-11-08 K-means Text Clustering Method based on topic model and rough set

Country Status (1)

Country Link
CN (1) CN109670037A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598192A (en) * 2019-06-28 2019-12-20 太原理工大学 Text feature reduction method based on neighborhood rough set
CN111078852A (en) * 2019-12-09 2020-04-28 武汉大学 College leading-edge scientific research team detection system based on machine learning
CN111259110A (en) * 2020-01-13 2020-06-09 武汉大学 College patent personalized recommendation system
CN112800253A (en) * 2021-04-09 2021-05-14 腾讯科技(深圳)有限公司 Data clustering method, related device and storage medium
CN117520529A (en) * 2023-12-04 2024-02-06 四川三江数智科技有限公司 Text subject mining method for power battery

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103870751A (en) * 2012-12-18 2014-06-18 ***通信集团山东有限公司 Method and system for intrusion detection
CN107085164A (en) * 2017-03-22 2017-08-22 清华大学 A kind of electric network fault type determines method and device
CN108197295A (en) * 2018-01-22 2018-06-22 重庆邮电大学 Application process of the attribute reduction based on more granularity attribute trees in text classification

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103870751A (en) * 2012-12-18 2014-06-18 ***通信集团山东有限公司 Method and system for intrusion detection
CN107085164A (en) * 2017-03-22 2017-08-22 清华大学 A kind of electric network fault type determines method and device
CN108197295A (en) * 2018-01-22 2018-06-22 重庆邮电大学 Application process of the attribute reduction based on more granularity attribute trees in text classification

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
HONGXIN WAN 等: "An Algorithm of LDA Topic Reduction Based on Rough Set", 《APPLIED MECHANICS AND MATERIALS》 *
六月麦茬: "粗糙集,邻域粗糙集与实域粗糙集概述", 《HTTPS://BLOG.CSDN.NET/LIUYUEMAICHA/ARTICLE/DETAILS/52355787》 *
王春龙 等: "基于 LDA 的改进 K-means 算法在文本聚类中的应用", 《计算机应用》 *
靳红伟 等: "基于邻域粗糙集的文本主题特征提取", 《科学技术与工程》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598192A (en) * 2019-06-28 2019-12-20 太原理工大学 Text feature reduction method based on neighborhood rough set
CN111078852A (en) * 2019-12-09 2020-04-28 武汉大学 College leading-edge scientific research team detection system based on machine learning
CN111259110A (en) * 2020-01-13 2020-06-09 武汉大学 College patent personalized recommendation system
CN112800253A (en) * 2021-04-09 2021-05-14 腾讯科技(深圳)有限公司 Data clustering method, related device and storage medium
CN112800253B (en) * 2021-04-09 2021-07-06 腾讯科技(深圳)有限公司 Data clustering method, related device and storage medium
CN117520529A (en) * 2023-12-04 2024-02-06 四川三江数智科技有限公司 Text subject mining method for power battery

Similar Documents

Publication Publication Date Title
CN109670037A (en) K-means Text Clustering Method based on topic model and rough set
CN106383877B (en) Social media online short text clustering and topic detection method
CN104199857B (en) A kind of tax document hierarchy classification method based on multi-tag classification
CN102289522B (en) Method of intelligently classifying texts
CN106709754A (en) Power user grouping method based on text mining
CN106776538A (en) The information extracting method of enterprise's noncanonical format document
CN105956015A (en) Service platform integration method based on big data
CN103150374A (en) Method and system for identifying abnormal microblog users
CN105631479A (en) Imbalance-learning-based depth convolution network image marking method and apparatus
CN105678607A (en) Order batching method based on improved K-Means algorithm
CN109284626A (en) Random forests algorithm towards difference secret protection
CN111860981B (en) Enterprise national industry category prediction method and system based on LSTM deep learning
CN103049581B (en) A kind of web text classification method based on consistance cluster
CN109657063A (en) A kind of processing method and storage medium of magnanimity environment-protection artificial reported event data
CN109635010A (en) A kind of user characteristics and characterization factor extract, querying method and system
CN111079427A (en) Junk mail identification method and system
Dan et al. Research of text categorization on Weka
CN111641608A (en) Abnormal user identification method and device, electronic equipment and storage medium
CN106845536A (en) A kind of parallel clustering method based on image scaling
CN111191099A (en) User activity type identification method based on social media
CN105046323A (en) Regularization-based RBF network multi-label classification method
Abinaya et al. Spam detection on social media platforms
CN105005792A (en) KNN algorithm based article translation method
CN110084376B (en) Method and device for automatically separating data into boxes
CN109828995B (en) Visual feature-based graph data detection method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190423