CN114416975A - System and method for determining text similarity - Google Patents

System and method for determining text similarity Download PDF

Info

Publication number
CN114416975A
CN114416975A CN202111572563.4A CN202111572563A CN114416975A CN 114416975 A CN114416975 A CN 114416975A CN 202111572563 A CN202111572563 A CN 202111572563A CN 114416975 A CN114416975 A CN 114416975A
Authority
CN
China
Prior art keywords
question
answer data
clustering
data
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111572563.4A
Other languages
Chinese (zh)
Inventor
马谊骏
刘振宇
杨硕
***
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aisino Corp
Original Assignee
Aisino Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aisino Corp filed Critical Aisino Corp
Priority to CN202111572563.4A priority Critical patent/CN114416975A/en
Publication of CN114416975A publication Critical patent/CN114416975A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a system and a method for determining text similarity, and belongs to the technical field of data processing. The system of the invention comprises: the clustering module is used for acquiring question and answer data of a user, preprocessing the question and answer data, selecting an initialization mass center of the preprocessed question and answer data according to the preprocessed question and answer data, and acquiring the clustered question and answer data according to the initialization mass center; and the text similarity determining module is used for segmenting the question data in the clustering result, calculating TF-IDF values of each segmented word, calculating Simhash values among the question and answer data of the clusters according to the TF-IDF values, and determining the similarity among the question and answer data of the clusters according to the Simhash values. Compared with the prior art, the method and the system successfully convert unstructured data in the chat platform into structured data recognizable by intelligent customer service by calculating the weight and the simhash value of the keyword in the question-answer corpus and comparing the similarity of the two pieces of question-answer data.

Description

System and method for determining text similarity
Technical Field
The present invention relates to the field of data processing technology, and more particularly, to a system and method for determining text similarity.
Background
With the rapid development of the internet in China, more and more users select to use mobile phones or computer terminals to obtain desired information. In order to solve the demand, various enterprises provide online chat platforms to solve the problems for users. In the prior art, most enterprises adopt a mode of combining artificial customer service and intelligent customer service, when a user initiates a session, a robot is used for responding to a user question, analyzing the user question and giving a corresponding suggestion, and if the user is not satisfied, the user is changed to the artificial customer service to answer. Although the artificial customer service can process and answer the problems of most users, the chat time of a single user is uncertain, and when the user consults a peak period, the artificial customer service cannot be online all the time or questions of a plurality of users need to be processed simultaneously, so that unsatisfactory emotions such as overlong online waiting time of part of users are caused; 2. the cost of maintaining a large amount of manual customer service is high, and the business level is uneven due to experience and other reasons. In order to solve the situation, the improvement of the accuracy rate of intelligent customer service responses becomes the research target of all companies, valuable real information of a sea measuring tool exists in the question and answer corpus of the artificial customer service and the user in the last few years, but the data exist in the online chat platform data, and the problem that how to reasonably convert unstructured data into structured data for unstructured data such as texts, mailbox addresses, pictures, HTML files and the like is urgently needed to be solved by all companies with intelligent customer service systems.
The existing method for processing unstructured corpora has the following problems that 1, an unstructured data management system extended by a traditional relational database system is adopted, such as an unstructured data management system based on NoSQL. 2. And judging the similarity of the two vectors by using a traditional method for calculating cosine values of the two vectors so as to calculate the similarity of the unstructured data texts. The method using the unstructured data management system is time-consuming, data existing in the management system cannot be input and trained as the structured data recognizable by the intelligent customer service, further work is needed to export the data, and the efficiency is low. For the question and answer data, the text similarity calculation by the method of calculating the cosine value between vectors cannot achieve the expected effect, for example, when the data volume is large, thousands of similarities need to be compared, so that a large amount of memory is occupied, and the performance requirement cannot be met due to long running time. In summary, it is necessary to find a method for efficiently and accurately converting unstructured data into structured data recognizable by intelligent customer service based on artificial customer service and user question and answer data.
Disclosure of Invention
In view of the above problem, the present invention provides a system for determining text similarity, including:
the clustering module is used for acquiring question and answer data of a user, preprocessing the question and answer data, selecting an initialization mass center of the preprocessed question and answer data according to the preprocessed question and answer data, and acquiring the clustered question and answer data according to the initialization mass center;
and the text similarity determining module is used for segmenting the question data in the clustering result, calculating TF-IDF values of each segmented word, calculating Simhash values among the question and answer data of the clusters according to the TF-IDF values, and determining the similarity among the question and answer data of the clusters according to the Simhash values.
Optionally, the text similarity determining module is further configured to merge the question and answer data of the clusters with the similarity greater than the preset value.
Optionally, the pretreatment specifically comprises: and cleaning the question and answer data.
Optionally, selecting an initialization centroid of the preprocessed question and answer data, specifically:
generating a sample set according to the preprocessed question and answer data, randomly selecting any point in the sample set as a clustering center, calculating the distance from each point in the sample set to the clustering center, determining a plurality of clustering centroids according to the distance from each point to the clustering center, and taking the clustering centroids as initialization centroids.
Optionally, the obtaining of the clustered question-answer data specifically includes:
and determining the distance between each point in the sample set and the vector of the initialized centroid aiming at the initialized centroid, clustering the distance between each point and the vector of the initialized centroid, determining a new centroid in the sample set again, and outputting a clustering result if the vector of the centroid is not changed.
The invention also provides a method for determining text similarity, which comprises the following steps:
acquiring question and answer data of a user, preprocessing the question and answer data, selecting an initialized mass center of the preprocessed question and answer data according to the preprocessed question and answer data, and acquiring clustered question and answer data according to the initialized mass center;
performing word segmentation on the problem data in the clustering result, calculating a TF-IDF value of each word segmentation, calculating a Simhash value between the question answering data of the clusters according to the TF-IDF value, and determining the similarity between the question answering data of the clusters according to the Simhash value.
Optionally, the method further comprises: and merging the clustered question-answer data with the similarity larger than a preset value.
Optionally, the pretreatment specifically comprises: and cleaning the question and answer data.
Optionally, selecting an initialization centroid of the preprocessed question and answer data, specifically:
generating a sample set according to the preprocessed question and answer data, randomly selecting any point in the sample set as a clustering center, calculating the distance from each point in the sample set to the clustering center, determining a plurality of clustering centroids according to the distance from each point to the clustering center, and taking the clustering centroids as initialization centroids.
Optionally, the obtaining of the clustered question-answer data specifically includes:
and determining the distance between each point in the sample set and the vector of the initialized centroid aiming at the initialized centroid, clustering the distance between each point and the vector of the initialized centroid, determining a new centroid in the sample set again, and outputting a clustering result if the vector of the centroid is not changed.
Compared with the prior art, the method has the advantages that the same type of data is gathered into the same cluster based on a clustering algorithm, then the similarity of two question-answer data is compared by calculating the weight of the keywords in the question-answer corpus and the Simhash value by using a text similarity calculation method based on the Simhash, and unstructured data in a chat platform is successfully converted into structured data which can be identified by intelligent customer service.
Drawings
FIG. 1 is a block diagram of the system of the present invention;
FIG. 2 is a flow chart of the method of the present invention.
Detailed Description
The exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, however, the present invention may be embodied in many different forms and is not limited to the embodiments described herein, which are provided for complete and complete disclosure of the present invention and to fully convey the scope of the present invention to those skilled in the art. The terminology used in the exemplary embodiments illustrated in the accompanying drawings is not intended to be limiting of the invention. In the drawings, the same units/elements are denoted by the same reference numerals.
Unless otherwise defined, terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Further, it will be understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense.
The present invention provides a system 100 for determining text similarity, as shown in fig. 1, comprising:
the clustering module 101 is used for acquiring question and answer data of a user, preprocessing the question and answer data, selecting an initialized mass center of the preprocessed question and answer data according to the preprocessed question and answer data, and acquiring clustered question and answer data according to the initialized mass center;
the text similarity determination module 102 is configured to perform word segmentation on the question data in the clustering result, calculate a TF-IDF value of each word segmentation, calculate a Simhash value between the clustered question and answer data according to the TF-IDF value, and determine similarity between the clustered question and answer data according to the Simhash value.
The text similarity determining module is further used for merging the question and answer data of the clusters with the similarity larger than a preset value.
Wherein the pretreatment specifically comprises the following steps: and cleaning the question and answer data.
Selecting an initialization mass center of the preprocessed question and answer data, specifically:
generating a sample set according to the preprocessed question and answer data, randomly selecting any point in the sample set as a clustering center, calculating the distance from each point in the sample set to the clustering center, determining a plurality of clustering centroids according to the distance from each point to the clustering center, and taking the clustering centroids as initialization centroids.
The method for obtaining the clustered question answering data specifically comprises the following steps:
and determining the distance between each point in the sample set and the vector of the initialized centroid aiming at the initialized centroid, clustering the distance between each point and the vector of the initialized centroid, determining a new centroid in the sample set again, and outputting a clustering result if the vector of the centroid is not changed.
The invention is further illustrated by the following examples:
the system of the present invention comprises:
the method comprises the following steps of firstly, clustering the question-answer data of the same type, wherein the clustering module specifically processes the following steps:
step 1, obtaining the question and answer data of a user, and preprocessing and cleaning the question and answer data.
Extracting all question and answer data in nearly 3 years from background chatting data of an intelligent customer service public number and a social software customer service account, identifying information such as texts, pictures and links in original data by using a regular expression in Python, and discharging interference data, wherein useless information such as calling only retains questions and answers given by customer service of a user, the pictures, links and the like in the original answers are stored in the extracted answers, finally, the question and answer data are stored in a dictionary mode, the Key Value is a question, and the Value is an answer.
Step 2: and selecting an initialization centroid.
Obtaining a sample set D ═ { x according to the preprocessed data1,x2,…xmWhere x is a total of m questions in the question-answer data, randomly selecting a point from the input set of all data points D as the first cluster center μ1For each point x in the data set D, the distance between x and the nearest cluster center among all the existing cluster centers is calculated, and the distance is disclosed as follows:
Figure BDA0003424335290000051
wherein xiAs points in the data set, μrK, when a new point is used as a new cluster center, if D (x)i) The larger the probability is, the more the probability is, the process is repeated until all data points are completely calculated, K clustering centroids are selected, and the centroids are used as the initial centroids of the K-Means algorithm.
And step 3: minimize the sum of euclidean distances of all samples to the center of the class to which they belong.
Calculating a sample x according to the initial centroid selected in the step 2iAnd each centroid vector mujThe formula is as follows:
Figure BDA0003424335290000061
wherein d isijIs the distance of the sample to the centroid vector, now xiMarked as minimum dijCorresponding minimum class
Figure BDA0003424335290000062
If the cluster is C, x is defined asiAre arranged in clusters
Figure BDA0003424335290000063
For each centroid quantity j 1,2.. k, a new centroid is recalculated for all samples, the formula:
Figure BDA0003424335290000064
if all the k centroid vectors are not changed, outputting the final cluster division, namely the clustering result C ═ C1,C2,…Ck}。
A text similarity module based on Simhash calculation, wherein the text similarity module comprises the following specific steps:
step 1, performing word segmentation on the problem and calculating the TF-IDF value of each word.
TF-IDF represents word frequency-inverse document frequency, the method is used for calculating the probability of which words are asked in all questions and answers is higher, TF-IDF can be obtained by calculating the percentage of each word weight appearing in all questions, and TF-IDF score is higher and represents that the words are more closely related to the questions, and the specific calculation formula is as follows:
Figure BDA0003424335290000065
wherein tf isx,yRepresenting the frequency of occurrence of the word x in the piece of question-and-answer data, N being the total amount of question-and-answer data, dfxTo include the number of words x, w, in all question-answer datax,yX final score.
And 2, calculating the similarity between the question answering data by using the Simhash, and merging the similar data.
The weight w of each word is calculated in step 1, and then the keyword with the top 3 of the weight score is extracted for each question and answer data, i.e. each question and answer data can be expressed as (x)1,w1),(x2,w2),(x3,w3),w1<w2<w3. Then obtaining binary system of all key words, such as x, by hash method1=10101,x2=00110,x301011; then toThe above results are weighted, let w3=4,w2=3,w1When the corresponding position of all binary words is 1, the weight is positive, and when the corresponding position is 0, the weight is negative, that is, x1=[2,-2,2,-2,2],x2=[-3,-3,3,3,-3],x3=[-4,4,-4,4,4](ii) a Next, accumulating and combining the sequence values to obtain x1+x2+x3=[-5,-1,1,5,3](ii) a Finally, the dimension reduction operation is carried out on the sequence, the corresponding position is 1 when the corresponding position is positive, the corresponding position is 0 when the corresponding position is negative, and the sequence is [0,0,1,1,1]Then, the Simhash value of the question-answer data is obtained, the Simhash value of each piece of question-answer data is calculated, whether the two pieces of data are similar or not is judged according to the Hamming distance, and if the Hamming distance of the two pieces of data is smaller than 3, the question-answer data is judged to be similar.
The present invention further provides a method for determining text similarity, as shown in fig. 2, including:
acquiring question and answer data of a user, preprocessing the question and answer data, selecting an initialized mass center of the preprocessed question and answer data according to the preprocessed question and answer data, and acquiring clustered question and answer data according to the initialized mass center;
performing word segmentation on the problem data in the clustering result, calculating a TF-IDF value of each word segmentation, calculating a Simhash value between the question answering data of the clusters according to the TF-IDF value, and determining the similarity between the question answering data of the clusters according to the Simhash value.
And merging the clustered question-answer data with the similarity larger than a preset value.
Wherein the pretreatment specifically comprises the following steps: and cleaning the question and answer data.
Selecting an initialization mass center of the preprocessed question and answer data, specifically:
generating a sample set according to the preprocessed question and answer data, randomly selecting any point in the sample set as a clustering center, calculating the distance from each point in the sample set to the clustering center, determining a plurality of clustering centroids according to the distance from each point to the clustering center, and taking the clustering centroids as initialization centroids.
The method for obtaining the clustered question answering data specifically comprises the following steps:
and determining the distance between each point in the sample set and the vector of the initialized centroid aiming at the initialized centroid, clustering the distance between each point and the vector of the initialized centroid, determining a new centroid in the sample set again, and outputting a clustering result if the vector of the centroid is not changed.
Compared with the prior art, the method has the advantages that the same type of data is gathered into the same cluster based on a clustering algorithm, then the similarity of two question-answer data is compared by calculating the weight of the keywords in the question-answer corpus and the Simhash value by using a text similarity calculation method based on the Simhash, and unstructured data in a chat platform is successfully converted into structured data which can be identified by intelligent customer service.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The scheme in the embodiment of the application can be implemented by adopting various computer languages, such as object-oriented programming language Java and transliterated scripting language JavaScript.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. A system for determining text similarity, the system comprising:
the clustering module is used for acquiring question and answer data of a user, preprocessing the question and answer data, selecting an initialization mass center of the preprocessed question and answer data according to the preprocessed question and answer data, and acquiring the clustered question and answer data according to the initialization mass center;
and the text similarity determining module is used for segmenting the question data in the clustering result, calculating TF-IDF values of each segmented word, calculating Simhash values among the question and answer data of the clusters according to the TF-IDF values, and determining the similarity among the question and answer data of the clusters according to the Simhash values.
2. The system of claim 1, wherein the text similarity determination module is further configured to combine the question-answer data of the clusters with similarity greater than a preset value.
3. The system according to claim 1, the pre-processing being in particular: and cleaning the question and answer data.
4. The system according to claim 1, wherein the selecting of the initialized centroid of the preprocessed question-answer data specifically comprises:
generating a sample set according to the preprocessed question and answer data, randomly selecting any point in the sample set as a clustering center, calculating the distance from each point in the sample set to the clustering center, determining a plurality of clustering centroids according to the distance from each point to the clustering center, and taking the clustering centroids as initialization centroids.
5. The system according to claim 1, wherein the obtaining of the clustered question-answer data specifically comprises:
and determining the distance between each point in the sample set and the vector of the initialized centroid aiming at the initialized centroid, clustering the distance between each point and the vector of the initialized centroid, determining a new centroid in the sample set again, and outputting a clustering result if the vector of the centroid is not changed.
6. A method for determining text similarity, the method comprising:
acquiring question and answer data of a user, preprocessing the question and answer data, selecting an initialized mass center of the preprocessed question and answer data according to the preprocessed question and answer data, and acquiring clustered question and answer data according to the initialized mass center;
performing word segmentation on the problem data in the clustering result, calculating a TF-IDF value of each word segmentation, calculating a Simhash value between the question answering data of the clusters according to the TF-IDF value, and determining the similarity between the question answering data of the clusters according to the Simhash value.
7. The method of claim 6, further comprising: and merging the clustered question-answer data with the similarity larger than a preset value.
8. The method according to claim 6, wherein the pre-treatment is in particular: and cleaning the question and answer data.
9. The method according to claim 6, wherein the selecting the initialized centroid of the preprocessed question and answer data specifically comprises:
generating a sample set according to the preprocessed question and answer data, randomly selecting any point in the sample set as a clustering center, calculating the distance from each point in the sample set to the clustering center, determining a plurality of clustering centroids according to the distance from each point to the clustering center, and taking the clustering centroids as initialization centroids.
10. The method according to claim 6, wherein the obtaining of the clustered question-answer data specifically comprises:
and determining the distance between each point in the sample set and the vector of the initialized centroid aiming at the initialized centroid, clustering the distance between each point and the vector of the initialized centroid, determining a new centroid in the sample set again, and outputting a clustering result if the vector of the centroid is not changed.
CN202111572563.4A 2021-12-21 2021-12-21 System and method for determining text similarity Pending CN114416975A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111572563.4A CN114416975A (en) 2021-12-21 2021-12-21 System and method for determining text similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111572563.4A CN114416975A (en) 2021-12-21 2021-12-21 System and method for determining text similarity

Publications (1)

Publication Number Publication Date
CN114416975A true CN114416975A (en) 2022-04-29

Family

ID=81267911

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111572563.4A Pending CN114416975A (en) 2021-12-21 2021-12-21 System and method for determining text similarity

Country Status (1)

Country Link
CN (1) CN114416975A (en)

Similar Documents

Publication Publication Date Title
US10417350B1 (en) Artificial intelligence system for automated adaptation of text-based classification models for multiple languages
CN111125334B (en) Search question-answering system based on pre-training
WO2021093755A1 (en) Matching method and apparatus for questions, and reply method and apparatus for questions
CN108388674B (en) Method and device for pushing information
CN105022754B (en) Object classification method and device based on social network
CN112035599B (en) Query method and device based on vertical search, computer equipment and storage medium
CN110134777B (en) Question duplication eliminating method and device, electronic equipment and computer readable storage medium
CN110472043B (en) Clustering method and device for comment text
CN110895559A (en) Model training method, text processing method, device and equipment
CN109299263B (en) Text classification method and electronic equipment
CN116483979A (en) Dialog model training method, device, equipment and medium based on artificial intelligence
WO2023129339A1 (en) Extracting and classifying entities from digital content items
CN112836019B (en) Public medical health named entity identification and entity linking method and device, electronic equipment and storage medium
CN111125329B (en) Text information screening method, device and equipment
CN116402166B (en) Training method and device of prediction model, electronic equipment and storage medium
CN110413750B (en) Method and device for recalling standard questions according to user questions
CN112199958A (en) Concept word sequence generation method and device, computer equipment and storage medium
CN114580354B (en) Information coding method, device, equipment and storage medium based on synonym
CN115600595A (en) Entity relationship extraction method, system, equipment and readable storage medium
CN115906797A (en) Text entity alignment method, device, equipment and medium
CN114416975A (en) System and method for determining text similarity
CN110866393B (en) Resume information extraction method and system based on domain knowledge base
CN109308565B (en) Crowd performance grade identification method and device, storage medium and computer equipment
CN110413956B (en) Text similarity calculation method based on bootstrapping
CN108733824B (en) Interactive theme modeling method and device considering expert knowledge

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination