CN113407694A - Customer service robot knowledge base ambiguity detection method, device and related equipment - Google Patents

Customer service robot knowledge base ambiguity detection method, device and related equipment Download PDF

Info

Publication number
CN113407694A
CN113407694A CN202110693227.9A CN202110693227A CN113407694A CN 113407694 A CN113407694 A CN 113407694A CN 202110693227 A CN202110693227 A CN 202110693227A CN 113407694 A CN113407694 A CN 113407694A
Authority
CN
China
Prior art keywords
category
deep learning
ambiguity
learning model
knowledge base
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110693227.9A
Other languages
Chinese (zh)
Other versions
CN113407694B (en
Inventor
潘晟锋
刘云峰
吴悦
胡晓
汶林丁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zhuiyi Technology Co Ltd
Original Assignee
Shenzhen Zhuiyi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Zhuiyi Technology Co Ltd filed Critical Shenzhen Zhuiyi Technology Co Ltd
Priority to CN202110693227.9A priority Critical patent/CN113407694B/en
Publication of CN113407694A publication Critical patent/CN113407694A/en
Application granted granted Critical
Publication of CN113407694B publication Critical patent/CN113407694B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0281Customer communication at a business location, e.g. providing product or service information, consulting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Development Economics (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Accounting & Taxation (AREA)
  • Computing Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Finance (AREA)
  • Biomedical Technology (AREA)
  • Strategic Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Human Computer Interaction (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application relates to a customer service robot knowledge base ambiguity detection method, a customer service robot knowledge base ambiguity detection device and relevant equipment, wherein the customer service robot knowledge base ambiguity detection method comprises the following steps: constructing a knowledge base, wherein the knowledge base is divided according to FAQs, each FAQ is provided with at least one similar question sentence, and each FAQ is of one category; dividing the knowledge base into a test set and a training set of a deep learning model; training a deep learning model on a training set, and performing ambiguity detection by using the deep learning model which is learned; updating the knowledge base according to an ambiguity detection result; and repeating the steps until the learning effect is not improved any more. The knowledge base is updated according to the ambiguity detection result, the training step is repeated until the learning effect reaches the expected standard, the ambiguity of the knowledge base can be found and corrected in an auxiliary mode, the ambiguity eliminated knowledge base is obtained, data are extracted from the ambiguity eliminated knowledge base and serve as a training set and a testing set of the deep learning model, and the learning effect of the deep learning model is further improved.

Description

Customer service robot knowledge base ambiguity detection method, device and related equipment
The application requires a divisional application of a Chinese patent application with the name of 'customer service robot knowledge base ambiguity detection method' filed by the Chinese patent office with the application number of 201810801678.8 on 19/07/2018.
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a customer service robot knowledge base ambiguity detection method, a customer service robot knowledge base ambiguity detection device and relevant equipment.
Background
With the increase of internet users, the service pressure of the customer service department of the enterprise is continuously increased. Since most of the questions encountered by users are repeated, these repeated questions can be answered with a fixed template. In order to reduce the labor cost of a customer service center, a robot customer service can be introduced, the type of a question of a user is judged by a program, if the question belongs to FAQ (Frequently Asked Questions), a standard answer is directly given, otherwise, manual service is switched to carry out special intervention.
In the related art, a customer service robot recognizes user intentions by using a machine learning technology, and converts the intention recognition into question classification problems. Each FAQ corresponds to a category, and each category has more than one similar question sentence. All FAQs and corresponding similar questions constitute the knowledge base of the robot.
The effect of the machine learning model usually depends on training data selected from a knowledge base, and especially the labeling accuracy of the training data has a great influence on the model effect. However, due to the limitations of time and manual effort, a large amount of ambiguity often exists in the knowledge base, for example, a question corresponds to an incorrect category, and the category and category semantics coincide, and the ambiguity may cause the model to learn the incorrect knowledge, thereby causing a negative effect on the accuracy of the model.
Disclosure of Invention
In order to overcome the problems in the related art at least to a certain extent, the application provides a method, a device and related equipment for detecting ambiguity of a knowledge base of a customer service robot.
In a first aspect, the present application provides a method for detecting ambiguity in a knowledge base of a customer service robot, including:
constructing a knowledge base, wherein the knowledge base is divided according to FAQs, each FAQ is provided with at least one similar question sentence, and each FAQ is of one category;
dividing the knowledge base into a test set and a training set of a deep learning model;
training a deep learning model on a training set, and performing ambiguity detection by using the deep learning model which is learned;
updating the knowledge base according to an ambiguity detection result;
and repeating the steps until the learning effect is not improved any more, and obtaining the knowledge base for eliminating the ambiguity.
Further, the dividing the knowledge base into a test set and a training set of the deep learning model includes: and randomly extracting a preset number of similar question sentences corresponding to each FAQ as test data of the FAQ corresponding type, and taking the rest similar question sentences as training data of the FAQ corresponding type. The test data of all classes constitutes a test set and the training data of all classes constitutes a training set.
Further, the deep learning model comprises: the deep learning model is trained on a training set, and the deep learning model comprises the following steps:
inputting the question sentences in the FAQ in the training set into the deep learning model as an input part;
converting the question sentences in the input part into feature vectors by using a feature extractor in the deep learning model;
calculating a prediction result by utilizing a shallow classifier in the deep learning model according to the feature vector, wherein the prediction result is a category corresponding to a question in an input part;
optimizing a training model by using an optimizer, and minimizing the average difference between the actual category marked by the question in the training set and the prediction result of the deep learning model;
and (4) evaluating the trained model by using the test set, and calculating the consistency rate of the model prediction result and the actual category marked by the question in the test set to be used as the evaluation of the model learning effect.
Further, the ambiguity detection includes: the category ambiguity detection, the labeling error detection and the labeling ambiguity detection, wherein the ambiguity detection by using the learned deep learning model comprises the following steps:
detecting ambiguity by using a feature extractor in a deep learning model;
and detecting ambiguity by using a shallow classifier in the deep learning model.
Further, the detecting ambiguity by using the feature extractor in the deep learning model includes:
converting similar question sentences in a data set into feature vectors by using a feature extractor in the deep learning model, wherein the data set comprises a training set or/and a test set;
combining the feature vectors corresponding to the question into a question feature vector pair (x, y), wherein the question corresponding to the feature vector x and the question corresponding to the feature vector y are from different categories respectively;
calculating the vector similarity cos (x, y) of each question feature vector pair, the
Figure BDA0003126963990000031
And sorting all question feature vector pairs from high to low according to the vector similarity, selecting the question feature vector pair with the vector similarity ranked at the top, and judging whether ambiguity exists according to the question feature vector pair with the vector similarity ranked at the top.
Further, the determining whether there is ambiguity according to the question feature vector pair ranked at the top in the vector similarity includes:
judging whether the labeling ambiguity or the labeling error exists: extracting a first preset number of question sentence feature vector pairs with the similarity ranking at the top, and manually checking whether the corresponding question sentence pairs have labeling ambiguity and labeling errors;
judging whether category ambiguity exists: counting the repeated occurrence times of the corresponding category pairs for the question feature vector pairs of the first preset number, sorting the category pairs of the second preset number from high to low according to the occurrence times, and manually checking whether category ambiguity exists.
Further, the detecting ambiguity by using the deep learning model shallow classifier includes:
counting the classification results of the deep learning model and forming a confusion matrix, wherein each row i of the confusion matrix corresponds to a labeled category, each column j corresponds to a category predicted by the deep learning model, and an element xijIs the number of question sentences labeled as category i, and the model predicts as category j, element xjiThe number of question sentences marked as category j and predicted as category i by the model;
calculating the number of samples marked as a category i in the data set, wherein the number of the samples of the category i is
Figure BDA0003126963990000041
Wherein k is of any class;
calculating the number of samples marked as a category j in the data set, wherein the number of the samples of the category j is
Figure BDA0003126963990000042
Wherein k is of any class;
calculating a proportion P of samples in the data set labeled as class i to be predicted as class j by the deep learning modelijRatio P to class i of samples predicted as class jjiSaid P isijAnd PjiThe calculation formulas are respectively as follows:
Figure BDA0003126963990000043
the category i and the category j belong to different categories, and the data set comprises a training set or/and a testing set;
calculating the confusion degree of the class pair (class i, class j), wherein the confusion degree is PijAnd PjiHarmonic mean value of SijSaid
Figure BDA0003126963990000044
And judging whether ambiguity exists between the category i and the category j according to the confusion degree.
Further, the determining whether the category i and the category j have ambiguity according to the confusion degree includes:
sorting the calculated confusion degrees;
and extracting the categories with the third preset number of confusion degrees ranked at the top, and manually detecting whether category ambiguity exists.
Further, the detecting ambiguity by using the deep learning model shallow classifier further comprises: and finding out data with inconsistent actual categories labeled in a data set and categories predicted by the deep learning model, and manually checking whether labeling errors exist, wherein the data set comprises a training set or/and a testing set.
Further, the updating the knowledge base according to the ambiguity detection result includes:
manually rewriting and manually re-labeling the detected ambiguous question, and deleting the original label;
and recombining and distributing the similar question sentences for the detected ambiguous categories, and deleting the original ambiguous categories.
In a second aspect, the present application provides a customer service robot knowledge base ambiguity detection apparatus, including:
the system comprises a construction module, a query module and a query module, wherein the construction module is used for constructing a knowledge base, the knowledge base is divided according to FAQs, each FAQ is provided with at least one similar question sentence, and each FAQ is of one category;
the dividing module is used for dividing the knowledge base into a test set and a training set of a deep learning model;
the training module is used for training a deep learning model on a training set and carrying out ambiguity detection by utilizing the deep learning model which is learned; the ambiguity detection includes: the category ambiguity detection, the labeling error detection and the labeling ambiguity detection, wherein the ambiguity detection by using the learned deep learning model comprises the following steps: detecting ambiguity by using a shallow classifier in a deep learning model, comprising: counting the classification results of the deep learning model and forming a confusion matrix, wherein each row i of the confusion matrix corresponds to a labeled category, each column j corresponds to a category predicted by the deep learning model, and an element xijIs the number of question sentences labeled as category i, and the model predicts as category j, element xjiThe number of question sentences marked as category j and predicted as category i by the model; calculating the number of samples marked as a category i in the data set, wherein the number of the samples of the category i is
Figure BDA0003126963990000051
Wherein k is of any class; calculating the number of samples marked as a category j in the data set, wherein the number of the samples of the category j is
Figure BDA0003126963990000052
Wherein k is of any class; calculating a proportion P of samples in the data set labeled as class i to be predicted as class j by the deep learning modelijRatio P to class i of samples predicted as class jjiSaid P isijAnd PjiThe calculation formulas are respectively as follows:
Figure BDA0003126963990000053
the category i and the category j belong to different categories, and the data set comprises a training set or/and a testing set; calculating the confusion degree of the class pair (class i, class j), wherein the confusion degree is PijAnd PjiHarmonic mean value of SijSaid
Figure BDA0003126963990000054
Judging whether ambiguity exists between the category i and the category j according to the confusion degree;
the updating module is used for updating the knowledge base according to an ambiguity detection result;
and the repeating module is used for repeating the steps until the learning effect is not improved any more, and obtaining the knowledge base for eliminating the ambiguity.
In a third aspect, the present application provides an electronic device, comprising:
at least one memory for storing a program;
at least one processor configured to load the program to perform the customer service robot knowledge base ambiguity detection method of any one of the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium having stored therein a program executable by a processor, comprising:
the processor executable program when executed by the processor is for performing the customer service robot knowledge base ambiguity detection method of any one of the first aspect.
The technical scheme provided by the embodiment of the application can have the following beneficial effects:
the knowledge base is updated according to the ambiguity detection result, the training step is repeated until the learning effect reaches the expected standard, the ambiguity of the knowledge base can be found and corrected in an auxiliary mode, the ambiguity eliminated knowledge base is obtained, data are extracted from the ambiguity eliminated knowledge base and serve as a training set and a testing set of the deep learning model, and the learning effect of the deep learning model is further improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
Fig. 1 is a schematic flowchart of a customer service robot knowledge base ambiguity detection method according to an embodiment of the present application.
Fig. 2 is a flowchart illustrating a method for ambiguity detection in a knowledge base of a customer service robot according to another embodiment of the present application.
Detailed Description
The invention is described in detail below with reference to the figures and examples.
Fig. 1 is a schematic flowchart of a customer service robot knowledge base ambiguity detection method according to an embodiment of the present application.
As shown in fig. 1, the method of the present embodiment includes:
s11: and constructing a knowledge base, wherein the knowledge base is divided according to FAQs, each FAQ is provided with a variable number of similar question sentences, and each FAQ is a category.
The knowledge base is developed on the basis of large-scale knowledge processing, is applied to the industry, is suitable for the technical industries of large-scale knowledge processing, natural language understanding, knowledge management, an automatic question and answer system, reasoning and the like, and intelligent customer service not only provides a fine-grained knowledge management technology for enterprises, but also establishes a quick and effective technical means based on natural language for communication between the enterprises and mass users. Taking a customer service robot knowledge base of an e-commerce enterprise as an example, the knowledge base comprises a plurality of FAQs, such as a 'return flow' and a 'refund flow'. Taking the "return flow" as an example, the FAQ may include the following similar question: "how do I buy yesterday to return goods? "," how should I want to return goods? ".
S12: the knowledge base is divided into a test set and a training set of deep learning models.
And selecting N FAQs needing to detect ambiguity from the knowledge base as N categories. For each FAQ, a preset number of similar question sentences are randomly extracted as the test data of the category, and the rest similar question sentences are used as the training data of the category. The test data of all classes constitutes a test set and the training data of all classes constitutes a training set.
For example, 10 FAQs are contained in the knowledge base, each FAQ contains 20 similar questions, a preset amount, for example, 3 similar questions are randomly extracted from each category of the knowledge base as a test set of the deep learning model, then 30 similar questions are contained in the test set, and the other 170 similar questions are included in the training set of the deep learning model.
It should be noted that the number of categories included in the knowledge base according to the present invention and the number of similar question sentences included in each category are not limited to the examples in the embodiments, and are not described herein again.
S13: and training a deep learning model on a training set, and performing ambiguity detection by using the deep learning model.
The deep learning model comprises: a feature extractor and a shallow classifier.
The ambiguity detection includes: category ambiguity detection, annotation error detection and annotation ambiguity detection;
the ambiguity includes:
category ambiguity: that is, the meaning of the two categories is very similar, for example, category 1 is "order problem", category 2 is "change cancel problem of product", and the semantics of category 1 and the semantics of category 2 overlap, because category 1 can basically cover category 2;
and (3) annotation ambiguity: that is, the question may be timescaled to multiple classes, for example: category 1 is "return question of product", category 2 is "price question of product", if the question is "this thing is too expensive, i want to return goods", there is a label ambiguity in this sentence, because the question contains the meanings of the two categories at the same time;
marking errors: the question corresponds to the wrong category, for example, category 1 is "return question of product", category 2 is "price question of product", and if the question is "i do not want" but is labeled as category 2, a labeling error is generated.
The ambiguity detection is for a test set or/and a training set.
The using the learned deep learning model for ambiguity detection includes:
detecting ambiguity by using a feature extractor in a deep learning model;
detecting ambiguity by using a shallow classifier in a deep learning model;
the detection of the ambiguity by using the feature extractor in the deep learning model comprises the following steps:
converting similar question sentences in a data set into feature vectors by using a feature extractor in the deep learning model, wherein the data set comprises a training set or/and a test set;
combining the feature vectors corresponding to the question into a question feature vector pair (x, y), wherein the question corresponding to the feature vector x and the question corresponding to the feature vector y are from different categories respectively;
calculating the vector similarity cos (x, y) of each question feature vector pair, the
Figure BDA0003126963990000081
And sorting all question feature vector pairs from high to low according to the vector similarity, selecting the question feature vector pair with the vector similarity ranked at the top, and judging whether ambiguity exists according to the question feature vector pair with the vector similarity ranked at the top.
The step of judging whether ambiguity exists according to the question feature vector pair ranked at the top according to the vector similarity comprises the following steps:
judging whether the labeling ambiguity or the labeling error exists: extracting a first preset number of question sentence feature vector pairs with the similarity ranking at the top, for example, 30, and manually checking whether the corresponding question sentence pairs have labeling ambiguity and labeling error;
judging whether category ambiguity exists: counting the repeated occurrence times of the corresponding category pairs for the question feature vector pairs with the first preset number, sorting the category pairs from high to low according to the occurrence times, taking a second preset number, such as 20 category pairs, and manually checking whether category ambiguity exists.
The method for detecting the ambiguity by using the deep learning model shallow classifier comprises the following steps:
counting the classification results of the deep learning model and forming a confusion matrix, wherein each row i of the confusion matrix corresponds to a labeled category, each column j corresponds to a category predicted by the deep learning model, and an element xijIs the number of question sentences labeled as category i, and the model predicts as category j, element xjiThe number of question sentences marked as category j and predicted as category i by the model;
calculating the number of samples marked as a category i in the data set, wherein the number of the samples of the category i is
Figure BDA0003126963990000091
Wherein k is of any class;
calculating the number of samples marked as a category j in the data set, wherein the number of the samples of the category j is
Figure BDA0003126963990000092
Wherein k is of any class;
calculating a proportion P of samples in the data set labeled as class i to be predicted as class j by the deep learning modelijRatio P to class i of samples predicted as class jjiSaid P isijAnd PjiThe calculation formulas are respectively as follows:
Figure BDA0003126963990000093
the category i and the category j belong to different categories, and the data set comprises a training set or/and a testing set;
calculating the confusion degree of the class pair (class i, class j), wherein the confusion degree is PijAnd PjiHarmonic mean value of SijSaid
Figure BDA0003126963990000094
And judging whether ambiguity exists between the category i and the category j according to the confusion degree.
Judging whether ambiguity exists in the category i and the category j according to the confusion degree comprises the following steps:
sorting the calculated confusion degrees;
and extracting a third preset number of category pairs with the confusion degrees of 5, for example, ranking at the top, and manually detecting whether category ambiguity exists.
The detecting ambiguity by using the deep learning model shallow classifier further comprises: and finding out data with inconsistent actual categories labeled in a data set and categories predicted by the deep learning model, and manually checking whether labeling errors exist, wherein the data set comprises a training set or/and a testing set.
S14: updating the knowledge base according to the ambiguity detection result, comprising:
manually rewriting and manually re-labeling the detected ambiguous question, and deleting the original label;
and recombining and distributing the similar question sentences for the detected ambiguous categories, and deleting the original ambiguous categories.
S15: and repeating the steps until the learning effect is not improved any more, and obtaining the knowledge base for eliminating the ambiguity.
The learning effect is the consistency ratio of the model prediction result and the actual category labeled by the question in the test set, and the consistency ratio is, for example, the prediction accuracy ratio, that is, the number of the question with the consistent prediction result is divided by the total number of the question. The learning effect is no longer improved, for example, the prediction accuracy is improved by less than 0.5%.
When the learning effect of the model is not improved any more, the performance reduction of the model caused by ambiguity of the knowledge base is eliminated, and the model can be trained by using the knowledge base and deployed in a production environment for use.
In this embodiment, the knowledge base is updated according to the ambiguity detection result, the training step is repeated until the learning effect reaches the expected standard, the ambiguity of the knowledge base can be found and corrected manually in an auxiliary manner, the ambiguity-resolved knowledge base is obtained, data is extracted from the ambiguity-resolved knowledge base and is used as a training set and a test set of the deep learning model, and the learning effect of the deep learning model is further improved.
Fig. 2 is a flowchart illustrating a method for ambiguity detection in a knowledge base of a customer service robot according to another embodiment of the present application.
As shown in fig. 2, the training of the deep learning model on the training set includes:
the concept of deep learning is derived from the research of an artificial neural network and comprises a multi-layer perceptron with multiple hidden layers. Deep learning forms a more abstract class or feature of high-level representation properties by combining low-level features to discover a distributed feature representation of the data. The deep learning model includes a feature extractor and a shallow classifier.
S21: inputting the question sentences in the training set into the deep learning model as an input part;
s22: converting the question sentences in the input part into feature vectors by using a feature extractor in the deep learning model;
the feature extractor is, for example, a recurrent neural network. The model reads each word in the question sentence in sequence and outputs a feature vector with fixed dimensionality. It should be noted that the feature extractor is not limited to the circular neural network illustrated, and any method that can convert a question into a feature vector of a fixed dimension may be used as the feature extractor.
S23: calculating a prediction result by utilizing a shallow classifier in the deep learning model according to the feature vector, wherein the prediction result is a category corresponding to a question in an input part;
the shallow classifier is, for example, a linear classifier. The classifier reads in a feature vector with a fixed dimension, calculates the linear combination of vector elements to obtain the score of each category, and takes the category with the highest score as a prediction result. It should be noted that the shallow classifier is not limited to the illustrated linear classifier, and any method that can convert a feature vector of a fixed dimension into scores of various categories can be used as the shallow classifier.
S24: optimizing a training model by using an optimizer, and minimizing the average difference between the actual category marked by the question in the training set and the prediction result of the deep learning model;
the average difference is for example a loss function. The loss function is for example cross entropy.
The optimizer is for example a gradient descent method. The Gradient Descent is one of iterative methods, and when solving model parameters of a machine learning algorithm, namely an unconstrained optimization problem, the Gradient Descent (Gradient) is one of the most commonly adopted methods. When the minimum value of the loss function is solved, iterative solution can be carried out step by step through a gradient descent method, and the minimized loss function and the corresponding model parameter value are obtained.
S25: and (3) evaluating the trained model by using the test set, and calculating the consistency rate of the model prediction result and the actual category marked by the question in the test set as the evaluation of the model learning effect, wherein the consistency rate is the prediction accuracy rate, namely the number of the question with the consistent prediction result is divided by the total number of the question.
In this embodiment, the deep learning model is used to train the concentrated FAQ, the optimizer is used to continuously optimize the model in the training process, the learning effect of the deep learning model is continuously improved by iteration, and the ambiguity detection accuracy is continuously improved.
An embodiment of the present application provides a customer service robot knowledge base ambiguity detection apparatus, including:
the system comprises a construction module, a query module and a query module, wherein the construction module is used for constructing a knowledge base, the knowledge base is divided according to FAQs, each FAQ is provided with at least one similar question sentence, and each FAQ is of one category;
the dividing module is used for dividing the knowledge base into a test set and a training set of a deep learning model;
the training module is used for training a deep learning model on a training set and carrying out ambiguity detection by utilizing the deep learning model which is learned; the ambiguity detection includes: the category ambiguity detection, the labeling error detection and the labeling ambiguity detection, wherein the ambiguity detection by using the learned deep learning model comprises the following steps: detecting ambiguity by using a shallow classifier in a deep learning model, comprising: counting the classification results of the deep learning model and forming a confusion matrix, wherein each row i of the confusion matrix corresponds to a labeled category, and each column j corresponds to the depth studyClass of learning model prediction, element xijIs the number of question sentences labeled as category i, and the model predicts as category j, element xjiThe number of question sentences marked as category j and predicted as category i by the model; calculating the number of samples marked as a category i in the data set, wherein the number of the samples of the category i is
Figure BDA0003126963990000121
Wherein k is of any class; calculating the number of samples marked as a category j in the data set, wherein the number of the samples of the category j is
Figure BDA0003126963990000122
Wherein k is of any class; calculating a proportion P of samples in the data set labeled as class i to be predicted as class j by the deep learning modelijRatio P to class i of samples predicted as class jjiSaid P isijAnd PjiThe calculation formulas are respectively as follows:
Figure BDA0003126963990000123
the category i and the category j belong to different categories, and the data set comprises a training set or/and a testing set; calculating the confusion degree of the class pair (class i, class j), wherein the confusion degree is PijAnd PjiHarmonic mean value of SijSaid
Figure BDA0003126963990000124
Judging whether ambiguity exists between the category i and the category j according to the confusion degree;
the updating module is used for updating the knowledge base according to an ambiguity detection result;
and the repeating module is used for repeating the steps until the learning effect is not improved any more, and obtaining the knowledge base for eliminating the ambiguity.
In some embodiments, further comprising:
the random extraction module is used for dividing the knowledge base into a test set and a training set of the deep learning model, and comprises the following steps: randomly extracting a preset number of similar question sentences corresponding to each FAQ as test data of FAQ corresponding categories, and taking the rest similar question sentences as training data of the FAQ corresponding categories; the test data of all classes constitutes a test set and the training data of all classes constitutes a training set.
And the sorting module is used for sorting the calculated confusion degrees, extracting the categories with the third preset number of confusion degrees ranked at the top, and manually detecting whether category ambiguity exists.
The labeling module is used for detecting ambiguity by utilizing a shallow classifier of the deep learning model, and further comprises: and finding out data with inconsistent actual categories labeled in a data set and categories predicted by the deep learning model, and manually checking whether labeling errors exist, wherein the data set comprises a training set or/and a testing set.
One embodiment of the present application provides an electronic device, including:
at least one memory for storing a program;
at least one processor, configured to load the program to perform the customer service robot knowledge base ambiguity detection method according to the above embodiments.
One embodiment of the present application provides a computer-readable storage medium, in which a program executable by a processor is stored, comprising:
the processor executable program when executed by the processor is configured to perform the customer service robot knowledge base ambiguity detection method according to the above embodiments.
It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar parts in other embodiments may be referred to for the content which is not described in detail in some embodiments.
It should be noted that, in the description of the present application, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present application, the meaning of "a plurality" means at least two unless otherwise specified.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.
It should be noted that the present invention is not limited to the above-mentioned preferred embodiments, and those skilled in the art can obtain other products in various forms without departing from the spirit of the present invention, but any changes in shape or structure can be made within the scope of the present invention with the same or similar technical solutions as those of the present invention.

Claims (9)

1. A customer service robot knowledge base ambiguity detection method is characterized by comprising the following steps:
constructing a knowledge base, wherein the knowledge base is divided according to FAQs, each FAQ is provided with at least one similar question sentence, and each FAQ is of one category;
dividing the knowledge base into a test set and a training set of a deep learning model;
training a deep learning model on a training set, and performing ambiguity detection by using the deep learning model which is learned; the ambiguity detection includes: the category ambiguity detection, the labeling error detection and the labeling ambiguity detection, wherein the ambiguity detection by using the learned deep learning model comprises the following steps: detecting ambiguity by using a shallow classifier in a deep learning model, comprising:
counting the deep learning model classification results and forming a confusion matrix, wherein the confusion matrix comprises a plurality of classesEach row i of the confusion matrix corresponds to a labeled category, each column j corresponds to a category predicted by the deep learning model, and an element xijIs the number of question sentences labeled as category i, and the model predicts as category j, element xjiThe number of question sentences marked as category j and predicted as category i by the model;
calculating the number of samples marked as a category i in the data set, wherein the number of the samples of the category i is
Figure FDA0003126963980000011
Wherein k is of any class;
calculating the number of samples marked as a category j in the data set, wherein the number of the samples of the category j is
Figure FDA0003126963980000012
Wherein k is of any class;
calculating a proportion P of samples in the data set labeled as class i to be predicted as class j by the deep learning modelijRatio P to class i of samples predicted as class jjiSaid P isijAnd PjiThe calculation formulas are respectively as follows:
Figure FDA0003126963980000013
the category i and the category j belong to different categories, and the data set comprises a training set or/and a testing set;
calculating the confusion degree of the class pair (class i, class j), wherein the confusion degree is PijAnd PjiHarmonic mean value of SijSaid
Figure FDA0003126963980000014
Judging whether ambiguity exists between the category i and the category j according to the confusion degree;
updating the knowledge base according to an ambiguity detection result;
and repeating the steps until the learning effect is not improved any more, and obtaining the knowledge base for eliminating the ambiguity.
2. The method of claim 1, wherein the partitioning the knowledge base into a test set and a training set of deep learning models comprises: randomly extracting a preset number of similar question sentences corresponding to each FAQ as test data of FAQ corresponding categories, and taking the rest similar question sentences as training data of the FAQ corresponding categories; the test data of all classes constitutes a test set and the training data of all classes constitutes a training set.
3. The method of claim 1, wherein the deep learning model comprises: the deep learning model is trained on a training set, and the deep learning model comprises the following steps:
inputting the question sentences in the training set into the deep learning model as an input part;
converting the question sentences in the input part into feature vectors by using a feature extractor in the deep learning model;
calculating a prediction result by utilizing a shallow classifier in the deep learning model according to the feature vector, wherein the prediction result is a category corresponding to a question in an input part;
optimizing a training model by using an optimizer, and minimizing the average difference between the actual category marked by the question in the training set and the prediction result of the deep learning model;
and (4) evaluating the trained model by using the test set, and calculating the consistency rate of the model prediction result and the actual category marked by the question in the test set to be used as the evaluation of the model learning effect.
4. The method according to claim 1, wherein the determining whether the category i and the category j are ambiguous according to the confusion comprises:
sorting the calculated confusion degrees;
and extracting the categories with the third preset number of confusion degrees ranked at the top, and manually detecting whether category ambiguity exists.
5. The method of claim 1, wherein detecting ambiguity using a deep learning model shallow classifier further comprises: and finding out data with inconsistent actual categories labeled in a data set and categories predicted by the deep learning model, and manually checking whether labeling errors exist, wherein the data set comprises a training set or/and a testing set.
6. The method of claim 1, wherein updating the knowledge base based on ambiguity detection comprises:
manually rewriting and manually re-labeling the detected ambiguous question, and deleting the original label;
and recombining and distributing the similar question sentences for the detected ambiguous categories, and deleting the original ambiguous categories.
7. A customer service robot knowledge base ambiguity detection device is characterized by comprising:
the system comprises a construction module, a query module and a query module, wherein the construction module is used for constructing a knowledge base, the knowledge base is divided according to FAQs, each FAQ is provided with at least one similar question sentence, and each FAQ is of one category;
the dividing module is used for dividing the knowledge base into a test set and a training set of a deep learning model;
the training module is used for training a deep learning model on a training set and carrying out ambiguity detection by utilizing the deep learning model which is learned; the ambiguity detection includes: the category ambiguity detection, the labeling error detection and the labeling ambiguity detection, wherein the ambiguity detection by using the learned deep learning model comprises the following steps: detecting ambiguity by using a shallow classifier in a deep learning model, comprising: counting the classification results of the deep learning model and forming a confusion matrix, wherein each row i of the confusion matrix corresponds to a labeled category, each column j corresponds to a category predicted by the deep learning model, and an element xijIs the number of question sentences labeled as category i, and the model predicts as category j, element xjiThe number of question sentences marked as category j and predicted as category i by the model; calculating the number of samples marked as a category i in the data set, wherein the number of the samples of the category i is
Figure FDA0003126963980000031
Wherein k is of any class; calculating the number of samples marked as a category j in the data set, wherein the number of the samples of the category j is
Figure FDA0003126963980000032
Wherein k is of any class; calculating a proportion P of samples in the data set labeled as class i to be predicted as class j by the deep learning modelijRatio P to class i of samples predicted as class jjiSaid P isijAnd PjiThe calculation formulas are respectively as follows:
Figure FDA0003126963980000033
the category i and the category j belong to different categories, and the data set comprises a training set or/and a testing set; calculating the confusion degree of the class pair (class i, class j), wherein the confusion degree is PijAnd PjiHarmonic mean value of SijSaid
Figure FDA0003126963980000041
Judging whether ambiguity exists between the category i and the category j according to the confusion degree;
the updating module is used for updating the knowledge base according to an ambiguity detection result;
and the repeating module is used for repeating the steps until the learning effect is not improved any more, and obtaining the knowledge base for eliminating the ambiguity.
8. An electronic device, comprising:
at least one memory for storing a program;
at least one processor configured to load the program to perform the customer service robot knowledge base ambiguity detection method of any one of claims 1-6.
9. A computer-readable storage medium in which a program executable by a processor is stored, comprising:
the processor executable program when executed by a processor is for performing the customer service robot knowledge base ambiguity detection method of any one of claims 1-6.
CN202110693227.9A 2018-07-19 2018-07-19 Method, device and related equipment for detecting ambiguity of customer service robot knowledge base Active CN113407694B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110693227.9A CN113407694B (en) 2018-07-19 2018-07-19 Method, device and related equipment for detecting ambiguity of customer service robot knowledge base

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810801678.8A CN109101579B (en) 2018-07-19 2018-07-19 Customer service robot knowledge base ambiguity detection method
CN202110693227.9A CN113407694B (en) 2018-07-19 2018-07-19 Method, device and related equipment for detecting ambiguity of customer service robot knowledge base

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201810801678.8A Division CN109101579B (en) 2018-07-09 2018-07-19 Customer service robot knowledge base ambiguity detection method

Publications (2)

Publication Number Publication Date
CN113407694A true CN113407694A (en) 2021-09-17
CN113407694B CN113407694B (en) 2023-06-02

Family

ID=64846947

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202110693227.9A Active CN113407694B (en) 2018-07-19 2018-07-19 Method, device and related equipment for detecting ambiguity of customer service robot knowledge base
CN201810801678.8A Active CN109101579B (en) 2018-07-09 2018-07-19 Customer service robot knowledge base ambiguity detection method

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201810801678.8A Active CN109101579B (en) 2018-07-09 2018-07-19 Customer service robot knowledge base ambiguity detection method

Country Status (1)

Country Link
CN (2) CN113407694B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117114103A (en) * 2023-10-20 2023-11-24 国家电网有限公司 Corpus reconstruction method and device

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020010930A1 (en) * 2018-07-09 2020-01-16 深圳追一科技有限公司 Method for detecting ambiguity of customer service robot knowledge base, storage medium, and computer device
CN109902285B (en) * 2019-01-08 2023-09-22 平安科技(深圳)有限公司 Corpus classification method, corpus classification device, computer equipment and storage medium
CN112818127A (en) * 2019-11-15 2021-05-18 北京中关村科金技术有限公司 Method, device and medium for detecting corpus conflict in knowledge base
CN111209404B (en) * 2020-04-17 2020-12-22 浙江百应科技有限公司 Method for generating similar question sentences based on deep learning assistance
CN111625636B (en) * 2020-05-28 2023-08-04 深圳追一科技有限公司 Method, device, equipment and medium for rejecting man-machine conversation
CN112257443B (en) * 2020-09-30 2024-04-02 华泰证券股份有限公司 MRC-based company entity disambiguation method combined with knowledge base
CN112559723B (en) * 2020-12-28 2024-05-28 广东国粒教育技术有限公司 FAQ search type question-answering construction method and system based on deep learning
CN112699226A (en) * 2020-12-29 2021-04-23 江苏苏宁云计算有限公司 Method and system for semantic confusion detection

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1979638A (en) * 2005-12-02 2007-06-13 中国科学院自动化研究所 Method for correcting error of voice identification result
WO2014025135A1 (en) * 2012-08-10 2014-02-13 에스케이텔레콤 주식회사 Method for detecting grammatical errors, error detecting apparatus for same, and computer-readable recording medium having the method recorded thereon
CN104657463A (en) * 2015-02-10 2015-05-27 乐娟 Question classification method and question classification device for automatic question-answering system
US20160091609A1 (en) * 2014-09-30 2016-03-31 Umm-Al-Qura University Method and system for an accurate and energy efficient vehicle lane detection
CN107292338A (en) * 2017-06-14 2017-10-24 大连海事大学 A kind of feature selection approach based on sample characteristics Distribution value degree of aliasing
CN107977356A (en) * 2017-11-21 2018-05-01 新疆科大讯飞信息科技有限责任公司 Method and device for correcting recognized text

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7392185B2 (en) * 1999-11-12 2008-06-24 Phoenix Solutions, Inc. Speech based learning/training system using semantic decoding
CN101373532A (en) * 2008-07-10 2009-02-25 昆明理工大学 FAQ Chinese request-answering system implementing method in tourism field
CN103617157B (en) * 2013-12-10 2016-08-17 东北师范大学 Based on semantic Text similarity computing method
CN104268134B (en) * 2014-09-28 2017-04-19 苏州大学 Subjective and objective classifier building method and system
US10509814B2 (en) * 2014-12-19 2019-12-17 Universidad Nacional De Educacion A Distancia (Uned) System and method for the indexing and retrieval of semantically annotated data using an ontology-based information retrieval model
CN105512209B (en) * 2015-11-28 2018-06-19 大连理工大学 The biomedical event trigger word recognition methods that a kind of feature based learns automatically
CN107102989B (en) * 2017-05-24 2020-09-29 南京大学 Entity disambiguation method based on word vector and convolutional neural network
CN107301246A (en) * 2017-07-14 2017-10-27 河北工业大学 Chinese Text Categorization based on ultra-deep convolutional neural networks structural model
CN107993724B (en) * 2017-11-09 2020-11-13 易保互联医疗信息科技(北京)有限公司 Medical intelligent question and answer data processing method and device
CN108227932B (en) * 2018-01-26 2020-06-23 上海智臻智能网络科技股份有限公司 Interaction intention determination method and device, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1979638A (en) * 2005-12-02 2007-06-13 中国科学院自动化研究所 Method for correcting error of voice identification result
WO2014025135A1 (en) * 2012-08-10 2014-02-13 에스케이텔레콤 주식회사 Method for detecting grammatical errors, error detecting apparatus for same, and computer-readable recording medium having the method recorded thereon
US20160091609A1 (en) * 2014-09-30 2016-03-31 Umm-Al-Qura University Method and system for an accurate and energy efficient vehicle lane detection
CN104657463A (en) * 2015-02-10 2015-05-27 乐娟 Question classification method and question classification device for automatic question-answering system
CN107292338A (en) * 2017-06-14 2017-10-24 大连海事大学 A kind of feature selection approach based on sample characteristics Distribution value degree of aliasing
CN107977356A (en) * 2017-11-21 2018-05-01 新疆科大讯飞信息科技有限责任公司 Method and device for correcting recognized text

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王小佳: "基于特征选择的语音情感识别研究", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *
镇丽华等: "自动问答***中问句分类研究综述", 《安徽工业大学学报(自然科学版)》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117114103A (en) * 2023-10-20 2023-11-24 国家电网有限公司 Corpus reconstruction method and device

Also Published As

Publication number Publication date
CN113407694B (en) 2023-06-02
CN109101579B (en) 2021-11-23
CN109101579A (en) 2018-12-28

Similar Documents

Publication Publication Date Title
CN109101579B (en) Customer service robot knowledge base ambiguity detection method
RU2648946C2 (en) Image object category recognition method and device
US11036811B2 (en) Categorical data transformation and clustering for machine learning using data repository systems
CN112070138B (en) Construction method of multi-label mixed classification model, news classification method and system
JP2021510429A (en) Machine learning to integrate knowledge and natural language processing
US9189541B2 (en) Evidence profiling
CN110019822B (en) Few-sample relation classification method and system
CN110134777A (en) Problem De-weight method, device, electronic equipment and computer readable storage medium
CN114186076A (en) Knowledge graph construction method, device, equipment and computer readable storage medium
CN112420125A (en) Molecular attribute prediction method and device, intelligent equipment and terminal
CN114691525A (en) Test case selection method and device
US20190164083A1 (en) Categorical Data Transformation and Clustering for Machine Learning using Natural Language Processing
CN117151222B (en) Domain knowledge guided emergency case entity attribute and relation extraction method thereof, electronic equipment and storage medium
CN113705159A (en) Merchant name labeling method, device, equipment and storage medium
CN111339258A (en) University computer basic exercise recommendation method based on knowledge graph
CN116861358A (en) BP neural network and multi-source data fusion-based computing thinking evaluation method
CN110879821A (en) Method, device, equipment and storage medium for generating rating card model derivative label
CN113010687B (en) Exercise label prediction method and device, storage medium and computer equipment
US20210358317A1 (en) System and method to generate sets of similar assessment papers
Barella et al. Simulating complexity measures on imbalanced datasets
Hauser et al. An improved assessing requirements quality with ML methods
Orozova et al. How to follow modern trends in courses in “databases”-introduction of data mining techniques by example
WO2020010930A1 (en) Method for detecting ambiguity of customer service robot knowledge base, storage medium, and computer device
CN117151247B (en) Method, apparatus, computer device and storage medium for modeling machine learning task
Ryu Machine learning-based classification system for building information models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant