CN110765270A - Training method and system of text classification model for spoken language interaction - Google Patents

Training method and system of text classification model for spoken language interaction Download PDF

Info

Publication number
CN110765270A
CN110765270A CN201911066202.5A CN201911066202A CN110765270A CN 110765270 A CN110765270 A CN 110765270A CN 201911066202 A CN201911066202 A CN 201911066202A CN 110765270 A CN110765270 A CN 110765270A
Authority
CN
China
Prior art keywords
training
context information
spoken language
text
classification model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911066202.5A
Other languages
Chinese (zh)
Other versions
CN110765270B (en
Inventor
方艳
徐华
初敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AI Speech Ltd
Original Assignee
AI Speech Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by AI Speech Ltd filed Critical AI Speech Ltd
Priority to CN201911066202.5A priority Critical patent/CN110765270B/en
Publication of CN110765270A publication Critical patent/CN110765270A/en
Application granted granted Critical
Publication of CN110765270B publication Critical patent/CN110765270B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a training method of a text classification model for spoken language interaction. The method comprises the following steps: acquiring a spoken language text corpus training set and dialogue historical context information; performing corpus expansion on the spoken language text corpus training set through the historical context information of the conversation to enrich the spoken language text corpus training set; and establishing a text classification model based on a bidirectional long-and-short time memory network, and training the text classification model through conversation historical context information and a spoken language text corpus training set after corpus expansion, so that the text classification model learns the field classification of the spoken language text through the conversation historical context information. The embodiment of the invention also provides a training system of the text classification model for spoken language interaction. The embodiment of the invention determines the historical context information of the conversation, constructs a large amount of virtual conversation texts and makes up for the shortage of linguistic data; the dialogue history contextual information is used as part of the input of the training model, and helps the model to improve the accuracy of the domain classification.

Description

Training method and system of text classification model for spoken language interaction
Technical Field
The invention relates to the field of intelligent voice conversation, in particular to a training method and a training system of a text classification model for spoken language interaction.
Background
In the text classification of spoken language interaction, a large amount of manually marked corpora are usually used for training a deep learning model, the model can automatically acquire text characteristics, and after a result is output by the model, the final field output needs to be selected by combining with the previous set of speech state design rules.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the related art:
the text classification method based on the feature engineering needs to consume manpower to design text features, the final performance of the model is limited by the quality of the feature design, and the features used by the method often have the problems of sparsity and dimension explosion, so that the final classification performance is relatively low.
No matter the method is based on the characteristic engineering or the deep learning method, the input of the classification model is the text of the current user, and the effect of the dialogue historical information on the model classification is not considered. The output field of the model is only an intermediate state, corresponding field selection rules are designed by combining conversation history, and one or more fields are screened out from the alternative fields given by the model to serve as final output. The whole process of the method is relatively complicated and is not simple and convenient; the rules of artificial design are often not flexible enough and not accurate enough.
Disclosure of Invention
In order to at least solve the problems that the manual design of text features in the prior art is time-consuming and labor-consuming; after the classification model obtains the result, the final field needs to be judged by a manual design rule, so that the time and the labor are consumed, and the flexibility is not high; the dialogue information can help the model to judge the field and improve the accuracy of field classification, but the existing method model has no problem of adding dialogue information.
In a first aspect, an embodiment of the present invention provides a method for training a text classification model for spoken language interaction, including:
acquiring a spoken language text corpus training set and dialogue historical context information;
performing corpus expansion on the spoken language text corpus training set through the dialogue historical contextual information to enrich the spoken language text corpus training set;
and establishing a text classification model based on a bidirectional long-and-short time memory network, and training the text classification model through the conversation historical context information and the spoken language text corpus training set after corpus expansion, so that the text classification model learns the field classification of the spoken language text through the conversation historical context information.
In a second aspect, an embodiment of the present invention provides a training system for a text classification model for spoken language interaction, including:
the information acquisition program module is used for acquiring a spoken language text corpus training set and dialogue historical context information;
the corpus expansion program module is used for performing corpus expansion on the spoken language text corpus training set through the dialogue historical contextual information to enrich the spoken language text corpus training set;
and the model training program module is used for establishing a text classification model based on a bidirectional long-time and short-time memory network, and training the text classification model through the conversation history context information and the spoken language text corpus training set after corpus expansion, so that the text classification model learns the field classification of the spoken language text through the conversation history context information.
In a third aspect, an electronic device is provided, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method for training a text classification model for spoken language interaction of any embodiment of the present invention.
In a fourth aspect, an embodiment of the present invention provides a storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the steps of the method for training a text classification model for spoken language interaction according to any embodiment of the present invention.
The embodiment of the invention has the beneficial effects that: extracting key factors influencing the field of the next pair of dialogues, determining historical context information of the dialogues, and constructing a large number of virtual dialog texts through the historical context information of the dialogues, so that the situation that the linguistic data of a spoken language text corpus training set is insufficient is made up; and the historical context information of the dialogue is used as a part of the input of the training model, and the output of the model is the final domain result which is in line with the current dialogue scene. The whole system does not have a complicated process of manually judging the field, time and labor are saved, and the historical context information can be conversed to help the model to improve the accuracy of field classification.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flowchart of a method for training a text classification model for spoken language interaction according to an embodiment of the present invention;
FIG. 2 is a flow chart illustrating the structure of a method for training a text classification model for spoken language interaction according to an embodiment of the present invention;
FIG. 3 is a diagram of a dialogue history context information adding BLSTM model of a training method for a text classification model for spoken language interaction according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating comparison of performance of a training method for a text classification model for spoken language interaction according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a training system for a text classification model for spoken language interaction according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart of a training method of a text classification model for spoken language interaction according to an embodiment of the present invention, including the following steps:
s11: acquiring a spoken language text corpus training set and dialogue historical context information;
s12: performing corpus expansion on the spoken language text corpus training set through the dialogue historical contextual information to enrich the spoken language text corpus training set;
s13: and establishing a text classification model based on a bidirectional long-and-short time memory network, and training the text classification model through the conversation historical context information and the spoken language text corpus training set after corpus expansion, so that the text classification model learns the field classification of the spoken language text through the conversation historical context information.
In the present embodiment, spoken text without dialogue information is relatively easily available, but annotation data with dialogue context is relatively rare and time-consuming and labor-consuming to annotate. The history information of the conversation has many contents, including the domain where the previous conversation was, the answer of the machine, and the like. How to effectively utilize such information is not easily conceivable.
For step S11, in training the spoken language interactive text classification model, not only the spoken language text corpus training set but also additional dialogue history context information are required, and the text classification model is trained by considering data of multiple dimensions.
As an embodiment, the acquiring a spoken language text corpus training set and dialogue historical context information includes:
extracting related domain-intentions in the domain set and the intention set based on the domain set and the intention set of the spoken language interaction and a reply template set for feeding back the intention;
extracting reply templates matched with the domain-intention from the reply template set, and determining a domain-intention-dialogue template;
the domain-intent-dialog template is obtained and determined as dialog history context information.
In the present embodiment, there are many contents in the historical contextual dialog information, but three of the domain where the previous dialog is located (pre _ domain), the intention of the user of the previous dialog (pre _ intent), and the reply of the dialog system of the previous dialog (pre _ system reply) are key factors that affect the domain where the next dialog is located. The range of pre _ domain is limited, i.e. the defined set of domains; under a specific domain, the scope of pre _ intent is also limited, i.e., the set of intents defined; the templates of pre _ system reply are also limited and defined set of reply templates, given the particular domain and intent. The system selects a suitable template from the limited templates, and then replaces variables in the template with specific values, thereby generating the final system reply. Therefore, when the dialog corpus is insufficient, a virtual dialog text with context information can be artificially constructed. Pre _ domain, pre _ intent, pre _ system reply (domain-intent-dialog template) are taken together as dialog history context information (dialog _ context for short) of a sentence, for example: "music-playing song-playing { song name }" for you is a complete dialog _ context, "music" indicates that the last round of dialog domain is a music domain, "playing song" indicates the intention of the user of the last round, and "playing { song name }" for you indicates the reply template of the last round of system.
In this embodiment, the set of domains, the set of intentions, and the set of reply templates for feedback intentions for spoken language interaction are pre-configured before the dialog history context information is obtained. For example, a manually predefined set of domains, intents, and reply templates may be obtained in other ways.
With respect to step S12, since the true dialog markup text with dialog _ context is not easily available, and the plain text corpus without dialog _ context is more common, the dialog text needs to be constructed in case of insufficient dialog corpus. The construction method comprises the following steps: randomly selecting a domain from a domain set as pre _ domain of a text corpus, selecting one domain from an intention set supported by the pre _ domain as pre _ intent, and randomly selecting one domain as pre _ systemresply according to a reply template supported by the pre _ intent. The selected pre _ domain, pre _ intent and pre _ systemresply are used as dialog _ context of a sentence, and finally, an original label is modified into a new field label according to the current dialog _ context, and the new label result is in line with the field result in the current dialog scene. By using the corpus construction method, one sentence can construct a plurality of sentences with dialog _ context, thereby enriching the spoken language text corpus training set.
For step S13, the method uses a bidirectional long-short memory network (BLSTM) for modeling. One drawback with conventional LSTM is that it can only utilize previous content from the forward sequence. In text category analysis, future content coming in reverse sequence also plays a crucial role in the judgment of classification. Structured knowledge is extracted by processing the forward and reverse sequences so that complementary information from the past and future can be integrated together for reasoning. Bi-directional LSTM processes data from 2 directions, forward and reverse, with 2 independent hidden layers to achieve the above purpose, and then takes both the hidden layer outputs of the forward and reverse sequences as inputs to the output layer. As shown in fig. 2, dialog _ context and text are simultaneously used as input of the BLSTM model, the model has information of the dialog field, and after the model outputs the field, it is not necessary to make judgment of field selection according to the dialog history information, and the output result of the model is the optimal field classification result according with the current context.
The field classification under the spoken language interaction scene is carried out on the trained text classification model, namely all possible fields are divided for the sentences spoken by the current user according to the above dialogue state. The task is characterized in that the conversation history information at the previous moment has an important influence on the judgment of the field to which the next wheel belongs, and the field classification results of texts are different under different conversation histories. For example, the sentence "in rush year of playing" is divided into fields, and the sentence can belong to the fields of "music" and "movie" because the sentence "in rush year" is the name of both a song and a movie. If dialog _ context is "music-play song-is playing { song name } for you", then there is a greater likelihood that the field of the sentence is "music"; if dialog _ context is "movie-find movie-has found { quantity } movie name } resources for you, then there is a greater likelihood that the sentence belongs to" movie ". This allows the text classification model to learn the domain classification of spoken text from the dialog history context information.
According to the embodiment, key factors influencing the field of the next pair of dialogs are extracted, the historical context information of the dialogs is determined, a large number of virtual dialog texts are constructed through the historical context information of the dialogs, and the situation that the linguistic data of a spoken language text corpus training set is insufficient is made up; and the historical context information of the dialogue is used as a part of the input of the training model, and the output of the model is the final domain result which is in line with the current dialogue scene. The whole system does not have a complicated process of manually judging the field, time and labor are saved, and the historical context information can be conversed to help the model to improve the accuracy of field classification.
As an implementation manner, in this embodiment, the training the text classification model through the dialog history context information and the corpus-augmented spoken language text corpus training set includes:
training the dialogue historical context information as an input layer of a text classification model of the bidirectional long-time and short-time memory network; or
Training the dialogue historical context information as an output layer of a text classification model of the bidirectional long-time and short-time memory network; or
And simultaneously training the conversation history context information as an input layer and an output layer of the text classification model of the bidirectional long-and-short time memory network.
In this embodiment, the input to the BLSTM model is embedding for each word or word. The output layer is a linear classifier, and the input of the linear classifier is the splicing of hidden layers at the two ends of the last moment of the BLSTM. The difference between the invention and the traditional BLSTM model is two points: (1) the dialog history context information is used as part of the model input. There are various ways to add the context information of the dialog history to the model, i.e. it can be added to the input layer of BLSTM, or it can be used as input to the output layer, or it can be added to both the input layer and the output layer, as shown in fig. 3. (2) The output of the model is a "1" and "-1" representation for each domain, where a "1" represents belonging to the domain and a "-1" represents not belonging to the domain, and the output of the overall system, i.e., all domains in the model whose output is "1", are sorted by probability score.
The method uses two test sets to evaluate the classification performance of the system, namely a correct text of the audio by manual transcription and a recognized text recognized by a voice system, wherein the correct text comprises 2 ten thousand sentences, the recognized text comprises 15 ten thousand sentences, and the performance is shown in figure 4. The basic system is a classification result of a traditional method, namely, conversation historical context information is not added as input of a model, and after the model outputs a field, a final field needs to be selected according to the conversation historical information. It can be seen from the figure that the system uses three methods of adding the historical context information of the conversation, and the performance of the system is improved relative to that of the basic system.
Fig. 5 is a schematic structural diagram of a training system for a text classification model for spoken language interaction according to an embodiment of the present invention, which can execute the training method for a text classification model for spoken language interaction according to any of the above embodiments and is configured in a terminal.
The training system of the text classification model for spoken language interaction provided by the embodiment comprises: an information acquisition program module 11, a corpus expansion program module 12 and a model training program module 13.
The information acquisition program module 11 is configured to acquire a spoken language text corpus training set and dialogue history context information; the corpus expansion program module 12 is configured to perform corpus expansion on the spoken language text corpus training set through the dialog history context information to enrich the spoken language text corpus training set; the model training program module 13 is configured to establish a text classification model based on a bidirectional long-and-short-term memory network, and train the text classification model through the dialog history context information and the spoken language text corpus training set after corpus expansion, so that the text classification model learns the domain classification of the spoken language text through the dialog history context information.
Further, the information acquisition program module is configured to:
extracting related domain-intentions in the domain set and the intention set based on the domain set and the intention set of the spoken language interaction and a reply template set for feeding back the intention;
extracting reply templates matched with the domain-intention from the reply template set, and determining a domain-intention-dialogue template;
the domain-intent-dialog template is obtained and determined as dialog history context information.
Further, the set of domains, the set of intentions, and the set of reply templates for feedback intentions of the spoken language interaction are pre-configured before the dialog history context information is obtained.
Further, the model training program module is to:
training the dialogue historical context information as an input layer of a text classification model of the bidirectional long-time and short-time memory network; or
Training the dialogue historical context information as an output layer of a text classification model of the bidirectional long-time and short-time memory network; or
And simultaneously training the conversation history context information as an input layer and an output layer of the text classification model of the bidirectional long-and-short time memory network.
The embodiment of the invention also provides a nonvolatile computer storage medium, wherein the computer storage medium stores computer executable instructions which can execute the training method of the text classification model for spoken language interaction in any method embodiment;
as one embodiment, a non-volatile computer storage medium of the present invention stores computer-executable instructions configured to:
acquiring a spoken language text corpus training set and dialogue historical context information;
performing corpus expansion on the spoken language text corpus training set through the dialogue historical contextual information to enrich the spoken language text corpus training set;
and establishing a text classification model based on a bidirectional long-and-short time memory network, and training the text classification model through the conversation historical context information and the spoken language text corpus training set after corpus expansion, so that the text classification model learns the field classification of the spoken language text through the conversation historical context information.
As a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as program instructions/modules corresponding to the methods in embodiments of the present invention. One or more program instructions are stored in a non-transitory computer readable storage medium, which when executed by a processor, perform a method of training a text classification model for spoken language interaction in any of the method embodiments described above.
The non-volatile computer-readable storage medium may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the device, and the like. Further, the non-volatile computer-readable storage medium may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the non-transitory computer readable storage medium optionally includes memory located remotely from the processor, which may be connected to the device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
An embodiment of the present invention further provides an electronic device, which includes: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method for training a text classification model for spoken language interaction of any embodiment of the present invention.
The client of the embodiment of the present application exists in various forms, including but not limited to:
(1) mobile communication devices, which are characterized by mobile communication capabilities and are primarily targeted at providing voice and data communications. Such terminals include smart phones, multimedia phones, functional phones, and low-end phones, among others.
(2) The ultra-mobile personal computer equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include PDA, MID, and UMPC devices, such as tablet computers.
(3) Portable entertainment devices such devices may display and play multimedia content. The devices comprise audio and video players, handheld game consoles, electronic books, intelligent toys and portable vehicle-mounted navigation devices.
(4) Other electronic devices with data processing capabilities.
In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for training a text classification model for spoken language interaction, comprising:
acquiring a spoken language text corpus training set and dialogue historical context information;
performing corpus expansion on the spoken language text corpus training set through the dialogue historical contextual information to enrich the spoken language text corpus training set;
and establishing a text classification model based on a bidirectional long-and-short time memory network, and training the text classification model through the conversation historical context information and the spoken language text corpus training set after corpus expansion, so that the text classification model learns the field classification of the spoken language text through the conversation historical context information.
2. The method of claim 1, wherein said obtaining a training set of spoken text corpora and dialog history context information comprises:
extracting related domain-intentions in the domain set and the intention set based on the domain set and the intention set of the spoken language interaction and a reply template set for feeding back the intention;
extracting reply templates matched with the domain-intention from the reply template set, and determining a domain-intention-dialogue template;
the domain-intent-dialog template is obtained and determined as dialog history context information.
3. The method of claim 2, wherein the set of domains for spoken interaction, the set of intentions, and the set of reply templates for feedback intentions are pre-configured prior to obtaining the dialog history context information.
4. The method of claim 1, wherein the training of the text classification model with the dialog history context information and the corpus-augmented spoken-language-text corpus training set comprises:
training the dialogue historical context information as an input layer of a text classification model of the bidirectional long-time and short-time memory network; or
Training the dialogue historical context information as an output layer of a text classification model of the bidirectional long-time and short-time memory network; or
And simultaneously training the conversation history context information as an input layer and an output layer of the text classification model of the bidirectional long-and-short time memory network.
5. A training system for a text classification model for spoken language interaction, comprising:
the information acquisition program module is used for acquiring a spoken language text corpus training set and dialogue historical context information;
the corpus expansion program module is used for performing corpus expansion on the spoken language text corpus training set through the dialogue historical contextual information to enrich the spoken language text corpus training set;
and the model training program module is used for establishing a text classification model based on a bidirectional long-time and short-time memory network, and training the text classification model through the conversation history context information and the spoken language text corpus training set after corpus expansion, so that the text classification model learns the field classification of the spoken language text through the conversation history context information.
6. The system of claim 5, wherein the information acquisition program module is to:
extracting related domain-intentions in the domain set and the intention set based on the domain set and the intention set of the spoken language interaction and a reply template set for feeding back the intention;
extracting reply templates matched with the domain-intention from the reply template set, and determining a domain-intention-dialogue template;
the domain-intent-dialog template is obtained and determined as dialog history context information.
7. The system of claim 6, wherein the set of domains for spoken interaction, the set of intentions, and the set of reply templates for feedback intentions are pre-configured prior to obtaining the dialog history context information.
8. The system of claim 5, wherein the model training program module is to:
training the dialogue historical context information as an input layer of a text classification model of the bidirectional long-time and short-time memory network; or
Training the dialogue historical context information as an output layer of a text classification model of the bidirectional long-time and short-time memory network; or
And simultaneously training the conversation history context information as an input layer and an output layer of the text classification model of the bidirectional long-and-short time memory network.
9. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method of any of claims 1-4.
10. A storage medium on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 4.
CN201911066202.5A 2019-11-04 2019-11-04 Training method and system of text classification model for spoken language interaction Active CN110765270B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911066202.5A CN110765270B (en) 2019-11-04 2019-11-04 Training method and system of text classification model for spoken language interaction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911066202.5A CN110765270B (en) 2019-11-04 2019-11-04 Training method and system of text classification model for spoken language interaction

Publications (2)

Publication Number Publication Date
CN110765270A true CN110765270A (en) 2020-02-07
CN110765270B CN110765270B (en) 2022-07-01

Family

ID=69335559

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911066202.5A Active CN110765270B (en) 2019-11-04 2019-11-04 Training method and system of text classification model for spoken language interaction

Country Status (1)

Country Link
CN (1) CN110765270B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112000787A (en) * 2020-08-17 2020-11-27 上海小鹏汽车科技有限公司 Voice interaction method, server and voice interaction system
WO2021135534A1 (en) * 2020-06-16 2021-07-08 平安科技(深圳)有限公司 Speech recognition-based dialogue management method, apparatus, device and medium
CN114036959A (en) * 2021-11-25 2022-02-11 北京房江湖科技有限公司 Method, apparatus, computer program product and storage medium for determining a context of a conversation
CN115687031A (en) * 2022-11-15 2023-02-03 北京优特捷信息技术有限公司 Method, device, equipment and medium for generating alarm description text
CN117576982A (en) * 2024-01-16 2024-02-20 青岛培诺教育科技股份有限公司 Spoken language training method and device based on ChatGPT, electronic equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107301225A (en) * 2017-06-20 2017-10-27 挖财网络技术有限公司 Short text classification method and device
CN108388638A (en) * 2018-02-26 2018-08-10 出门问问信息科技有限公司 Semantic analytic method, device, equipment and storage medium
CN108415923A (en) * 2017-10-18 2018-08-17 北京邮电大学 The intelligent interactive system of closed domain
CN108597519A (en) * 2018-04-04 2018-09-28 百度在线网络技术(北京)有限公司 A kind of bill classification method, apparatus, server and storage medium
CN108962224A (en) * 2018-07-19 2018-12-07 苏州思必驰信息科技有限公司 Speech understanding and language model joint modeling method, dialogue method and system
CN110209791A (en) * 2019-06-12 2019-09-06 百融云创科技股份有限公司 It is a kind of to take turns dialogue intelligent speech interactive system and device more

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107301225A (en) * 2017-06-20 2017-10-27 挖财网络技术有限公司 Short text classification method and device
CN108415923A (en) * 2017-10-18 2018-08-17 北京邮电大学 The intelligent interactive system of closed domain
CN108388638A (en) * 2018-02-26 2018-08-10 出门问问信息科技有限公司 Semantic analytic method, device, equipment and storage medium
CN108597519A (en) * 2018-04-04 2018-09-28 百度在线网络技术(北京)有限公司 A kind of bill classification method, apparatus, server and storage medium
CN108962224A (en) * 2018-07-19 2018-12-07 苏州思必驰信息科技有限公司 Speech understanding and language model joint modeling method, dialogue method and system
CN110209791A (en) * 2019-06-12 2019-09-06 百融云创科技股份有限公司 It is a kind of to take turns dialogue intelligent speech interactive system and device more

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021135534A1 (en) * 2020-06-16 2021-07-08 平安科技(深圳)有限公司 Speech recognition-based dialogue management method, apparatus, device and medium
CN112000787A (en) * 2020-08-17 2020-11-27 上海小鹏汽车科技有限公司 Voice interaction method, server and voice interaction system
CN112000787B (en) * 2020-08-17 2021-05-14 上海小鹏汽车科技有限公司 Voice interaction method, server and voice interaction system
CN114036959A (en) * 2021-11-25 2022-02-11 北京房江湖科技有限公司 Method, apparatus, computer program product and storage medium for determining a context of a conversation
CN115687031A (en) * 2022-11-15 2023-02-03 北京优特捷信息技术有限公司 Method, device, equipment and medium for generating alarm description text
CN117576982A (en) * 2024-01-16 2024-02-20 青岛培诺教育科技股份有限公司 Spoken language training method and device based on ChatGPT, electronic equipment and medium
CN117576982B (en) * 2024-01-16 2024-04-02 青岛培诺教育科技股份有限公司 Spoken language training method and device based on ChatGPT, electronic equipment and medium

Also Published As

Publication number Publication date
CN110765270B (en) 2022-07-01

Similar Documents

Publication Publication Date Title
CN110765270B (en) Training method and system of text classification model for spoken language interaction
JP6799574B2 (en) Method and device for determining satisfaction with voice dialogue
CN110516253B (en) Chinese spoken language semantic understanding method and system
CN106534548B (en) Voice error correction method and device
CN111090727B (en) Language conversion processing method and device and dialect voice interaction system
CN110223692B (en) Multi-turn dialogue method and system for voice dialogue platform cross-skill
CN111832308A (en) Method and device for processing consistency of voice recognition text
CN112767969B (en) Method and system for determining emotion tendentiousness of voice information
CN111723207B (en) Intention identification method and system
CN111680129B (en) Training method and system of semantic understanding system
CN110597958B (en) Text classification model training and using method and device
CN110929045A (en) Construction method and system of poetry-semantic knowledge map
CN111833844A (en) Training method and system of mixed model for speech recognition and language classification
CN113128228A (en) Voice instruction recognition method and device, electronic equipment and storage medium
CN111046217B (en) Combined song generation method, device, equipment and storage medium
CN116821290A (en) Multitasking dialogue-oriented large language model training method and interaction method
CN115640398A (en) Comment generation model training method, comment generation device and storage medium
CN112749544B (en) Training method and system of paragraph segmentation model
CN112041809A (en) Automatic addition of sound effects to audio files
CN111128122B (en) Method and system for optimizing rhythm prediction model
CN111046674B (en) Semantic understanding method and device, electronic equipment and storage medium
CN111063337B (en) Large-scale voice recognition method and system capable of rapidly updating language model
CN110827802A (en) Speech recognition training and decoding method and device
CN114297372A (en) Personalized note generation method and system
CN114896988A (en) Unified dialog understanding method and framework

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Applicant after: Sipic Technology Co.,Ltd.

Address before: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Applicant before: AI SPEECH Ltd.

GR01 Patent grant
GR01 Patent grant