CN108932945A

CN108932945A - A kind of processing method and processing device of phonetic order

Info

Publication number: CN108932945A
Application number: CN201810233853.8A
Authority: CN
Inventors: 钱希; 杨琛
Original assignee: Beijing Orion Star Technology Co Ltd
Current assignee: Beijing Orion Star Technology Co Ltd
Priority date: 2018-03-21
Filing date: 2018-03-21
Publication date: 2018-12-04
Anticipated expiration: 2038-03-21
Also published as: CN108932945B

Abstract

This application discloses a kind of processing method and processing device of phonetic order, the method includes：Receive the phonetic order for coming that self terminal includes user's original intent；Speech recognition is carried out to the phonetic order, generates the text information of the phonetic order；The text information is parsed, determines that parsing corresponding to the text information is intended to；It is intended to resource needed for retrieval executes the phonetic order according to the parsing, and the resource is sent to the terminal；It determines that the original intent is unmet, the phonetic order, text information and parsing intention is labeled as error sample and saved to error sample library.The method decides whether to obtain error sample according to the reaction pattern of human-computer interaction, carries out the operation of error label manually without user, increase a possibility that getting error sample.

Description

A kind of processing method and processing device of phonetic order

Technical field

This application involves interactive voice technical field of intelligent equipment, a kind of processing method more particularly to phonetic order and Device.

Background technique

With the development of artificial intelligence technology, occur smart machine various in style in the market, common are intelligence and set Standby includes smart phone, intelligent sound box, smart television, intelligent robot etc..In order to promote the usage experience of user, many intelligence Equipment all provides the function of voice input and voice output.The language that the voice interactive system of these smart machines is inputted according to user Sound instructs the intention for determining user, to provide various services for user.

In common voice interactive system, it is generally divided into three parts and is handled come the instruction inputted to user.First The phonetic order that user inputs is converted into text by speech recognition system ASR (Automatic speech recognition) Word；Then the intention as representated by semantic resolution system NLP (Natural language processing) parsing text；Most Realizing and being intended to complete for task is executed by requesting various resources afterwards.

Wherein, speech recognition system and natural language processing system require a large amount of labeled data to be trained.? After error sample detection is online, it is also necessary to constantly manually be marked to user's input, to improve error sample detection model Accuracy.It is the autonomous marking error sample of man-machine interaction mode for passing through active by user mostly in the prior art.Due to There is no mode standard, corresponding same original intents might have various, multifarious for the phonetic order of user's input Phonetic order collects a large amount of labeled data, especially those labeled data being erroneously identified, and can usually examine to error sample The performance for surveying model brings biggish raising.But it in the prior art, needs user to be switched in other interactive systems actively to carry out The mark of error sample, because of the cumbersome movement for causing most of user to abandon completing data mark, actually very Difficulty gets the data that user actively submits, it has to expend a large amount of manpower and material resources and collect error sample deposit error sample library In, this causes the training error sample detection model in existing voice interactive system to cannot achieve quickly mentioning for user experience It rises.

Summary of the invention

In order to solve the problems in the existing technology, the embodiment of the present application provides a kind of processing side of phonetic order Method, device, smart machine and computer readable storage medium, to solve to obtain error label sample automatically from man-machine interactive operation This problem of, to quickly improve the user experience of voice interactive system.

On the one hand the embodiment of the present application provides a kind of processing method of phonetic order, the method includes：

Receive the phonetic order for coming that self terminal includes user's original intent；

Speech recognition is carried out to the phonetic order, generates the text information of the phonetic order；

The text information is parsed, determines that parsing corresponding to the text information is intended to；

It is intended to resource needed for retrieval executes the phonetic order according to the parsing, and the resource is sent to described Terminal；

It determines that the original intent is unmet, is intended to the phonetic order, text information and parsing to be labeled as mistake Sample preservation is missed to error sample library.

Optionally, the determination original intent is unmet includes：

The parsing for repeating to receive same user in preset period of time is intended to identical phonetic order；Or

It receives and is intended to the phonetic order, text information and parsing from the terminal to be labeled as the letter of error sample Breath.

Optionally, determine whether to repeat to receive in preset period of time the parsing of same user in a manner of decision tree It is intended to identical phonetic order.

Optionally, the method also includes：

Save the record of resource retrieval corresponding with the parsing intention；

If determining that the parsing is intended to unmet reason according to preset rules based on the retrieval record is money Caused by the retrieval of source, then the error sample is rejected from the error sample library.

Optionally, the method also includes：

The matching degree that the resource retrieval and the parsing are intended to is calculated, and the value for the matching degree being calculated is stored in In retrieval record；

The value of the matching degree is less than preset threshold, then rejects the error sample from the error sample library.

On the other hand the embodiment of the present application also provides a kind of processing method of phonetic order, the method includes：

The phonetic order is simultaneously sent to server by the phonetic order of original intent of the acquisition comprising user；

Resource needed for executing the phonetic order is obtained from server end；

It determines that the original intent is unmet, send to the server by the phonetic order and is based on institute's predicate The parsing that sound instruction carries out the text information of speech recognition acquisition and carries out parsing acquisition to the text information is intended to mark For the information of error sample.

Optionally, after obtaining resource required for executing the phonetic order, the method also includes：

The information of the corresponding execution movement of the phonetic order is provided based on the resource got；

The determination original intent is unmet to include：

Capture the instruction for abandoning executing the corresponding execution movement of the phonetic order.

Execute that the phonetic order is corresponding to execute movement based on the resource that gets；

The determination original intent is unmet to include：

Capture the instruction that the corresponding execution movement of the phonetic order is terminated in scheduled time threshold value.

On the other hand the embodiment of the present application also provides a kind of processing unit of phonetic order, described device includes：Receive mould Block, speech recognition module, parsing module, resource retrieval module, the first error sample detection module and error sample library；Wherein, The receiving module is configured as receiving the phonetic order that self terminal includes user's original intent；The speech recognition module quilt It is configured to carry out speech recognition to the phonetic order, generates the text information of the phonetic order；The parsing module is matched It is set to and the text information is parsed, determine that parsing corresponding to the text information is intended to；The resource retrieval module It is configured as being intended to resource needed for retrieval executes the phonetic order according to the parsing, and the resource is sent to described Terminal；The first error sample detection module is configured to determine that the original intent is unmet, and the voice is referred to It enables, text information and parsing intention are labeled as error sample and save to error sample library；The error sample library is configured as depositing Store up the error sample.

Optionally, the first error sample detection module determines that the original intent is unmet and includes：

Optionally, the first error sample detection module is configured as determining whether in a manner of decision tree when default Between repeat to receive the parsing of same user in the period and be intended to identical phonetic order.

Optionally, the first error sample detection module is additionally configured to：

Optionally, described device includes：Acquisition module, execution module and the second error sample detection module；The acquisition Module is configured as the phonetic order of original intent of the acquisition comprising user and the phonetic order is sent to server；It is described Execution module is configured as resource needed for obtaining the execution phonetic order from server end；The second error sample detection Module is configured to determine that the original intent is unmet, sends to the server by the phonetic order and is based on institute The parsing that phonetic order carries out the text information of speech recognition acquisition and carries out parsing acquisition to the text information is stated to be intended to It is labeled as the information of error sample.

Optionally, the execution module is additionally configured to after obtaining resource required for executing the phonetic order,

The determination original intent is unmet to include：

Optionally, the execution module is additionally configured to after obtaining resource required for executing the phonetic order, is based on The resource execution phonetic order got is corresponding to execute movement；

The determination original intent is unmet to include：

On the other hand the embodiment of the present application also provides a kind of smart machine, including memory, processor and be stored in storage On device and the computer instruction that can run on a processor, which is characterized in that the processor is realized such as when executing described instruction The processing method of the upper phonetic order.

On the other hand embodiments herein also provides a kind of computer readable storage medium, be stored thereon with computer and refer to It enables, which is characterized in that the processing method of phonetic order as described above is realized in the instruction when being executed by processor.

The processing method and processing device of phonetic order provided by the present application can obtain automatically those from man-machine interactive operation The data being erroneously identified label it as error sample and save into error sample library, and it is wrong that this not only greatly reduces mark Accidentally human cost consumed by sample, and the optimization efficiency of error sample detection model is significantly improved, and then be effectively improved The user experience of voice interactive system.

Detailed description of the invention

Fig. 1 is the flow diagram of the processing method of the phonetic order of the server end of one embodiment of the application；

Fig. 2 is the flow diagram of the processing method of the phonetic order of the server end of another embodiment of the application；

Fig. 3 is the structural schematic diagram of the decision tree of one embodiment of the application；

Fig. 4 is the flow diagram of the processing method of the phonetic order of the server end of another embodiment of the application；

Fig. 5 is the flow diagram of the processing method of the phonetic order of the server end of another embodiment of the application；

Fig. 6 is the flow diagram of the processing method of the phonetic order of the client of one embodiment of the application；

Fig. 7 is the flow diagram of the processing method of the phonetic order of the client of another embodiment of the application；

Fig. 8 is the flow diagram of the processing method of the phonetic order of the client of another embodiment of the application；

Fig. 9 is the structural schematic diagram of the processing unit of the phonetic order of the server end of one embodiment of the application；

Figure 10 is the structural schematic diagram of the processing unit of the phonetic order of one embodiment of the application；

Figure 11 is the structural schematic diagram of the smart machine of one specific embodiment of the application；

Specific embodiment

The details for illustrating the application by embodiment with reference to the accompanying drawing is more advantageous in this way and understands that the application's is interior Hold, but the application can by it is a variety of be different from specific embodiment in a manner of implement, those skilled in the art can without prejudice to The prior art is combined to do similar popularization in the case where the application intension, therefore the application is not by the specific embodiment of following discloses Limitation.

In this application, " first ", " second ", " third ", " the 4th " etc. are only used for mutual differentiation, rather than indicate important Degree and sequence and each other existing premise etc..

In this application, processing method, device, smart machine and the storage medium of a kind of phonetic order are provided, under It is described in detail one by one in the embodiment in face.

A kind of processing method of the phonetic order of server end is disclosed in one embodiment of the application, it is described referring to Fig. 1 Method includes：

Step 101：Receive the phonetic order for coming that self terminal includes user's original intent；

Step 102：Speech recognition is carried out to the phonetic order, generates the text information of the phonetic order；

Step 103：The text information is parsed, determines that parsing corresponding to the text information is intended to；

Step 104：It is intended to resource needed for retrieval executes the phonetic order according to the parsing, and the resource is sent out It send to the terminal；

Step 105：It determines that the original intent is unmet, the phonetic order, text information and parsing is intended to Error sample is labeled as to save to error sample library.

The above method according to the reaction pattern of human-computer interaction determine whether obtain error sample, it is no longer necessary to user manually into The operation of row error label, to increase a possibility that getting error sample.

Determining that the original intent is unmet in one embodiment according to the application, in step 105 includes：

By taking the first situation as an example, it is assumed that user wants being named as of one well-known shop Pizza of purchase " dustbin " Pizza, then it inputs phonetic order " taking out a dustbin Pizza " by mobile phone, and received server-side refers to the voice There is mistake during speech recognition after order, phonetic order is converted into text information " one dustbin of mother Pizza ", and the matched refuse collection of intention removal search that parsing makes mistake is held according to the text information of this mistake and services public affairs Department.Since parsing is intended to be intended to not be inconsistent with user, it is unable to satisfy user's request, user is likely to attempt input voice again and refers to " taking out a dustbin Pizza " is enabled, current server, which correctly changes phonetic order in speech recognition process, switchs to text envelope " taking out a dustbin Pizza " is ceased, is gone out according to the Context resolution of text information and is correctly intended to, sale Pizza/ is searched and drapes over one's shoulders Sa/Piza restaurant inventory simultaneously provides the map route in the shop of arriving.It is well-known that user has found that family in the restaurant inventory of offer The information such as the address in the shop Pizza and phone, and successfully have subscribed the Pizza for being named as " dustbin ".In this process, quilt Phonetic order " taking out a dustbin Pizza ", text information " one dustbin Pizza of mother " and the parsing meaning of wrong identification Figure " searching for matched refuse collection service company " is noted as error sample and saves into error sample library, for improving mistake The accuracy of pattern detection model.Reduce a possibility that wrong identification or parsing hereafter occurs.

It provides a kind of server end in the embodiment and determines the unmet judgment mode of the original intent, i.e., it is logical Cross detect it is a certain represent it is identical parsing be intended to phonetic order whether repeat to transmit determination in a short time by same user be No acquisition error sample, this method are omitted the tedious steps that user carries out error label feedback manually, obtain error sample A possibility that greatly improve.

Second situation, if received server-side arrival self terminal anticipates the phonetic order, text information and parsing Figure is labeled as the information of error sample, it is determined that the original intent of user is unmet, by the phonetic order, text information Error sample is labeled as with parsing intention to save to error sample library.

In another embodiment according to the application, as shown in Fig. 2, wherein step 201 to 204 with as shown in Figure 1 side Step 101 in method constructs the detection model that can learn, for examining to 104 identical in a manner of decision tree in step 205 The identical phonetic order of parsing intention for whether repeating to receive same user in preset period of time is surveyed, so that it is determined that described Whether original intent is met, and conciliates the phonetic order, text information in the case where original intent is unmet Analysis intention is labeled as error sample and saves to error sample library.

Decision tree is a kind of prediction model in machine learning, it indicates that one kind between object properties and object value is reflected It penetrates, each of decision tree node indicates that the Rule of judgment of object properties, branching representation meet the object of node condition.Certainly The leaf node of plan tree indicates prediction result belonging to object.

Fig. 3 shows the structural schematic diagram of the decision tree in a specific embodiment, for detecting whether in preset period of time The parsing that interior repetition receives same user is intended to identical phonetic order, and then determines whether current phonetic order, text This information and parsing intention are labeled as error sample and save to error sample library.

In the decision tree of this structure there are three the attributes of judgment basis：

The first, whether repeat to receive the instruction of identical intention；

The second, whether the instruction of the described identical intention is from same subscriber；

Third, whether be repeated 2 times in two minutes it is above.

Each of decision tree node indicates that the Rule of judgment of object properties, branching representation meet pair of node condition As.Such as：Server repetition receives the phonetic order of identical intention, described instruction and repeats from same subscriber, in 2 minutes 3 times.Judge that the situation meets right branch (YES) by the root node of decision tree；Judge whether again from identical use Family meets right branch (YES)；Then judge whether to be repeated 2 times in 2 minutes above, meet right branch (YES), work as cause Condition is just fallen on the leaf node of " original intent is unmet ", and current phonetic order, text information and parsing are intended to Error sample is labeled as to save to error sample library.

The building of the decision tree can using ID3 algorithm (Iterative Dichotomiser 3), C4.5 algorithm or CART algorithm etc..

The ID3 algorithm is a kind of greedy algorithm, for constructing decision tree.ID3 algorithm originates from concept learning system, with The decrease speed of comentropy is to choose the standard of testing attribute, i.e., has most what each node selection was not yet used to divide The attribute of high information gain then proceedes to this process as the criteria for classifying, until the decision tree of generation can perfect classification based training Sample.

The C4.5 algorithm and ID3 algorithm are solved using greedy algorithm, and the two is the difference is that classification The foundation of decision is different.When carrying out categorised decision with information gain, it is partial to the more feature of value, C4.5 is namely based on letter Cease the categorised decision method of the ratio of gains.Therefore, C4.5 algorithm is identical with ID3 in recurrence in structure, and difference is that choosing Select information gain than maximum when depending on disconnected feature.

The CART algorithm is also known as post-class processing algorithm, and the post-class processing is binary tree, therefore CART algorithm Dichotomy can simplify the scale of decision tree, improve the efficiency for generating decision tree.

The sharpest edges of decision Tree algorithms are to the self study of implementation model, it is only necessary to carry out to training example preferable Mark, it will be able to train the good error sample detection model of effect.

In the case where the dimension relatively depth of decision tree, it is easy to there is the phenomenon that over-fitting, so-called over-fitting refer in order to Unanimously assumed and makes to assume to become over stringent.Avoiding over-fitting is one of core missions of classifier design.In order to anti- Other than the pruning method except through limiting decision tree dimension, it is random can also to construct a large amount of decision tree composition for only over-fitting Forest prevents over-fitting, the disadvantage that avoids decision tree generalization ability weak.In other words, single decision tree is there may be over-fitting, But over-fitting can be eliminated by the increase of range.Random forest technology can preferably handle high latitude data, Training can be rapidly completed in the case where multiple features.In addition, random forest is able to detect that between feature in the training process It influences each other to predict whether sample belongs to error sample.

Therefore, error sample can be filtered by the detection model that the method optimizing of random forest can learn.

It is a kind of it is typical building random forest method include：

N sample is randomly selected from sample set；

K attribute is randomly selected from all properties, is selected optimal segmentation attribute as node and is established decision tree；

It repeats two above step m times, establishes m decision tree；

This m decision tree forms random forest, the voting result by way of ballot, which kind of determination data belongs to.Institute It states voting machine and is formed with that veto by one vote system, the minority is subordinate to the majority, weighted majority etc..

It, can be by being modeled to user's customary model used in everyday, from wrong sample by decision tree or random forest technology Some special users, such as tester are rejected in this library, operate generated error sample.

Below by taking a concrete application as an example, the effect of random forest technology in this application is illustrated, wherein：

Class of subscriber is divided into：Tester and ordinary user.

The corresponding feature for classification of each decision tree in random forest, if total characteristic number is 3 forests In be just corresponding with 3 decision trees, decision tree herein uses post-class processing.

First decision tree classified for feature " the daily average duration for using phonetic order " is shown in table 1 Parameter：

Second decision tree classified for feature " the daily par for uploading phonetic order " is shown in table 2 Parameter：

Upload the quantity of voice	Tester	Ordinary user
			It is daily to be more than or equal to 500	75%	1%
It is daily to be more than or equal to 100	85%	8%
			It is daily to be less than or equal to 50	15%	75%
It is daily to be less than or equal to 10	1%	35%

The ginseng for the third decision tree classified for feature " par of daily marking error " is shown in table 1 Number：

Upload the quantity of voice	Tester	Ordinary user
			Daily more than 100	80%	2%
It is daily to be more than or equal to 50	92%	15%
			It is daily to be less than or equal to 20	30%	55%
It is daily to be less than or equal to 10	1%	30%

According to the classification results of above-mentioned three decision trees, user's classification can be established for the information of some specific user Distribution situation：

Feature	Characteristic value	Tester	Ordinary user
				The daily average duration for using phonetic order	7	95%	5%
The daily par for uploading phonetic order	100	85%	8%
				The par of daily marking error	50	92%	15%

It finally draws a conclusion, it is tester which, which has about 91% probability, and about 9% probability is ordinary user, institute Finally to assert that the user belongs to tester, error sample caused by the user's operation is rejected from error sample library.

In another embodiment according to the application, the method includes：

Step 401：Receive the phonetic order for coming that self terminal includes user's original intent；

Step 402：Speech recognition is carried out to the phonetic order, generates the text information of the phonetic order；

Step 403：The text information is parsed, determines that parsing corresponding to the text information is intended to；

Step 404：It is intended to resource needed for retrieval executes the phonetic order according to the parsing, and the resource is sent out It send to the terminal, saves the record of resource retrieval corresponding with the parsing intention；

Step 405：It determines that the original intent is unmet, the phonetic order, text information, parsing is intended to mark Note is that error sample is saved to error sample library；

Step 406：If determining that the parsing intention is unmet according to preset rules based on retrieval record The reason is that then the error sample is rejected from the error sample library caused by resource retrieval.

Parsing caused by resource retrieval reason is intended to unmet situation and includes：

Network connection error leads to not obtain resource retrieval result；

The mode mistake of retrieval leads to the resource retrieval result of mistake；Or

Because the limitation of search library leads to not obtain the resource retrieval result needed.

It may be that speech recognition errors or semantic interpretation generate in the process since user is intended to unsatisfied situation both, Be also likely to be as resource retrieval failure or mistake caused by, when carrying out error sample preservation, at the same save resource examine The record of rope, can subsequently through modes such as artificial reinspections, by this kind of speech recognition and semantic parsing it is correct and merely because User caused by retrieval failure or mistake is intended to unsatisfied interference error sample and rejects from error sample library, to improve The accuracy of error sample detection model.

In a specific application, the phonetic order that server receives user's input " plays film《AABBCC》", " film is played by the text information that speech recognition obtains the phonetic order《AABBCC》", after parsing text information, into It is not found when row resource retrieval entitled《AABBCC》Film video resource, subsequent user repeatedly inputs phonetic order, but by In retrieval less than matched film video resource, user is intended to be unable to get satisfaction always, therefore phonetic order " plays film 《AABBCC》", text information corresponding with the phonetic order and parsing be intended to and money corresponding with the parsing intention The record of source retrieval is all noted as error sample and saves into error sample library.Obviously, it is parsed in speech recognition and semanteme Do not occur mistake in link, therefore, which can be removed from error sample library by artificial screening.

In another embodiment according to the application, to avoid the cumbersome of artificial screening, money is saved in retrieval record The matching degree of source and request interferes the automatic screening of wrong data sample to realize, the method includes：

Step 501：Receive the phonetic order for coming that self terminal includes user's original intent；

Step 502：Speech recognition is carried out to the phonetic order, generates the text information of the phonetic order；

Step 503：The text information is parsed, determines that parsing corresponding to the text information is intended to；

Step 504：It is intended to resource needed for retrieval executes the phonetic order according to the parsing, and the resource is sent out It send to the terminal, saves the record of resource retrieval corresponding with the parsing intention；

Step 505：The matching degree that the resource retrieval and the parsing are intended to is calculated, and by the matching degree being calculated Value is stored in retrieval record；

Step 506：It determines that the original intent is unmet, the phonetic order, text information and parsing is intended to Error sample is labeled as to save to error sample library；

Step 507：The value of the matching degree is less than preset threshold, then by the error sample from the error sample library It rejects.

Through the above steps, the mistake for not being able to satisfy user's intention and mark can will be caused due to resource retrieval Sample screening comes out, and is rejected from error sample library, and the accuracy of error sample detection model is further increased.

In an alternate embodiment of the invention, can by it is described cause not to be able to satisfy due to resource retrieval user be intended to and The error sample of mark just filters this out before saving to error sample library.

There are many kinds of the modes for the matching degree that computing resource retrieval is intended to parsing, is illustrated below with a concrete application Explanation.If the KTV that user wants to go to one entitled " lollipop ", which sings, wants relevant search information, user inputs voice and refers to thus " lollipop KTV singing " is enabled, which is carried out speech recognition to have obtained corresponding text information being that " lollipop KTV is sung Song ", obtains three keywords " lollipop ", " KTV " and " singing " according to the fractionation to text information, is solved according to keyword Analysis is intended to the KTV in search name comprising keyword " lollipop ".But due to there is no one entitled " lollipop " KTV, Therefore following 4 kinds of search results can only be provided as feedback：

1. providing the information for not including the place that can be sung of keyword " lollipop " in title；

2. phoning comprising keyword " lollipop " and/or " KTV " and/or the contact person of " singing "；

3. the schedule of " lollipop KTV singing " is added in calendar；

4. playing the song including keyword " lollipop " and/or " KTV ".

Obviously regardless of providing any original intent for being all unable to satisfy user in above-mentioned four kinds of feedbacks, but due to resource Retrieval with it is described parsing be intended to matching degree it is not high, it is assumed that the calculated matching degree numerical value of the first situation be 50%, second The calculated matching degree of situation be 40%, the calculated matching degree of the third situation be 30%, the 4th kind situation calculated It is 15%, respectively less than preset threshold 70% with degree, even if this for receiving same user's transmission is repeated several times in server at this time Same voice instruction, finally all can because of resource retrieval and it is described parsing be intended to matching degree value be less than preset threshold, and incite somebody to action Corresponding error sample is rejected from the error sample library.

The calculating of matching degree depends on the type of resource retrieval, for retrieving a song.Phonetic order is to play song Bent S1, S1 represent the character string of song title.But there is mistake in speech recognition or semantic parsing, cause finally to obtain The song title for including in parsing intention is that character string S2, S1 and S2 respectively correspond pinyin character string P1 and P2, passes through formula M= 1-d/max (len (p1), len (p2)) calculates the value of the matching degree using character string S2 as search condition, and wherein M is matching degree, D is the editing distance of P1 and P2, and len (P1) is the length of pinyin character string P1, and len (P2) is the length of pinyin character string P2, Max (len (P1), len (P2)) takes the biggish numerical value of string length in the two.

In one embodiment of the application, a kind of processing method of the phonetic order of client is disclosed, it is described referring to Fig. 6 Method includes：

Step 601：The phonetic order is simultaneously sent to server by the phonetic order of original intent of the acquisition comprising user；

Step 602：Resource needed for executing the phonetic order is obtained from server end；

Step 603：It determines that the original intent is unmet, sends to the server by the phonetic order and base The text information of speech recognition acquisition is carried out in the phonetic order and the parsing of parsing acquisition is carried out to the text information It is intended to be labeled as the information of error sample.

The above method decides whether acquisition error sample according to the reaction pattern of human-computer interaction, carries out mistake manually without user The accidentally operation of mark, increases a possibility that getting error sample.

Optionally, the processing side of the phonetic order of another client is provided in another embodiment according to the application Method, referring to Fig. 7, the method includes：

Step 701：The phonetic order is simultaneously sent to server by the phonetic order of original intent of the acquisition comprising user；

Step 702：Resource needed for executing the phonetic order is obtained from server end；

Step 703：The information of the corresponding execution movement of the phonetic order is provided based on the resource got；

Step 704：The instruction for abandoning executing the corresponding execution movement of the phonetic order is captured, determines the original meaning Scheme it is unmet, to the server send by the phonetic order and based on the phonetic order carry out speech recognition acquisition Text information and parsing acquisition is carried out to the text information parsing be intended to be labeled as the information of error sample.

Feedback information is provided a user based on the resource got in the above method, is informed by the feedback information The user client subsequent action to be carried out.The feedback information can be text feedback and be also possible to through TTS (text turn Voice Text To Speech) technology realize voice feedback, text can be realized by common text-to-speech converting unit Conversion of the information to audio-frequency information.User sees or hears the feedback information, it will be able to judge that can request be met. In general, user only be intended to be not being met in the case where, the execution of subsequent action just can be actively abandoned, so if client Termination receives the instruction that user abandons executing subsequent action, can estimate needed for the execution phonetic order that client is got The resource wanted is not able to satisfy the actual request of user, carries out voice by the phonetic order and based on the phonetic order at this time Identify that the text information obtained and the parsing intention for carrying out parsing acquisition to the text information are labeled as error sample and save To error sample library, the complicated processes of the manual marking error sample of user are not only omitted, but also are complying fully with the normal of user Automatic collection error sample in the state of rule operating habit, which greatly enhances the probability for getting effective error sample.

A kind of processing method of the phonetic order of client is provided in another embodiment according to the application, referring to Fig. 8, the method includes：

Step 801：The phonetic order is simultaneously sent to server by the phonetic order of original intent of the acquisition comprising user；

Step 802：Resource needed for executing the phonetic order is obtained from server end；

Step 803：Execute that the phonetic order is corresponding to execute movement based on the resource that gets；

Step 804：The instruction that the corresponding execution movement of the phonetic order is terminated in scheduled time threshold value is captured, really The fixed original intent is unmet, sends to the server by the phonetic order and is carried out based on the phonetic order The text information of speech recognition acquisition and the parsing intention for carrying out parsing acquisition to the text information are labeled as error sample Information.

For example, the phonetic order of acquisition user's input " plays video《ABCC》", and the phonetic order is uploaded into clothes Business device end；Mistake occurs when carrying out speech recognition by server end, the text information that phonetic order is identified as mistake " is played and regarded Frequently《ADCC》", it is intended to play according to the parsing that the text information of mistake parses entitled《ADCC》Video, and according to final Determining parsing is intended to retrieval and executes video resource required for the parsing is intended to《ADCC》；Client is got《ADCC》's Start to play after video resource, user has found that the video played is not that its request plays at this time《ABCC》Therefore it is broadcast in video It puts to have issued for 5 seconds and terminates the instruction of broadcasting, client captures this instruction of user, thereby determines that user's is original Be intended to it is unmet, by the phonetic order " play video《ABCC》" and based on the phonetic order carry out speech recognition obtain The text information obtained " plays video《ADCC》" and to the text information carry out parsing acquisition parsing be intended to " play it is entitled 《ADCC》Video " be labeled as error sample and save to error sample library.

Also the mode that a kind of routine operation habit for meeting user is had chosen in the above method realizes oneself of error sample Dynamic acquisition, is omitted the complicated processes of the manual marking error sample of user, improves the probability for getting effective error sample.

One embodiment of the application discloses a kind of processing unit of the phonetic order of server end, referring to Fig. 9, described device Including：Receiving module 901, speech recognition module 902, parsing module 903, resource retrieval module 904, the detection of the first error sample Module 905 and error sample library 906；Wherein, the receiving module 901 is configured as reception to carry out self terminal including the original meaning of user The phonetic order of figure；The speech recognition module 902 is configured as carrying out speech recognition to the phonetic order, generates institute's predicate The text information of sound instruction；The parsing module 903 is configured as parsing the text information, determines the text envelope The corresponding parsing of breath is intended to；The resource retrieval module 904 is configured as being intended to retrieval execution institute's predicate according to the parsing Resource needed for sound instruction, and the resource is sent to the terminal；The first error sample detection module 905 is configured It is unmet for the determination original intent, it is intended to the phonetic order, text information and parsing to be labeled as error sample It saves to error sample library 906；The error sample library 906 is configured as storing the error sample.

It can be decided whether to obtain error sample according to the reaction pattern of human-computer interaction by above-mentioned apparatus, be not necessarily to user hand The dynamic operation for carrying out error label, increases a possibility that getting error sample.

In another embodiment according to the application, the first error sample detection module 905 determines the original meaning Scheme unmet include：

The detection module 905 of the device repeats the identical parsing meaning of the representative uploaded by capturing same user in a short time The phonetic order of figure acquires the opportunity of error sample to determine, the cumbersome step that user carries out error label feedback manually is omitted Suddenly, a possibility that obtaining error sample, greatly improves.Another judgment mode is to determine the original for that will meet client The error sample for beginning to be intended to unmet judgment criteria is saved in error sample library.

In another embodiment according to the application, the first error sample detection module is configured as with decision tree Mode constructs the detection model that can learn, for detecting whether repeating to receive the parsing of same user in preset period of time It is intended to identical phonetic order, so that it is determined that whether the original intent is met, in the unmet feelings of original intent The phonetic order, text information and parsing intention are labeled as error sample and saved to error sample library under condition.

The sharpest edges of decision Tree algorithms are to the self study of implementation model, it is only necessary to carry out to training example preferable Mark, can error sample be able to be filtered by the detection model that the method optimizing of random forest can learn.

Random forests algorithm can prevent over-fitting, solve the weak disadvantage of decision tree generalization ability.

In another embodiment according to the application, the first error sample detection module 905 is additionally configured to：

As user be intended to unsatisfied situation be also likely to be as caused by resource retrieval failure or mistake, Device configured with the first error sample detection module 905 as described above saves money when carrying out error sample preservation The record of source retrieval, can subsequently through modes such as artificial reinspections, by this kind of speech recognition and semantic parsing it is correct and only It is rejected from error sample library because user caused by retrieval failure or mistake is intended to unsatisfied interference error sample, thus Improve the accuracy of error sample detection model.

According to another embodiment of the application, the first error sample detection module 905 is additionally configured to：

Device configured with the first error sample detection module 905 as described above can will be due to resource retrieval Cause the error sample screening for not being able to satisfy user's intention and marking to come out, and rejected from error sample library, further Improve the accuracy of error sample detection model.

A kind of processing unit of the phonetic order of client, such as Figure 10 are disclosed in one embodiment according to the application Shown, described device includes：Acquisition module 1001, execution module 1002 and the second error sample detection module 1003；It is described to adopt Collection module 1001 is configured as the phonetic order of original intent of the acquisition comprising user and the phonetic order is sent to service Device；The execution module 1002 is configured as resource needed for obtaining the execution phonetic order from server end；Described second Error sample detection module 1003 is configured to determine that the original intent is unmet, and sending to the server will be described Phonetic order and based on the phonetic order carry out speech recognition acquisition text information and the text information is solved The parsing that analysis obtains is intended to be labeled as the information of error sample.

In order to facilitate the principle of description client and server cooperating, also shown in Figure 10 and client The structure composition of the processing unit of the phonetic order of the server end of the processing unit cooperating of phonetic order.Wherein, described Receiving module 1004 is configured as receiving the phonetic order comprising user's original intent that acquisition module 1001 described in terminal is sent； The phonetic order is transmitted to speech recognition module 1005 and carries out speech recognition, generates the text information of the phonetic order； The text information is transmitted to parsing module 1006 and determines that parsing corresponding to the text information is intended to；Later, resource is examined Rope module 1007 is intended to resource needed for retrieval executes the phonetic order according to the parsing, and the resource is sent to institute State the execution module 1002 of terminal；The execution module 1002 of terminal obtains needed for executing the phonetic order from server end Resource after by the second error sample detection module 1003 determine user original intent whether met, however, it is determined that The original intent is unmet then to be sent the phonetic order to server and carries out voice based on the phonetic order Identify that the text information obtained and the parsing for carrying out parsing acquisition to the text information are intended to be labeled as the letter of error sample Breath.Wherein the first error sample detection module 1003 can save the note of resource retrieval corresponding with the parsing intention Record；If determining that the parsing is intended to unmet reason according to preset rules based on the retrieval record is resource retrieval It is caused, then the error sample is rejected from the error sample library 1009.

The error sample library 1009 is generally located on server end, is also not precluded within client certainly for specific user The possibility of specific faults sample database is set.

Above-mentioned apparatus can be decided whether according to the reaction pattern of human-computer interaction obtain error sample, without user manually into The operation of row error label, increases a possibility that getting error sample.

According to one embodiment of the application, the execution module 1002 is additionally configured to refer in the acquisition execution voice After resource required for enabling, the information of the corresponding execution movement of the phonetic order is provided based on the resource got；

The determination original intent is unmet to include：

The complicated processes of the manual marking error sample of user are not only omitted in the device for being configured with above-mentioned execution module 1002, And the automatic collection error sample in the state that routine operation for complying fully with user is accustomed to, which greatly enhances got Imitate the probability of error sample.

According to another embodiment of the application, the execution module 1002 is additionally configured to obtaining the execution voice It required for instructing after resource, is based on, executes that the phonetic order is corresponding to execute movement based on the resource got；

The determination original intent is unmet to include：

The device for being configured with above-mentioned execution module 1002 also can be with a kind of side of routine operation habit for meeting user Formula realizes the automatic collection of error sample, and the complicated processes of the manual marking error sample of user are omitted, improves and has got The probability of the error sample of effect.

A kind of smart machine 1100 as shown in figure 11 is provided in one embodiment according to the application, including but not It is limited to memory 1101, processor 1102 and is stored in the computer that can be run on memory 1101 and on processor 1102 to refer to It enables, the processor 1102 realizes the processing method of foregoing phonetic order when executing described instruction.

A kind of exemplary scheme of above-mentioned smart machine for the present embodiment.It should be noted that the skill of the smart machine Art scheme and the processing method of phonetic order above-mentioned belong to same design, and the technical solution of the smart machine is not described in detail Detail content, may refer to the description of the technical solution of the processing method of above-mentioned phonetic order.

A kind of computer readable storage medium is provided in one embodiment according to the application, is stored thereon with calculating Machine instruction, realizes the processing method for weighing foregoing phonetic order when described instruction is executed by processor.

The computer instruction includes computer program code, the computer program code can for source code form, Object identification code form, executable file or certain intermediate forms etc..The computer-readable medium may include：Institute can be carried State any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic disk, CD, the computer storage of computer program code Device, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), Electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that the computer-readable medium include it is interior Increase and decrease appropriate can be carried out according to the requirement made laws in jurisdiction with patent practice by holding, such as in certain jurisdictions of courts Area does not include electric carrier signal and telecommunication signal according to legislation and patent practice, computer-readable medium.

A kind of exemplary scheme of above-mentioned computer readable storage medium for the present embodiment.It should be noted that this is deposited The technical solution of storage media and the processing method of phonetic order above-mentioned belong to same design, and the technical solution of storage medium is not detailed The detail content carefully described may refer to the description of the technical solution of the processing method of above-mentioned phonetic order.

It should be noted that for the various method embodiments described above, describing for simplicity, therefore, it is stated as a series of Combination of actions, but those skilled in the art should understand that, the application is not limited by the described action sequence because According to the application, certain steps can use other sequences or carry out simultaneously.Secondly, those skilled in the art should also know It knows, the embodiments described in the specification are all preferred embodiments, and related actions and modules might not all be this Shen It please be necessary.

In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, it may refer to the associated description of other embodiments.

The application preferred embodiment disclosed above is only intended to help to illustrate the application.There is no detailed for alternative embodiment All details are described, also not limiting this application is only the specific embodiment.Obviously, according to the content of this specification, It can make many modifications and variations.These embodiments are chosen and specifically described to this specification, is in order to preferably explain the application Principle and practical application, so that skilled artisan be enable to better understand and utilize the application.The application is only It is limited by claims and its full scope and equivalent.

Claims

1. a kind of processing method of phonetic order, which is characterized in that the method includes：

It is intended to resource needed for retrieval executes the phonetic order according to the parsing, and the resource is sent to the end End；

It determines that the original intent is unmet, is intended to the phonetic order, text information and parsing to be labeled as wrong sample This preservation is to error sample library.

2. the method according to claim 1, wherein the determination original intent is unmet includes：

The parsing for repeating to receive same user in preset period of time is intended to identical phonetic order；Or it receives and comes from The phonetic order, text information and parsing are intended to be labeled as the information of error sample by the terminal.

3. according to the method described in claim 2, it is characterized in that, being determined whether in a manner of decision tree in preset period of time The parsing that interior repetition receives same user is intended to identical phonetic order.

4. method according to claim 1 or 2, which is characterized in that the method also includes：

If determining that the parsing is intended to unmet reason according to preset rules based on the retrieval record is resource inspection Caused by rope, then the error sample is rejected from the error sample library.

5. method according to claim 1 or 2, which is characterized in that the method also includes：

The matching degree that the resource retrieval and the parsing are intended to is calculated, and the value for the matching degree being calculated is stored in retrieval In record；

6. a kind of processing method of phonetic order, which is characterized in that the method includes：

Resource needed for executing the phonetic order is obtained from server end；

It determines that the original intent is unmet, send to the server by the phonetic order and is referred to based on the voice The text information for carrying out speech recognition acquisition and the parsing intention for carrying out parsing acquisition to the text information is enabled to be labeled as mistake The accidentally information of sample.

7. according to the method described in claim 6, it is characterized in that, obtaining resource required for executing the phonetic order Afterwards, the method also includes：

The determination original intent is unmet to include：

8. according to the method described in claim 6, it is characterized in that, after obtaining resource required for executing the phonetic order, The method also includes：

The determination original intent is unmet to include：

9. a kind of processing unit of phonetic order, which is characterized in that described device includes：Acquisition module, execution module and second Error sample detection module；The acquisition module is configured as the phonetic order of original intent of the acquisition comprising user and will be described Phonetic order is sent to server；The execution module is configured as obtaining needed for executing the phonetic order from server end Resource；The second error sample detection module is configured to determine that the original intent is unmet, to the server Send by the phonetic order and based on the phonetic order carry out speech recognition acquisition text information and to the text The parsing that information carries out parsing acquisition is intended to be labeled as the information of error sample.

10. a kind of computer readable storage medium, is stored thereon with computer instruction, which is characterized in that the instruction is by processor The processing method of phonetic order described in any one of claim 1 to 5 or 6 to 8 is realized when execution.