CN115662430B - Input data analysis method, device, electronic equipment and storage medium - Google Patents

Input data analysis method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115662430B
CN115662430B CN202211338183.9A CN202211338183A CN115662430B CN 115662430 B CN115662430 B CN 115662430B CN 202211338183 A CN202211338183 A CN 202211338183A CN 115662430 B CN115662430 B CN 115662430B
Authority
CN
China
Prior art keywords
analysis result
offline
input data
result
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211338183.9A
Other languages
Chinese (zh)
Other versions
CN115662430A (en
Inventor
周文欢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Apollo Zhilian Beijing Technology Co Ltd
Apollo Zhixing Technology Guangzhou Co Ltd
Original Assignee
Apollo Zhilian Beijing Technology Co Ltd
Apollo Zhixing Technology Guangzhou Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Apollo Zhilian Beijing Technology Co Ltd, Apollo Zhixing Technology Guangzhou Co Ltd filed Critical Apollo Zhilian Beijing Technology Co Ltd
Priority to CN202211338183.9A priority Critical patent/CN115662430B/en
Publication of CN115662430A publication Critical patent/CN115662430A/en
Application granted granted Critical
Publication of CN115662430B publication Critical patent/CN115662430B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The disclosure provides an input data analysis method, an input data analysis device, electronic equipment and a storage medium, and relates to the field of data processing, in particular to the fields of voice technology, internet of things and automatic driving. The specific implementation scheme is as follows: the method comprises the steps of sending input data provided by a user to a server, so that the server carries out online analysis on the input data; performing offline analysis on the input data to obtain an offline analysis result, and performing trusted detection on the offline analysis result; under the conditions that the offline analysis result is reliable and the online analysis result is not received, acquiring the reliable offline analysis result, and determining the reliable offline analysis result as the analysis result of the input data; and under the condition that the offline analysis result is not credible or the online analysis result is preferentially received, acquiring the online analysis result, and determining the online analysis result as the analysis result of the input data. The embodiment of the disclosure can improve the accuracy of input data analysis.

Description

Input data analysis method, device, electronic equipment and storage medium
Technical Field
The disclosure relates to the field of data processing, in particular to the fields of voice technology, internet of things and automatic driving, and in particular relates to an input data analysis method, an input data analysis device, electronic equipment and a storage medium.
Background
With the popularization of intelligent devices, the way of man-machine interaction is developed towards more and more convenience, and voice interaction, gesture interaction and the like are more convenient ways than typing, mouse or touch screen control, so that a machine can understand human language and respond to the human language, and the machine can better serve human beings.
Specifically, the voice interaction device can upload the received voice to the cloud, and voice recognition and natural language understanding can be performed by means of the powerful processing capacity of the cloud.
Disclosure of Invention
The disclosure provides an input data parsing method, an input data parsing device, electronic equipment and a storage medium.
According to an aspect of the present disclosure, there is provided an input data parsing method, including:
the method comprises the steps of sending input data provided by a user to a server, so that the server carries out online analysis on the input data;
performing offline analysis on the input data to obtain an offline analysis result, and performing trusted detection on the offline analysis result;
Under the conditions that the offline analysis result is reliable and the online analysis result is not received, acquiring the reliable offline analysis result, and determining the reliable offline analysis result as the analysis result of the input data;
and under the condition that the offline analysis result is not credible or the online analysis result is preferentially received, acquiring the online analysis result, and determining the online analysis result as the analysis result of the input data.
According to an aspect of the present disclosure, there is provided an input data parsing apparatus including:
the input data acquisition module is used for sending input data provided by a user to the server so that the server can analyze the input data online;
the offline analysis trusted detection module is used for carrying out offline analysis on the input data to obtain an offline analysis result and carrying out trusted detection on the offline analysis result;
the trusted result acquisition module is used for acquiring a trusted offline analysis result and determining the trusted offline analysis result as the analysis result of the input data under the conditions that the offline analysis result is trusted and the online analysis result is not received;
The online result acquisition module is used for acquiring the online analysis result and determining the online analysis result as the analysis result of the input data under the condition that the offline analysis result is not credible or the online analysis result is preferentially received.
According to another aspect of the present disclosure, there is provided an electronic device including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the input data parsing method of any of the embodiments of the present disclosure.
According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the input data parsing method of any embodiment of the present disclosure.
According to another aspect of the present disclosure, there is provided a computer program object comprising a computer program which, when executed by a processor, implements the input data parsing method of any embodiment of the present disclosure.
The embodiment of the disclosure can improve the accuracy of input data analysis.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a flow chart of a method of input data parsing disclosed in accordance with an embodiment of the present disclosure;
FIG. 2 is a flow chart of another input data parsing method disclosed in accordance with an embodiment of the present disclosure;
FIG. 3 is a flow chart of another input data parsing method disclosed in accordance with an embodiment of the present disclosure;
FIG. 4 is a scene graph of another input data parsing method disclosed in accordance with an embodiment of the disclosure;
FIG. 5 is a block diagram of an input data parsing apparatus according to an embodiment of the present disclosure;
fig. 6 is a block diagram of an electronic device for implementing an input data parsing method of an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a flowchart of an input data parsing method according to an embodiment of the present disclosure, which may be suitable for a case of parsing input data. The method of the embodiment can be executed by an input data analysis device, and the device can be realized in a software and/or hardware mode and is specifically configured in an electronic device with a certain data operation capability, wherein the electronic device can be a client device, and the client device can be a mobile phone, a tablet personal computer, a vehicle-mounted terminal, a desktop computer, an internet of things device and the like.
S101, sending input data provided by a user to a server, so that the server carries out online analysis on the input data.
The input data is used for identifying and analyzing the acquired instructions to instruct corresponding modules or equipment to execute, so that man-machine interaction is realized. The input data provided by the user may be data provided by text, voice, image, video, etc. The input data (i.e., source data) may be sent directly to the server. Or the input data may be processed and the processed input data may be sent to the server. The server is used for carrying out online analysis on the input data so as to determine the intention of the user. The server is arranged in a network and is in network communication with the current electronic equipment, and accordingly, the server analyzes the input data, which is an online analysis process. The analysis result fed back by the server is an online analysis result.
S102, performing offline analysis on the input data to obtain an offline analysis result, and performing trusted detection on the offline analysis result.
The input data is analyzed offline, and the input data is actually analyzed by using local resources, i.e. the input data is not analyzed by using network resources. The offline analysis result is an analysis process running locally, and the obtained analysis result is an analysis process running in an environment where network resources are difficult to obtain, where the environment where network resources are difficult to obtain may be an environment without a network or a weak network. The offline parsing result may include actions and action objects, e.g., the offline parsing result is an operation + object. Illustratively, the input sentence is voice "open window", the offline parsing result is that the operation is open and the operation object is window. The trusted detection is used for detecting whether the offline analysis result is trusted or not. By way of example, at least one parameter such as confidence, accuracy, whether instructions can be generated and whether the generated instructions are available can be used to detect whether the offline parsing result is authentic.
S103, under the condition that the offline analysis result is reliable and the online analysis result is not received, acquiring the reliable offline analysis result, and determining the analysis result of the input data according to the reliable offline analysis result.
S104, under the condition that the offline analysis result is not credible or the online analysis result is preferentially received, acquiring the online analysis result, and determining the online analysis result as the analysis result of the input data.
In practice, the online resolution and the trusted offline resolution can be understood as accurate resolution. And determining the analysis result which is acquired in priority, namely the analysis result which is acquired first in time sequence, from the online analysis result and the credible offline analysis result as the analysis result of the input data. And if the offline analysis result is not credible, waiting for receiving the online analysis result, and determining the received online analysis result as the analysis result of the input data.
Optionally, the input data parsing method further includes: under the condition that the offline analysis result is credible, acquiring the credible offline analysis result; intercepting an online analysis result sent by the server; waiting for receiving an online analysis result fed back by the server under the condition that the offline analysis result is not credible; and receiving an online analysis result fed back by the server.
The reliable offline analysis result is obtained, if the online analysis result is not received at the moment, the online analysis result is intercepted, no processing is carried out on the online analysis result, the online analysis result can be directly discarded, and resources waiting for processing the online analysis result are released. In the process of offline analysis and trusted detection of the offline analysis result, once the online analysis result is obtained, stopping the process of offline analysis, and releasing the resources used by offline analysis.
And if the offline analysis result is not trusted, waiting to receive the online analysis result. If waiting for timeout, feedback response timeout is performed to the user. And the processing of an unreliable offline analysis result is avoided, so that the erroneous execution operation is caused. If in the automatic driving field, processing the unreliable offline analysis result can even cause safety accidents.
The online analysis result is intercepted by processing the trusted offline analysis result, and the online analysis result is waited to be received under the condition that the offline analysis result is not trusted, so that the reliability of the adopted analysis result can be improved, the analysis accuracy can be improved, the interaction accuracy can be improved based on the analysis result, the analysis result which is preferentially acquired can be correctly executed, and the man-machine interaction response speed can be improved by taking the analysis result which is preferentially acquired as the input of the subsequent processing.
Optionally, an instruction is generated according to the analysis result of the input data, and the instruction is sent to a corresponding module, so that the module executes the instruction to realize the corresponding function.
In the prior art, for example, during a voice recognition process, especially in a weak network state, an instruction does not return or prompts a network to report errors for a long time, which can affect the use of the intelligent device by a user. Meanwhile, due to the problem of low accuracy of offline analysis, when the offline analysis result is directly used for instruction analysis, the problem of error of instruction analysis is easy to occur.
According to the technical scheme, the on-line analysis and the off-line analysis are simultaneously carried out, the analysis result which is the forefront time and the most credible is adopted to determine the analysis result which is the analysis result of the user input data, the analysis efficiency and the accuracy are considered, and the real-time accurate response to the user request is realized.
Fig. 2 is a flowchart of another input data parsing method disclosed in an embodiment of the present disclosure, further optimized and expanded based on the above technical solution, and may be combined with the above various alternative embodiments. The performing trusted detection on the offline analysis result includes: and detecting the accuracy and the availability of the offline analysis result.
S201, input data provided by a user are sent to a server, so that the server carries out online analysis on the input data.
And acquiring user voice data, and sending the user voice data to a server for online voice recognition to obtain a voice recognition result. Under the condition of no or weak network, the current electronic equipment performs offline voice recognition on the user voice data to obtain a voice recognition result.
Optionally, the performing offline analysis on the input data to obtain an offline analysis result includes: performing voice recognition on the input data; and carrying out semantic analysis on the voice recognition result to obtain an offline analysis result.
The input data is user voice data. The method is applied to the voice interaction scene. The user speech data is audio data formed by user speech. The user performs voice interaction with the current electronic equipment, the current electronic equipment records user voice through user authorization to obtain user voice data, and input data provided for the user is determined. Automatic speech recognition techniques (Automatic Speech Recognition) may be employed to effect speech recognition, in particular converting speech signals into text instructions. The current electronic device performs voice recognition on user voice data, and specifically, the voice recognition process is an offline operation process.
By acquiring the voice data of the user and performing text recognition to obtain a voice recognition result, the accuracy and the instantaneity of voice interaction control can be improved aiming at the voice interaction scene.
Optionally, the input data parsing method further includes: word segmentation is carried out on the voice recognition result to obtain at least one alternative word; acquiring pronunciation information of the alternative words; inquiring the expected words matched with the pronunciation information of the alternative words in the prestored expected words; and replacing the alternative words with the expected words, and correcting the voice recognition result.
The speech recognition results of offline speech recognition are often not very accurate and may be modified in some way. The alternative words are used as a correction unit to detect whether errors exist or not so as to correct. The pronunciation information may refer to the pronunciation of the alternative word. The pre-stored desired word refers to a word desired to be obtained, and specifically refers to a word desired to be applied in association with a control object or a control operation or the like in the control field. The desired word may be determined based on the functions of the functional modules of the current electronic device, as well as the control functions. For example, when the user issues "zheyanglia" to actually indicate that the control object is "sunshade", but the offline recognition result is likely to be recognized as "such year", the word is expected to be "sunshade". In addition, the user may have dialect accents, resulting in offline recognition results that differ significantly from the actual speech content the user wants to express. The expected words can be adapted to the content related to the control function which the user wants to express and standardized, namely, the user voice recognition result is normalized and standardized, and then the control function is adapted, and the voice is accurately recognized to obtain the voice recognition result.
The pronunciation of the candidate word is the same as or similar to the pronunciation of the expected word. The desired word is used to replace the alternative word and correct the speech recognition result. For example, the pronunciations are the same or similar, which means that the ratio of the number of the same syllables existing in the pronunciation of the candidate word to the number of the total syllables of the candidate word is greater than or equal to a preset threshold. Wherein syllables can be the pronunciation of a word. The term is typically at least one word.
The method can obtain a known common test voice problem (query) set, and the test set can be formed by single Chinese characters, phrases or sentence patterns and the like. However, because the accent and the actual pronunciation of different users are different, the actual offline voice misrecognition "sunshade curtain" may be identified as "such curtain", "this sunshade curtain" and "the like-hiding year" … …, and when this situation occurs, the identified alternative word "such year" may be corrected to the desired word "sunshade curtain" by way of example, so as to obtain a more accurate offline analysis result, so that a correct instruction can be analyzed based on the offline analysis result.
The correction of vocabulary can be achieved by a fuzzy syllable matching strategy. Specifically, querying the desired term that matches the pronunciation information of the candidate term may include: and determining at least two syllables in the pronunciation information and expected words which are the same as the at least two syllables in the pronunciation information of the alternative words as expected words matched with the pronunciation information of the alternative words. Illustratively, the pronunciation information of the alternative word "shading year" is "zheyangnian", the pronunciation information of the desired word "sunshade curtain" is "zheyanglia", and the two syllables, "zhe" and "yang", are identical, so that it is determined that the desired word "sunshade curtain" matches the alternative word "shading year". In addition, it is also possible to further set at least two continuous syllables in the pronunciation information, and the desired word identical to at least two continuous syllables in the pronunciation information of the candidate word, to be determined as the desired word matching the pronunciation information of the candidate word. The language of the alternative word may be applicable to english, french, japanese, or the like, and is not particularly limited.
By correcting the voice recognition result, the accuracy of voice recognition can be improved, and further, the analysis result can be improved by performing off-line analysis according to the accurate voice recognition result.
S202, carrying out offline analysis on the input data to obtain an offline analysis result, and detecting the accuracy and the availability of the offline analysis result.
The accuracy is used for detecting whether the offline analysis result is accurate in analysis or not, and the availability is used for detecting whether the offline analysis result is available for subsequent operation or not. For example, the accuracy may be detected by at least one of resolution confidence of an offline resolution result, presence or absence of additional data for resolving the input data, and processing accuracy of the enriched and input data. Availability may be detected by at least one of whether the offline parsing result is executable, whether an instruction is generated, and whether the generated instruction is executable.
S203, under the condition that the offline analysis result is reliable and the online analysis result is not received, acquiring the reliable offline analysis result, and determining the reliable offline analysis result as the analysis result of the input data.
S204, under the condition that the offline analysis result is not credible or the online analysis result is preferentially received, acquiring the online analysis result, and determining the online analysis result as the analysis result of the input data.
Optionally, the detecting the accuracy of the offline analysis result includes at least one of: acquiring statement identification self-confidence scores corresponding to the input data, and detecting whether the statement identification self-confidence scores are larger than or equal to a preset self-confidence score threshold value; and detecting whether information for multiple rounds of conversations is available.
Sentence recognition confidence scores are used for detecting sentence recognition accuracy, and in general, the higher the confidence score is, the higher the sentence recognition accuracy is. The input data is a text recognition result. The analysis result is a semantic analysis result of the text recognition result. Illustratively, the input data is a speech recognition result, and the sentence recognition self-confidence score is a speech recognition self-confidence score; the input data is an image recognition result, and the sentence recognition self-confidence score is an image recognition self-confidence score. The confidence threshold is used for detecting whether statement identification of input data is accurate. Determining that the sentence identification of the input data is accurate only for the sentence identification confidence score under the condition that the sentence identification confidence score is greater than or equal to a confidence score threshold value; in the case where the sentence recognition confidence score is smaller than the confidence score threshold value, it is determined that the sentence recognition of the input data is inaccurate. For example, a pre-trained machine learning model may be employed to detect statements identifying confidence scores.
Multiple rounds of dialog are multiple rounds of questions and answers by the pointer to the same intent to clarify the intent of the user. The information of multiple rounds of conversations can be obtained, which indicates that the input data is the conversational data of a certain round of conversations in the multiple rounds of conversations, so that the input data can be combined with the information of the multiple rounds of conversations to optimize the intention of a user, and the intention of the input data can be more clearly and clearly identified, and an accurate analysis result can be more easily obtained. In addition, in the multi-round dialogue scene, the input data can be analyzed based on the context state, so that the offline analysis result is corrected or optimized, and the accuracy of the offline analysis result is improved. That is, the scene of the multi-round dialogue can increase the richness of the input data and the content with more intention representativeness, thereby improving the accuracy of the offline analysis result. In general, in a voice interaction scene, information of a conversation is stored, and specifically, information such as conversation identification information, whether a multi-turn conversation is completed, and whether the conversation is completed may be included. In the scene of the multi-turn dialog, information such as the contents of the front and rear dialogs, the dialog types, and the intentions of the multi-turn dialog is stored. For example, it may be detected that information of a plurality of rounds of conversations may be acquired while the input data is acquired, indicating whether the input data is in a certain one of the rounds of conversations. The information of the multiple rounds of conversations can be determined to be acquired under the condition that the current conversations where the input data are positioned are multiple rounds of conversations and the multiple rounds of conversations are not finished; in the case where the current session in which the input data is located is not a multi-turn session or the multi-turn session has ended, it is determined that the information of the multi-turn session is not available. Under the condition that the information of multiple rounds of conversations can be obtained, determining that the sentence identification of the input data is accurate; in the case where multiple rounds of conversations are not available, the statement identification of the input data is determined to be inaccurate. For example, the intent of the first-round dialog may be determined based on input data provided by the user in the first-round dialog, and the input data and/or intent may be detected, whether the dialog is a multi-round dialog may be detected, and if so, the dialog may be marked as a multi-round dialog, and the non-ending status of the multi-round dialog may be recorded. The intention of the multi-turn dialog may be preset, for example, the intention of the multi-turn dialog is a navigation intention, the intention of the input data is an exemplary navigation to a place, information that the multi-turn dialog may be acquired may be determined, and at the same time, the context information and the intention of the multi-turn dialog may be used, the second input data provided by the user in a short time may be determined to be actually the second turn dialog of the multi-turn dialog, and the intention detection may be performed on the second input data according to the context information and the intention of the multi-turn dialog, that is, the first turn dialog and the determined intention. As another example, the content of the preset multi-round dialog, for example, the content of the multi-round dialog is the content of the preset description state, for example, the input data provided by the first round of user is "hot weather", the first round of input data may be determined to be the multi-round dialog, and the subsequent electronic device may feedback a question of "whether to open a window", that is, the second round of dialog, and the input data provided by the corresponding user on the third round of input data may be "yes" or "no", where the third round of input data and the previous two rounds of dialog are the dialog content of the same multi-round dialog. In addition, there are other cases, and the setting is specifically performed as needed.
Under the condition that the sentence identification confidence score is greater than or equal to a confidence score threshold value and the information of multiple rounds of conversations can be obtained, the sentence identification of the input data is determined to be accurate; in the case where the sentence recognition confidence score is smaller than the confidence score threshold or information of a multi-turn conversation is not available, it is determined that sentence recognition of the input data is inaccurate. In addition, there are other cases where the accuracy can be determined, for example, the accuracy of detecting the offline analysis result, which is not particularly limited.
The accuracy of the offline analysis result is determined by identifying the self-confidence score and the multi-round dialogue through the detection statement, so that the detection accuracy process can be simplified, the accuracy is detected through a plurality of angles, and the accuracy detection precision is improved.
Optionally, the detecting the availability of the offline analysis result includes at least one of the following: acquiring the prediction intention of the offline analysis result, and detecting whether the prediction intention corresponding to the offline analysis result is an offline support intention or not; and detecting whether the offline analysis result is analyzed to obtain a trusted instruction.
Offline support intent refers to intent that corresponding functionality may be implemented in an offline scenario. It can be appreciated that implementation of some functions corresponding to the intents requires network interaction to acquire resources, for example, navigation intents, map resource data is required to be acquired from a network, and in an offline scenario, the resources cannot be acquired, so that the functions corresponding to the navigation intents cannot be implemented. In addition, some intended processes need to be performed online, and current electronic devices of offline scenes do not have the ability to perform operations corresponding to the intents, for example, operations for verifying authority information of users, which are performed by a server. In the offline scenario, the operation cannot be performed, and thus the operation corresponding to the intention cannot be performed. In particular, offline support intent may refer to intent that is not dependent on online resources or services. Optionally, the functions of the function module of the current electronic device in the offline scene may be collected, and the offline support intention is correspondingly determined and added to the white list, where the white list stores the offline support intention.
Trusted instructions refer to instructions that may be executed. Whether the trusted instruction is analyzed or not specifically comprises detecting whether the instruction can be generated or not and detecting whether the generated instruction is trusted or not. The trusted instruction means that the corresponding functional module can execute the instruction and obtain a correct execution result, and specifically, the corresponding functional module can obtain effective resources to execute the trusted instruction correctly, for example, the trusted instruction is independent of online resources, and can be processed correctly only by offline resources.
Optionally, according to the functions of the current functional module of the electronic device in the offline scene, the range of the instructions that can be executed by each functional module can be determined, so as to determine the range of the trusted instructions. And detecting whether the instruction belonging to the trusted instruction range can be generated according to the offline analysis result so as to detect whether the offline analysis result is analyzed to obtain the trusted instruction. For another example, the range of the specified field included in the offline analysis result can be correspondingly configured based on the range of the trusted instruction, and the offline analysis result is determined to analyze to obtain the trusted instruction under the condition that any field in the range exists in the offline analysis result; and if the offline analysis result does not contain all the fields in the range, determining that the offline analysis result cannot be analyzed to obtain the trusted instruction. In addition, there are other cases in which the availability can be determined, which are not particularly limited.
By detecting the offline support intention and the trusted instruction, determining the availability of the offline parsing result, the availability detection process can be refined, and the availability detection accuracy can be improved by detecting the availability through a plurality of angles.
Optionally, the detecting whether the offline analysis result is analyzed to obtain a trusted instruction includes: detecting whether the offline analysis result is analyzed to obtain an instruction with at least one function type matched, and determining that the instruction can analyze the detection result; acquiring the resource dependency type of the instruction obtained by analysis, and determining a resource effective detection result; and according to the instruction resolvable detection result and the resource effective detection result, detecting whether the offline resolution result is resolved to obtain a trusted instruction.
The function type is the type of the function corresponding to the instruction obtained by analysis, and can be determined according to the function realized by the input data or the intention corresponding to the offline analysis result. In an internet of things scenario, the electronic device being controlled is a sound box, the functions that can be implemented include playing songs, and the types of functions may include switching songs or adjusting volume. The instruction for matching the corresponding function types includes a song switching instruction or a volume adjustment instruction.
The instruction resolvable detection result is used for determining whether the offline resolution result can be resolved to obtain an instruction. The instruction resolvable detection results comprise resolvable results and non-resolvable results. For example, a plurality of parsing modes may be configured according to the function type, and whether the offline parsing result may be parsed may be detected by using the plurality of parsing modes. The parsed instructions may include instructions that match function types, general instructions, null, or the like. Under the condition that any analysis mode can analyze, instructions with matched function types can be obtained through analysis, and the instruction-resolvable detection result is an resolvable result; in the case that each analysis mode can analyze, the obtained general instruction can be analyzed, and the instruction resolvable detection result is an unresolveable result, or in the case that each analysis mode can analyze, the instruction cannot be obtained, that is, the obtained instruction is empty, and the instruction resolvable detection result is an unresolveable result. The parsing process may be, for example, to replace a specified character in the offline parsing result with a target character identifiable by the function module, and the function type configures the parsing manner, which may be a replacement rule corresponding to the configured function type.
The resource dependency type is used to determine the type of resource required during execution of the instruction. The resource-dependent types include an online resource-dependent type or an offline resource-dependent type, etc. As another example, the resource-dependent types include an active resource-dependent type or an inactive (untrusted) resource-dependent type, etc. The resource effective detection result is used for detecting whether the effective resource required in the instruction execution process can be acquired or not, so that the analysis result is used for detecting whether the instruction can be executed or not. The resource validity detection result includes valid resources or invalid resources. The resource-effective detection result may be determined according to the resource-dependent type. Specifically, a correspondence relationship between the resource dependency type and the resource effective detection result may be preset. In a specific example, the resource dependency type is an online resource dependency type, and the valid detection result of the resource is determined to be an invalid resource; the resource dependency type is an offline resource dependency type, and the effective detection result of the resource is determined to be effective resource.
The instruction resolvable detection result and the resource effective detection result are used for jointly detecting whether the offline resolution result is resolved or not to obtain a trusted instruction. According to the instruction resolvable detection result and the resource effective detection result, determining an instruction which can be resolved to obtain function type matching, and the instruction can be correctly executed, and correspondingly determining an offline resolution result to obtain a trusted instruction; according to the instruction resolvable detection result and the resource effective detection result, determining that the instruction with the matched function types can not be obtained through resolution, or that the instruction with the matched function types can not be executed correctly, and correspondingly determining that the offline resolution result can not be obtained through resolution. Specifically, under the condition that the instruction resolvable detection result is a resolvable result and the resource effective detection result is an effective resource, determining that the offline resolvable result is resolved to obtain a trusted instruction; and under the condition that the instruction resolvable detection result is an unresolved result or the resource effective detection result is an ineffective resource, determining that the offline resolution result cannot be resolved to obtain a trusted instruction.
The instruction generation module can be configured to process the offline analysis result, generate an instruction adapting to the functional module according to the information of the functional module, and perform trusted detection on the instruction. Detecting whether the offline analysis result is analyzed to obtain an instruction with at least one function type matched through an instruction generation module, and determining that the instruction can analyze the detection result; and acquiring the resource dependency type of the instruction obtained by analysis, and determining a resource effective detection result. And detecting whether the offline analysis result is analyzed to obtain a trusted instruction according to the instruction resolvable detection result and the resource effective detection result.
Whether the offline analysis result can analyze to obtain the instruction or not and whether the analyzed instruction depends on effective resources or not are detected, whether the offline analysis result can analyze to obtain the trusted instruction or not is detected, the availability of the offline analysis result is detected from the analysis angle and the executable angle, the detection dimension of the availability is enriched, the detection range is increased, and the availability detection precision is improved.
Optionally, the performing the trusted detection on the offline analysis result may include: acquiring statement identification self-confidence scores corresponding to the input data, and detecting whether the statement identification self-confidence scores are larger than or equal to a preset self-confidence score threshold value; detecting whether information of a plurality of rounds of conversations can be acquired; acquiring the prediction intention of the offline analysis result, and detecting whether the prediction intention corresponding to the offline analysis result is an offline support intention or not; and detecting whether the offline analysis result is analyzed to obtain a trusted instruction. When the sentence identification confidence score is greater than or equal to a preset confidence score threshold value, information of multiple rounds of dialogue can be obtained, and the offline analysis result can be analyzed to obtain a trusted instruction, the offline analysis result is determined to be trusted; or determining that the offline analysis result is reliable under the condition that the sentence identification confidence score is greater than or equal to a preset confidence score threshold, the prediction intention is the offline support intention, and the offline analysis result can be analyzed to obtain a trusted instruction. The offline parsing result of the remaining cases is not trusted.
According to the technical scheme, whether the offline analysis result is reliable or not is determined by detecting the accuracy and the availability of the offline analysis result, the reliable detection dimension is increased, the detection range is increased, the reliable detection accuracy is improved, the reliable offline analysis result is obtained, and the accuracy of the analysis result corresponding to the input data is improved.
Fig. 3 is a flowchart of another input data parsing method disclosed in an embodiment of the present disclosure, further optimized and expanded based on the above technical solution, and may be combined with the above various alternative embodiments. The input data parsing method is optimized to further include: acquiring context information and dialogue types of a current multi-round dialogue; determining an association intention according to the context information; determining a target instruction according to the analysis result of the input data; acquiring a prediction intention corresponding to an analysis result of the input data; and determining a target functional module according to the association intention, the prediction intention and the dialogue type, and sending the target instruction to the target functional module so as to enable the target functional module to execute the target instruction.
S301, input data provided by a user are sent to a server, so that the server carries out online analysis on the input data.
S302, carrying out offline analysis on the input data to obtain an offline analysis result, and carrying out credible detection on the offline analysis result.
S303, under the condition that the offline analysis result is reliable and the online analysis result is not received, acquiring the reliable offline analysis result, and determining the reliable offline analysis result as the analysis result of the input data.
S304, under the condition that the offline analysis result is not credible or the online analysis result is preferentially received, acquiring the online analysis result, and determining the online analysis result as the analysis result of the input data.
S305, obtaining the context information and the dialogue type of the current multi-round dialogue.
In the case that the information that the multi-round dialogue can be acquired is detected, that is, the current dialogue is indicated to be the multi-round dialogue, the information of the current multi-round dialogue can be acquired at this time, and the information can include the context information and the dialogue type. The current multi-turn dialog refers to a multi-turn dialog to which the current dialog to which input data provided by a user belongs. Context information refers to content associated with the current multi-round conversation. By way of example, the context information may include duration, whether a multi-turn conversation is completed, identification information, and previous intent to determine a conversation, and so forth. The dialog type is used to determine the scope of the user's reply content in the current multi-round dialog. By way of example, the dialog types may include: limiting user session content types and non-limiting user session content types.
If the input data is data input by a user of the first-round dialog, the content associated with the input data can be determined as the context information of the current multi-round dialog, the dialog type can be determined, and the context information and the dialog type can be stored. If the input data is data input by the user of the second dialog, pre-stored context information and dialog types can be directly acquired. In addition, the context information and the dialogue type can be updated according to the input data of the second round of dialogue, so as to be used for the dialogue of the subsequent round; or only the context information and the dialog type of the first-round dialog are used and not updated, wherein whether to update and the manner of updating can be set according to the need, and the method is not particularly limited. And so on, for the input data of the subsequent round of dialogue, the context information and dialogue types of the previous rounds of dialogue records can be acquired. The dialog type may be determined according to the context information, and if the intention corresponding to the context information is an intention to limit the user dialog content, for example, an intention to navigate or make a call, the dialog type is determined to limit the user dialog content type; if the intent corresponding to the context information is a non-limiting intent, e.g., open a skylight, the dialog type is determined to be a non-limiting user dialog content type.
S306, determining the association intention according to the context information.
The associated intent may refer to the intent of the user as determined in the previous dialog of the current multiple dialog. The context information may include an intention, and the intention is extracted from the context information and determined as an associated intention. Or the context information includes the contents of the previous rounds of conversations, and the associated intents are resolved according to the contents of the previous rounds of conversations. S305-S306 may be performed simultaneously with S302.
S307, determining a target instruction according to the analysis result of the input data.
The target instruction is an instruction obtained by analyzing an analysis result of the input data. The target instruction is used for being distributed to the functional module for execution. And the function module receives the target instruction to execute, and realizes the function corresponding to the user intention. When the online analysis result is preferentially acquired, the online analysis result is analyzed to obtain a target instruction, or the instruction issued by the server is acquired directly while the online analysis result is acquired, and the target instruction is determined. When the trusted offline analysis result is preferentially obtained, determining an instruction obtained by analyzing the trusted offline analysis result as a target instruction. In the foregoing embodiment, the offline analysis result needs to be analyzed to obtain the instruction, and the trusted detection is performed, and accordingly, the trusted instruction obtained by the analysis is determined as the target instruction if the trusted instruction can be obtained by the analysis and the online analysis result is not reached. After S303, S307 is performed.
S308, obtaining a prediction intention corresponding to the analysis result of the input data.
The predicted intent may refer to an intent determined by the input data. In the scenario of multiple rounds of conversations, the associated intent is the intent determined in the previous round of conversations in the current multiple rounds of conversations. The predicted intent is an intent determined in the current round of dialog.
S308 may be performed concurrently with S302. S307 may also be exchanged with the execution order of S304-S306. When the analysis result of the input data is an online analysis result, the online analysis result is received, meanwhile, the intention corresponding to the online analysis result is received, and the prediction intention corresponding to the analysis result of the input data is determined.
S309, determining a target functional module according to the association intention, the prediction intention and the dialogue type, and sending the target instruction to the target functional module so that the target functional module executes the target instruction.
The associative intent, the predictive intent, and the dialog type are used to collectively determine the target functional module. The target function module is used for executing target instructions. The input data analysis method can be applied to an application scene of the Internet of things, and the target functional module can be configured in the Internet of things equipment. The internet of things device can be connected with a network device, receives and executes instructions through the network, and can be divided into fixed devices or mobile devices, for example, the fixed devices are intelligent home devices, and the mobile devices are vehicle-mounted devices and the like. Specifically, in the application scenario of the internet of things, the target function module includes a module of the internet of things device, and may include a table lifting module, a sound box, a cabinet door sliding module, a module for controlling a vehicle window, and the like. Illustratively, the target function module is a media control module for controlling playing media, such as controlling audio playing or video playing, etc. As another example, the target function module is a telephone module for establishing a telephone communication connection. For another example, the target function module is a navigation module for providing navigation functions. As another example, the target function module is a hardware control module, for example, for controlling windows, doors, air conditioners, and the like. In addition, there are other cases, and this is not particularly limited.
In practice, the intent of association may be different from the predicted intent. In multiple rounds of conversations, the user's expressed intentions are generally consistent, and through multiple conversations, a complete intent is described. If the associated intention is different from the predicted intention, whether the intention is changed by the user is required to be judged, if the intention is not changed, a target intention is determined together according to the associated intention and the predicted intention, a target function module corresponding to the target intention is determined, and if the intention is changed, the target function module corresponding to the predicted intention is determined based on the predicted intention. The dialog type is used to select a target function module corresponding to the target intention determined according to a certain number of intents. For example, according to the dialog type, a target function module corresponding to the predicted intention is selected, a target function module corresponding to the associated intention is selected, or a target function module corresponding to the target intention is determined by the associated intention and the predicted intention together, wherein the intentions selected for different dialog types are different. For another example, according to the dialogue type and the target intention, a target function module corresponding to the predicted intention is selected, a target function module corresponding to the associated intention is selected, or a target function module corresponding to the target intention is determined by the associated intention and the predicted intention together. Wherein the combination of different dialog types and target intents corresponds to different intents of the selection.
Illustratively, the intent of association is to adjust bluetooth and the intent of prediction is to adjust volume. The dialogue type is the dialogue content type of the limited user, and the corresponding selection determines the target intention together according to the association intention and the predicted intention, namely the target intention is to adjust the Bluetooth volume. If there is no intention to associate, it cannot be determined whether to adjust the system volume or the bluetooth volume based on the predicted intention alone. And sending the target instruction for adjusting the volume to the Bluetooth module. As another example, the intent to associate is to open a window and the intent to predict is that the weather is hotter today. The dialog type is a non-limiting user dialog content type, where the corresponding selection determines a target intent based on the predicted intent, i.e., the target intent is query weather. And sending the target instruction of temperature detection to a temperature control module, or sending the instruction of weather acquisition to a wireless module. There are other examples, not specifically limited, and should not be construed as limiting the present invention.
In fact, in the current multi-round dialogue, the first round dialogue can generate a target instruction based on the online analysis result and send the target instruction to the target functional module, and the second round dialogue can generate a target instruction based on the trusted offline analysis result and send the target instruction to the target functional module; or the first round of dialogue can be based on the credible offline analysis result to generate target instruction, send to the target function module, and the second round of dialogue can be based on the online analysis result to generate target instruction, send to the target function module. At this time, different analysis results are adopted in conversations of different rounds, a target instruction is generated and sent to a target functional module, in order to avoid the problem that the intention of sequential recognition is wrong and false recall is caused because the instruction is generated based on different analysis results, the correct target functional module for instruction sending is determined based on the context state so as to realize correct functions, the seamless connection of off-line instructions can be improved, the target functional module for executing the target instruction can be accurately determined when the off-line analysis results are switched, and the instruction execution accuracy is improved.
In addition, when the offline analysis result is obtained, the offline analysis can be performed on the input data according to the context information, so that the offline analysis result is obtained, and the offline analysis accuracy is improved. Namely, the input data parsing method further comprises the following steps: in the case where information of a multi-turn conversation is available, context information of the current multi-turn conversation is acquired. And carrying out offline analysis on the input data according to the context information of the current multi-round dialogue, and carrying out credible detection on the offline analysis result. Correspondingly, the context information and the input data can be analyzed and sent to the server, so that the server can analyze the input data on line according to the context information.
Optionally, the determining a target function module according to the association intention, the prediction intention and the dialogue type includes: determining a function module corresponding to the association intention as a target function module under the condition that the dialogue type is a user dialogue content type; determining a function module corresponding to the prediction intention of the online analysis result as a target function module under the condition that the dialogue type is a non-limiting user dialogue content type and the analysis result of the input data is the online analysis result; or determining the function module corresponding to the association intention as a target function module under the condition that the dialogue type is a non-limiting user dialogue content type and the analysis result of the input data is an offline analysis result.
Limiting the user session content type means that the user's session content is within a preset range. Non-limiting user dialog content types refer to user dialog content that is unrestricted. For example, in a multi-round conversation, the current electronic device provides a problem of navigating to: 1. ground a, ground 2, B and ground 3, C. The user's dialog content can only be selected among these three options. The user's dialogue content may be some place or some option, and this reply mode is not limited, but the dialogue content can be limited to the provided content. As another example, in a multi-round conversation, the current electronic device does not provide a question, or what does the provided question adjust? The user's dialogue content is today's weather truly hot. At this time, the dialogue content of the user is not limited, and the user can reply at will. The dialogue type is a limited user dialogue content type, which indicates that the user can only select dialogue content within a preset range, and the preset range is a range of dialogue content determined based on association intention, and the corresponding preset range is different according to different association intention. At this time, the intention of the user is mainly the association intention, and thus, the function module corresponding to the association intention is determined as the target function module. The conversation type is an unrestricted user conversation content type, indicating that the user may have unrestricted conversation content. In this case, the intention of the user is mainly the predicted intention, but the trusted offline analysis result is not necessarily accurate, and therefore, the function module corresponding to the associated intention is determined as the target function module for the trusted offline analysis result. The online analysis result is accurate, and the function module corresponding to the prediction intention is determined as the target function module aiming at the online analysis result.
Further, if the predicted intention cannot determine the target function module, the target function module may be determined in accordance with the association intention and the predicted intention at this time. Illustratively, the intent of association is to adjust bluetooth and the intent of prediction is to adjust volume. The dialogue type is the dialogue content type of the limiting user, and a Bluetooth module corresponding to Bluetooth is selected and adjusted at the moment to be determined as a target functional module. As another example, the intent to associate is to open a window and the predicted intent is to place a call. The conversation type is a non-limiting user conversation content type. The analysis result of the input data is an online analysis result, and a telephone module corresponding to a telephone call is selected at the moment and is determined to be a target functional module. The analysis result of the input data is a trusted online analysis result, and at the moment, a window control module corresponding to the window is selected to be opened, and the window control module is determined to be a target functional module.
The target function module is determined by predicting the intention, the association intention and the dialogue type of the context of the multi-round dialogue and the off-line type of the analysis result, different scenes can be subdivided, the most accurate intention is determined, the corresponding function module is determined to be the target function module for executing the target instruction, the accuracy of determining the target function module is improved, the instruction execution accuracy is improved, different function modules are adapted, and the application scene is increased.
In addition, if the dialogue content provided by the user is not in the preset range, prompt information for replying in the preset range is provided for the user. Optionally, the input data parsing method further includes: providing clarified information of the input data to the user in the case that the dialogue type is a user dialogue content type and the associated intention is different from the predicted intention; acquiring new input data provided by a user; and determining the analysis result of the new input data aiming at the new input data. The clarification information is used for prompting the user to reply within a preset range.
Optionally, the target function module includes a module of the vehicle-mounted device.
The vehicle-mounted device is a device configured on the vehicle, and can be connected with a network, receive instructions through the network and execute the instructions. The input data parsing method can be applied to vehicle assisted driving application scenes and automatic driving application scenes. The current electronic device may be configured with a target functional module, or the target functional module and the electronic device to which the current electronic module belongs are independent electronic devices, and the two electronic devices may communicate through a network. In practice, mobile networks such as mobile phone traffic are typically used to access the internet for mobile scenarios. If the network is in a closed scene such as a tunnel or a ground library, weak network or no network can occur, so that the network connection is unstable, and the cloud service is unreliable. When the cloud service is unreliable, a reliable offline analysis result is selected as a basis for generating an instruction, so that the accuracy and reliability of vehicle control can be improved, the safety of the vehicle can be improved, and the analysis speed can be improved without waiting for an online analysis result provided by the cloud for a long time.
The target function module is configured as the module of the vehicle-mounted equipment, so that the application scene is enriched, the acquisition speed of the analysis result can be improved, the analysis accuracy is considered, the accuracy and the reliability of vehicle control can be improved, and the safety of the vehicle is improved.
According to the technical scheme, the association intention is determined according to the context information of the multi-round dialogue in the multi-round dialogue, the prediction intention corresponding to the analysis result of the input data is obtained, the target functional module is determined according to the association intention and the prediction intention and the dialogue type, and the target instruction is sent to the target functional module for execution, so that the target functional module for executing the target instruction can be accurately determined for accurately executing the target instruction, meanwhile, the instruction switching of the off-line analysis result can be realized, and the analysis accuracy and the execution accuracy during the switching of the off-line analysis result are improved.
Fig. 4 is a scene graph of another input data parsing method disclosed in accordance with an embodiment of the disclosure. The input data parsing method may include:
s401, recording by the voice client.
And starting the voice client, starting a system recording function through user authorization, recording the voice of the user, obtaining user voice data, and determining input data provided for the user.
S402, input data provided by a user are sent to a server, so that the server carries out online voice recognition on the input data.
S403, the server performs on-line semantic analysis on the voice recognition result of the input data.
The online parsing includes online speech recognition and online semantic parsing.
Input data provided by the user is provided to the online speech recognition engine (i.e., server) via network transmission, and results of online recognition ASR (Automatic Speech Recognition, automatic speech recognition technique) returned by the online speech engine are obtained. After the acquired online recognition ASR result is obtained, the cloud speech recognition server transfers the recognized text to an online semantic processing server to obtain a semantic analysis result NLU (Natural Language Understanding ), and the online returned NLU result is defined as r1.
S404, performing offline speech recognition on the input data.
The input data is provided to an offline speech recognition engine integrated into the client, which will return an offline speech recognition ASR result. Word segmentation is carried out on the voice recognition result to obtain at least one alternative word; acquiring pronunciation information of the alternative words; inquiring the expected words matched with the pronunciation information of the alternative words in the prestored expected words; and replacing the alternative words with the expected words, and correcting the voice recognition result. The returned ASR result of the speech recognition is offline recognition, so that the recognition result is often not very accurate and needs to be corrected, and the specific method comprises the following steps of: and acquiring the pinyin of the identification result, and if more than two syllables of the identified characters are matched with the syllable pinyin of the target vocabulary, replacing the identification result with the target vocabulary.
S405, performing offline semantic analysis on the obtained voice recognition result to obtain an offline analysis result.
The corrected voice recognition result is provided for a local semantic analysis engine to carry out semantic analysis, and an offline semantic analysis NLU result is obtained. And simultaneously carrying out online analysis and offline analysis. S403 is executed at the same time as S402. In practice, the online analysis result obtained by online identification and analysis is more reliable, the accuracy of online analysis is far higher than that of offline analysis, but the online analysis result should be theoretically used, the online analysis result is slower to return due to the problem of network fluctuation, and even the online analysis result cannot be further analyzed to obtain an instruction due to the fact that the online analysis result cannot be returned due to the fact that the network is overtime, and the online analysis result is executed.
S406, acquiring statement identification self-confidence scores corresponding to the input data, and detecting whether the statement identification self-confidence scores are larger than or equal to a preset self-confidence score threshold value.
When the offline speech recognition result is corrected, the sentence recognition self-confidence score of the speech recognition result before correction is calculated. And calculating according to the matching degree of the offline recognized words and the recognized words and syllables of the expected definition to obtain the sentence recognition self-confidence score. Specifically, the ratio of the number of the same words (word) of the voice recognition results before and after correction to the number of words included in the voice recognition results before correction is obtained, and is determined as the sentence recognition confidence score. If the sentence identifies whether the confidence score is greater than or equal to the preset confidence score threshold, S406 and S407 are executed, otherwise, the offline analysis result is not trusted, and the online analysis result is waited for being received.
S407, detecting whether the information of the multi-round dialogue can be acquired; or obtaining the prediction intention of the offline analysis result, and detecting whether the prediction intention corresponding to the offline analysis result is the offline support intention.
The instruction behavior is composed of domain and intent. A domain may refer to a functional classification resulting in content, further subdividing the intent in the domain. In practice, the input data is used to perform certain functions, and the purpose of the user can be differentiated according to the functions, wherein the differentiated major categories are domains, and the major categories are further subdivided into minor categories as intents. A domain may be understood as a domain of functions. These intents supported offline may be unavailable due to resources, or intents requiring network interaction such as navigation-related information, etc., and may be differentiated in advance. The determinable offline supportable intent is added to the whitelist. The white list stores therein the domain and the intention under the domain. The fields may be represented using drop classes, such as navigation drop classes, music drop classes, car control drop classes, system control drop classes, phone drop classes, and the like. The domain to which the prediction intention belongs hits the white list or the prediction intention hits the white list, and the prediction intention is determined to be offline support intention. If the information or predicted intention of the multi-turn dialog can be acquired as the offline support intention, S408 is performed, otherwise the offline parsing result is not trusted, and the online parsing result is waited for to be received.
S408, detecting whether the offline analysis result is analyzed to obtain a trusted instruction.
And sending the offline analysis result to the instruction generation module to obtain the instruction and the trusted detection result fed back by the instruction generation module. Detecting whether the offline analysis result is analyzed to obtain an instruction with at least one function type matched through an instruction generating module, and determining that the instruction can analyze the detection result; and acquiring the resource dependency type of the instruction obtained by analysis, and determining a resource effective detection result. And detecting whether the offline analysis result is analyzed to obtain a trusted instruction according to the instruction resolvable detection result and the resource effective detection result.
The instruction generating module may be divided into a plurality of functional instruction generating units, for example, an instruction generating unit of a navigation drop class, an instruction generating unit of a music drop class, an instruction generating unit of a car control drop class, an instruction generating unit of a system control drop class, and an instruction generating unit of a telephone drop class. The current class instruction generating unit can process the offline analysis result and generate an instruction, and meanwhile, the instruction can acquire effective resources to determine that the generated instruction is a trusted instruction. If the instruction depends on an online resource and an offline, unprocessed instruction, the generated instruction is determined to be an untrusted instruction. If all the instruction generating units cannot process the offline parsing result and generate an instruction, a general instruction is generated, and the instruction is determined To be an unreliable instruction, for example, the general instruction is a tts (Text To Speech) instruction which does not support voice broadcasting of the instruction. The instruction generated by the instruction generating unit of the current drop class is used for being sent to the functional module corresponding to the current drop class to be executed. Illustratively, the instructions generated by the instruction generation unit of the navigation drop class are for execution by the navigation module.
S409, under the condition that the offline analysis result is reliable and the online analysis result is not received, acquiring the reliable offline analysis result and acquiring an offline instruction of the reliable offline analysis result.
When the sentence identification confidence score is greater than or equal to a preset confidence score threshold, the input data can acquire information of multiple rounds of conversations, and the offline analysis result can be analyzed to obtain a trusted instruction, the offline analysis result is determined to be trusted; or determining that the offline analysis result is reliable under the condition that the sentence identification confidence score is greater than or equal to a preset confidence score threshold, the prediction intention is the offline support intention, and the offline analysis result can be analyzed to obtain a trusted instruction. The offline parsing result of the remaining cases is not trusted.
S410, waiting to receive the online analysis result in the process of the offline analysis result being unreliable or the offline analysis result.
And waiting for receiving the online analysis result under the condition of unreliable offline analysis result.
S411, under the condition that the offline analysis result is not credible or the online analysis result is preferentially received, acquiring the online analysis result and acquiring an online instruction of the online analysis result.
Determining a trusted instruction of a trusted offline analysis result as a target instruction; and acquiring an online instruction of the online analysis result, and generating a target instruction. Acquiring context information and dialogue types of a current multi-round dialogue; determining an association intention according to the context information; determining a target instruction according to the analysis result of the input data; acquiring a prediction intention corresponding to an analysis result of the input data; and determining a target functional module according to the association intention, the prediction intention and the dialogue type, and sending the target instruction to the target functional module so as to enable the target functional module to execute the target instruction. The user inputs 'I want to make a call' to enter multiple rounds of interaction, tts voice broadcast prompts to ask 'you want to make a call', voice recognition is started to enter a listening state, voice of the user is recorded, input data is obtained, and the input data is voice data input by the user in multiple rounds of conversations at the moment, so that information of the multiple rounds of conversations can be obtained. At this time, if the user inputs "today's weather", the normal execution behavior should be to call the contact for "today's weather", but at this time, because the state of the context is not recorded, the "today's weather" executes the online instruction, broadcasting that the today's weather is inconsistent with the expectations. Thus, at the beginning of each session, session information (session) of the session of the round, that is, the aforementioned context information, is recorded, and the session of the session holds information such as the currently processed vertical class (for distinguishing the currently processed vertical class, solving the problem that the same session is recalled by other vertical classes), whether it is a multi-round session, whether it is an ending session (for informing the end of the current session), and session Id (unique identifiable Id information of each round of session).
When the offline and online instructions are switched, whether offline analysis or online analysis is performed currently, information is read from the saved session, and then distributed to the correct function modules of the sags and executed. For example, the user inputs "i want to make a call", because the contextual information is recorded, the contextual information is not sent to the function module of the weather inquiry drop class to execute the inquiry command of "today's weather", but is sent to the function module of the telephone drop class line to execute the inquiry and dialing command of the contact person of "today's weather", and then the user judges whether the contact person exists in the address book and then gives the correct execution behavior.
S412, executing the instruction.
And sending the target instruction to the target functional module so that the target functional module executes the target instruction. And in particular to a designated target function module.
According to the technical scheme, the problem of slow instruction response in the weak network environment is solved, meanwhile, the problem of seamless connection of the online instruction can be solved, the problem of false recall (identified as other intentions) of the instruction caused by connection of the context during online switching can be conveniently and rapidly solved, the response speed of a user is greatly improved, and the user experience is improved.
Fig. 5 is a block diagram of an input data parsing apparatus according to an embodiment of the present disclosure, which is applicable to a case of parsing input data. The device is realized by software and/or hardware, and is specifically configured in the electronic equipment with certain data operation capability.
An input data parsing apparatus 500 as shown in fig. 5, comprising: an input data acquisition module 501, an offline analysis trusted detection module 502, an offline result acquisition module 503 and an online result acquisition module 504; wherein,
an input data obtaining module 501, configured to send input data provided by a user to a server, so that the server performs online analysis on the input data;
the offline analysis trusted detection module 502 is configured to perform offline analysis on the input data to obtain an offline analysis result, and perform trusted detection on the offline analysis result;
an offline result obtaining module 503, configured to obtain a trusted offline analysis result and determine an analysis result of the input data according to the trusted offline analysis result when the offline analysis result is trusted and the online analysis result is not received;
The online result obtaining module 504 is configured to obtain an online analysis result and determine the online analysis result as an analysis result of the input data when the offline analysis result is not trusted or the online analysis result is preferentially received.
According to the technical scheme, the on-line analysis and the off-line analysis are simultaneously carried out, the analysis result which is the forefront time and the most credible is adopted to determine the analysis result which is the analysis result of the user input data, the analysis efficiency and the accuracy are considered, and the real-time accurate response to the user request is realized.
Further, the offline parsing trusted detection module 502 includes: and the accurate available detection unit is used for detecting the accuracy and the availability of the offline analysis result.
Further, the accurate availability detection unit comprises at least one of the following: the recognition accuracy detection subunit is used for acquiring the sentence recognition self-confidence score corresponding to the input data and detecting whether the sentence recognition self-confidence score is greater than or equal to a preset self-confidence score threshold value; and a multi-turn dialogue detection subunit for detecting whether information of the multi-turn dialogue is available.
Further, the accurate availability detection unit comprises at least one of the following: the intention detection subunit is used for acquiring the predicted intention of the offline analysis result and detecting whether the predicted intention corresponding to the offline analysis result is an offline support intention or not; and the trusted instruction detection subunit is used for detecting whether the offline analysis result is analyzed to obtain a trusted instruction.
Further, the trusted instruction detection subunit is specifically configured to: detecting whether the offline analysis result is analyzed to obtain an instruction with at least one function type matched, and determining that the instruction can analyze the detection result; acquiring the resource dependency type of the instruction obtained by analysis, and determining a resource effective detection result; and according to the instruction resolvable detection result and the resource effective detection result, detecting whether the offline resolution result is resolved to obtain a trusted instruction.
Further, the offline parsing trusted detection module 502 includes: the voice recognition module is used for carrying out voice recognition on the input data; and the offline analysis module is used for carrying out semantic analysis on the voice recognition result to obtain an offline analysis result.
Further, the input data parsing device further includes: the recognition result word segmentation module is used for segmenting the voice recognition result to obtain at least one alternative word; the pronunciation information determining module is used for acquiring pronunciation information of the alternative words; the expected word inquiry module is used for inquiring expected words matched with pronunciation information of the alternative words in prestored expected words; and the recognition result correction module is used for replacing the candidate words with the expected words and correcting the voice recognition result.
Further, the input data parsing device further includes: the session information acquisition module is used for acquiring the context information and the session type of the current multi-round session; the intention determining module is used for determining the association intention according to the context information; the target instruction determining module is used for determining a target instruction according to the analysis result of the input data; the prediction intention determining module is used for obtaining a prediction intention corresponding to an analysis result of the input data; and the function module determining module is used for determining a target function module according to the association intention, the prediction intention and the dialogue type and sending the target instruction to the target function module so as to enable the target function module to execute the target instruction.
Further, the function module determining module includes: a first function determining unit, configured to determine, as a target function module, a function module corresponding to the association intention in a case where the dialogue type is a user dialogue content type; a second function determining unit, configured to determine, as a target function module, a function module corresponding to a prediction intention of the online analysis result when the dialogue type is a non-limiting user dialogue content type and the analysis result of the input data is the online analysis result; or a third function determining unit, configured to determine, as a target function module, a function module corresponding to the association intention when the dialogue type is a non-limiting user dialogue content type and the analysis result of the input data is an offline analysis result.
Further, the input data analyzing device further includes: the online result interception module is used for intercepting the online analysis result sent by the server under the conditions that the offline analysis result is reliable and the online analysis result is not received; and the online result waiting module is used for waiting to receive the online analysis result fed back by the server under the condition that the offline analysis result is not credible.
Further, the target function module includes a module of the vehicle-mounted device.
The input data analysis device can execute the input data analysis method provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of executing the input data analysis method.
In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order colloquial is not violated.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program object.
Fig. 6 shows a schematic area diagram of an example electronic device 600 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 6, the apparatus 600 includes a computing unit 601 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 602 or a computer program loaded from a storage unit 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the device 600 may also be stored. The computing unit 601, ROM602, and RAM 603 are connected to each other by a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
Various components in the device 600 are connected to the I/O interface 605, including: an input unit 606 such as a keyboard, mouse, etc.; an output unit 607 such as various types of displays, speakers, and the like; a storage unit 608, such as a magnetic disk, optical disk, or the like; and a communication unit 609 such as a network card, modem, wireless communication transceiver, etc. The communication unit 609 allows the device 600 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The computing unit 601 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 601 performs the respective methods and processes described above, such as an input data parsing method. For example, in some embodiments, the input data parsing method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 608. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 600 via the ROM602 and/or the communication unit 609. When the computer program is loaded into the RAM 603 and executed by the computing unit 601, one or more steps of the input data parsing method described above may be performed. Alternatively, in other embodiments, the computing unit 601 may be configured to perform the input data parsing method in any other suitable way (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application specific standard objects (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or region diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (21)

1. An input data parsing method, comprising:
the method comprises the steps of sending input data provided by a user to a server, so that the server carries out online analysis on the input data;
performing offline analysis on the input data to obtain an offline analysis result, and detecting the accuracy and availability of the offline analysis result;
under the conditions that the offline analysis result is reliable and the online analysis result is not received, acquiring the reliable offline analysis result, and determining the reliable offline analysis result as the analysis result of the input data;
Under the condition that an offline analysis result is not credible or an online analysis result is preferentially received, acquiring the online analysis result, and determining the online analysis result as an analysis result of the input data;
the detecting the availability of the offline analysis result includes:
detecting whether the offline analysis result is analyzed to obtain an instruction with at least one function type matched, and determining that the instruction can analyze the detection result; the instruction resolvable detection result is used for representing whether the offline analysis result can be analyzed to obtain an instruction;
acquiring the resource dependency type of the instruction obtained by analysis; the resource dependency type comprises an online dependency resource type or an offline dependency resource type;
determining that the effective detection result of the resource is an ineffective resource under the condition that the resource dependency type is an online resource dependency type;
under the condition that the resource dependency type is an offline resource dependency type, determining a resource effective detection result as an effective resource;
and according to the instruction resolvable detection result and the resource effective detection result, detecting whether the offline resolution result is resolved to obtain a trusted instruction.
2. The method of claim 1, wherein the detecting the accuracy of the offline parsing result comprises at least one of:
Acquiring statement identification self-confidence scores corresponding to the input data, and detecting whether the statement identification self-confidence scores are larger than or equal to a preset self-confidence score threshold value; and
it is detected whether information of a multi-turn conversation is available.
3. The method of claim 1, wherein the detecting the availability of the offline parsing result further comprises:
and acquiring the predicted intention of the offline analysis result, and detecting whether the predicted intention corresponding to the offline analysis result is an offline support intention or not.
4. The method of claim 1, wherein the performing offline parsing of the input data to obtain offline parsing results comprises:
performing voice recognition on the input data;
and carrying out semantic analysis on the voice recognition result to obtain an offline analysis result.
5. The method of claim 4, further comprising:
word segmentation is carried out on the voice recognition result to obtain at least one alternative word;
acquiring pronunciation information of the alternative words;
inquiring the expected words matched with the pronunciation information of the alternative words in the prestored expected words;
and replacing the alternative words with the expected words, and correcting the voice recognition result.
6. The method of claim 1, further comprising:
acquiring context information and dialogue types of a current multi-round dialogue;
determining an association intention according to the context information;
determining a target instruction according to the analysis result of the input data;
acquiring a prediction intention corresponding to an analysis result of the input data;
and determining a target functional module according to the association intention, the prediction intention and the dialogue type, and sending the target instruction to the target functional module so as to enable the target functional module to execute the target instruction.
7. The method of claim 6, wherein the determining a target function module from the associated intent, the predicted intent, and the dialog type comprises:
determining a function module corresponding to the association intention as a target function module under the condition that the dialogue type is a user dialogue content type;
determining a function module corresponding to the prediction intention of the online analysis result as a target function module under the condition that the dialogue type is a non-limiting user dialogue content type and the analysis result of the input data is the online analysis result; or (b)
And determining the function module corresponding to the association intention as a target function module under the condition that the dialogue type is a non-limiting user dialogue content type and the analysis result of the input data is an offline analysis result.
8. The method of claim 1, further comprising:
under the conditions that the offline analysis result is credible and the online analysis result is not received, intercepting the online analysis result sent by the server;
and waiting for receiving an online analysis result fed back by the server under the condition that the offline analysis result is not trusted.
9. The method of claim 6, wherein the target function module comprises a module of an in-vehicle device.
10. An input data parsing apparatus comprising:
the input data acquisition module is used for sending input data provided by a user to the server so that the server can analyze the input data online;
the offline analysis trusted detection module is used for carrying out offline analysis on the input data to obtain an offline analysis result and carrying out trusted detection on the offline analysis result;
the offline result acquisition module is used for acquiring a trusted offline analysis result and determining the trusted offline analysis result as the analysis result of the input data under the condition that the offline analysis result is trusted and the online analysis result is not received;
The online result acquisition module is used for acquiring the online analysis result and determining the online analysis result as the analysis result of the input data under the condition that the offline analysis result is not credible or the online analysis result is preferentially received;
the offline analysis trusted detection module comprises:
the accurate availability detection unit is used for detecting the accuracy and availability of the offline analysis result;
wherein the accurate availability detection unit comprises:
the trusted instruction detection subunit is used for detecting whether the offline analysis result is analyzed to obtain a trusted instruction;
the trusted instruction detection subunit is specifically configured to:
detecting whether the offline analysis result is analyzed to obtain an instruction with at least one function type matched, and determining that the instruction can analyze the detection result; the instruction resolvable detection result is used for representing whether the offline analysis result can be analyzed to obtain an instruction;
acquiring the resource dependency type of the instruction obtained by analysis; the resource dependency type comprises an online dependency resource type or an offline dependency resource type;
determining that the effective detection result of the resource is an ineffective resource under the condition that the resource dependency type is an online resource dependency type;
Under the condition that the resource dependency type is an offline resource dependency type, determining a resource effective detection result as an effective resource;
and according to the instruction resolvable detection result and the resource effective detection result, detecting whether the offline resolution result is resolved to obtain a trusted instruction.
11. The apparatus of claim 10, wherein the accurately available detection unit comprises at least one of:
the recognition accuracy detection subunit is used for acquiring the sentence recognition self-confidence score corresponding to the input data and detecting whether the sentence recognition self-confidence score is greater than or equal to a preset self-confidence score threshold value; and
and the multi-round dialogue detection subunit is used for detecting whether the information of the multi-round dialogue can be acquired.
12. The apparatus of claim 10, wherein the accurately available detection unit further comprises:
and the intention detection subunit is used for acquiring the predicted intention of the offline analysis result and detecting whether the predicted intention corresponding to the offline analysis result is an offline support intention or not.
13. The apparatus of claim 10, wherein the offline parsing trusted detection module comprises:
the voice recognition module is used for carrying out voice recognition on the input data;
And the offline analysis module is used for carrying out semantic analysis on the voice recognition result to obtain an offline analysis result.
14. The apparatus of claim 13, further comprising:
the recognition result word segmentation module is used for segmenting the voice recognition result to obtain at least one alternative word;
the pronunciation information determining module is used for acquiring pronunciation information of the alternative words;
the expected word inquiry module is used for inquiring expected words matched with pronunciation information of the alternative words in prestored expected words;
and the recognition result correction module is used for replacing the candidate words with the expected words and correcting the voice recognition result.
15. The apparatus of claim 10, further comprising:
the session information acquisition module is used for acquiring the context information and the session type of the current multi-round session;
the intention determining module is used for determining the association intention according to the context information;
the target instruction determining module is used for determining a target instruction according to the analysis result of the input data;
the prediction intention determining module is used for obtaining a prediction intention corresponding to an analysis result of the input data;
and the function module determining module is used for determining a target function module according to the association intention, the prediction intention and the dialogue type and sending the target instruction to the target function module so as to enable the target function module to execute the target instruction.
16. The apparatus of claim 15, wherein the functional module determination module comprises:
a first function determining unit, configured to determine, as a target function module, a function module corresponding to the association intention in a case where the dialogue type is a user dialogue content type;
a second function determining unit, configured to determine, as a target function module, a function module corresponding to a prediction intention of the online analysis result when the dialogue type is a non-limiting user dialogue content type and the analysis result of the input data is the online analysis result; or (b)
And the third function determining unit is used for determining the function module corresponding to the association intention as a target function module when the dialogue type is a non-limiting user dialogue content type and the analysis result of the input data is an offline analysis result.
17. The apparatus of claim 10, further comprising:
the online result interception module is used for intercepting the online analysis result sent by the server under the conditions that the offline analysis result is reliable and the online analysis result is not received;
and the online result waiting module is used for waiting to receive the online analysis result fed back by the server under the condition that the offline analysis result is not credible.
18. The apparatus of claim 15, the target function module comprising a module of an in-vehicle device.
19. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the input data parsing method of any one of claims 1-9.
20. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the input data parsing method according to any one of claims 1-9.
21. A computer program product comprising a computer program which, when executed by a processor, implements the input data parsing method according to any one of claims 1-9.
CN202211338183.9A 2022-10-28 2022-10-28 Input data analysis method, device, electronic equipment and storage medium Active CN115662430B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211338183.9A CN115662430B (en) 2022-10-28 2022-10-28 Input data analysis method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211338183.9A CN115662430B (en) 2022-10-28 2022-10-28 Input data analysis method, device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115662430A CN115662430A (en) 2023-01-31
CN115662430B true CN115662430B (en) 2024-03-29

Family

ID=84993082

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211338183.9A Active CN115662430B (en) 2022-10-28 2022-10-28 Input data analysis method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115662430B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016191319A1 (en) * 2015-05-27 2016-12-01 Google Inc. Local persisting of data for selectively offline capable voice action in a voice-enabled electronic device
WO2017166649A1 (en) * 2016-03-30 2017-10-05 乐视控股(北京)有限公司 Voice signal processing method and device
CN112331203A (en) * 2020-11-06 2021-02-05 深圳市欧瑞博科技股份有限公司 Intelligent household equipment control method and device, electronic equipment and storage medium
CN112331213A (en) * 2020-11-06 2021-02-05 深圳市欧瑞博科技股份有限公司 Intelligent household equipment control method and device, electronic equipment and storage medium
CN112509585A (en) * 2020-12-22 2021-03-16 北京百度网讯科技有限公司 Voice processing method, device and equipment of vehicle-mounted equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130085753A1 (en) * 2011-09-30 2013-04-04 Google Inc. Hybrid Client/Server Speech Recognition In A Mobile Device
US8972263B2 (en) * 2011-11-18 2015-03-03 Soundhound, Inc. System and method for performing dual mode speech recognition
US20180330714A1 (en) * 2017-05-12 2018-11-15 Apple Inc. Machine learned systems
CN111666396B (en) * 2020-06-05 2023-10-31 北京百度网讯科技有限公司 User intention understanding satisfaction evaluation method, device, equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016191319A1 (en) * 2015-05-27 2016-12-01 Google Inc. Local persisting of data for selectively offline capable voice action in a voice-enabled electronic device
WO2017166649A1 (en) * 2016-03-30 2017-10-05 乐视控股(北京)有限公司 Voice signal processing method and device
CN112331203A (en) * 2020-11-06 2021-02-05 深圳市欧瑞博科技股份有限公司 Intelligent household equipment control method and device, electronic equipment and storage medium
CN112331213A (en) * 2020-11-06 2021-02-05 深圳市欧瑞博科技股份有限公司 Intelligent household equipment control method and device, electronic equipment and storage medium
CN112509585A (en) * 2020-12-22 2021-03-16 北京百度网讯科技有限公司 Voice processing method, device and equipment of vehicle-mounted equipment and storage medium

Also Published As

Publication number Publication date
CN115662430A (en) 2023-01-31

Similar Documents

Publication Publication Date Title
US11887604B1 (en) Speech interface device with caching component
US11817094B2 (en) Automatic speech recognition with filler model processing
CN109961792B (en) Method and apparatus for recognizing speech
US10269346B2 (en) Multiple speech locale-specific hotword classifiers for selection of a speech locale
CN113327609B (en) Method and apparatus for speech recognition
US11978432B2 (en) On-device speech synthesis of textual segments for training of on-device speech recognition model
US11615784B2 (en) Control method and control apparatus for speech interaction
WO2020024620A1 (en) Voice information processing method and device, apparatus, and storage medium
JP2003308087A (en) System and method for updating grammar
KR20210098880A (en) Voice processing method, apparatus, device and storage medium for vehicle-mounted device
CN109697981B (en) Voice interaction method, device, equipment and storage medium
US20200193985A1 (en) Domain management method of speech recognition system
CN113611316A (en) Man-machine interaction method, device, equipment and storage medium
CN112863496B (en) Voice endpoint detection method and device
CN115662430B (en) Input data analysis method, device, electronic equipment and storage medium
CN113077793B (en) Voice recognition method, device, equipment and storage medium
JP2003140690A (en) Information system, electronic equipment, and program
CN115171695A (en) Voice recognition method, device, electronic equipment and computer readable medium
KR20110025510A (en) Electronic device and method of recognizing voice using the same
CN115394300B (en) Voice interaction method, voice interaction device, vehicle and readable storage medium
CN115910025A (en) Voice processing method, device, electronic equipment and medium
CN116895275A (en) Dialogue system and control method thereof
CN116524916A (en) Voice processing method and device and vehicle
CN116153310A (en) Voice dialogue interaction method, system, electronic equipment and storage medium
KR20210032200A (en) Apparatus and method for providing multilingual conversation service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant