CN118248138A - Wake-free voice control method, device, equipment and storage medium - Google Patents

Wake-free voice control method, device, equipment and storage medium Download PDF

Info

Publication number
CN118248138A
CN118248138A CN202211668334.7A CN202211668334A CN118248138A CN 118248138 A CN118248138 A CN 118248138A CN 202211668334 A CN202211668334 A CN 202211668334A CN 118248138 A CN118248138 A CN 118248138A
Authority
CN
China
Prior art keywords
wake
free
similarity
voice
target wake
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211668334.7A
Other languages
Chinese (zh)
Inventor
冯泯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pateo Connect and Technology Shanghai Corp
Original Assignee
Pateo Connect and Technology Shanghai Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pateo Connect and Technology Shanghai Corp filed Critical Pateo Connect and Technology Shanghai Corp
Priority to CN202211668334.7A priority Critical patent/CN118248138A/en
Publication of CN118248138A publication Critical patent/CN118248138A/en
Pending legal-status Critical Current

Links

Landscapes

  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a wake-up-free voice control method, a device, equipment and a storage medium, wherein the wake-up-free voice control method comprises the following steps: acquiring a voice instruction and determining a target wake-up-free word of the voice instruction; judging whether the similarity between the voice instruction and the target wake-up-free word is smaller than a similarity threshold corresponding to the target wake-up-free word; determining an intent of the voice instruction when the similarity is less than the similarity threshold; judging whether the target wake-up-free word is matched with the intention of the voice instruction or not; and when the target wake-free word is matched with the intention of the voice instruction, executing a target operation corresponding to the target wake-free word. The invention has the positive progress effects that: the effective output of the target wake-up-free words is ensured, and the probability of false triggering of the target wake-up-free words is reduced, so that the recognition result of the target wake-up-free words is more accurate.

Description

Wake-free voice control method, device, equipment and storage medium
Technical Field
The present invention relates to the field of speech recognition, and in particular, to a wake-up-free speech control method, device, apparatus, and storage medium.
Background
With the intellectualization of automobiles, automobiles are equipped with intelligent voice assistants. The intelligent voice assistant can receive the voice instruction of the user to realize manual-free operation, so that a guarantee is provided for safe driving. The core technology for realizing the function is voice recognition, wherein the wake-up-free voice control is a control method capable of receiving user instructions at any time, generally, operation options are displayed to a user through a current display interface of electronic equipment, the operation options correspond to different wake-up-free words, and the user can send voice instructions to a voice assistant by speaking the wake-up-free words, so that corresponding target operations corresponding to the wake-up-free words can be triggered.
However, wake-up-free voice control has a problem that recognition accuracy is not high. When the voice command of the user cannot be matched with the corresponding wake-up-free word, the voice assistant will not give feedback, namely 'unrecognizable', or the output result is not the actual command intention of the user, namely 'answer questions'. This severely affects the user's experience, reducing the user's assessment of wake-up free speech control.
Disclosure of Invention
The invention aims to overcome the defect of low recognition accuracy of voice instructions by wake-up-free voice control in the prior art, and provides a wake-up-free voice control method, a device, equipment and a storage medium.
The invention solves the technical problems by the following technical scheme:
the invention provides a wake-up-free voice control method, which is applied to electronic equipment and comprises the following steps:
acquiring a voice instruction and determining a target wake-up-free word of the voice instruction;
Judging whether the similarity between the voice instruction and the target wake-up-free word is smaller than a similarity threshold corresponding to the target wake-up-free word;
determining an intent of the voice instruction when the similarity is less than the similarity threshold;
Judging whether the target wake-up-free word is matched with the intention of the voice instruction or not;
And when the target wake-free word is matched with the intention of the voice instruction, executing a target operation corresponding to the target wake-free word.
Preferably, the method further comprises:
and when the number of times that the target wake-up-free word is matched with the intention of the voice instruction is larger than a number threshold and the matching success rate is larger than a success rate threshold, updating the similarity threshold according to the similarity.
Preferably, it includes:
Counting the occurrence times of the similarity;
Updating the similarity of the occurrence times meeting a preset condition to the similarity threshold of the target wake-free word; the preset conditions include that the occurrence number is greater than a number threshold and/or the occurrence number is maximum.
Preferably, it includes:
And sending the voice instruction to a first recognition engine so as to determine a target wake-free word of the voice instruction by the first recognition engine.
Preferably, it includes:
The voice instruction is sent to a second recognition engine to determine an intent of the voice instruction by the second recognition engine.
Preferably, the method further comprises:
Uploading the wake-up-free voice data to a cloud end so that the cloud end can update the similarity threshold according to the wake-up-free voice data; wherein, the wake-up-free voice data comprises: the voice instruction, the similarity and the target wake-up-free word.
Preferably, when the similarity is greater than or equal to the similarity threshold, a target operation corresponding to the target wake-free word is performed.
The invention also provides a wake-up-free voice control device, which comprises:
the voice acquisition module is used for acquiring voice instructions;
The first recognition engine is used for determining a target wake-up-free word of the voice instruction;
The control module is used for judging whether the similarity between the voice instruction and the target wake-up-free word is smaller than a similarity threshold corresponding to the target wake-up-free word, and calling a second recognition engine when the similarity is smaller than the similarity threshold;
The second recognition engine is used for determining the intention of the voice instruction;
The control module is further configured to determine whether the target wake-up-free word matches the intent of the voice instruction, and execute a target operation corresponding to the target wake-up-free word when the target wake-up-free word matches the intent of the voice instruction.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the wake-up-free voice control method when executing the computer program.
The invention also provides a computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements the wake-up free speech control method described above.
The invention has the positive progress effects that: after the voice instruction is acquired, the recognition engine recognizes the target wake-up-free word. If the similarity of the target wake-up-free word exceeds the similarity threshold, executing target operation of the target wake-up-free word, if the similarity of the target wake-up-free word does not exceed the similarity threshold, further matching with the intention of the voice instruction, and if so, executing target operation of the target wake-up-free word. Therefore, the method and the device ensure the effective output of the target wake-up-free word, and reduce the false triggering probability of the target wake-up-free word, so that the recognition result of the target wake-up-free word is more accurate.
Drawings
FIG. 1 is a flowchart of a wake-up free speech control method according to an exemplary embodiment of the present invention;
FIG. 2 is a schematic block diagram of a wake-up-free voice control device according to an exemplary embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present invention.
Detailed Description
The invention is further illustrated by means of the following examples, which are not intended to limit the scope of the invention.
Fig. 1 is a flowchart of a wake-up-free voice control method according to an exemplary embodiment of the present invention, where the control method includes the following steps:
step 101, acquiring a voice instruction and determining a target wake-up-free word of the voice instruction.
In the wake-up-free word mode, the electronic device registers a plurality of wake-up-free words according to the display interface, and the wake-up-free words correspond to different operation actions. After the voice command of the user is obtained, the electronic device determines a target wake-up-free word of the voice command.
In one implementation, the voice command and all the wake-up free words are subjected to similarity calculation, and then the wake-up free word with the highest similarity is selected as the target wake-up free word.
The similarity can be understood as the matching degree of the wake-up-free word displayed currently and the voice instruction, the higher the similarity is, the more matching is, or the confidence that the wake-up-free word is taken as the target wake-up-free word can be understood as the confidence that the higher the similarity is, the more likely the wake-free word is close to the real intention of the voice instruction of the user. Therefore, preferably, the wake-free word with the highest similarity is selected as the target wake-free word. The specific expression of the similarity may be a percentage expression or a decimal expression of not more than 1. For example, the similarity of the user voice command "fifth head" and the target wake-free word "fifth head" is 98.5% or the similarity is 0.985.
In one embodiment, the voice instruction is sent to a first recognition engine to determine a target wake-free word of the voice instruction by the first recognition engine.
In this step, the first recognition engine preferably adopts a hotword engine, and after receiving the voice command of the user, the hotword engine processes the voice command to obtain the text, the similarity threshold value of the target wake-free word, and the voice stream for recognition of the target wake-free word. The similarity is the similarity of the target wake-up-free word and the voice command, the similarity threshold is the threshold of the similarity corresponding to the target wake-up-free word, and the voice stream is the audio data of the original voice command after analog-to-digital conversion.
Step 102, judging whether the similarity between the voice command and the target wake-up free word is smaller than a similarity threshold corresponding to the target wake-up free word.
If yes, executing step 103; if the determination result is negative, step 105 is performed.
In this step, the similarity threshold is a threshold set for different wake-up free words, and the threshold may be preset in the initial stage or may be adjusted according to actual needs. If the similarity between the target wake-up-free word and the voice command exceeds a similarity threshold, the probability that the currently selected target wake-up-free word is close to the real intention of the voice command is higher. Otherwise, the probability that the target wake-free word is consistent with the true intention of the voice instruction is low. In addition, since the similarity threshold is a threshold set for similarity, the similarity threshold and the similarity adopt a unified numerical expression. For example, if the similarity threshold is a percent representation, then the similarity should also be a percent representation.
For example, if the similarity threshold for the target wake-free word "fifth" is set to 0.95, the target wake-free word "fifth" is identified when the similarity between the voice command and the target wake-free word exceeds 0.95. In practice, if there are factors such as unclear pronunciation of the spoken word by the user, dialect accent, or loud surrounding noise, the similarity between the voice command "fifth head" sent by the user and the target wake-up-free word "fifth head" may not reach 0.95, or may be very close to 0.95. To prevent false triggering of a voice command, the voice command is not recognized. Otherwise, if the noise of the user environment is less, the voice command of the user is standard, the similarity between the voice command 'fifth head' sent by the user and the target wake-free word 'fifth head' exceeds 0.95, and then the target wake-free word 'fifth head' can be identified.
Similarly, if the user speaks a voice command that does not currently correspond to the target wake-up exempt word, the voice command is less similar to the target wake-up exempt word. For example, when there are 5 target wake-up free words divided into "first", "second", "third", "fourth" and "fifth", after the user speaks the voice command "fifteenth", the similarity between the voice command "fifteenth" and the 5 target wake-up free words is very low, and the similarity threshold of each target wake-up free word is not exceeded, and the voice command "fifteenth" will not be recognized. Of course, if the similarity threshold is set too high at this time, if the similarity threshold is assumed to be 0.97, the voice command will not be recognized even if the similarity between the voice command "fifth first" and the target wake-up free word "fifth first" exceeds 0.95. If the similarity threshold is set too low, and if the similarity threshold is assumed to be 0.87, the similarity between the voice command "fifth first" and the target wake-free word "fifth first" will easily exceed 0.87, and the similarity between the voice command "fifteenth" similar to the voice command "fifth first" and the target wake-free word "fifth" will also easily exceed 0.87, which will cause erroneous recognition and result in "answer question. Therefore, the setting of the similarity threshold needs to be adjusted within a suitable range.
And step 103, determining the intention of the voice instruction when the similarity is smaller than a similarity threshold.
In one embodiment, the voice instruction is sent to a second recognition engine to determine the intent of the voice instruction by the second recognition engine. In this step, the second recognition engine may employ a local recognition engine, and after receiving the voice command of the user, the second recognition engine directly parses the voice command of the user into the intention of the voice command, that is, the user intention, through the processing of the local recognition engine.
Wherein the user intent may be in the form of text in natural language or keyword tags. For example, when the voice instruction is "fifth", the processing by the second recognition engine is performed. If the user intent is handled as natural language, the user intent is the natural language text "fifth-head". Or the user intention is processed into keyword labels, which for the voice command "fifth" include the number label "five" and the attribute label "song". When the displayed target wake-up word is a list with the character of a number sequence number, the recognition of the number label is focused, the number label is based on the analysis of the number in the voice command 'fifth head', and the attribute label is determined based on the function attribute of the current display interface. For example, a label such as "song", "play", or "pause" may be recognized if a music player is currently being used, or a label such as "destination", "departure place", "start navigation", or "gas station" may be recognized if a map navigation function is currently being used. In the step, the keyword labels can be adjusted according to actual needs, and weights can be distributed to different keyword labels according to actual needs so as to improve judgment accuracy.
Step 104, judging whether the target wake-up-free word is matched with the intention of the voice instruction. If yes, go to step 105; if the judgment result is negative, not triggering a response;
In one embodiment, whether the intent of the target wake-free word and the intent of the voice instruction match is determined based on the similarity of the text of the target wake-free word and the intent of the voice instruction. In step 103, the voice command of the user has been parsed into the intention of the voice command, which is then subjected to similarity calculation with the target wake-free word. For example, in an ideal case, if the intention of the voice command parsed by the voice command "fifth first" is "fifth first", the similarity is 1, that is, 100% when the text matching is performed with the target wake-free word "fifth first", and it is no doubt that the intention of the target wake-free word and the voice command match. It should be understood that, for the intent recognition of the voice command "fifth" there is a possibility that there is an error in recognition accuracy, that is, the voice command "fifth" may be resolved into other meanings close to the intent of "fifth", so that the similarity does not reach 1 or 100% when the similarity of the text of the target wake-free word and the intent of the voice command is calculated. Therefore, a matching threshold needs to be set to determine whether the target wake-free word matches the intent of the voice command. Specifically, if the matching threshold is set to be 0.985 or 98.5%, when the similarity between the text of the target wake-free word and the intention of the voice command exceeds 0.985 or 98.5%, it can be determined that the intention of the target wake-free word and the voice command is matched. Otherwise, when the similarity of the text of the target wake-up-free word and the intention of the voice instruction is not more than 0.985 or 98.5%, the intention of the target wake-up-free word and the intention of the voice instruction are not matched. The matching threshold can be adjusted according to actual needs.
Step 105, executing the target operation corresponding to the target wake-free word.
And when the target wake-free word is matched with the intention of the voice instruction, executing the target operation corresponding to the target wake-free word.
In this step, when the target wake-free word matches with the intention of the voice command, it is indicated that the target wake-free word conforms to the intention of the user voice command, and the target operation corresponding to the target wake-free word can be performed.
In one embodiment, a correspondence between the target wake-up-free word and the target operation may be preset, and when the voice command matches the preset wake-up-free word, the target operation of the target wake-up-free word may be determined according to the correspondence. In one embodiment, the method of controlling further comprises:
When the number of times that the target wake-up-free word is matched with the intention history of the voice instruction is larger than the number threshold and the matching success rate is larger than the success rate threshold, updating the similarity threshold according to the similarity of the intention of the target wake-up-free word and the voice instruction, which appears in the history.
In the step, the more times the target wake-free word is matched with the intention history of the voice instruction, the higher the matching success rate is, the better the current target wake-free word can be matched with the voice instruction, and the higher the reliability is. Therefore, when the number of times that the target wake-up free word is matched with the intention history of the voice instruction is larger than the number threshold and the matching success rate is larger than the success rate threshold, the similarity threshold can be properly adjusted so that the target wake-up free word with higher similarity can be more easily identified. Preferably, the similarity threshold value can be reduced appropriately so that the target wake-free word with higher reliability can be identified more easily.
In one embodiment, updating the similarity threshold based on the similarity includes: counting the occurrence times of each similarity, and updating the similarity with the occurrence times meeting the preset condition into a similarity threshold of the target wake-free word.
The preset conditions comprise that the occurrence number is larger than a frequency threshold value and/or the occurrence number is the maximum value.
In this embodiment, for the same target wake-up-free word, the similarity between each time and the speech recognition is counted, so as to count the occurrence times of the similarities. For the number of occurrences corresponding to each similarity, if the number of occurrences of a certain similarity exceeds the number threshold and/or the number of occurrences of a certain similarity is the largest, the similarity is indicated to be the most frequently occurring similarity in the past, and the similarity can be set as the similarity threshold of the target wake-free word.
For example, 1000 matches were made for a target wake-free word, where 75% of the similarity occurs 2 times, 85% of the similarity occurs 990 times, and 97% of the similarity occurs 8 times. It can be seen that the number of occurrences of the similarity is 85% at the maximum, so that 85% can be set as the similarity threshold of the target wake-free word.
In one embodiment, the method of controlling further comprises: and uploading the wake-up-free voice data to the cloud end so that the cloud end can update the similarity threshold according to the wake-up-free voice data. Wherein, exempt from to wake up voice data includes: voice instruction, similarity, target wake-up free word.
In this step, in order to achieve data synchronization and save local storage space and computing resources, wake-up-free voice data may be uploaded to the cloud.
The following describes the control procedure of wake-up-free voice control as a specific example.
In this example, several wake-free words are currently displayed, one of which is "fifth". When the user sends out the voice command of the fifth, the voice command of the fifth is compared with all the current wake-up-free words one by one in similarity. The wake-free word with the highest similarity is the fifth word, and the fifth word is used as the target wake-free word. And then judging whether the similarity between the target wake-up free word 'fifth head' and the voice instruction exceeds the similarity threshold of the wake-up free word 'fifth head'. If the similarity of the target wake-up free word "fifth head" exceeds the similarity threshold of the "fifth head" wake-up free word, the target wake-up free word "fifth head" is indicated to have higher reliability, the voice instruction can be identified as "fifth head", and the action corresponding to the "fifth head" is executed.
Further, in example one, if the similarity of the target wake-free word "fifth first" does not exceed the similarity threshold of the "fifth first" wake-free word. In order to further determine whether the target wake-up-free word "fifth head" matches with the intention of the voice command, the intention of the voice command needs to be further identified, that is, the text content of the voice command is determined through the identification engine, and then the target wake-up-free word "fifth head" carries out consistency judgment. If the voice command is judged to be consistent, the voice command is identified as a fifth head, and the action corresponding to the fifth head is executed. If the "fifth" wake-up free word is judged to be identical, it is also explained that the similarity threshold of the "fifth" wake-up free word is too high, and the threshold should be adaptively reduced.
The following describes the process of wake-up-free voice control, as a specific example:
1) Receiving a voice command sent by a user.
The microphone of the vehicle-mounted terminal is in a standby state, namely, the microphone keeps collecting external sounds at the moment, after the microphone captures the voice sent by the user, the microphone can collect voice data until the voice is finished, and the voice data is subjected to analog-to-digital conversion and is converted into voice data of digital signals to be transmitted into a hotword engine. When the voice command sent by the user contains the corresponding target wake-up-free word, the hot word engine triggers the command protocol callback, and the callback protocol contains the text of the current triggered target wake-up-free word, the numerical value of the similarity between the voice command and the target wake-up-free word, the similarity threshold set by the target wake-up-free word and the audio stream of the voice command.
2) Parsing callback protocol content fed back by the hotword engine.
The text of the target wake-up-free word, the numerical value of the similarity between the voice command and the target wake-up-free word, and the similarity threshold set by the target wake-up-free word, which are triggered currently, can be obtained through analysis of the callback protocol, and the audio stream of the voice command is cached. If the similarity value between the voice command and the target wake-up free word is larger than the similarity threshold set by the target wake-up free word, the possibility that the currently selected target wake-up free word approaches to the real intention of the voice command is higher, namely the voice command is matched with the target wake-up free word. And then directly sending the instruction corresponding to the target wake-free word to the application software of the vehicle-mounted terminal for execution. Otherwise, if the value of the similarity between the voice command and the target wake-up free word is not greater than the similarity threshold set by the target wake-up free word, the current time stamp is obtained.
3) The audio stream of voice commands is re-speech recognized.
And according to the timestamp acquired in the last step, the audio stream of the voice instruction corresponding to the timestamp is taken out from the cache. And sends the audio stream to the local recognition engine. And acquiring callback contents of text recognition related to the audio stream after the recognition by the local recognition engine, and analyzing the text corresponding to the audio stream from the callback contents, namely the local recognition text.
4) And comparing the text of the target wake-up-free word with the local recognition text recognized by the local recognition engine.
And comparing the local recognition text obtained in the last step with the text of the target wake-up-free word. If the text of the target wake-up-free word is inconsistent with the local recognition text, the instruction corresponding to the target wake-up-free word is not required to be sent to the application software of the vehicle-mounted terminal for execution; if the texts are consistent, the instruction corresponding to the target wake-up-free word is directly sent to the application software of the vehicle-mounted terminal for execution.
5) And uploading the text of the target wake-up free word with the similarity of the current voice instruction and the target wake-up free word being lower than a similarity threshold value and the corresponding similarity value to the cloud.
When the user does not perform wake-up-free voice control, the wake-up-free voice data in the wake-up-free execution process is uploaded to the cloud, wherein the wake-free voice data comprise texts of current target wake-free words, similarity thresholds corresponding to the current target wake-free words, similarity of the target wake-free words and voice instructions and audio stream information of the voice instructions, and corresponding files of the vehicle-mounted terminal after the cloud is successfully uploaded are deleted, so that occupation of local storage space is avoided.
6) Analyzing the information acquired by the cloud.
The analysis aims to set a similarity threshold value which is more in line with the actual application scene based on the historical data so as to achieve the effect of continuous self-learning and self-optimization. The cloud can generate relevant data analysis through the collected similarity threshold value of the wake-up-free words and the numerical value of the similarity actually compared with the voice command, and the accuracy of the similarity threshold value is continuously improved.
7) Sending the analyzed similarity threshold value to the vehicle-mounted terminal, resetting the original similarity threshold value, updating the similarity threshold value into the redetermined similarity threshold value, and pushing the hotword configuration file to the fixed directory of the vehicle-mounted terminal. The accuracy of recognition of the voice command of the user is improved, and the user experience is improved.
Referring to fig. 2, a schematic block diagram of a wake-up-free voice control device according to an exemplary embodiment of the present invention is provided, where the system includes the following blocks:
The voice acquisition module 21 is configured to acquire a voice instruction.
A first recognition engine 22 for determining a target wake-free word of the voice instruction.
The control module 23 is configured to determine whether the similarity between the voice command and the target wake-free word is less than a similarity threshold corresponding to the target wake-free word, and call the second recognition engine when the similarity is less than the similarity threshold.
A second recognition engine 24 for determining the intent of the voice instruction.
The control module 23 is further configured to determine whether the target wake-up exempt word matches the intention of the voice command, and execute a target operation corresponding to the target wake-up exempt word when the target wake-up exempt word matches the intention of the voice command.
Optionally, the wake-up-free voice control device further comprises:
And the updating module is used for updating the similarity threshold according to the similarity when the number of times of matching the target wake-up-free word with the intention of the voice instruction is larger than the number threshold and the matching success rate is larger than the success rate threshold.
Optionally, the updating module includes:
the statistics unit is used for counting the occurrence times of each similarity;
and the updating unit is used for updating the similarity of which the occurrence times meet the preset conditions into a similarity threshold of the target wake-free word. The preset conditions comprise that the occurrence number is larger than a frequency threshold value and/or the occurrence number is the maximum value.
In one embodiment, the wake-up-free voice control apparatus further comprises:
And the sending module is used for uploading the wake-up-free voice data to the cloud end so that the cloud end can update the similarity threshold according to the wake-up-free voice data. Wherein, exempt from to wake up voice data includes: voice instruction, similarity, target wake-up free word.
Optionally, when the similarity is greater than or equal to a similarity threshold, performing a target operation corresponding to the target wake-free word.
Fig. 3 is a schematic structural diagram of an electronic device according to the present embodiment. The electronic device includes a memory, a processor, and a computer program stored on the memory and executable on the processor, which when executed implements the wake-up-free speech control method of embodiment 1. The electronic device 300 shown in fig. 3 is merely an example and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.
Referring to fig. 3, the electronic device 300 may be embodied in the form of a general purpose computing device, which may be a server device, for example. Components of electronic device 300 may include, but are not limited to: the at least one processor 301, the at least one memory 302, a bus 303 connecting the different system components, including the memory 302 and the processor 301.
The bus 303 includes a data bus, an address bus, and a control bus.
Memory 302 may include volatile memory such as Random Access Memory (RAM) 321 and/or cache memory 322, and may further include Read Only Memory (ROM) 323.
Memory 302 may also include a program/utility 325 having a set (at least one) of program modules 324, such program modules 324 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
The processor 301 executes various functional applications and data processing, such as the wake-up-free voice control method of embodiment 1 of the present invention, by running a computer program stored in the memory 302.
The electronic device 300 may also communicate with one or more external devices 304 (e.g., keyboard, pointing device, etc.). Such communication may occur through an input/output (I/O) interface 305. Also, model-generated device 300 may also communicate with one or more networks, such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet, through network adapter 306. As shown, the network adapter 306 communicates with other modules of the model-generated device 300 via the bus 303. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in connection with the model-generating device 300, including, but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, data backup storage systems, and the like.
It should be noted that although several units/modules or sub-units/modules of an electronic device are mentioned in the above detailed description, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more units/modules described above may be embodied in one unit/module in accordance with embodiments of the present invention. Conversely, the features and functions of one unit/module described above may be further divided into ones that are embodied by a plurality of units/modules.
The present embodiment also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the wake-up-free speech control method provided in any of the above embodiments.
More specifically, among others, readable storage media may be employed including, but not limited to: portable disk, hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.
In a possible implementation manner, the present invention may also be realized in the form of a program product, which comprises program code for causing a terminal device to execute the wake-up free speech control method provided by any of the above embodiments, when the program product is run on the terminal device.
Wherein the program code for carrying out the invention may be written in any combination of one or more programming languages, which program code may execute entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device and partly on the remote device or entirely on the remote device.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that this is by way of example only, and the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the principles and spirit of the invention, but such changes and modifications fall within the scope of the invention.

Claims (10)

1. A wake-up-free voice control method is applied to electronic equipment, and comprises the following steps:
acquiring a voice instruction and determining a target wake-up-free word of the voice instruction;
Judging whether the similarity between the voice instruction and the target wake-up-free word is smaller than a similarity threshold corresponding to the target wake-up-free word;
determining an intent of the voice instruction when the similarity is less than the similarity threshold;
Judging whether the target wake-up-free word is matched with the intention of the voice instruction or not;
And when the target wake-free word is matched with the intention of the voice instruction, executing a target operation corresponding to the target wake-free word.
2. The wake-up-free voice control method of claim 1, further comprising:
and when the number of times that the target wake-up-free word is matched with the intention of the voice instruction is larger than a number threshold and the matching success rate is larger than a success rate threshold, updating the similarity threshold according to the similarity.
3. The wake-up-free speech control method of claim 2, the updating the similarity threshold according to the similarity, comprising:
Counting the occurrence times of the similarity;
Updating the similarity of the occurrence times meeting a preset condition to the similarity threshold of the target wake-free word; the preset conditions include that the occurrence number is greater than a number threshold and/or the occurrence number is maximum.
4. The wake-free speech control method of claim 1, the obtaining a speech instruction, determining a target wake-free word of the speech instruction, comprising:
And sending the voice instruction to a first recognition engine so as to determine a target wake-free word of the voice instruction by the first recognition engine.
5. The wake-free speech control method of claim 1, the determining intent of the speech instruction comprising:
The voice instruction is sent to a second recognition engine to determine an intent of the voice instruction by the second recognition engine.
6. The wake-up-free voice control method of claim 1, further comprising:
Uploading the wake-up-free voice data to a cloud end so that the cloud end can update the similarity threshold according to the wake-up-free voice data; wherein, the wake-up-free voice data comprises: the voice instruction, the similarity and the target wake-up-free word.
7. The wake-free speech control method of any of claims 1-6, performing a target operation corresponding to the target wake-free word when the similarity is greater than or equal to the similarity threshold.
8. A wake-up-free voice control device, comprising:
the voice acquisition module is used for acquiring voice instructions;
The first recognition engine is used for determining a target wake-up-free word of the voice instruction;
The control module is used for judging whether the similarity between the voice instruction and the target wake-up-free word is smaller than a similarity threshold corresponding to the target wake-up-free word, and calling a second recognition engine when the similarity is smaller than the similarity threshold;
The second recognition engine is used for determining the intention of the voice instruction;
The control module is further configured to determine whether the target wake-up-free word matches the intent of the voice instruction, and execute a target operation corresponding to the target wake-up-free word when the target wake-up-free word matches the intent of the voice instruction.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the wake-up free speech control method of any one of claims 1-7 when the computer program is executed.
10. A computer readable storage medium having stored thereon a computer program which when executed by a processor implements the wake-up free speech control method of any of claims 1-7.
CN202211668334.7A 2022-12-23 2022-12-23 Wake-free voice control method, device, equipment and storage medium Pending CN118248138A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211668334.7A CN118248138A (en) 2022-12-23 2022-12-23 Wake-free voice control method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211668334.7A CN118248138A (en) 2022-12-23 2022-12-23 Wake-free voice control method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN118248138A true CN118248138A (en) 2024-06-25

Family

ID=91555344

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211668334.7A Pending CN118248138A (en) 2022-12-23 2022-12-23 Wake-free voice control method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN118248138A (en)

Similar Documents

Publication Publication Date Title
CN108520743B (en) Voice control method of intelligent device, intelligent device and computer readable medium
US8972260B2 (en) Speech recognition using multiple language models
CN113327609B (en) Method and apparatus for speech recognition
EP3614378A1 (en) Method and apparatus for identifying key phrase in audio, device and medium
CN110277088B (en) Intelligent voice recognition method, intelligent voice recognition device and computer readable storage medium
WO2021051564A1 (en) Speech recognition method, apparatus, computing device and storage medium
WO2019031268A1 (en) Information processing device and information processing method
CN113674746B (en) Man-machine interaction method, device, equipment and storage medium
CN113674742B (en) Man-machine interaction method, device, equipment and storage medium
CN112669842A (en) Man-machine conversation control method, device, computer equipment and storage medium
CN111400463B (en) Dialogue response method, device, equipment and medium
CN110956958A (en) Searching method, searching device, terminal equipment and storage medium
CN114420102A (en) Method and device for speech sentence-breaking, electronic equipment and storage medium
CN110809796B (en) Speech recognition system and method with decoupled wake phrases
CN113611316A (en) Man-machine interaction method, device, equipment and storage medium
CN114299955B (en) Voice interaction method and device, electronic equipment and storage medium
CN118248138A (en) Wake-free voice control method, device, equipment and storage medium
CN112509567B (en) Method, apparatus, device, storage medium and program product for processing voice data
CN114171000A (en) Audio recognition method based on acoustic model and language model
CN114399992A (en) Voice instruction response method, device and storage medium
CN109036379B (en) Speech recognition method, apparatus and storage medium
CN114121022A (en) Voice wake-up method and device, electronic equipment and storage medium
CN114078478B (en) Voice interaction method and device, electronic equipment and storage medium
CN111951784B (en) Method and device for generating junk words in voice recognition, medium and electronic equipment
CN117711389A (en) Voice interaction method, device, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
CB02 Change of applicant information

Country or region after: China

Address after: Room 3701, No. 866 East Changzhi Road, Hongkou District, Shanghai, 200080

Applicant after: Botai vehicle networking technology (Shanghai) Co.,Ltd.

Address before: Room 208, Building 4, No. 1411, Yecheng Road, Jiading Industrial Zone, Shanghai, Jiading District, Shanghai, 201821

Applicant before: Botai vehicle networking technology (Shanghai) Co.,Ltd.

Country or region before: China

CB02 Change of applicant information
SE01 Entry into force of request for substantive examination