CN112863514B - Voice application control method and electronic equipment - Google Patents

Voice application control method and electronic equipment Download PDF

Info

Publication number
CN112863514B
CN112863514B CN202110275681.2A CN202110275681A CN112863514B CN 112863514 B CN112863514 B CN 112863514B CN 202110275681 A CN202110275681 A CN 202110275681A CN 112863514 B CN112863514 B CN 112863514B
Authority
CN
China
Prior art keywords
voice
application
semantic recognition
recognition result
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110275681.2A
Other languages
Chinese (zh)
Other versions
CN112863514A (en
Inventor
蔡泽辉
金玉龙
马华锋
于春波
雷淼森
曹阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ecarx Hubei Tech Co Ltd
Original Assignee
Ecarx Hubei Tech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ecarx Hubei Tech Co Ltd filed Critical Ecarx Hubei Tech Co Ltd
Priority to CN202110275681.2A priority Critical patent/CN112863514B/en
Publication of CN112863514A publication Critical patent/CN112863514A/en
Application granted granted Critical
Publication of CN112863514B publication Critical patent/CN112863514B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention provides a control method of voice application and electronic equipment, wherein the method comprises the following steps: the received voice command is sent to a voice cloud; receiving a semantic recognition result which is issued by a voice cloud and corresponds to a voice instruction and carries a set format of an application name of a customized voice interaction service; identifying whether a target voice application corresponding to the application name is customized locally based on the application name; if yes, the semantic recognition result is issued to a target voice application corresponding to the application name in the client applied to the voice local server, so that the semantic recognition result is analyzed through the target voice application in the client, and actions corresponding to the semantic recognition result are executed. Based on the technical scheme provided by the invention, the voice application can multiplex the application function page, so that a function page template does not need to be developed, and resource waste caused by the development of the template is avoided.

Description

Voice application control method and electronic equipment
Technical Field
The invention relates to the technical field of voice interaction, in particular to a control method of voice application and electronic equipment.
Background
With the rapid development of the automotive electronics field, particularly the enrichment and the complexity of functions of vehicle-mounted information and entertainment systems, the traditional operation mode easily distracts users, and the driving safety is continuously threatened by various types. The voice recognition technology is introduced to solve the problem that both hands and eyes are separated from driving behaviors to the greatest extent in the driving process, so that the operation is simplified and intelligent, and the voice recognition technology is rapidly developed and popularized in the field of automobile electronics in recent years.
The existing vehicle-mounted voice skill platform is provided with templates of various scenes so as to provide corresponding templates when a service corresponding to a client application exists in a vehicle-mounted system environment. However, since it is impossible to consider all the scenes, the application is limited to the functions and page presentations by templates, resulting in a single presentation, and in addition, the development of templates requires a lot of time and money.
Disclosure of Invention
The present invention has been made in view of the above problems, and has as its object to provide a control method of a speech application and an electronic device which overcome or at least partially solve the above problems.
According to one aspect of the present invention, there is provided a control method for a voice application, applied to a voice local server, including:
the received voice command is sent to a voice cloud;
receiving a semantic recognition result which is issued by the voice cloud and corresponds to the voice instruction and carries a set format of an application name of the customized voice interaction service;
identifying whether a target voice application corresponding to the application name is customized locally based on the application name;
if yes, the semantic recognition result is issued to a target voice application corresponding to the application name in the client applied to the voice local server, so that the semantic recognition result is analyzed through the voice target voice application in the client, and an action corresponding to the semantic recognition result is executed.
Optionally, before the identifying whether the target voice application corresponding to the application name is customized locally based on the application name, the method further includes: receiving a first voice interaction business customization request from any voice application, wherein the first voice interaction business customization request carries an application name of the voice application;
identifying whether the voice application customizes a voice interaction service on the voice cloud based on the application name;
if yes, the application name of the voice application is saved to the local.
Optionally, the identifying whether the target voice application corresponding to the application name is customized locally based on the application name includes: identifying whether the application name is stored locally;
if yes, determining that the target voice application corresponding to the application name is customized locally.
Optionally, after the issuing the semantic recognition result to the target voice application corresponding to the application name in the client applied to the voice local server, the method further includes:
and receiving the execution result information of the target voice application from the client, broadcasting and displaying the execution result information.
According to still another aspect of the present invention, there is also provided a control method of a voice application, applied to a voice cloud, including:
receiving and storing a second voice interaction business customization request from any voice application, wherein the second voice interaction business customization request carries an application name of the voice application and a plurality of voice customization scenes;
generating a plurality of voice sentence patterns and semantic recognition results of set formats corresponding to the voice sentence patterns one by one according to the voice customization scene;
acquiring a voice command received by a voice local server, determining a voice customization scene and a semantic recognition result corresponding to the voice command, and determining a target voice application corresponding to the voice command according to the voice customization scene;
writing the application name of the target voice application into the semantic recognition result, and issuing the semantic recognition result written into the application name of the target voice application to the voice local server.
Optionally, the determining the voice customization scene and the semantic recognition result corresponding to the voice instruction includes:
identifying whether the voice command is a fuzzy voice command;
if yes, determining a voice sentence pattern corresponding to the voice instruction, and determining a semantic recognition result and a voice customization scene corresponding to the voice instruction according to the voice sentence pattern;
if not, analyzing the voice command to obtain a semantic recognition result corresponding to the voice command, and extracting a voice customization scene corresponding to the voice command from the semantic recognition result.
Optionally, the determining, according to the voice customization scenario, the target voice application corresponding to the voice instruction includes: identifying whether a plurality of voice applications customize a voice customization scene corresponding to the voice instruction;
if yes, determining the target voice application from the voice applications according to a preset strategy.
Optionally, the determining the target voice application from the plurality of voice applications according to a preset policy includes: and determining the target voice application from the plurality of voice applications according to the execution accuracy and the user satisfaction of the plurality of voice applications.
Optionally, after receiving and saving the second voice interaction service customization request from any vehicle-mounted voice application, the method further includes: and transmitting the application name of the voice application for transmitting the second voice interaction service customization request to the voice local server.
Optionally, after receiving and saving the second voice interaction service customization request from any voice application, the method further includes: receiving a voice interaction service cancellation request from any voice application;
and deleting the application names corresponding to the voice applications, a plurality of customized scenes and semantic recognition results corresponding to the voice sentence patterns in each customized scene from the local according to the voice interaction service cancellation request.
According to one aspect of the present invention, there is also provided an electronic device including:
a processor;
a memory storing a computer program;
the computer program, when executed by the processor, causes the electronic device to perform the method as claimed in any one of the preceding claims.
In the scheme provided by the invention, after a received voice instruction is sent to a voice cloud, a semantic recognition result which is sent by the voice cloud and carries a set format of an application name of a customized voice interaction service is received, and then whether a target voice application of the application name is customized locally is recognized based on the application name; if yes, the semantic recognition result is issued to a target voice application corresponding to the application name in the client applied to the voice local server, so that the semantic recognition result is analyzed through the target voice application in the client, and actions corresponding to the semantic recognition result are executed. Based on the technical scheme provided by the invention, all voice applications can customize voice interaction business at the voice cloud, so that voice recognition results issued by a voice local server can be analyzed later, actions corresponding to the voice recognition results are executed, the purpose of multiplexing application functions is achieved, the flexibility of application display is improved, in addition, a function page template is not required to be developed, and resource waste caused by the development of the template is avoided.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.
The above, as well as additional objectives, advantages, and features of the present invention will become apparent to those skilled in the art from the following detailed description of a specific embodiment of the present invention when read in conjunction with the accompanying drawings.
Drawings
Some specific embodiments of the invention will be described in detail hereinafter by way of example and not by way of limitation with reference to the accompanying drawings. The same reference numbers will be used throughout the drawings to refer to the same or like parts or portions. It will be appreciated by those skilled in the art that the drawings are not necessarily drawn to scale. In the accompanying drawings:
FIG. 1 is a flow chart of a method of controlling a speech application according to one embodiment of the invention;
FIG. 2 is a flow chart of a method of controlling a speech application according to another embodiment of the present invention;
FIG. 3 is a flow chart of a method of controlling a speech application according to another embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
It should be noted that the technical features of the embodiments and the alternative embodiments of the present invention may be combined with each other without conflict.
The present invention first provides a pipelining SDK (Software Developers Kit), which is an exchange channel between a voice application, a voice cloud and a voice local server.
The pipelined SDK includes a first interface, a second interface, and a third interface. The first interface is used for designing application key information and callback functions to be input when the voice application customizes the voice interaction service. The application key information may include information such as application name, voice customization scene, etc. The application name is a necessary option used for searching the voice application by the voice cloud, the callback function refers to a function adopted by a voice recognition result corresponding to the voice instruction, and the voice recognition result is used for returning semantic information to the voice application by the voice cloud so that the voice application can execute corresponding actions. The semantic information in the speech recognition result may be in Json format. The second interface is used for feeding back execution result information by the voice application, the interface design can be in a Json format, and the interface can be used for customizing required fields by the voice cloud. The third interface is used for the voice application to cancel the voice interaction service.
On the basis of the above-mentioned pipelined SDK, the present invention proposes a control method of a voice application, and fig. 1 is a schematic flow chart of a control method of a voice application according to an embodiment of the present invention. Referring to fig. 1, the method may include at least the following steps.
S102: and sending the received voice instruction to the voice cloud.
In this step, the voice local server may receive a voice command through the microphone.
S104: and receiving a semantic recognition result which is sent by the voice cloud and carries a set format of an application name of the customized voice interaction service.
S106: and identifying whether the target voice application corresponding to the application name is customized locally based on the application name.
S108: if yes, the semantic recognition result is issued to a target voice application corresponding to the application name in the client applied to the voice local server, so that the semantic recognition result is analyzed through the target voice application in the client, and actions corresponding to the semantic recognition result are executed.
The voice application mentioned in the above steps may be any application in the system environment that has a need for customizing voice interaction, for example: hundred degree map, flight video, cool dog music, volcanic car entertainment, etc.
In this embodiment, after a received voice command is sent to a voice cloud, a semantic recognition result with a set format of an application name of a customized voice interaction service issued by the voice cloud is received, and then whether a target voice application of the application name is customized locally is recognized based on the application name; if yes, the semantic recognition result is issued to a target voice application corresponding to the application name in the client applied to the voice local server, so that the semantic recognition result is analyzed through the target voice application in the client, and actions corresponding to the semantic recognition result are executed. Based on the technical scheme provided by the invention, all voice applications can customize voice interaction business at the voice cloud, so that voice recognition results issued by a voice local server can be analyzed later, actions corresponding to the voice recognition results are executed, the purpose of multiplexing application functions is achieved, the flexibility of application display is improved, in addition, a function page template is not required to be developed, and resource waste caused by the development of the template is avoided.
The information in the semantic recognition result may be in Json format. In one example, if the voice command is "please tune the air conditioner to 2 nd gear", the semantic recognition result received by the voice local server is as follows:
{
*"application name":"com.ecarx.vehicle",
*"domain":"AC",
*"entrylist":[
*
{
*"RECOG_SLOT_AC_DegreeNum":"2",
*"RECOG_SLOT_SeatName":"CarSeat_Everywhere"
*
}],
*"intent":"INTENT_AC_TemperatureSet",
*"intentGroup":"INTENTGROUP_TOPLEVEL",
* "intent_ortho": "air conditioner is tuned to 2 nd gear",
*"tconf":"6000"
*
}]
*}
referring to the above semantic recognition result, the application name is "com.ecarx.vehicle", and the semantic information includes "air conditioner to 2 nd gear".
After the semantic recognition result is issued to the target voice application, the target voice application analyzes the semantic information in the semantic recognition result, and further executes corresponding actions and displays corresponding pages.
In some embodiments of the present invention, before step S106 above, the method further comprises: receiving a first voice interaction business customization request from any voice application, wherein the first voice interaction business customization request carries an application name of the voice application; identifying whether the voice application customizes a voice interaction service on a voice cloud based on the application name; if yes, the application name of the voice application is saved locally.
The first voice interaction service customization request may carry, in addition to an application name of the voice application, a plurality of voice customization scenes corresponding to the voice application.
For example, the voice customization scenes may include scenes for VEHICLE-body control, AC-air conditioning, PHONE-PHONE, SMART_HOME-Smart HOME, AGEND-calendar, MEDIA-multimedia, FLIGHT-air ticket, TRAIN-TRAIN ticket, HOTEL-HOTEL, RESTAURANT-RESTAURANT, STOCK-STOCK, WEATHER-WEATHER, VIDEO-VIDEO, NAVI-navigation, SETTING-SETTINGs, and the like.
For example, if the voice application is "carry-over", the corresponding voice customization scenario may include "air ticket", "train ticket", "hotel", etc.
The voice cloud end can issue the application name of the voice application customized by the voice interaction service at the voice cloud end to the voice local server. Therefore, the voice local server judges whether the semantic application with the application name is customized locally or not by matching the application name in the semantic recognition result with the pre-stored application name, and specifically, if matching is successful, the voice application is determined to be customized with the voice interaction service at the voice cloud; if the matching is unsuccessful, determining that the voice application does not customize the voice interaction service at the voice cloud.
After determining that the voice application customizes the voice interaction service at the voice cloud, the voice local server can store the application name and the voice customization scene of the voice application locally.
It should be noted that, the application name issued by the cloud and the application name actively stored by the voice local server may be respectively located in different local storage modules of the voice local server, so as to prevent confusion.
Then, for the target voice application that is identified based on the application name and that corresponds to the application name and is customized locally, as mentioned in the step S106, it may include: identifying whether the application name is stored locally; if yes, determining that the target voice application corresponding to the application name is customized locally.
With the above example, if "com.ecarx.vehicle" is stored locally in the voice local server, it is proved that the voice local server locally customizes the target voice application corresponding to the application name.
And if the application name is identified to be not stored locally, broadcasting unsupported information and exiting the current voice interaction flow.
After the semantic recognition result is issued to the target voice application, the method further comprises the following steps: and receiving the execution result information of the target voice application from the client, broadcasting and displaying the execution result information. The execution result information may include information such as whether the execution was successful.
After the above example is received, the semantic recognition result is issued to the voice application with the application name of "com.ecarx.velicle", if the execution result information returned by the voice application is received as "the air conditioner has been tuned to 2 files", the execution result information can be broadcasted through a loudspeaker and output to the display screen of the terminal device for displaying to the user, so that the user can know the execution condition of the target voice application.
The terminal device is, for example, a mobile device, a computer, a vehicle-mounted device built in a floating car, or the like, or any combination thereof. In some embodiments, the mobile device may include, for example, a cell phone, smart home device, wearable device, smart mobile device, virtual reality device, etc., or any combination thereof.
Based on the same inventive concept, the invention also provides a control method of the voice application, which is applied to the voice cloud, and fig. 2 is a flow chart of a control method of the voice application according to another embodiment of the invention. Referring to fig. 2, the method may include at least the following steps.
Step S202: and receiving and storing a second voice interaction business customization request from any voice application, wherein the second voice interaction business customization request carries the application name of the voice application and a plurality of voice customization scenes.
Step S204: and generating a plurality of voice sentence patterns and semantic recognition results in a set format corresponding to the voice sentence patterns one by one according to the voice customization scene.
Step S206: and acquiring a voice command received by the voice local server, determining a voice customization scene and a semantic recognition result corresponding to the voice command, and determining a target voice application corresponding to the voice command according to the voice customization scene.
Step S208: writing the application name of the target voice application into a semantic recognition result, and transmitting the semantic recognition result written into the application name of the target voice application to a voice local server.
For the voice sentence patterns mentioned in step S204, which are usually some semantic-unclean sentence patterns, such voice sentence patterns generally do not include specific semantic information, for example, the voice customization scene of "air conditioner", the corresponding voice sentence patterns may include "too hot", "two degrees of lowering", and the like, and the semantic recognition results of "too hot" and "two degrees of lowering" may be the semantic recognition results of the sentence pattern of "air conditioner up to one grade". The purpose of setting the voice sentence pattern and the corresponding semantic recognition result is to determine the voice sentence pattern corresponding to the voice instruction when the sentence pattern corresponding to the received voice instruction is not clear in the semantic meaning, and then determine the semantic recognition result and the voice customization scene corresponding to the voice instruction according to the voice sentence pattern.
In this embodiment, the voice cloud may receive and store a second voice interaction service customization request from any voice application, when a voice command received by the voice local server is obtained, determine a voice customization scene corresponding to the voice command and a semantic recognition result of a specific format, determine a target voice application corresponding to the voice command according to the voice customization scene, write an application name of the target voice application into the semantic recognition result, and then issue the semantic recognition result written into the application name of the target voice application to the voice local server. Based on the technical scheme provided by the invention, the voice application and the voice cloud are directly interacted, the voice cloud is used for customizing the voice application access of the voice interaction service, any voice application can customize the voice interaction service at the voice cloud, so that the voice application can achieve the purposes of multiplexing functions and pages by analyzing semantic recognition results, the flexibility of the voice application is improved, the interaction experience is enriched, and the cost of developing functional page templates is saved.
In some embodiments of the present invention, for the voice customization scenario and the semantic recognition result corresponding to the determined voice instruction mentioned in step S204, the method may include: identifying whether the voice command is a fuzzy voice command; if yes, determining a voice sentence pattern corresponding to the voice instruction, and determining a semantic recognition result and a voice customization scene corresponding to the voice instruction according to the voice sentence pattern; if not, analyzing the voice command to obtain a semantic recognition result corresponding to the voice command, and extracting a voice customization scene corresponding to the voice command from the semantic recognition result.
For recognizing whether the voice command is a fuzzy voice command, specifically, the voice text corresponding to the voice command may be determined first, and then words in the voice text that do not affect the semantics, such as "moral" and "praise" may be deleted. And then judging whether the voice command can be analyzed into a semantic recognition result. If yes, determining that the voice command is not a fuzzy voice command; if not, determining the voice command as a fuzzy voice command.
For example, the voice command is "please adjust the air conditioner to 2 nd gear", and the following semantic recognition result can be directly obtained by parsing the voice command.
{*"domain":"AC",
*"entrylist":[
*
{
*"RECOG_SLOT_AC_DegreeNum":"2",
*"RECOG_SLOT_SeatName":"CarSeat_Everywhere"
*
}],
*"intent":"INTENT_AC_TemperatureSet",
*"intentGroup":"INTENTGROUP_TOPLEVEL",
* "intent_ortho": "air conditioner is tuned to 2 nd gear",
*"tconf":"6000"
*
}]
*}
according to the semantic recognition result, the voice command can be determined to be a non-fuzzy voice command, and a voice customization scene is "AC (air conditioner)", and the semantic information is "air conditioner to 2 nd gear" can be extracted from the semantic recognition result.
If the voice command is "too hot", the voice command is analyzed to obtain the voice recognition result of the set format, so that the voice command is determined to be a fuzzy voice command. For the fuzzy voice instruction, a voice sentence corresponding to the voice instruction can be determined, and then a semantic recognition result and a voice customization scene corresponding to the voice instruction are determined.
Because different voice applications can customize the same voice customization scenes, after determining the semantic recognition result and the voice customization scenes corresponding to the voice instructions, next, whether a plurality of voice applications customize the voice customization scenes corresponding to the voice instructions is recognized; if yes, determining a target voice application from a plurality of voice applications according to a preset strategy.
For example, the "take network" and "go to network" may each customize the voice customization scenarios of "train ticket", "air ticket" and "accommodation" due to similar functions.
The determining the target voice application from the plurality of voice applications according to the preset strategy may include: and determining the target voice application from the plurality of voice applications according to the execution accuracy and the user satisfaction of the plurality of voice applications.
For example, if two voice applications customize the voice customization scenes corresponding to the voice commands, for each voice application, the first weight and the second weight may be multiplied by the execution accuracy and the user preference of the voice application respectively, and then added to obtain a weighted value, and then the voice application with a high weighted value is determined as the target voice application.
The first weight and the second weight respectively represent the side weights of the execution accuracy and the user satisfaction, and in practical application, the side weights can be set according to the needs, and the invention is not limited.
For the semantic recognition result of the application name of the write target voice application mentioned in step S208, the semantic recognition result in the first example can be specifically referred to.
In addition, after the above step S202, the method further includes: and transmitting the application name of the voice application for transmitting the second voice interaction service customization request to the voice local server. Therefore, when the voice local server receives the first voice interaction business customization request from any voice application, whether the voice application customizes the voice interaction business at the voice cloud can be recognized according to whether the application name of the voice application is stored in the voice local server.
In addition, after the above step S202, the method further includes: receiving a voice interaction service cancellation request from any voice application; and deleting the application names corresponding to the voice application, the plurality of customized scenes and the semantic recognition results corresponding to the voice sentence patterns in each customized scene from the local according to the voice interaction service cancellation request.
In some embodiments of the present invention, after the above step S208, the method may further include: receiving execution result information of a target voice application uploaded by a voice local server, and judging whether the execution result information is an execution failure or not; if the execution result information is that the execution fails and the voice applications corresponding to the voice instructions are multiple, the target voice application is redetermined in the rest voice applications, the application name of the new target voice application is written into the semantic recognition result corresponding to the voice instructions, the semantic recognition result written into the application name of the new target voice application is issued to the voice local server, so that the voice local server can identify whether to customize the voice application of the application name locally based on the application name, and if yes, the semantic recognition result is issued to the target voice application of the client corresponding to the application name. And circulating in this way until the received execution result information is that the execution is successful or a plurality of voice applications corresponding to the voice instructions are circulated.
For example, if the voice command is "play wind sound", the customized voice scene corresponding to the voice command is "video", and the voice application for determining that the customized voice scene is customized includes "Tencent video", "cool video" and "personal video". And determining that the target voice application is a 'vacation video' according to the execution accuracy and user satisfaction of each voice application, and transmitting a semantic recognition result containing the 'vacation video' to a voice local server. After receiving the execution result information of the "Tencent video" uploaded by the voice local server as a play failure (the reason for the play failure may be that the Tencent video has a functional failure or does not support the film source, etc.), the new target voice application can be determined according to the execution accuracy and user desirability of the "excellent video" and the "man video", and if the new target voice application is the "excellent video", the semantic recognition result containing the "excellent video" is issued to the voice local server. And circulating the steps until the received execution result information is that the execution is successful or the three voice applications are circulated.
Having described the implementation manner of each link of the embodiments shown in fig. 1 and fig. 2, the implementation process of the control method of the voice application of the present invention will be described in detail below by combining the voice local server, the voice cloud and the voice application. Fig. 3 is a flowchart of a control method of a voice application according to another embodiment of the present invention. Referring to fig. 3, the method may include the following steps.
S302: sending a second voice interaction service customization request, wherein the request carries application names of voice applications and a plurality of voice customization scenes; and generating a plurality of voice sentence patterns and semantic recognition results in a set format corresponding to the voice sentence patterns one by one according to the voice customization scene.
S304: the voice cloud stores information carried by the second voice interaction business customization request.
S306: and the voice cloud end transmits the application name of the voice application for transmitting the second voice interaction business customization request to the voice local server.
S308: the voice application sends a first voice interaction business customization request to the voice local server, wherein the first voice interaction business customization request carries an application name of the voice application.
In this step, the first voice interaction service customization request may carry, in addition to an application name of the voice application, a plurality of voice customization scenes corresponding to the voice application.
S310: the voice local server identifies whether the voice application is customized with a voice interaction service at the cloud based on the application name; if yes, the application name of the voice application is saved to the local; if not, the storage is not performed.
In this step, in addition to saving the application name locally, a plurality of voice customization scenes corresponding to the voice application may be saved.
In the step, the voice local server side matches the application name with the application name issued by the voice cloud; if the matching is successful, determining that the voice application customizes the voice interaction service at the cloud; if the matching fails, determining that the voice application does not customize the voice interaction service at the cloud.
S312: and the voice local server sends the received voice instruction to the voice cloud.
S314: the voice cloud determines a voice customization scene and a semantic recognition result corresponding to the voice instruction, determines a target voice application corresponding to the voice instruction according to the voice customization scene, and writes an application name of the target voice application into the semantic recognition result.
S316: and the voice cloud end transmits the semantic recognition result of the application name written in the target voice application to the voice local server.
S318: the voice local server identifies whether the target voice application with the application name is customized locally; if yes, go to step S320; if not, return to step S312.
In this step, the voice local server identifies whether the voice application with the application name is locally customized by identifying whether the application name is stored in the voice local server, specifically, if the application name is locally stored in the voice local server, determining that the voice application with the application name is locally customized; if not, then determining that the local is not customized.
S320: the voice local server side issues the semantic recognition result to a target voice application of the client side corresponding to the application name, so that the semantic recognition result is analyzed through the target voice application, and actions corresponding to the semantic recognition result are executed.
S322: and the target voice application sends the execution result information to the voice local service.
S324: and broadcasting and displaying the execution result information by the voice local server.
S326: and the voice local server sends the execution result information to the voice cloud.
S328: the voice cloud judges whether the execution result information is an execution failure or not; if yes, returning to the step S314 until the received execution result information is that the execution is successful or the voice application corresponding to the voice instruction is circulated; if not, ending the current flow.
Based on the same inventive concept, the present invention also proposes an electronic device 400, and fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. Referring to fig. 4, electronic device 400 includes a processor 410 and a memory 420 storing a computer program 421; when executed by the processor 410, the computer program 421 causes the electronic device 400 to perform the method of any embodiment of the method of controlling a voice application applied to a voice local server as described above or the method of any embodiment of the method of controlling a voice application applied to a voice cloud.
The invention provides a control method of a voice application and electronic equipment, wherein in the scheme provided by the invention, after a received voice instruction is sent to a voice cloud, a semantic recognition result which is issued by the voice cloud and carries a set format of an application name of a customized voice interaction service is received, and then whether a target voice application of the application name is customized locally is recognized based on the application name; if yes, the semantic recognition result is issued to a target voice application corresponding to the application name in the client applied to the voice local server, so that the semantic recognition result is analyzed through the target voice application in the client, and actions corresponding to the semantic recognition result are executed. Based on the technical scheme provided by the invention, all voice applications can customize voice interaction business at the voice cloud, so that voice recognition results issued by a voice local server can be analyzed later, actions corresponding to the voice recognition results are executed, the purpose of multiplexing application functions is achieved, the flexibility of application display is improved, in addition, a function page template is not required to be developed, and resource waste caused by the development of the template is avoided.
It will be clear to those skilled in the art that the specific working procedures of the above-described systems, devices and units may refer to the corresponding procedures in the foregoing method embodiments, and are not repeated herein for brevity.
In addition, each functional unit in the embodiments of the present invention may be physically independent, two or more functional units may be integrated together, or all functional units may be integrated in one processing unit. The integrated functional units may be implemented in hardware or in software or firmware.
Those of ordinary skill in the art will appreciate that: the integrated functional units, if implemented in software and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in essence or in whole or in part in the form of a software product stored in a storage medium, comprising instructions for causing a computing device (e.g., a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present invention when the instructions are executed. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a read-only memory (ROM), a random-access memory (RAM), a magnetic disk, or an optical disk, etc.
Alternatively, all or part of the steps of implementing the foregoing method embodiments may be implemented by hardware (such as a personal computer, a server, or a computing device such as a network device) associated with program instructions, where the program instructions may be stored on a computer-readable storage medium, and where the program instructions, when executed by a processor of the computing device, perform all or part of the steps of the method according to the embodiments of the present invention.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all technical features thereof can be replaced by others within the spirit and principle of the present invention; such modifications and substitutions do not depart from the scope of the invention.

Claims (9)

1. The control method of the voice application is applied to a voice local server and is characterized by comprising the following steps:
the received voice command is sent to a voice cloud;
receiving a semantic recognition result which is issued by the voice cloud and corresponds to the voice instruction and carries a set format of an application name of the customized voice interaction service;
identifying whether a target voice application corresponding to the application name is customized locally based on the application name;
if yes, the semantic recognition result is issued to a target voice application corresponding to the application name in a client applied to the voice local server, so that the semantic recognition result is analyzed through the target voice application in the client, and an action corresponding to the semantic recognition result is executed;
wherein the identifying whether the target voice application corresponding to the application name is customized based on the application name includes:
identifying whether the application name is stored locally;
if yes, determining that the target voice application corresponding to the application name is customized locally;
before the identifying whether the target voice application corresponding to the application name is customized locally based on the application name, the method further comprises:
receiving a first voice interaction business customization request from any voice application, wherein the first voice interaction business customization request carries an application name of the voice application;
identifying whether the voice application customizes a voice interaction service on the voice cloud based on the application name;
if yes, the application name of the voice application is saved to the local.
2. The method of claim 1, wherein after the step of issuing the semantic recognition result to the target voice application corresponding to the application name in the client applied to the voice local server, further comprises:
and receiving the execution result information of the target voice application from the client, broadcasting and displaying the execution result information.
3. The control method of the voice application is applied to a voice cloud and is characterized by comprising the following steps:
receiving and storing a second voice interaction business customization request from any voice application, wherein the second voice interaction business customization request carries an application name of the voice application and a plurality of voice customization scenes;
generating a plurality of voice sentence patterns and semantic recognition results of set formats corresponding to the voice sentence patterns one by one according to the voice customization scene;
acquiring a voice command received by a voice local server, determining a voice customization scene and a semantic recognition result corresponding to the voice command, and determining a target voice application corresponding to the voice command according to the voice customization scene;
writing the application name of the target voice application into the semantic recognition result, and issuing the semantic recognition result written into the application name of the target voice application to the voice local server;
wherein, the determining the voice customization scene and the semantic recognition result corresponding to the voice instruction comprises:
identifying whether the voice command is a fuzzy voice command;
if yes, determining a voice sentence pattern corresponding to the voice instruction, and determining a semantic recognition result and a voice customization scene corresponding to the voice instruction according to the voice sentence pattern.
4. The method of claim 3, further comprising, after identifying whether the voice command is a ambiguous voice command:
if the voice command is not the fuzzy voice command, analyzing the voice command to obtain a semantic recognition result of a set format corresponding to the voice command, and extracting a voice customization scene corresponding to the voice command from the semantic recognition result.
5. The method of claim 3, wherein the determining the target voice application to which the voice command corresponds from the voice customization scene comprises:
identifying whether a plurality of voice applications customize a voice customization scene corresponding to the voice instruction;
if yes, determining the target voice application from the voice applications according to a preset strategy.
6. The method of claim 5, wherein determining the target voice application from the plurality of voice applications according to a preset policy comprises:
and determining the target voice application from the plurality of voice applications according to the execution accuracy and the user satisfaction of the plurality of voice applications.
7. The method of claim 3, wherein after receiving and saving the second voice interactive service customization request from any of the on-board voice applications, further comprising:
and transmitting the application name of the voice application for transmitting the second voice interaction service customization request to the voice local server.
8. The method of claim 3, wherein after receiving and saving the second voice interactive service customization request from any one of the voice applications, further comprising:
receiving a voice interaction service cancellation request from any voice application;
and deleting the application names corresponding to the voice applications, a plurality of customized scenes and semantic recognition results corresponding to the voice sentence patterns in each customized scene from the local according to the voice interaction service cancellation request.
9. An electronic device, comprising:
a processor;
a memory storing a computer program;
the computer program, when executed by the processor, causes the electronic device to perform the method of claim 1 or 2 or the method of any of claims 3-8.
CN202110275681.2A 2021-03-15 2021-03-15 Voice application control method and electronic equipment Active CN112863514B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110275681.2A CN112863514B (en) 2021-03-15 2021-03-15 Voice application control method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110275681.2A CN112863514B (en) 2021-03-15 2021-03-15 Voice application control method and electronic equipment

Publications (2)

Publication Number Publication Date
CN112863514A CN112863514A (en) 2021-05-28
CN112863514B true CN112863514B (en) 2024-03-15

Family

ID=75994466

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110275681.2A Active CN112863514B (en) 2021-03-15 2021-03-15 Voice application control method and electronic equipment

Country Status (1)

Country Link
CN (1) CN112863514B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6901431B1 (en) * 1999-09-03 2005-05-31 Cisco Technology, Inc. Application server providing personalized voice enabled web application services using extensible markup language documents
CN104683456A (en) * 2015-02-13 2015-06-03 腾讯科技(深圳)有限公司 Service processing method, server and terminal
US9361084B1 (en) * 2013-11-14 2016-06-07 Google Inc. Methods and systems for installing and executing applications
CN109120774A (en) * 2018-06-29 2019-01-01 深圳市九洲电器有限公司 Terminal applies voice control method and system
CN110060672A (en) * 2019-03-08 2019-07-26 华为技术有限公司 A kind of sound control method and electronic equipment
CN110797022A (en) * 2019-09-06 2020-02-14 腾讯科技(深圳)有限公司 Application control method and device, terminal and server
CN111002996A (en) * 2019-12-10 2020-04-14 广州小鹏汽车科技有限公司 Vehicle-mounted voice interaction method, server, vehicle and storage medium
CN111290796A (en) * 2018-12-07 2020-06-16 阿里巴巴集团控股有限公司 Service providing method, device and equipment
CN111383631A (en) * 2018-12-11 2020-07-07 阿里巴巴集团控股有限公司 Voice interaction method, device and system
CN111627435A (en) * 2020-04-30 2020-09-04 长城汽车股份有限公司 Voice recognition method and system and control method and system based on voice instruction
WO2020233074A1 (en) * 2019-05-21 2020-11-26 深圳壹账通智能科技有限公司 Mobile terminal control method and apparatus, mobile terminal, and readable storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9781262B2 (en) * 2012-08-02 2017-10-03 Nuance Communications, Inc. Methods and apparatus for voice-enabling a web application
CN109036396A (en) * 2018-06-29 2018-12-18 百度在线网络技术(北京)有限公司 A kind of exchange method and system of third-party application
CN109992248B (en) * 2019-02-25 2022-07-29 阿波罗智联(北京)科技有限公司 Method, device and equipment for realizing voice application and computer readable storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6901431B1 (en) * 1999-09-03 2005-05-31 Cisco Technology, Inc. Application server providing personalized voice enabled web application services using extensible markup language documents
US9361084B1 (en) * 2013-11-14 2016-06-07 Google Inc. Methods and systems for installing and executing applications
CN104683456A (en) * 2015-02-13 2015-06-03 腾讯科技(深圳)有限公司 Service processing method, server and terminal
CN109120774A (en) * 2018-06-29 2019-01-01 深圳市九洲电器有限公司 Terminal applies voice control method and system
CN111290796A (en) * 2018-12-07 2020-06-16 阿里巴巴集团控股有限公司 Service providing method, device and equipment
CN111383631A (en) * 2018-12-11 2020-07-07 阿里巴巴集团控股有限公司 Voice interaction method, device and system
CN110060672A (en) * 2019-03-08 2019-07-26 华为技术有限公司 A kind of sound control method and electronic equipment
WO2020233074A1 (en) * 2019-05-21 2020-11-26 深圳壹账通智能科技有限公司 Mobile terminal control method and apparatus, mobile terminal, and readable storage medium
CN110797022A (en) * 2019-09-06 2020-02-14 腾讯科技(深圳)有限公司 Application control method and device, terminal and server
CN111002996A (en) * 2019-12-10 2020-04-14 广州小鹏汽车科技有限公司 Vehicle-mounted voice interaction method, server, vehicle and storage medium
CN111627435A (en) * 2020-04-30 2020-09-04 长城汽车股份有限公司 Voice recognition method and system and control method and system based on voice instruction

Also Published As

Publication number Publication date
CN112863514A (en) 2021-05-28

Similar Documents

Publication Publication Date Title
JP6952184B2 (en) View-based voice interaction methods, devices, servers, terminals and media
US20210104232A1 (en) Electronic device for processing user utterance and method of operating same
EP3633947B1 (en) Electronic device and control method therefor
US20220020358A1 (en) Electronic device for processing user utterance and operation method therefor
KR20190122457A (en) Electronic device for performing speech recognition and the method for the same
CN112837683B (en) Voice service method and device
CN112863514B (en) Voice application control method and electronic equipment
US20130179165A1 (en) Dynamic presentation aid
US11720324B2 (en) Method for displaying electronic document for processing voice command, and electronic device therefor
CN111538812A (en) Method, equipment and system for disambiguating natural language content title
CN112215010A (en) Semantic recognition method and equipment
CN114066098A (en) Method and device for estimating completion duration of learning task
CN114356083A (en) Virtual personal assistant control method and device, electronic equipment and readable storage medium
CN107967308B (en) Intelligent interaction processing method, device, equipment and computer storage medium
CN112837678B (en) Private cloud recognition training method and device
CN111291644B (en) Method and apparatus for processing information
US20240111848A1 (en) Electronic device and control method therefor
CN117133286A (en) Man-machine voice interaction method, device and equipment under vehicle-mounted environment and storage medium
JP2024067241A (en) On-vehicle device and system
US20240160836A1 (en) Context Adaptive Writing Assistant
CN115440213A (en) Voice control method, device, equipment, vehicle and medium
CN116798424A (en) Vehicle-mounted terminal voice control method and system and vehicle
CN116233474A (en) Interaction method for live broadcast and terminal equipment
CN117075785A (en) Guide language display method and device, electronic equipment and storage medium
CN117253483A (en) Voice control method, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220317

Address after: 430051 No. b1336, chuanggu startup area, taizihu cultural Digital Creative Industry Park, No. 18, Shenlong Avenue, Wuhan Economic and Technological Development Zone, Wuhan, Hubei Province

Applicant after: Yikatong (Hubei) Technology Co.,Ltd.

Address before: No.c101, chuanggu start up area, taizihu cultural Digital Industrial Park, No.18 Shenlong Avenue, Wuhan Economic Development Zone, Hubei Province

Applicant before: HUBEI ECARX TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant