WO2015027789A1

WO2015027789A1 - Language control method, device and terminal

Info

Publication number: WO2015027789A1
Application number: PCT/CN2014/083505
Authority: WO
Inventors: 樊艳梅; 蒋洪睿
Original assignee: 华为终端有限公司
Priority date: 2013-08-26
Filing date: 2014-08-01
Publication date: 2015-03-05
Also published as: CN103442138A

Abstract

Disclosed are a language control method, device and terminal. The method comprises: receiving a voice command from a user to a first application; matching the voice command with a voice UI resource of the first application, and obtaining an action command corresponding to the voice command, the voice UI resource of the first application including voice attribute information, action attribute information and context attribute information about each assembly of the first application; and executing an operation corresponding to the action command on the first application. The embodiments of the present invention expand the processing capability of a voice assistant framework in a terminal, and can realize operations of various third-party applications by voice, thereby meeting the requirements of a user of installing an application at any time and using voice to interact at any time, and improving the use experience of a terminal user.

Description

语言控制方法、装置及终端 Language control method, device and terminal

本申请要求于 2013 年 08 月 26 日提交中国专利局、申请号为 201310375572.3、发明名称为 "语言控制方法、装置及终端"的中国专利申请的优先权，其全部内容通过引用包含于本申请中。 The present application claims priority to Chinese Patent Application No. 201310375572.3, entitled "Language Control Method, Apparatus, and Terminal", filed on August 26, 2013, the entire contents of which is hereby incorporated by reference. .

技术领域 Technical field

本发明涉及通信技术领域，特别涉及语音控制方法、装置及终端。 The present invention relates to the field of communications technologies, and in particular, to a voice control method, apparatus, and terminal.

背景技术 Background technique

智能终端通常釆用图形用户界面（Graphical User Interface, GUI ) 向终端用户输出信息。 GUI是指釆用图形方式显示的计算机操作用户界面，在现有 GUI架构下，当启动一个应用时，该应用的图形运行结果在屏幕上呈现，通过视觉向用户解释意图，包括图形内展示的文字、颜色、组件、区域划分等，用户通过视觉获知图形界面上能够实施的操作，并通过向屏幕输入触摸手势实施相应的操作。随着机器语音自动识别技术的日趋成熟，为了简化用户对 GUI的使用，可以通过输入语音命令完成原来由手势输入的操作。现有智能终端上设置的语音助理可以对终端内自带的各种应用进行语音操作，这些应用包括电话、短信、搜索、日程、闹钟等。其中，语音助理预设了每个应用可以接收的命令，当用户开启了某个应用后，就进入该应用的对话场景中，通过对话输入语音指令，完成用户希望的操作。但是，发明人在对现有技术的研究过程中发现，现有智能终端内的语音助理框架仅针对终端的自带应用，当终端内下载了各种第三方应用后，无法应用终端内的语音助理进行语音操作。由此可知，现有语音助理框架的开放程度有限，难以满足用户随时安装应用随时使用语音交互的需求，导致用户体验不高。发明内容 Intelligent terminals typically use a Graphical User Interface (GUI) to output information to end users. The GUI refers to a computer-operated user interface that is graphically displayed. Under the existing GUI architecture, when an application is launched, the graphical running result of the application is presented on the screen, and the intention is explained to the user through visual display, including the display in the graphic. Text, color, components, region division, etc., the user can visually know the operations that can be performed on the graphical interface, and perform corresponding operations by inputting a touch gesture to the screen. With the maturity of automatic machine voice recognition technology, in order to simplify the user's use of the GUI, the operation originally input by the gesture can be completed by inputting a voice command. The voice assistant set on the existing smart terminal can perform voice operations on various applications that are provided in the terminal, such as telephone, short message, search, schedule, alarm clock and the like. The voice assistant presets the command that each application can receive. When the user opens an application, the user enters the conversation scene of the application, and inputs a voice instruction through the dialogue to complete the operation desired by the user. However, the inventor found in the research process of the prior art that the voice assistant framework in the existing smart terminal is only for the built-in application of the terminal, and after downloading various third-party applications in the terminal, the voice in the terminal cannot be applied. The assistant performs voice operations. It can be seen that the existing voice assistant framework has a limited degree of openness, and it is difficult to meet the requirement that the user installs the application to use the voice interaction at any time, resulting in a low user experience. Summary of the invention

本发明实施例中提供了语音控制方法、装置及终端，以解决现有技术无法随时安装应用随时使用语音交互，从而导致智能终端的用户体验不高的问题。为了解决上述技术问题，本发明实施例公开了如下技术方案：第一方面，提供一种语音控制方法，所述方法包括：接收用户对第一应用的语音指令；将所述语音指令与所述第一应用的语音用户接口 UI资源进行匹配，获得与所述语音指令对应的动作指令，所述第一应用的语音 UI资源包含所述第一应用的每个组件的语音属性信息、动作属性信息和上下文属性信息；对所述第一应用执行与所述动作指令对应的操作。结合第一方面，在第一方面的第一种可能的实现方式中，所述组件的语音属性信息为触发所述组件的语音指令对应的文本内容；所述组件的动作属性信息为触发所述组件后执行的操作；所述组件的上下文属性信息为所述组件的语音指令生效时的运行状态，所述运行状态包括全局状态、应用状态或页面状态。结合第一方面，或第一方面的第一种可能的实现方式，在第一方面的第二种可能的实现方式中，所述接收用户对第一应用的语音指令后，所述方法还包括：获得所述终端当前的第一运行状态；所述将所述语音指令与所述第一应用的语音 UI资源进行匹配，获得与所述语音指令对应的动作指令，包括：通过语音引擎识别所述语音指令对应的第一文本内容；将所述第一运行状态和所述第一文本内容与所述语音 UI资源进行匹配，获得与所述第一运行状态和所述第一文本内容对应的第一上下文属性信息和第一语音属性信息；获得与所述第一上下文属性信息和第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令；或者，通过语音引擎识别所述语音指令对应的第一文本内容；将所述第一文本内容与所述语音 UI资源进行匹配，获得与所述第一文本内容对应的第一语音属性信息和第一上下文属性信息；当所述第一运行状态与所述第一上下文属性信息一致时，获得与所述第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令。结合第一方面，或第一方面的第一种可能的实现方式，或第一方面的第二种可能的实现方式，在第一方面的第三种可能的实现方式中，所述接收用户对第一应用的语音指令，包括：接收用户开启第一应用的语音指令；或者，接收用户在第一应用开启后的页面上对第一应用进行的进一步操作的语音指令。结合第一方面，或第一方面的第一种可能的实现方式，或第一方面的第二种可能的实现方式，或第一方面的第三种可能的实现方式，在第一方面的第四种可能的实现方式中，所述接收用户对第一应用的语音指令前，所述方法还包括：当接收到用户的应用开启语音指令对应至少两个应用时，输出所述至少两个应用的选项；所述接收用户对所述第一应用的语音指令包括：接收用户根据所述选项对从所述至少两个应用中选择的第一应用的语音指令；或者，所述接收用户对第一应用的语音指令包括：接收用户对第一应用的应用开启语音指令，所述第一应用为所述应用开启语音指令对应的至少两个应用中预设优先级最高的应用。第二方面，提供一种语音控制装置，所述装置包括：接收单元，用于接收用户对第一应用的语音指令；匹配单元，用于将所述接收单元接收到的语音指令与所述第一应用的语音In the embodiment of the present invention, a voice control method, a device, and a terminal are provided to solve the problem that the prior art cannot install the application to use the voice interaction at any time, thereby causing the user experience of the smart terminal to be not high. In order to solve the above technical problem, the embodiment of the present invention discloses the following technical solution: In a first aspect, a voice control method is provided, where the method includes: receiving a voice instruction of a user to a first application; The voice user interface UI resource of the first application is matched, and the action instruction corresponding to the voice instruction is obtained, where the voice UI resource of the first application includes voice attribute information and action attribute information of each component of the first application. And context attribute information; performing an operation corresponding to the action instruction on the first application. With reference to the first aspect, in a first possible implementation manner of the first aspect, the voice attribute information of the component is a text content corresponding to a voice instruction that triggers the component, and the action attribute information of the component is triggered by the An operation performed after the component; the context attribute information of the component is an operation state when the voice instruction of the component is in effect, and the operation state includes a global state, an application state, or a page state. With the first aspect, or the first possible implementation manner of the first aspect, in the second possible implementation manner of the first aspect, after the receiving, by the user, the voice instruction of the first application, the method further includes Obtaining a current first running state of the terminal; the matching the voice command with the voice UI resource of the first application, and obtaining an action instruction corresponding to the voice command, including: identifying, by using a voice engine Corresponding to the first text content corresponding to the voice instruction; matching the first running state and the first text content with the voice UI resource to obtain a first operation state and the first text content a first context attribute information and first voice attribute information; obtaining first action attribute information corresponding to the first context attribute information and the first voice attribute information, and using the operation corresponding to the first action attribute information as a context Determining an action instruction corresponding to the voice instruction; or identifying, by the voice engine, the first text content corresponding to the voice instruction; The content is matched with the voice UI resource, and the first voice attribute information and the first context attribute information corresponding to the first text content are obtained; when the first running state is consistent with the first context attribute information Obtaining first action attribute information corresponding to the first voice attribute information, and performing an operation corresponding to the first action attribute information as an action instruction corresponding to the voice instruction. With reference to the first aspect, or the first possible implementation manner of the first aspect, or the second possible implementation manner of the first aspect, in a third possible implementation manner of the first aspect, The voice command of the first application includes: receiving a voice instruction that the user turns on the first application; or receiving a voice instruction that the user performs further operations on the first application on the page after the first application is opened. With reference to the first aspect, or the first possible implementation of the first aspect, or the second possible implementation of the first aspect, or the third possible implementation of the first aspect, in the first aspect In the four possible implementation manners, before the receiving the voice instruction of the user to the first application, the method further includes: outputting the at least two applications when the application opening voice command of the user receives the at least two applications The receiving the user's voice instruction to the first application includes: receiving, by the user, a voice instruction of the first application selected from the at least two applications according to the option; or The voice command of the application includes: receiving, by the user, an application for opening the voice command to the first application, where the first application is the application with the highest priority among the at least two applications corresponding to the voice instruction of the application. In a second aspect, a voice control apparatus is provided, where the apparatus includes: a receiving unit, configured to receive a voice instruction of a user to a first application; a matching unit, configured to: receive, by the receiving unit, a voice instruction and the first An application voice

UI 资源进行匹配，获得与所述语音指令对应的动作指令，所述第一应用的语音 UI资源包含所述第一应用的每个组件的语音属性信息、动作属性信息和上下文属性信息；执行单元，用于对所述第一应用执行与所述匹配单元获得的动作指令对应的操作。结合第二方面，在第二方面的第一种可能的实现方式中，所述组件的语音属性信息为触发所述组件的语音指令对应的文本内容；所述组件的动作属性信息为触发所述组件后执行的操作；所述组件的上下文属性信息为所述组件的语音指令生效时的运行状态，所述运行状态包括全局状态、应用状态或页面状态。结合第二方面，或第二方面的第一种可能的实现方式，在第二方面的第二种可能的实现方式中，所述装置还包括：获得单元，用于所述接收单元接收到所述语音指令后，获得所述终端当前的第一运行状态；所述匹配单元包括：第一指令识别子单元，用于通过语音引擎识别所述语音指令对应的第一文本内容；第一信息匹配子单元，用于将所述获得单元获得的第一运行状态和所述第一指令识别子单元识别出的第一文本内容与所述语音 UI资源进行匹配，获得与所述第一运行状态和所述第一文本内容对应的第一上下文属性信息和第一语音属性信息；第一指令获得子单元，用于获得与所述第一信息匹配子单元获得的第一上下文属性信息和第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令; 或者，第二指令识别子单元，用于通过语音引擎识别所述语音指令对应的第一文本内容；第二信息匹配子单元，用于将所述第二指令识别子单元识别出的第一文本内容与所述语音 UI资源进行匹配，获得与所述第一文本内容对应的第一语音属性信息和第一上下文属性信息；第二指令获得子单元，用于当所述第一运行状态与所述第二信息匹配子单元获得的第一上下文属性信息一致时，获得与所述第一上下文属性信息和第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令。结合第二方面，或第二方面的第一种可能的实现方式，或第二方面的第二种可能的实现方式，在第二方面的第三种可能的实现方式中，所述接收单元，具体用于接收用户开启第一应用的语音指令，或者，接收用户在第一应用开启后的页面上对第一应用进行的进一步操作的语音指令。结合第二方面，或第二方面的第一种可能的实现方式，或第二方面的第二种可能的实现方式，或第二方面的第三种可能的实现方式，在第二方面的第四种可能的实现方式中，所述装置还包括：输出单元，用于当接收到用户的应用开启语音指令对应至少两个应用时，输出所述至少两个应用的选项；所述接收单元，具体用于接用的语音指令；或者，所述接收单元，具体用于接收用户对第一应用的应用开启语音指令，所述第一应用为所述应用开启语音指令对应的至少两个应用中预设优先级最高的应用。第三方面，提供一种终端，所述终端包括：麦克风、存储器和处理器，其中，所述存储器，用于存储语音引擎；所述麦克风，用于接收用户的语音指令；所述处理器，用于当所述麦克风接收用户对第一应用的语音指令后，将所述语音指令与所述第一应用的语音用户接口 UI资源进行匹配，获得与所述语音指令对应的动作指令，所述第一应用的语音 UI资源包含所述第一应用的每个组件的语音属性信息、动作属性信息和上下文属性信息，并对所述第一应用执行与所述动作指令对应的操作。结合第三方面，在第三方面的第一种可能的实现方式中，所述组件的语音属性信息为触发所述组件的语音指令对应的文本内容；所述组件的动作属性信息为触发所述组件后执行的操作；所述组件的上下文属性信息为所述组件的语音指令生效时的运行状态，所述运行状态包括全局状态、应用状态或页面状态。结合第三方面，或第三方面的第一种可能的实现方式，在第三方面的第二种可能的实现方式中，所述处理器，还用于获得所述终端当前的第一运行状态；所述处理器，具体用于通过语音引擎识别所述语音指令对应的第一文本内容，将所述第一运行状态和所述第一文本内容与所述语音 UI资源进行匹配，获得与所述第一运行状态和所述第一文本内容对应的第一上下文属性信息和第一语音属性信息，获得与所述第一上下文属性信息和第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令；或者，通过语音引擎识别所述语音指令对应的第一文本内容；将所述第一文本内容与所述语音 UI资源进行匹配，获得与所述第一文本内容对应的第一语音属性信息和第一上下文属性信息；当所述第一运行状态与所述第一上下文属性信息一致时，获得与所述第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令。结合第三方面，或第三方面的第一种可能的实现方式，或第三方面的第二种可能的实现方式，在第三方面的第三种可能的实现方式中，所述麦克风，具体用于接收用户开启第一应用的语音指令，或者，接收用户在第一应用开启后的页面上对第一应用进行的进一步操作的语音指令。结合第三方面，或第三方面的第一种可能的实现方式，或第三方面的第二种可能的实现方式，或第三方面的第三种可能的实现方式，在第三方面的第四种可能的实现方式中，所述处理器，还用于当通过所述麦克风接收到用户的应用开启语音指令对应至少两个应用时，输出所述至少两个应用的选项；所述麦克风，具体用于接 The UI resource is matched to obtain an action instruction corresponding to the voice instruction, where the voice UI resource of the first application includes voice attribute information, action attribute information, and context attribute information of each component of the first application; And performing, for the first application, an operation corresponding to the action instruction obtained by the matching unit. In conjunction with the second aspect, in a first possible implementation of the second aspect, The voice attribute information of the component is the text content corresponding to the voice instruction that triggers the component; the action attribute information of the component is an operation performed after the component is triggered; the context attribute information of the component is the voice of the component The running state when the command is in effect, and the running state includes a global state, an application state, or a page state. With reference to the second aspect, or the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, the device further includes: an obtaining unit, where the receiving unit receives the After the voice command is performed, the current first running state of the terminal is obtained; the matching unit includes: a first instruction identifying subunit, configured to identify, by using a voice engine, a first text content corresponding to the voice instruction; a subunit, configured to match the first running state obtained by the obtaining unit and the first text content recognized by the first instruction identifying subunit with the voice UI resource, to obtain the first running state and a first context attribute information and a first voice attribute information corresponding to the first text content; a first instruction obtaining subunit, configured to obtain first context attribute information and a first voice obtained by matching the first information matching subunit The first action attribute information corresponding to the attribute information, and the operation corresponding to the first action attribute information is used as the motion corresponding to the voice instruction Or the second instruction identification subunit, configured to identify, by the speech engine, the first text content corresponding to the voice instruction; the second information matching subunit, configured to identify the second instruction identification subunit The first text content is matched with the voice UI resource, and the first voice attribute information and the first context attribute information corresponding to the first text content are obtained; the second instruction obtaining subunit is configured to be the first Obtaining the first action attribute information corresponding to the first context attribute information and the first voice attribute information when the running state is consistent with the first context attribute information obtained by the second information matching subunit, and the first The operation corresponding to the action attribute information is an action command corresponding to the voice command. With reference to the second aspect, or the first possible implementation manner of the second aspect, or the second possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, the receiving unit, Specifically, it is used to receive a voice instruction that the user starts the first application, or receive a voice instruction that the user performs further operations on the first application on the page after the first application is opened. With reference to the second aspect, or the first possible implementation of the second aspect, or the second possible implementation of the second aspect, or the third possible implementation of the second aspect, in the second aspect In the four possible implementation manners, the device further includes: an output unit, configured to output an option of the at least two applications when the application opening voice instruction of the user receives the at least two applications; the receiving unit, The receiving unit is configured to receive a voice command, where the first application is the at least two applications corresponding to the voice command of the application. The application with the highest priority is preset. In a third aspect, a terminal is provided, where the terminal includes: a microphone, a memory, and a processor, where the memory is used to store a voice engine; The microphone is configured to receive a voice instruction of the user, where the processor is configured to: after the microphone receives a voice instruction of the first application, the voice instruction and the voice user interface UI of the first application The resource is matched to obtain an action instruction corresponding to the voice instruction, where the voice UI resource of the first application includes voice attribute information, action attribute information, and context attribute information of each component of the first application, and The first application performs an operation corresponding to the action instruction. With reference to the third aspect, in a first possible implementation manner of the third aspect, the voice attribute information of the component is a text content corresponding to a voice instruction that triggers the component, and the action attribute information of the component is triggered by the An operation performed after the component; the context attribute information of the component is an operation state when the voice instruction of the component is in effect, and the operation state includes a global state, an application state, or a page state. With reference to the third aspect, or the first possible implementation manner of the third aspect, in a second possible implementation manner of the third aspect, the processor is further configured to obtain a current first running state of the terminal The processor is specifically configured to identify, by using a voice engine, the first text content corresponding to the voice instruction, and match the first running state and the first text content with the voice UI resource to obtain Determining, by the first operating state, the first context attribute information and the first voice attribute information corresponding to the first text content, obtaining first action attribute information corresponding to the first context attribute information and the first voice attribute information, The operation corresponding to the first action attribute information is used as an action instruction corresponding to the voice instruction; or the first text content corresponding to the voice instruction is recognized by a voice engine; and the first text content and the voice are Matching the UI resources, obtaining first voice attribute information and first context attribute information corresponding to the first text content; when the first transport Row status When the first context attribute information is consistent, the first action attribute information corresponding to the first voice attribute information is obtained, and the operation corresponding to the first action attribute information is used as an action instruction corresponding to the voice instruction. . With reference to the third aspect, or the first possible implementation manner of the third aspect, or the second possible implementation manner of the third aspect, in a third possible implementation manner of the third aspect, the And receiving a voice instruction for the user to open the first application, or receiving a voice instruction for the user to perform further operations on the first application on the page after the first application is opened. With reference to the third aspect, or the first possible implementation of the third aspect, or the second possible implementation of the third aspect, or the third possible implementation of the third aspect, in the third aspect In the four possible implementation manners, the processor is further configured to output, when the user receives an application of the voice instruction by the microphone, the at least two applications, and the option of outputting the at least two applications; Specifically used to pick up

或者，所述麦克风，具体用于接收用户对第一应用的应用开启语音指令，所述第一应用为所述应用开启语音指令对应的至少两个应用中预设优先级最高的应用。本发明实施例中，接收用户对第一应用的语音指令时，将语音指令与第一应用的语音 UI资源进行匹配，获得与语音指令对应的动作指令，第一应用的语音 UI资源包含第一应用的每个组件的语音属性信息、动作属性信息和上下文属性信息，并对第一应用执行与动作指令对应的操作。本发明实施例扩展了终端内的语音助理框架的处理能力，由于在每个应用内增加了对不同组件的语音属性信息、动作属性信息和上下文属性信息，使得终端在解析应用后可以获得应用的语音 UI资源，当接收到应用的语音指令时，通过匹配应用的语音 UI 资源能够得到对应的动作指令，以此可以实现语音操作各种第三方应用，从而可以满足用户随时安装应用随时使用语音交互的需求，提高了终端用户的使用体验。 Alternatively, the microphone is specifically configured to receive a voice command for the application of the first application, where the first application is the application with the highest priority among the at least two applications corresponding to the voice command. In the embodiment of the present invention, when receiving the voice instruction of the first application, the voice instruction is matched with the voice UI resource of the first application, and the action instruction corresponding to the voice instruction is obtained, where the voice UI resource of the first application includes the first The voice attribute information, the action attribute information, and the context attribute information of each component of the application, and the operation corresponding to the action instruction is performed on the first application. The embodiment of the invention expands the processing capability of the voice assistant framework in the terminal, because the language of different components is added in each application. The sound attribute information, the action attribute information, and the context attribute information enable the terminal to obtain the voice UI resource of the application after the application is parsed. When the voice command of the application is received, the corresponding action instruction can be obtained by matching the voice UI resource of the application. This can realize various third-party applications for voice operation, thereby satisfying the requirement that the user can use the voice interaction at any time to install the application, and improve the user experience of the terminal user.

附图说明 DRAWINGS

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，对于本领域普通技术人员而言，在不付出创造性劳动性的前提下，还可以根据这些附图获得其他的附图。图 1为本发明语音控制方法的一个实施例流程图；图 2为本发明语音控制方法的另一个实施例流程图；图 3为本发明语音控制装置的一个实施例框图；图 4为本发明语音控制装置的另一个实施例框图；图 5为本发明语音控制装置的另一个实施例框图；图 6为本发明终端的实施例框图。 In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it will be apparent to those skilled in the art that In other words, other drawings can be obtained based on these drawings without paying for creative labor. 1 is a flow chart of one embodiment of a voice control method according to the present invention; FIG. 2 is a flow chart of another embodiment of a voice control method according to the present invention; FIG. 3 is a block diagram of an embodiment of a voice control device according to the present invention; A block diagram of another embodiment of a voice control device; FIG. 5 is a block diagram of another embodiment of a voice control device of the present invention; FIG. 6 is a block diagram of an embodiment of a terminal of the present invention.

具体实施方式 detailed description

为了使本技术领域的人员更好地理解本发明实施例中的技术方案，并使本发明实施例的上述目的、特征和优点能够更加明显易懂，下面结合附图对本发明实施例中技术方案作进一步详细的说明。参见图 1 , 为本发明语音控制方法的一个实施例流程图：步骤 101 : 接收用户对第一应用的语音指令。终端上通常可以通过设置麦克风获得用户发出的语音指令，本实施例中，当用户要操作终端内安装的第一应用时，可以向终端发出语音指令。其中，对第一应用的语音指令可以包括开启该第一应用的语音指令，例如，当第一应用为邮件应用时，用户对第一应用的语音指令可以是开启该邮件应用的语音指令; 或者，对第一应用的语音指令也可以包括在第一应用开启后的页面上对第一应用进行的进一步操作的语音指令，例如，当第一应用为邮件应用时，对第一应用的语音指令可以是邮件应用开启后，在邮件查看页面上进行转发邮件，或者回复邮件的语音指令。本实施例中，如果终端接收到用户发出的应用开启语音指令，且该应用开启语音指令对应至少两个应用时，则可以通过显示界面输出上述至少两个应用的选项，并接收用户对从至少两个应用中选择的第一应用的语音指令。例如，用户发出的应用开启语音指令为 "发短信"，而终端内的短消息应用，以及安装的天天聊应用都可以实现发短信的功能，则终端可以输出 "短消息应用"和 "天天聊应用" 的选项，用户可以从上述选项中选择一个应用作为第一应用，假设用户选择了 "天天聊应用"，则发出天天聊应用的语音指令即可。或者，如果终端接收到用户发出的应用开启语音指令，且该应用开启语音指令对应至少两个应用时，则终端可以根据预先设置的至少两个应用的优先级, 从中选择一个优先级最高的应用作为第一应用。例如，用户发出的应用开启语音指令为 "发短信"，而终端内的短消息应用，以及安装的天天聊应用都可以实现发短信的功能，且 "短消息应用" 的优先级高于 "天天聊应用"，则终端根据优先级将 "短消息应用" 作为第一应用。步骤 102: 将语音指令与第一应用的语音 UI资源进行匹配，获得与语音指令对应的动作指令，该第一应用的语音 UI资源包含第一应用的每个组件的语音属性信息、动作属性信息和上下文属性信息。本实施例中，每个应用都可以预先定义语音用户接口（ User Interface, UI ) 资源，该语音 UI 资源可以包括应用中的每个组件的语音属性信息In order to enable those skilled in the art to better understand the technical solutions in the embodiments of the present invention, and to enable The above described objects, features and advantages of the embodiments of the present invention will be more apparent and understood. 1 is a flowchart of an embodiment of a voice control method according to the present invention: Step 101: Receive a voice instruction of a user to a first application. The voice command sent by the user can be obtained by setting the microphone. In this embodiment, when the user wants to operate the first application installed in the terminal, the voice command can be sent to the terminal. The voice command for the first application may include the voice command of the first application, for example, when the first application is a mail application, the voice command of the user to the first application may be a voice command for opening the mail application; or The voice instruction for the first application may also include a voice instruction for further operation of the first application on the page after the first application is opened, for example, when the first application is a mail application, the voice instruction for the first application It can be a mail order that is forwarded on the mail viewing page after the mail application is opened, or a voice command to reply to the mail. In this embodiment, if the terminal receives the application-enabled voice command sent by the user, and the application-open voice command corresponds to the at least two applications, the at least two applications may be output through the display interface, and the user pair is received from at least The voice command of the first application selected in the two applications. For example, if the application sends a voice command to "send a text message", and the short message application in the terminal, and the installed daily chat application can implement the function of texting, the terminal can output "short message application" and "day chat" With the option of "Apply", the user can select one of the above options as the first application. If the user selects "Everyday chat application", then the voice command of the daily chat application can be issued. Alternatively, if the terminal receives the application-enabled voice command sent by the user, and the application-open voice command corresponds to the at least two applications, the terminal may select one of the highest-priority applications according to the preset priorities of the at least two applications. As the first application. For example, an application opening message from a user The voice command is "send text message", and the short message application in the terminal and the installed daily chat application can realize the function of texting, and the priority of "short message application" is higher than "every day chat application", then the terminal is based on The priority is "Short Message Application" as the first application. Step 102: Match the voice instruction with the voice UI resource of the first application, and obtain an action instruction corresponding to the voice instruction, where the voice UI resource of the first application includes voice attribute information and action attribute information of each component of the first application. And context attribute information. In this embodiment, each application may pre-define a voice user interface (UI) resource, where the voice UI resource may include voice attribute information of each component in the application.

( VoiceCommandText )、动作属性信息 ( VoiceCommandAction )和上下文属性信息（ VoiceCommandContext )。其中，应用的组件可以包括启动该应用的 LayOut 组件，和应用启动后的各种控件，例如，按钮（ Button )、复选框( VoiceCommandText ), action attribute information ( VoiceCommandAction ), and context attribute information ( VoiceCommandContext ). The components of the application may include a LayOut component that launches the application, and various controls after the application is launched, for example, a button, a check box.

( CheckBox )等。其中，组件的语音属性信息为触发所述组件的语音指令对应的文本内容；组件的动作属性信息为触发所述组件后执行的操作；组件的上下文属性信息为所述组件的语音指令生效时的运行状态，所述运行状态包括全局状态、应用状态或页面状态。其中，全局状态指终端在任何运行状态下接收到组件的语音指令都能够生效；应用状态指终端在当前已开启应用的运行过程中接收到组件的语音指令才能够生效；页面状态指终端在当前某个应用的页面下接收到组件的语音指令才能生效。终端可以通过对每个应用的语音 UI资源进行解析，获得该应用中不同组件的语音属性信息、动作属性信息和上下文属性信息。需要说明的是，本发明实施例中终端可以在安装某个应用时，就对该应用的语音 UI资源进行解析，或者也可以在首次使用某个应用时，对该应用的语音 UI资源进行解析，对此本发明实施例不进行限制。终端可以将解析出的应用的每个组件的语音属性信息、动作属性信息和上下文属性信息与所述每个组件的组件名称之间的对应关系保存到语音引擎；其中，对于同一类型的组件，可以有多个不同组件实例，同一类型组件的不同组件实例之间通过组件名称进行区别，即每个组件实例的组件名称对应了该组件的语音属性信息( VoiceCommandText )、动作属性信息( VoiceCommandAction ) 和上下文属性信息 ( VoiceCommandContext )„ 当终端接收到语音指令后，可以获得终端当前的第一运行状态，通过语音引擎识别语音指令对应的第一文本内容。然后，将第一运行状态和第一文本内容与第一应用的语音 UI资源进行匹配，获得与第一运行状态和第一文本内容对应的第一上下文属性信息和第一语音属性信息，并获得与第一上文属性信息和第一语音属性信息所对应的第一动作属性信息，将该第一动作属性信息对应的操作作为与用户发出的语音指令对应的动作指令；或者，也可以先将所述第一文本内容与第一应用的语音 UI资源进行匹配，获得与所述第一文本内容对应的第一语音属性信息和第一上下文属性信息，当所述第一运行状态与所述第一上下文属性信息一致时，获得与所述第一上下文属性信息和第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令。步骤 103: 对第一应用执行与动作指令对应的操作。由上述实施例可见，该实施例扩展了终端内的语音助理框架的处理能力，由于在每个应用内增加了对不同组件的语音属性信息、动作属性信息和上下文属性信息，使得终端在解析应用后可以获得应用的语音 UI资源，当接收到应用的语音指令时，通过匹配应用的语音 UI资源能够得到对应的动作指令，以此可以实现语音操作各种第三方应用 ,从而可以满足用户随时安装应用随时使用语音交互的需求，提高了终端用户的使用体验。参见图 2, 为本发明语音控制方法的另一个实施例流程图：步骤 201 : 终端获得应用的语音 UI资源。本实施例中，每个应用都可以预先定义语音用户接口（ User Interface, UI ) 资源，该语音 UI 资源可以包括应用中的每个组件的语音属性信息(CheckBox) and so on. The voice attribute information of the component is a text content corresponding to the voice instruction that triggers the component; the action attribute information of the component is an operation performed after the component is triggered; and the context attribute information of the component is when the voice instruction of the component is in effect. An operational state, including a global state, an application state, or a page state. The global state refers to that the voice command of the terminal that receives the component in any running state can take effect; the application state refers to that the terminal can take effect when receiving the voice command of the component during the running process of the currently enabled application; the page state refers to the terminal currently The voice command of the component received under the page of an application will take effect. The terminal may parse the voice UI resource of each application to obtain voice attribute information, action attribute information, and context attribute information of different components in the application. It should be noted that, in the embodiment of the present invention, the terminal may parse the voice UI resource of the application when installing an application, or may parse the voice UI resource of the application when the application is used for the first time. The embodiment of the present invention is not limited thereto. The terminal may save the correspondence between the voice attribute information, the action attribute information, and the context attribute information of each component of the parsed application and the component name of each component to the voice engine; wherein, for the same type of component, There can be multiple different component instances, and different component instances of the same type component are distinguished by component names, that is, the component name of each component instance corresponds to the component's voice attribute information (VoiceCommandText), action attribute information (VoiceCommandAction), and Context attribute information (VoiceCommandContext) „ After the terminal receives the voice command, the first running state of the terminal can be obtained, and the first text content corresponding to the voice instruction is recognized by the voice engine. Then, the first running state and the first text content are obtained. Matching with the voice UI resource of the first application, obtaining first context attribute information and first voice attribute information corresponding to the first running state and the first text content, and obtaining the first attribute information and the first voice attribute The first action genus corresponding to the information The information, the operation corresponding to the first action attribute information is an action instruction corresponding to the voice command issued by the user; or the first text content may be matched with the voice UI resource of the first application to obtain the The first voice attribute information and the first context attribute information corresponding to the first text content, when the first running state is consistent with the first context attribute information, obtaining the first context attribute information and the first voice The first action attribute information corresponding to the attribute information is an operation instruction corresponding to the first action attribute information as an action instruction corresponding to the voice instruction. Step 103: Perform an operation corresponding to the action instruction on the first application. It can be seen that the embodiment expands the processing capability of the voice assistant framework in the terminal, and the voice attribute information, the action attribute information, and the context attribute information of different components are added in each application, so that the terminal can analyze the application after Obtain the voice UI resource of the application, and match when receiving the voice command of the application Speech UI resource with the operation corresponding to the command is obtained, in order to operate a variety of voice can be third party applications, so that at any time to meet the users to install the application so that The need for voice interaction increases the end user experience. 2 is a flowchart of another embodiment of a voice control method according to the present invention: Step 201: A terminal obtains a voice UI resource of an application. In this embodiment, each application may pre-define a voice user interface (UI) resource, where the voice UI resource may include voice attribute information of each component in the application.

( CheckBox )等。其中，组件的语音属性信息为触发所述组件的语音指令对应的文本内容；组件的动作属性信息为触发所述组件后执行的操作；组件的上下文属性信息为所述组件的语音指令生效时的运行状态，所述运行状态包括全局状态、应用状态或页面状态。其中，全局状态指终端在任何运行状态下接收到组件的语音指令都能够生效；应用状态指终端在当前已开启应用的运行过程中接收到组件的语音指令才能够生效；页面状态指终端在当前某个应用的页面下接收到组件的语音指令才能生效。当终端内安装某个应用时，或者终端内开启某个应用后，终端可以对该应用的语音 UI资源进行解析，获得该应用中不同组件的语音属性信息、动作属性信息和上下文属性信息，并将每个组件的语音属性信息、动作属性信息和上下文属性信息与所述每个组件的组件名称之间的对应关系保存到语音引擎；其中，对于同一类型的组件，可以有多个不同组件实例，同一类型组件的不同组件实例之间通过组件名称进行区别，即每个组件实例的组件名称对应了该组件的语音属性信息( VoiceCommandText )、动作属性信息( VoiceCommandAction ) 和上下文属性信息（ VoiceCommandContext ), 例如，对于按钮组件，可能分为 "下一页" 按钮， "上一页" 按钮等。例如，对于电子邮件应用的 LayOut 组件，其语音属性信息(CheckBox) and so on. The voice attribute information of the component is a text content corresponding to the voice instruction that triggers the component; the action attribute information of the component is an operation performed after the component is triggered; and the context attribute information of the component is when the voice instruction of the component is in effect. An operational state, including a global state, an application state, or a page state. The global state refers to that the voice command of the terminal that receives the component in any running state can take effect; the application state refers to that the terminal can take effect when receiving the voice command of the component during the running process of the currently enabled application; the page state refers to the terminal currently The voice command of the component received under the page of an application will take effect. When an application is installed in the terminal, or an application is enabled in the terminal, the terminal may parse the voice UI resource of the application, and obtain voice attribute information, action attribute information, and context attribute information of different components in the application, and Saving the correspondence between the voice attribute information, the action attribute information, and the context attribute information of each component and the component name of each component to the speech engine; wherein, for the same type of component, there may be multiple different component instances Different component instances of the same type of component are distinguished by component names, that is, the component name of each component instance corresponds to the component's voice attribute information (VoiceCommandText) and action attribute information (VoiceCommandAction). And context attribute information (VoiceCommandContext), for example, for button components, may be divided into "Next Page" button, "Previous Page" button, and so on. For example, for the LayOut component of an email application, its voice attribute information

( VoiceCommandText )可以定义为 "VoiceCommandText=启动电子邮件应用 "，其动作属性信息 ( VoiceCommandAction ) 可以定义为(VoiceCommandText) can be defined as "VoiceCommandText=Start Email Application", and its action attribute information (VoiceCommandAction) can be defined as

" VoiceCommandAction=Open" , 其上下文属性信息 ( VoiceCommandContext ) 可以定义为 "VoiceCommandContext 全局"，即终端在任何运行状态下接收到开启电子邮件应用的语音指令都可以生效。又例如，对于开启电子邮件应用后的浏览邮件页面上的 "下一页 " Button 组件，其语音属性信息"VoiceCommandAction=Open" , its context property information ( VoiceCommandContext ) can be defined as "VoiceCommandContext global", that is, the voice command that the terminal receives the open email application in any running state can take effect. For another example, the voice attribute information of the "Next Page" Button component on the browse mail page after the email application is opened.

( VoiceCommandText )可以定义为 " VoiceCommandText:下一页，， , 其动作属性信息 ( VoiceCommandAction )可以定义为 " VoiceCommandAction=onClick" , 其上下文属性信息（ VoiceCommandContext ) 可以定义为( VoiceCommandText ) can be defined as " VoiceCommandText : next page , , , its action attribute information ( VoiceCommandAction ) can be defined as " VoiceCommandAction = onClick " , its context information ( VoiceCommandContext ) can be defined as

"VoiceCommandContext 页面"，即终端仅在浏览邮件页面时接收到的 "下一页"语音指令才能生效，而在例如邮件编辑页面时接收到该 "下一页"指令时不能生效。步骤 202: 终端接收用户对第一应用的语音指令。终端上通常可以通过设置麦克风获得用户发出的语音指令，本实施例中，当用户要操作终端内安装的第一应用时，可以向终端发出语音指令。其中，对第一应用的语音指令可以包括开启该第一应用的语音指令，例如，当第一应用为邮件应用时，用户对第一应用的语音指令可以是开启该邮件应用的语音指令; 对第一应用的语音指令也可以包括在第一应用开启后的页面上对第一应用进行的进一步操作的语音指令，例如，当邮件应用开启后，在邮件查看页面上进行转发邮件，或者回复邮件的语音指令。步骤 203: 获得终端当前的第一运行状态。本实施例中，仍然以第一应用是电子邮件应用为例，假设步骤 202中接收到用户开启电子邮件的语音指令，该语音指令可以具体为 "打开电子邮件"，或者 "启动 email" 等。此时，终端获得当前的第一运行状态，该第一运行状态指终端当前处于全局状态，或应用状态，或某个应用的页面状态。步骤 204: 通过语音引擎识别语音指令对应的第一文本内容。本实施例中，语音引擎对语音指令釆用语义识别方式进行识别，语义识别方式对语音指令进行模糊识别，例如，无论用户发出的语音指令为 "打开电子邮件"，或者 "启动 email" , 通过语义分析都可以获知该语音指令对应的第一文本内容为 "开启电子邮件应用"。步骤 205: 将第一运行状态和第一文本内容与第一应用的语音 UI资源进行匹配，获得与第一运行状态和第一文本内容对应的第一上下文属性信息和第一语音属性信息。根据步骤 201可知，语音引擎保存了应用的每个组件的语音属性信息、动作属性信息和上下文属性信息与每个组件的组件名称之间的对应关系，因此本步骤中，当获得了第一运行状态和第一文本内容后，可以在对应关系中匹配该第一运行状态和第一文本内容，获得与第一运行状态和第一文本内容对应的第一上下文属性信息和第一语音属性信息。例如，当语音指令对应的第一文本内容为 "开启电子邮件应用"，当前终端的第一运行状态为全局状态时，则语音引擎匹配保存的对应关系获得了 VoiceCommandContext和 VoiceCommandText分另 ll为 "全局，，和 "开启电子由 P 件应用"。步骤 206: 获得与第一上文属性信息和第一语音属性信息所对应的第一动作属性信息 ,将该第一动作属性信息对应的操作作为与语音指令对应的动作指令。结合步骤 205 , 当 VoiceCommandContext和 VoiceCommandText分另 ll为 "全局" 和 "开启电子邮件应用" 后，通过语音引擎可以获得对应的组件下的 VoiceCommandAction为 "Open" , 即用户发出的语音指令 "打开电子邮件"，或者 "启动 email" 对应的动作指令为 "Open" 触发的操作，即打开邮件应用。需要说明的是，除了上述步骤 205和步骤 206示出的语音 UI资源的匹配方式外，在实际应用中，也可以先将所述第一文本内容与第一应用的语音 UI资源进行匹配，获得与所述第一文本内容对应的第一语音属性信息和第一上下文属性信息，当所述第一运行状态与所述第一上下文属性信息一致时，获得与所述第一上下文属性信息和第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令。在实际应用中，对于釆用何种语音 UI资源的匹配方式，本发明实施例不进行限制。步骤 207: 对第一应用执行与该动作指令对应的操作。由上述实施例可见，该实施例扩展了终端内的语音助理框架的处理能力，由于在每个应用内增加了对不同组件的语音属性信息、动作属性信息和上下文属性信息，使得终端在解析应用后可以获得应用的语音 UI资源，当接收到应用的语音指令时，通过匹配应用的语音 UI资源能够得到对应的动作指令，以此可以实现语音操作各种第三方应用 ,从而可以满足用户随时安装应用随时使用语音交互的需求，提高了终端用户的使用体验。与本发明语音控制方法的实施例相对应，本发明还提供了语音控制装置和终端的实施例。参见图 3 , 为本发明语音控制装置的一个实施例框图：该装置包括：接收单元 310、匹配单元 320和执行单元 330。其中，接收单元 310, 用于接收用户对第一应用的语音指令；匹配单元 320, 用于将所述接收单元 310接收到的语音指令与所述第一应用的语音 UI资源进行匹配，获得与所述语音指令对应的动作指令，所述第一应用的语音 UI资源包含所述第一应用的每个组件的语音属性信息、动作属性信息和上下文属性信息；执行单元 330, 用于对所述第一应用执行与所述匹配单元 320获得的动作指令对应的操作。可选的，所述接收单元 310, 可以具体用于接收用户开启第一应用的语音指令，或者，接收用户在第一应用开启后的页面上对第一应用进行的进一步操作的语音指令。其中，所述组件的语音属性信息为触发所述组件的语音指令对应的文本内容；所述组件的动作属性信息为触发所述组件后执行的操作；所述组件的上下文属性信息为所述组件的语音指令生效时的运行状态，所述运行状态包括全局状态、应用状态或页面状态。可选的，所述接收单元 310, 可以具体用于接收用户对第一应用的应用开启语音指令，所述第一应用为所述应用开启语音指令对应的至少两个应用中预设优先级最高的应用参见图 4, 为本发明语音控制的另一个实施例框图：该装置包括：解析单元 410、保存单元 420、接收单元 430、获得单元 440、匹配单元 450和执行单元 460。其中，解析单元 410, 用于通过解析第一应用，获得所述第一应用的不同组件的语音属性信息、动作属性信息和上下文属性信息；保存单元 420, 用于将所述第一应用的语音 UI资源保存到语音引擎，所述语音 UI资源包含所述解析单元 410获得的所述第一应用的每个组件的语音属性信息、动作属性信息和上下文属性信息；接收单元 430, 用于接收用户对所述第一应用的语音指令；获得单元 440, 用于所述接收单元 430接收到所述语音指令后，获得所述终端当前的第一运行状态；匹配单元 450, 用于将所述获得单元 440获得的第一运行状态和所述接收单元 430接收的语音指令与所述保存单元 420保存的第一应用的语音 UI资源进行匹配，获得与所述语音指令对应的动作指令；执行单元 460, 用于对所述第一应用执行与所述匹配单元 450获得的动作指令对应的操作。在一个可选的实现方式中，所述匹配单元 450可以包括（图 4中未示出）：第一指令识别子单元，用于通过语音引擎识别所述语音指令对应的第一文本内容；第一信息匹配子单元，用于将所述获得单元获得的第一运行状态和所述第一指令识别子单元识别出的第一文本内容与所述保存单元保存的第一应用的"VoiceCommandContext page", that is, the "next page" voice command received by the terminal only when browsing the mail page can take effect, but cannot be effective when receiving the "next page" command in, for example, the mail editing page. Step 202: The terminal receives a voice instruction of the user to the first application. The voice command sent by the user can be obtained by setting the microphone. In this embodiment, when the user wants to operate the first application installed in the terminal, the voice command can be sent to the terminal. The voice command for the first application may include the voice command of the first application, for example, when the first application is a mail application, the voice command of the user to the first application may be a voice command for opening the mail application; The voice command of the first application may also include a voice instruction for further operation of the first application on the page after the first application is opened, for example, when the mail application is turned on, forwarding the mail on the mail viewing page, or replying to the mail Voice instructions. Step 203: Obtain a current first running state of the terminal. In this embodiment, the first application is an email application. Assume that the voice command of the user to open the email is received in step 202, and the voice command may be specifically “open email” or “initiate email”. At this time, the terminal obtains the current first running state, where the first running state refers to the terminal currently in the global state, or the application state, or the page state of an application. Step 204: Identify, by the voice engine, the first text content corresponding to the voice instruction. In this embodiment, the voice engine recognizes the voice instruction using the semantic recognition mode, and the semantic recognition mode performs fuzzy recognition on the voice instruction, for example, whether the voice command issued by the user is “open email” or “initiate email” The semantic analysis can know that the first text content corresponding to the voice instruction is "open email application". Step 205: Match the first running state and the first text content with the voice UI resource of the first application, and obtain first context attribute information and first voice attribute information corresponding to the first running state and the first text content. According to step 201, the speech engine saves the correspondence between the voice attribute information, the action attribute information, and the context attribute information of each component of the application and the component name of each component, so in this step, when the first run is obtained After the state and the first text content, the first running state and the first text content may be matched in the correspondence, and the first context attribute information and the first voice attribute information corresponding to the first running state and the first text content are obtained. For example, when the first text content corresponding to the voice instruction is "turn on the email application", and the first running state of the current terminal is the global state, the corresponding relationship between the voice engine matching and the saved is obtained by the VoiceCommandContext and the VoiceCommandText. ,, and "Open the electronic application by P". Step 206: Obtain first action attribute information corresponding to the first attribute information and the first voice attribute information, and the operation corresponding to the first action attribute information is an action instruction corresponding to the voice instruction. In combination with step 205, when the VoiceCommandContext and the VoiceCommandText are respectively divided into "global" and "open email application", the voice command can be obtained by the voice engine as "Open" under the corresponding component, that is, the voice command issued by the user "opens the email"", or "Start email" corresponding action command is "Open" triggered operation, that is, open the mail application. It should be noted that, in addition to the matching manner of the voice UI resource shown in the foregoing step 205 and the step 206, in the actual application, the first text content may be first matched with the voice UI resource of the first application to obtain The first voice attribute information and the first context attribute information corresponding to the first text content, when the first running state is consistent with the first context attribute information, obtaining the first context attribute information and the first The first action attribute information corresponding to the voice attribute information, and the operation corresponding to the first action attribute information is an action instruction corresponding to the voice instruction. In an actual application, the embodiment of the present invention does not limit the matching manner of the voice UI resources. Step 207: Perform an operation corresponding to the action instruction on the first application. It can be seen from the above embodiment that the embodiment expands the processing capability of the voice assistant framework in the terminal, and the voice attribute information, the action attribute information and the context attribute information of different components are added in each application, so that the terminal is parsing the application. The voice UI resource of the application can be obtained. When the voice command of the application is received, the corresponding action instruction can be obtained by matching the voice UI resource of the application, thereby implementing various third-party applications for voice operation, thereby satisfying the user to install at any time. The need for applications to use voice interaction at any time increases the end user experience. Corresponding to the embodiment of the voice control method of the present invention, the present invention also provides an embodiment of a voice control apparatus and terminal. Referring to FIG. 3, it is a block diagram of an embodiment of a voice control apparatus according to the present invention. The apparatus includes: a receiving unit 310, a matching unit 320, and an executing unit 330. The receiving unit 310 is configured to receive a voice instruction of the first application by the user, and the matching unit 320 is configured to match the voice command received by the receiving unit 310 with the voice UI resource of the first application, to obtain The action instruction corresponding to the voice instruction, the voice UI resource of the first application includes voice attribute information, action attribute information, and context attribute information of each component of the first application; and the executing unit 330 is configured to: The first application performs an operation corresponding to the action instruction obtained by the matching unit 320. Optionally, the receiving unit 310 may be specifically configured to receive a voice instruction that the user starts the first application, or receive a voice instruction that is further operated by the user on the first application after the first application is opened. The voice attribute information of the component is a text content corresponding to a voice instruction that triggers the component; the action attribute information of the component is an operation performed after the component is triggered; the context attribute information of the component is the component The running state when the voice command is in effect, the running state includes a global state, an application state, or a page state. Optionally, the receiving unit 310 may be configured to: receive, by the user, an open voice command for the application of the first application, where the first application has the highest preset priority among the at least two applications corresponding to the open voice command of the application Applications Referring to FIG. 4, it is a block diagram of another embodiment of voice control according to the present invention. The apparatus includes: a parsing unit 410, a saving unit 420, a receiving unit 430, an obtaining unit 440, a matching unit 450, and an executing unit 460. The parsing unit 410 is configured to obtain the voice attribute information, the action attribute information, and the context attribute information of the different components of the first application by parsing the first application. The saving unit 420 is configured to: use the voice of the first application. The UI resource is saved to the voice engine, and the voice UI resource includes the voice attribute information, the action attribute information, and the context attribute information of each component of the first application obtained by the parsing unit 410. The receiving unit 430 is configured to receive the user. a voice instruction for the first application; an obtaining unit 440, configured to obtain, by the receiving unit 430, the current first running state of the terminal after receiving the voice instruction, and a matching unit 450, configured to obtain the The first operating state obtained by the unit 440 and the voice command received by the receiving unit 430 are matched with the voice UI resource of the first application saved by the saving unit 420 to obtain an action instruction corresponding to the voice command; the executing unit 460 And performing, for the first application, an operation corresponding to the action instruction obtained by the matching unit 450. In an optional implementation manner, the matching unit 450 may include: (not shown in FIG. 4): a first instruction identification subunit, configured to identify, by using a voice engine, a first text content corresponding to the voice instruction; a information matching subunit, configured to obtain the first operating state and the first An instruction identifying the first text content recognized by the subunit and the first application saved by the saving unit

UI 资源进行匹配，获得与所述第一运行状态和所述第一文本内容对应的第一上下文属性信息和第一语音属性信息；第一指令获得子单元，用于获得与所述第一信息匹配子单元获得的第一上下文属性信息和第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令；在另一个可选的实现方式中，所述匹配单元 450也可以包括（图 4中未示出 ): 第二指令识别子单元，用于通过语音引擎识别所述语音指令对应的第一文本内容；第二信息匹配子单元，用于将所述第二指令识别子单元识别出的第一文本内容与所述保存单元保存的第一应用的 UI资源进行匹配，获得与所述第一文本内容对应的第一语音属性信息和第一上下文属性信息；第二指令获得子单元，用于当所述第一运行状态与所述第二信息匹配子单元获得的第一上下文属性信息一致时，获得与所述第一上下文属性信息和第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令。可选的，所述接收单元 430, 可以具体用于接收用户开启所述第一应用的语音指令，或者，接收用户在所述第一应用开启后的页面上对第一应用进行的进一步操作的语音指令。其中，所述组件的语音属性信息为触发所述组件的语音指令对应的文本内容；所述组件的动作属性信息为触发所述组件后执行的操作；所述组件的上下文属性信息为所述组件的语音指令生效时的运行状态，所述运行状态包括全局状态、应用状态或页面状态。可选的，所述接收单元 430, 可以具体用于接收用户对第一应用的应用开启语音指令，所述第一应用为所述应用开启语音指令对应的至少两个应用中预设优先级最高的应用参见图 5, 为本发明语音控制装置的另一个实施例框图：该装置包括：解析单元 510、保存单元 520、输出单元 530、接收单元 540、匹配单元 550和执行单元 560。其中，解析单元 510, 用于通过解析第一应用，获得所述第一应用的不同组件的语音属性信息、动作属性信息和上下文属性信息；保存单元 520, 用于将所述第一应用的语音 UI资源保存到语音引擎，所述语音 UI资源包含所述解析单元 510获得的所述第一应用的每个组件的语音属性信息、动作属性信息和上下文属性信息；输出单元 530, 用于当接收到用户的应用开启语音指令对应至少两个应用时，输出所述至少两个应用的选项；接收单元 540, 用于接收用户根据所述输出单元 530输出的选项对从所述至少两个应用中选择的第一应用的语音指令；匹配单元 550, 用于将所述接收单元 540接收的语音指令与所述保存单元 520保存的第一应用的语音 UI资源进行匹配，获得与所述语音指令对应的动作指令；执行单元 560, 用于对所述第一应用执行与所述匹配单元 550获得的动作指令对应的操作。可选的，所述接收单元 540, 可以具体用于接收用户开启所述第一应用的语音指令，或者，接收用户在所述第一应用开启后的页面上对第一应用进行的进一步操作的语音指令。其中，所述组件的语音属性信息为触发所述组件的语音指令对应的文本内容；所述组件的动作属性信息为触发所述组件后执行的操作；所述组件的上下文属性信息为所述组件的语音指令生效时的运行状态，所述运行状态包括全局状态、应用状态或页面状态。参见图 6, 为本发明终端的实施例框图：该终端包括：麦克风 610、存储器 620和处理器 630。其中，所述存储器 620, 用于存储语音引擎；所述麦克风 610, 用于接收用户的语音指令；所述处理器 630, 用于当所述麦克风 610接收用户对第一应用的语音指令后，将所述语音指令与所述第一应用的语音用户接口 UI资源进行匹配，获得与所述语音指令对应的动作指令，所述第一应用的语音 UI资源包含所述第一应用的每个组件的语音属性信息、动作属性信息和上下文属性信息，并对所述第一应用执行与所述动作指令对应的操作。在一个可选的实现方式中：所述麦克风 610, 可以具体用于接收用户开启所述第一应用的语音指令，或者，接收用户在所述第一应用开启后的页面上对第一应用进行的进一步操作的语音指令。在另一个可选的实现方式中：所述处理器 630, 还可以用于获得所述终端当前的第一运行状态；所述处理器 630, 可以具体用于通过语音引擎识别所述语音指令对应的第一文本内容，将所述第一运行状态和所述第一文本内容与所述语音 UI资源进行匹配，获得与所述第一运行状态和所述第一文本内容对应的第一上下文属性信息和第一语音属性信息，获得与所述第一上下文属性信息和第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令；或者，通过语音引擎识别所述语音指令对应的第一文本内容；将所述第一文本内容与所述语音 UI资源进行匹配，获得与所述第一文本内容对应的第一语音属性信息和第一上下文属性信息；当所述第一运行状态与所述第一上下文属性信息一致时，获得与所述第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令。在另一个可选的实现方式中：所述处理器 630, 还可以用于当通过所述麦克风接收到用户的应用开启语音指令对应至少两个应用时，输出所述至少两个应用的选项；所述麦克风 610, 可以具体用于接收用户根据所述选项对从所述至少两个应用中选择的第一应用的语音指令。在另一个可选的实现方式中：所述麦克风 610, 可以具体用于接收用户对第一应用的应用开启语音指令，所述第一应用为所述应用开启语音指令对应的至少两个应用中预设优先级最高的应用。上述实施例中，所述组件的语音属性信息为触发所述组件的语音指令对应的文本内容；所述组件的动作属性信息为触发所述组件后执行的操作；所述组件的上下文属性信息为与执行所述组件匹配的运行状态，所述运行状态包括全局状态、应用状态或页面状态。由上述实施例可见，接收用户对第一应用的语音指令时，将语音指令与第一应用的语音 UI资源进行匹配，获得与语音指令对应的动作指令，第一应用的语音 UI资源包含第一应用的每个组件的语音属性信息、动作属性信息和上下文属性信息，并对第一应用执行与动作指令对应的操作。本发明实施例扩展了终端内的语音助理框架的处理能力，由于在每个应用内增加了对不同组件的语音属性信息、动作属性信息和上下文属性信息，使得终端在解析应用后可以获得应用的语音 UI资源，当接收到应用的语音指令时，通过匹配应用的语音 UI 资源能够得到对应的动作指令，以此可以实现语音操作各种第三方应用，从而可以满足用户随时安装应用随时使用语音交互的需求，提高了终端用户的使用体验。本领域的技术人员可以清楚地了解到本发明实施例中的技术可借助软件加必需的通用硬件平台的方式来实现。基于这样的理解，本发明实施例中的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品可以存储在存储介质中，如 ROM/RAM、磁碟、光盘等，包括若干指令用以使得一台计算机设备（可以是个人计算机，服务器，或者网络设备等）执行本发明各个实施例或者实施例的某些部分所述的方法。本说明书中的各个实施例均釆用递进的方式描述，各个实施例之间相同相似的部分互相参见即可，每个实施例重点说明的都是与其他实施例的不同之处。尤其，对于***实施例而言，由于其基本相似于方法实施例，所以描述的比较简单，相关之处参见方法实施例的部分说明即可。以上所述的本发明实施方式，并不构成对本发明保护范围的限定。任何在本发明的精神和原则之内所作的修改、等同替换和改进等，均应包含在本发明的保护范围之内。 The UI resource is matched, and the first context attribute information and the first voice attribute information corresponding to the first running state and the first text content are obtained; the first instruction obtaining subunit is configured to obtain the first information Matching the first context attribute information obtained by the subunit and the first action attribute information corresponding to the first voice attribute information, and the operation corresponding to the first action attribute information is an action instruction corresponding to the voice instruction; In an optional implementation manner, the matching unit 450 may also include (not shown in FIG. 4): a second instruction identification subunit, configured to identify, by using a voice engine, a first text content corresponding to the voice instruction; The information matching subunit is configured to match the first text content recognized by the second instruction identification subunit with the UI resource of the first application saved by the saving unit, to obtain a first corresponding to the first text content a voice attribute information and first context attribute information; a second instruction obtaining subunit, configured to: when the first running state and the second information When the first context attribute information obtained by the game unit is consistent, the first action attribute information corresponding to the first context attribute information and the first voice attribute information is obtained, and the operation corresponding to the first action attribute information is used as The action command corresponding to the voice command. Optionally, the receiving unit 430 may be specifically configured to receive a voice instruction that is used by the user to enable the first application, or receive a further operation performed by the user on the first application after the first application is opened. Voice command. The voice attribute information of the component is within the text corresponding to the voice instruction that triggers the component. The action attribute information of the component is an operation performed after the component is triggered; the context attribute information of the component is an operation state when the voice instruction of the component is in effect, and the running state includes a global state, an application state, or Page status. Optionally, the receiving unit 430 may be specifically configured to: receive, by the user, an open voice command for the application of the first application, where the first application has the highest preset priority among the at least two applications corresponding to the open voice command of the application Application FIG. 5 is a block diagram of another embodiment of a voice control apparatus according to the present invention. The apparatus includes: a parsing unit 510, a saving unit 520, an output unit 530, a receiving unit 540, a matching unit 550, and an executing unit 560. The parsing unit 510 is configured to obtain the voice attribute information, the action attribute information, and the context attribute information of the different components of the first application by parsing the first application, and the saving unit 520, configured to: use the voice of the first application The UI resource is saved to the voice engine, and the voice UI resource includes voice attribute information, action attribute information, and context attribute information of each component of the first application obtained by the parsing unit 510. The output unit 530 is configured to receive Receiving an option of the at least two applications when the user's application opens the voice instruction corresponding to the at least two applications; the receiving unit 540 is configured to receive the option pair output by the user according to the output unit 530 from the at least two applications a matching voice command of the first application, the matching unit 550, configured to match the voice command received by the receiving unit 540 with the voice UI resource of the first application saved by the saving unit 520, to obtain a voice command corresponding to the voice command Action instruction The executing unit 560 is configured to perform an operation corresponding to the action instruction obtained by the matching unit 550 on the first application. Optionally, the receiving unit 540 may be specifically configured to receive a voice instruction that is used by the user to enable the first application, or receive a further operation performed by the user on the first application after the first application is opened. Voice command. The voice attribute information of the component is a text content corresponding to a voice instruction that triggers the component; the action attribute information of the component is an operation performed after the component is triggered; the context attribute information of the component is the component The running state when the voice command is in effect, the running state includes a global state, an application state, or a page state. Referring to FIG. 6, a block diagram of an embodiment of a terminal according to the present invention includes: a microphone 610, a memory 620, and a processor 630. The memory 620 is configured to receive a voice command, and the processor 630 is configured to: after the microphone 610 receives a voice command from the user to the first application, Matching the voice instruction with the voice user interface UI resource of the first application, and obtaining an action instruction corresponding to the voice instruction, where the voice UI resource of the first application includes each component of the first application The voice attribute information, the action attribute information, and the context attribute information, and perform an operation corresponding to the action instruction on the first application. In an optional implementation manner, the microphone 610 may be specifically configured to receive a voice instruction that is used by a user to enable the first application. Or receiving a voice instruction for further operation of the first application by the user on the page after the first application is opened. In another optional implementation manner, the processor 630 may be further configured to obtain a current first running state of the terminal, where the processor 630 may be specifically configured to identify, by using a voice engine, the voice command. a first text content, matching the first running state and the first text content with the voice UI resource to obtain a first context attribute corresponding to the first running state and the first text content And the first voice attribute information, the first action attribute information corresponding to the first context attribute information and the first voice attribute information, and the operation corresponding to the first action attribute information is used as the voice instruction Or the first text content corresponding to the voice instruction is matched by the voice engine; and the first text content is matched with the voice UI resource to obtain a first voice corresponding to the first text content. Attribute information and first context attribute information; obtained when the first operational state is consistent with the first context attribute information A first motion attribute information corresponding to voice attribute information, the attribute information corresponding to the first operation as to the operation of the voice instruction corresponding to the operation command. In another optional implementation manner, the processor 630 may be further configured to output, when the application, the user, the application, the voice command, the at least two applications, by the microphone, output an option of the at least two applications; The microphone 610 may be specifically configured to receive a voice instruction of a first application selected by the user from the at least two applications according to the option. In another alternative implementation: The microphone 610 may be specifically configured to receive a voice command for the application of the first application, where the first application is the application with the highest priority among the at least two applications corresponding to the voice command of the application. In the above embodiment, the voice attribute information of the component is the text content corresponding to the voice instruction that triggers the component; the action attribute information of the component is an operation performed after the component is triggered; the context attribute information of the component is An operational state that matches the execution of the component, the operational state including a global state, an application state, or a page state. It can be seen that, when the voice instruction of the first application is received by the user, the voice instruction is matched with the voice UI resource of the first application, and the action instruction corresponding to the voice instruction is obtained, where the voice UI resource of the first application includes the first The voice attribute information, the action attribute information, and the context attribute information of each component of the application, and the operation corresponding to the action instruction is performed on the first application. The embodiment of the invention extends the processing capability of the voice assistant framework in the terminal, and the voice attribute information, the action attribute information and the context attribute information of different components are added in each application, so that the terminal can obtain the application after parsing the application. The voice UI resource can obtain corresponding action instructions by matching the voice UI resource of the application when the voice command of the application is received, thereby implementing various third-party applications for voice operation, thereby satisfying the user to install the application and use the voice interaction at any time. The demand has improved the end user experience. It will be apparent to those skilled in the art that the techniques in the embodiments of the present invention can be implemented by means of software plus a necessary general hardware platform. Based on such understanding, the technical solution in the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product, which may be stored in a storage medium such as a ROM/RAM. , a disk, an optical disk, etc., including instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform the methods described in various embodiments of the present invention or portions of the embodiments. The various embodiments in the present specification are described in a progressive manner, and the same or similar parts between the various embodiments may be referred to each other, and each embodiment focuses on differences from other embodiments. In particular, for the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment. The embodiments of the present invention described above are not intended to limit the scope of the present invention. Any modifications, equivalent substitutions and improvements made within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims

权利要求 Rights request

1、一种语音控制方法，其特征在于，所述方法包括：接收用户对第一应用的语音指令；将所述语音指令与所述第一应用的语音用户接口 UI资源进行匹配，获得与所述语音指令对应的动作指令，所述第一应用的语音 UI资源包含所述第一应用的每个组件的语音属性信息、动作属性信息和上下文属性信息；对所述第一应用执行与所述动作指令对应的操作。 1. A voice control method, characterized in that the method includes: receiving a user's voice command for a first application; matching the voice command with the voice user interface UI resource of the first application to obtain the corresponding voice command. Action instructions corresponding to the voice instructions, the voice UI resources of the first application include voice attribute information, action attribute information and context attribute information of each component of the first application; executing the first application and the The operation corresponding to the action instruction.

2、根据权利要求 1所述的方法，其特征在于，所述组件的语音属性信息为触发所述组件的语音指令对应的文本内容；所述组件的动作属性信息为触发所述组件后执行的操作；所述组件的上下文属性信息为所述组件的语音指令生效时的运行状态，所述运行状态包括全局状态、应用状态或页面状态。 2. The method according to claim 1, characterized in that: the voice attribute information of the component is the text content corresponding to the voice command that triggers the component; the action attribute information of the component is executed after triggering the component Operation; The context attribute information of the component is the running state when the voice command of the component takes effect, and the running state includes global state, application state or page state.

3、根据权利要求 1或 2所述的方法，其特征在于，所述接收用户对第一应用的语音指令后，所述方法还包括：获得所述终端当前的第一运行状态；所述将所述语音指令与所述第一应用的语音 UI资源进行匹配，获得与所述语音指令对应的动作指令，包括：通过语音引擎识别所述语音指令对应的第一文本内容；将所述第一运行状态和所述第一文本内容与所述语音 UI资源进行匹配，获得与所述第一运行状态和所述第一文本内容对应的第一上下文属性信息和第一语音属性信息；获得与所述第一上下文属性信息和第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令；或者，通过语音引擎识别所述语音指令对应的第一文本内容；将所述第一文本内容与所述语音 UI资源进行匹配，获得与所述第一文本内容对应的第一语音属性信息和第一上下文属性信息；当所述第一运行状态与所述第一上下文属性信息一致时，获得与所述第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令。 3. The method according to claim 1 or 2, characterized in that, after receiving the user's voice command for the first application, the method further includes: obtaining the current first operating state of the terminal; Matching the voice command with the voice UI resource of the first application to obtain the action command corresponding to the voice command includes: identifying the first text content corresponding to the voice command through a voice engine; converting the first text content to the voice command. Match the running state and the first text content with the voice UI resource to obtain the first context attribute information and the first voice attribute information corresponding to the first running state and the first text content; obtain the first context attribute information and the first voice attribute information corresponding to the first running state and the first text content; the first action attribute information corresponding to the first context attribute information and the first voice attribute information, Use the operation corresponding to the first action attribute information as the action instruction corresponding to the voice instruction; or, identify the first text content corresponding to the voice instruction through a speech engine; compare the first text content with the voice instruction Match the UI resources to obtain the first voice attribute information and the first context attribute information corresponding to the first text content; when the first running state is consistent with the first context attribute information, obtain the first voice attribute information and the first context attribute information corresponding to the first text content. First action attribute information corresponding to a piece of voice attribute information, and an operation corresponding to the first action attribute information is used as an action instruction corresponding to the voice instruction.

4、根据权利要求 1至 3任意一项所述的方法，其特征在于，所述接收用户对第一应用的语音指令，包括：接收用户开启第一应用的语音指令；或者，接收用户在第一应用开启后的页面上对第一应用进行的进一步操作的语音指令。 4. The method according to any one of claims 1 to 3, wherein the receiving the user's voice instruction for the first application includes: receiving the user's voice instruction for opening the first application; or, receiving the user's voice instruction for opening the first application; or, receiving the user's voice instruction for opening the first application. Voice instructions for further operations on the first application on the page after the application is opened.

5、根据权利要求 1至 4任意一项所述的方法，其特征在于，所述接收用户对第一应用的语音指令前，所述方法还包括：当接收到用户的应用开启语音指令对应至少两个应用时，输出所述至少两个应用的选项；所述接收用户对所述第一应用的语音指令包括：接收用户根据所述选项对从所述至少两个应用中选择的第一应用的语音指令；或者，所述接收用户对第一应用的语音指令包括：接收用户对第一应用的应用开启语音指令，所述第一应用为所述应用开启语音指令对应的至少两个应用中预设优先级最高的应用。 5. The method according to any one of claims 1 to 4, characterized in that, before receiving the user's voice instruction for the first application, the method further includes: when receiving the user's application opening voice instruction corresponding to at least When there are two applications, output the options of the at least two applications; receiving the user's voice instructions for the first application includes: receiving the user's instructions for the first application selected from the at least two applications according to the options. or, the receiving the user's voice instruction to the first application includes: receiving the user's application opening voice instruction to the first application, and the first application is one of at least two applications corresponding to the application opening voice instruction. The application with the highest default priority.

6、一种语音控制装置，其特征在于，所述装置包括：接收单元，用于接收用户对第一应用的语音指令；匹配单元，用于将所述接收单元接收到的语音指令与所述第一应用的语音 UI 资源进行匹配，获得与所述语音指令对应的动作指令，所述第一应用的语音 UI资源包含所述第一应用的每个组件的语音属性信息、动作属性信息和上下文属性信息；执行单元，用于对所述第一应用执行与所述匹配单元获得的动作指令对应的操作。 6. A voice control device, characterized in that the device includes: a receiving unit, used to receive the user's voice command for the first application; a matching unit, used to match the voice command received by the receiving unit with the The voice UI resources of the first application are matched to obtain the action instructions corresponding to the voice instructions. The voice UI resources of the first application include voice attribute information, action attribute information and context of each component of the first application. Attribute information; an execution unit, configured to execute an operation corresponding to the action instruction obtained by the matching unit on the first application.

7、根据权利要求 6所述的装置，其特征在于，所述组件的语音属性信息为触发所述组件的语音指令对应的文本内容；所述组件的动作属性信息为触发所述组件后执行的操作；所述组件的上下文属性信息为所述组件的语音指令生效时的运行状态，所述运行状态包括全局状态、应用状态或页面状态。 7. The device according to claim 6, wherein the voice attribute information of the component is the text content corresponding to the voice command that triggers the component; the action attribute information of the component is executed after triggering the component. Operation; The context attribute information of the component is the running state when the voice command of the component takes effect, and the running state includes global state, application state or page state.

8、根据权利要求 6或 7所述的装置，其特征在于，所述装置还包括：获得单元，用于所述接收单元接收到所述语音指令后，获得所述终端当前的第一运行状态；所述匹配单元包括：第一指令识别子单元，用于通过语音引擎识别所述语音指令对应的第一文本内容；第一信息匹配子单元，用于将所述获得单元获得的第一运行状态和所述第一指令识别子单元识别出的第一文本内容与所述语音 UI资源进行匹配，获得与所述第一运行状态和所述第一文本内容对应的第一上下文属性信息和第一语音属性信息；第一指令获得子单元，用于获得与所述第一信息匹配子单元获得的第一上下文属性信息和第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令; 或者，第二指令识别子单元，用于通过语音引擎识别所述语音指令对应的第一文本内容；第二信息匹配子单元，用于将所述第二指令识别子单元识别出的第一文本内容与所述语音 UI资源进行匹配，获得与所述第一文本内容对应的第一语音属性信息和第一上下文属性信息；第二指令获得子单元，用于当所述第一运行状态与所述第二信息匹配子单元获得的第一上下文属性信息一致时，获得与所述第一上下文属性信息和第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令。 8. The device according to claim 6 or 7, characterized in that, the device further includes: an obtaining unit, configured to obtain the current first operating state of the terminal after the receiving unit receives the voice command. ; The matching unit includes: a first instruction recognition subunit, used to identify the first text corresponding to the voice instruction through a speech engine; This content; The first information matching subunit is used to match the first operating status obtained by the obtaining unit and the first text content recognized by the first instruction recognition subunit with the voice UI resource, to obtain the The first context attribute information and the first voice attribute information corresponding to the first operating state and the first text content; a first instruction obtaining subunit, used to obtain the first instruction obtained by the first information matching subunit. The first action attribute information corresponding to the context attribute information and the first voice attribute information uses the operation corresponding to the first action attribute information as the action instruction corresponding to the voice instruction; or, the second instruction identification subunit uses for identifying the first text content corresponding to the voice command through a speech engine; a second information matching subunit for matching the first text content recognized by the second command recognition subunit with the voice UI resource, Obtain the first voice attribute information and the first context attribute information corresponding to the first text content; The second instruction acquisition subunit is used to obtain the first subunit when the first operating state matches the second information. When the context attribute information is consistent, the first action attribute information corresponding to the first context attribute information and the first voice attribute information is obtained, and the operation corresponding to the first action attribute information is used as the operation corresponding to the voice command. Action instructions.

9、根据权利要求 6至 8任意一项所述的装置，其特征在于，所述接收单元，具体用于接收用户开启第一应用的语音指令，或者，接收用户在第一应用开启后的页面上对第一应用进行的进一步操作的语音指令。 9. The device according to any one of claims 6 to 8, characterized in that the receiving unit is specifically configured to receive the user's voice instruction to open the first application, or to receive the user's page after the first application is opened. voice instructions for further operations on the first application.

10、根据权利要求 6至 9任意一项所述的装置，其特征在于，所述装置还包括：输出单元，用于当接收到用户的应用开启语音指令对应至少两个应用时，输出所述至少两个应用的选项；所述接收单元，具体用于接用的语音指令；或者, 所述接收单元，具体用于接收用户对第一应用的应用开启语音指令，所述第一应用为所述应用开启语音指令对应的至少两个应用中预设优先级最高的应用。 10. The device according to any one of claims 6 to 9, characterized in that, the device further includes: an output unit, configured to output the said At least two application options; The receiving unit is specifically used to receive voice commands; Alternatively, the receiving unit is specifically configured to receive a user's application start voice instruction for a first application, and the first application is the application with the highest preset priority among at least two applications corresponding to the application start voice instruction.

11、一种终端，其特征在于，所述终端包括：麦克风、存储器和处理器，其中，所述存储器，用于存储语音引擎；所述麦克风，用于接收用户的语音指令；所述处理器，用于当所述麦克风接收用户对第一应用的语音指令后，将所述语音指令与所述第一应用的语音用户接口 UI资源进行匹配，获得与所述语音指令对应的动作指令，所述第一应用的语音 UI资源包含所述第一应用的每个组件的语音属性信息、动作属性信息和上下文属性信息，并对所述第一应用执行与所述动作指令对应的操作。 11. A terminal, characterized in that the terminal includes: a microphone, a memory and a processor, wherein the memory is used to store a voice engine; the microphone is used to receive the user's voice instructions; the processor , used to match the voice command with the voice user interface UI resource of the first application after the microphone receives the user's voice command for the first application, and obtain the action command corresponding to the voice command, so The voice UI resource of the first application includes voice attribute information, action attribute information and context attribute information of each component of the first application, and performs operations corresponding to the action instructions on the first application.

12、根据权利要求 11所述的终端，其特征在于，所述组件的语音属性信息为触发所述组件的语音指令对应的文本内容；所述组件的动作属性信息为触发所述组件后执行的操作；所述组件的上下文属性信息为所述组件的语音指令生效时的运行状态，所述运行状态包括全局状态、应用状态或页面状态。 12. The terminal according to claim 11, wherein the voice attribute information of the component is the text content corresponding to the voice command that triggers the component; the action attribute information of the component is executed after triggering the component. Operation; The context attribute information of the component is the running state when the voice command of the component takes effect, and the running state includes global state, application state or page state.

13、根据权利要求 11或 12所述的终端，其特征在于，所述处理器，还用于获得所述终端当前的第一运行状态；所述处理器，具体用于通过语音引擎识别所述语音指令对应的第一文本内容，将所述第一运行状态和所述第一文本内容与所述语音 UI资源进行匹配，获得与所述第一运行状态和所述第一文本内容对应的第一上下文属性信息和第一语音属性信息，获得与所述第一上下文属性信息和第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令；或者，通过语音引擎识别所述语音指令对应的第一文本内容；将所述第一文本内容与所述语音 UI资源进行匹配，获得与所述第一文本内容对应的第一语音属性信息和第一上下文属性信息；当所述第一运行状态与所述第一上下文属性信息一致时，获得与所述第一语音属性信息所对应的第一动作属性信息，将所述第一动作属性信息对应的操作作为与所述语音指令对应的动作指令。 13. The terminal according to claim 11 or 12, characterized in that, The processor is further configured to obtain the current first operating state of the terminal; the processor is specifically configured to identify the first text content corresponding to the voice instruction through a speech engine, and combine the first operating state with The first text content is matched with the voice UI resource, first context attribute information and first voice attribute information corresponding to the first running state and the first text content are obtained, and the first context attribute information and the first voice attribute information corresponding to the first text content are obtained. The first action attribute information corresponding to the context attribute information and the first voice attribute information uses the operation corresponding to the first action attribute information as the action instruction corresponding to the voice instruction; or, recognizes the voice instruction through a speech engine. Corresponding first text content; Match the first text content with the voice UI resource to obtain the first voice attribute information and first context attribute information corresponding to the first text content; when the first When the running state is consistent with the first context attribute information, the first action attribute information corresponding to the first voice attribute information is obtained, and the operation corresponding to the first action attribute information is used as the operation corresponding to the voice instruction. Action instructions.

14、根据权利要求 11至 13任意一项所述的终端，其特征在于，所述麦克风，具体用于接收用户开启第一应用的语音指令，或者，接收用户在第一应用开启后的页面上对第一应用进行的进一步操作的语音指令。 14. The terminal according to any one of claims 11 to 13, characterized in that the microphone is specifically used to receive the user's voice command to open the first application, or to receive the user's message on the page after the first application is opened. Voice instructions for further operations on the first application.

15、根据权利要求 11至 14任意一项所述的终端，其特征在于，所述处理器，还用于当通过所述麦克风接收到用户的应用开启语音指令对应至少两个应用时，输出所述至少两个应用的选项；所述麦克风，具体用于接 15. The terminal according to any one of claims 11 to 14, wherein the processor is further configured to output the user's application opening voice command corresponding to at least two applications through the microphone. options for at least two applications; the microphone is specifically used to receive

或者，所述麦克风，具体用于接收用户对第一应用的应用开启语音指令，所述第一应用为所述应用开启语音指令对应的至少两个应用中预设优先级最高的应用。 Or, the microphone is specifically used to receive the user's voice instruction for opening the first application, and the third application An application opens the application with the highest preset priority among at least two applications corresponding to the voice command for the application.