WO2017113738A1 - 语音控制方法及其设备 - Google Patents

语音控制方法及其设备 Download PDF

Info

Publication number
WO2017113738A1
WO2017113738A1 PCT/CN2016/089578 CN2016089578W WO2017113738A1 WO 2017113738 A1 WO2017113738 A1 WO 2017113738A1 CN 2016089578 W CN2016089578 W CN 2016089578W WO 2017113738 A1 WO2017113738 A1 WO 2017113738A1
Authority
WO
WIPO (PCT)
Prior art keywords
human
interaction interface
instruction
graphic
computer interaction
Prior art date
Application number
PCT/CN2016/089578
Other languages
English (en)
French (fr)
Inventor
王蕊
崔洪贵
Original Assignee
乐视控股(北京)有限公司
乐视致新电子科技(天津)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 乐视控股(北京)有限公司, 乐视致新电子科技(天津)有限公司 filed Critical 乐视控股(北京)有限公司
Priority to US15/241,417 priority Critical patent/US20170193992A1/en
Publication of WO2017113738A1 publication Critical patent/WO2017113738A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/0482Interaction with lists of selectable items, e.g. menus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04847Interaction techniques to control parameter settings, e.g. interaction with sliders or dials
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/221Announcement of recognition results
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Definitions

  • This patent application relates to the field of communications, and in particular to voice control techniques.
  • the inventor found that the product of the traditional intelligent speech recognition product in the mobile application market is mainly based on the accumulation of content, and interacts in the form of dialogue.
  • the click trigger button is often used, and the interface is filled with too much text information or content operations performed after semantic recognition, and the user of the vehicle state needs to obtain a speech recognition result page or semantics.
  • the execution interface jumps back to the recording state and requires complicated operations to complete.
  • the user in the driving state is more demanding on the information acquisition, too much redundant information, and the too complicated interactive interface will increase the operation cost of the user, increase the operation time of the user, and affect the normal driving state, thereby making the user
  • the interface is not well suited for use in automotive products.
  • the purpose of some embodiments of the present invention is to provide a voice control method and a device thereof, which simplify the human-computer interaction interface, simplify the operation process, reduce the operation cost of the user, and reduce the impact on the normal driving of the user.
  • an embodiment of the present invention provides a voice control method, including the steps of: generating a corresponding instruction for execution according to the collected voice information, and generating a corresponding graphic, where the corresponding graphic is used to display the voice information.
  • the recognition result is embedded in the view page, and in the current human-computer interaction interface, the corresponding graphic generated according to the most recently collected voice information is displayed; if the gesture sliding operation is detected in the human-computer interaction interface, Then, the corresponding graphic indicated by the gesture sliding operation is displayed in the human-computer interaction interface, and the corresponding instruction of the indicated corresponding graphic is executed.
  • the embodiment of the present invention further provides a voice control device, including: an instruction generation module, configured to generate a corresponding instruction according to the collected voice information; an instruction execution module, configured to execute a corresponding instruction generated by the instruction generation module; and a graphics generation module And a corresponding graphic is used to generate a corresponding graphic according to the collected voice information, the corresponding graphic is used to display the recognition result of the voice information; the embedded module is configured to embed the generated corresponding graphic into the view page; and the display module is used for the current person.
  • a voice control device including: an instruction generation module, configured to generate a corresponding instruction according to the collected voice information; an instruction execution module, configured to execute a corresponding instruction generated by the instruction generation module; and a graphics generation module
  • a corresponding graphic is used to generate a corresponding graphic according to the collected voice information, the corresponding graphic is used to display the recognition result of the voice information; the embedded module is configured to embed the generated corresponding graphic into the view page; and the display module is used for the current person.
  • a corresponding graphic generated according to the most recently collected voice information is displayed;
  • a gesture detection module is configured to detect whether there is a gesture sliding operation in the human-computer interaction interface; and the gesture detection module triggers the display when detecting the gesture sliding operation
  • the module displays a corresponding graphic indicated by the gesture sliding operation in the human-computer interaction interface, and triggers a corresponding instruction of the instruction execution module to execute the corresponding graphic of the indication.
  • the present invention generates a corresponding graphic for executing and a corresponding graphic for displaying the recognition result of the voice information by acquiring the voice information and identifying the corresponding graphic, and the corresponding graphic is embedded in the view page, and In the human-computer interaction interface, the corresponding graphic generated by the latest collected voice information is displayed. If the gesture sliding operation is detected in the human-computer interaction interface, the graphic corresponding to the gesture sliding is displayed in the human-computer interaction interface, and the indication is executed. The corresponding instruction for the graphic.
  • different voice information generates different corresponding graphics; each corresponding graphic is embedded side by side into the view page; in the step of displaying the corresponding graphic indicated by the gesture sliding operation in the human-computer interaction interface, sliding according to the gesture sliding operation Direction, showing the corresponding graphic on the left or right side of the corresponding graphic currently displayed.
  • Each corresponding graphic is embedded side by side in the view page, and the corresponding image is displayed as the gesture slides, which effectively simplifies the user operation.
  • each corresponding graphic is embedded side by side into the view page in a left-to-right order according to the order in which the corresponding voice information is collected.
  • the left-to-right order is embedded side by side into the view page, and the gesture sliding operation is used to select different corresponding screens, which is in accordance with the user's operating habits.
  • the step of executing the corresponding instruction includes the following sub-steps: the in-vehicle device sends the instruction to the associated terminal; the associated terminal executes the instruction, and feeds back the execution result of the instruction to the in-vehicle device; the in-vehicle device will receive the The execution result is displayed in the human-computer interaction interface.
  • the associated terminal may be a mobile phone, and the association mode may be a Bluetooth association.
  • the information exchange between the mobile terminal and the in-vehicle device, the mobile terminal feedbacks the execution result to the in-vehicle device, and displays the display execution result in the human-computer interaction interface, the user may Obtaining the execution result from the human-computer interaction interface more intuitively.
  • the human-computer interaction interface is divided into a first display area and a second display area; the corresponding graphic is displayed in the first display area; and the execution result is displayed in the second display area.
  • the human-computer interaction interface is divided into two areas, and the corresponding content is displayed in each area, which simplifies the human-computer interaction interface style, so that the content on the human-computer interaction surface becomes clear at a glance, especially for use in an in-vehicle device, effectively The streamlined redundant information makes it easy for users to get information quickly.
  • the background color of the first display area is different from the background color of the second display area.
  • the two regions use different background colors, so that the boundaries of the two regions are clear, and the user can quickly locate the region of the desired information directly through the background color, thereby shortening the time of the region where the user locates the information.
  • the area of the first display area and the second display area is adjustable. If an area adjustment operation for the first display area or the second display area is received, the area of the area is adjusted according to the received area adjustment operation. The user can adjust the area of the display area according to the view habits, so that the human-computer interaction interface is more flexible and reasonable, and the user experience is improved.
  • a button for triggering a voice recognition function is preset in the human-machine interaction interface; before the step of generating a corresponding instruction for execution according to the collected voice information, the method further includes: if the operation of the button is detected , the voice acquisition device is used to collect voice. Taking into account the flexibility and randomness of the user's actual operation, a button for triggering the voice recognition function is added to ensure the correctness and rationality of the voice information collection process.
  • FIG. 1 is a flow chart of a voice control method according to a first embodiment of the present invention
  • FIG. 2 is a schematic diagram of a human-machine interaction interface according to a first embodiment, a second embodiment, and a third embodiment of the present invention
  • FIG. 3 is a schematic diagram of a pattern switching corresponding to a sliding direction of a gesture sliding operation from left to right according to the first embodiment, the second embodiment, and the third embodiment of the present invention
  • FIG. 4 is a schematic diagram of switching a displayed graphic to a graphic A according to a gesture sliding operation according to the first embodiment, the second embodiment, and the third embodiment of the present invention
  • Figure 5 is a system configuration diagram of a voice control device in accordance with a fourth embodiment of the present invention.
  • a first embodiment of the present invention relates to a voice control method.
  • the present embodiment is applied to an in-vehicle device, and the specific process is as shown in FIG. 1.
  • step 101 it is determined whether a voice recognition button operation is detected.
  • a button for triggering the voice recognition function is preset in a human-computer interaction interface (such as a touch screen) of the in-vehicle device. If the operation of the button is not detected by the user, return to the initial state, and continue to detect whether the user operates the button for triggering the voice recognition function;
  • the process proceeds to step 102, and the in-vehicle device uses the voice collecting device to collect voice information, for example, using a microphone set on the in-vehicle device to collect voice. information.
  • a button for triggering the voice recognition function is set, and only when the button is detected to be operated, the voice collection device is started to collect voice to ensure voice information. The correctness and rationality of the collection process.
  • a corresponding instruction and a corresponding graphic are generated.
  • the collected voice information generates corresponding instructions for execution, and generates corresponding graphics, and the corresponding graphics are used to display the recognition result of the voice information, for example, the graphic is “calling Li Moumou”.
  • Different voice information generates different corresponding graphics.
  • Each corresponding graphic and corresponding instruction can be saved in the in-vehicle device, and when the corresponding graphic information of the voice information is called, the corresponding instruction can be called at the same time. Specifically, each corresponding graphic is embedded side by side into the view page.
  • each corresponding graphic is embedded side by side into the view page in a left-to-right order according to the order of the corresponding voice information, and is in the current human-computer interaction interface.
  • the corresponding graphic generated based on the most recently collected voice information is displayed, as shown in FIG. 2 .
  • the human-computer interaction interface is represented by a solid line border
  • C is a corresponding graphic displayed by the current human-computer interaction interface
  • B is a corresponding graphic of a current voice information corresponding to the graphic C
  • A is a graphic corresponding to the previous voice information of the graphic B.
  • Display the corresponding graphic generated by the latest voice information in the current human-computer interaction interface which is convenient The user intuitively understands the current operation.
  • the entire human-computer interaction interface (such as APP) exists in the form of a view page.
  • a corresponding graphic is generated in the voice view for displaying a single voice information recognition and
  • the content of semantic understanding when the user initiates the voice information recognition instruction again, continues to generate another corresponding graphic, completes the initiated voice information recognition instruction in this way, and generates a corresponding graphic, and each corresponding graphic is based on the corresponding voice.
  • the order in which the information is collected is embedded side by side in the order from left to right in accordance with the user's operating habits.
  • the human-computer interaction interface is divided into a first display area and a second display area; the corresponding graphic is displayed in the first display area; and the execution result is displayed in the second display area.
  • the human-computer interaction interface is represented by a solid line border, and the upper end area I is a first display area for displaying a corresponding graphic, and the lower end area II is a second display area for displaying an execution result.
  • the human-computer interaction interface is divided into two areas, and the corresponding content is displayed in each area.
  • step 104 an instruction to be executed is acquired. There are two cases in which the general execution instructions are obtained:
  • the in-vehicle device regards the latest voice information corresponding instruction displayed in the current human-computer interaction interface as a to-be-executed instruction.
  • the user can obtain the required instructions from the in-vehicle device by storing the corresponding graphics and corresponding commands generated by the previous voice information operation in the in-vehicle device. If the gesture sliding operation is detected in the human-computer interaction interface, the corresponding graphic indicated by the gesture sliding operation is displayed in the human-computer interaction interface, and the corresponding instruction of the corresponding graphic is taken as the instruction to be executed.
  • the user can switch out in parallel on the human-computer interaction interface by gesture operation.
  • the graphic on the left or right side of the currently displayed graphic, and the corresponding command is called.
  • the graphic C can be switched to the graphic B corresponding to the previous voice information, and the human-computer interaction interface is represented by a solid line border, and after the switching is completed, the human-machine
  • the graphic displayed by the interactive interface is B; if the user continues to slide the human-computer interaction interface from left to right at this time, the graphic B is switched to the graphic A corresponding to a voice information before the graphic B, as shown in FIG. 4 .
  • the graphic A can be switched to the graphic B corresponding to the next voice information of the graphic A.
  • the user can switch the voice information command by sliding the human-computer interaction interface, which simplifies the user operation process.
  • the to-be-executed command acquired by the in-vehicle device is an instruction corresponding to the graphic currently displayed by the human-computer interaction interface when the user stops the gesture sliding operation.
  • step 105 it is determined whether the associated terminal needs to execute the instruction. If the determination result is no, that is, the associated terminal is not required to execute the instruction, and the process proceeds to step 106, the in-vehicle device executes the acquired instruction, and displays the execution result in the human-computer interaction interface.
  • the process proceeds to step 107, and the in-vehicle device sends the corresponding instruction to the associated terminal.
  • the associated terminal can be a mobile phone, and the mobile phone can be associated with the in-vehicle device through Bluetooth pairing. In this step, the in-vehicle device can send an instruction to the mobile phone via Bluetooth.
  • the associated terminal executes the instruction and feeds back the execution result to the in-vehicle device.
  • the user can execute commands (such as making a call) through the terminal, and can execute commands through the in-vehicle device, which is more flexible, and is convenient for the user to make a reasonable choice according to the actual situation during the driving process.
  • the in-vehicle device displays the received execution result in the human-computer interaction interface, so that the user can view the currently performed operation.
  • the generated corresponding graphics are embedded in the view page, and the corresponding graphics generated by the most recently collected voice information are displayed in the current In the human-computer interaction interface.
  • the human-computer interaction interface realizes switching and selection of voice information commands. Using the acceleration sliding effect generated when sliding the screen in the human-computer interaction interface operation, the relative displacement distance of the interface is judged, thereby performing different responses, simplifying the user operation flow, and reducing the influence on the normal driving of the user when operating the vehicle-mounted device. .
  • a second embodiment of the present invention relates to a voice control method.
  • the second embodiment is improved on the basis of the first embodiment, and the main improvement is that the background color of the first display area is different from the background color of the second display area.
  • the background color of the first display area is black
  • the background color of the second display area is white
  • the two areas respectively use two distinct background colors of black and white, so that the boundaries of the two areas are distinct, and the user can directly according to the background.
  • the color quickly locates the location of the desired information, shortening the time at which the user needs to locate the information.
  • a third embodiment of the present invention relates to a voice control method.
  • the third embodiment is improved on the basis of the first and second embodiments, and the main improvement is that the areas of the first display area and the second display area are adjustable. If an area adjustment operation for the first display area or the second display area is received, the area of the area is adjusted according to the received area adjustment operation. During the actual operation, the user can manually drag the border of the first display area or the second display area until the appropriate position is reached, and the heights of the two display areas change with the user's drag, thereby adjusting two The display scale of the display area in the human-computer interaction interface. Users can flexibly and reasonably adjust the area of the display area according to the view habits to meet the view requirements of different users.
  • a fourth embodiment of the present invention relates to a voice control device, as shown in FIG. 5, comprising: an instruction generating module, configured to generate a corresponding instruction according to the collected voice information; and an instruction execution module, configured to: And executing a corresponding instruction generated by the instruction generating module; the graphic generating module is configured to generate a corresponding graphic according to the collected voice information, the corresponding graphic is used to display the recognition result of the voice information; and the embedded module is configured to embed the generated corresponding graphic into the view a display module, configured to display a corresponding graphic generated according to the most recently collected voice information in the current human-computer interaction interface; a gesture detection module, configured to detect whether there is a gesture sliding operation in the human-computer interaction interface; When detecting the gesture sliding operation, the detecting module triggers the display module to display the corresponding graphic indicated by the gesture sliding operation in the human-computer interaction interface, and triggers the corresponding instruction of the instruction execution module to execute the indicated corresponding graphic.
  • the present embodiment is an apparatus embodiment corresponding to the first embodiment, and the present embodiment can be implemented in cooperation with the first embodiment.
  • the related technical details mentioned in the first embodiment are still effective in the present embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related art details mentioned in the present embodiment can also be applied to the first embodiment.
  • each module involved in this embodiment is a logic module.
  • a logical unit may be a physical unit, a part of a physical unit, or multiple physical entities. A combination of units is implemented.
  • the present embodiment does not introduce a unit that is not closely related to solving the technical problem proposed by the present invention, but this does not mean that there are no other units in the present embodiment.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

一种语音控制方法及其设备。包含以下步骤:根据采集到的语音信息生成用于执行的对应指令,并生成对应图形,对应图形用于显示对语音信息的识别结果;将生成的对应图形嵌入到视图页面中,在当前的人机交互界面中,显示根据最近一次采集的语音信息生成的对应图形;如果在人机交互界面中检测到手势滑动操作,则在人机交互界面中显示手势滑动操作所指示的对应图形,并执行该指示的对应图形的相应指令。采用本方法实施例,精简了人机交互界面,简化了操作流程,降低了用户操作成本,减小操作时对用户正常驾驶产生的影响。

Description

语音控制方法及其设备
交叉引用
本申请要求于2015年12月30日提交中国专利局、申请号为201511031185.3的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本专利申请涉及通信领域,特别涉及语音控制技术。
背景技术
发明人在实现本发明的过程中发现,移动应用市场中传统的智能语音识别类的产品首页以内容的堆积为主,多采取对话的形式进行交互。在对录音状态和待机状态进行切换时多采用点击触发按钮的方式,在界面中充斥着过多文字信息或语义识别之后执行的内容操作,对于车载状态的用户如果需要从语音识别结果页面或者语义执行界面跳转回到录音状态,需要进行复杂的操作才能完成。
然而行驶状态的用户对于信息的获取要求更加苛刻,过多的冗余信息,过于复杂的交互界面都会提高用户的操作成本,增加用户的操作时间,影响驾驶状态的正常进行,从而使得这种用户界面并不能很好的适用于车载产品中。
发明内容
本发明部分实施例的目的在于提供一种语音控制方法及其设备,精简了人机交互界面,简化了操作流程,降低用户操作成本,减小对用户正常驾驶产生的影响。
为解决上述技术问题,本发明的实施方式提供了一种语音控制方法,包含以下步骤:根据采集到的语音信息生成用于执行的对应指令,并生成对应图形,对应图形用于显示对语音信息的识别结果;将生成的对应图形嵌入到视图页面中,在当前的人机交互界面中,显示根据最近一次采集的语音信息生成的对应图形;如果在人机交互界面中检测到手势滑动操作,则在人机交互界面中显示手势滑动操作所指示的对应图形,并执行该指示的对应图形的相应指令。
本发明的实施方式还提供了一种语音控制设备,包含:指令生成模块,用于根据采集到的语音信息生成对应指令;指令执行模块,用于执行指令生成模块生成的对应指令;图形生成模块,用于根据采集到的语音信息生成对应图形,对应图形用于显示对语音信息的识别结果;嵌入模块,用于将生成的对应图形嵌入到视图页面中;显示模块,用于在当前的人机交互界面中,显示根据最近一次采集的语音信息生成的对应图形;手势检测模块,用于检测在人机交互界面中是否有手势滑动操作;手势检测模块在检测到手势滑动操作时,触发显示模块在人机交互界面中显示手势滑动操作所指示的对应图形,并触发指令执行模块执行该指示的对应图形的相应指令。
本发明实施方式相对于现有技术而言,通过采集语音信息并对其进行识别生成用于执行的对应指令和显示语音信息的识别结果的对应图形,对应图形嵌入到视图页面中,并可在人机交互界面中,显示最近一次采集的语音信息生成的对应图形,若在人机交互界面中检测到手势滑动操作,则在人机交互界面中显示手势滑动所对应的图形,并执行该指示图形的相应指令。利用 人机交互界面操作中滑动屏幕时产生的加速度滑动效果,对界面的相对位移距离进行判断,从而执行不同的响应,简化了用户操作流程,减小操作车载设备时对用户正常驾驶产生的影响。
在一个实施例中,不同语音信息生成不同的对应图形;各对应图形并排嵌入到视图页面中;在人机交互界面中显示手势滑动操作所指示的对应图形的步骤中,根据手势滑动操作的滑动方向,显示当前显示的对应图形左侧或右侧的对应图形。各对应图形并排嵌入到视图页面中,并随着手势滑动的方向,显示出对应的图像,有效的简化了用户操作。
在一个实施例中,各对应图形根据相应的语音信息的采集顺序,以从左至右的顺序并排嵌入到视图页面中。采用从左至右的顺序并排嵌入到视图页面,配合手势滑动操作选择不同的对应画面,符合用户的操作习惯。
在一个实施例中,执行对应指令的步骤中,包含以下子步骤:车载设备将指令发送至关联终端;关联终端执行指令,并将该指令的执行结果反馈至车载设备;车载设备将收到的执行结果显示在人机交互界面中。其中,关联终端可以为手机,关联方式可以为蓝牙关联,通过手机终端和车载设备的信息交互,手机终端将执行结果反馈至车载设备,并在在人机交互界面中显示显示执行结果,用户可以比较直观的从人机交互界面上获取到执行结果。
在一个实施例中,人机交互界面划分为第一显示区域和第二显示区域;对应图形显示在第一显示区域;执行结果显示在第二显示区域。将人机交互界面划分成两个区域,并在各区域内显示相应的内容,简化了人机交互界面风格,使得人机交互面上的内容变得一目了然,尤其是用于车载设备中,有效的精简了冗余信息,方便用户快速的获取信息。
在一个实施例中,第一显示区域的背景色不同于第二显示区域的背景色。两个区域使用不同的背景色,使得两区域界限分明,用户可以直接通过背景色迅速定位到所需信息的区域位置,缩短用户定位信息所在区域的时间。
在一个实施例中,第一显示区域与第二显示区域的面积可调。如果接收到对第一显示区域或第二显示区域的面积调整操作,则根据接收到的面积调整操作,调整区域面积。用户可以根据视图习惯,对显示区域的面积进行调整,使得人机交互界面更加灵活、合理,提高了用户体验。
在一个实施例中,人机交互界面中预设有用于触发语音识别功能的按键;在根据采集到的语音信息生成用于执行的对应指令的步骤之前,还包含:如果检测到对按键的操作,则利用语音采集设备采集语音。考虑到用户实际操作的灵活性和随机性,增设用于触发语音识别功能的按键,确保语音信息采集过程的正确性与合理性。
附图说明
图1是根据本发明第一实施方式的语音控制方法的流程图;
图2是根据本发明第一实施方式、第二实施方式和第三实施方式的人机交互界面示意图;
图3是根据本发明第一实施方式、第二实施方式和第三实施方式中手势滑动操作的滑动方向为从左向右时对应的图形切换示意图;
图4是根据本发明第一实施方式、第二实施方式和第三实施方式中根据手势滑动操作将显示的图形切换至图形A的示意图;
图5是根据本发明第四实施方式语音控制设备的***结构图。
具体实施方式
为使本发明部分实施例的目的、技术方案和优点更加清楚,下面将结合附图对本发明的各实施方式进行详细的阐述。然而,本领域的普通技术人员可以理解,在本发明各实施方式中,为了使读者更好地理解本申请而提出了 许多技术细节。但是,即使没有这些技术细节和基于以下各实施方式的种种变化和修改,也可以实现本申请各权利要求所要求保护的技术方案。
本发明的第一实施方式涉及一种语音控制方法,本实施方式应用于车载设备,具体流程如图1所示。
在步骤101中,判断是否检测到语音识别按键操作。具体地说,在车载设备的人机交互界面(如触控屏)中预设有用于触发语音识别功能的按键。若未检测到用户对该按键的操作,则回到起始状态,继续检测用户是否对该用于触发语音识别功能的按键进行操作;
如果检测到有对该按键进行操作(如检测到对该按键进行了点击),则进入步骤102,车载设备利用语音采集设备采集语音信息,比如说,利用设置于该车载设备上的麦克风采集语音信息。
在本实施方式中,考虑到用户实际操作的灵活性和随机性,设置有用于触发语音识别功能的按键,只有当检测到该按键***作时,才会启动语音采集设备采集语音,确保语音信息采集过程的正确性与合理性。
接着,进入步骤103,生成对应指令和对应图形。采集到的语音信息生成用于执行的对应指令,并生成对应图形,对应图形用于显示对语音信息的识别结果,比如图形为字样“给李某某打电话”。不同的语音信息生成不同的对应图形。各对应图形及对应指令可保存在车载设备中,当调出各语音信息对应图形的时即可同时调出对应指令。具体的说,各对应图形并排嵌入到视图页面中,比如说,各对应图形根据相应的语音信息的采集顺序,以从左至右的顺序并排嵌入到视图页面中,并在当前人机交互界面中,显示根据最近一次采集的语音信息生成的对应图形,如图2所示。其中,人机交互界面以实线边框表示,C为当前人机交互界面显示的对应图形,B为当前对应图形C上一条语音信息的对应图形,A为图形B的上一条语音信息对应的图形。将最近一次的语音信息生成的对应图形显示在当前的人机交互界面中,方便 用户直观的了解当前的操作。
比如说,整个人机交互界面(例如APP)以一个视图页面的形式存在,用户单次换起语音信息识别指令时,在该语音视图中生成一个对应图形,用于展示单次语音信息识别与语义理解的内容,当用户再次发起语音信息识别指令时,则继续生成另一个对应图形,以此方式完成发起的各语音信息识别指令,并生成与之相对应图形,各对应图形根据相应的语音信息的采集顺序,以从左至右的顺序并排嵌入到视图页面中,符合用户的操作习惯。
在本实施方式中,人机交互界面划分为第一显示区域和第二显示区域;对应图形显示在第一显示区域;执行结果显示在第二显示区域。如图2所示,人机交互界面以实线边框表示,上端区域Ⅰ为第一显示区域,用于显示对应图形,下端区域Ⅱ为第二显示区域,用于显示执行结果。将人机交互界面划分成两个区域,并在各区域内显示相应的内容。简化了人机交互界面风格,精简了人机交互界面上的显示信息,去除了冗余信息,使得人机交互面上的内容变得一目了然,尤其是用于车载设备中,方便用户快速的获取信息,尽可能地减少了对驾驶的影响。
接着,进入步骤104,获取待执行指令。一般执行指令的获取有以下两种情况:
一、车载设备将当前人机交互界面中显示的最近一次语音信息对应指令作为待执行指令。
二、通过手势滑动人机交互界面获取。由于在车载设备中储存有之前语音信息操作产生的对应图形及对应指令,为了提高用户体验,方便用户操作,用户可以通过手势滑动人机交互界面,从车载设备中获取所需的指令。如果在人机交互界面中检测到手势滑动操作,则在人机交互界面中显示手势滑动操作所指示的对应图形,将该对应图形的相应指令作为待执行指令。
具体的说,用户通过手势操作在人机交互界面上平行滑动,即可切换出 当前显示的图形的左侧或者右侧的图形,并调出对应指令。如图3所示,用户从左向右滑动人机交互界面时,可从图形C切换到前一条语音信息所对应图形B,人机交互界面以实线边框表示,且切换完成后,人机交互界面显示的图形即为B;如果此时,用户继续从左向右滑动人机交互界面,则将图形B切换到图形B之前的一条语音信息所对应图形A,如图4所示。相应的,如果用户从右向左滑动人机交互界面时,即可将图形A再切换至图形A的后一条语音信息所对应的图形B。用户通过手势滑动人机交互界面即可实现语音信息指令的切换,简化了用户操作流程。在本步骤中,车载设备获取的待执行指令即为用户停止手势滑动操作时,该人机交互界面当前显示的图形所对应的指令。
接着,进入步骤105,判断是否需要关联终端执行指令。若判断结果为否,即不需要关联终端执行指令,进入步骤106,车载设备执行获取的指令,并将执行结果显示在人机交互界面中。
若需要关联终端执行指令,即判断结果为是,进入步骤107,车载设备将对应指令发送至关联终端。关联终端可以为手机,手机可以通过蓝牙配对的方式与车载设备进行关联,在本步骤中,车载设备即可通过蓝牙将指令发送给手机。
接着,进入步骤108,关联终端执行指令,并将执行结果反馈到车载设备。用户既可通过终端来执行指令(如拨打电话),又可通过车载设备执行指令,灵活性较大,在驾驶过程中,方便用户根据实际情况进行合理选择。
接着,进入步骤109,车载设备将收到的执行结果显示在人机交互界面中,方便用户查看当前执行的操作。
不难发现,在本实施方式中,通过采集语音信息,并生成对应指令和对应图形,将生成的对应图形嵌入到视图页面中,并将最近一次采集的语音信息生成的对应图形显示在当前的人机交互界面中。此外,根据手势滑动操作 人机交互界面,实现对语音信息指令的切换及选择。利用人机交互界面操作中滑动屏幕时产生的加速度滑动效果,对界面的相对位移距离进行判断,从而执行不同的响应,简化了用户操作流程,减小操作车载设备时对用户正常驾驶产生的影响。
本发明的第二实施方式涉及一种语音控制方法。第二实施方式在第一实施方式的基础上进行了改进,主要改进之处在于:第一显示区域的背景色不同于第二显示区域的背景色。比如说,第一显示区域的背景色为黑色,第二显示区域的背景色为白色,两个区域分别使用黑色和白色两种截然不同的背景色,使得两区域界限分明,用户可以直接根据背景色迅速定位到所需信息的区域位置,缩短用户定位所需信息所在位置的时间。
本发明的第三实施方式涉及一种语音控制方法。第三实施方式在第一、第二实施方式的基础上进行了改进,主要改进之处在于:第一显示区域与第二显示区域的面积可调。如果接收到对第一显示区域或第二显示区域的面积调整操作,则根据接收到的面积调整操作,调整区域面积。在实际操作过程中,用户可以通过手动拖动第一显示区域或者第二显示区域的边框,直到达到合适的位置,两个显示区域的高度随着用户的拖动而变化,以此调整两个显示区域在人机交互界面中的显示比例。用户可以根据视图习惯,灵活、合理的对显示区域的面积进行调整,满足不同用户的视图需求。
上面各种方法的步骤划分,只是为了描述清楚,实现时可以合并为一个步骤或者对某些步骤进行拆分,分解为多个步骤,只要包含相同的逻辑关系,都在本专利的保护范围内;对算法中或者流程中添加无关紧要的修改或者引入无关紧要的设计,但不改变其算法和流程的核心设计都在该专利的保护范围内。
本发明第四实施方式涉及一种语音控制设备,如图5所示,包含:指令生成模块,用于根据采集到的语音信息生成对应指令;指令执行模块,用于 执行指令生成模块生成的对应指令;图形生成模块,用于根据采集到的语音信息生成对应图形,对应图形用于显示对语音信息的识别结果;嵌入模块,用于将生成的对应图形嵌入到视图页面中;显示模块,用于在当前的人机交互界面中,显示根据最近一次采集的语音信息生成的对应图形;手势检测模块,用于检测在人机交互界面中是否有手势滑动操作;手势检测模块在检测到手势滑动操作时,触发显示模块在人机交互界面中显示手势滑动操作所指示的对应图形,并触发指令执行模块执行该指示的对应图形的相应指令。
不难发现,本实施方式为与第一实施方式相对应的设备实施例,本实施方式可与第一实施方式互相配合实施。第一实施方式中提到的相关技术细节在本实施方式中依然有效,为了减少重复,这里不再赘述。相应地,本实施方式中提到的相关技术细节也可应用在第一实施方式中。
值得一提的是,本实施方式中所涉及到的各模块均为逻辑模块,在实际应用中,一个逻辑单元可以是一个物理单元,也可以是一个物理单元的一部分,还可以以多个物理单元的组合实现。此外,为了突出本发明的创新部分,本实施方式中并没有将与解决本发明所提出的技术问题关系不太密切的单元引入,但这并不表明本实施方式中不存在其它的单元。
本领域的普通技术人员可以理解,上述各实施方式是实现本发明的具体实施例,而在实际应用中,可以在形式上和细节上对其作各种改变,而不偏离本发明的精神和范围。

Claims (10)

  1. 一种语音控制方法,包含以下步骤:
    根据采集到的语音信息生成用于执行的对应指令,并生成对应图形,所述对应图形用于显示对所述语音信息的识别结果;
    将所述生成的对应图形嵌入到视图页面中,在当前的人机交互界面中,显示根据最近一次采集的语音信息生成的对应图形;
    如果在所述人机交互界面中检测到手势滑动操作,则在所述人机交互界面中显示所述手势滑动操作所指示的对应图形,并执行该指示的对应图形的相应指令。
  2. 根据权利要求1所述的语音控制方法,其中,
    不同语音信息生成不同的所述对应图形;
    所述各对应图形并排嵌入到所述视图页面中;
    在所述人机交互界面中显示所述手势滑动操作所指示的对应图形的步骤中,根据所述手势滑动操作的滑动方向,显示当前显示的对应图形左侧或右侧的对应图形。
  3. 根据权利要求2所述的语音控制方法,其中,
    所述各对应图形根据相应的语音信息的采集顺序,以从左至右的顺序并排嵌入到所述视图页面中。
  4. 根据权利要求1至3任一项所述的语音控制方法,其中,所述语音控制方法应用于车载设备。
  5. 根据权利要求4所述的语音控制方法,其中,所述执行对应指令的步骤中,包含以下子步骤:
    所述车载设备将所述指令发送至关联终端;
    所述关联终端执行所述指令,并将该指令的执行结果反馈至所述车载设备;
    所述车载设备将收到的所述执行结果显示在人机交互界面中。
  6. 根据权利要求5所述的语音控制方法,其中,所述人机交互界面划分为第一显示区域和第二显示区域;
    所述对应图形显示在所述第一显示区域;
    所述执行结果显示在所述第二显示区域。
  7. 根据权利要求6所述的语音控制方法,其中,所述第一显示区域的背景色不同于所述第二显示区域的背景色。
  8. 根据权利要求6或7所述的语音控制方法,其中,所述第一显示区域与所述第二显示区域的面积可调。
    如果接收到对所述第一显示区域或所述第二显示区域的面积调整操作,则根据接收到的所述面积调整操作,调整区域面积。
  9. 根据权利要求1至8中任一项所述的语音控制方法,其中,所述人机交互界面中预设有用于触发语音识别功能的按键;
    在所述根据采集到的语音信息生成用于执行的对应指令的步骤之前,还包含:
    如果检测到对所述按键的操作,则利用语音采集设备采集语音。
  10. 一种语音控制设备,包含:
    指令生成模块,用于根据采集到的语音信息生成对应指令;
    指令执行模块,用于执行所述指令生成模块生成的对应指令;
    图形生成模块,用于根据采集到的语音信息生成对应图形,所述对应图形用于显示对所述语音信息的识别结果;
    嵌入模块,用于将所述生成的对应图形嵌入到视图页面中;
    显示模块,用于在当前的人机交互界面中,显示根据最近一次采集的语音信息生成的对应图形;
    手势检测模块,用于检测在所述人机交互界面中是否有手势滑动操作;
    所述手势检测模块在检测到手势滑动操作时,触发所述显示模块在所述人机交互界面中显示所述手势滑动操作所指示的对应图形,并触发所述指令执行模块执行该指示的对应图形的相应指令。
PCT/CN2016/089578 2015-12-30 2016-07-10 语音控制方法及其设备 WO2017113738A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/241,417 US20170193992A1 (en) 2015-12-30 2016-08-19 Voice control method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2015110311853 2015-12-30
CN201511031185.3A CN105912187A (zh) 2015-12-30 2015-12-30 语音控制方法及其设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/241,417 Continuation US20170193992A1 (en) 2015-12-30 2016-08-19 Voice control method and apparatus

Publications (1)

Publication Number Publication Date
WO2017113738A1 true WO2017113738A1 (zh) 2017-07-06

Family

ID=56744061

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/089578 WO2017113738A1 (zh) 2015-12-30 2016-07-10 语音控制方法及其设备

Country Status (3)

Country Link
US (1) US20170193992A1 (zh)
CN (1) CN105912187A (zh)
WO (1) WO2017113738A1 (zh)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107039039A (zh) * 2017-06-08 2017-08-11 湖南中车时代通信信号有限公司 列车监控运行***的基于语音的车载人机交互方法、装置
US10449440B2 (en) 2017-06-30 2019-10-22 Electronic Arts Inc. Interactive voice-controlled companion application for a video game
US10621317B1 (en) 2017-09-14 2020-04-14 Electronic Arts Inc. Audio-based device authentication system
CN110618750A (zh) * 2018-06-19 2019-12-27 阿里巴巴集团控股有限公司 一种数据处理方法、装置和机器可读介质
CN109068010A (zh) * 2018-11-06 2018-12-21 上海闻泰信息技术有限公司 语音内容记录方法与装置
CN109669754A (zh) * 2018-12-25 2019-04-23 苏州思必驰信息科技有限公司 语音交互窗口的动态显示方法、具有伸缩式交互窗口的语音交互方法及装置
CN110288989A (zh) * 2019-06-03 2019-09-27 安徽兴博远实信息科技有限公司 语音交互方法及***
US10926173B2 (en) * 2019-06-10 2021-02-23 Electronic Arts Inc. Custom voice control of video game character
CN112210951B (zh) * 2019-06-24 2023-07-25 青岛海尔洗衣机有限公司 用于洗涤设备的补水控制方法
CN110290219A (zh) * 2019-07-05 2019-09-27 斑马网络技术有限公司 车载机器人的数据交互方法、装置、设备及可读存储介质
CN111240477A (zh) * 2020-01-07 2020-06-05 北京汽车研究总院有限公司 一种车载人机交互方法、***和具有该***的车辆
CN111309283B (zh) * 2020-03-25 2023-12-05 北京百度网讯科技有限公司 用户界面的语音控制方法、装置、电子设备及存储介质
CN113495622A (zh) * 2020-04-03 2021-10-12 百度在线网络技术(北京)有限公司 一种交互模式的切换方法、装置、电子设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103338311A (zh) * 2013-07-11 2013-10-02 成都西可科技有限公司 一种智能手机锁屏界面启动app的方法
CN104360805A (zh) * 2014-11-28 2015-02-18 广东欧珀移动通信有限公司 应用程序图标管理方法及装置
CN104599669A (zh) * 2014-12-31 2015-05-06 乐视致新电子科技(天津)有限公司 一种语音控制方法和装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE60125597T2 (de) * 2000-08-31 2007-05-03 Hitachi, Ltd. Vorrichtung für die Dienstleistungsvermittlung
US9542958B2 (en) * 2012-12-18 2017-01-10 Seiko Epson Corporation Display device, head-mount type display device, method of controlling display device, and method of controlling head-mount type display device
CN104049727A (zh) * 2013-08-21 2014-09-17 惠州华阳通用电子有限公司 一种移动终端与车载终端的相互控制方法
DE112014000297T5 (de) * 2014-02-18 2015-11-12 Mitsubishi Electric Corporation Spracherkennungsvorrichtung und Anzeigeverfahren

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103338311A (zh) * 2013-07-11 2013-10-02 成都西可科技有限公司 一种智能手机锁屏界面启动app的方法
CN104360805A (zh) * 2014-11-28 2015-02-18 广东欧珀移动通信有限公司 应用程序图标管理方法及装置
CN104599669A (zh) * 2014-12-31 2015-05-06 乐视致新电子科技(天津)有限公司 一种语音控制方法和装置

Also Published As

Publication number Publication date
US20170193992A1 (en) 2017-07-06
CN105912187A (zh) 2016-08-31

Similar Documents

Publication Publication Date Title
WO2017113738A1 (zh) 语音控制方法及其设备
US20150143285A1 (en) Method for Controlling Position of Floating Window and Terminal
CN105511781B (zh) 启动应用程序的方法、装置和用户设备
KR102449601B1 (ko) 엘리베이터 호출을 입력하는 것에 의해 엘리베이터 서비스를 개시하는 시스템 및 방법
EP2813930A1 (en) Terminal reselection operation method and terminal
KR20150072074A (ko) 차량 제스처 인식 시스템 및 그 제어 방법
CN102855056A (zh) 终端和终端控制方法
WO2007061057A1 (ja) ジェスチャー入力装置、及び方法
CN104885047A (zh) 终端和终端操控方法
CN106415469A (zh) 用于使显示单元上的视图适配的方法和用户界面
WO2015043113A1 (zh) 一种单手操作手持设备屏幕的方法
JP2010224658A (ja) 操作入力装置
WO2019114808A1 (zh) 车载终端设备及其应用组件的显示处理方法
CN103853481A (zh) 模拟触屏移动终端按键的方法和***
CN108733283A (zh) 上下文车辆用户界面
CN103677578A (zh) 屏幕局部的屏蔽方法及应用此的便携式终端
EP3726360B1 (en) Device and method for controlling vehicle component
WO2012151926A1 (zh) 触屏待机墙纸显示设置方法及其装置
CN105630395A (zh) 终端操控方法、终端操控装置和终端
KR20140085942A (ko) 터치 기반의 도형의 조작 방법 및 장치
CN104020989B (zh) 基于远程应用的控制方法和***
KR20170107767A (ko) 차량 단말기 조작 시스템 및 그 방법
CN105912158B (zh) 一种移动终端的触屏拍照方法、装置及移动终端
CN107450821A (zh) 通过在应用界面上执行手势来呼出菜单的方法和设备
TWI607369B (zh) 調整畫面顯示的系統及方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16880536

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16880536

Country of ref document: EP

Kind code of ref document: A1