WO2017012511A1 - 语音控制方法、装置及投影仪设备 - Google Patents

语音控制方法、装置及投影仪设备 Download PDF

Info

Publication number
WO2017012511A1
WO2017012511A1 PCT/CN2016/090170 CN2016090170W WO2017012511A1 WO 2017012511 A1 WO2017012511 A1 WO 2017012511A1 CN 2016090170 W CN2016090170 W CN 2016090170W WO 2017012511 A1 WO2017012511 A1 WO 2017012511A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
instruction
command
stored
wake
Prior art date
Application number
PCT/CN2016/090170
Other languages
English (en)
French (fr)
Inventor
朱渊
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017012511A1 publication Critical patent/WO2017012511A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B19/00Programme-control systems
    • G05B19/02Programme-control systems electric
    • G05B19/04Programme control other than numerical control, i.e. in sequence controllers or logic controllers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Definitions

  • This document relates to, but is not limited to, the field of communications, and in particular, to a voice control method, apparatus, and projector apparatus.
  • a projector also known as a projector, is a device that can project images or video onto a screen. It can be connected to a computer, a video compact disc (VCD), or a digital video disc (Digital Video Disc) through different interfaces. Referred to as DVD), game consoles, etc., to play the corresponding video signals, projectors are widely used in homes, offices, schools and entertainment venues. According to the application environment, projectors can be divided into the following categories: home theater, Portable business projectors, educational conference projectors, mainstream engineering projectors, professional theater projectors, and measurement projectors.
  • a common feature of these projectors is that they require manual remote control when operating these projectors, and manual operation can cause cumbersome operation, resulting in poor user experience and lack of fun.
  • the present invention provides a voice control method, device and projector device, which can solve the problem that the operation of the projector manually operated in the related art is cumbersome and leads to a poor user experience.
  • This article provides a voice control method that includes:
  • the voice recognition state is a state in which an operation is performed according to the voice instruction
  • the determining that the projector device enters the voice recognition state comprises:
  • a touch signal, a voice signal, and a button signal of a predetermined track are provided.
  • the determining, by the received voice instruction, performing an operation corresponding to the voice instruction includes:
  • the identifying the received voice instruction, before performing the operation corresponding to the voice instruction further includes:
  • the obtained file name and/or application name is stored to the specified location, wherein when the voice recognizes the file name stored in the specified location, the file corresponding to the file name recognized by the voice is called, and when the voice recognition is stored in the specified location When the application name is called, the application corresponding to the application name recognized by the speech is called.
  • the receiving the input voice instruction includes:
  • the projector device receives the voice command through a peripheral device, wherein the peripheral device includes one or more of the following:
  • Wired earphones Wired earphones, Bluetooth headsets.
  • a voice control device comprising:
  • a determining module configured to determine that the projector device enters a voice recognition state, wherein the voice recognition state is a state in which an operation is performed according to the voice instruction;
  • a receiving module configured to receive an input voice command
  • An execution module configured to identify the received voice instruction, and perform execution corresponding to the voice instruction Operation.
  • the determining module includes:
  • a determining unit configured to determine that the projector device enters the voice recognition state by receiving a wake-up command, where the wake-up command includes one or more of the following:
  • a touch signal, a voice signal, and a button signal of a predetermined track are provided.
  • the execution module includes:
  • a determining unit configured to determine whether an instruction matching the voice command is stored in advance
  • the execution unit is configured to perform an operation corresponding to the voice instruction in a case where the determination result of the determination unit is that an instruction matching the voice instruction is stored in advance.
  • the foregoing apparatus further includes:
  • Obtaining a module configured to obtain a file name of a pre-stored file and/or an application name of a pre-installed application
  • a storage module configured to store the file name and/or the application name to a specified location
  • the execution unit is further configured to, when the voice recognition reaches the file name stored in the specified location, invoke a file corresponding to the file name recognized by the voice, and invoke voice recognition when the voice recognition identifies the application name stored in the specified location.
  • the application corresponding to the application name.
  • the receiving, by the receiving module, the input voice instruction includes:
  • the voice command is received by a peripheral device supported by the projector device, wherein the peripheral device includes one or more of the following:
  • Wired earphones Wired earphones, Bluetooth headsets.
  • a projector device comprising at least: a low power wake-up chip, a speech engine, and a standard stream component, wherein
  • the low power wake-up chip is configured to enter a voice recognition state according to the wake-up instruction, wherein the voice recognition state is a state in which an operation is performed according to the voice instruction;
  • the speech engine is configured to receive an input voice command
  • the standard stream component is configured to identify the received voice command, and execute the voice command Corresponding operation.
  • determining that the projector device enters a voice recognition state is a state in which an operation is performed according to a voice instruction; receiving an input voice command; identifying the received voice command, executing and The operation corresponding to the voice instruction solves the cumbersome operation when manually operating the projector in the related art, resulting in poor user experience, achieving the effect of reducing the complexity of the projector operation and improving the user experience.
  • FIG. 1 is a flow chart of a voice control method according to an embodiment of the present invention.
  • FIG. 2 is a block diagram showing the structure of a voice control apparatus according to an embodiment of the present invention.
  • FIG. 3 is a structural block diagram of a determining module 22 in a voice control apparatus according to an embodiment of the present invention
  • FIG. 4 is a block diagram showing the structure of an execution module 26 in a voice control apparatus according to an embodiment of the present invention.
  • FIG. 5 is a block diagram showing another structure of a voice control apparatus according to an embodiment of the present invention.
  • FIG. 6 is a block diagram showing the structure of a voice-activated projector system according to an embodiment of the present invention.
  • FIG. 7 is a low power consumption wake-up flowchart of a voice-activated projector system in accordance with an embodiment of the present invention.
  • Figure 8 is a diagram showing the operational state of a voice-activated projector system in accordance with an embodiment of the present invention.
  • FIG. 1 is a language according to an embodiment of the present invention.
  • a flow chart of the sound control method, as shown in FIG. 1, the process includes the following steps:
  • Step S102 determining that the projector device enters a voice recognition state, wherein the voice recognition state is a state in which an operation is performed according to the voice instruction;
  • Step S104 receiving an input voice instruction
  • Step S106 identifying the received voice command, and performing an operation corresponding to the voice command.
  • the projector device when the projector device is operated, the projector device can be operated by voice instructions, which can avoid the cumbersome steps of manual operation, solves the cumbersome operation of the manual operation of the projector in the related art, and leads to poor user experience, reaching Reduce the complexity of the projector operation and improve the user experience.
  • determining that the projector device enters the voice recognition state comprises: determining that the projector device enters the voice recognition state by receiving a wakeup command, wherein the wakeup command includes one or more of the following:
  • a touch signal, a voice signal, and a button signal of a predetermined track are provided.
  • identifying the received voice command, and performing an operation corresponding to the voice command includes: determining whether an instruction matching the voice command is pre-stored; and determining that the command matching the voice command is stored in advance In the case, an operation corresponding to the voice instruction is performed. If there is no instruction stored in the voice command matching, a prompt message, such as a "unable to recognize the command" prompt, may be fed back.
  • the performing an operation corresponding to the voice instruction may include: executing a pre-stored instruction that matches the voice instruction.
  • the method before the receiving the voice instruction, performing the operation corresponding to the voice instruction, the method further includes: acquiring a file name of the pre-stored file and/or an application name of the pre-installed application; storing the file A name and/or an application name, wherein the file name is used to invoke a file corresponding to the file name according to the voice instruction, the application name being used to invoke an application corresponding to the application name according to the voice instruction.
  • the purpose of storing the above file name and application name is to conveniently call the corresponding file and application according to the voice instruction.
  • the new file is stored or a new application is installed, the file name of the newly stored file and the new file are stored.
  • the app name of the installed app is acquiring a file name of the pre-stored file and/or an application name of the pre-installed application.
  • the projector device supports receiving the above language through a peripheral device.
  • An audio command wherein the peripheral device includes one or more of the following:
  • Wired earphones Wired earphones, Bluetooth headsets.
  • the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A more general embodiment.
  • portions of the embodiments of the present invention that contribute substantially or to the related art may be embodied in the form of a software product stored in a storage medium (eg, ROM/RAM, disk, optical disk).
  • the method includes a plurality of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method described in the embodiments of the present invention.
  • a voice control device is also provided, which is used to implement the above-mentioned embodiments and optional embodiments, and has not been described again.
  • the term "module” may implement a combination of software and/or hardware of a predetermined function.
  • the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.
  • FIG. 2 is a block diagram showing the structure of a voice control apparatus according to an embodiment of the present invention. As shown in FIG. 2, the apparatus includes a determining module 22, a receiving module 24, and an executing module 26. The apparatus will be described below.
  • the determining module 22 is configured to determine that the projector device enters a voice recognition state, wherein the voice recognition state is a state in which an operation is performed according to the voice instruction; the receiving module 24 is connected to the determining module 22, configured to receive the input voice command; The module 26, coupled to the receiving module 24, is configured to recognize the received voice command and perform an operation that matches the voice command.
  • FIG. 3 is a block diagram showing the structure of the determining module 22 in the voice control apparatus according to the embodiment of the present invention.
  • the determining module 22 includes a determining unit 32, which will be described below.
  • the determining unit 32 is configured to determine that the projector device enters a voice recognition state by receiving a wake-up command, where the wake-up command includes one or more of the following:
  • a touch signal, a voice signal, and a button signal of a predetermined track are provided.
  • FIG. 4 is a structural block diagram of an execution module 26 in a voice control apparatus according to an embodiment of the present invention. As shown in FIG. 4, the execution module 26 includes a determination unit 42 and an execution unit 44. Block 26 is described:
  • the determining unit 42 is configured to determine whether an instruction matching the voice command is stored in advance; the executing unit 44 is connected to the determining unit 42 and is configured to perform the determination result in the determining unit 42 that the voice command is matched in advance. In the case of an instruction, an operation corresponding to the voice instruction is performed.
  • FIG. 5 is another structural block diagram of a voice control apparatus according to an embodiment of the present invention. As shown in FIG. 5, the apparatus includes an acquisition module 52 and a storage module 54 in addition to all the modules shown in FIG. Explain the device:
  • the obtaining module 52 is configured to obtain a file name of the pre-stored file and/or an application name of the pre-installed application;
  • the storage module 54 is connected to the obtaining module 52 and the executing module 26, and is configured to store the file name and/or The above application name is to the specified location.
  • the execution unit is further configured to, when the voice recognition reaches the file name stored in the specified location, invoke a file corresponding to the file name recognized by the voice, and invoke the voice when the voice recognition identifies the application name stored in the specified location.
  • the application corresponding to the identified application name is configured to obtain a file name of the pre-stored file and/or an application name of the pre-installed application;
  • the storage module 54 is connected to the obtaining module 52 and the executing module 26, and is configured to store the file name and/or The above application name is to the specified location.
  • the execution unit is further configured to, when the voice recognition reaches the file name stored in the specified location, invoke a file corresponding to the file name recognized
  • the projector device described above supports receiving voice commands through a peripheral device, where the peripheral device includes one or more of the following:
  • Wired earphones Wired earphones, Bluetooth headsets.
  • the embodiment further provides a projector device, the device includes at least: a low-power wake-up chip, a voice engine, and a standard stream component, wherein the low-power wake-up chip is set to enter a voice recognition state according to the wake-up instruction, where
  • the speech recognition state is a state in which an operation is performed according to a voice instruction; the speech engine is configured to receive an input voice instruction; the standard stream component is configured to recognize the received voice instruction, and perform an operation corresponding to the voice instruction.
  • the low-power wake-up chip may be connected to a voice engine, and the voice engine may be connected to a standard stream component, and the low-power wake-up chip and the standard stream component may or may not be connected.
  • the related technologies may include the following aspects:
  • Speech recognition technology also known as Automatic Speech Recognition (ASR)
  • ASR Automatic Speech Recognition
  • Speech recognition technology aims to convert vocabulary content in human speech into computer readable input such as buttons, binary codes or sequences of characters. Unlike speaker recognition and speaker confirmation, the latter attempts to identify or confirm the speaker of the speech rather than the vocabulary content contained therein.
  • Speech recognition technology is a high-tech technique that allows a machine to transform a speech signal into a corresponding text or command through an identification and understanding process. Speech recognition technology mainly includes three aspects: feature extraction technology, pattern matching criterion and model training technology.
  • speech recognition tasks can be roughly divided into three categories, namely, isolated word recognition, keyword recognition (or keyword spotting) and continuous speech recognition.
  • isolated word recognition is to identify previously known isolated words, such as "boot”, "shutdown”, etc.
  • continuous speech recognition task is to identify any continuous speech, such as a sentence or a paragraph
  • continuous voice stream is for continuous speech, but it does not recognize all the words, but only detects where one or more known keywords appear, such as detecting "computer” and "world” in a paragraph. Two words.
  • voice recognition of an isolated word may be adopted, that is, a voice instruction that needs to be supported is pre-edited into a grammar file, and an engine is compiled to generate a corresponding recognition range.
  • the user only supports pre-defined instructions in the syntax.
  • the low-power digital signal processor (DSP) voice wake-up technology refers to the terminal (eg, mobile phone) wireless access point (Access Point, AP for short) after sleep (ie, the central processing unit (Central Processing Unit, Referred to as CPU) (stopping the work), relying on the DSP-specific processing unit, and through a specific trigger mode, can achieve the technology to wake up the CPU and make it back into the working state. It is a voice control scene focusing on completely liberating hands. Based on the maximum power saving in the sleep state of the mobile phone system, the technical operation of waking up the voice of the mobile phone is developed. The research work of this research can open up a complete use of "speech + auditory response" instead of "finger + visual touch" input operation premise for mobile phone operation, and achieve a complete voice intelligent human-computer interaction experience.
  • Speech interruption refers to a special speech recognition technology for speech recognition under steady-state background sound. With this function, when using the speech recognition system, you don't have to wait for the "click" sound before you can talk. Instead, you can interrupt the prompt tone with your voice and go directly to speech recognition (this process is called barge-in).
  • the key to speech interruption is the speech endpoint detection function.
  • the purpose of endpoint detection is to distinguish the speech signal from the non-speech signal in the signal stream in a complex application environment, and to determine the start and end of the speech signal.
  • the signal flow in the related art has a certain background sound, and the speech recognition model is based on the speech signal training, and the speech signal and the speech model are meaningful for pattern matching. Therefore, detecting a speech signal from a signal stream is a necessary pre-processing process for speech recognition.
  • endpoint detection can have two processes:
  • parameters such as energy, zero-crossing rate, entropy, pitch, and their derived parameters are used to determine the speech/non-speech signal in the signal stream.
  • endpoint detection is to:
  • the user wakes up the projector with a wake-up command and enters a voice recognition state.
  • the wake-up command supports custom recording training.
  • the user can also manually wake up the projector, such as long-pressing the device through the home button to enter the voice recognition state.
  • the user can speak any preset voice commands to tell the projector what to do next. Such as: open the projection, close the projection, play ****, open *** (where *** is the file name of the video, PPT document name or the application name of the installed application, etc.).
  • open the projection close the projection
  • play **** open *** (where *** is the file name of the video, PPT document name or the application name of the installed application, etc.).
  • the file name or application name can be automatically loaded into the succinct syntax, and the application can be automatically loaded into the grammar as long as it is installed in the system.
  • the projector when the user inputs an instruction that is not preset by the projector, the projector prompts the user to input an error and enters a re-input instruction flow.
  • the user can control the video playback through the voice interrupt technology, and the voice command can be input at any time during the video playback.
  • the user can say that the video control commands are such as: increase the volume, turn down the volume, pause, resume playing, and exit playback.
  • the user can control the PPT play through the voice interrupt technology, and the voice command can be input at any time during the PPT presentation.
  • Users can say PPT control commands such as: previous page, next page, first page, last page, exit full screen, full screen playback, etc.
  • Peripherals such as wired headsets, Bluetooth headsets, and connected to the projector, peripherals can be used as voice input devices to control the projector. If the user can stand farther from the projector, the projector can be voice-activated through the Bluetooth headset.
  • the entire process projector will have a user interface (User Interface, UI for short) prompt on the projector screen, and there will be a vocal or prompt tone to tell the user when to start inputting instructions, input end, input error, and so on.
  • UI User Interface
  • FIG. 6 is a block diagram showing the structure of a voice-activated projector system according to an embodiment of the present invention, as shown in FIG.
  • the system consists of three parts, including a low-power wake-up chip module (corresponding to the Low-power Wakeup DSP Chip in Figure 6, the low-power wake-up chip described above), the identification and broadcast engine.
  • Modules corresponding to the Voice Engine in Figure 6, as described above for the speech engine
  • standard stream component modules corresponding to the Standard Flow Component in Figure 6, as well as the standard stream components described above.
  • the main functions of each module are as follows:
  • the low-power wake-up chip module belongs to a hardware device and is configured to monitor a user's wake-up operation while the projector is sleeping;
  • the recognition and broadcast engine module is a core module for voice recognition and voice announcement, and is responsible for recognizing the collected audio.
  • FIG. 7 is a low power consumption wake-up flowchart of a voice-activated projector system according to an embodiment of the present invention. As shown in FIG. 7, the flow includes the following steps S702 to S712:
  • Step S702 the user inputs an awakening word
  • Step S704 the low-power wake-up chip continuously monitors the user voice input while the projector is sleeping;
  • Step S706 when the user's voice input is consistent with the wake-up words of the preset training, the low-power wake-up chip wakes up the CPU, and reports a wake-up event to the driver layer;
  • Step S708 the framework layer then notifies the application layer by means of a message
  • Step S710 the application layer adjusts the voice recognition process
  • the low-power wake-up chip makes it possible to completely liberate the user's hands and make the voice control process a closed-loop operation. Since the low-power wake-up chip is a hardware configuration and cannot be configured on some projector models, the system supports cropping the module on a low-profile projector, and the user can wake up by other means such as peripheral devices and projector buttons.
  • FIG. 8 is a diagram showing an operation state of a voice-activated projector system according to an embodiment of the present invention, which will be described below with reference to FIG. 8:
  • the device When the device is initialized and wakes up, the device enters the recording state and waits for the user to input a voice command.
  • the user has two possible operations at this time: one is that there is no sound, and the recognition process is timed out; the other is that the sound is recorded by the projector and enters the subsequent recognition state.
  • After entering the recognition state if it is recognized that the user has said the correct instruction, it will be distributed to the corresponding standard stream component for processing; if it is an unrecognizable instruction, the user is prompted to input an error, re-enter or exit.
  • recording interruption is a special recognition method under steady-state background sound. Such as voice control during video playback. At this point, the recording is continuously turned on to detect the user's voice input and denoise for the steady-state background sound. If a speech input consistent with the preset dynamic command is detected, the engine will return the recognition result to inform the standard component stream to perform the corresponding operation. Continue to detect the next voice input, the recording interruption will not stop until the user exits the video playback.
  • a voice-activated projector system is proposed to solve the problem.
  • the system works in hardware and software to enable the user to wake up the projector with sound and send voice commands.
  • the whole process can realize closed-loop operation, that is, the whole link is completed by voice control, and no manual operation is required, which liberates the user's hands and greatly enhances the use efficiency and fun of the projector.
  • the system supports cropping and tailors features and hardware configurations as needed.
  • modules may be implemented by software or hardware.
  • the foregoing may be implemented by, but not limited to, the foregoing modules are all located in the same processor; or, the modules are respectively located in multiple processes. In the device.
  • the embodiment of the invention further provides a storage medium.
  • the foregoing storage medium may be configured to store program code for performing the following steps:
  • the foregoing storage medium may include, but is not limited to, a USB flash drive, a Read-Only Memory (ROM), and a Random Access Memory (RAM).
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • a medium that can store program code such as a hard disk, a disk, or a disc.
  • the processor performs steps S1-S3 according to the stored program code in the storage medium.
  • the instructions are related to hardware (eg, a processor) that can be stored in a computer readable storage medium, such as a read only memory, a magnetic disk, or an optical disk.
  • a computer readable storage medium such as a read only memory, a magnetic disk, or an optical disk.
  • all or part of the steps of the above embodiments may also be implemented using one or more integrated circuits.
  • the modules/units in the above embodiments may be implemented in the form of hardware, for example, by implementing integrated functions to implement their respective functions, or may be implemented in the form of software function modules, for example, executing program instructions stored in the memory by the processor. To achieve its corresponding function.
  • This application is not limited to any specific combination of hardware and software.
  • determining that the projector device enters a voice recognition state is a state in which an operation is performed according to a voice instruction; receiving an input voice command; identifying the received voice command, performing the The operation corresponding to the voice instruction solves the cumbersome operation when the projector is manually operated in the related art, resulting in poor user experience, achieving the effect of reducing the complexity of the projector operation and improving the user experience.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Electrically Operated Instructional Devices (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

一种语音控制方法、装置及投影仪设备,其中,该方法包括:确定投影仪设备进入语音识别状态,其中,该语音识别状态为根据语音指令执行操作的状态(S102);接收输入的语音指令(S104);识别接收的语音指令,执行与上述语音指令对应的操作(S106)。

Description

语音控制方法、装置及投影仪设备 技术领域
本文涉及但不限于通信领域,尤其涉及一种语音控制方法、装置及投影仪设备。
背景技术
投影仪,又称投影机,是一种可以将图像或视频投射到幕布上的设备,可以通过不同的接口同计算机、视频光盘(Video Compact Disc,简称为VCD)、数字视盘(Digital Video Disc,简称为DVD)、游戏机等相连接,播放相应的视频信号,投影仪广泛应用于家庭、办公室、学校和娱乐场所,按照应用环境的不同,投影仪可以分为如下几类:家庭影院型、便携商务型投影仪、教育会议型投影仪、主流工程型投影仪、专业剧院型投影仪、测量投影仪。
这些投影仪都有一个共同的特点,就是在操作这些投影仪时,需要手动遥控器操作,而手动操作会造成操作繁琐的问题,从而导致用户体验差、缺乏趣味性。
针对相关技术中存在的手动操作投影仪时操作繁琐,导致用户体验差的问题,目前尚未提出有效的解决方案。
发明内容
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
本文提供一种语音控制方法、装置及投影仪设备,可以解决相关技术中存在的手动操作投影仪时操作繁琐,导致用户体验差的问题。
本文提供了一种语音控制方法,包括:
确定投影仪设备进入语音识别状态,其中,所述语音识别状态为根据语音指令执行操作的状态;
接收输入的语音指令;
识别接收的所述语音指令,执行与所述语音指令对应的操作。
可选地,上述方法中,所述确定投影仪设备进入语音识别状态包括:
确定所述投影仪设备通过接收唤醒指令的方式,进入所述语音识别状态,其中,所述唤醒指令包括以下一种或几种:
预定轨迹的触控信号、语音信号、按键信号。
可选地,上述方法中,所述识别接收的所述语音指令,执行与所述语音指令对应的操作包括:
判断是否预先存储有与所述语音指令匹配的指令;
在判断结果为预先存储有与所述语音指令匹配的指令的情况下,执行与所述语音指令对应的操作。
可选地,上述方法中,所述识别接收的所述语音指令,执行与所述语音指令对应的操作之前,还包括:
获取预先存储的文件的文件名称和/或预先安装的应用的应用名称;
将所获取的文件名称和/或应用名称存储至指定位置,其中,当语音识别到存储在指定位置的文件名称时,调用语音识别到的文件名称对应的文件,当语音识别到存储在指定位置的应用名称时,调用语音识别到的应用名称对应的应用。
可选地,上述方法中,所述接收输入的语音指令包括:
所述投影仪设备通过***设备接收所述语音指令,其中,所述***设备包括以下一种或几种:
有线耳机、蓝牙耳机。
本文还公开了一种语音控制装置,包括:
确定模块,设置为确定投影仪设备进入语音识别状态,其中,所述语音识别状态为根据语音指令执行操作的状态;
接收模块,设置为接收输入的语音指令;
执行模块,设置为识别接收的所述语音指令,执行与所述语音指令对应 的操作。
可选地,上述装置中,所述确定模块包括:
确定单元,设置为确定所述投影仪设备通过接收唤醒指令的方式,进入所述语音识别状态,其中,所述唤醒指令包括以下一种或几种:
预定轨迹的触控信号、语音信号、按键信号。
可选地,上述装置中,所述执行模块包括:
判断单元,设置为判断是否预先存储有与所述语音指令匹配的指令;
执行单元,设置为在所述判断单元的判断结果为预先存储有与所述语音指令匹配的指令的情况下,执行与所述语音指令对应的操作。
可选地,上述装置还包括:
获取模块,设置为获取预先存储的文件的文件名称和/或预先安装的应用的应用名称;
存储模块,设置为将所述文件名称和/或所述应用名称存储至指定位置,
所述执行单元,还设置为在语音识别到存储在指定位置中的文件名称时,调用语音识别到的文件名称对应的文件,在语音识别到存储在指定位置中的应用名称时,调用语音识别到的应用名称对应的应用。
可选地,上述装置中,所述接收模块接收输入的语音指令包括:
通过投影仪设备支持的***设备接收所述语音指令,其中,所述***设备包括以下一种或几种:
有线耳机、蓝牙耳机。
本文还公开了一种投影仪设备,至少包括:低功耗唤醒芯片、语音引擎和标准流组件,其中,
所述低功耗唤醒芯片设置为根据唤醒指令进入语音识别状态,其中,所述语音识别状态为根据语音指令执行操作的状态;
所述语音引擎设置为接收输入的语音指令;
所述标准流组件设置为识别接收的所述语音指令,执行与所述语音指令 对应的操作。
通过本文提供的技术方案,采用确定投影仪设备进入语音识别状态,其中,所述语音识别状态为根据语音指令执行操作的状态;接收输入的语音指令;识别接收的所述语音指令,执行与所述语音指令对应的操作,解决了相关技术中存在的手动操作投影仪时操作繁琐,导致用户体验差,达到了降低投影仪操作复杂度,提高用户体验的效果。
在阅读并理解了附图和详细描述后,可以明白其他方面。
附图概述
图1是根据本发明实施例的语音控制方法的流程图;
图2是根据本发明实施例的语音控制装置的一种结构框图;
图3是根据本发明实施例的语音控制装置中确定模块22的结构框图;
图4是根据本发明实施例的语音控制装置中执行模块26的结构框图;
图5是根据本发明实施例的语音控制装置的另一种结构框图;
图6是根据本发明实施例的声控投影仪***的结构框图;
图7是根据本发明实施例的声控投影仪***的低功耗唤醒流程图;
图8是根据本发明实施例的声控投影仪***的工作状态图。
本发明的实施方式
下文中将结合附图对本文的实施例进行详细说明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。
需要说明的是,本文及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
在本实施例中提供了一种语音控制方法,图1是根据本发明实施例的语 音控制方法的流程图,如图1所示,该流程包括如下步骤:
步骤S102,确定投影仪设备进入语音识别状态,其中,该语音识别状态为根据语音指令执行操作的状态;
步骤S104,接收输入的语音指令;
步骤S106,识别接收的语音指令,执行与上述语音指令对应的操作。
通过上述步骤,在操作投影仪设备时,可以通过语音指令操作投影仪设备,可以避免手工操作的繁琐步骤,解决了相关技术中存在的手动操作投影仪时操作繁琐,导致用户体验差,达到了降低投影仪操作复杂度,提高用户体验的效果。
在一个可选的实施例中,确定投影仪设备进入语音识别状态包括:确定该投影仪设备通过接收唤醒指令的方式,进入上述语音识别状态,其中,该唤醒指令包括以下一种或几种:
预定轨迹的触控信号、语音信号、按键信号。
在一个可选的实施例中,识别接收的上述语音指令,执行与语音指令对应的操作包括:判断是否预先存储有与上述语音指令匹配的指令;判断预先存储有与上述语音指令匹配的指令的情况下,执行与该语音指令对应的操作。其中,若没有存储于上述语音指令匹配的指令,则可以反馈一个提示信息,例如“无法识别该指令”的提示。
其中,执行与该语音指令对应的操作可以包括:执行预先存储的、与所述语音指令匹配的指令。
在一个可选的实施例中,识别接收的上述语音指令,执行与语音指令对应的操作之前,还包括:获取预先存储的文件的文件名称和/或预先安装的应用的应用名称;存储该文件名称和/或应用名称,其中,该文件名称用于根据语音指令调用与文件名称对应的文件,该应用名称用于根据语音指令调用与应用名称对应的应用。存储上述文件名称和应用名称的目的是为了方便地根据语音指令调用相应的文件和应用,当存储了新的文件或安装了新的应用后,会存储该新存储的文件的文件名称和该新安装的应用的应用名称。
在一个可选的实施例中,上述投影仪设备支持通过***设备接收上述语 音指令,其中,该***设备包括以下一种或几种:
有线耳机、蓝牙耳机。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更普通的实施方式。基于这样的理解,本发明实施例本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括多个指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明实施例所述的方法。
在本实施例中还提供了一种语音控制装置,该装置用于实现上述实施例及可选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。
图2是根据本发明实施例的语音控制装置的一种结构框图,如图2所示,该装置包括确定模块22、接收模块24和执行模块26,下面对该装置进行说明。
确定模块22,设置为确定投影仪设备进入语音识别状态,其中,该语音识别状态为根据语音指令执行操作的状态;接收模块24,连接至上述确定模块22,设置为接收输入的语音指令;执行模块26,连接至上述接收模块24,设置为识别接收的语音指令,执行与上述语音指令匹配的操作。
图3是根据本发明实施例的语音控制装置中确定模块22的结构框图,如图3所示,该确定模块22包括确定单元32,下面对该确定模块22进行说明。
确定单元32,设置为确定投影仪设备通过接收唤醒指令的方式,进入语音识别状态,其中,该唤醒指令包括以下一种或几种:
预定轨迹的触控信号、语音信号、按键信号。
图4是根据本发明实施例的语音控制装置中执行模块26的结构框图,如图4所示,该执行模块26包括判断单元42和执行单元44,下面对该执行模 块26进行说明:
判断单元42,设置为判断是否预先存储有与上述语音指令匹配的指令;执行单元44,连接至上述判断单元42,设置为在上述判断单元42的判断结果为预先存储有与上述语音指令匹配的指令的情况下,执行与该语音指令对应的操作。
图5是根据本发明实施例的语音控制装置的另一种结构框图,如图5所示,该装置除包括图2所示的所有模块外,还包括获取模块52和存储模块54,下面对该装置进行说明:
获取模块52,设置为获取预先存储的文件的文件名称和/或预先安装的应用的应用名称;存储模块54,连接至上述获取模块52和上述执行模块26,设置为存储上述文件名称和/或上述应用名称至指定位置。此时,执行单元,还设置为在语音识别到存储在指定位置中的文件名称时,调用语音识别到的文件名称对应的文件,在语音识别到存储在指定位置中的应用名称时,调用语音识别到的应用名称对应的应用。
可选地,上述的投影仪设备支持通过***设备接收语音指令,其中,该***设备包括以下一种或几种:
有线耳机、蓝牙耳机。
本实施例还提供了一种投影仪设备,该设备至少包括:低功耗唤醒芯片、语音引擎和标准流组件,其中,该低功耗唤醒芯片设置为根据唤醒指令进入语音识别状态,其中,该语音识别状态为根据语音指令执行操作的状态;该语音引擎设置为接收输入的语音指令;该标准流组件设置为识别接收的语音指令,执行与该语音指令对应的操作。其中,上述的低功耗唤醒芯片可以和语音引擎连接,该语音引擎可以和标准流组件连接,低功耗唤醒芯片和标准流组件之间可以连接,也可以不连接。
在本发明实施例中,所涉及到的技术可以包含以下几个方面:
1、语音识别技术:
语音识别技术作为相关技术热点,已渗透到多个领域,开启从“键盘交互”、“触控交互”到“语音交互”的人机交互模式,为人们解放双手和提 高效率带来可能。
语音识别技术也被称为自动语音识别(Automatic Speech Recognition,简称为ASR),其目标是将人类的语音中的词汇内容转换为计算机可读的输入,例如按键、二进制编码或者字符序列。与说话人识别(Speaker recognition)及说话人确认不同,后者尝试识别或确认发出语音的说话人而非其中所包含的词汇内容。语音识别技术就是让机器通过识别和理解过程把语音信号转变为相应的文本或命令的高技术。语音识别技术主要包括特征提取技术、模式匹配准则及模型训练技术三个方面。
根据识别的对象不同,语音识别任务大体可分为3类,即孤立词识别(isolated word recognition),关键词识别(或称关键词检出,keyword spotting)和连续语音识别。其中,孤立词识别的任务是识别事先已知的孤立的词,如“开机”、“关机”等;连续语音识别的任务则是识别任意的连续语音,如一个句子或一段话;连续语音流中的关键词检测针对的是连续语音,但它并不识别全部文字,而只是检测已知的一个或多个关键词在何处出现,如在一段话中检测“计算机”、“世界”这两个词。
可选地,在本发明实施例中可以采用孤立词的语音识别,即将需要支持的语音指令预先编辑成语法文件,有引擎编译生成相应的识别范围。用户使用时仅支持语法中预先定义好的指令。
2、低功耗唤醒:
低功耗数字信号处理器(Digital Signal Processor,简称为DSP)语音唤醒技术是指终端(如,手机)无线访问点(Access Point,简称为AP)休眠后(即中央处理器(Central Processing Unit,简称为CPU)停止工作),依靠DSP特有的处理单元,并通过特定的触发方式,能达到唤醒CPU从而使其重新进入工作状态的技术。它是着眼于完全的解放双手的语音操控场景,在手机***休眠状态中达到最大节电的基础上,开发对手机语音唤醒的技术操作。此研究的开发工作可以为手机操作开辟一种完全的使用“语音+听觉反应”代替“手指+视觉触控”的输入操作前提,达成完全的语音智能化的人机交互体验。
3、语音打断:
语音打断是指在稳态背景音下进行语音识别的一项特殊语音识别技术。 有了这一功能,使用语音识别***时就不必等待“嘀”声之后才能讲话了,而是可以随时用语音打断提示音,直接进入语音识别(这一过程称为barge-in)。
语音打断的关键是语音端点检测功能,端点检测的目的就是在复杂的应用环境下的信号流中分辨出语音信号和非语音信号,并确定语音信号的开始及结束。相关技术中的信号流都存在一定的背景声,而语音识别的模型都是基于语音信号训练的,语音信号和语音模型进行模式匹配才有意义。因此从信号流中检测出语音信号是语音识别的必要的预处理过程。
其中,端点检测可以有两个过程:
a)基于语音信号的特征,用能量、过零率、商(entropy)、音高(pitch)等参数以及它们的衍生参数,来判断信号流中的语音/非语音信号。
b)在信号流中检测到语音信号后,判断此处是否是语句的开始或结束点。在商用语音***中,由于信号多变的背景和自然对话模式而更容易使句中有停顿(非语音),特别是在爆发声母前总会有无声间隙。因此,这种开始/结束的判定尤为重要。
此外端点检测的目的还在于:
a)减少识别器的数据处理量:可以大量减少信号传输量及识别器的运算负载,对于语音对话的实时识别有重要作用。
b)拒绝非语音的信号:对非语音信号的识别不仅是一种资源浪费,有可能改变对话的状态,造成对用户的困扰。
c)在需要打断(barge-in)功能的***中,语音的起始点是必须的。在端点检测找到语音的起始点时,***将停止提示音的播放。完成打断功能。
该***的技术方案如下:
设备休眠时,用户通过唤醒指令唤醒投影仪,进入语音识别状态,该唤醒指令支持自定义录制培训。
其中,用户也可手动唤醒投影仪,如通过home键长按唤醒设备,进入语音识别状态。
随即,用户可以说出预置的任何语音指令,告诉投影仪下一步需要做什么。如:打开投影,关闭投影,播放****,打开***(其中***为视频的文件名称,PPT文档名或者安装的应用的应用名称等)。文件只要拷贝到投影仪存储器即可自动将该文件名称或应用名称加载到可说语法,应用只要安装到***也可自动加载到可说语法。也就是说,先获取预先存储的文件的文件名称和/或预先安装的应用的应用名称;再将所获取的文件名称和/或应用名称存储至指定位置(即将文件名称和/或应用名称加载到可说语法),当语音识别到存储在指定位置的文件名称(即加载至可说语法中的文件名称)时,调用语音识别到的文件名称对应的文件,当语音识别到存储在指定位置的应用名称(即加载至可说语法中的应用名称)时,调用语音识别到的应用名称对应的应用。
其中,当用户输入了投影仪未预置的指令,投影仪会提示用户输入错误,进入重新输入指令流程。
当视频开始播放,用户可以通过语音打断技术全程语音控制视频播放,即可在视频播放期间任何时候输入语音指令。用户可说视频控制指令如:调高音量、调低音量、暂停、继续播放、退出播放等。
当PPT开始演示,用户可以通过语音打断技术全程语音控制PPT播放,即可在PPT演示期间任何时候输入语音指令。用户可说PPT控制指令如:上一页、下一页、首页、尾页、退出全屏、全屏播放等。
支持***设备语音控制投影仪。***设备如有线耳机,蓝牙耳机,连上投影仪以后,***设备可作为语音输入设备控制投影仪。如用户可以站在离投影仪较远的地方通过蓝牙耳机声控投影仪。
整个流程投影仪在投影仪屏幕上会有用户界面(User Interface,简称为UI)提示,同时会有人声或提示音告诉用户何时开始输入指令,输入结束,输入错误等。
以下将结合附图对本发明实施例的的方案进行较为详尽的说明。
图6是根据本发明实施例的声控投影仪***的结构框图,如图6所示。该***主要由3部分组成,包括低功耗唤醒芯片模块(对应于图6中的Low-power Wakeup DSP Chip,同上述的低功耗唤醒芯片)、识别和播报引擎 模块(对应于图6中的Voice Engine,同上述的语音引擎)和标准流组件模块(对应于图6中的Standard Flow Component,同上述的标准流组件)。每个模块的主要功能如下:
低功耗唤醒芯片模块,属于硬件设备,设置为在投影仪休眠时监控用户的唤醒操作;识别和播报引擎模块,是语音识别和人声播报的核心模块,负责对搜集到的音频进行识别,并语音合成播报内容;标准流组件模块,实现每个功能点,如视频播放语音控制,打开应用语音控制,每个功能点以流的形式存在,有自己的生命周期。
图7是根据本发明实施例的声控投影仪***的低功耗唤醒流程图,如图7所示,该流程包括如下步骤S702至S712:
步骤S702,用户输入唤醒词;
步骤S704,低功耗唤醒芯片在投影仪休眠时持续监控用户语音输入;
步骤S706,当用户的语音输入于预置培训的唤醒词一致时,低功耗唤醒芯片唤醒CPU,并向驱动层上报唤醒事件;
步骤S708,随后框架层通过消息的方式通知应用层;
步骤S710,应用层调起语音识别流程;
步骤S712,结束。
该低功耗唤醒芯片为完全解放用户双手,使语音控制流程成为闭环操作成为可能。鉴于低功耗唤醒芯片属于硬件配置,在某些投影仪机型无法配置,所以本***在低配置投影仪上支持裁剪该模块,用户可通过其他方式,如***设备,投影仪按键来唤醒。
图8是根据本发明实施例的声控投影仪***的工作状态图,下面结合图8进行说明:
当设备初始化完成并被唤醒后,设备进入录音状态,等待用户输入语音指令。用户此时有两种可能操作:一是没有发声,识别流程超时结束;一是有发声被投影仪录入,进入后续的识别状态。进入识别状态后,如果识别到用户说了正确的指令,就会分发到相应的标准流组件进行处理;如果为不可识别的指令,就提示用户输入错误,重新输入或退出。
其中录音打断是一种在稳态背景音下的特殊识别方式。如视频播放时的语音控制。此时录音持续开启检测用户语音输入并针对稳态背景音消噪。如果检测到和预置的动态指令一致的语音输入,引擎会返回识别结果告知标准组件流进行相应操作。继续检测下一次语音输入,录音打断不会停止直到用户退出视频播放。
本发明实施例中,针对投影仪设备手动操作繁琐,用户体验差,缺乏趣味性的问题,提出声控投影仪***以解决该问题。该***通过硬件和软件配合使用户能通过声音唤醒投影仪并发送声音指令。整个流程能实现闭环操作,即整个环节都通过声控完成,不需要手动操作,解放了用户的双手,大大增强了投影仪的使用效率和趣味性。该***支持裁剪,可根据需要裁剪功能和硬件配置。
需要说明的是,上述模块是可以通过软件或硬件来实现的,对于后者,可以通过以下方式实现,但不限于此:上述模块均位于同一处理器中;或者,上述模块分别位于多个处理器中。
本发明实施例还提供了一种存储介质。可选地,在本实施例中,上述存储介质可以被设置为存储用于执行以下步骤的程序代码:
S1,确定投影仪设备进入语音识别状态,其中,该语音识别状态为根据语音指令执行操作的状态;
S2,接收输入的语音指令;
S3,识别接收的语音指令,执行与上述语音指令对应的操作。
可选地,在本实施例中,上述存储介质可以包括但不限于:U盘、只读存储器(Read-Only Memory,简称为ROM)、随机存取存储器(Random Access Memory,简称为RAM)、移动硬盘、磁碟或者光盘等可以存储程序代码的介质。
可选地,在本实施例中,处理器根据存储介质中已存储的程序代码执行步骤S1-S3。
本领域普通技术人员可以理解上述方法中的全部或部分步骤可通过程序 来指令相关硬件(例如处理器)完成,所述程序可以存储于计算机可读存储介质中,如只读存储器、磁盘或光盘等。可选地,上述实施例的全部或部分步骤也可以使用一个或多个集成电路来实现。相应地,上述实施例中的模块/单元可以采用硬件的形式实现,例如通过集成电路来实现其相应功能,也可以采用软件功能模块的形式实现,例如通过处理器执行存储于存储器中的程序指令来实现其相应功能。本申请不限制于任何特定形式的硬件和软件的结合。
工业实用性
通过本发明实施例,采用确定投影仪设备进入语音识别状态,其中,所述语音识别状态为根据语音指令执行操作的状态;接收输入的语音指令;识别接收的所述语音指令,执行与所述语音指令对应的操作,解决了相关技术中存在的手动操作投影仪时操作繁琐,导致用户体验差,达到了降低投影仪操作复杂度,提高用户体验的效果。

Claims (11)

  1. 一种语音控制方法,包括:
    确定投影仪设备进入语音识别状态,其中,所述语音识别状态为根据语音指令执行操作的状态;
    接收输入的语音指令;
    识别接收的所述语音指令,执行与所述语音指令对应的操作。
  2. 根据权利要求1所述的方法,其中,所述确定投影仪设备进入语音识别状态包括:
    确定所述投影仪设备通过接收唤醒指令的方式,进入所述语音识别状态,其中,所述唤醒指令包括以下一种或几种:
    预定轨迹的触控信号、语音信号、按键信号。
  3. 根据权利要求1所述的方法,其中,所述识别接收的所述语音指令,执行与所述语音指令对应的操作包括:
    判断是否预先存储有与所述语音指令匹配的指令;
    在判断结果为预先存储有与所述语音指令匹配的指令的情况下,执行与所述语音指令对应的操作。
  4. 根据权利要求1所述的方法,其中,所述识别接收的所述语音指令,执行与所述语音指令对应的操作之前,还包括:
    获取预先存储的文件的文件名称和/或预先安装的应用的应用名称;
    将所获取的文件名称和/或应用名称存储至指定位置,其中,当语音识别到存储在指定位置的文件名称时,调用语音识别到的文件名称对应的文件,当语音识别到存储在指定位置的应用名称时,调用语音识别到的应用名称对应的应用。
  5. 根据权利要求1至4中任一项所述的方法,其中,所述接收输入的语音指令包括:
    所述投影仪设备通过***设备接收所述语音指令,其中,所述***设备包括以下一种或几种:
    有线耳机、蓝牙耳机。
  6. 一种语音控制装置,包括:
    确定模块,设置为确定投影仪设备进入语音识别状态,其中,所述语音识别状态为根据语音指令执行操作的状态;
    接收模块,设置为接收输入的语音指令;
    执行模块,设置为识别接收的所述语音指令,执行与所述语音指令对应的操作。
  7. 根据权利要求6所述的装置,其中,所述确定模块包括:
    确定单元,设置为确定所述投影仪设备通过接收唤醒指令的方式,进入所述语音识别状态,其中,所述唤醒指令包括以下一种或几种:
    预定轨迹的触控信号、语音信号、按键信号。
  8. 根据权利要求6所述的装置,其中,所述执行模块包括:
    判断单元,设置为判断是否预先存储有与所述语音指令匹配的指令;
    执行单元,设置为在所述判断单元的判断结果为预先存储有与所述语音指令匹配的指令的情况下,执行与所述语音指令对应的操作。
  9. 根据权利要求6所述的装置,还包括:
    获取模块,设置为获取预先存储的文件的文件名称和/或预先安装的应用的应用名称;
    存储模块,设置为将所述文件名称和/或所述应用名称存储至指定位置,
    所述执行单元,还设置为在语音识别到存储在指定位置中的文件名称时,调用语音识别到的文件名称对应的文件,在语音识别到存储在指定位置中的应用名称时,调用语音识别到的应用名称对应的应用。
  10. 根据权利要求6至9中任一项所述的装置,其中,所述接收模块接 收输入的语音指令包括:
    通过投影仪设备支持的***设备接收所述语音指令,其中,所述***设备包括以下一种或几种:
    有线耳机、蓝牙耳机。
  11. 一种投影仪设备,至少包括:低功耗唤醒芯片、语音引擎和标准流组件,其中,
    所述低功耗唤醒芯片设置为根据唤醒指令进入语音识别状态,其中,所述语音识别状态为根据语音指令执行操作的状态;
    所述语音引擎设置为接收输入的语音指令;
    所述标准流组件设置为识别接收的所述语音指令,执行与所述语音指令对应的操作。
PCT/CN2016/090170 2015-07-17 2016-07-15 语音控制方法、装置及投影仪设备 WO2017012511A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510424654.1 2015-07-17
CN201510424654.1A CN106356059A (zh) 2015-07-17 2015-07-17 语音控制方法、装置及投影仪设备

Publications (1)

Publication Number Publication Date
WO2017012511A1 true WO2017012511A1 (zh) 2017-01-26

Family

ID=57833698

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/090170 WO2017012511A1 (zh) 2015-07-17 2016-07-15 语音控制方法、装置及投影仪设备

Country Status (2)

Country Link
CN (1) CN106356059A (zh)
WO (1) WO2017012511A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110908718A (zh) * 2018-09-14 2020-03-24 上海擎感智能科技有限公司 人脸识别激活语音导航方法、***、存储介质及设备

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018176387A1 (zh) * 2017-03-31 2018-10-04 深圳市红昌机电设备有限公司 缠绕式绕线机的语音控制方法及***
CN106847285B (zh) * 2017-03-31 2020-05-05 上海思依暄机器人科技股份有限公司 一种机器人及其语音识别方法
CN107180631A (zh) * 2017-05-24 2017-09-19 刘平舟 一种语音交互方法及装置
CN107360327B (zh) * 2017-07-19 2021-05-07 腾讯科技(深圳)有限公司 语音识别方法、装置和存储介质
CN107680592B (zh) * 2017-09-30 2020-09-22 惠州Tcl移动通信有限公司 一种移动终端语音识别方法、及移动终端及存储介质
CN107920240A (zh) * 2017-12-27 2018-04-17 兴天通讯技术有限公司 一种可实现语音操控的智能投影仪
CN108319171B (zh) * 2018-02-09 2020-08-07 广景视睿科技(深圳)有限公司 一种基于语音控制的动向投影方法、装置及动向投影***
CN108566634B (zh) * 2018-03-30 2021-06-25 深圳市冠旭电子股份有限公司 降低蓝牙音箱连续唤醒延时的方法、装置及蓝牙音箱
CN110505431A (zh) * 2018-05-17 2019-11-26 视联动力信息技术股份有限公司 一种终端的控制方法和装置
CN108920128B (zh) * 2018-07-12 2021-10-08 思必驰科技股份有限公司 演示文稿的操作方法及***
CN109375460B (zh) * 2018-12-27 2021-03-23 成都极米科技股份有限公司 智能投影仪的控制方法及智能投影仪
CN110322873B (zh) * 2019-07-02 2022-03-01 百度在线网络技术(北京)有限公司 语音技能的退出方法、装置、设备及存储介质
CN110517697A (zh) * 2019-08-20 2019-11-29 中信银行股份有限公司 用于交互式语音应答的提示音智能打断装置
CN110992960A (zh) * 2019-12-18 2020-04-10 Oppo广东移动通信有限公司 控制方法、装置、电子设备和存储介质
CN113160806A (zh) * 2020-01-07 2021-07-23 京东方科技集团股份有限公司 投影***及其控制方法
CN111467198B (zh) * 2020-04-28 2022-12-09 天赋光彩医疗科技(苏州)有限公司 一种明目醒脑仪
CN113763944B (zh) * 2020-09-29 2024-06-04 浙江思考者科技有限公司 基于拟真人逻辑知识库的ai视频云交互***
CN112530430A (zh) * 2020-11-30 2021-03-19 北京百度网讯科技有限公司 车载操作***控制方法、装置、耳机、终端及存储介质
CN113157350B (zh) * 2021-03-18 2022-06-07 福建马恒达信息科技有限公司 一种基于语音识别的办公辅助***与方法
CN113127105B (zh) * 2021-03-18 2022-06-10 福建马恒达信息科技有限公司 一种excel自动语音工具调用方法
CN114097660A (zh) * 2021-11-08 2022-03-01 广州回味源蛋类食品有限公司 一种鸭蛋筛选装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6230137B1 (en) * 1997-06-06 2001-05-08 Bsh Bosch Und Siemens Hausgeraete Gmbh Household appliance, in particular an electrically operated household appliance
CN101740028A (zh) * 2009-11-20 2010-06-16 四川长虹电器股份有限公司 家电产品语音控制***
CN103885350A (zh) * 2014-03-19 2014-06-25 四川长虹电器股份有限公司 一种语音控制家庭电器的方法和装置
CN104216351A (zh) * 2014-02-10 2014-12-17 美的集团股份有限公司 家用电器语音控制方法及***
CN104538030A (zh) * 2014-12-11 2015-04-22 科大讯飞股份有限公司 一种可以通过语音控制家电的控制***与方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101648077A (zh) * 2008-08-11 2010-02-17 巍世科技有限公司 语音指令游戏控制装置及其方法
CN103971683A (zh) * 2013-01-24 2014-08-06 上海果壳电子有限公司 语音控制方法、***及手持设备
CN104599669A (zh) * 2014-12-31 2015-05-06 乐视致新电子科技(天津)有限公司 一种语音控制方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6230137B1 (en) * 1997-06-06 2001-05-08 Bsh Bosch Und Siemens Hausgeraete Gmbh Household appliance, in particular an electrically operated household appliance
CN101740028A (zh) * 2009-11-20 2010-06-16 四川长虹电器股份有限公司 家电产品语音控制***
CN104216351A (zh) * 2014-02-10 2014-12-17 美的集团股份有限公司 家用电器语音控制方法及***
CN103885350A (zh) * 2014-03-19 2014-06-25 四川长虹电器股份有限公司 一种语音控制家庭电器的方法和装置
CN104538030A (zh) * 2014-12-11 2015-04-22 科大讯飞股份有限公司 一种可以通过语音控制家电的控制***与方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110908718A (zh) * 2018-09-14 2020-03-24 上海擎感智能科技有限公司 人脸识别激活语音导航方法、***、存储介质及设备

Also Published As

Publication number Publication date
CN106356059A (zh) 2017-01-25

Similar Documents

Publication Publication Date Title
WO2017012511A1 (zh) 语音控制方法、装置及投影仪设备
JP6811758B2 (ja) 音声対話方法、装置、デバイス及び記憶媒体
JP6926241B2 (ja) ホットワード認識音声合成
TWI576825B (zh) 一種機器人系統的聲音識別系統及方法
TWI525532B (zh) Set the name of the person to wake up the name for voice manipulation
US9466286B1 (en) Transitioning an electronic device between device states
WO2017071182A1 (zh) 一种语音唤醒方法、装置及***
WO2020029500A1 (zh) 语音命令定制方法、装置和设备及计算机存储介质
CN112201246B (zh) 基于语音的智能控制方法、装置、电子设备及存储介质
CN110914828B (zh) 语音翻译方法及翻译装置
WO2019007245A1 (zh) 一种处理方法、控制方法、识别方法及其装置和电子设备
US9293134B1 (en) Source-specific speech interactions
JP2023015054A (ja) 自動化アシスタントを呼び出すための動的および/またはコンテキスト固有のホットワード
US20210241768A1 (en) Portable audio device with voice capabilities
CN104247280A (zh) 话音控制的通信连接
KR20140089863A (ko) 디스플레이 장치, 및 이의 제어 방법, 그리고 음성 인식 시스템의 디스플레이 장치 제어 방법
JP2015520409A (ja) ユーザ定義可能な制約条件を有する省スペースの音声認識を構築する為の埋め込みシステム
CN106971723A (zh) 语音处理方法和装置、用于语音处理的装置
WO2016078214A1 (zh) 终端处理方法、装置及计算机存储介质
KR20220027251A (ko) 오디오 워터 마킹을 이용한 키 구문 검출
CN109817220A (zh) 语音识别方法、装置及***
KR20150089145A (ko) 음성 제어를 수행하는 디스플레이 장치 및 그 음성 제어 방법
KR20200052638A (ko) 전자 장치 및 전자 장치의 음성 인식 방법
JP7173049B2 (ja) 情報処理装置、情報処理システム、および情報処理方法、並びにプログラム
JP2020527739A (ja) 話者ダイアライゼーション

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16827202

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16827202

Country of ref document: EP

Kind code of ref document: A1