CN109036430A - Voice control terminal - Google Patents

Voice control terminal Download PDF

Info

Publication number
CN109036430A
CN109036430A CN201811150055.5A CN201811150055A CN109036430A CN 109036430 A CN109036430 A CN 109036430A CN 201811150055 A CN201811150055 A CN 201811150055A CN 109036430 A CN109036430 A CN 109036430A
Authority
CN
China
Prior art keywords
voice control
cloud
control box
voice
semantic analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811150055.5A
Other languages
Chinese (zh)
Inventor
陈琦
陈志军
占文
刘倩茹
梁毅
武文强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhu Xingtu Robot Technology Co Ltd
Original Assignee
Wuhu Xingtu Robot Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhu Xingtu Robot Technology Co Ltd filed Critical Wuhu Xingtu Robot Technology Co Ltd
Priority to CN201811150055.5A priority Critical patent/CN109036430A/en
Publication of CN109036430A publication Critical patent/CN109036430A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Telephonic Communication Services (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The present invention discloses a kind of voice control terminal, including voice control box, cloud, throw screen, microphone, camera, collected audio data is sent to the speech recognition of the cloud in cloud/semantic analysis engine by AIUI service by voice control box, and the text of needs is converted speech by semantic understanding, the text is converted into corresponding control instruction by voice control box, is sent to control centre by transmission forms such as network, serial ports;Voice control box identifies the identity for issuing the people of control voice by recognition of face, finger print identifying and Application on Voiceprint Recognition, and decide whether to send phonetic order by identification, and loudspeaker is provided in voice control box, support full-duplex voice interactive and barge function, it is handled by local voice identification/semantic analysis engine that the cloud speech recognition/semantic analysis engine cooperation disposed on beyond the clouds is deployed on voice control box to carry out speech recognition and semantic analysis, hand labor is liberated significantly, improves working efficiency.

Description

Voice control terminal
Technical field
The present invention relates to intelligent sound control technology field more particularly to a kind of voice control terminals.
Background technique
With the development of science and technology, science and technology is more and more intelligent, wherein voice control terminal is more more and more universal, existing voice Controlling terminal all has user identity identification function, but is mostly identified by vocal print, also has plenty of and passes through Action gesture identifies that this results in function more single, and cloud deployment way is also more single, lead to voice control terminal It is more low for the treatment effeciency of audio-frequency information, and phonetic order back track function is not had, therefore, solve this kind of ask Topic is particularly important.
Summary of the invention
In view of the deficiencies of the prior art, it the present invention provides a kind of voice control terminal, will be collected by voice control box Audio data the speech recognition of the cloud in cloud/semantic analysis engine is sent to by AIUI service, and passing through semantic understanding will Voice is converted into the text needed, which is converted into corresponding control instruction, passes through network, serial port form by voice control box It is sent to control centre, voice control is deployed in by the cloud speech recognition/semantic analysis engine cooperation disposed on beyond the clouds Local voice identification/semantic analysis engine on box is handled to carry out speech recognition and semantic analysis.
To solve the above-mentioned problems, the present invention provides a kind of voice control terminal, include
Voice control box: collected audio data is sent to the speech recognition of the cloud in cloud/semanteme point by AIUI service Engine is analysed, and converts speech into the text of needs by semantic understanding, which is converted into controlling accordingly by voice control box System instruction, control centre is sent to by network, serial port form, be additionally provided in the voice control box local voice identification/ Semantic analysis engine is used for speech recognition and semantic analysis;
Cloud: the audio data transmitted for receiving and identifying voice control box, the cloud include privately owned cloud and Publicly-owned cloud, the public cloud end administration are deployed in open network, and the private clound end administration is deployed in local area network, institute It states and is provided with cloud speech recognition/semantic analysis engine in cloud, the cloud speech recognition/semantic analysis engine distribution is in private Have on cloud and publicly-owned cloud;
Throw screen: for showing treated audio data, content feed, guidance and configuration;
Microphone: for acquiring audio data, the microphone is connected on voice control box by bluetooth or network interface;
Camera: for acquiring user's human face data, the camera is connected to voice control box by USB or network interface On.
Further improvement lies in that: the voice control box identifies hair by recognition of face, finger print identifying and Application on Voiceprint Recognition The identity of the people of voice is controlled out, and decides whether to send phonetic order by identification.
Further improvement lies in that: it is provided with loudspeaker in the voice control box, loudspeaker can carry or external.
Further improvement lies in that: the microphone uses array algorithm noise reduction, supports near field and far field pickup.
Further improvement lies in that: it is provided with backtracking module in the voice control box, for using the instruction of sending The retrospect of family identity.
Further improvement lies in that: the semantic understanding includes the understanding of the standard meaning of one's words and the understanding for extending the meaning of one's words.
The beneficial effects of the present invention are: the present invention acquires audio data by microphone, microphone passes through usb, bluetooth etc. Wired or be wirelessly connected with voice control terminal, collected audio data is sent to cloud by AIUI service by voice control box Speech recognition/semantic analysis engine is held, and converts speech into the text of needs, voice by speech analysis, semantic understanding etc. It controls box and the text is converted into corresponding control instruction, the control of integrated manufacturer is sent to by transmission forms such as network, serial ports Center processed;Voice control box identifies the identity for issuing the people of control voice by recognition of face, finger print identifying and Application on Voiceprint Recognition, And decide whether to send phonetic order by identification, and there is back track function simultaneously, facilitate instruction of the user to sending It is tracked;And voice control box carries loudspeaker, supports full-duplex voice interactive and barge function;Microphone uses wheat Gram wind array noise reduction algorithm supports near field and far field pickup, is drawn by the cloud speech recognition/semantic analysis disposed on beyond the clouds Local voice identification/semantic analysis engine that cooperation is deployed on voice control box is held up to carry out at speech recognition and semantic analysis Reason, has liberated hand labor significantly, has improved work efficiency.
Detailed description of the invention
Fig. 1 is system connection figure of the invention.
Fig. 2 is deployment schematic diagram in cloud of the invention.
Fig. 3 is local speech recognition engine system framework figure of the invention.
Specific embodiment
In order to deepen the understanding of the present invention, the present invention is further described below in conjunction with embodiment, the present embodiment For explaining only the invention, it is not intended to limit the scope of the present invention..
As shown in Figure 1, 2, 3, a kind of voice control terminal is present embodiments provided, includes
Voice control box: collected audio data is sent to the speech recognition engine in cloud by AIUI service, and passes through language Reason and good sense solution converts speech into the text of needs, which is converted into corresponding control instruction by voice control box, by network, Serial port form is sent to control centre, is provided with local speech recognition engine in the voice control box;
Cloud: the audio data transmitted for receiving and identifying voice control box, the cloud include privately owned cloud and Publicly-owned cloud, the public cloud end administration are deployed in open network, and the private clound end administration is deployed in local area network, institute It states and is provided with cloud speech recognition/semantic analysis engine in cloud, the cloud speech recognition/semantic analysis engine distribution is in private Have on cloud and publicly-owned cloud;
Throw screen: for showing treated audio data;
Microphone: for acquiring audio data, the microphone is connected on voice control box by bluetooth or MIC interface;
Camera: for acquiring user's human face data, the camera is connected to voice control box by USB or network interface On.
The voice control box identifies the people's for issuing control voice by recognition of face, finger print identifying and Application on Voiceprint Recognition Identity, and decide whether to send phonetic order by identification.Loudspeaker is provided in the voice control box.The wheat Gram elegance array algorithm noise reduction supports near field and far field pickup.Be provided with backtracking module in the voice control box, for pair The instruction of sending carries out the retrospect of user identity.The local speech recognition engine is used for speech recognition and semantic analysis.It is described Semantic understanding includes the understanding of the standard meaning of one's words and the understanding for extending the meaning of one's words.
The present invention acquires audio data by microphone, microphone by usb, bluetooth etc. it is wired or wirelessly with voice control Terminal processed is connected, and collected audio data is sent to cloud speech recognition/semantic analysis by AIUI service by voice control box Engine, and the text of needs is converted speech by speech analysis, semantic understanding etc., which is converted by voice control box Corresponding control instruction is sent to the control centre of integrated manufacturer by transmission forms such as network, serial ports;Voice control box passes through Recognition of face, finger print identifying and Application on Voiceprint Recognition issue the identity for the people for controlling voice to identify, and are determined by identification Phonetic order whether is sent, and there is back track function simultaneously, user is facilitated to be tracked the instruction of sending;And voice control box Included loudspeaker supports full-duplex voice interactive and barge function;Microphone uses noise reduction of microphone array algorithm, supports Near field and far field pickup are deployed in voice control by the cloud speech recognition/semantic analysis engine cooperation disposed on beyond the clouds Local voice identification/semantic analysis engine on box is handled to carry out speech recognition and semantic analysis, has liberated artificial labor significantly It is dynamic, it improves work efficiency.

Claims (6)

1. a kind of voice control terminal, which is characterized in that include
Voice control box: collected audio data is sent to the speech recognition of the cloud in cloud/semanteme point by AIUI service Engine is analysed, and converts speech into the text of needs by semantic understanding, which is converted into controlling accordingly by voice control box System instruction, control centre is sent to by network, serial port form, be additionally provided in the voice control box local voice identification/ Semantic analysis engine is used for speech recognition and semantic analysis;
Cloud: the audio data transmitted for receiving and identifying voice control box, the cloud include privately owned cloud and Publicly-owned cloud, the public cloud end administration are deployed in open network, and the private clound end administration is deployed in local area network, institute It states and is provided with cloud speech recognition/semantic analysis engine in cloud, the cloud speech recognition/semantic analysis engine distribution is in private Have on cloud and publicly-owned cloud;
Throw screen: for showing treated audio data, content feed, guidance and configuration;
Microphone: for acquiring audio data, the microphone is connected on voice control box by bluetooth or network interface;
Camera: for acquiring user's human face data, the camera is connected to voice control box by USB or network interface On.
2. voice control terminal according to claim 1, it is characterised in that: the voice control box by recognition of face, Finger print identifying and Application on Voiceprint Recognition identify the identity for the people for issuing control voice, and decide whether to send language by identification Sound instruction.
3. voice control terminal according to claim 1, it is characterised in that: be provided with loudspeaking in the voice control box Device.
4. voice control terminal according to claim 1, it is characterised in that: the microphone uses array algorithm noise reduction, Support near field and far field pickup.
5. voice control terminal according to claim 1, it is characterised in that: be provided with backtracking mould in the voice control box Block carries out the retrospect of user identity for the instruction to sending.
6. voice control terminal according to claim 1, it is characterised in that: the semantic understanding includes the standard meaning of one's words Understand and extend the understanding of the meaning of one's words.
CN201811150055.5A 2018-09-29 2018-09-29 Voice control terminal Pending CN109036430A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811150055.5A CN109036430A (en) 2018-09-29 2018-09-29 Voice control terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811150055.5A CN109036430A (en) 2018-09-29 2018-09-29 Voice control terminal

Publications (1)

Publication Number Publication Date
CN109036430A true CN109036430A (en) 2018-12-18

Family

ID=64615139

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811150055.5A Pending CN109036430A (en) 2018-09-29 2018-09-29 Voice control terminal

Country Status (1)

Country Link
CN (1) CN109036430A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109657091A (en) * 2019-01-02 2019-04-19 百度在线网络技术(北京)有限公司 State rendering method, device, equipment and the storage medium of interactive voice equipment
CN110691301A (en) * 2019-09-25 2020-01-14 晶晨半导体(深圳)有限公司 Method for testing delay time between far-field voice equipment and external loudspeaker
CN111785277A (en) * 2020-06-29 2020-10-16 北京捷通华声科技股份有限公司 Speech recognition method, speech recognition device, computer-readable storage medium and processor
CN112151062A (en) * 2020-09-27 2020-12-29 广州德初科技有限公司 Virtual sound insulation communication method based on cloud storage
CN113223518A (en) * 2021-04-16 2021-08-06 讯飞智联科技(江苏)有限公司 Human-computer interaction method of edge computing gateway based on AI (Artificial Intelligence) voice analysis
CN114553922A (en) * 2022-02-07 2022-05-27 中煤信息技术(北京)有限公司 Voice-controlled coal mine comprehensive automation system and method

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103247291A (en) * 2013-05-07 2013-08-14 华为终端有限公司 Updating method, device, and system of voice recognition device
CN103839549A (en) * 2012-11-22 2014-06-04 腾讯科技(深圳)有限公司 Voice instruction control method and system
CN104318924A (en) * 2014-11-12 2015-01-28 沈阳美行科技有限公司 Method for realizing voice recognition function
CN105045122A (en) * 2015-06-24 2015-11-11 张子兴 Intelligent household natural interaction system based on audios and videos
CN105202721A (en) * 2015-07-31 2015-12-30 广东美的制冷设备有限公司 Air conditioner and control method thereof
US20170302450A1 (en) * 2015-05-05 2017-10-19 ShoCard, Inc. Identity Management Service Using A Blockchain Providing Certifying Transactions Between Devices
CN107682536A (en) * 2017-09-25 2018-02-09 努比亚技术有限公司 A kind of sound control method, terminal and computer-readable recording medium
CN108470533A (en) * 2018-03-30 2018-08-31 南京七奇智能科技有限公司 Enhanced smart interactive advertisement system based on visual human and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103839549A (en) * 2012-11-22 2014-06-04 腾讯科技(深圳)有限公司 Voice instruction control method and system
CN103247291A (en) * 2013-05-07 2013-08-14 华为终端有限公司 Updating method, device, and system of voice recognition device
CN104318924A (en) * 2014-11-12 2015-01-28 沈阳美行科技有限公司 Method for realizing voice recognition function
US20170302450A1 (en) * 2015-05-05 2017-10-19 ShoCard, Inc. Identity Management Service Using A Blockchain Providing Certifying Transactions Between Devices
CN105045122A (en) * 2015-06-24 2015-11-11 张子兴 Intelligent household natural interaction system based on audios and videos
CN105202721A (en) * 2015-07-31 2015-12-30 广东美的制冷设备有限公司 Air conditioner and control method thereof
CN107682536A (en) * 2017-09-25 2018-02-09 努比亚技术有限公司 A kind of sound control method, terminal and computer-readable recording medium
CN108470533A (en) * 2018-03-30 2018-08-31 南京七奇智能科技有限公司 Enhanced smart interactive advertisement system based on visual human and device

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109657091A (en) * 2019-01-02 2019-04-19 百度在线网络技术(北京)有限公司 State rendering method, device, equipment and the storage medium of interactive voice equipment
US11205431B2 (en) 2019-01-02 2021-12-21 Baidu Online Network Technology (Beijing) Co., Ltd. Method, apparatus and device for presenting state of voice interaction device, and storage medium
CN110691301A (en) * 2019-09-25 2020-01-14 晶晨半导体(深圳)有限公司 Method for testing delay time between far-field voice equipment and external loudspeaker
CN111785277A (en) * 2020-06-29 2020-10-16 北京捷通华声科技股份有限公司 Speech recognition method, speech recognition device, computer-readable storage medium and processor
CN112151062A (en) * 2020-09-27 2020-12-29 广州德初科技有限公司 Virtual sound insulation communication method based on cloud storage
CN112151062B (en) * 2020-09-27 2021-12-24 梅州国威电子有限公司 Sound insulation communication method
CN113223518A (en) * 2021-04-16 2021-08-06 讯飞智联科技(江苏)有限公司 Human-computer interaction method of edge computing gateway based on AI (Artificial Intelligence) voice analysis
CN113223518B (en) * 2021-04-16 2024-03-22 讯飞智联科技(江苏)有限公司 Human-computer interaction method of edge computing gateway based on AI voice analysis
CN114553922A (en) * 2022-02-07 2022-05-27 中煤信息技术(北京)有限公司 Voice-controlled coal mine comprehensive automation system and method

Similar Documents

Publication Publication Date Title
CN109036430A (en) Voice control terminal
CN109309804A (en) A kind of intelligent meeting system
US7707035B2 (en) Autonomous integrated headset and sound processing system for tactical applications
CN105512113B (en) AC system speech translation system and interpretation method
CN106328132A (en) Voice interaction control method and device for intelligent equipment
CN107682240A (en) A kind of distributed sound interactive system for intelligent domestic
CN108877805A (en) Speech processes mould group and terminal with phonetic function
CN110366156A (en) Vehicle bluetooth communication processing method, onboard audio management system and relevant device
CN108597507A (en) Far field phonetic function implementation method, equipment, system and storage medium
KR20170103925A (en) Speech identification system and identification method of a kind of robot system
CN105957514A (en) Portable deaf-mute communication equipment
CN205619069U (en) Intelligent desk lamp
CN109545216A (en) A kind of audio recognition method and speech recognition system
CN107277276A (en) One kind possesses voice control function smart mobile phone
CN109639908A (en) A kind of bluetooth headset, anti-eavesdrop method, apparatus, equipment and medium
CN208985692U (en) Voice control terminal
CN207010925U (en) A kind of Headphone device for carrying voice and waking up identification
CN107945799A (en) A kind of multifunction speech interactive intelligence machine
CN108877799A (en) A kind of phonetic controller and method
CN108766426B (en) Intelligent voice interaction command system for naval vessel
CN109168110A (en) External hanging type speech packet
CN107249155A (en) A kind of supra-aural English real time translator
CN107635178A (en) A kind of pull bar audio amplifier with identification phonetic function
CN205376116U (en) Automatic dolly remote control unit that guides of wireless directional speech control
CN110379422A (en) Far field speech control system, control method and equipment under line

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181218