CN208985692U

CN208985692U - Voice control terminal

Info

Publication number: CN208985692U
Application number: CN201821604498.2U
Authority: CN
Inventors: 陈琦; 陈志军; 占文; 刘倩茹; 梁毅; 武文强
Original assignee: Wuhu Xingtu Robot Technology Co Ltd
Current assignee: Wuhu Xingtu Robot Technology Co Ltd
Priority date: 2018-09-29
Filing date: 2018-09-29
Publication date: 2019-06-14
Anticipated expiration: 2028-09-29

Abstract

The utility model discloses a kind of voice control terminal, including voice control box, cloud, throw screen, microphone, camera, collected audio data is sent to the speech recognition of the cloud in cloud/semantic analysis engine by AIUI service by voice control box, and the text of needs is converted speech by semantic understanding, the text is converted into corresponding control instruction by voice control box, is sent to control centre by transmission forms such as network, serial ports；Voice control box identifies the identity for issuing the people of control voice by recognition of face, finger print identifying and Application on Voiceprint Recognition, and decide whether to send phonetic order by identification, and loudspeaker is provided in voice control box, support full-duplex voice interactive and barge function, it is handled by local voice identification/semantic analysis engine that the cloud speech recognition/semantic analysis engine cooperation disposed on beyond the clouds is deployed on voice control box to carry out speech recognition and semantic analysis, hand labor is liberated significantly, improves working efficiency.

Description

Voice control terminal

Technical field

The utility model relates to intelligent sound control technology field more particularly to a kind of voice control terminals.

Background technique

With the development of science and technology, science and technology is more and more intelligent, wherein voice control terminal is more more and more universal, existing voice Controlling terminal all has user identity identification function, but is mostly identified by vocal print, also has plenty of and passes through Action gesture identifies that this results in function more single, and cloud deployment way is also more single, lead to voice control terminal It is more low for the treatment effeciency of audio-frequency information, and phonetic order back track function is not had, therefore, solve this kind of ask Topic is particularly important.

Utility model content

In view of the deficiencies of the prior art, the utility model provides a kind of voice control terminal, will be adopted by voice control box The audio data collected is sent to the speech recognition of the cloud in cloud/semantic analysis engine by AIUI service, and passes through semantic reason Solution converts speech into the text of needs, which is converted into corresponding control instruction by voice control box, passes through network, serial ports Form is sent to control centre, is deployed in voice by the cloud speech recognition/semantic analysis engine cooperation disposed on beyond the clouds Local voice identification/semantic analysis engine on control box is handled to carry out speech recognition and semantic analysis.

To solve the above-mentioned problems, the utility model provides a kind of voice control terminal, includes

Voice control box: collected audio data is sent to the speech recognition of the cloud in cloud/language by AIUI service Adopted analysis engine, and the text of needs is converted speech by semantic understanding, which is converted into accordingly by voice control box Control instruction, control centre is sent to by network, serial port form, be additionally provided in the voice control box local voice knowledge Not/semantic analysis engine is used for speech recognition and semantic analysis；

Cloud: the audio data transmitted for receiving and identifying voice control box, the cloud include private clound End and publicly-owned cloud, the public cloud end administration are deployed in open network, and the private clound end administration is deployed in local area network It is interior, cloud speech recognition/semantic analysis engine, the cloud speech recognition/semantic analysis engine part are provided in the cloud Administration is on privately owned cloud and publicly-owned cloud；

Throw screen: for showing treated audio data, content feed, guidance and configuration；

Microphone: for acquiring audio data, the microphone is connected to voice control box by bluetooth or network interface On；

Camera: for acquiring user's human face data, the camera is connected to voice control by USB or network interface On box.

Further improvement lies in that: the voice control box identifies hair by recognition of face, finger print identifying and Application on Voiceprint Recognition The identity of the people of voice is controlled out, and decides whether to send phonetic order by identification.

Further improvement lies in that: it is provided with loudspeaker in the voice control box, loudspeaker can carry or external.

Further improvement lies in that: the microphone uses array algorithm noise reduction, supports near field and far field pickup.

Further improvement lies in that: it is provided with backtracking module in the voice control box, for using the instruction of sending The retrospect of family identity.

Further improvement lies in that: the semantic understanding includes the understanding of the standard meaning of one's words and the understanding for extending the meaning of one's words.

The beneficial effects of the utility model are: the utility model acquires audio data by microphone, microphone passes through Usb, bluetooth etc. are wired or are wirelessly connected with voice control terminal, and collected audio data is passed through AIUI by voice control box Service is sent to cloud speech recognition/semantic analysis engine, and converts speech into needs by speech analysis, semantic understanding etc. The text is converted into corresponding control instruction by text, voice control box, is sent to by transmission forms such as network, serial ports integrated The control centre of manufacturer；Voice control box is identified by recognition of face, finger print identifying and Application on Voiceprint Recognition issues control voice The identity of people, and decide whether to send phonetic order by identification, and there is back track function simultaneously, facilitate user to hair Instruction out is tracked；And voice control box carries loudspeaker, supports full-duplex voice interactive and barge function；Mike Elegance noise reduction of microphone array algorithm supports near field and far field pickup, passes through the cloud speech recognition/language disposed on beyond the clouds Adopted analysis engine cooperates the local voice identification/semantic analysis engine being deployed on voice control box to carry out speech recognition and language Adopted analysis processing, has liberated hand labor significantly, has improved work efficiency.

Detailed description of the invention

Fig. 1 is the system connection figure of the utility model.

Fig. 2 is the cloud deployment schematic diagram of the utility model.

Fig. 3 is the local speech recognition engine system framework figure of the utility model.

Specific embodiment

In order to deepen the understanding to the utility model, the utility model is further described below in conjunction with embodiment, The present embodiment is only used for explaining the utility model, does not constitute the restriction to scope of protection of the utility model.

As shown in Figure 1, 2, 3, a kind of voice control terminal is present embodiments provided, includes

Voice control box: collected audio data is sent to the speech recognition engine in cloud by AIUI service, and is led to The text that semantic understanding converts speech into needs is crossed, which is converted into corresponding control instruction, passed through by voice control box Network, serial port form are sent to control centre, are provided with local speech recognition engine in the voice control box；

Throw screen: for showing treated audio data；

Microphone: for acquiring audio data, the microphone is connected to voice control box by bluetooth or MIC interface On；

The voice control box identifies the people's for issuing control voice by recognition of face, finger print identifying and Application on Voiceprint Recognition Identity, and decide whether to send phonetic order by identification.Loudspeaker is provided in the voice control box.The wheat Gram elegance array algorithm noise reduction supports near field and far field pickup.Be provided with backtracking module in the voice control box, for pair The instruction of sending carries out the retrospect of user identity.The local speech recognition engine is used for speech recognition and semantic analysis.It is described Semantic understanding includes the understanding of the standard meaning of one's words and the understanding for extending the meaning of one's words.

The utility model acquires audio data by microphone, and microphone is wired or wirelessly and language by usb, bluetooth etc. Sound controlling terminal is connected, and collected audio data is sent to cloud speech recognition/semanteme by AIUI service by voice control box Analysis engine, and the text of needs is converted speech by speech analysis, semantic understanding etc., voice control box turns the text It changes corresponding control instruction into, the control centre of integrated manufacturer is sent to by transmission forms such as network, serial ports；Voice control box Identified by recognition of face, finger print identifying and Application on Voiceprint Recognition issue control voice people identity, and by identification come Decide whether to send phonetic order, and there is back track function simultaneously, user is facilitated to be tracked the instruction of sending；And voice control Box processed carries loudspeaker, supports full-duplex voice interactive and barge function；Microphone uses noise reduction of microphone array algorithm, It supports near field and far field pickup, voice is deployed in by the cloud speech recognition/semantic analysis engine cooperation disposed on beyond the clouds Local voice identification/semantic analysis engine on control box is handled to carry out speech recognition and semantic analysis, has liberated people significantly Work labour, improves work efficiency.

Claims

1. a kind of voice control terminal, which is characterized in that include

Voice control box: collected audio data is sent to the speech recognition of the cloud in cloud/semanteme point by AIUI service Engine is analysed, and converts speech into the text of needs by semantic understanding, which is converted into controlling accordingly by voice control box System instruction, control centre is sent to by network, serial port form, be additionally provided in the voice control box local voice identification/ Semantic analysis engine is used for speech recognition and semantic analysis；

Cloud: the audio data transmitted for receiving and identifying voice control box, the cloud include privately owned cloud and Publicly-owned cloud, the public cloud end administration are deployed in open network, and the private clound end administration is deployed in local area network, institute It states and is provided with cloud speech recognition/semantic analysis engine in cloud, the cloud speech recognition/semantic analysis engine distribution is in private Have on cloud and publicly-owned cloud；

Microphone: for acquiring audio data, the microphone is connected on voice control box by bluetooth or network interface；

Camera: for acquiring user's human face data, the camera is connected to voice control box by USB or network interface On.

2. voice control terminal according to claim 1, it is characterised in that: the voice control box by recognition of face, Finger print identifying and Application on Voiceprint Recognition identify the identity for the people for issuing control voice, and decide whether to send language by identification Sound instruction.

3. voice control terminal according to claim 1, it is characterised in that: be provided with loudspeaking in the voice control box Device.

4. voice control terminal according to claim 1, it is characterised in that: the microphone uses array algorithm noise reduction, Support near field and far field pickup.

5. voice control terminal according to claim 1, it is characterised in that: be provided with backtracking mould in the voice control box Block carries out the retrospect of user identity for the instruction to sending.

6. voice control terminal according to claim 1, it is characterised in that: the semantic understanding includes the standard meaning of one's words Understand and extend the understanding of the meaning of one's words.