CN109036430A

CN109036430A - Voice control terminal

Info

Publication number: CN109036430A
Application number: CN201811150055.5A
Authority: CN
Inventors: 陈琦; 陈志军; 占文; 刘倩茹; 梁毅; 武文强
Original assignee: Wuhu Xingtu Robot Technology Co Ltd
Current assignee: Wuhu Xingtu Robot Technology Co Ltd
Priority date: 2018-09-29
Filing date: 2018-09-29
Publication date: 2018-12-18

Abstract

The present invention discloses a kind of voice control terminal, including voice control box, cloud, throw screen, microphone, camera, collected audio data is sent to the speech recognition of the cloud in cloud/semantic analysis engine by AIUI service by voice control box, and the text of needs is converted speech by semantic understanding, the text is converted into corresponding control instruction by voice control box, is sent to control centre by transmission forms such as network, serial ports；Voice control box identifies the identity for issuing the people of control voice by recognition of face, finger print identifying and Application on Voiceprint Recognition, and decide whether to send phonetic order by identification, and loudspeaker is provided in voice control box, support full-duplex voice interactive and barge function, it is handled by local voice identification/semantic analysis engine that the cloud speech recognition/semantic analysis engine cooperation disposed on beyond the clouds is deployed on voice control box to carry out speech recognition and semantic analysis, hand labor is liberated significantly, improves working efficiency.

Description

Voice control terminal

Technical field

The present invention relates to intelligent sound control technology field more particularly to a kind of voice control terminals.

Background technique

With the development of science and technology, science and technology is more and more intelligent, wherein voice control terminal is more more and more universal, existing voice Controlling terminal all has user identity identification function, but is mostly identified by vocal print, also has plenty of and passes through Action gesture identifies that this results in function more single, and cloud deployment way is also more single, lead to voice control terminal It is more low for the treatment effeciency of audio-frequency information, and phonetic order back track function is not had, therefore, solve this kind of ask Topic is particularly important.

Summary of the invention

In view of the deficiencies of the prior art, it the present invention provides a kind of voice control terminal, will be collected by voice control box Audio data the speech recognition of the cloud in cloud/semantic analysis engine is sent to by AIUI service, and passing through semantic understanding will Voice is converted into the text needed, which is converted into corresponding control instruction, passes through network, serial port form by voice control box It is sent to control centre, voice control is deployed in by the cloud speech recognition/semantic analysis engine cooperation disposed on beyond the clouds Local voice identification/semantic analysis engine on box is handled to carry out speech recognition and semantic analysis.

To solve the above-mentioned problems, the present invention provides a kind of voice control terminal, include

Voice control box: collected audio data is sent to the speech recognition of the cloud in cloud/semanteme point by AIUI service Engine is analysed, and converts speech into the text of needs by semantic understanding, which is converted into controlling accordingly by voice control box System instruction, control centre is sent to by network, serial port form, be additionally provided in the voice control box local voice identification/ Semantic analysis engine is used for speech recognition and semantic analysis；

Cloud: the audio data transmitted for receiving and identifying voice control box, the cloud include privately owned cloud and Publicly-owned cloud, the public cloud end administration are deployed in open network, and the private clound end administration is deployed in local area network, institute It states and is provided with cloud speech recognition/semantic analysis engine in cloud, the cloud speech recognition/semantic analysis engine distribution is in private Have on cloud and publicly-owned cloud；

Throw screen: for showing treated audio data, content feed, guidance and configuration；

Microphone: for acquiring audio data, the microphone is connected on voice control box by bluetooth or network interface；

Camera: for acquiring user's human face data, the camera is connected to voice control box by USB or network interface On.

Further improvement lies in that: the voice control box identifies hair by recognition of face, finger print identifying and Application on Voiceprint Recognition The identity of the people of voice is controlled out, and decides whether to send phonetic order by identification.

Further improvement lies in that: it is provided with loudspeaker in the voice control box, loudspeaker can carry or external.

Further improvement lies in that: the microphone uses array algorithm noise reduction, supports near field and far field pickup.

Further improvement lies in that: it is provided with backtracking module in the voice control box, for using the instruction of sending The retrospect of family identity.

Further improvement lies in that: the semantic understanding includes the understanding of the standard meaning of one's words and the understanding for extending the meaning of one's words.

The beneficial effects of the present invention are: the present invention acquires audio data by microphone, microphone passes through usb, bluetooth etc. Wired or be wirelessly connected with voice control terminal, collected audio data is sent to cloud by AIUI service by voice control box Speech recognition/semantic analysis engine is held, and converts speech into the text of needs, voice by speech analysis, semantic understanding etc. It controls box and the text is converted into corresponding control instruction, the control of integrated manufacturer is sent to by transmission forms such as network, serial ports Center processed；Voice control box identifies the identity for issuing the people of control voice by recognition of face, finger print identifying and Application on Voiceprint Recognition, And decide whether to send phonetic order by identification, and there is back track function simultaneously, facilitate instruction of the user to sending It is tracked；And voice control box carries loudspeaker, supports full-duplex voice interactive and barge function；Microphone uses wheat Gram wind array noise reduction algorithm supports near field and far field pickup, is drawn by the cloud speech recognition/semantic analysis disposed on beyond the clouds Local voice identification/semantic analysis engine that cooperation is deployed on voice control box is held up to carry out at speech recognition and semantic analysis Reason, has liberated hand labor significantly, has improved work efficiency.

Detailed description of the invention

Fig. 1 is system connection figure of the invention.

Fig. 2 is deployment schematic diagram in cloud of the invention.

Fig. 3 is local speech recognition engine system framework figure of the invention.

Specific embodiment

In order to deepen the understanding of the present invention, the present invention is further described below in conjunction with embodiment, the present embodiment For explaining only the invention, it is not intended to limit the scope of the present invention..

As shown in Figure 1, 2, 3, a kind of voice control terminal is present embodiments provided, includes

Voice control box: collected audio data is sent to the speech recognition engine in cloud by AIUI service, and passes through language Reason and good sense solution converts speech into the text of needs, which is converted into corresponding control instruction by voice control box, by network, Serial port form is sent to control centre, is provided with local speech recognition engine in the voice control box；

Throw screen: for showing treated audio data；

Microphone: for acquiring audio data, the microphone is connected on voice control box by bluetooth or MIC interface；

The voice control box identifies the people's for issuing control voice by recognition of face, finger print identifying and Application on Voiceprint Recognition Identity, and decide whether to send phonetic order by identification.Loudspeaker is provided in the voice control box.The wheat Gram elegance array algorithm noise reduction supports near field and far field pickup.Be provided with backtracking module in the voice control box, for pair The instruction of sending carries out the retrospect of user identity.The local speech recognition engine is used for speech recognition and semantic analysis.It is described Semantic understanding includes the understanding of the standard meaning of one's words and the understanding for extending the meaning of one's words.

The present invention acquires audio data by microphone, microphone by usb, bluetooth etc. it is wired or wirelessly with voice control Terminal processed is connected, and collected audio data is sent to cloud speech recognition/semantic analysis by AIUI service by voice control box Engine, and the text of needs is converted speech by speech analysis, semantic understanding etc., which is converted by voice control box Corresponding control instruction is sent to the control centre of integrated manufacturer by transmission forms such as network, serial ports；Voice control box passes through Recognition of face, finger print identifying and Application on Voiceprint Recognition issue the identity for the people for controlling voice to identify, and are determined by identification Phonetic order whether is sent, and there is back track function simultaneously, user is facilitated to be tracked the instruction of sending；And voice control box Included loudspeaker supports full-duplex voice interactive and barge function；Microphone uses noise reduction of microphone array algorithm, supports Near field and far field pickup are deployed in voice control by the cloud speech recognition/semantic analysis engine cooperation disposed on beyond the clouds Local voice identification/semantic analysis engine on box is handled to carry out speech recognition and semantic analysis, has liberated artificial labor significantly It is dynamic, it improves work efficiency.

Claims

1. a kind of voice control terminal, which is characterized in that include

2. voice control terminal according to claim 1, it is characterised in that: the voice control box by recognition of face, Finger print identifying and Application on Voiceprint Recognition identify the identity for the people for issuing control voice, and decide whether to send language by identification Sound instruction.

3. voice control terminal according to claim 1, it is characterised in that: be provided with loudspeaking in the voice control box Device.

4. voice control terminal according to claim 1, it is characterised in that: the microphone uses array algorithm noise reduction, Support near field and far field pickup.

5. voice control terminal according to claim 1, it is characterised in that: be provided with backtracking mould in the voice control box Block carries out the retrospect of user identity for the instruction to sending.

6. voice control terminal according to claim 1, it is characterised in that: the semantic understanding includes the standard meaning of one's words Understand and extend the understanding of the meaning of one's words.