CN106935240A - Voice translation method, device, terminal device and cloud server based on artificial intelligence - Google Patents
Voice translation method, device, terminal device and cloud server based on artificial intelligence Download PDFInfo
- Publication number
- CN106935240A CN106935240A CN201710183965.2A CN201710183965A CN106935240A CN 106935240 A CN106935240 A CN 106935240A CN 201710183965 A CN201710183965 A CN 201710183965A CN 106935240 A CN106935240 A CN 106935240A
- Authority
- CN
- China
- Prior art keywords
- languages
- voice
- target language
- terminal device
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013519 translation Methods 0.000 title claims abstract description 264
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 96
- 238000000034 method Methods 0.000 title claims abstract description 87
- 238000003860 storage Methods 0.000 claims description 37
- 230000015572 biosynthetic process Effects 0.000 claims description 34
- 238000003786 synthesis reaction Methods 0.000 claims description 34
- 238000004891 communication Methods 0.000 claims description 19
- 230000014616 translation Effects 0.000 description 236
- 230000006870 function Effects 0.000 description 22
- 230000003287 optical effect Effects 0.000 description 12
- 238000012545 processing Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000002093 peripheral effect Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- XEEYBQQBJWHFJM-UHFFFAOYSA-N Iron Chemical compound [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 238000000151 deposition Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 241000209140 Triticum Species 0.000 description 1
- 235000021307 Triticum Nutrition 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 229910052742 iron Inorganic materials 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/005—Language recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Signal Processing (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The application proposes a kind of voice translation method based on artificial intelligence, device, terminal device and cloud server, and the above-mentioned voice translation method based on artificial intelligence includes:Receive the voice of the source languages that user is input into by terminal device;The voice of the source languages is sent to cloud server;Receive the audio file of the target language that the cloud server sends;Play the audio file of the target language.The application can realize the real time translation of voice, meet the translation demand of overseas trip scene, and it is higher to translate accuracy rate.
Description
Technical field
The application is related to voice processing technology field, more particularly to a kind of voice translation method based on artificial intelligence, dress
Put, terminal device and cloud server.
Background technology
At present in overseas trip market, translation software is based substantially on mobile phone terminal, although the language that can solve some scenes is handed over
Stream problem, map and/or takes pictures to wait and applies but because in usage scenario overseas, the language translation degree of accuracy is low, and during trip
(Application;Hereinafter referred to as:APP the viscosity for) using is higher, APP is translated due to needing switching APP to call, in instantaneity
It is defective in satisfaction.Meanwhile, increasing person in middle and old age go on a tour crowd, higher using educational costs for cell phone software, for
" point-and-shoot " is there is tight demand by translator i.e..
But existing translation class hardware product, the substantially modification of electronic dictionary, mostly text query, voice are turned in real time
The product translated is little, and accuracy rate is relatively low.In addition, existing translation class hardware product is the language learning demand that solves mostly, it is right
Not high in the translation support of overseas trip scene, translation accuracy rate is relatively low.
The content of the invention
The purpose of the application is intended at least solve to a certain extent one of technical problem in correlation technique.
Therefore, first purpose of the application is to propose a kind of voice translation method based on artificial intelligence.The method
The real time translation of voice can be realized, the translation demand of overseas trip scene is met, and it is higher to translate accuracy rate.
Second purpose of the application is to propose a kind of speech translation apparatus based on artificial intelligence.
3rd purpose of the application is to propose a kind of terminal device.
4th purpose of the application is to propose a kind of cloud server.
5th purpose of the application is to propose a kind of storage medium comprising computer executable instructions.
To achieve these goals, the voice translation method based on artificial intelligence of the application first aspect embodiment, bag
Include:Receive the voice of the source languages that user is input into by terminal device;The voice of the source languages is sent to cloud server;
The audio file of the target language that the cloud server sends is received, the audio file of the target language is the high in the clouds clothes
Business device carries out speech recognition to the voice of the source languages, it is determined that being at least two target languages by the voiced translation of the source languages
After at least one target language in kind in addition to the source languages, the text that speech recognition is obtained is translated into the mesh of determination
The text of poster kind, and the text of the target language to translating into carries out what is obtained after phonetic synthesis;Play the target language
The audio file planted.
In the voice translation method based on artificial intelligence of the embodiment of the present application, receive what user was input into by terminal device
After the voice of source languages, the voice of above-mentioned source languages is sent to cloud server, then receive above-mentioned cloud server hair
The audio file of the target language for sending, finally plays the audio file of above-mentioned target language, such that it is able to realize the real-time of voice
Translation, meets the translation demand of overseas trip scene, and it is higher to translate accuracy rate.
To achieve these goals, the voice translation method based on artificial intelligence of the application second aspect embodiment, bag
Include:The voice of the source languages that receiving terminal apparatus send;Voice to the source languages carries out speech recognition, by the source languages
Voice be converted into the text of source languages;It is determined that the voiced translation of the source languages is described to be removed at least two target languages
At least one target language outside the languages of source;The text of the source languages is translated into the text of the target language of determination, it is right
The text of the target language translated into carries out phonetic synthesis, obtains the audio file of target language;By the sound of the target language
Frequency file is sent to the terminal device, so that the terminal device is played.
In the voice translation method based on artificial intelligence of the embodiment of the present application, the source languages that receiving terminal apparatus send
After voice, the voice to above-mentioned source languages carries out speech recognition, and the voice of above-mentioned source languages is converted into the text of source languages,
It is determined that at least one mesh in being at least two target languages by the voiced translation of above-mentioned source languages in addition to above-mentioned source languages
After poster kind, the text of above-mentioned source languages is translated into the text of the target language of determination, and the target language to translating into
Text carry out phonetic synthesis, obtain the audio file of target language, finally the audio file of above-mentioned target language is sent to
Above-mentioned terminal device, so that above-mentioned terminal device is played, such that it is able to realize the real time translation of voice, meets overseas trip scene
Translation demand, and it is higher to translate accuracy rate.
To achieve these goals, the speech translation apparatus based on artificial intelligence of the application third aspect embodiment, if
Put on the terminal device, the speech translation apparatus based on artificial intelligence include:Receiver module, for receiving user by eventually
The voice of the source languages of end equipment input;Sending module, for the voice of the source languages to be sent to cloud server;It is described
Receiver module, is additionally operable to receive the audio file of the target language that the cloud server sends, the audio of the target language
File is that the cloud server carries out speech recognition to the voice of the source languages, it is determined that by the voiced translation of the source languages
After at least one target language at least two target languages in addition to the source languages, the text that speech recognition is obtained
Originally the text of the target language of determination is translated into, and the text of the target language to translating into carries out acquisition after phonetic synthesis
's;Playing module, the audio file for playing the target language.
In the speech translation apparatus based on artificial intelligence of the embodiment of the present application, receiver module is received user and is set by terminal
After the voice of the source languages of standby input, sending module sends to cloud server the voice of above-mentioned source languages, then receives
Module receives the audio file of the target language that above-mentioned cloud server sends, and last playing module plays above-mentioned target language
Audio file, such that it is able to realize the real time translation of voice, meets the translation demand of overseas trip scene, and translate accuracy rate compared with
It is high.
To achieve these goals, the speech translation apparatus based on artificial intelligence of the application fourth aspect embodiment, if
Put on server beyond the clouds, the speech translation apparatus based on artificial intelligence include:Receiver module, for receiving terminal apparatus
The voice of the source languages of transmission;Sound identification module, speech recognition is carried out for the voice to the source languages, by the source language
The voice planted is converted into the text of source languages;Determining module, for determining that by the voiced translation of the source languages be at least two
At least one target language in target language in addition to the source languages;Translation module, for by the text of the source languages
Translate into the text of the target language that the determining module determines;Voice synthetic module, for being translated into the translation module
The text of target language carry out phonetic synthesis, obtain the audio file of target language;Sending module, for the voice to be closed
The audio file of the target language obtained into module is sent to the terminal device, so that the terminal device is played.
In the speech translation apparatus based on artificial intelligence of the embodiment of the present application, what receiver module receiving terminal apparatus sent
After the voice of source languages, sound identification module carries out speech recognition to the voice of above-mentioned source languages, by the language of above-mentioned source languages
Sound is converted into the text of source languages, in being at least two target languages by the voiced translation of above-mentioned source languages in determining module determination
After at least one target language in addition to above-mentioned source languages, the text of above-mentioned source languages is translated into determination by translation module
The text of target language, the text of target language of the voice synthetic module to translating into carries out phonetic synthesis, obtains target language
Audio file, the audio file of above-mentioned target language is sent to above-mentioned terminal device by last sending module, for above-mentioned end
End equipment is played, and such that it is able to realize the real time translation of voice, meets the translation demand of overseas trip scene, and translate accuracy rate
It is higher.
To achieve these goals, the terminal device of the aspect embodiment of the application the 5th, including:One or more treatment
Device;Memory, for storing one or more programs;Receiver, for receiving the source languages that user is input into by terminal device
Voice;And send to cloud server the voice of the source languages in transmitter, receive the cloud server
The audio file of the target language of transmission, the audio file of the target language is the cloud server to the source languages
Voice carries out speech recognition, it is determined that by the voiced translation of the source languages be at least two target languages in except the source languages it
After outer at least one target language, the text that speech recognition is obtained is translated into the text of the target language of determination, and
The text of the target language to translating into carries out what is obtained after phonetic synthesis;The transmitter, for by the language of the source languages
Sound is sent to cloud server;When one or more of programs are by one or more of computing devices so that described one
Individual or multiple processors realize method as described above.
To achieve these goals, the aspect embodiment of the application the 6th provides a kind of depositing comprising computer executable instructions
Storage media, the computer executable instructions are used to perform method as described above when being performed by computer processor.
To achieve these goals, the cloud server of the aspect embodiment of the application the 7th, including:One or more treatment
Device;Memory, for storing one or more programs;Receiver, the voice of the source languages sent for receiving terminal apparatus;Hair
Device is sent, for the audio file of target language to be sent into the terminal device, so that the terminal device is played;When described one
Individual or multiple programs are by one or more of computing devices so that one or more of processors are realized as described above
Method.
To achieve these goals, the application eighth aspect embodiment provides a kind of depositing comprising computer executable instructions
Storage media, the computer executable instructions are used to perform method as described above when being performed by computer processor.
The aspect and advantage that the application is added will be set forth in part in the description, and will partly become from the following description
Substantially, or recognized by the practice of the application.
Brief description of the drawings
The above-mentioned and/or additional aspect of the application and advantage will become from the following description of the accompanying drawings of embodiments
Substantially and be readily appreciated that, wherein:
Fig. 1 is the flow chart of voice translation method one embodiment that the application is based on artificial intelligence;
Fig. 2 is the flow chart that the application is based on another embodiment of the voice translation method of artificial intelligence;
Fig. 3 is the schematic diagram that the application is based on terminal device one embodiment in the voice translation method of artificial intelligence;
Fig. 4 is the flow chart of the voice translation method further embodiment that the application is based on artificial intelligence;
Fig. 5 is the flow chart of the voice translation method further embodiment that the application is based on artificial intelligence;
Fig. 6 is the flow chart of the voice translation method further embodiment that the application is based on artificial intelligence;
Fig. 7 is the flow chart of the voice translation method further embodiment that the application is based on artificial intelligence;
Fig. 8 is the flow chart of the voice translation method further embodiment that the application is based on artificial intelligence;
Fig. 9 is the flow chart of the voice translation method further embodiment that the application is based on artificial intelligence;
Figure 10 is the structural representation of speech translation apparatus one embodiment that the application is based on artificial intelligence;
Figure 11 is the structural representation that the application is based on another embodiment of the speech translation apparatus of artificial intelligence;
Figure 12 is the structural representation of the speech translation apparatus further embodiment that the application is based on artificial intelligence;
Figure 13 is the structural representation of the speech translation apparatus further embodiment that the application is based on artificial intelligence;
Figure 14 is the structural representation of the application terminal device one embodiment;
Figure 15 is the structural representation of the application cloud server one embodiment.
Specific embodiment
Embodiments herein is described below in detail, the example of the embodiment is shown in the drawings, wherein from start to finish
Same or similar label represents same or similar element or the element with same or like function.Below with reference to attached
It is exemplary to scheme the embodiment of description, is only used for explaining the application, and it is not intended that limitation to the application.Conversely, this
The embodiment of application includes all changes fallen into the range of the spiritual and intension of attached claims, modification and is equal to
Thing.
Artificial intelligence (Artificial Intelligence;Hereinafter referred to as:AI it is) to study, be developed for simulating, extend
One new technological sciences of intelligent theory, method, technology and application system with extension people.Artificial intelligence is computer section
The branch learned, it attempts to understand essence of intelligence, and produces and a kind of new can be made in the similar mode of human intelligence
The intelligence machine of reaction, the research in the field includes robot, language identification, image recognition, natural language processing and expert system
System etc..
Fig. 1 is the flow chart of voice translation method one embodiment that the application is based on artificial intelligence, as shown in figure 1, on
Stating the voice translation method based on artificial intelligence can include:
Step 101, receives the voice of the source languages that user is input into by terminal device.
Step 102, the voice of above-mentioned source languages is sent to cloud server.
Specifically, terminal device can be by pulse code modulation (Pulse Code Modulation;Hereinafter referred to as:
PCM) voice of above-mentioned source languages is uploaded to cloud server by form.
Step 103, receives the audio file of the target language that above-mentioned cloud server sends, the audio of above-mentioned target language
File is that cloud server carries out speech recognition to the voice of above-mentioned source languages, it is determined that by the voiced translation of above-mentioned source languages be to
After at least one target language in few two kinds of target languages in addition to above-mentioned source languages, the text that speech recognition is obtained is turned over
It is translated into the text of the target language of determination, and the text of the target language to translating into carries out what is obtained after phonetic synthesis.
Specifically, the audio file that cloud server is sent to the target language of terminal device is also the file of PCM format.
The text of target language of the cloud server to translating into carries out phonetic synthesis and uses from Text To Speech
(Text To Speech;Hereinafter referred to as:TTS) service.
Step 104, plays the audio file of above-mentioned target language.
In the above-mentioned voice translation method based on artificial intelligence, the language of the source languages that user is input into by terminal device is received
After sound, the voice of above-mentioned source languages is sent to cloud server, then receive the target language that above-mentioned cloud server sends
The audio file planted, finally plays the audio file of above-mentioned target language, such that it is able to realize the real time translation of voice, meets and
The translation demand of scene is swum in border, and it is higher to translate accuracy rate.
Fig. 2 is the flow chart that the application is based on another embodiment of the voice translation method of artificial intelligence, as shown in Fig. 2
Step 101 can be in the application embodiment illustrated in fig. 1:
Step 201, receives user after the translation button for triggering above-mentioned terminal device by the wheat of above-mentioned terminal device
The voice of the source languages of gram wind input.
In the present embodiment, the translation button of above-mentioned terminal device can be the mechanical key set on above-mentioned terminal device,
Can also be the virtual key set on above-mentioned terminal device, the present embodiment is not construed as limiting to the form of above-mentioned translation case, but
The present embodiment is illustrated so that above-mentioned translation button is the mechanical key set on above-mentioned terminal device as an example.
Wherein, the mode for triggering above-mentioned translation button can be pressed for length, or single click on or double-click etc., this reality
Example is applied to be not construed as limiting the mode for triggering above-mentioned translation button, the present embodiment to trigger above-mentioned translation button in the way of for grow by for
Example is illustrated.
It should be noted that in the present embodiment, the quantity of the translation button of above-mentioned terminal device is 1, as shown in figure 3,
Fig. 3 is the schematic diagram that the application is based on terminal device one embodiment in the voice translation method of artificial intelligence.
That is, the terminal device of the present embodiment is in hardware design, speech recognition+translation can be with a mechanical keys
Triggering.When user uses, as long as long by translation button, the voice for wanting translation is said to microphone, for example:" I thinks recently
Subway station ", then unclamp translation button, above-mentioned terminal device will play out " I want to go to the nearest
The sound result of subway station ", it is achieved thereby that real-time " key " translation of voice.
Fig. 4 is the flow chart of the voice translation method further embodiment that the application is based on artificial intelligence, as shown in figure 4,
In the application embodiment illustrated in fig. 1, before step 103, can also include:
Step 401, obtains the target language that above-mentioned user is set, and the target language that above-mentioned user is set is uploaded to above-mentioned
Cloud server, the mark and above-mentioned target language of above-mentioned terminal device, above-mentioned mesh are preserved so as to above-mentioned cloud server correspondence
Poster kind includes at least two languages, and above-mentioned at least two languages include above-mentioned source languages.
Specifically, after user sets target language on the terminal device, terminal device is obtained with above-mentioned user and sets
The target language put, then terminal device the target language that above-mentioned user is set is uploaded to above-mentioned cloud server, by high in the clouds
Server correspondence preserves the mark and above-mentioned target language of above-mentioned terminal device, wherein, being designated for above-mentioned terminal device can be with
The information of the above-mentioned terminal device of unique mark, for example:The device number of above-mentioned terminal device, the present embodiment is to above-mentioned terminal device
The form of mark is not construed as limiting.
Above-mentioned target language can include at least two languages, and above-mentioned at least two languages include above-mentioned source languages,
That is, in the present embodiment, terminal device can realize voice mutual translation at least two target languages that user is set.Citing
For, it is assumed that the target language that user is set is " China and Britain ", then user pin unique translation of above-mentioned terminal device by
After key, it is right literary " I thinks nearest subway station " to above-mentioned terminal device, then unclamps translation button, is carried by the application
The voice translation method based on artificial intelligence for supplying, above-mentioned terminal device will play out " I want to go to the
The sound result of nearest subway station ";And if user pin unique translation of above-mentioned terminal device by
After key, English " I want to go to the nearest subway station " is said to above-mentioned terminal device, then
Translation button is unclamped, the voice translation method based on artificial intelligence provided by the application, above-mentioned terminal device will be played out
The sound result of " I thinks nearest subway station ".
Similarly, above-mentioned target language can also be " middle Britain and Japan ", then above-mentioned terminal device will be in Chinese, English and Japanese
Between realize voice mutual translation, if user be input into a Chinese, then terminal device translates the Japanese for playing this Chinese successively
Text and English translation, if one English of input, then terminal device will successively play the Chinese translation and Japanese of this English
Translation, by that analogy, will not be repeated here.
By above description as can be seen that in the present embodiment, it is English either to need translator of Chinese, or by English
Text is translated as Chinese, is all the input of the voice by same translation button trigger source languages, and this improves terminal in the application
The ease for use of equipment, it is user-friendly.
Fig. 5 is the flow chart of the voice translation method further embodiment that the application is based on artificial intelligence, in the present embodiment,
Above-mentioned user includes first user and second user, and above-mentioned target language includes the first languages and the second languages;As shown in figure 5,
The above-mentioned voice translation method based on artificial intelligence can include:
Step 501, receives the voice of the source languages that first user is input into by terminal device.
Step 502, the voice of above-mentioned source languages is sent to cloud server.
Step 503, receives the audio file of the second languages that above-mentioned cloud server sends, the audio of above-mentioned second languages
File is that cloud server carries out speech recognition and Application on Voiceprint Recognition to the voice of above-mentioned source languages, determines the voice of above-mentioned source languages
It is the voice of the first languages that first user is input into by above-mentioned terminal device, and the voice of above-mentioned first languages is turned in determination
Be translated into after the second languages, the text that speech recognition is obtained translated into the text of the second languages, and to translate into second
The text of languages carries out what is obtained after phonetic synthesis.
Step 504, plays the audio file of above-mentioned second languages.
Step 505, receives the voice of another source languages that second user is input into by above-mentioned terminal device.
Step 506, the voice of above-mentioned another source languages is sent to cloud server.
Step 507, receives the audio file of the first languages that above-mentioned cloud server sends, the audio of above-mentioned first languages
File is that cloud server carries out speech recognition and Application on Voiceprint Recognition to the voice of above-mentioned another source languages, determines above-mentioned another source language
The voice of the second languages that the voice planted is input into for second user by above-mentioned terminal device, and determine above-mentioned second languages
Voiced translation it is after the first languages, the text that speech recognition is obtained to be translated into the text of above-mentioned first languages and right
The text of the first languages translated into carries out what is obtained after phonetic synthesis.
Step 508, plays the audio file of above-mentioned first languages.
As described above, the present embodiment can realize many wheel voice mutual translations, still so that target language is for Chinese and English as an example,
The scene such as entry and exit customs, checkout of ordering, shopping are knocked down-price and/or hotel occupancy is checked out, first user can be grown by above-mentioned terminal
The translation button of equipment, one section of Chinese speech is input into by the microphone of above-mentioned terminal device to above-mentioned terminal device, is then pressed
According to the voice translation method based on artificial intelligence that the application is provided, it is corresponding that above-mentioned terminal device will obtain above-mentioned Chinese speech
The voice of English translation, and play back, second user has been listened after this section of voice of English translation, still can be grown by above-mentioned
The translation button of terminal device, is input into one section of English voice, so by the microphone of above-mentioned terminal device to above-mentioned terminal device
The voice translation method based on artificial intelligence for being provided according to the application afterwards, above-mentioned terminal device will obtain above-mentioned English voice pair
The voice of the Chinese translation answered, and play back, so, first user just can be real by above-mentioned terminal device with second user
Existing smooth communication, can fully meet the translation demand of the scenes such as overseas trip.
Further, in the voice translation method based on artificial intelligence that the application is provided, above-mentioned terminal can also be set
Standby wireless communication signals are supplied to another terminal device, so that above-mentioned another terminal device is connected to internet.Specifically, on
The wireless communication signals for stating terminal device can be Wireless Fidelity (Wireless Fidel ity;Hereinafter referred to as:WiFi) signal,
That is, in the present embodiment, above-mentioned terminal device is also equipped with WiFi function, user can be searched and be connected by wireless network
To the WiFi that above-mentioned terminal device is provided, at least one online demand of electronic equipment such as mobile phone and/or computer is met, and with
Mobile phone cellular network roaming overseas compare cheaper, and signal is more stable.
With original carry-on WiFi function be combined real-time voice interpretative function by the terminal device in above-described embodiment, both
Network can be freely enjoyed, the voice real time translation of 26 state's languages can be called by a key when needed again.In commercial affairs exchange, multilingual
The scenes such as habit, entry and exit tourism and/or sight spot guide to visitors can efficiently meet online and the translation demand of user, improve user's body
Test.
Fig. 6 is the flow chart of the voice translation method further embodiment that the application is based on artificial intelligence, as shown in fig. 6,
The above-mentioned voice translation method based on artificial intelligence can include:
Step 601, the voice of the source languages that receiving terminal apparatus send.
Specifically, the voice of above-mentioned source languages is PCM format.
Step 602, the voice to above-mentioned source languages carries out speech recognition, and the voice of above-mentioned source languages is converted into source languages
Text.
Step 603, it is determined that by the voiced translation of above-mentioned source languages be at least two target languages in except above-mentioned source languages it
Outer at least one target language.
Step 604, the text of above-mentioned source languages is translated into the text of the target language of determination, to the target language translated into
The text planted carries out phonetic synthesis, obtains the audio file of target language.
Specifically, the text of target language that can be by TTS service to translating into carries out phonetic synthesis, obtains target language
The audio file planted.
Step 605, above-mentioned terminal device is sent to by the audio file of above-mentioned target language, so that above-mentioned terminal device is broadcast
Put.
In the above-mentioned voice translation method based on artificial intelligence, after the voice of the source languages that receiving terminal apparatus send,
Voice to above-mentioned source languages carries out speech recognition, and the voice of above-mentioned source languages is converted into the text of source languages, it is determined that will
The voiced translation of above-mentioned source languages be at least one target language at least two target languages in addition to above-mentioned source languages it
Afterwards, the text of above-mentioned source languages is translated into the text of the target language of determination, and the text of the target language to translating into enters
Row phonetic synthesis, obtains the audio file of target language, and the audio file of above-mentioned target language finally is sent into above-mentioned terminal
Equipment, so that above-mentioned terminal device is played, such that it is able to realize the real time translation of voice, meets the translation need of overseas trip scene
Ask, and it is higher to translate accuracy rate.
Fig. 7 is the flow chart of the voice translation method further embodiment that the application is based on artificial intelligence, in the present embodiment,
Above-mentioned target language can include the first languages and the second languages;As shown in fig. 7, in the application embodiment illustrated in fig. 6, step
603 can include:
Step 701, the voice to above-mentioned source languages carries out Application on Voiceprint Recognition, and the voice for determining above-mentioned source languages is first user
The voice of the first languages being input into by above-mentioned terminal device.
Step 702, the corresponding target language of mark according to the above-mentioned terminal device for pre-saving, it is determined that by above-mentioned first
The voiced translation of languages is the audio file of the second languages.
That is, in the present embodiment, cloud server is first by Application on Voiceprint Recognition, the voice for determining above-mentioned source languages
After the voice of the first languages of user input, the mark of identifier lookup to the above-mentioned terminal device according to above-mentioned terminal device is right
The target language answered includes the first languages and the second languages, and because source languages are the first languages, therefore cloud server can be true
Surely need audio file that the voiced translation of above-mentioned first languages is the second languages.
Fig. 8 is the flow chart of the voice translation method further embodiment that the application is based on artificial intelligence, as shown in figure 8,
Can include:
Step 801, the voice of the source languages that receiving terminal apparatus send.
Step 802, the voice to above-mentioned source languages carries out speech recognition, and the voice of above-mentioned source languages is converted into source languages
Text.
Step 803, the voice to above-mentioned source languages carries out Application on Voiceprint Recognition, and the voice for determining above-mentioned source languages is first user
The voice of the first languages being input into by above-mentioned terminal device.
Step 804, the corresponding target language of mark according to the above-mentioned terminal device for pre-saving, it is determined that by above-mentioned first
The voiced translation of languages is the audio file of the second languages.
Step 805, the text of above-mentioned source languages is translated into the text of the second languages, the text of the second languages to translating into
Originally phonetic synthesis is carried out, the audio file of the second languages is obtained.
Step 806, above-mentioned terminal device is sent to by the audio file of above-mentioned second languages, so that above-mentioned terminal device is broadcast
Put.
Step 807, the voice of another source languages that receiving terminal apparatus send.
Step 808, the voice to above-mentioned another source languages carries out Application on Voiceprint Recognition, and the voice for determining above-mentioned another source languages is
The voice of the second languages that second user is input into by above-mentioned terminal device.
Step 809, the voice to above-mentioned second languages carries out speech recognition, and the voice of above-mentioned second languages is converted into
The text of two languages.
Step 810, the corresponding target language of mark according to the above-mentioned terminal device for pre-saving, it is determined that by above-mentioned second
The voiced translation of languages is the audio file of the first languages.
Step 811, the text of above-mentioned second languages is translated into the text of the first languages, to the text of above-mentioned first languages
Phonetic synthesis is carried out, the audio file of the first languages is obtained.
Step 812, above-mentioned terminal device is sent to by the audio file of above-mentioned first languages, so that above-mentioned terminal device is broadcast
Put.
As described above, the present embodiment can realize many wheel voice mutual translations, still so that target language is for Chinese and English as an example,
The scene such as entry and exit customs, checkout of ordering, shopping are knocked down-price and/or hotel occupancy is checked out, first user can be grown by above-mentioned terminal
The translation button of equipment, one section of Chinese speech, Ran Houshang are input into by the microphone of above-mentioned terminal device to above-mentioned terminal device
State terminal device and this section of Chinese speech be sent to cloud server, cloud server according to the application provide based on artificial intelligence
Can voice translation method by above-mentioned Chinese speech be translated as English audio file, then by it is translated English audio file
Terminal device is sent to, and is played back by terminal device, second user has been listened after this section of English voice, still can be grown and be pressed
The translation button of above-mentioned terminal device, one section of English language is input into by the microphone of above-mentioned terminal device to above-mentioned terminal device
Sound, then above-mentioned terminal device this section of English voice is sent to cloud server, cloud server is provided according to the application
Voice translation method based on artificial intelligence by the audio file that above-mentioned English voiced translation is Chinese, then by translated Chinese
Audio file be sent to terminal device, and played back by terminal device, so, first user is with second user by above-mentioned
Terminal device can just realize smooth communication, can fully meet the translation demand of the scenes such as overseas trip.
Fig. 9 is the flow chart of the voice translation method further embodiment that the application is based on artificial intelligence, as shown in figure 9,
In the application embodiment illustrated in fig. 6, before step 604, can also include:
Step 901, receives the target language that above-mentioned terminal device is uploaded, correspondence preserve the mark of above-mentioned terminal device with it is upper
Target language is stated, above-mentioned target language includes at least two languages, and above-mentioned at least two languages include above-mentioned source languages.
So, step 604 can be:
Step 902, according to the mark of above-mentioned terminal device, calls the corresponding target language of mark of above-mentioned terminal device
Corpus, the text of above-mentioned source languages is translated into the text of the target language of determination, the text of the target language to translating into
Phonetic synthesis is carried out, the audio file of target language is obtained.
In the present embodiment, terminal device is obtained after the target language that above-mentioned user is set, and can be set above-mentioned user
Target language be uploaded to above-mentioned cloud server, the mark and above-mentioned mesh of above-mentioned terminal device are preserved by cloud server correspondence
Poster kind, wherein, being designated for above-mentioned terminal device can be with the information of the above-mentioned terminal device of unique mark, such as:Above-mentioned terminal
The device number of equipment, the present embodiment is not construed as limiting to the form of the mark of above-mentioned terminal device.
Above-mentioned target language can include at least two languages, and above-mentioned at least two languages include above-mentioned source languages,
That is, the present embodiment can realize voice mutual translation at least two target languages that user is set.As an example it is assumed that with
Family set target language be " China and Britain ", then user after the unique translation button for pinning above-mentioned terminal device, upwards
State terminal device to be right literary " I thinks nearest subway station ", then unclamp translation button, by the application offer based on people
The voice translation method of work intelligence, above-mentioned terminal device will play out " I want to go to the nearest subway
The sound result of station ";And if user is after the unique translation button for pinning above-mentioned terminal device, to above-mentioned end
End equipment says English " I want to go to the nearest subway station ", then unclamps translation button, leads to
The voice translation method based on artificial intelligence of the application offer is crossed, above-mentioned terminal device will play out that " I thinks nearest ground
The sound result at iron station ".
Similarly, above-mentioned target language can also be " middle Britain and Japan ", then the present embodiment will be between Chinese, English and Japanese
Voice mutual translation is realized, if user is input into a Chinese, then the voiced translation based on artificial intelligence provided by the application
Method, terminal device will successively play the Japanese Translation and English translation of this Chinese, if one English of input, then pass through
The application provide the voice translation method based on artificial intelligence, terminal device will play successively this English Chinese translation and
Japanese Translation, by that analogy, will not be repeated here.
By above description as can be seen that in the present embodiment, it is English either to need translator of Chinese, or by English
Text is translated as Chinese, is all the input of the voice by same translation button trigger source languages, and this improves terminal in the application
The ease for use of equipment, it is user-friendly.
Figure 10 is the structural representation of speech translation apparatus one embodiment that the application is based on artificial intelligence, the present embodiment
In the speech translation apparatus based on artificial intelligence can set realize on the terminal device shown in the application Fig. 1~Fig. 5 implement
The method that example is provided.Above-mentioned terminal device can be the interpreting equipment for being integrated with WiFi function, and the present embodiment sets to above-mentioned terminal
Standby form is not construed as limiting.
As shown in Figure 10, the above-mentioned speech translation apparatus based on artificial intelligence can include:Receiver module 1001, transmission mould
Block 1002 and playing module 1003;
Wherein, receiver module 1001, the voice for receiving the source languages that user is input into by terminal device;
Sending module 1002, for the voice of above-mentioned source languages to be sent to cloud server;Specifically, sending module
The voice of above-mentioned source languages can be uploaded to cloud server by 1002 by PCM format.
Receiver module 1001, is additionally operable to receive the audio file of the target language that above-mentioned cloud server sends, above-mentioned mesh
The audio file of poster kind is that cloud server carries out speech recognition to the voice of above-mentioned source languages, it is determined that by above-mentioned source languages
After voiced translation is at least one target language at least two target languages in addition to above-mentioned source languages, by speech recognition
The text of acquisition translates into the text of the target language of determination, and the text of the target language to translating into carries out phonetic synthesis
Obtain afterwards;Specifically, the audio file that cloud server is sent to the target language of terminal device is also the text of PCM format
Part.
The text of target language of the cloud server to translating into carries out phonetic synthesis and uses TTS service.
Playing module 1003, the audio file for playing above-mentioned target language.
In the above-mentioned speech translation apparatus based on artificial intelligence, receiver module 1001 is received user and is input into by terminal device
Source languages voice after, sending module 1002 sends to cloud server the voice of above-mentioned source languages, then receives mould
Block 1001 receives the audio file of the target language that above-mentioned cloud server sends, and last playing module 1003 plays above-mentioned target
The audio file of languages, such that it is able to realize the real time translation of voice, meets the translation demand of overseas trip scene, and translate standard
True rate is higher.
Figure 11 is the structural representation that the application is based on another embodiment of the speech translation apparatus of artificial intelligence, this implementation
In example, receiver module 1001 passes through above-mentioned end specifically for receiving user after the translation button for triggering above-mentioned terminal device
The voice of the source languages of the microphone input of end equipment.In the present embodiment, the translation button of above-mentioned terminal device can be above-mentioned
The mechanical key set on terminal device, or the virtual key set on above-mentioned terminal device, the present embodiment is to above-mentioned
The form for translating case is not construed as limiting, but the present embodiment take above-mentioned translation button as the mechanical key set on above-mentioned terminal device
As a example by illustrate.
Wherein, the mode for triggering above-mentioned translation button can be pressed for length, or single click on or double-click etc., this reality
Example is applied to be not construed as limiting the mode for triggering above-mentioned translation button, the present embodiment to trigger above-mentioned translation button in the way of for grow by for
Example is illustrated.
It should be noted that in the present embodiment, the quantity of the translation button of above-mentioned terminal device is 1, as shown in Figure 3.
That is, the terminal device of the present embodiment is in hardware design, speech recognition+translation can be with a mechanical keys
Triggering.When user uses, as long as long by translation button, the voice for wanting translation is said to microphone, for example:" I thinks recently
Subway station ", then unclamp translation button, above-mentioned terminal device will play out " I want to go to the nearest
The sound result of subway station ", it is achieved thereby that real-time " key " translation of voice.
Further, the above-mentioned speech translation apparatus based on artificial intelligence can also include:Obtain module 1004;
Module 1004 is obtained, the audio for receiving the target language that above-mentioned cloud server sends in receiver module 1001
Before file, the target language that above-mentioned user is set is obtained;
Sending module 1002, is additionally operable to for the target language that above-mentioned user is set to be uploaded to above-mentioned cloud server, so as to
Above-mentioned cloud server correspondence preserves the mark and above-mentioned target language of above-mentioned terminal device, and above-mentioned target language includes at least two
Languages are planted, above-mentioned at least two languages include above-mentioned source languages.
Specifically, after user sets target language on the terminal device, obtain module 1004 and be obtained with above-mentioned use
Family set target language, then sending module 1002 target language that above-mentioned user is set is uploaded to above-mentioned cloud service
Device, the mark and above-mentioned target language of above-mentioned terminal device are preserved by cloud server correspondence, wherein, the mark of above-mentioned terminal device
Know for can be with the information of the above-mentioned terminal device of unique mark, such as:The device number of above-mentioned terminal device, the present embodiment is to above-mentioned end
The form of the mark of end equipment is not construed as limiting.
Above-mentioned target language can include at least two languages, and above-mentioned at least two languages include above-mentioned source languages,
That is, the present embodiment can realize voice mutual translation at least two target languages that user is set.As an example it is assumed that with
Family set target language be " China and Britain ", then user after the unique translation button for pinning above-mentioned terminal device, upwards
State terminal device to be right literary " I thinks nearest subway station ", then unclamp translation button, playing module 1003 will play out " I
The sound result of want to go to the nearest subway station ";And if user is pinning above-mentioned terminal
After unique translation button of equipment, English " I want to go to the nearest are said to above-mentioned terminal device
Subway station ", then unclamp translation button, and playing module 1003 will play out the language of " I thinks nearest subway station "
Sound result.
Similarly, above-mentioned target language can also be " middle Britain and Japan ", then above-mentioned terminal device will be in Chinese, English and Japanese
Between realize voice mutual translation, if user be input into a Chinese, then playing module 1003 will successively play the day of this Chinese
Literary translation and English translation, if one English of input, then playing module 1003 translates the Chinese for playing this English successively
Text and Japanese Translation, by that analogy, will not be repeated here.
By above description as can be seen that in the present embodiment, it is English either to need translator of Chinese, or by English
Text is translated as Chinese, is all the input of the voice by same translation button trigger source languages, and this improves terminal in the application
The ease for use of equipment, it is user-friendly.
In the present embodiment, above-mentioned user include first user and second user, above-mentioned target language include the first languages and
Second languages;
The audio file of above-mentioned target language includes the audio file of above-mentioned second languages, the audio text of above-mentioned second languages
Part is that cloud server carries out speech recognition and Application on Voiceprint Recognition to the voice of above-mentioned source languages, and the voice for determining above-mentioned source languages is
The voice of the first languages that first user is input into by above-mentioned terminal device, and determine the voiced translation of above-mentioned first languages
After for the second languages, the text that speech recognition is obtained is translated into the text of the second languages, and the second language to translating into
Kind text carry out what is obtained after phonetic synthesis;
Receiver module 1001, is additionally operable to after the audio file that playing module 1003 plays above-mentioned target language, receives
The voice of another source languages that second user is input into by above-mentioned terminal device;
Sending module 1002, is additionally operable to send the voice of above-mentioned another source languages to cloud server;
Receiver module 1001, is additionally operable to receive the audio file of the first languages that above-mentioned cloud server sends, and above-mentioned the
The audio file of one languages is that cloud server carries out speech recognition and Application on Voiceprint Recognition to the voice of above-mentioned another source languages, it is determined that
The voice of above-mentioned another source languages is the voice of the second languages that second user is input into by above-mentioned terminal device, and determination will
The voiced translation of above-mentioned second languages be the first languages after, by speech recognition obtain text translate into above-mentioned first languages
Text, and the text of the first languages to translating into carries out what is obtained after phonetic synthesis;
Playing module 1003, is additionally operable to play the audio file of above-mentioned first languages.
As described above, the speech translation apparatus based on artificial intelligence of the present embodiment can realize many wheel voice mutual translations, still
By target language for Chinese and English as a example by, entry and exit customs, checkout of ordering, shopping knock down-price and/or hotel occupancy check out etc.
Scene, first user can be grown by the translation button of above-mentioned terminal device, by the microphone of above-mentioned terminal device to above-mentioned end
End equipment is input into one section of Chinese speech, and then the above-mentioned speech translation apparatus based on artificial intelligence obtain above-mentioned Chinese speech correspondence
English translation voice, and play back, second user has been listened after this section of voice of English translation, still can be grown by upper
The translation button of terminal device is stated, one section of English voice is input into above-mentioned terminal device by the microphone of above-mentioned terminal device,
Then the above-mentioned speech translation apparatus based on artificial intelligence obtain the voice of the corresponding Chinese translation of above-mentioned English voice, and play
Out, so, first user can just be realized smooth with second user by the above-mentioned speech translation apparatus based on artificial intelligence
Link up, can fully meet the translation demand of the scenes such as overseas trip.
Further, the above-mentioned speech translation apparatus based on artificial intelligence can also include:
Wireless signal provides module 1005, for being supplied to another terminal to set the wireless communication signals of above-mentioned terminal device
It is standby, so that above-mentioned another terminal device is connected to internet.
Specifically, the wireless communication signals of above-mentioned terminal device can be WiFi signal, that is to say, that in the present embodiment,
The above-mentioned speech translation apparatus based on artificial intelligence are also equipped with WiFi function, and user can be searched and be connected to by wireless network
The WiFi that the above-mentioned speech translation apparatus based on artificial intelligence are provided, meets at least one electronic equipment such as mobile phone and/or computer
Online demand, and the cheaper compared with mobile phone cellular network roaming overseas, signal is more stable.
Speech translation apparatus based on artificial intelligence in above-described embodiment by real-time voice interpretative function with it is original with
Body WiFi function is combined, and can freely enjoy network, can call the voice real time translation of 26 state's languages by a key when needed again.
The scenes such as commercial exchange, multilingual study, entry and exit tourism and/or sight spot guide to visitors can efficiently meet the online and translation of user
Demand, improves Consumer's Experience.
Figure 12 is the structural representation of the speech translation apparatus further embodiment that the application is based on artificial intelligence, this implementation
The speech translation apparatus based on artificial intelligence in example can be realized as cloud server, or a part for cloud server
The voice translation method based on artificial intelligence shown in the application Fig. 6~Fig. 9 embodiments.
As shown in figure 12, the above-mentioned speech translation apparatus based on artificial intelligence can include:Receiver module 1201, voice is known
Other module 1202, determining module 1203, translation module 1204, voice synthetic module 1205 and sending module 1206;
Wherein, receiver module 1201, the voice of the source languages sent for receiving terminal apparatus;Specifically, above-mentioned source language
The voice planted is PCM format.
Sound identification module 1202, the voice of the source languages for being received to receiver module 1201 carries out speech recognition, will
The voice of above-mentioned source languages is converted into the text of source languages;
Determining module 1203, it is for determining that the voiced translation of above-mentioned source languages is above-mentioned to be removed at least two target languages
At least one target language outside the languages of source;
Translation module 1204, for the text of above-mentioned source languages to be translated into the target language that determining module 1203 determines
Text;
Voice synthetic module 1205, the text of the target language for being translated into translation module 1204 carries out voice conjunction
Into the audio file of acquisition target language;Specifically, the mesh that voice synthetic module 1205 can be by TTS service to translating into
The text of poster kind carries out phonetic synthesis, obtains the audio file of target language.
Sending module 1206, the audio file of the target language for voice synthetic module 1205 to be obtained is sent to above-mentioned
Terminal device, so that above-mentioned terminal device is played.
In the above-mentioned speech translation apparatus based on artificial intelligence, the source languages that the receiving terminal apparatus of receiver module 1201 send
Voice after, sound identification module 1202 carries out speech recognition to the voice of above-mentioned source languages, by the voice of above-mentioned source languages
The text of source languages is converted into, determines that by the voiced translation of above-mentioned source languages be at least two target languages in determining module 1203
In after at least one target language in addition to above-mentioned source languages, translation module 1204 translates into the text of above-mentioned source languages
The text of the target language of determination, and carry out voice conjunction by 1205 pairs of texts of the target language translated into of voice synthetic module
Into, the audio file of target language is obtained, be sent to for the audio file of above-mentioned target language above-mentioned by last sending module 1206
Terminal device, so that above-mentioned terminal device is played, such that it is able to realize the real time translation of voice, meets the translation of overseas trip scene
Demand, and it is higher to translate accuracy rate.
Figure 13 is the structural representation of the speech translation apparatus further embodiment that the application is based on artificial intelligence, this implementation
In example, above-mentioned target language includes the first languages and the second languages;
Determining module 1203, Application on Voiceprint Recognition is carried out specifically for the voice to above-mentioned source languages, determines above-mentioned source languages
Voice is the voice of the first languages that first user is input into by above-mentioned terminal device;And according to the above-mentioned terminal for pre-saving
The corresponding target language of mark of equipment, it is determined that by audio that the voiced translation of above-mentioned first languages is above-mentioned second languages text
Part.
That is, in the present embodiment, determining module 1203 determines that the voice of above-mentioned source languages is the by Application on Voiceprint Recognition
After the voice of the first languages of one user input, the mark of identifier lookup according to above-mentioned terminal device to above-mentioned terminal device
Corresponding target language includes the first languages and the second languages, because source languages are the first languages, it is thus determined that module 1203 can
To determine to need the audio file by the voiced translation of above-mentioned first languages is the second languages.
In the present embodiment, the audio file of above-mentioned target language includes the audio file of the second languages;
Receiver module 1201, is additionally operable to that the audio file of above-mentioned target language is sent into above-mentioned end in sending module 1206
End equipment, after being played for above-mentioned terminal device, the voice of another source languages that receiving terminal apparatus send;
Determining module 1203, is additionally operable to carry out Application on Voiceprint Recognition to the voice of above-mentioned another source languages, determines above-mentioned another source
The voice of languages is the voice of the second languages that second user is input into by above-mentioned terminal device;
Sound identification module 1202, is additionally operable to carry out speech recognition to the voice of above-mentioned second languages, by above-mentioned second language
The voice planted is converted into the text of the second languages;
Determining module 1203, is additionally operable to the corresponding target language of mark according to the above-mentioned terminal device for pre-saving, really
It is fixed by audio file that the voiced translation of above-mentioned second languages is the first languages;
Translation module 1204, is additionally operable to translate into the text of the second languages the text of the first languages;
Voice synthetic module 1205, is additionally operable to carry out the text of above-mentioned first languages phonetic synthesis, obtains the first languages
Audio file;
Sending module 1206, is additionally operable to for the audio file of above-mentioned first languages to be sent to above-mentioned terminal device, for upper
State terminal device broadcasting.
As described above, the present embodiment can realize many wheel voice mutual translations, still so that target language is for Chinese and English as an example,
The scene such as entry and exit customs, checkout of ordering, shopping are knocked down-price and/or hotel occupancy is checked out, first user can be grown by above-mentioned terminal
The translation button of equipment, one section of Chinese speech, Ran Houshang are input into by the microphone of above-mentioned terminal device to above-mentioned terminal device
State terminal device and this section of Chinese speech is sent to the speech translation apparatus based on artificial intelligence, the above-mentioned language based on artificial intelligence
Above-mentioned Chinese speech is translated as English by sound translating equipment according to the voice translation method based on artificial intelligence that the application is provided
Audio file, then the audio file of translated English is sent to terminal device, and played back by terminal device, second
User has been listened after this section of English voice, still can be grown by the translation button of above-mentioned terminal device, by above-mentioned terminal device
Microphone be input into one section of English voice to above-mentioned terminal device, then this section of English voice is sent to base by above-mentioned terminal device
In the speech translation apparatus of artificial intelligence, the above-mentioned speech translation apparatus based on artificial intelligence according to the application provide based on people
Work intelligence voice translation method by above-mentioned English voiced translation be Chinese audio file, then by it is translated Chinese audio
File is sent to terminal device, and is played back by terminal device, and so, first user is set with second user by above-mentioned terminal
It is standby just to realize smooth communication, can fully meet the translation demand of the scenes such as overseas trip.
Further, the above-mentioned speech translation apparatus based on artificial intelligence can also include:Preserving module 1207;
Receiver module 1201, is additionally operable to translate into the text of above-mentioned source languages in translation module 1204 target language of determination
Before the text planted, the target language that above-mentioned terminal device is uploaded is received;
Preserving module 1207, the mark and above-mentioned target language of above-mentioned terminal device, above-mentioned target language are preserved for correspondence
Planting includes at least two languages, and above-mentioned at least two languages include above-mentioned source languages.
In the present embodiment, translation module 1204, specifically for the mark according to above-mentioned terminal device, calls above-mentioned terminal to set
The corpus of the standby corresponding target language of mark, the text of above-mentioned source languages is translated into the text of the target language of determination.
In the present embodiment, terminal device is obtained after the target language that above-mentioned user is set, and can be set above-mentioned user
Target language be uploaded to the above-mentioned speech translation apparatus based on artificial intelligence, above-mentioned terminal is preserved by the correspondence of preserving module 1207
The mark of equipment and above-mentioned target language, wherein, being designated for above-mentioned terminal device can be with the above-mentioned terminal device of unique mark
Information, for example:The device number of above-mentioned terminal device, the present embodiment is not construed as limiting to the form of the mark of above-mentioned terminal device.
Above-mentioned target language can include at least two languages, and above-mentioned at least two languages include above-mentioned source languages,
That is, the present embodiment can realize voice mutual translation at least two target languages that user is set.As an example it is assumed that with
Family set target language be " China and Britain ", then user after the unique translation button for pinning above-mentioned terminal device, upwards
State terminal device to be right literary " I thinks nearest subway station ", then unclamp translation button, above-mentioned terminal device will play out " I
The sound result of want to go to the nearest subway station ";And if user is pinning above-mentioned terminal
After unique translation button of equipment, English " I want to go to the nearest are said to above-mentioned terminal device
Subway station ", then unclamp translation button, and above-mentioned terminal device will play out the language of " I thinks nearest subway station "
Sound result.
Similarly, above-mentioned target language can also be " middle Britain and Japan ", then the present embodiment will be between Chinese, English and Japanese
Voice mutual translation is realized, if user is input into a Chinese, then the speech translation apparatus based on artificial intelligence that the application is provided
To be successively Japanese and English by this translator of Chinese, then play the Japanese Translation and English of this Chinese successively by terminal device
Literary translation, if one English of input, then the speech translation apparatus based on artificial intelligence that the application is provided will successively by this
Sentence translator of English is Chinese and Japanese, then plays the Chinese translation and Japanese Translation of this English successively by terminal device, with
This analogizes, and will not be repeated here.
Figure 14 is the structural representation of the application terminal device one embodiment, and the terminal device in the present embodiment can be real
The method that existing the application Fig. 1~embodiment illustrated in fig. 5 is provided, above-mentioned terminal device can include:One or more processors;Deposit
Reservoir, for storing one or more programs;Receiver, the language for receiving the source languages that user is input into by terminal device
Sound;And send to cloud server the voice of above-mentioned source languages in transmitter, receive above-mentioned cloud server and send
Target language audio file, the audio file of above-mentioned target language is that cloud server is carried out to the voice of above-mentioned source languages
Speech recognition, it is determined that by the voiced translation of above-mentioned source languages be at least two target languages in addition to above-mentioned source languages at least
After a kind of target language, the text that speech recognition is obtained is translated into the text of the target language of determination, and to translating into
The text of target language carry out what is obtained after phonetic synthesis;Transmitter, for the voice of above-mentioned source languages to be sent to high in the clouds
Server;When said one or multiple programs are by said one or multiple computing devices so that said one or multiple treatment
Device realizes the method that the application Fig. 1~embodiment illustrated in fig. 5 is provided.
Figure 14 shows the block diagram for being suitable to the exemplary terminal equipment 12 for realizing the application implementation method.Figure 14 shows
Terminal device 12 be only an example, should not to the function of the embodiment of the present application and using range band any limitation.
As shown in figure 14, terminal device 12 is showed in the form of universal computing device.The component of terminal device 12 can be wrapped
Include but be not limited to:One or more processor or processing unit 16, system storage 28, connection different system component (bag
Include system storage 28 and processing unit 16) bus 18.
Bus 18 represents one or more in a few class bus structures, including memory bus or Memory Controller,
Peripheral bus, AGP, processor or the local bus using any bus structures in various bus structures.Lift
For example, these architectures include but is not limited to industry standard architecture (Industry Standard
Architecture;Hereinafter referred to as:ISA) bus, MCA (Micro Channel Architecture;Below
Referred to as:MAC) bus, enhanced isa bus, VESA (Video Electronics Standards
Association;Hereinafter referred to as:VESA) local bus and periphery component interconnection (Peripheral Component
Interconnection;Hereinafter referred to as:PCI) bus.
Terminal device 12 typically comprises various computing systems computer-readable recording medium.These media can be it is any can be by end
The usable medium that end equipment 12 is accessed, including volatibility and non-volatile media, moveable and immovable medium.
System storage 28 can include the computer system readable media of form of volatile memory, such as arbitrary access
Memory (Random Access Memory;Hereinafter referred to as:RAM) 30 and/or cache memory 32.Terminal device 12 can
To further include other removable/nonremovable, volatile/non-volatile computer system storage mediums.Only as act
Example, storage system 34 can be used for the immovable, non-volatile magnetic media of read-write, and (Figure 14 does not show that commonly referred to " hard disk drives
Dynamic device ").Although not shown in Figure 14, can provide for the magnetic to may move non-volatile magnetic disk (such as " floppy disk ") read-write
Disk drive, and to removable anonvolatile optical disk (for example:Compact disc read-only memory (Compact Disc Read Only
Memory;Hereinafter referred to as:CD-ROM), digital multi read-only optical disc (Digital Video Disc Read Only
Memory;Hereinafter referred to as:DVD-ROM) or other optical mediums) read-write CD drive.In these cases, each driving
Device can be connected by one or more data media interfaces with bus 18.Memory 28 can be produced including at least one program
Product, the program product has one group of (for example, at least one) program module, and it is each that these program modules are configured to perform the application
The function of embodiment.
With one group of program/utility 40 of (at least one) program module 42, can store in such as memory 28
In, such program module 42 includes --- but being not limited to --- operating system, one or more application program, other programs
Module and routine data, potentially include the realization of network environment in each or certain combination in these examples.Program mould
Block 42 generally performs function and/or method in embodiments described herein.
Terminal device 12 can also be with one or more external equipments 14 (such as keyboard, sensing equipment, display 24 etc.)
Communication, can also enable a user to the equipment communication that interact with the terminal device 12 with one or more, and/or with so that the end
Any equipment (such as network interface card, modem etc.) that end equipment 12 can be communicated with one or more of the other computing device
Communication.This communication can be carried out by input/output (I/O) interface 22.Also, terminal device 12 can also be suitable by network
Orchestration 20 and one or more network (such as LAN (Local Area Network;Hereinafter referred to as:LAN), wide area network
(Wide Area Network;Hereinafter referred to as:WAN) and/or public network, such as internet) communication.As shown in figure 14, network
Adapter 20 is communicated by bus 18 with other modules of terminal device 12.Although it should be understood that not shown in Figure 14, Ke Yijie
Close terminal device 12 and use other hardware and/or software module, including but not limited to:Microcode, device driver, redundancy treatment
Unit, external disk drive array, RAID system, tape drive and data backup storage system etc..
Processing unit 16 by running program of the storage in system storage 28 so that perform various function application and
Data processing, for example, realize the voice translation method based on artificial intelligence that the application Fig. 1~embodiment illustrated in fig. 5 is provided.
The application also provides a kind of storage medium comprising computer executable instructions, and above computer executable instruction exists
For performing the voice based on artificial intelligence that the application Fig. 1~embodiment illustrated in fig. 5 is provided when being performed by computer processor
Interpretation method.
The above-mentioned storage medium comprising computer executable instructions can use one or more computer-readable media
Any combination.Computer-readable medium can be computer-readable signal media or computer-readable recording medium.Calculate
Machine readable storage medium storing program for executing for example may be-but not limited to-electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system,
Device or device, or any combination above.The more specifically example (non exhaustive list) of computer-readable recording medium
Including:Electrical connection, portable computer diskette, hard disk, random access memory (RAM) with one or more wires, only
Read memory (Read Only Memory;Hereinafter referred to as:ROM), erasable programmable read only memory (Erasable
Programmable Read Only Memory;Hereinafter referred to as:) or flash memory, optical fiber, portable compact disc are read-only deposits EPROM
Reservoir (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer
Readable storage medium storing program for executing can be it is any comprising or storage program tangible medium, the program can be commanded execution system, device
Or device is used or in connection.
Computer-readable signal media can include the data-signal propagated in a base band or as a carrier wave part,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including --- but
It is not limited to --- electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be
Any computer-readable medium beyond computer-readable recording medium, the computer-readable medium can send, propagate or
Transmit for being used or program in connection by instruction execution system, device or device.
The program code included on computer-readable medium can be transmitted with any appropriate medium, including --- but do not limit
In --- wireless, electric wire, optical cable, RF etc., or above-mentioned any appropriate combination.
Computer for performing the application operation can be write with one or more programming language or its combination
Program code, described program design language includes object oriented program language-such as Java, Smalltalk, C++,
Also including conventional procedural programming language-such as " C " language or similar programming language.Program code can be with
Fully perform on the user computer, partly perform on the user computer, performed as an independent software kit, portion
Part on the user computer is divided to perform on the remote computer or performed on remote computer or server completely.
It is related in the situation of remote computer, remote computer can be by the network of any kind --- including LAN (Local
Area Network;Hereinafter referred to as:) or wide area network (Wide Area Network LAN;Hereinafter referred to as:WAN) it is connected to user
Computer, or, it may be connected to outer computer (such as using ISP come by Internet connection).
Figure 15 is the structural representation of the application cloud server one embodiment, and the cloud server in the present embodiment can
To realize the flow of the application Fig. 6~embodiment illustrated in fig. 9, above-mentioned cloud server can include:One or more processors;
Memory, for storing one or more programs;Receiver, the voice of the source languages sent for receiving terminal apparatus;Send
Device, for the audio file of target language to be sent into above-mentioned terminal device, so that above-mentioned terminal device is played;Work as said one
Or multiple programs are by said one or multiple computing devices so that said one or multiple processors realize the application Fig. 6~
The voice translation method based on artificial intelligence that embodiment illustrated in fig. 9 is provided.
Figure 15 shows the block diagram for being suitable to the exemplary cloud server 10 for realizing the application implementation method.Figure 15 shows
The cloud server 10 for showing is only an example, should not carry out any limit to the function of the embodiment of the present application and using range band
System.
As shown in figure 15, cloud server 10 is showed in the form of universal computing device.The component of cloud server 10 can
To include but is not limited to:One or more processor or processing unit 160, system storage 280 connect different system group
The bus 180 of part (including system storage 280 and processing unit 160).
Bus 180 represents one or more in a few class bus structures, including memory bus or Memory Controller,
Peripheral bus, AGP, processor or the local bus using any bus structures in various bus structures.Lift
For example, these architectures include but is not limited to industry standard architecture (Industry Standard
Architecture;Hereinafter referred to as:ISA) bus, MCA (Micro Channel Architecture;Below
Referred to as:MAC) bus, enhanced isa bus, VESA (Video Electronics Standards
Association;Hereinafter referred to as:VESA) local bus and periphery component interconnection (Peripheral Component
Interconnection;Hereinafter referred to as:PCI) bus.
Cloud server 10 typically comprises various computing systems computer-readable recording medium.These media can be it is any can be by
The usable medium that cloud server 10 is accessed, including volatibility and non-volatile media, moveable and immovable medium.
System storage 280 can include the computer system readable media of form of volatile memory, for example, deposit at random
Access to memory (Random Access Memory;Hereinafter referred to as:RAM) 300 and/or cache memory 320.Cloud service
Device 10 may further include other removable/nonremovable, volatile/non-volatile computer system storage mediums.Only
As an example, storage system 340 can be used for reading and writing immovable, non-volatile magnetic media that (Figure 15 do not show, commonly referred to
" hard disk drive ").Although not shown in Figure 15, can provide for reading removable non-volatile magnetic disk (such as " floppy disk ")
The disc driver write, and to removable anonvolatile optical disk (for example:Compact disc read-only memory (Compact Disc Read
Only Memory;Hereinafter referred to as:CD-ROM), digital multi read-only optical disc (Digital Video Disc Read Only
Memory;Hereinafter referred to as:DVD-ROM) or other optical mediums) read-write CD drive.In these cases, each driving
Device can be connected by one or more data media interfaces with bus 180.Memory 280 can include at least one program
Product, the program product has one group of (for example, at least one) program module, and these program modules are configured to perform the application
The function of each embodiment.
With one group of program/utility 400 of (at least one) program module 420, can store in such as memory
In 280, operating system that such program module 420 includes --- but being not limited to ---, one or more application program, other
Program module and routine data, potentially include the realization of network environment in each or certain combination in these examples.Journey
Sequence module 420 generally performs function and/or method in embodiments described herein.
Cloud server 10 can also be with one or more external equipment 140 (such as keyboard, sensing equipment, displays 240
Deng) communication, can also enable a user to the equipment communication that is interacted with the cloud server 10 with one or more, and/or with make
Obtain any equipment (such as network interface card, modulatedemodulate that the cloud server 10 can be communicated with one or more of the other computing device
Adjust device etc.) communication.This communication can be carried out by input/output (I/O) interface 220.Also, cloud server 10 may be used also
With by network adapter 200 and one or more network (such as LAN (Local Area Network;Hereinafter referred to as:
LAN), wide area network (Wide Area Network;Hereinafter referred to as:WAN) and/or public network, such as internet) communication.As schemed
Shown in 15, network adapter 200 is communicated by bus 180 with other modules of cloud server 10.Although it should be understood that Figure 15
Not shown in, cloud server 10 can be combined and use other hardware and/or software module, including but not limited to:Microcode, set
Standby driver, redundant processing unit, external disk drive array, RAID system, tape drive and data backup storage system
System etc..
Processing unit 160 by running program of the storage in system storage 280 so that perform various function application with
And data processing, for example realize the voice translation method based on artificial intelligence that the application Fig. 6~embodiment illustrated in fig. 9 is provided.
The application also provides a kind of storage medium comprising computer executable instructions, and above computer executable instruction exists
For performing the voice based on artificial intelligence that the application Fig. 6~embodiment illustrated in fig. 9 is provided when being performed by computer processor
Interpretation method.
The above-mentioned storage medium comprising computer executable instructions can use one or more computer-readable media
Any combination.Computer-readable medium can be computer-readable signal media or computer-readable recording medium.Calculate
Machine readable storage medium storing program for executing for example may be-but not limited to-electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system,
Device or device, or any combination above.The more specifically example (non exhaustive list) of computer-readable recording medium
Including:Electrical connection, portable computer diskette, hard disk, random access memory (RAM) with one or more wires, only
Read memory (Read Only Memory;Hereinafter referred to as:ROM), erasable programmable read only memory (Erasable
Programmable Read Only Memory;Hereinafter referred to as:) or flash memory, optical fiber, portable compact disc are read-only deposits EPROM
Reservoir (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer
Readable storage medium storing program for executing can be it is any comprising or storage program tangible medium, the program can be commanded execution system, device
Or device is used or in connection.
Computer-readable signal media can include the data-signal propagated in a base band or as a carrier wave part,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including --- but
It is not limited to --- electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be
Any computer-readable medium beyond computer-readable recording medium, the computer-readable medium can send, propagate or
Transmit for being used or program in connection by instruction execution system, device or device.
The program code included on computer-readable medium can be transmitted with any appropriate medium, including --- but do not limit
In --- wireless, electric wire, optical cable, RF etc., or above-mentioned any appropriate combination.
Computer for performing the application operation can be write with one or more programming language or its combination
Program code, described program design language includes object oriented program language-such as Java, Smalltalk, C++,
Also including conventional procedural programming language-such as " C " language or similar programming language.Program code can be with
Fully perform on the user computer, partly perform on the user computer, performed as an independent software kit, portion
Part on the user computer is divided to perform on the remote computer or performed on remote computer or server completely.
It is related in the situation of remote computer, remote computer can be by the network of any kind --- including LAN (Local
Area Network;Hereinafter referred to as:) or wide area network (Wide Area Network LAN;Hereinafter referred to as:WAN) it is connected to user
Computer, or, it may be connected to outer computer (such as using ISP come by Internet connection).
It should be noted that in the description of the present application, term " first ", " second " etc. are only used for describing purpose, without
It is understood that to indicate or implying relative importance.Additionally, in the description of the present application, unless otherwise indicated, the implication of " multiple "
It is two or more.
Any process described otherwise above or method description in flow chart or herein is construed as, and expression includes
It is one or more for realizing specific logical function or process the step of the module of code of executable instruction, fragment or portion
Point, and the scope of the preferred embodiment of the application includes other realization, wherein can not press shown or discussion suitable
Sequence, including function involved by basis by it is basic simultaneously in the way of or in the opposite order, carry out perform function, this should be by the application
Embodiment person of ordinary skill in the field understood.
It should be appreciated that each several part of the application can be realized with hardware, software, firmware or combinations thereof.Above-mentioned
In implementation method, the software that multiple steps or method can in memory and by suitable instruction execution system be performed with storage
Or firmware is realized.If for example, realized with hardware, and in another embodiment, can be with well known in the art
Any one of row technology or their combination are realized:With the logic gates for realizing logic function to data-signal
Discrete logic, the application specific integrated circuit with suitable combinational logic gate circuit, programmable gate array
(Programmable Gate Array;Hereinafter referred to as:PGA), field programmable gate array (Field Programmable
Gate Array;Hereinafter referred to as:FPGA) etc..
Those skilled in the art are appreciated that to realize all or part of step that above-described embodiment method is carried
The rapid hardware that can be by program to instruct correlation is completed, and described program can be stored in a kind of computer-readable storage medium
In matter, the program upon execution, including one or a combination set of the step of embodiment of the method.
Additionally, each functional module in the application each embodiment can be integrated in a processing module, or
Modules are individually physically present, it is also possible to which two or more modules are integrated in a module.Above-mentioned integrated module
Both can be realized in the form of hardware, it would however also be possible to employ the form of software function module is realized.If the integrated module
Realized in the form of using software function module and as independent production marketing or when using, it is also possible to which storage can in a computer
In reading storage medium.
Storage medium mentioned above can be read-only storage, disk or CD etc..
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means to combine specific features, structure, material or spy that the embodiment or example are described
Point is contained at least one embodiment of the application or example.In this manual, to the schematic representation of above-mentioned term not
Necessarily refer to identical embodiment or example.And, the specific features of description, structure, material or feature can be any
One or more embodiments or example in combine in an appropriate manner.
Although embodiments herein has been shown and described above, it is to be understood that above-described embodiment is example
Property, it is impossible to the limitation to the application is interpreted as, one of ordinary skill in the art within the scope of application can be to above-mentioned
Embodiment is changed, changes, replacing and modification.
Claims (24)
1. a kind of voice translation method based on artificial intelligence, it is characterised in that including:
Receive the voice of the source languages that user is input into by terminal device;
The voice of the source languages is sent to cloud server;
The audio file of the target language that the cloud server sends is received, the audio file of the target language is the cloud
End server carries out speech recognition to the voice of the source languages, it is determined that being at least two mesh by the voiced translation of the source languages
After at least one target language in poster kind in addition to the source languages, the text that speech recognition is obtained is translated into determination
Target language text, and the text of the target language to translating into carries out what is obtained after phonetic synthesis;
Play the audio file of the target language.
2. method according to claim 1, it is characterised in that the source languages that the reception user is input into by terminal device
Voice include:
User is received after the translation button for triggering the terminal device by the source of the microphone input of the terminal device
The voice of languages.
3. method according to claim 1, it is characterised in that the target language that the reception cloud server sends
Audio file before, also include:
The target language that the user is set is obtained, the target language that the user is set is uploaded to the cloud server,
The mark and the target language of the terminal device are preserved so as to cloud server correspondence, the target language is included extremely
Few two kinds of languages, at least two languages include the source languages.
4. method according to claim 1, it is characterised in that the user includes first user and second user;It is described
Target language includes the first languages and the second languages;
The audio file of the target language includes the audio file of second languages, and the audio file of second languages is
The cloud server carries out speech recognition and Application on Voiceprint Recognition to the voice of the source languages, and the voice for determining the source languages is
The voice of the first languages that first user is input into by the terminal device, and determine the voiced translation of first languages
After for the second languages, the text that speech recognition is obtained is translated into the text of the second languages, and the second language to translating into
Kind text carry out what is obtained after phonetic synthesis;
After the audio file for playing the target language, also include:
Receive the voice of another source languages that second user is input into by the terminal device;
The voice of another source languages is sent to cloud server;
The audio file of the first languages that the cloud server sends is received, the audio file of first languages is the cloud
End server carries out speech recognition and Application on Voiceprint Recognition to the voice of another source languages, determines the voice of another source languages
It is the voice of the second languages that second user is input into by the terminal device, and the voice of second languages is turned in determination
It is translated into after the first languages, the text that speech recognition is obtained is translated into the text of first languages, and to translating into
The text of the first languages carries out what is obtained after phonetic synthesis;
Play the audio file of first languages.
5. the method according to claim 1-4 any one, it is characterised in that also include:
The wireless communication signals of the terminal device are supplied to another terminal device, so that another terminal device is connected to
Internet.
6. a kind of voice translation method based on artificial intelligence, it is characterised in that including:
The voice of the source languages that receiving terminal apparatus send;
Voice to the source languages carries out speech recognition, and the voice of the source languages is converted into the text of source languages;
It is determined that being at least one at least two target languages in addition to the source languages by the voiced translation of the source languages
Target language;
The text of the source languages is translated into the text of the target language of determination, the text of the target language to translating into is carried out
Phonetic synthesis, obtains the audio file of target language;
The audio file of the target language is sent to the terminal device, so that the terminal device is played.
7. method according to claim 6, it is characterised in that the target language includes the first languages and the second languages;
It is described determine by the voiced translation of the source languages be at least two target languages in addition to the source languages at least
A kind of target language includes:
Voice to the source languages carries out Application on Voiceprint Recognition, and the voice for determining the source languages is first user by the terminal
The voice of the first languages of equipment input;
The corresponding target language of mark according to the terminal device for pre-saving, it is determined that the voice of first languages is turned over
It is translated into the audio file of second languages.
8. method according to claim 7, it is characterised in that the audio file of the target language includes the second languages
Audio file;
The audio file by the target language is sent to the terminal device, after being played for the terminal device,
Also include:
The voice of another source languages that receiving terminal apparatus send;
Voice to another source languages carries out Application on Voiceprint Recognition, determines the voice of another source languages for second user passes through
The voice of the second languages of the terminal device input;
Voice to second languages carries out speech recognition, and the voice of second languages is converted into the text of the second languages
This;
The corresponding target language of mark according to the terminal device for pre-saving, it is determined that the voice of second languages is turned over
It is translated into the audio file of the first languages;
The text of second languages is translated into the text of the first languages, the text to first languages carries out voice conjunction
Into the audio file of the first languages of acquisition;
The audio file of first languages is sent to the terminal device, so that the terminal device is played.
9. the method according to claim 6-8 any one, it is characterised in that the text by the source languages is translated
Into before the text of the target language for determining, also include:
The target language that the terminal device is uploaded is received, correspondence preserves the mark and the target language of the terminal device,
The target language includes at least two languages, and at least two languages include the source languages.
10. method according to claim 9, it is characterised in that the text by the source languages translates into determination
The text of target language includes:
According to the mark of the terminal device, the corpus of the corresponding target language of mark of the terminal device is called, by institute
State source languages text translate into determination target language text.
A kind of 11. speech translation apparatus based on artificial intelligence, are set on the terminal device, it is characterised in that described based on people
The speech translation apparatus of work intelligence include:
Receiver module, the voice for receiving the source languages that user is input into by terminal device;
Sending module, for the voice of the source languages to be sent to cloud server;
The receiver module, is additionally operable to receive the audio file of the target language that the cloud server sends, the target language
Kind audio file be that the cloud server carries out speech recognition to the voice of the source languages, it is determined that by the source languages
After voiced translation is at least one target language at least two target languages in addition to the source languages, by speech recognition
The text of acquisition translates into the text of the target language of determination, and the text of the target language to translating into carries out phonetic synthesis
Obtain afterwards;
Playing module, the audio file for playing the target language.
12. devices according to claim 11, it is characterised in that
The receiver module, specifically for receiving user after the translation button for triggering the terminal device by the terminal
The voice of the source languages of the microphone input of equipment.
13. devices according to claim 11, it is characterised in that also include:Obtain module;
The acquisition module, the audio file for receiving the target language that the cloud server sends in the receiver module
Before, the target language that the user is set is obtained;
The sending module, is additionally operable to for the target language that the user is set to be uploaded to the cloud server, so as to described
Cloud server correspondence preserves the mark and the target language of the terminal device, and the target language includes at least two languages
Kind, at least two languages include the source languages.
14. devices according to claim 11, it is characterised in that the user includes first user and second user;Institute
Stating target language includes the first languages and the second languages;
The audio file of the target language includes the audio file of second languages, and the audio file of second languages is
The cloud server carries out speech recognition and Application on Voiceprint Recognition to the voice of the source languages, and the voice for determining the source languages is
The voice of the first languages that first user is input into by the terminal device, and determine the voiced translation of first languages
After for the second languages, the text that speech recognition is obtained is translated into the text of the second languages, and the second language to translating into
Kind text carry out what is obtained after phonetic synthesis;
The receiver module, is additionally operable to after the audio file that the playing module plays the target language, receives second
The voice of another source languages that user is input into by the terminal device;
The sending module, is additionally operable to send the voice of another source languages to cloud server;
The receiver module, is additionally operable to receive the audio file of the first languages that the cloud server sends, first language
The audio file planted is that the cloud server carries out speech recognition and Application on Voiceprint Recognition to the voice of another source languages, it is determined that
The voice of another source languages is the voice of the second languages that second user is input into by the terminal device, and determination will
The voiced translation of second languages be the first languages after, by speech recognition obtain text translate into first languages
Text, and the text of the first languages to translating into carries out what is obtained after phonetic synthesis;
The playing module, is additionally operable to play the audio file of first languages.
15. device according to claim 11-14 any one, it is characterised in that also include:
Wireless signal provides module, for the wireless communication signals of the terminal device to be supplied into another terminal device, for
Another terminal device is connected to internet.
A kind of 16. speech translation apparatus based on artificial intelligence, are arranged on cloud server, it is characterised in that described to be based on
The speech translation apparatus of artificial intelligence include:
Receiver module, the voice of the source languages sent for receiving terminal apparatus;
Sound identification module, speech recognition is carried out for the voice to the source languages, and the voice of the source languages is converted into
The text of source languages;
Determining module, for determine by the voiced translation of the source languages be at least two target languages in except the source languages it
Outer at least one target language;
Translation module, the text for the text of the source languages to be translated into the target language that the determining module determines;
Voice synthetic module, the text of the target language for being translated into the translation module carries out phonetic synthesis, obtains mesh
The audio file of poster kind;
Sending module, the audio file of the target language for the voice synthetic module to be obtained is sent to the terminal and sets
It is standby, so that the terminal device is played.
17. devices according to claim 16, it is characterised in that the target language includes the first languages and the second language
Kind;
The determining module, Application on Voiceprint Recognition is carried out specifically for the voice to the source languages, determines the voice of the source languages
It is the voice of the first languages that first user is input into by the terminal device;According to the mark of the terminal device for pre-saving
Corresponding target language is known, it is determined that by the audio file that the voiced translation of first languages is second languages.
18. devices according to claim 17, it is characterised in that the audio file of the target language includes the second languages
Audio file;
The receiver module, is additionally operable to the audio file of the target language is sent into the terminal in the sending module and sets
It is standby, after being played for the terminal device, the voice of another source languages that receiving terminal apparatus send;
The determining module, is additionally operable to carry out Application on Voiceprint Recognition to the voice of another source languages, determines another source languages
The voice of the second languages that is input into by the terminal device for second user of voice;
The sound identification module, is additionally operable to carry out speech recognition to the voice of second languages, by second languages
Voice is converted into the text of the second languages;
The determining module, is additionally operable to the corresponding target language of mark according to the terminal device for pre-saving, it is determined that will
The voiced translation of second languages is the audio file of the first languages;
The translation module, is additionally operable to translate into the text of second languages text of the first languages;
The voice synthetic module, is additionally operable to carry out the text of first languages phonetic synthesis, obtains the sound of the first languages
Frequency file;
The sending module, was additionally operable to for the audio file of first languages to be sent to the terminal device, for the end
End equipment is played.
19. device according to claim 16-18 any one, it is characterised in that also include:Preserving module;
The receiver module, is additionally operable to that the text of the source languages is translated into the target language of determination in the translation module
Before text, the target language that the terminal device is uploaded is received;
The preserving module, the mark and the target language of the terminal device, the target language bag are preserved for correspondence
At least two languages are included, at least two languages include the source languages.
20. devices according to claim 19, it is characterised in that
The translation module, specifically for the mark according to the terminal device, calls the mark of the terminal device corresponding
The corpus of target language, the text of the source languages is translated into the text of the target language of determination.
A kind of 21. terminal devices, it is characterised in that including:
One or more processors;
Memory, for storing one or more programs;
Receiver, the voice for receiving the source languages that user is input into by terminal device;And in transmitter by the source language
The voice planted is sent to cloud server, receives the audio file of the target language that the cloud server sends, described
The audio file of target language is that the cloud server carries out speech recognition to the voice of the source languages, it is determined that by the source
After the voiced translation of languages is at least one target language at least two target languages in addition to the source languages, by language
The text that sound identification is obtained translates into the text of the target language of determination, and the text of the target language to translating into carries out language
Obtained after sound synthesis;
The transmitter, for the voice of the source languages to be sent to cloud server;
When one or more of programs are by one or more of computing devices so that one or more of processor realities
The existing method as described in any in claim 1-5.
A kind of 22. storage mediums comprising computer executable instructions, the computer executable instructions are by computer disposal
For performing the method as described in any in claim 1-5 when device is performed.
A kind of 23. cloud servers, it is characterised in that including:
One or more processors;
Memory, for storing one or more programs;
Receiver, the voice of the source languages sent for receiving terminal apparatus;
Transmitter, for the audio file of target language to be sent into the terminal device, so that the terminal device is played;
When one or more of programs are by one or more of computing devices so that one or more of processor realities
The existing method as described in any in claim 6-10.
A kind of 24. storage mediums comprising computer executable instructions, the computer executable instructions are by computer disposal
For performing the method as described in any in claim 6-10 when device is performed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710183965.2A CN106935240A (en) | 2017-03-24 | 2017-03-24 | Voice translation method, device, terminal device and cloud server based on artificial intelligence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710183965.2A CN106935240A (en) | 2017-03-24 | 2017-03-24 | Voice translation method, device, terminal device and cloud server based on artificial intelligence |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106935240A true CN106935240A (en) | 2017-07-07 |
Family
ID=59426343
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710183965.2A Pending CN106935240A (en) | 2017-03-24 | 2017-03-24 | Voice translation method, device, terminal device and cloud server based on artificial intelligence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106935240A (en) |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107749296A (en) * | 2017-10-12 | 2018-03-02 | 深圳市沃特沃德股份有限公司 | Voice translation method and device |
CN107967264A (en) * | 2017-12-29 | 2018-04-27 | 深圳市译家智能科技有限公司 | Translator, translation system and interpretation method |
CN108010519A (en) * | 2017-11-10 | 2018-05-08 | 上海爱优威软件开发有限公司 | A kind of information search method and system |
CN108319590A (en) * | 2018-01-25 | 2018-07-24 | 芜湖应天光电科技有限责任公司 | A kind of adaptive translator based on cloud service |
CN108415904A (en) * | 2018-01-12 | 2018-08-17 | 广东思派康电子科技有限公司 | A kind of binary channels real time translating method |
CN108595443A (en) * | 2018-03-30 | 2018-09-28 | 浙江吉利控股集团有限公司 | Simultaneous interpreting method, device, intelligent vehicle mounted terminal and storage medium |
CN108710615A (en) * | 2018-05-03 | 2018-10-26 | Oppo广东移动通信有限公司 | Interpretation method and relevant device |
CN109036451A (en) * | 2018-07-13 | 2018-12-18 | 深圳市小瑞科技股份有限公司 | A kind of simultaneous interpretation terminal and its simultaneous interpretation system based on artificial intelligence |
CN109272983A (en) * | 2018-10-12 | 2019-01-25 | 武汉辽疆科技有限公司 | Bilingual switching device for child-parent education |
CN109359307A (en) * | 2018-10-17 | 2019-02-19 | 深圳市沃特沃德股份有限公司 | Interpretation method, device and the equipment of automatic identification languages |
CN109522564A (en) * | 2018-12-17 | 2019-03-26 | 北京百度网讯科技有限公司 | Voice translation method and device |
CN109979431A (en) * | 2019-03-20 | 2019-07-05 | 邱洵 | A kind of multifunctional intellectual language translation system |
CN110728976A (en) * | 2018-06-30 | 2020-01-24 | 华为技术有限公司 | Method, device and system for voice recognition |
CN110970014A (en) * | 2019-10-31 | 2020-04-07 | 阿里巴巴集团控股有限公司 | Voice conversion, file generation, broadcast, voice processing method, device and medium |
CN111614781A (en) * | 2020-05-29 | 2020-09-01 | 王浩 | Audio processing method, terminal device and system based on cloud server |
CN112995873A (en) * | 2019-12-13 | 2021-06-18 | 西万拓私人有限公司 | Method for operating a hearing system and hearing system |
CN113360127A (en) * | 2021-05-31 | 2021-09-07 | 富途网络科技(深圳)有限公司 | Audio playing method and electronic equipment |
CN114267358A (en) * | 2021-12-17 | 2022-04-01 | 北京百度网讯科技有限公司 | Audio processing method, device, apparatus, storage medium, and program |
CN115051991A (en) * | 2022-07-08 | 2022-09-13 | 北京有竹居网络技术有限公司 | Audio processing method and device, storage medium and electronic equipment |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1602483A (en) * | 2001-12-17 | 2005-03-30 | 内维尼·加雅拉特尼 | Real time translator and method of performing real time translation of a plurality of spoken word languages |
US20110238407A1 (en) * | 2009-08-31 | 2011-09-29 | O3 Technologies, Llc | Systems and methods for speech-to-speech translation |
CN103838714A (en) * | 2012-11-22 | 2014-06-04 | 北大方正集团有限公司 | Method and device for converting voice information |
CN104462069A (en) * | 2013-09-18 | 2015-03-25 | 株式会社东芝 | Speech translation apparatus and speech translation method |
CN105117391A (en) * | 2010-08-05 | 2015-12-02 | 谷歌公司 | Translating languages |
CN105512113A (en) * | 2015-12-04 | 2016-04-20 | 青岛冠一科技有限公司 | Communication type voice translation system and translation method |
-
2017
- 2017-03-24 CN CN201710183965.2A patent/CN106935240A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1602483A (en) * | 2001-12-17 | 2005-03-30 | 内维尼·加雅拉特尼 | Real time translator and method of performing real time translation of a plurality of spoken word languages |
US20110238407A1 (en) * | 2009-08-31 | 2011-09-29 | O3 Technologies, Llc | Systems and methods for speech-to-speech translation |
CN105117391A (en) * | 2010-08-05 | 2015-12-02 | 谷歌公司 | Translating languages |
CN103838714A (en) * | 2012-11-22 | 2014-06-04 | 北大方正集团有限公司 | Method and device for converting voice information |
CN104462069A (en) * | 2013-09-18 | 2015-03-25 | 株式会社东芝 | Speech translation apparatus and speech translation method |
CN105512113A (en) * | 2015-12-04 | 2016-04-20 | 青岛冠一科技有限公司 | Communication type voice translation system and translation method |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107749296A (en) * | 2017-10-12 | 2018-03-02 | 深圳市沃特沃德股份有限公司 | Voice translation method and device |
CN108010519A (en) * | 2017-11-10 | 2018-05-08 | 上海爱优威软件开发有限公司 | A kind of information search method and system |
CN107967264A (en) * | 2017-12-29 | 2018-04-27 | 深圳市译家智能科技有限公司 | Translator, translation system and interpretation method |
CN108415904A (en) * | 2018-01-12 | 2018-08-17 | 广东思派康电子科技有限公司 | A kind of binary channels real time translating method |
CN108319590A (en) * | 2018-01-25 | 2018-07-24 | 芜湖应天光电科技有限责任公司 | A kind of adaptive translator based on cloud service |
CN108595443A (en) * | 2018-03-30 | 2018-09-28 | 浙江吉利控股集团有限公司 | Simultaneous interpreting method, device, intelligent vehicle mounted terminal and storage medium |
CN108710615A (en) * | 2018-05-03 | 2018-10-26 | Oppo广东移动通信有限公司 | Interpretation method and relevant device |
CN108710615B (en) * | 2018-05-03 | 2020-03-03 | Oppo广东移动通信有限公司 | Translation method and related equipment |
CN110728976A (en) * | 2018-06-30 | 2020-01-24 | 华为技术有限公司 | Method, device and system for voice recognition |
CN110728976B (en) * | 2018-06-30 | 2022-05-06 | 华为技术有限公司 | Method, device and system for voice recognition |
CN109036451A (en) * | 2018-07-13 | 2018-12-18 | 深圳市小瑞科技股份有限公司 | A kind of simultaneous interpretation terminal and its simultaneous interpretation system based on artificial intelligence |
CN109272983A (en) * | 2018-10-12 | 2019-01-25 | 武汉辽疆科技有限公司 | Bilingual switching device for child-parent education |
CN109359307A (en) * | 2018-10-17 | 2019-02-19 | 深圳市沃特沃德股份有限公司 | Interpretation method, device and the equipment of automatic identification languages |
CN109522564A (en) * | 2018-12-17 | 2019-03-26 | 北京百度网讯科技有限公司 | Voice translation method and device |
CN109522564B (en) * | 2018-12-17 | 2022-05-31 | 北京百度网讯科技有限公司 | Voice translation method and device |
CN109979431A (en) * | 2019-03-20 | 2019-07-05 | 邱洵 | A kind of multifunctional intellectual language translation system |
CN110970014A (en) * | 2019-10-31 | 2020-04-07 | 阿里巴巴集团控股有限公司 | Voice conversion, file generation, broadcast, voice processing method, device and medium |
CN110970014B (en) * | 2019-10-31 | 2023-12-15 | 阿里巴巴集团控股有限公司 | Voice conversion, file generation, broadcasting and voice processing method, equipment and medium |
CN112995873A (en) * | 2019-12-13 | 2021-06-18 | 西万拓私人有限公司 | Method for operating a hearing system and hearing system |
CN111614781A (en) * | 2020-05-29 | 2020-09-01 | 王浩 | Audio processing method, terminal device and system based on cloud server |
CN113360127A (en) * | 2021-05-31 | 2021-09-07 | 富途网络科技(深圳)有限公司 | Audio playing method and electronic equipment |
CN114267358A (en) * | 2021-12-17 | 2022-04-01 | 北京百度网讯科技有限公司 | Audio processing method, device, apparatus, storage medium, and program |
CN114267358B (en) * | 2021-12-17 | 2023-12-12 | 北京百度网讯科技有限公司 | Audio processing method, device, equipment and storage medium |
CN115051991A (en) * | 2022-07-08 | 2022-09-13 | 北京有竹居网络技术有限公司 | Audio processing method and device, storage medium and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106935240A (en) | Voice translation method, device, terminal device and cloud server based on artificial intelligence | |
CN110288077B (en) | Method and related device for synthesizing speaking expression based on artificial intelligence | |
US20200294505A1 (en) | View-based voice interaction method, apparatus, server, terminal and medium | |
CN110490213B (en) | Image recognition method, device and storage medium | |
Shawai et al. | Malay language mobile learning system (MLMLS) using NFC technology | |
CN109036396A (en) | A kind of exchange method and system of third-party application | |
US20200184948A1 (en) | Speech playing method, an intelligent device, and computer readable storage medium | |
CN113592985B (en) | Method and device for outputting mixed deformation value, storage medium and electronic device | |
CN107562850A (en) | Music recommends method, apparatus, equipment and storage medium | |
CN108012173A (en) | A kind of content identification method, device, equipment and computer-readable storage medium | |
CN108133707A (en) | A kind of content share method and system | |
JP2019211747A (en) | Voice concatenative synthesis processing method and apparatus, computer equipment and readable medium | |
CN106537496A (en) | Terminal device, information provision system, information presentation method, and information provision method | |
CN109257659A (en) | Subtitle adding method, device, electronic equipment and computer readable storage medium | |
CN108564966A (en) | The method and its equipment of tone testing, the device with store function | |
CN107590216A (en) | Answer preparation method, device and computer equipment | |
CN104598443B (en) | Language service providing method, apparatus and system | |
KR20190005103A (en) | Electronic device-awakening method and apparatus, device and computer-readable storage medium | |
CN108073572A (en) | Information processing method and its device, simultaneous interpretation system | |
CN113257218B (en) | Speech synthesis method, device, electronic equipment and storage medium | |
KR20200027331A (en) | Voice synthesis device | |
CN108122555A (en) | The means of communication, speech recognition apparatus and terminal device | |
CN112272170A (en) | Voice communication method and device, electronic equipment and storage medium | |
CN110010127A (en) | Method for changing scenes, device, equipment and storage medium | |
CN108769891A (en) | A kind of audio frequency transmission method and mobile translation equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170707 |
|
RJ01 | Rejection of invention patent application after publication |