CN108595443A

CN108595443A - Simultaneous interpreting method, device, intelligent vehicle mounted terminal and storage medium

Info

Publication number: CN108595443A
Application number: CN201810286936.3A
Authority: CN
Inventors: 张鸿鸽; 徐钧
Original assignee: Zhejiang Geely Holding Group Co Ltd
Current assignee: Zhejiang Geely Holding Group Co Ltd
Priority date: 2018-03-30
Filing date: 2018-03-30
Publication date: 2018-09-28

Abstract

The present invention relates to intelligent automobile technical field, a kind of simultaneous interpreting method, device, intelligent vehicle mounted terminal and storage medium are provided, the method includes：It is asked based on simultaneous interpretation, obtains voice to be translated；It treats translated speech and carries out speech recognition, obtain text to be translated；It treats cypher text and carries out languages identification, obtain the corresponding languages to be translated of text to be translated；According to the correspondence between pre-set languages to be translated and target language, text to be translated is translated as target language text, and exported with voice mode.The present invention carries out automatic identification by treating the languages of translated speech, realizes the automatic intertranslation between bilingual so that can be with normal communication between the personnel of interior language obstacle.

Description

Simultaneous interpreting method, device, intelligent vehicle mounted terminal and storage medium

Technical field

The present invention relates to intelligent automobile technical fields, in particular to a kind of simultaneous interpreting method, device, intelligent vehicle Mounted terminal and storage medium.

Background technology

Automobile function has been not only traditional function of riding instead of walk of today, and with the development of the times, new technique is not Hair is disconnected, automobile is also more and more intelligent.As compatriots constantly go abroad, the foreigner constantly comes travel in China, trade, past Toward on taxi, commercial vehicle, driver personnel encounter and are led because being ignorant of the language of both sides when picking passenger or business reception Cause can not normal communication.

Invention content

Be designed to provide a kind of simultaneous interpreting method, device, intelligent vehicle mounted terminal and the storage of the embodiment of the present invention are situated between Matter, to solve the problems, such as driver with passenger because language obstacle can not normal communication.

To achieve the goals above, technical solution used in the embodiment of the present invention is as follows：

In a first aspect, an embodiment of the present invention provides a kind of simultaneous interpreting method, the method includes：Based on simultaneous interpretation Request, obtains voice to be translated；It treats translated speech and carries out speech recognition, obtain text to be translated；Treat cypher text progress Languages identify, obtain the corresponding languages to be translated of text to be translated；According to pre-set languages to be translated and target language it Between correspondence, text to be translated is translated as target language text, and exported with voice mode.

Second aspect, the embodiment of the present invention additionally provide a kind of simultaneous interpretation arrangement, and described device includes acquisition module, language Sound identification module, languages identification module and translation module.Wherein, acquisition module is used to ask based on simultaneous interpretation, obtains and waits turning over Translate voice；Sound identification module carries out speech recognition for treating translated speech, obtains text to be translated；Languages identification module, Languages identification is carried out for treating cypher text, obtains the corresponding languages to be translated of text to be translated；Translation module, for according to Text to be translated is translated as target language text by the correspondence between pre-set languages to be translated and target language, And it is exported with voice mode.

The third aspect, the embodiment of the present invention additionally provide a kind of intelligent vehicle mounted terminal, and the intelligent vehicle mounted terminal includes vehicle Microphone, vehicle-mounted acoustical generator are carried, the intelligent vehicle mounted terminal further includes：Memory；Processor, the processor with it is vehicle-mounted transaudient Device, vehicle-mounted acoustical generator are electrically connected；And simultaneous interpretation arrangement, the simultaneous interpretation arrangement are stored in the memory and wrap Include one or more software function modules executed by the processor comprising：Acquisition module, for being asked based on simultaneous interpretation It asks, obtains the voice to be translated of the vehicle-mounted microphone pick；Sound identification module, for carrying out language to the voice to be translated Sound identifies, obtains text to be translated；Languages identification module obtains described for carrying out languages identification to the text to be translated The corresponding languages to be translated of text to be translated；Translation module, for according to pre-set languages to be translated and target language it Between correspondence, the text to be translated is translated as target language text, and by the vehicle-mounted acoustical generator with voice side Formula is exported.

Fourth aspect, the embodiment of the present invention additionally provide a kind of computer readable storage medium, are stored thereon with computer Program, the computer program realize above-mentioned simultaneous interpreting method when being executed by processor.

Compared with the prior art, it a kind of simultaneous interpreting method provided in an embodiment of the present invention, device, intelligent vehicle mounted terminal and deposits Storage media responds the simultaneous interpretation translation request that user sends out and obtains language to be translated when user needs to carry out simultaneous interpretation Then sound treats translated speech progress speech recognition and obtains text to be translated, then carries out languages identification by treating cypher text The corresponding languages to be translated of text to be translated are determined, next according between pre-set languages to be translated and target language Correspondence, text to be translated is translated as target language text, and exported with voice mode.With prior art phase Automatic identification is carried out by treating the languages of translated speech than, the embodiment of the present invention, is realized automatic same between bilingual Sound intertranslation so that can be with normal communication between the personnel of interior language obstacle.

To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, special embodiment below, and appended by cooperation Attached drawing is described in detail below.

Description of the drawings

In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.

Fig. 1 shows the block diagram of intelligent vehicle mounted terminal provided in an embodiment of the present invention.

Fig. 2 shows simultaneous interpreting method flow charts provided in an embodiment of the present invention.

Fig. 3 be Fig. 2 shows step S102 sub-step flow chart.

Fig. 4 be Fig. 2 shows step S103 sub-step flow chart.

Fig. 5 shows the block diagram of simultaneous interpretation arrangement provided in an embodiment of the present invention.

Icon：100- intelligent vehicle mounted terminals；101- memories；102- storage controls；103- processors；104- peripheral hardwares connect Mouthful；The vehicle-mounted microphones of 105-；The vehicle-mounted acoustical generators of 106-；107- display devices；200- simultaneous interpretation arrangements；201- acquisition modules； 202- sound identification modules；203- languages identification modules；204- translation modules；205- display modules.

Specific implementation mode

Below in conjunction with attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete Ground describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Usually exist The component of the embodiment of the present invention described and illustrated in attached drawing can be arranged and be designed with a variety of different configurations herein.Cause This, the detailed description of the embodiment of the present invention to providing in the accompanying drawings is not intended to limit claimed invention below Range, but it is merely representative of the selected embodiment of the present invention.Based on the embodiment of the present invention, those skilled in the art are not doing The every other embodiment obtained under the premise of going out creative work, shall fall within the protection scope of the present invention.

It should be noted that：Similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined, then it further need not be defined and explained in subsequent attached drawing in a attached drawing.Meanwhile the present invention's In description, term " first ", " second " etc. are only used for distinguishing description, are not understood to indicate or imply relative importance.

Fig. 1 is please referred to, Fig. 1 shows the block diagram of intelligent vehicle mounted terminal 100 provided in an embodiment of the present invention.Intelligence Car-mounted terminal 100 can be used to implement the intertranslation in unison of different language, can be smart mobile phone, vehicle-mounted computer, the group on automobile Close instrument or multimedia host etc..The intelligent vehicle mounted terminal 100 includes memory 101, storage control 102, processor 103, Peripheral Interface 104, vehicle-mounted microphone 105, vehicle-mounted acoustical generator 106, display device 107.

Memory 101, storage control 102 and 103 each element of processor are directly or indirectly electrically connected between each other, To realize the transmission or interaction of data.For example, these elements can pass through one or more communication bus or signal wire between each other It realizes and is electrically connected.Simultaneous interpretation arrangement 200 can be stored in including at least one in the form of software or firmware (firmware) In memory 101 or the software that is solidificated in the operating system (operating system, OS) of the intelligent vehicle mounted terminal 100 Function module.Processor 103 is used to execute the executable module stored in memory 101, such as simultaneous interpretation arrangement 200 is wrapped Software function module and computer program for including etc..

Wherein, memory 101 may be, but not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM), Electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc.. Wherein, memory 101 is for storing program, and the processor 103 executes described program after receiving and executing instruction.

Processor 103 can be a kind of IC chip, have signal handling capacity.Above-mentioned processor 103 can be with It is general processor, including central processing unit (Central Processing Unit, CPU), network processing unit (Network Processor, NP), speech processor and video processor etc.；Can also be digital signal processor, application-specific integrated circuit, Field programmable gate array either other programmable logic device, discrete gate or transistor logic, discrete hardware components. It may be implemented or execute disclosed each method, step and the logic diagram in the embodiment of the present invention.General processor can be Microprocessor or the processor 103 can also be any conventional processor etc..

Vehicle-mounted microphone 105 is used to acquire the voice of simultaneous interpretation request and voice to be translated, and by the voice to be translated Be sent to intelligent vehicle mounted terminal 100 so that intelligent vehicle mounted terminal 100 opens simultaneous interpretation function, realize two kinds of different languages it Between intertranslation in unison.Vehicle-mounted microphone 105 is the microphone for automobile, is the energy turn that voice signal is converted to electric signal Parallel operation part.Vehicle-mounted microphone 105 can be car microphone, vehicle mounted microphone, vehicle-mounted microphone etc., in embodiments of the present invention, Vehicle-mounted microphone 105 can be car microphone.

Vehicle-mounted acoustical generator 106 is the generator for automobile, is the equipment for converting electrical energy into sound, and being used for will be in unison The response voice and the voice after translation of translation request are exported.Vehicle-mounted acoustical generator 106 can be vehicle-mounted loudspeaker, vehicle-mounted raise one's voice Device etc., in embodiments of the present invention, vehicle-mounted acoustical generator 106 can be vehicle-mounted loudspeakers.

Display device 107 is a kind of human interface device on automobile, user circle for showing intelligent vehicle mounted terminal 100 Face, at the same the simultaneous interpretation request of user can be received by touch manner and by the corresponding text to be translated of voice to be translated and Target language text after translation is shown that display device 107 can be touch screen.

First embodiment

Fig. 2 is please referred to, Fig. 2 shows simultaneous interpreting method flow charts provided in an embodiment of the present invention.The present invention first is real The simultaneous interpreting method for applying example is applied in intelligent vehicle mounted terminal 100, and simultaneous interpreting method includes the following steps：

Step S101 is asked based on simultaneous interpretation, obtains voice to be translated.

In embodiments of the present invention, simultaneous interpretation request is to be triggered by user, utilize simultaneous interpreting method for starting The command request of the simultaneous interpretation function of realization, wherein simultaneous interpretation request can be that user is sent out by vehicle-mounted microphone 105 Voice command, can also be the operational order that user is sent out by the user interface of intelligent vehicle mounted terminal 100, can also be use The keying order that family passes through the button in the vehicle steering wheel that is electrically connected with intelligent vehicle mounted terminal 100.100 sound of intelligent vehicle mounted terminal Enter simultaneous interpretation function after answering simultaneous interpretation to ask and send prompt message to user, prompt message can be by vehicle-mounted hair The voice prompt or the display reminding sent out by display device 107 that sound device 106 is sent out, for example, vehicle-mounted acoustical generator 106 is broadcast " voice please be input " or display device 107 is reported to show " voice please be input ".

Intelligent vehicle mounted terminal 100 enters after simultaneous interpretation function, starts to obtain and waits turning over by what vehicle-mounted microphone 105 was sent Voice is translated, wherein voice to be translated is the interior voice that vehicle-mounted microphone 105 acquires, interior voice can be that on-board crew is spoken When the sound that sends out, can also be the sound that audible device is sent out in the car, for example, the sound that the audio file on mobile phone plays Sound.

Step S102 treats translated speech and carries out speech recognition, obtains text to be translated.

In embodiments of the present invention, the purpose of speech recognition is that the voice that user sends out is converted to intelligent vehicle mounted terminal The text of 100 readable character strings.Treating the method that translated speech carries out speech recognition may include：

First, it treats translated speech to be pre-processed, eliminates the influence of grass, and voice to be translated is subjected to acoustics Feature extraction obtains audio data to be translated.Acoustic feature extraction can not only treat translated speech and carry out Information Compression, but also be convenient for Subsequent speech recognition.

Secondly, audio data to be translated is inputted in the audio identification model pre-established and is handled to obtain text to be translated This, calculates the probability that voice to be translated corresponds to syllable, obtains the syllable sequence of voice to be translated, first then according to multiple sounds Sequence is saved, the probability of corresponding word sequence is calculated, finally selects syllable sequence probability and word sequence in voice to be translated The highest word sequence of probability is as speech recognition as a result, text i.e. to be translated.

Fig. 3 is please referred to, step S102 can also include following sub-step：

Voice to be translated is converted to audio data to be translated by sub-step S1021.

In embodiments of the present invention, first, it treats translated speech by eliminating noise and channel distortion and carries out speech enhan-cement. Secondly, framing is carried out to the voice to be translated after speech enhan-cement, and Fourier transformation is carried out to extract each frame to each frame Feature vector, the feature vector of each frame of voice to be translated constitutes audio data to be translated.For example, voice to be translated is " hello " obtains " you " and " good " two frame voice after framing, then audio data to be translated may include " you " feature vector and The feature vector of " good ".

Audio data to be translated is inputted the audio identification model pre-established, to obtain sound to be translated by sub-step S1022 Frequency is according to corresponding text to be translated.

In embodiments of the present invention, audio identification model includes acoustic model, language model and search space, wherein sound Learning model is obtained after carrying out statistical modeling to the acoustic feature of a large amount of sample voice, is corresponded to for calculating voice to be translated To the probability of syllable.Language model be by a large amount of samples of text of training and using probability statistics method in word What statistical law was modeled, the probability for calculating corresponding word sequence according to multiple syllable sequences.

Search space is the network of the syllable grade formed using syllable as node, search space to establish process as follows：First, The word being likely to occur using in text to be translated forms word level network as node, wherein is likely to occur in text to be translated Word can be predefined according to the application scenarios of voice to be translated, for example, if application scenarios are business receptions, it is common Word includes hotel, country, city etc.；Then, then syllable is carried out to word level network to extend to obtain corresponding syllable grade network, The syllable grade network is search space, for example, word network is " hotel ", then the syllable grade network extended can be " ho " corresponding syllable and " tel " corresponding syllable.

The audio identification model that audio data to be translated input pre-establishes is carried out to the process of speech recognition is：It will wait turning over It translates audio data and is input to search space, the highest list of probability is determined in search space according to acoustic model and speech model Word sequence, using the word sequence as the corresponding text to be translated of audio data to be translated.

Step S103 treats cypher text and carries out languages identification, obtains the corresponding languages to be translated of text to be translated.

In embodiments of the present invention, it after obtaining text to be translated, first, treats cypher text and carries out languages feature extraction, Obtain the languages feature of text to be translated.Wherein, languages feature refers to the peculiar mark for being different from other languages, generally includes Peculiar letter, peculiar monogram, the type of cedilla and mark quantity etc..Feature extraction is to treat in cypher text to repeat And peculiar letter, peculiar monogram, the type of cedilla and the mark quantity etc. largely occurred extracts.It is intelligent vehicle-carried Terminal 100 has pre-saved languages database, be stored in the languages database multiple languages templates and with each languages mould The corresponding template languages of plate, after obtaining the languages feature of text to be translated, by the languages feature of text to be translated and languages number It is matched one by one according to multiple languages templates in library, obtains the corresponding languages to be translated of text to be translated.

Fig. 4 is please referred to, step S103 can also include following sub-step：

Sub-step S1031 treats cypher text and carries out feature extraction, obtains the languages feature of text to be translated.

In embodiments of the present invention, languages feature refers to the peculiar mark for being different from other languages, generally includes peculiar Alphabetical, peculiar monogram, the type of cedilla and mark quantity etc..For example, text to be translated is " Een van de hoofdkenmerken van dit communicatiesysteem is dat het alleen door de mens kan worden voortgebracht en gebruikt en meestal ook alleen door de mens wordt Begrepen ", the then languages after extracting are characterized in " oo, ee, en ".Sub-step S1032, will be in languages feature and languages database Multiple languages templates matched one by one, determine the corresponding languages to be translated of languages feature.

In embodiments of the present invention, languages database has been pre-saved on intelligent vehicle mounted terminal 100, in the languages database Including multiple languages templates and template languages corresponding with each languages template, for example, languages template is " oo, aa, uu, ee ", Template languages Dutch corresponding with the languages template.Sub-step S1031 is obtained into the languages feature and language of text to be translated Multiple languages templates in kind database are matched one by one, and matching degree reaches the corresponding mould of target language template of predetermined threshold Plate languages are as the corresponding languages to be translated of text to be translated, for example, languages to be translated are characterized in " oo, ee, uu ", languages mould Plate is " oo, aa, uu, ee ", and matching degree is (3/4) * 100%=75%, wherein the Characteristic Number in 4 finger speech kind templates is 4 A, 3 refer to the number of the to be translated languages feature consistent with feature in languages template be 3, and predetermined threshold is an experience Value, for example, being considered based on accuracy, predetermined threshold value is 70%, is greater than or equal in predetermined threshold from matching degree and finds out With the corresponding template languages of the highest target language template of degree as the corresponding languages to be translated of text to be translated, for example, predetermined Threshold value is 70%, and text to be translated and the matching degree of Dutch languages template are 72%, text and Afrikaans to be translated The matching degree of languages template be 80%, then using Afrikaans as the corresponding languages to be translated of text to be translated.

As an implementation, intelligent vehicle mounted terminal 100 can also be communicated to connect with languages database server, the language Languages database is preserved on kind database server, the languages feature of text to be translated is sent to by intelligent vehicle mounted terminal 100 On the languages database server, so that the languages database server carries out languages characteristic matching, and matching result is sent To intelligent vehicle mounted terminal 100.

Step S104, according to the correspondence between pre-set languages to be translated and target language, by text to be translated Originally it is translated as target language text, and is exported with voice mode.

In embodiments of the present invention, user first passes through the operation interface setting languages to be translated of intelligent vehicle mounted terminal 100 in advance Correspondence between target language, the corresponding languages to be translated of text to be translated obtained according to step S103 and sets in advance Correspondence between the languages to be translated set and target language can be obtained by the target language of target language text.For example, The correspondence of user setting is Sino-British intertranslation, and when languages to be translated are Chinese, the target language of target language text is English Text needs text to be translated translating into English text at this time, when languages to be translated are English, the target of target language text Languages are Chinese, need text to be translated translating into Chinese text at this time.

In embodiments of the present invention, text to be translated is translated as target language text and realizes that process can be by that will wait turning over It translates in the Recognition with Recurrent Neural Network model that text input pre-establishes, it is real to obtain target language text corresponding with text to be translated It is existing.Recognition with Recurrent Neural Network model includes two stages of coding and decoding, first, by text input Recognition with Recurrent Neural Network mould to be translated It is encoded in type, obtains the corresponding sequence vector of text to be translated, then, which is decoded, passes through calculating Each word probability in target language text, finally obtains target language text.

As an implementation, in Recognition with Recurrent Neural Network model text input to be translated pre-established, obtain with The corresponding target language text process of text to be translated can be：

First, text to be translated is encoded to obtain the corresponding sequence vector of text to be translated.It will be in text to be translated Each word sequentially inputs Recognition with Recurrent Neural Network model and is converted to the corresponding term vector of each word into row vector, In, the input as the term vector for generating next word again of the corresponding term vector of each word, in this way since, the last one list The term vector information of the term vector of word all words before containing the word in fact, therefore, by the word of the last one word to Amount is as the corresponding sequence vector of text to be translated.

Secondly, it treats the corresponding sequence vector of cypher text and is decoded, it is every in target language text by calculating one by one The probability of a word finally obtains target language text.Decoding is to treat the corresponding sequence vector of cypher text to be decoded, often The secondary word decoded in target language text, and decode each word when, will be by decoded upper one Word is as input, then determines probability of each word in target language text, chooses the list of wherein maximum probability every time Word finally obtains target language text as the word in target language text.

It should be noted that the Recognition with Recurrent Neural Network model in step S104 is pre-established by training, training follows The process of ring neural network model is as follows：

First, input parameter in loop initialization neural network model.Input parameter includes embedded size, encoder input Direction, coding depth, decoding depth, neural unit type, iterations, wherein embedded size is used to indicate the vector of word Length that is to say term vector length；Encoder input direction refers to the sequence for the list entries for inputting text to be translated, can be just To, it is reversed, forward and reverse, forward direction is sequentially input according to the sequence of text to be translated, is reversely the sequence according to text to be translated Backward is sequentially input, and forward and reverse is to be sequentially input respectively according to the sequence and backward of text to be translated；The depth of coding refers to coding The number of plies of neural network, decoded depth refer to the number of plies of decoding neural network；Common neural unit type has simple cycle refreshing Through network, shot and long term memory network, gating cycle unit, after determining neural unit type, while neural unit is also initialized The set sizes of corresponding neural network, each layer neuron number, each layer activation primitive and gradient weight initial value；Iteration time The main training penalty values for influencing neural network of number.

Secondly, training sample is input in Recognition with Recurrent Neural Network model be trained be continuously updated weight parameter and partially Shifting amount.Each training sample includes a sample text to be translated and corresponding target language sample text, such as { (you It will be where), (Where are you going) it is a training sample, sample text to be translated is that (which you will go In), corresponding target language sample text is (Where are you going).Each Recognition with Recurrent Neural Network wraps At least one input layer, a hidden layer, an output layer are included, weight parameter includes：Weight parameter of the input layer to hidden layer； Weight parameter of the hidden layer to hidden layer；Hidden layer to output layer weight parameter, offset include hidden layer offset and The offset of output layer.Training sample is input in Recognition with Recurrent Neural Network model, using back-propagation algorithm BPTT to cycle Neural network model is trained, is updated to weight parameter and offset using gradient descent algorithm, and weight is finally obtained The value of parameter and offset, to train the Recognition with Recurrent Neural Network mould that text to be translated can be translated as to target language text Type.

It in embodiments of the present invention, can be by target language text after intelligent vehicle mounted terminal 100 obtains target language text This is exported with voice mode, that is to say, that intelligent vehicle mounted terminal 100 is first by target language text conversion at target language sound Frequently, and by vehicle-mounted acoustical generator 106 export.In addition, user is probably due to noise etc. hears target language audio in time, In order to improve user experience, the target language text after text to be translated and translation can be shown with display device 107 Show, therefore, the embodiment of the present invention can also include step S105.

Step S105 shows text to be translated and target language text corresponding with text to be translated.

In embodiments of the present invention, intelligent vehicle mounted terminal 100 in addition to by target language text conversion at target language audio, And by vehicle-mounted acoustical generator 106 export except, can also by display device for mounting on vehicle 107 by text to be translated and with wait turning over The corresponding target language text of translation sheet shown, with user due to noise etc. does not hear target language audio when can To see target language text by display device 107, user experience is improved.

Compared with prior art, the embodiment of the present invention has the advantages that：

First, the simultaneous interpretation acquisition request voice to be translated based on user, simultaneous interpretation request either voice, Can be operation or be button, it is various informative, facilitate user is initiated under different scenes ask.

Secondly, voice to be translated is converted into text to be translated using speech recognition, then treats cypher text and carries out automatically Languages identify, realize the automatic intertranslation in unison between bilingual so that can be normal between the personnel of interior language obstacle Exchange.

Third exports the target language text after voiced translation to be translated with voice mode, can also will wait turning over Translation sheet and the corresponding target language text of text to be translated shown with display device 107, is output by voice and text The output for showing two kinds of forms, meets the needs of user is to output under different scenes, improves the usage experience of user.

Second embodiment

Fig. 5 is please referred to, Fig. 5 shows the block diagram of simultaneous interpretation arrangement 200 provided in an embodiment of the present invention.In unison Translating equipment 200 is applied to intelligent vehicle mounted terminal 100 comprising acquisition module 201；Sound identification module 202；Languages identify mould Block 203；Translation module 204；Display module 205.

Acquisition module 201 obtains voice to be translated for being asked based on simultaneous interpretation.

Sound identification module 202 carries out speech recognition for treating translated speech, obtains text to be translated.

In the embodiment of the present invention, sound identification module 202 is specifically used for, and voice to be translated is converted to audio number to be translated According to；Audio data to be translated is inputted to the audio identification model pre-established, audio data to be translated is corresponding to be waited turning over to obtain Translation sheet.

Languages identification module 203 carries out languages identification for treating cypher text, obtains that text to be translated is corresponding to be waited turning over Translate languages.

In the embodiment of the present invention, languages identification module 203 is specifically used for, and treats cypher text and carries out feature extraction, obtains The languages feature of text to be translated；Languages feature is matched one by one with multiple languages templates in languages database, is determined Go out the corresponding languages to be translated of languages feature.

Translation module 204, for according to the correspondence between pre-set languages to be translated and target language, will wait for Cypher text is translated as target language text, and is exported with voice mode.

Display module 205, for carrying out text to be translated and target language text corresponding with text to be translated Display.

The embodiment of the present invention further discloses a kind of computer readable storage medium, is stored thereon with computer program, described The simultaneous interpreting method that present invention discloses is realized when computer program is executed by processor 103.

In conclusion a kind of simultaneous interpreting method, device, intelligent vehicle mounted terminal and storage medium provided by the invention, institute The method of stating includes：It is asked based on simultaneous interpretation, obtains voice to be translated；It treats translated speech and carries out speech recognition, obtain waiting turning over Translation sheet；It treats cypher text and carries out languages identification, obtain the corresponding languages to be translated of text to be translated；According to pre-set Text to be translated is translated as target language text by the correspondence between languages to be translated and target language, and with voice side Formula is exported.Compared with prior art, the embodiment of the present invention carries out automatic identification by treating the languages of translated speech, realizes Automatic intertranslation in unison between bilingual so that can be with normal communication between the personnel of interior language obstacle.

In several embodiments provided herein, it should be understood that disclosed device and method can also pass through Other modes are realized.The apparatus embodiments described above are merely exemplary, for example, the flow chart in attached drawing and block diagram Show the device of multiple embodiments according to the present invention, the architectural framework in the cards of method and computer program product, Function and operation.In this regard, each box in flowchart or block diagram can represent the one of a module, section or code Part, a part for the module, section or code, which includes that one or more is for implementing the specified logical function, to be held Row instruction.It should also be noted that at some as in the realization method replaced, the function of being marked in box can also be to be different from The sequence marked in attached drawing occurs.For example, two continuous boxes can essentially be basically executed in parallel, they are sometimes It can execute in the opposite order, this is depended on the functions involved.It is also noted that every in block diagram and or flow chart The combination of box in a box and block diagram and or flow chart can use function or the dedicated base of action as defined in executing It realizes, or can be realized using a combination of dedicated hardware and computer instructions in the system of hardware.

In addition, each function module in each embodiment of the present invention can integrate to form an independent portion Point, can also be modules individualism, can also two or more modules be integrated to form an independent part.

It, can be with if the function is realized and when sold or used as an independent product in the form of software function module It is stored in a computer read/write memory medium.Based on this understanding, technical scheme of the present invention is substantially in other words The part of the part that contributes to existing technology or the technical solution can be expressed in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be People's computer, server or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention. And storage medium above-mentioned includes：USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic disc or CD.It needs Illustrate, herein, relational terms such as first and second and the like be used merely to by an entity or operation with Another entity or operation distinguish, and without necessarily requiring or implying between these entities or operation, there are any this realities The relationship or sequence on border.Moreover, the terms "include", "comprise" or its any other variant are intended to the packet of nonexcludability Contain, so that the process, method, article or equipment including a series of elements includes not only those elements, but also includes Other elements that are not explicitly listed, or further include for elements inherent to such a process, method, article, or device. In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including the element Process, method, article or equipment in there is also other identical elements.

The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, any made by repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.It should be noted that：Similar label and letter exist Similar terms are indicated in following attached drawing, therefore, once being defined in a certain Xiang Yi attached drawing, are then not required in subsequent attached drawing It is further defined and is explained.

Claims

1. a kind of simultaneous interpreting method, which is characterized in that it is applied to intelligent vehicle mounted terminal, the method includes：

It is asked based on simultaneous interpretation, obtains voice to be translated；

Speech recognition is carried out to the voice to be translated, obtains text to be translated；

Languages identification is carried out to the text to be translated, obtains the corresponding languages to be translated of the text to be translated；

According to the correspondence between pre-set languages to be translated and target language, the text to be translated is translated as mesh Poster kind text, and exported with voice mode.

2. the method as described in claim 1, which is characterized in that it is described that speech recognition is carried out to the voice to be translated, it obtains The step of text to be translated, including：

The voice to be translated is converted into audio data to be translated；

The audio data to be translated is inputted into the audio identification model pre-established, to obtain and the audio data to be translated Corresponding text to be translated.

3. the method as described in claim 1, which is characterized in that the intelligent vehicle mounted terminal includes languages database, institute's predicate Multiple languages templates are stored in kind database, it is described that languages identification is carried out to the text to be translated, it obtains described to be translated The step of text corresponding languages to be translated, including：

Feature extraction is carried out to the text to be translated, obtains the languages feature of the text to be translated；

The languages feature is matched one by one with multiple languages templates in the languages database, determines the languages The corresponding languages to be translated of feature.

4. the method as described in claim 1, which is characterized in that described that the text to be translated is translated as target language text The step of, including：

In the Recognition with Recurrent Neural Network model that the text input to be translated is pre-established, obtain corresponding with the text to be translated Target language text.

5. the method as described in claim 1, which is characterized in that the method further includes：

The text to be translated and target language text corresponding with the text to be translated are shown.

6. a kind of simultaneous interpretation arrangement, which is characterized in that be applied to intelligent vehicle mounted terminal, described device includes：

Acquisition module obtains voice to be translated for being asked based on simultaneous interpretation；

Sound identification module obtains text to be translated for carrying out speech recognition to the voice to be translated；

Languages identification module, for the text progress languages identification to be translated, obtaining, the text to be translated is corresponding to be waited for Translate languages；

Translation module, for according to the correspondence between pre-set languages to be translated and target language, waiting turning over by described Translation is originally translated as target language text, and is exported with voice mode.

7. device as claimed in claim 6, which is characterized in that the sound identification module is specifically used for：

The voice to be translated is converted into audio data to be translated；

8. device as claimed in claim 6, which is characterized in that the intelligent vehicle mounted terminal includes languages database, institute's predicate Languages identification module described in multiple languages templates is stored in kind database to be specifically used for：

9. a kind of intelligent vehicle mounted terminal, which is characterized in that the intelligent vehicle mounted terminal includes vehicle-mounted microphone, vehicle-mounted acoustical generator, The intelligent vehicle mounted terminal further includes：

Memory；

Processor, the processor are electrically connected with vehicle-mounted microphone, vehicle-mounted acoustical generator；And

Simultaneous interpretation arrangement, the simultaneous interpretation arrangement be stored in the memory and include one or more by the processing The software function module that device executes comprising：

Acquisition module obtains the voice to be translated of the vehicle-mounted microphone pick for being asked based on simultaneous interpretation；

Translation module, for according to the correspondence between pre-set languages to be translated and target language, waiting turning over by described Translation is originally translated as target language text, and is exported with voice mode by the vehicle-mounted acoustical generator.

10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program quilt The method as described in any one of claim 1-5 is realized when processor executes.