CN108270928A

CN108270928A - The method and mobile terminal of a kind of speech recognition

Info

Publication number: CN108270928A
Application number: CN201810358427.7A
Authority: CN
Inventors: 赵俊杰
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2018-04-20
Filing date: 2018-04-20
Publication date: 2018-07-10
Anticipated expiration: 2038-04-20
Also published as: CN108270928B

Abstract

An embodiment of the present invention provides the method and mobile terminal of a kind of speech recognition, this method includes：When mobile terminal is in speech recognition state, it is determined whether interrupt speech recognition；If so, preserving the first voice messaging obtained before interrupting speech recognition, speech recognition interrupt status is switched to from speech recognition state；When mobile terminal is in speech recognition interrupt status, it is determined whether restore speech recognition；If so, switching to speech recognition state from speech recognition interrupt status, the second voice messaging is obtained；According to the first voice messaging of one or more obtained before speech recognition is interrupted, and the second voice messaging of one or more obtained after speech recognition is restored, carry out speech recognition, obtain complete semanteme, user is avoided to need to re-enter voice due to phonetic entry is interrupted, improves the usage experience of mobile terminal.

Description

The method and mobile terminal of a kind of speech recognition

Technical field

The present embodiments relate to the communications field, the method and mobile terminal of more particularly to a kind of speech recognition.

Background technology

With being constantly progressive for speech recognition technology, user, can be by inputting the side of voice when using mobile terminal Formula performs various operations.Existing speech identifying function continuous voice can only be identified one section input by user, when User when being interrupted during inputting voice because of special circumstances, such as：To the acquaintance met by chance greets, calling leads to boundary Face is switched, and since the voice content inputted before interruption is not complete sentence, mobile terminal can not be obtained by speech recognition The semanteme of the voice is obtained, causes user that can only re-enter the voice inputted before interrupting one time, user experience is poor.

Invention content

An embodiment of the present invention provides the methods and mobile terminal of a kind of speech recognition, solve phonetic entry and are interrupted and need The problem of re-entering voice.

According to the present invention embodiment in a first aspect, provide a kind of method of speech recognition, applied to mobile terminal, packet It includes：When the mobile terminal is in speech recognition state, it is determined whether interrupt speech recognition；Know if so, preserving and interrupting voice The first voice messaging obtained before not, speech recognition interrupt status is switched to from the speech recognition state；In the movement When terminal is in the speech recognition interrupt status, it is determined whether restore speech recognition；If so, interrupt shape from the speech recognition State switches to the speech recognition state, obtains the second voice messaging；To one or more obtained before speech recognition is interrupted A first voice messaging and the second voice messaging of one or more obtained after speech recognition is restored carry out voice knowledge Not.

The second aspect of embodiment according to the present invention provides a kind of mobile terminal, including：First determining module, is used for When the mobile terminal is in speech recognition state, it is determined whether interrupt speech recognition；Preserving module, for described first When determining module determines to interrupt speech recognition, the first voice messaging obtained before interrupting speech recognition is preserved；First switching mould Block, for when first determining module determines to interrupt speech recognition, speech recognition to be switched to from the speech recognition state Interrupt status；Second determining module, for when the mobile terminal is in the speech recognition interrupt status, it is determined whether extensive Multiple speech recognition；Second handover module, for knowing when second determining module determines to restore speech recognition from the voice Other interrupt status switches to the speech recognition state；Acquisition module, for determining to restore voice in second determining module The second voice messaging is obtained during identification；Sound identification module, for the one or more obtained before speech recognition is interrupted First voice messaging and the second voice messaging of one or more obtained after speech recognition is restored carry out speech recognition.

The third aspect of embodiment according to the present invention provides another mobile terminal, including processor, memory and deposits The computer program that can be run on the memory and on the processor is stored up, the computer program is by the processor The step of method of speech recognition as described in relation to the first aspect is realized during execution.

The fourth aspect of embodiment according to the present invention provides a kind of computer readable storage medium, which is characterized in that institute It states and computer program is stored on computer readable storage medium, first such as is realized when the computer program is executed by processor The step of method of speech recognition described in aspect.

In this way, mobile terminal can switch to speech recognition interrupt status when speech recognition is interrupted, and preserve interruption The first voice messaging obtained before speech recognition, allows hand over when speech recognition is resumed to speech recognition state, and obtain The second voice messaging restored after speech recognition is taken, according to the first language of one or more obtained before speech recognition is interrupted The second voice messaging of one or more that message is ceased and obtained after speech recognition is restored carries out speech recognition, has obtained Whole semanteme avoids user from needing to re-enter voice due to phonetic entry is interrupted, improves the usage experience of mobile terminal.

Description of the drawings

In order to illustrate the technical solution of the embodiments of the present invention more clearly, needed in being described below to the embodiment of the present invention Attached drawing to be used is briefly described, it should be apparent that, the accompanying drawings in the following description is only some embodiments of the present invention, For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings Other attached drawings.

Fig. 1 is one of flow diagram of audio recognition method provided in an embodiment of the present invention；

Fig. 2 is the two of the flow diagram of audio recognition method provided in an embodiment of the present invention；

Fig. 3 is the three of the flow diagram of audio recognition method provided in an embodiment of the present invention；

Fig. 4 is the four of the flow diagram of audio recognition method provided in an embodiment of the present invention；

Fig. 5 is a kind of structure diagram of mobile terminal provided in an embodiment of the present invention；

Fig. 6 is the structure diagram of another mobile terminal provided in an embodiment of the present invention.

Specific embodiment

Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is part of the embodiment of the present invention, instead of all the embodiments.Based on this hair Embodiment in bright, the every other implementation that those of ordinary skill in the art are obtained without making creative work Example, shall fall within the protection scope of the present invention.

Referring to Fig. 1, an embodiment of the present invention provides a kind of methods of speech recognition, are as follows：

Step 101, when mobile terminal is in speech recognition state, it is determined whether interrupt speech recognition, if performing step Rapid 102；Otherwise, step 101 is continued to execute；

In embodiments of the present invention, mobile terminal can be mobile phone, it is tablet computer, laptop, palm PC, vehicle-mounted Terminal or wearable device etc..

Speech recognition state represents that the speech identifying function of mobile terminal is in operating status, which can be with It is (such as wechat or QQ etc.) that mobile terminal itself provides or third party application provides, the present invention is real Example is applied to be not specifically limited the presentation mode of speech identifying function.

It when mobile terminal is in speech recognition state, can determine whether to interrupt speech recognition in real time, determine according to can be with It is the first switching of the first phonetic order input by user, the first button operation input by user or mobile terminal display interface Triggering mode etc., the embodiment of the present invention are not specifically limited the mode for determining whether to interrupt speech recognition.

Step 102 preserves the first voice messaging obtained before interrupting speech recognition, and language is switched to from speech recognition state Sound identifies interrupt status；

In embodiments of the present invention, the first voice messaging is the voice letter for the microphone input that user passes through mobile terminal Breath can be one or more voice messagings, not limit the specific interior of the first voice messaging specifically in embodiments of the present invention Hold.Speech recognition interrupt status represents that mobile terminal temporarily ceases speech identifying function, and the state that waiting voice identification restores, It should be noted that when mobile terminal is in speech recognition interrupt status, the speech identifying function of mobile terminal is not related It closes, is only at the state that pause receives voice.

Step 103, when mobile terminal is in speech recognition interrupt status, it is determined whether restore speech recognition, if so, holding Row step 104；Otherwise, step 103 is continued to execute；

In embodiments of the present invention, when mobile terminal is in speech recognition interrupt status, can determine whether to restore in real time Speech recognition is determined according to can be the second phonetic order input by user, the second button operation input by user or movement Second handover trigger mode of terminal display interface etc., the embodiment of the present invention do not do the mode for determining whether to restore speech recognition It is specific to limit.

Step 104 switches to speech recognition state from speech recognition interrupt status, obtains the second voice messaging, and return is held Row step 101 performs step 105；

In embodiments of the present invention, the voice messaging of microphone input that the second voice messaging user passes through mobile terminal, It can be one or more voice messagings, not limit the particular content of the second voice messaging specifically in embodiments of the present invention.

Step 105 to the first voice messaging of one or more for being obtained before speech recognition is interrupted and is restoring language The second voice messaging of one or more that sound identification obtains later carries out speech recognition；

It should be noted that step 105 can be performed in mobile terminal step 101~step 104 it is one or many after It performs, wherein step 101~step 104, which is performed a plurality of times, represents that user is using mobile terminal repeatedly to be beaten when carrying out speech recognition Disconnected, the embodiment of the present invention is interrupted one or many application scenarios for user and is applicable in.

In embodiments of the present invention, speech recognition be interrupted it is one or many after, can get interrupt speech recognition it Preceding the first voice messaging of one or more and the second voice messaging of one or more after speech recognition is restored, it is right First voice messaging of one or more and the second voice messaging carry out speech recognition, the semanteme so as to be completed.

Such as：User passes through following first voice messaging of Mike's wind direction of mobile terminal mobile terminal input：" please in upper 10 points of noon has a meeting in meeting room, meeting " after, mobile terminal detects interruption speech recognition, then keeps the first voice messaging, when The mobile terminal detects recovery speech recognition, then the mobile terminal continues to obtain following second voice messaging：" period is please by hand Machine is adjusted to mute, thanks.", after completing phonetic entry, the mobile terminal pair：" please have a meeting at 10 points in the morning in meeting room, meeting Mobile phone is please adjusted to mute by period, thanks." carry out speech recognition.

It should be noted that existing speech recognition technology, which may be used, in speech recognition realizes that the embodiment of the present invention is to language Sound knowledge is not specifically limited otherwise.

In embodiments of the present invention, when mobile terminal determines to restore speech recognition, prompt message can be exported, the prompting Information includes the relevant information of the first voice messaging, and referring to Fig. 2, an embodiment of the present invention provides the sides of another speech recognition Method is as follows：

Step 201, when mobile terminal is in speech recognition state, it is determined whether interrupt speech recognition, if performing step Rapid 202；Otherwise, step 201 is continued to execute；

In embodiments of the present invention, speech recognition state represents that the speech identifying function of mobile terminal is in operating status, The speech identifying function can be (such as the wechat that mobile terminal itself provides or third party application provides Or QQ etc.), the embodiment of the present invention is not specifically limited the presentation mode of speech identifying function.

Step 202 preserves the first voice messaging obtained before interrupting speech recognition, and language is switched to from speech recognition state Sound identifies interrupt status；

In embodiments of the present invention, speech recognition interrupt status represents that mobile terminal temporarily ceases speech identifying function, and The state that waiting voice identification restores, it should be noted that when mobile terminal is in speech recognition interrupt status, mobile terminal Speech identifying function be not off, be only at pause receive voice state.

Step 203, when mobile terminal is in speech recognition interrupt status, it is determined whether restore speech recognition, if so, holding Row step 204；Otherwise, step 203 is continued to execute；

Step 204, output prompt message；

In embodiments of the present invention, after speech recognition recovery, mobile terminal can export prompt message, prompting letter Breath includes the relevant information of the first voice messaging, and the content of the relevant information can be the content of the first voice messaging, for carrying Show user before speech recognition is interrupted, the voice content that user has inputted, the embodiment of the present invention is in the relevant information Appearance is not specifically limited.

Step 205 switches to speech recognition state from speech recognition interrupt status, obtains the second voice messaging, and return is held Row step 201 performs step 206；

Step 206 to the first voice messaging of one or more for being obtained before speech recognition is interrupted and is restoring language The second voice messaging of one or more that sound identification obtains later carries out speech recognition；

It should be noted that step 206 can be performed in mobile terminal step 201~step 205 it is one or many after It performs, wherein step 201~step 205, which is performed a plurality of times, represents that user is using mobile terminal repeatedly to be beaten when carrying out speech recognition Disconnected, the embodiment of the present invention is interrupted one or many application scenarios for user and is applicable in.

In this way, mobile terminal can switch to speech recognition interrupt status when speech recognition is interrupted, and preserve interruption The first voice messaging obtained before speech recognition, allows hand over when speech recognition is resumed to speech recognition state, and Before speech recognition is resumed, prompt message can be exported, user is prompted to continue phonetic entry, restore speech recognition it After obtain the second voice messaging, according to the first voice messaging of one or more obtained before speech recognition is interrupted, Yi Ji Restore the second voice messaging of one or more obtained after speech recognition, carry out speech recognition, obtain complete semanteme, avoid User needs to re-enter voice due to phonetic entry is interrupted, and improves the usage experience of mobile terminal.

In embodiments of the present invention, mobile terminal can be grasped by phonetic order input by user, button input by user Make or determine whether to interrupt or restore speech recognition according to the handover trigger mode of the display interface of mobile terminal, referring to Fig. 3, An embodiment of the present invention provides the methods of another speech recognition, are as follows：

Step 301 obtains the first phonetic order input by user, then performs step 304；

Step 302 obtains the first button operation input by user, then performs step 305；

In embodiments of the present invention, the first button operation can be operation of the user to physical button or virtual key, this Inventive embodiments are not specifically limited the mode of operation of the first button operation.

Step 303, obtain mobile terminal display interface the first handover trigger mode, then perform step 306；

In embodiments of the present invention, the first handover trigger mode of the display interface of mobile terminal can be enforceable, For example, having phone to call in when user carries out phonetic entry, display interface pressure switches to call interface etc., and the present invention is implemented Example is not specifically limited the first handover trigger mode of the display interface of mobile terminal.

Whether step 304 matches with the first default phonetic order according to the first phonetic order, it is determined whether interrupts voice and knows Not, if so, performing step 307；Otherwise step 301 is continued to execute；

In embodiments of the present invention, if the first phonetic order is matched with the first preset phonetic order, mobile terminal It determines to interrupt speech recognition.The first preset phonetic order can be the phonetic order that mobile terminal pre-sets or Phonetic order set by the user, for example, the first preset phonetic order can be " waiting ", " in a moment " etc., the present invention To first, the content of preset phonetic order is not specifically limited embodiment.

Step 305 is matched according to whether the first button operation operates with the first programmable button, it is determined whether is interrupted voice and is known Not, if so, performing step 307；Otherwise step 302 is continued to execute；

In embodiments of the present invention, if the first button operation is matched with the first preset button operation, mobile terminal It determines to interrupt speech recognition.The first preset button operation can be the phonetic order that mobile terminal pre-sets or Button operation set by the user, for example, user can set the first preset button operation as pressing main screen real key, this hair To first, the mode of operation of preset button operation is not specifically limited bright embodiment.

Whether step 306 matches with the first default handover trigger mode according to the first handover trigger mode, it is determined whether in Disconnected speech recognition, if so, performing step 307；Otherwise step 303 is continued to execute；

In embodiments of the present invention, mobile terminal can be pre-set when the first default handover trigger mode, such as：It forces Property switching mode, if the first handover trigger mode is matched with the first default handover trigger mode, it is determined that interrupt speech recognition, For example, having phone to call in when user is carrying out phonetic entry, display interface is forced to switch to the situation of call interface, mobile Terminal determines to interrupt speech recognition, the embodiment of the present invention to the classification of the first handover trigger mode with determine to interrupt speech recognition it Between correspondence be not specifically limited.

Step 307 preserves the first voice messaging obtained before interrupting speech recognition, and language is switched to from speech recognition state Sound identifies interrupt status, then performs step 308, step 309 or step 310；

In embodiments of the present invention, when mobile terminal is in speech recognition interrupt status, can determine whether to restore in real time Speech recognition optionally, performs step 308, step 309 or step 310；

Step 308 obtains the second phonetic order input by user, then performs step 311；

Step 309 obtains the second button operation input by user, then performs step 312；

In embodiments of the present invention, the second button operation can be operation of the user to physical button or virtual key, this Inventive embodiments are not specifically limited the mode of operation of the second button operation.

Step 310, obtain mobile terminal display interface the second handover trigger mode, then perform step 313；

In embodiments of the present invention, the first handover trigger mode of the display interface of mobile terminal can be non-imposed , for example, display interface is switched to interface of speech recognition etc. by calling by user from the background, the embodiment of the present invention is to mobile whole Second handover trigger mode of the display interface at end is not specifically limited.

Whether step 311 matches with the second default phonetic order according to the second phonetic order, it is determined whether restores voice and knows Not, if so, performing step 311；Otherwise step 308 is continued to execute；

In embodiments of the present invention, if the second phonetic order is matched with the second preset phonetic order, mobile terminal It determines to restore speech recognition.The preset phonetic order can be the phonetic order that mobile terminal pre-sets or by with The phonetic order of family setting, for example, the preset phonetic order can be " I returns ", " we continue " etc., the present invention is real Applying example, the content of preset phonetic order is not specifically limited to second.

Step 312 is matched according to whether the second button operation operates with the second programmable button, it is determined whether is restored voice and is known Not, if so, performing step 311；Otherwise step 309 is continued to execute；

In embodiments of the present invention, if the second button operation is matched with the second preset button operation, mobile terminal It determines to restore speech recognition.The second preset button operation can be the phonetic order that mobile terminal pre-sets or Button operation set by the user, for example, user can set the second preset button operation to be while press power key and return Hui Jian, to second, the mode of operation of preset button operation is not specifically limited the embodiment of the present invention.

Whether step 313 matches with the second default handover trigger mode according to the second handover trigger mode, it is determined whether extensive Multiple speech recognition, if so, performing step 311；Otherwise step 310 is continued to execute；

In embodiments of the present invention, mobile terminal can pre-set the second default handover trigger mode, such as：It is optional Property switching mode, if the second handover trigger mode is matched with the second default handover trigger mode, it is determined that restore speech recognition, For example, when user is by calling backstage that display interface is switched to the interface of speech recognition, the embodiment of the present invention switches to second The classification of triggering mode and the correspondence for determining to restore between speech recognition are not specifically limited.

Step 314 switches to speech recognition state from speech recognition interrupt status, obtains the second voice messaging, and return is held Row step 301, step 302 or step 303；Or perform step 315；

Step 315 to the first voice messaging of one or more for being obtained before speech recognition is interrupted and is restoring language The second voice messaging of one or more that sound identification obtains later carries out speech recognition；

Such as：User passes through following first voice messaging of Mike's wind direction of mobile terminal mobile terminal input：" please in upper 10 points of noon has a meeting in meeting room, meeting " after, by taking user is by the phonetic order of input as an example, user inputs the first voice and refers to It enables：" waiting ", the mobile terminal detect the first phonetic order, then keep the first voice messaging, when user inputs the second voice Instruction：" I returns ", which detects the second phonetic order, then the mobile terminal continues to obtain following second voice Information：" mobile phone is please adjusted to mute by period, thanks.", after completing phonetic entry, the mobile terminal pair：" please exist at 10 points in the morning Meeting room is had a meeting, and is please adjusted to mobile phone during meeting mute, thanks." carry out speech recognition.The button that input is passed through for user Operation or mobile terminal according to the handover trigger mode of display interface determine whether to interrupt or restore the applied field of speech recognition Scape is similar with the above process, and details are not described herein.

It should be noted that step 315 can be performed in mobile terminal step 301~step 314 it is one or many after It performs, wherein it is more when mobile terminal is used to carry out speech recognition that more expression users of step 301~step 314 are performed a plurality of times Secondary to interrupt, the embodiment of the present invention is interrupted one or many application scenarios for user and is applicable in.

In this way, mobile terminal is during speech recognition is carried out, it can be according to the first phonetic order input by user, First handover trigger mode of one button operation or display interface interrupts speech recognition, and according to the second voice input by user Second handover trigger mode of instruction, the second button operation or display interface restores speech recognition, and then centering conclusion sound is known The first voice messaging of one or more obtained before not and the second language of one or more obtained after recovery speech recognition Message ceases, and carries out speech recognition, obtains complete semanteme, user is avoided to need to re-enter language due to phonetic entry is interrupted Sound improves the usage experience of mobile terminal.

In embodiments of the present invention, one or more first voice letter of the acquisition for mobile terminal before speech recognition is interrupted Breath and the second voice messaging of one or more after speech recognition is restored, can select first by the first voice messaging with Second voice messaging carries out speech recognition or first carries out voice to the first voice messaging and the second voice messaging respectively after merging Identification, then voice recognition result is merged into complete semanteme, referring to Fig. 4, an embodiment of the present invention provides also a kind of speech recognitions Method, be as follows：

Step 401, when mobile terminal is in speech recognition state, it is determined whether interrupt speech recognition, if performing step Rapid 402；Otherwise, step 401 is continued to execute；

Step 402 preserves the first voice messaging obtained before interrupting speech recognition, and language is switched to from speech recognition state Sound identifies interrupt status；

Step 403, when mobile terminal is in speech recognition interrupt status, it is determined whether restore speech recognition, if so, holding Row step 404；Otherwise, step 403 is continued to execute；

Step 404 switches to speech recognition state from speech recognition interrupt status, obtains the second voice messaging, and return is held Row step 401；Either perform step 405 or step 407；

Step 405 synthesizes one or more first voice messagings and one or more second voice messagings, obtains third Voice messaging；

In embodiments of the present invention, mobile terminal is by existing voice processing technology, the one or more that will be got First voice messaging and one or more second voice messagings are merged into third voice messaging, realize customer segment is inputted it is more A voice messaging merges into a complete voice messaging.It should be noted that the embodiment of the present invention is to by the first voice messaging The mode that third voice messaging is merged into the second voice messaging is not specifically limited.

Step 406 carries out speech recognition to third voice messaging, obtains the first voice recognition result；

Step 407 carries out voice to one or more first voice messagings and one or more second voice messagings respectively Identification, obtains multiple second voice recognition results；

In embodiments of the present invention, mobile terminal is to one or more first voice messagings and one or more second voices Information carries out speech recognition respectively, obtains corresponding multiple second voice recognition results.

Step 408 carries out semantic analysis to multiple second voice recognition results, obtains third voice recognition result；

In embodiments of the present invention, by existing semantic analysis technology, multiple second voice recognition results are merged into One third voice recognition result, the embodiment of the present invention are not specifically limited the mode of semantic analysis.

Such as：User passes through following first voice messaging of Mike's wind direction of mobile terminal mobile terminal input：" please in upper 10 points of noon has a meeting in meeting room, meeting " after, mobile terminal detects interruption speech recognition, then keeps the first voice messaging, when The mobile terminal detects recovery speech recognition, then the mobile terminal continues to obtain following second voice messaging：" period is please by hand Machine is adjusted to mute, thanks.", after completing phonetic entry, which can be by first voice messaging and the second voice messaging Synthesize third voice messaging：It " please has a meeting at 10 points in the morning in meeting room, is please adjusted to mobile phone mute during meeting, thanks.", Then speech recognition is carried out, obtains the first voice recognition result；Or respectively to first voice messaging and the second voice messaging Speech recognition is carried out, obtains multiple second voice recognition results, semantic analysis is carried out to multiple second voice recognition results, is obtained Third voice recognition result.

It should be noted that step 405~step 408 can be performed in mobile terminal step 401~step 404 once Or repeatedly rear execution, represent user when mobile terminal is used to carry out speech recognition wherein step 401~step 404 is performed a plurality of times It is repeatedly interrupted, the embodiment of the present invention is interrupted one or many application scenarios for user and is applicable in.

In this way, mobile terminal according to the first voice messaging of one or more obtained before speech recognition is interrupted and The second voice messaging of one or more obtained after speech recognition is restored, can select first to the first voice messaging and second Voice messaging carries out voice and merges to obtain third voice messaging, and then identify and obtain the first voice recognition result；Or first distinguish Speech recognition is carried out to the first voice messaging and the second voice messaging and obtains multiple second voice recognition results, then by multiple the Two voice recognition results merge into third voice recognition result, obtain complete semanteme, user is avoided to be interrupted because of phonetic entry And need to re-enter voice, improve the usage experience of mobile terminal.

Referring to Fig. 5, an embodiment of the present invention provides a kind of mobile terminal 500, including：

First determining module 501, for when mobile terminal is in speech recognition state, it is determined whether interrupt voice and know Not；

Preserving module 502, for when the first determining module determines to interrupt speech recognition, preserving before interrupting speech recognition The first voice messaging obtained；

First handover module 503, for when the first determining module determines to interrupt speech recognition, being cut from speech recognition state Shift to speech recognition interrupt status；

Second determining module 504, for when mobile terminal is in speech recognition interrupt status, it is determined whether restore voice Identification；

Second handover module 505, for interrupting shape from speech recognition when the second determining module determines to restore speech recognition State switches to speech recognition state；

Acquisition module 506, for obtaining the second voice messaging when the second determining module determines to restore speech recognition；

Sound identification module 507, for the first voice messaging of one or more obtained before speech recognition is interrupted, And the second voice messaging of one or more obtained after speech recognition is restored carries out speech recognition；

Optionally, the first determining module 501 includes：

First acquisition unit 5011, for obtaining the first phonetic order input by user；

Whether the first determination unit 5012 for being matched according to the first phonetic order with the first default phonetic order, determines Whether speech recognition is interrupted；

And/or second acquisition unit 5013, for obtaining the first button operation input by user；

Whether the second determination unit 5014 for being matched according to the first button operation with the operation of the first programmable button, determines Whether speech recognition is interrupted；

And/or third acquiring unit 5015, for obtaining the handover trigger mode of the display interface of mobile terminal；

Third determination unit 5016, for whether being matched according to handover trigger mode with the first default handover trigger mode, Determine whether to interrupt speech recognition；

Optionally, the second determining module 504 includes：

4th acquiring unit 5041, for obtaining the second phonetic order input by user；

Whether the 4th determination unit 5042 for being matched according to the second phonetic order with the second default phonetic order, determines Whether speech recognition is restored；

And/or the 5th acquiring unit 5043, for obtaining the second button operation input by user；

Whether the 5th determination unit 5044 for being matched according to the second button operation with the operation of the second programmable button, determines Whether speech recognition is restored；

And/or the 6th acquiring unit 5045, for obtaining the handover trigger mode of the display interface of mobile terminal；

6th determination unit 5046, for whether being matched according to handover trigger mode with the second default handover trigger mode, Determine whether to restore speech recognition；

Optionally, sound identification module 507 includes：

Synthesis unit 5071, for one or more first voice messagings and one or more second voice messagings to be closed Into obtaining third voice messaging；

First recognition unit 5072 for carrying out speech recognition to third voice messaging, obtains the first voice recognition result；

And/or second recognition unit 5073, for respectively to one or more first voice messagings and one or more the Two voice messagings carry out speech recognition, obtain multiple second voice recognition results；

Analytic unit 5074 for carrying out semantic analysis to multiple second voice recognition results, obtains third speech recognition As a result；

Optionally, which further includes：

Output module 508, for exporting prompt message, wherein, prompt message includes the relevant information of the first voice messaging.

Mobile terminal provided in an embodiment of the present invention can realize that mobile terminal is realized in the embodiment of the method for Fig. 1 to Fig. 4 Each process, repeated to avoid, which is not described herein again.

In this way, mobile terminal can switch to speech recognition interrupt status when speech recognition is interrupted, and obtain interruption The first voice messaging before speech recognition, speech recognition state is switched to when speech recognition is resumed, and obtains recovery language The second voice messaging after sound identification, according to the first voice messaging of one or more obtained before speech recognition is interrupted, And the second voice messaging of one or more obtained after speech recognition is restored, speech recognition is carried out, obtains complete language Justice avoids user from needing to re-enter voice due to phonetic entry is interrupted, improves the usage experience of mobile terminal.

The hardware architecture diagram of Fig. 6 a kind of mobile terminals of each embodiment to realize the present invention, as shown in the figure, the shifting Dynamic terminal 600 includes but not limited to：Radio frequency unit 601, audio output unit 603, input unit 604, passes network module 602 Sensor 605, display unit 606, user input unit 607, interface unit 608, memory 609, processor 610 and power supply 611 grade components.It will be understood by those skilled in the art that mobile terminal structure shown in Fig. 6 is not formed to mobile terminal It limits, mobile terminal can include either combining certain components or different component cloth than illustrating more or fewer components It puts.In embodiments of the present invention, mobile terminal includes but not limited to mobile phone, tablet computer, laptop, palm PC, vehicle Mounted terminal, wearable device and pedometer etc..

In one embodiment, processor 610, for when mobile terminal is in speech recognition state, it is determined whether in Disconnected speech recognition；If so, preserving the first voice messaging obtained before interrupting speech recognition, language is switched to from speech recognition state Sound identifies interrupt status；When mobile terminal is in speech recognition interrupt status, it is determined whether restore speech recognition；If so, from Speech recognition interrupt status switches to speech recognition state, obtains the second voice messaging；To being obtained before speech recognition is interrupted The first voice messaging of one or more and the second voice messaging of one or more for being obtained after speech recognition is restored into Row speech recognition.

It should be understood that the embodiment of the present invention in, radio frequency unit 601 can be used for receive and send messages or communication process in, signal Send and receive, specifically, by from base station downlink data receive after, handled to processor 610；In addition, by uplink Data are sent to base station.In general, radio frequency unit 601 includes but not limited to antenna, at least one amplifier, transceiver, coupling Device, low-noise amplifier, duplexer etc..In addition, radio frequency unit 601 can also by radio communication system and network and other set Standby communication.

Mobile terminal has provided wireless broadband internet to the user by network module 602 and has accessed, and such as user is helped to receive It sends e-mails, browse webpage and access streaming video etc..

It is that audio output unit 603 can receive radio frequency unit 601 or network module 602 or in memory 609 The audio data of storage is converted into audio signal and exports as sound.Moreover, audio output unit 603 can also be provided and be moved The relevant audio output of specific function that dynamic terminal 600 performs is (for example, call signal receives sound, message sink sound etc. Deng).Audio output unit 603 includes loud speaker, buzzer and receiver etc..

Input unit 604 is used to receive audio or video signal.Input unit 604 can include graphics processor (Graphics Processing Unit, GPU) 6041 and microphone 6042, graphics processor 6041 is in video acquisition mode Or the static images or the image data of video obtained in image capture mode by image capture apparatus (such as camera) carry out Reason.Treated, and picture frame may be displayed on display unit 606.Through graphics processor 6041, treated that picture frame can be deposited Storage is sent in memory 609 (or other storage mediums) or via radio frequency unit 601 or network module 602.Mike Wind 6042 can receive sound, and can be audio data by such acoustic processing.Treated audio data can be The form output of mobile communication base station can be sent to via radio frequency unit 601 by being converted in the case of telephone calling model.

Mobile terminal 600 further includes at least one sensor 605, such as optical sensor, motion sensor and other biographies Sensor.Specifically, optical sensor includes ambient light sensor and proximity sensor, wherein, ambient light sensor can be according to environment The light and shade of light adjusts the brightness of display panel 6061, and proximity sensor can close when mobile terminal 600 is moved in one's ear Display panel 6061 and/or backlight.As one kind of motion sensor, accelerometer sensor can detect in all directions (general For three axis) size of acceleration, size and the direction of gravity are can detect that when static, can be used to identify mobile terminal posture (ratio Such as horizontal/vertical screen switching, dependent game, magnetometer pose calibrating), Vibration identification correlation function (such as pedometer, tap)；It passes Sensor 605 can also include fingerprint sensor, pressure sensor, iris sensor, molecule sensor, gyroscope, barometer, wet Meter, thermometer, infrared ray sensor etc. are spent, details are not described herein.

Display unit 606 is used to show by information input by user or be supplied to the information of user.Display unit 606 can wrap Display panel 6061 is included, liquid crystal display (Liquid Crystal Display, LCD), Organic Light Emitting Diode may be used Display panel 6061 is configured in forms such as (Organic Light-Emitting Diode, OLED).

User input unit 607 can be used for receiving the number inputted or character information and generation and the use of mobile terminal The key signals input that family is set and function control is related.Specifically, user input unit 607 include touch panel 6071 and Other input equipments 6072.Touch panel 6071, also referred to as touch screen collect user on it or neighbouring touch operation (for example user uses any suitable objects such as finger, stylus or attachment on touch panel 6071 or in touch panel 6071 Neighbouring operation).Touch panel 671 may include both touch detecting apparatus and touch controller.Wherein, touch detection fills The touch orientation of detection user is put, and detects the signal that touch operation is brought, transmits a signal to touch controller；Touch control Device receives touch information from touch detecting apparatus, and is converted into contact coordinate, then gives processor 610, reception processing Order that device 610 is sent simultaneously is performed.It is furthermore, it is possible to a variety of using resistance-type, condenser type, infrared ray and surface acoustic wave etc. Type realizes touch panel 6071.In addition to touch panel 6071, user input unit 607 can also include other input equipments 6072.Specifically, other input equipments 6072 can include but is not limited to physical keyboard, function key (such as volume control button, Switch key etc.), trace ball, mouse, operating lever, details are not described herein.

Further, touch panel 6071 can be covered on display panel 6061, when touch panel 6071 is detected at it On or near touch operation after, send to processor 610 with determine touch event type, be followed by subsequent processing device 610 according to touch The type for touching event provides corresponding visual output on display panel 6061.Although in figure 6, touch panel 6071 and display Panel 6061 is the component independent as two to realize the function that outputs and inputs of mobile terminal, but in some embodiments In, can be integrated by touch panel 6071 and display panel 6061 and realize the function that outputs and inputs of mobile terminal, it is specific this Place does not limit.

Interface unit 608 is the interface that external device (ED) is connect with mobile terminal 600.For example, external device (ED) can include Line or wireless head-band earphone port, external power supply (or battery charger) port, wired or wireless data port, storage card end Mouth, port, audio input/output (I/O) port, video i/o port, earphone end for connecting the device with identification module Mouthful etc..Interface unit 608 can be used for receiving the input (for example, data information, electric power etc.) from external device (ED) and One or more elements that the input received is transferred in mobile terminal 600 can be used in 600 He of mobile terminal Data are transmitted between external device (ED).

Memory 609 can be used for storage software program and various data.Memory 609 can mainly include storing program area And storage data field, wherein, storing program area can storage program area, application program (such as the sound needed at least one function Sound playing function, image player function etc.) etc.；Storage data field can store according to mobile phone use created data (such as Audio data, phone directory etc.) etc..In addition, memory 609 can include high-speed random access memory, can also include non-easy The property lost memory, a for example, at least disk memory, flush memory device or other volatile solid-state parts.

Processor 610 is the control centre of mobile terminal, utilizes each of various interfaces and the entire mobile terminal of connection A part is stored in storage by running or performing the software program being stored in memory 609 and/or module and call Data in device 609 perform the various functions of mobile terminal and processing data, so as to carry out integral monitoring to mobile terminal.Place Reason device 610 may include one or more processing units；Preferably, processor 610 can integrate application processor and modulatedemodulate is mediated Device is managed, wherein, the main processing operation system of application processor, user interface and application program etc., modem processor is main Processing wireless communication.It is understood that above-mentioned modem processor can not also be integrated into processor 610.

Mobile terminal 600 can also include the power supply 611 (such as battery) powered to all parts, it is preferred that power supply 611 Can be logically contiguous by power-supply management system and processor 610, so as to realize management charging by power-supply management system, put The functions such as electricity and power managed.

In addition, mobile terminal 600 includes some unshowned function modules, details are not described herein.

Preferably, the embodiment of the present invention also provides a kind of mobile terminal, and including processor 610, memory 609 is stored in On memory 609 and the computer program that can be run on the processor 610, the computer program are performed by processor 610 Each process of the above-mentioned audio recognition method embodiments of Shi Shixian, and identical technique effect can be reached, it is repeated to avoid, here It repeats no more.

The embodiment of the present invention also provides a kind of computer readable storage medium, and meter is stored on computer readable storage medium Calculation machine program, the computer program realize each process of above-mentioned audio recognition method embodiment, and energy when being executed by processor Reach identical technique effect, repeated to avoid, which is not described herein again.Wherein, the computer readable storage medium, such as only Read memory (Read-Only Memory, abbreviation ROM), random access memory (Random Access Memory, abbreviation RAM), magnetic disc or CD etc..

It should be noted that herein, term " comprising ", "comprising" or its any other variant are intended to non-row His property includes, so that process, method, article or device including a series of elements not only include those elements, and And it further includes other elements that are not explicitly listed or further includes intrinsic for this process, method, article or device institute Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including this Also there are other identical elements in the process of element, method, article or device.

The specific embodiment of the above, only the application, but the protection domain of the application is not limited thereto, it is any The change or replacement in technical scope disclosed in the application, should all cover within the protection domain of the application.Therefore, this Shen Protection domain please should be based on the protection scope of the described claims.

Claims

1. a kind of method of speech recognition, applied to mobile terminal, which is characterized in that the method includes：

When the mobile terminal is in speech recognition state, it is determined whether interrupt speech recognition；

If so, preserving the first voice messaging obtained before interrupting speech recognition, voice is switched to from the speech recognition state Identify interrupt status；

When the mobile terminal is in the speech recognition interrupt status, it is determined whether restore speech recognition；

If so, switching to the speech recognition state from the speech recognition interrupt status, the second voice messaging is obtained；

To being obtained in the first voice messaging of one or more obtained before interrupting speech recognition and after speech recognition is restored The second voice messaging of one or more taken carries out speech recognition.

2. according to the method described in claim 1, it is characterized in that, it is described determine whether interrupt speech recognition, including：

Obtain the first phonetic order input by user；

Whether matched according to first phonetic order with the first default phonetic order, it is determined whether interrupt speech recognition；

Alternatively,

Obtain the first button operation input by user；

It is matched according to whether first button operation operates with the first programmable button, it is determined whether interrupt speech recognition；

Alternatively,

Obtain the first handover trigger mode of the display interface of the mobile terminal；

Whether matched according to the first handover trigger mode with the first default handover trigger mode, it is determined whether interrupt voice and know Not.

3. according to the method described in claim 1, it is characterized in that, it is described determine whether restore speech recognition, including：

Obtain the second phonetic order input by user；

Whether matched according to second phonetic order with the second default phonetic order, it is determined whether restore speech recognition；

Alternatively,

Obtain the second button operation input by user；

It is matched according to whether second button operation operates with the second programmable button, it is determined whether restore speech recognition；

Alternatively,

Obtain the second handover trigger mode of the display interface of the mobile terminal；

Whether matched according to the second handover trigger mode with the second default handover trigger mode, it is determined whether restore voice and know Not.

4. according to the method described in claim 1, it is characterized in that, it is described to obtained before speech recognition is interrupted one or Multiple first voice messagings and the second voice messaging of one or more obtained after speech recognition is restored carry out voice knowledge Not, including：

One or more of first voice messagings and one or more of second voice messagings are synthesized, obtain third voice Information；

Speech recognition is carried out to the third voice messaging, obtains the first voice recognition result；

Alternatively,

Speech recognition is carried out to one or more of first voice messagings and one or more of second voice messagings respectively, Obtain multiple second voice recognition results；

Semantic analysis is carried out to the multiple second voice recognition result, obtains third voice recognition result.

5. according to the method described in claim 1, it is characterized in that, it is described acquisition the second voice messaging the step of before, institute The method of stating further includes：

Prompt message is exported, wherein, the prompt message includes the relevant information of first voice messaging.

6. a kind of mobile terminal, which is characterized in that including：

First determining module, for when the mobile terminal is in speech recognition state, it is determined whether interrupt speech recognition；

Preserving module, for when first determining module determines to interrupt speech recognition, preserving and being obtained before interrupting speech recognition The first voice messaging taken；

First handover module, for first determining module determine interrupt speech recognition when, from the speech recognition state Switch to speech recognition interrupt status；

Second determining module, for when the mobile terminal is in the speech recognition interrupt status, it is determined whether restore language Sound identifies；

Second handover module, for interrupting shape from the speech recognition when second determining module determines to restore speech recognition State switches to the speech recognition state；

Acquisition module, for obtaining the second voice messaging when second determining module determines to restore speech recognition；

Sound identification module, for the first voice messaging of one or more obtained before speech recognition is interrupted, Yi Ji The second voice messaging of one or more for restoring to obtain after speech recognition carries out speech recognition.

7. mobile terminal according to claim 6, which is characterized in that first determining module includes：

First acquisition unit, for obtaining the first phonetic order input by user；

First determination unit, for whether being matched according to first phonetic order with the first default phonetic order, it is determined whether Interrupt speech recognition；

And/or

Second acquisition unit, for obtaining the first button operation input by user；

Second determination unit, for whether being matched according to first button operation with the operation of the first programmable button, it is determined whether Interrupt speech recognition；

And/or

Third acquiring unit, for obtaining the handover trigger mode of the display interface of the mobile terminal；

Whether third determination unit for being matched according to the handover trigger mode with the first default handover trigger mode, determines Whether speech recognition is interrupted.

8. mobile terminal according to claim 6, which is characterized in that second determining module includes：

4th acquiring unit, for obtaining the second phonetic order input by user；

4th determination unit, for whether being matched according to second phonetic order with the second default phonetic order, it is determined whether Restore speech recognition；

And/or

5th acquiring unit, for obtaining the second button operation input by user；

5th determination unit, for whether being matched according to second button operation with the operation of the second programmable button, it is determined whether Restore speech recognition；

And/or

6th acquiring unit, for obtaining the handover trigger mode of the display interface of the mobile terminal；

Whether the 6th determination unit for being matched according to the handover trigger mode with the second default handover trigger mode, determines Whether speech recognition is restored.

9. mobile terminal according to claim 6, which is characterized in that the sound identification module includes：

Synthesis unit, for one or more of first voice messagings and one or more of second voice messagings to be closed Into obtaining third voice messaging；

First recognition unit for carrying out speech recognition to the third voice messaging, obtains the first voice recognition result；

And/or

Second recognition unit, for respectively to one or more of first voice messagings and one or more of second voices Information carries out speech recognition, obtains multiple second voice recognition results；

Analytic unit for carrying out semantic analysis to the multiple second voice recognition result, obtains third voice recognition result.

10. mobile terminal according to claim 6, which is characterized in that the mobile terminal further includes：

Output module, for exporting prompt message, wherein, the prompt message includes the related letter of first voice messaging Breath.

11. a kind of mobile terminal, which is characterized in that including processor, memory and be stored on the memory and can be in institute The computer program run on processor is stated, such as claim 1 to 5 is realized when the computer program is performed by the processor Any one of described in speech recognition method the step of.