CN107230478A - A kind of voice information processing method and system - Google Patents
A kind of voice information processing method and system Download PDFInfo
- Publication number
- CN107230478A CN107230478A CN201710302993.1A CN201710302993A CN107230478A CN 107230478 A CN107230478 A CN 107230478A CN 201710302993 A CN201710302993 A CN 201710302993A CN 107230478 A CN107230478 A CN 107230478A
- Authority
- CN
- China
- Prior art keywords
- speech recognition
- submodule
- voice
- obtains
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 34
- 238000003672 processing method Methods 0.000 title claims abstract description 22
- 239000012634 fragment Substances 0.000 claims abstract description 53
- 238000012545 processing Methods 0.000 claims abstract description 23
- 238000012163 sequencing technique Methods 0.000 claims description 17
- 238000012217 deletion Methods 0.000 claims description 11
- 230000037430 deletion Effects 0.000 claims description 11
- 238000004891 communication Methods 0.000 claims description 8
- 230000010354 integration Effects 0.000 claims description 6
- 238000000034 method Methods 0.000 abstract description 19
- 230000008569 process Effects 0.000 description 12
- 238000005516 engineering process Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 241001553178 Arachis glabrata Species 0.000 description 2
- 241000287828 Gallus gallus Species 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 235000020232 peanut Nutrition 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000004904 shortening Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/64—Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations
- H04M1/65—Recording arrangements for recording a message from the calling party
- H04M1/656—Recording arrangements for recording a message from the calling party for recording conversations
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Telephonic Communication Services (AREA)
Abstract
The invention provides a kind of voice information processing method and system, its method includes:S100 obtains the voice messaging of user;S200 intercepts the voice messaging, obtains multiple sound bites;S300 recognizes the sound bite, obtains corresponding speech recognition fragment;The S400 processing speech recognition fragment, obtains voice identification result.System includes acquisition module, obtains the voice messaging of user;Interception module, intercepts the voice messaging, obtains multiple sound bites;Identification module, recognizes the sound bite, obtains corresponding speech recognition fragment;First processing module, handles the speech recognition fragment, obtains voice identification result.The present invention carries out speech recognition during realizing voice recording, reducing user needs after the completion of voice recording, speech recognition can be carried out and export the stand-by period of sound result, shortened recording time delay while normal recognition result is not influenceed, improve user experience.
Description
Technical field
The present invention relates to technical field of voice recognition, espespecially a kind of voice information processing method and system.
Background technology
With flourishing for the communication technology, the application of speech recognition is more and more extensive, and various network service instruments are for example micro-
The meanss of communication such as letter, Tencent QQ progressively turn into one of main tool that mass-communication is linked up.Wherein, the operation of speech message is simple
Property, convenience are extensively liked by user.In the intelligent terminals such as current mobile phone, computer, voice can be provided by means of communication
Input, output function.
In the prior art, the identifying schemes of current speech recognition do not make consideration for identification time started length,
The stand-by period of user will be longer when identification is shorter, and not only the stand-by period is veryer long but also knows for the speech recognition of user when longer
It is not imperfect, have a strong impact on the use demand of user.And prior art is after voice recording terminates, then to send out recording result
Deliver to sound identification module and carry out speech recognition, record length adds recognition time, causes the unnecessary stand-by period, wastes
Time, influence the usage experience of user.
The content of the invention
It is an object of the invention to provide a kind of voice information processing method and system, realize and carry out language during voice recording
Sound is recognized, after the completion of reduction user waiting voice is recorded.
The technical scheme that the present invention is provided is as follows:
A kind of voice information processing method, including step:S100 is periodically gathered during user recording and is recognized use
The voice messaging at family, obtains speech recognition fragment;The S200 processing speech recognition fragment, obtains voice identification result.
The present invention carries out speech recognition during realizing voice recording, and reducing user needs after the completion of voice recording,
Speech recognition can be carried out and export the stand-by period of sound result, when shortening recording while normal recognition result is not influenceed
Prolong, improve user experience.
Further, the step S100 includes step:S110 is during user recording, according to preset collection rule
The voice messaging of user is gathered, current speech segment is obtained;S120 recognizes the current speech segment according to speech recognition library, obtains
To speech recognition fragment;S130 obtains next sound bite and performs step S110-130, until user terminates recording;Wherein,
The default collection rule is according to the equal acquisition mode of time interval.
Further, S110 also includes step:S111 judges whether the current speech segment is blank sound bite;If
It is to perform step S112;Otherwise, step S120 is performed;S112 deletes the current speech segment, and performs step S130.
Further, the step S200 includes step:S210 according to collection time sequencing, by the speech recognition piece
Section is ranked up integration, obtains institute's speech recognition result.
Further, the step S200 also includes step:S220 exports the voice and known according to the time sequencing of collection
Other fragment, obtains institute's speech recognition result.
The present invention also provides a kind of speech information processing system, including:Control module and processing module;The processing module
Communicated to connect with the control module;The control module, periodically gathers and recognizes the language of user during user recording
Message ceases, and obtains speech recognition fragment;The processing module, handles the speech recognition piece that the control module identification is obtained
Section, obtains voice identification result.
Further, the control module includes:Gather submodule and identification submodule;It is described collection submodule with it is described
Recognize submodule communication connection;The collection submodule, during user recording, gathers user's according to default collection rule
Voice messaging, obtains current speech segment, sends the current speech segment to the identification submodule;The identification submodule
Block, receives the current speech segment that the collection submodule is sent, the current speech piece is recognized according to speech recognition library
Section, obtains speech recognition fragment;The collection submodule also obtains and sends next sound bite to the identification submodule, directly
Terminate recording to user;The identification submodule also receives next sound bite that the collection submodule is sent, according to
Speech recognition library recognizes next sound bite, obtains speech recognition fragment, until user terminates recording;Wherein, it is described pre-
If collection rule is according to the equal acquisition mode of time interval.
Further, the control module also includes:Judging submodule and deletion submodule, the judging submodule difference
Communicated to connect with the collection submodule, the deletion submodule and the identification submodule;The judging submodule, judges institute
Whether state current speech segment is blank sound bite;Judge the current speech segment for blank sound bite if so, sending
Result to the deletion submodule;Otherwise, send judge the current speech segment not for blank sound bite result extremely
The identification submodule;The deletion submodule, receives the judged result that the judging submodule is sent, and deletes the current language
Tablet section.
Further, the processing module includes:Sorting sub-module;The sorting sub-module communicates with the control module
Connection;The sorting sub-module, according to the time sequencing of collection, is ranked up integration by the speech recognition fragment, obtains institute
Speech recognition result.
Further, the processing module also includes:Output sub-module, the output sub-module leads to the control module
Letter connection;The output sub-module, according to the time sequencing of collection, exports the speech recognition fragment, obtains the voice and knows
Other result
A kind of voice information processing method and system provided by the present invention, can bring the following beneficial effect of at least one
Really:
1st, the present invention is during recording, and the sound bite that collection recording is obtained carries out speech recognition, compared to traditional language
Sound identification method, processing voice identification result faster, reduces the time of the typing of user's waiting voice and speech recognition.
2nd, according to fifo queue, (FIFO is First Input First Output abbreviation, FIFO team to the present invention
Row, this is a kind of traditional sequentially execution method, and the instruction being introduced into first is completed and retired from office, and and then just performs Article 2 instruction.
A kind of data buffer of first in first out) carry out acquisition voice messaging, and by fifo queue carry out speech recognition, for compared with
Prolonged Recording Process can not only efficiently reduce the stand-by period of voice recording and speech recognition, can also make complete
Speech recognition.
3rd, the present invention carries out speech recognition during realizing voice recording, and solving user needs after the completion of voice recording,
The problem of speech recognition being carried out.
4th, the present invention shortens recording time delay while normal recognition result is not influenceed, and improves user experience.
5th, the present invention can delete invalid voice fragment, help user more rapidly to carry out speech recognition.
Brief description of the drawings
Below by clearly understandable mode, preferred embodiment is described with reference to the drawings, to a kind of speech signal analysis side
Above-mentioned characteristic, technical characteristic, advantage and its implementation of method and system are further described.
Fig. 1 is a kind of flow chart of one embodiment of voice information processing method of the invention;
Fig. 2 is a kind of flow chart of another embodiment of voice information processing method of the invention;
Fig. 3 is a kind of flow chart of another embodiment of voice information processing method of the invention;
Fig. 4 is a kind of flow chart of another embodiment of voice information processing method of the invention;
Fig. 5 is a kind of structural representation of one embodiment of speech information processing system of the invention;
Fig. 6 is a kind of structural representation of another embodiment of speech information processing system of the invention;
Fig. 7 is a kind of structural representation of another embodiment of speech information processing system of the invention;
Fig. 8 is a kind of structural representation of another embodiment of speech information processing system of the invention;
Fig. 9 is a kind of flow chart of an example of voice information processing method of the invention.
Embodiment
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, control is illustrated below
The embodiment of the present invention.It should be evident that drawings in the following description are only some embodiments of the present invention, for
For those of ordinary skill in the art, on the premise of not paying creative work, other can also be obtained according to these accompanying drawings
Accompanying drawing, and obtain other embodiments.
To make only to schematically show part related to the present invention in simplified form, each figure, they are not represented
Its as product practical structures.In addition, so that simplified form is readily appreciated, there is identical structure or function in some figures
Part, only symbolically depicts one of those, or has only marked one of those.Herein, " one " is not only represented
" only this ", can also represent the situation of " more than one ".
With reference to shown in Fig. 1, the present invention provides a kind of one embodiment of voice information processing method, including:
S110 is periodically gathered during user recording and is recognized the voice messaging of user, obtains speech recognition fragment;
The S120 processing speech recognition fragment, obtains voice identification result.
In the embodiment of the present invention, realize and speech recognition is carried out during voice recording, reducing user needs in voice recording
After the completion of, speech recognition can be carried out and export the stand-by period of sound result, while normal recognition result is not influenceed
Shorten recording time delay, improve user experience.
With reference to shown in Fig. 2, the present invention provides a kind of another embodiment of voice information processing method, including:
S210 gathers the voice messaging of user according to preset collection rule during user recording, obtains current language
Tablet section;
S220 recognizes the current speech segment according to speech recognition library, obtains speech recognition fragment;
S230 obtains next sound bite and performs step S210-230, until user terminates recording;
The speech recognition fragment is ranked up integration by S240 according to the time sequencing of collection, is obtained the voice and is known
Other result.
Wherein, the default collection rule is according to the equal acquisition mode of time interval.
In the embodiment of the present invention, the foundation of specific speech recognition library, prior art has a lot, herein not fine explanation.
During recording, the sound bite that collection recording is obtained carries out speech recognition, compared to traditional voice identification method, handles language
Sound recognition result faster, reduces the time of the typing of user's waiting voice and speech recognition.Acquisition voice is carried out according to fifo queue
Information, and speech recognition is carried out by fifo queue, for shorter recording, sound identification module, which is needed not wait for, reaches voice
Recognition time could carry out speech recognition after starting, it is to avoid the increase unnecessary stand-by period, for the recording of long period
Journey can not only efficiently reduce the stand-by period of voice recording and speech recognition, can also make complete speech recognition.With
Family can set default collection rule according to oneself hobby, demand.Avoid causing the unnecessary stand-by period, during saving
Between lifted user usage experience.Acquisition voice messaging is carried out according to fifo queue, and speech recognition is carried out by fifo queue,
Recording Process for the long period can not only efficiently reduce the stand-by period of voice recording and speech recognition, can also do
Go out complete speech recognition.For example, user's first sets collection rule to be to carry out interception voice messaging per 1S in Recording Process, that
User starts after recording, and the collection rule set according to user's first collects first 1S sound bite Y1, second
1S n-th of 1S of sound bite Y2 ... ... sound bite Yn, then after sound bite Y1 is collected, pass through voice
Identification module carries out speech recognition, obtains speech recognition fragment S1, obtains after sound bite Y2, entered by sound identification module
Row speech recognition, obtains speech recognition fragment S2, the like, during recording, once collection obtains corresponding voice
Speech recognition can be just carried out after fragment immediately and obtains corresponding speech recognition fragment, speech recognition fragment is saved,
Sequencing arrangement is carried out according to the time order and function order of acquisition, then almost complete language is instantly obtained after End of Tape
Sound recognition result, lifts the efficiency of speech recognition.
Technology in the embodiment of the present invention can be applied to be controlled including indoor equipment, in terms of voice dialogue robot,
By carrying out the function of speech recognition during voice recording in recording, solving user needs after the completion of voice recording,
The problem of speech recognition being carried out, and shortening recording time delay while normal recognition result is not influenceed, and user
Voice command is quickly converted into voice recognition commands and inputted to intelligent home device, intelligent robot, so as to more facilitate fast
The voice recognition commands promptly obtained according to identification control intelligent home device, intelligent robot, without user with hand come
Operation, voice operating is more rapid compared to manually operated, improves user experience.So avoid to do shopping such as Taobao
Platform, causes user to prefer to manual service of transferring due to the inefficiency of speech recognition, improves the utilization rate of speech recognition,
The wasting of resources of voice service is reduced, the workload of human customer is reduced, labour cost is reduced.The embodiment of the present invention can also be applied
In speech searching system, such as Baidu's phonetic search is a kind of brand-new search pattern, and user can use voice to say search
Intention, such as saying " weather will be how tomorrow ", " way of Spicy diced chicken with peanuts ", user during speaking, just can side obtain
Take family information of speaking and just carry out speech recognition, the embodiment of the present invention can be instantly obtained desired result, output character version
The phonetic search such as " how is weather tomorrow ", " way of Spicy diced chicken with peanuts " allow user to remove the cumbersome of typewriting from, make the whole mistake of search
Journey is more smooth, more convenient.
With reference to shown in Fig. 3, the present invention provides a kind of another embodiment of voice information processing method, including:
S310 gathers the voice messaging of user according to preset collection rule during user recording, obtains current language
Tablet section;
S320 recognizes the current speech segment according to speech recognition library, obtains speech recognition fragment;
S330 exports the speech recognition fragment, obtains institute's speech recognition result according to the time sequencing of collection;
S340 obtains next sound bite and performs step S310-330, until user terminates recording.
Wherein, the default collection rule is according to the equal acquisition mode of time interval.
The embodiment of the present invention, during recording, the sound bite that collection recording is obtained carries out speech recognition, handles language
Sound identification is fast, reduces period of reservation of number.Acquisition voice messaging is carried out according to fifo queue, and voice is carried out by fifo queue
Identification, the Recording Process for the long period can not only efficiently reduce the stand-by period of voice recording and speech recognition,
Complete speech recognition can be made.Such as general speech recognition effective time is 30S, the record if user's second is spoken without a break
Sound recorded 60S, because recording time is long, and it is long not only result in the recording stand-by period, and because voice messaging is long, leads
Sound identification module is caused intactly to identify the recording substance of user's second.
The embodiment of the present invention can also be applied and phonetic dialing, Voice Navigation, dictation data inputting etc. field.For example, listening
Write in Data Input Process, user side speech utterance identification module just exports the content that user speaks in typing column at once, tool
Body starts after recording, and the collection rule set according to user's second collects first 0.5S sound bite X1, second
0.5S n-th of 0.5S of sound bite X2 ... ... sound bite Xn, then after sound bite X1 is collected, pass through
Sound identification module carries out speech recognition, obtains speech recognition fragment B1, the like.During recording, once collection
Speech recognition can just be carried out immediately and obtain corresponding speech recognition fragment by obtaining after corresponding sound bite, according to collection
Time sequencing, exports the speech recognition fragment, obtains institute's speech recognition result.If user's second finds the word on typing column
Which part has different from the content that oneself is spoken, and the part of the wrong identification can also be found out according to time sequencing, carries out
Re-recognize.
With reference to shown in Fig. 4, the present invention provides a kind of another embodiment of voice information processing method, including:
S410 gathers the voice messaging of user according to preset collection rule during user recording, obtains current language
Tablet section;
S420 judges whether the current speech segment is blank sound bite;If so, performing step S430;Otherwise, hold
Row step S440;
S430 deletes the current speech segment, and performs step S450;
S440 recognizes the current speech segment according to speech recognition library, obtains speech recognition fragment;
S450 obtains next sound bite and performs step S410-S450, until user terminates recording;
Wherein, the default collection rule is according to the equal acquisition mode of time interval.
In the embodiment of the present invention, invalid voice fragment can be deleted, helps user more rapidly to carry out speech recognition.
In preprocessing process before speech recognition, according to the skill such as sound wave change frequency during the speaking of user and sound wave change fluctuation
Art can identify user speech information, and which is partly efficient voice part, and which is invalid voice part, and mark user is empty
Bai Yuyin time point, and remove invalid voice partial information i.e. blank sound bite.For example assume the adopting according to 2S of user third
Collection rule carry out interception user speech information, it is also assumed that user third speak beginning time point be 14:30, user is 14:33-
14:36 periods did not speak, that is, detected and the Jing Yin of 3s occur.Collection rule so according to embodiments of the present invention, 14:33-
14:The sound bite of 35 this interception is the sound bite of blank, this sound bite is marked, at this point it is possible to think
The initial speech information is invalid, and sound identification module can not carry out speech recognition to it
The present embodiment by speech recognition technology by that can reduce key-press input, enhancing and the interactivity of user;By adopting
With fifo queue, realize multichannel microphone and share a speech recognition engine, improve engine utilization rate.
With reference to shown in Fig. 5, the present invention provides a kind of one embodiment of speech information processing system 1000, including:Control
Module and processing module;The processing module is communicated to connect with the control module;
The control module, periodically gathers and recognizes during user recording the voice messaging of user, obtain voice
Recognize fragment;
The processing module, handles the speech recognition fragment that the control module identification is obtained, obtains speech recognition
As a result.
In the embodiment of the present invention, realize and speech recognition is carried out during voice recording, reducing user needs in voice recording
After the completion of, speech recognition can be carried out and export the stand-by period of sound result, while normal recognition result is not influenceed
Shorten recording time delay, improve user experience.
With reference to shown in Fig. 6, it will not be repeated here with upper one embodiment identical part.The present invention provides a kind of voice letter
Another embodiment of processing system 1000 is ceased, including:The control module includes:Gather submodule and identification submodule;Institute
State collection submodule and the identification submodule communication connection;The processing module includes:Sorting sub-module;The sequence submodule
Block is communicated to connect with the control module;
The collection submodule, during user recording, the voice messaging of user is gathered according to default collection rule, is obtained
Current speech segment is obtained, the current speech segment is sent to the identification submodule;
The identification submodule, receives the current speech segment that the collection submodule is sent, according to speech recognition
Storehouse recognizes the current speech segment, obtains speech recognition fragment;
The collection submodule also obtains and sends next sound bite to the identification submodule, until user terminates record
Sound;
The identification submodule also receives next sound bite that the collection submodule is sent, according to speech recognition
Storehouse recognizes next sound bite, obtains speech recognition fragment, until user terminates recording;
The sorting sub-module, according to the time sequencing of collection, is ranked up integration by the speech recognition fragment, obtains
Institute's speech recognition result;
Wherein, the default collection rule is according to the equal acquisition mode of time interval.
In the embodiment of the present invention, the foundation of specific speech recognition library, prior art has a lot, herein not fine explanation.
During recording, the sound bite that collection recording is obtained carries out speech recognition, compared to traditional voice identification method, handles language
Sound recognition result faster, reduces the time of the typing of user's waiting voice and speech recognition.Acquisition voice is carried out according to fifo queue
Information, and speech recognition is carried out by fifo queue, the Recording Process for the long period can not only efficiently reduce voice
Recording and the stand-by period of speech recognition, complete speech recognition can also be made.User can be according to oneself hobby, demand
To set default collection rule.Avoid causing the unnecessary stand-by period, the usage experience that the time of saving lifts user.According to
Fifo queue carry out acquisition voice messaging, and by fifo queue carry out speech recognition, for the long period Recording Process not
The stand-by period of voice recording and speech recognition can be only efficiently reduced, complete speech recognition can also be made.The present invention
Technology in embodiment can be applied to be controlled including indoor equipment, in terms of voice dialogue robot, passes through voice recording mistake
The function of speech recognition is carried out in journey in recording, solving user needs after the completion of voice recording, can carry out voice knowledge
Other problem, and while normal recognition result is not influenceed shorten recording time delay, and user voice command promptly
It is converted into voice recognition commands to input to intelligent home device, intelligent robot, so that more conveniently according to recognizing
Voice recognition commands control intelligent home device, the intelligent robot arrived, is operated, voice operating phase without user with hand
It is more rapider than manually operated, improve user experience.Specific example is shown in corresponding method embodiment.Realize voice recording process
Middle carry out speech recognition, reducing user needs after the completion of voice recording, can carry out speech recognition and export sound result
Stand-by period, while normal recognition result is not influenceed shorten recording time delay, improve user experience.
With reference to shown in Fig. 7, it will not be repeated here with upper one embodiment identical part.The present invention provides a kind of voice letter
Another embodiment of processing system 1000 is ceased, including:The processing module also includes:Output sub-module, the output submodule
Block is communicated to connect with the control module;
The output sub-module, according to the time sequencing of collection, exports the speech recognition fragment, obtains the voice and knows
Other result.
Specifically, the present embodiment is during recording, it can enter immediately once collection is obtained after corresponding sound bite
Row speech recognition obtains corresponding speech recognition fragment, according to the time sequencing of collection, exports the speech recognition fragment,
Obtain institute's speech recognition result.If user's second finds which the word segment on typing column has different from the content that oneself is spoken
, because acquisition time is regular, the sound bite can be found according to the time sequencing of collection and re-start identification, greatly
Big lifting user experience.Realize and speech recognition is carried out during voice recording, reducing user needs to complete in voice recording
Afterwards, speech recognition can be carried out and exports the stand-by period of sound result, shortened while normal recognition result is not influenceed
Recording time delay, improves user experience.
With reference to shown in Fig. 8, the present invention provides a kind of another embodiment of speech information processing system 1000, including:Institute
Stating control module includes:Gather submodule, identification submodule, judging submodule and delete submodule;The judging submodule point
Do not communicated to connect with the collection submodule, the deletion submodule and the identification submodule;
The collection submodule, during user recording, the voice messaging of user is gathered according to default collection rule, is obtained
Current speech segment is obtained, the current speech segment is sent to the judging submodule;
Whether the judging submodule, it is blank sound bite to judge the current speech segment;Judge institute if so, sending
Result that current speech segment is blank sound bite is stated to the deletion submodule;Otherwise, send and judge the current speech
Fragment is not the result of blank sound bite to the identification submodule;
The deletion submodule, receives the judged result that the judging submodule is sent, and deletes the current speech segment;
The identification submodule, receives the current speech segment that the collection submodule is sent, according to speech recognition
Storehouse recognizes the current speech segment, obtains speech recognition fragment;
The collection submodule also obtains and sends next sound bite to the judging submodule, until user terminates record
Sound;
The identification submodule also receives next sound bite that the collection submodule is sent, according to speech recognition
Storehouse recognizes next sound bite, obtains speech recognition fragment, until user terminates recording.
In the embodiment of the present invention, invalid voice fragment can be deleted, helps user more rapidly to carry out speech recognition.
In preprocessing process before speech recognition, according to the skill such as sound wave change frequency during the speaking of user and sound wave change fluctuation
Art can identify user speech information, and which is partly efficient voice part, and which is invalid voice part, and removes invalid
Phonological component information is blank sound bite.Realize and speech recognition is carried out during voice recording, reducing user needs in voice
After the completion of recording, speech recognition can be carried out and export the stand-by period of sound result, not influence normal recognition result
Shorten recording time delay simultaneously, improve user experience.
With reference to shown in Fig. 9, the present invention provides an a kind of example of voice information processing method, including:
1st, recording starts.
2nd, recording module is kept in Recording Process, is intercepted successively for 2S/ times.
3rd, file is intercepted.
4th, recording result is sent to sound identification module and carries out voice dictation.
5th, voice dictation result is put into fifo queue.
6th, semantics recognition module constantly carries out semantics recognition to the sentence in queue, and semantic analysis understands sentence.
7th, according to semantics recognition result, send command adapted thereto or answer result, so as to complete a whole set of speech recognition.
In the embodiment of the present invention, it is not special case that 2S/ times, which carries out interception, can be set according to the hobby and demand of user
Put the temporal frequency of interception.Realize and speech recognition carried out during voice recording, reducing user needs after the completion of voice recording,
Speech recognition can be carried out and export the stand-by period of sound result, shorten recording while normal recognition result is not influenceed
Time delay, improves user experience.By using FIFO fifo queues, realize multichannel microphone and share a speech recognition
Engine, improves engine utilization rate.Reduce for shorter recording, sound identification module, which is needed not wait for, reaches the speech recognition time
It could carry out speech recognition after beginning, reduce the stand-by period of speech recognition, the Recording Process for the long period not only can be with
The stand-by period of voice recording and speech recognition is efficiently reduced, complete speech recognition can also be made.This programme is in recording
Time uses two second time, is once recorded within every two seconds, and recording result then is sent into sound identification module is identified,
It is put into after recognition result in fifo queues, so continuous recording result is all in queue, then in semantics recognition module to splicing
Sentence is identified, so as to reach the effect of Rapid Speech identification.Realize and speech recognition is carried out during voice recording, reduce and use
Family is needed after the completion of voice recording, can be carried out speech recognition and be exported the stand-by period of sound result, not influence just
Shorten recording time delay while normal recognition result, improve user experience.
It should be noted that above-described embodiment can independent assortment as needed.Described above is only the preferred of the present invention
Embodiment, it is noted that for those skilled in the art, is not departing from the premise of the principle of the invention
Under, some improvements and modifications can also be made, these improvements and modifications also should be regarded as protection scope of the present invention.
Claims (10)
1. a kind of voice information processing method, it is characterised in that including step:
S100 is periodically gathered during user recording and is recognized the voice messaging of user, obtains speech recognition fragment;
The S200 processing speech recognition fragment, obtains voice identification result.
2. a kind of voice information processing method according to claim 1, it is characterised in that the step S100 includes step
Suddenly:
S110 gathers the voice messaging of user according to preset collection rule during user recording, obtains current speech piece
Section;
S120 recognizes the current speech segment according to speech recognition library, obtains speech recognition fragment;
S130 obtains next sound bite and performs step S110-130, until user terminates recording;
Wherein, the default collection rule is according to the equal acquisition mode of time interval.
3. a kind of voice information processing method according to claim 2, it is characterised in that the step S110 also includes step
Suddenly:
S111 judges whether the current speech segment is blank sound bite;If so, performing step S112;Otherwise, step is performed
Rapid S120;
S112 deletes the current speech segment, and performs step S130.
4. a kind of voice information processing method according to claim 1, it is characterised in that the step S200 includes step
Suddenly:
The speech recognition fragment is ranked up integration by S210 according to the time sequencing of collection, obtains the speech recognition knot
Really.
5. a kind of voice information processing method according to claim any one of 1-4, it is characterised in that the step S200
Also include step:
S220 exports the speech recognition fragment, obtains institute's speech recognition result according to the time sequencing of collection.
6. a kind of speech information processing system, it is characterised in that including:Control module and processing module;The processing module with
The control module communication connection;
The control module, periodically gathers and recognizes during user recording the voice messaging of user, obtain speech recognition
Fragment;
The processing module, handles the speech recognition fragment that the control module identification is obtained, obtains voice identification result.
7. speech information processing system according to claim 6, it is characterised in that the control module includes:Collection
Module and identification submodule;The collection submodule and the identification submodule communication connection;
The collection submodule, during user recording, the voice messaging of user is gathered according to default collection rule, is worked as
Preceding sound bite, sends the current speech segment to the identification submodule;
The identification submodule, receives the current speech segment that the collection submodule is sent, is known according to speech recognition library
Not described current speech segment, obtains speech recognition fragment;
The collection submodule also obtains and sends next sound bite to the identification submodule, until user terminates recording;
The identification submodule also receives next sound bite that the collection submodule is sent, and is known according to speech recognition library
Not described next sound bite, obtains speech recognition fragment, until user terminates recording;
Wherein, the default collection rule is according to the equal acquisition mode of time interval.
8. speech information processing system according to claim 7, it is characterised in that the control module also includes:Judge
Submodule and deletion submodule, the judging submodule gather submodule, the deletion submodule and the knowledge with described respectively
Small pin for the case module is communicated to connect;
Whether the judging submodule, it is blank sound bite to judge the current speech segment;Judge described work as if so, sending
Preceding sound bite for blank sound bite result to the deletion submodule;Otherwise, send and judge the current speech segment
It is not the result of blank sound bite to the identification submodule;
The deletion submodule, receives the judged result that the judging submodule is sent, and deletes the current speech segment.
9. speech information processing system according to claim 7, it is characterised in that the processing module includes:Sequence
Module;The sorting sub-module is communicated to connect with the control module;
The sorting sub-module, according to the time sequencing of collection, integration is ranked up by the speech recognition fragment, obtains described
Voice identification result.
10. the speech information processing system according to claim any one of 6-9, it is characterised in that the processing module is also
Including:Output sub-module, the output sub-module is communicated to connect with the control module;
The output sub-module, according to the time sequencing of collection, exports the speech recognition fragment, obtains the speech recognition knot
Really.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710302993.1A CN107230478A (en) | 2017-05-03 | 2017-05-03 | A kind of voice information processing method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710302993.1A CN107230478A (en) | 2017-05-03 | 2017-05-03 | A kind of voice information processing method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107230478A true CN107230478A (en) | 2017-10-03 |
Family
ID=59933174
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710302993.1A Pending CN107230478A (en) | 2017-05-03 | 2017-05-03 | A kind of voice information processing method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107230478A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110784591A (en) * | 2019-09-25 | 2020-02-11 | 福建新大陆软件工程有限公司 | Intelligent voice automatic detection method, device and system |
CN110797030A (en) * | 2019-10-24 | 2020-02-14 | 秒针信息技术有限公司 | Method and system for working hour statistics based on voice recognition |
CN111508531A (en) * | 2020-04-23 | 2020-08-07 | 维沃移动通信有限公司 | Audio processing method and device |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1920947A (en) * | 2006-09-15 | 2007-02-28 | 清华大学 | Voice/music detector for audio frequency coding with low bit ratio |
CN101588415A (en) * | 2009-06-29 | 2009-11-25 | 中国农业大学 | Voice service method and voice service system |
CN101593520A (en) * | 2008-05-27 | 2009-12-02 | 北京凌声芯语音科技有限公司 | The implementation method that high-performance speech recognition coprocessor and association thereof handle |
CN101848277A (en) * | 2010-04-23 | 2010-09-29 | 中兴通讯股份有限公司 | Mobile terminal and method for storing conversation contents in real time |
CN102118886A (en) * | 2010-01-04 | 2011-07-06 | ***通信集团公司 | Recognition method of voice information and equipment |
CN102360187A (en) * | 2011-05-25 | 2012-02-22 | 吉林大学 | Chinese speech control system and method with mutually interrelated spectrograms for driver |
CN102376305A (en) * | 2011-11-29 | 2012-03-14 | 安徽科大讯飞信息科技股份有限公司 | Speech recognition method and system |
CN103366742A (en) * | 2012-03-31 | 2013-10-23 | 盛乐信息技术(上海)有限公司 | Voice input method and system |
CN104157301A (en) * | 2014-07-25 | 2014-11-19 | 广州三星通信技术研究有限公司 | Method, device and terminal deleting voice information blank segment |
CN104412323A (en) * | 2012-06-25 | 2015-03-11 | 三菱电机株式会社 | On-board information device |
CN104769670A (en) * | 2012-09-06 | 2015-07-08 | 萨热姆通信宽带简易股份有限公司 | Device and method for supplying a reference audio signal to an acoustic processing unit |
CN105653729A (en) * | 2016-01-28 | 2016-06-08 | 努比亚技术有限公司 | Device and method for indexing sound recording file |
CN105702257A (en) * | 2015-08-12 | 2016-06-22 | 乐视致新电子科技(天津)有限公司 | Speech processing method and device |
CN105989836A (en) * | 2015-03-06 | 2016-10-05 | 腾讯科技(深圳)有限公司 | Voice acquisition method, device and terminal equipment |
CN106019592A (en) * | 2016-07-15 | 2016-10-12 | 中国人民解放军63908部队 | Augmented reality optical transmission-type helmet mounted display pre-circuit and control method thereof |
-
2017
- 2017-05-03 CN CN201710302993.1A patent/CN107230478A/en active Pending
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1920947A (en) * | 2006-09-15 | 2007-02-28 | 清华大学 | Voice/music detector for audio frequency coding with low bit ratio |
CN101593520A (en) * | 2008-05-27 | 2009-12-02 | 北京凌声芯语音科技有限公司 | The implementation method that high-performance speech recognition coprocessor and association thereof handle |
CN101588415A (en) * | 2009-06-29 | 2009-11-25 | 中国农业大学 | Voice service method and voice service system |
CN102118886A (en) * | 2010-01-04 | 2011-07-06 | ***通信集团公司 | Recognition method of voice information and equipment |
CN101848277A (en) * | 2010-04-23 | 2010-09-29 | 中兴通讯股份有限公司 | Mobile terminal and method for storing conversation contents in real time |
CN102360187A (en) * | 2011-05-25 | 2012-02-22 | 吉林大学 | Chinese speech control system and method with mutually interrelated spectrograms for driver |
CN102376305A (en) * | 2011-11-29 | 2012-03-14 | 安徽科大讯飞信息科技股份有限公司 | Speech recognition method and system |
CN103366742A (en) * | 2012-03-31 | 2013-10-23 | 盛乐信息技术(上海)有限公司 | Voice input method and system |
CN104412323A (en) * | 2012-06-25 | 2015-03-11 | 三菱电机株式会社 | On-board information device |
CN104769670A (en) * | 2012-09-06 | 2015-07-08 | 萨热姆通信宽带简易股份有限公司 | Device and method for supplying a reference audio signal to an acoustic processing unit |
CN104157301A (en) * | 2014-07-25 | 2014-11-19 | 广州三星通信技术研究有限公司 | Method, device and terminal deleting voice information blank segment |
CN105989836A (en) * | 2015-03-06 | 2016-10-05 | 腾讯科技(深圳)有限公司 | Voice acquisition method, device and terminal equipment |
CN105702257A (en) * | 2015-08-12 | 2016-06-22 | 乐视致新电子科技(天津)有限公司 | Speech processing method and device |
CN105653729A (en) * | 2016-01-28 | 2016-06-08 | 努比亚技术有限公司 | Device and method for indexing sound recording file |
CN106019592A (en) * | 2016-07-15 | 2016-10-12 | 中国人民解放军63908部队 | Augmented reality optical transmission-type helmet mounted display pre-circuit and control method thereof |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110784591A (en) * | 2019-09-25 | 2020-02-11 | 福建新大陆软件工程有限公司 | Intelligent voice automatic detection method, device and system |
CN110797030A (en) * | 2019-10-24 | 2020-02-14 | 秒针信息技术有限公司 | Method and system for working hour statistics based on voice recognition |
CN110797030B (en) * | 2019-10-24 | 2022-06-07 | 上海明胜品智人工智能科技有限公司 | Method and system for working hour statistics based on voice recognition |
CN111508531A (en) * | 2020-04-23 | 2020-08-07 | 维沃移动通信有限公司 | Audio processing method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110096191B (en) | Man-machine conversation method and device and electronic equipment | |
CN110049270B (en) | Multi-person conference voice transcription method, device, system, equipment and storage medium | |
CN110689876B (en) | Voice recognition method and device, electronic equipment and storage medium | |
CN103440867B (en) | Audio recognition method and system | |
CN109584876A (en) | Processing method, device and the voice air conditioner of voice data | |
CN106847285B (en) | Robot and voice recognition method thereof | |
JP2002169588A (en) | Text display device, text display control method, storage medium, program transmission device, and reception supporting method | |
CN106407198A (en) | Question and answer information processing method and device | |
JP2006146881A (en) | Dialoguing rational agent, intelligent dialoguing system using this agent, method of controlling intelligent dialogue, and program for using it | |
CN107230478A (en) | A kind of voice information processing method and system | |
CN108874904A (en) | Speech message searching method, device, computer equipment and storage medium | |
CN102292766A (en) | Method, apparatus and computer program product for providing compound models for speech recognition adaptation | |
CN110992955A (en) | Voice operation method, device, equipment and storage medium of intelligent equipment | |
CN110377908A (en) | Semantic understanding method, apparatus, equipment and readable storage medium storing program for executing | |
CN112866086B (en) | Information pushing method, device, equipment and storage medium for intelligent outbound | |
CN110995943B (en) | Multi-user streaming voice recognition method, system, device and medium | |
CN109992239A (en) | Voice traveling method, device, terminal and storage medium | |
JP2021140134A (en) | Method, device, electronic apparatus, computer readable storage medium, and computer program for recognizing speech | |
CN106601242A (en) | Executing method and device of operation event and terminal | |
US20040042591A1 (en) | Method and system for the processing of voice information | |
CN111933149A (en) | Voice interaction method, wearable device, terminal and voice interaction system | |
CN112015879B (en) | Method and device for realizing man-machine interaction engine based on text structured management | |
CN106782546A (en) | Audio recognition method and device | |
CN111414748A (en) | Traffic data processing method and device | |
CN111062729A (en) | Information acquisition method, device and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20201102 Address after: 318015 no.2-3167, zone a, Nonggang City, no.2388, Donghuan Avenue, Hongjia street, Jiaojiang District, Taizhou City, Zhejiang Province Applicant after: Taizhou Jiji Intellectual Property Operation Co.,Ltd. Address before: 201616 Shanghai city Songjiang District Sixian Road No. 3666 Applicant before: Phicomm (Shanghai) Co.,Ltd. |
|
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171003 |