CN104751856B - A kind of speech sentences recognition methods and device - Google Patents

A kind of speech sentences recognition methods and device Download PDF

Info

Publication number
CN104751856B
CN104751856B CN201310753083.7A CN201310753083A CN104751856B CN 104751856 B CN104751856 B CN 104751856B CN 201310753083 A CN201310753083 A CN 201310753083A CN 104751856 B CN104751856 B CN 104751856B
Authority
CN
China
Prior art keywords
sentence
sequence
composition
digital speech
specified
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310753083.7A
Other languages
Chinese (zh)
Other versions
CN104751856A (en
Inventor
王左彪
王瑞鹏
吕广娜
王红梅
刘越
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201310753083.7A priority Critical patent/CN104751856B/en
Publication of CN104751856A publication Critical patent/CN104751856A/en
Application granted granted Critical
Publication of CN104751856B publication Critical patent/CN104751856B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses a kind of speech sentences recognition methods and device, including:Pretreatment is digitized to belonging to a variety of speech sentences signals to be identified for presetting one of specific clause, obtains digital speech sequence;Zero-crossing examination is carried out to digital voice sequence, obtains multiple sequence segments that digital speech sequence includes, and the number of fragments of multiple sequence segments;From multiple sentence structures, case statement composition quantity and number of fragments identical sentence structure, the sentence structure as digital speech sequence;Sentence composition according to the sentence structure of digital speech sequence forms, and determines each specified each self-corresponding sequence segment of sentence composition in the digital speech sequence;Each specified sentence composition corresponding to sequence segment is directed in digital speech sequence respectively, specify the characteristic parameter of sequence segment corresponding to sentence composition to specify with this each characteristic parameter in corresponding pattern base of sentence composition by comparing this, determine the semanteme of sequence segment corresponding to the specified sentence composition.

Description

A kind of speech sentences recognition methods and device
Technical field
The present invention relates to area of pattern recognition, more particularly to a kind of speech sentences recognition methods and device.
Background technology
With the intelligent development direction of technology of Internet of things, intelligent appliance and intelligent robot etc. can according to user it Between information exchange complete the requirement of user, speech recognition technology is widely used as the dialog interface of people and intelligent terminal. Intelligent terminal only accurately and quickly understands the implication of voice after the voice messaging of user is received, and could carry out corresponding Control operation obtains corresponding state, therefore intelligent terminal is understood that the semanteme of the continuous vocabulary composition sentence of user is One urgent problem.
At present, existing audio recognition method is mainly that the analog signal of the voice of acquisition is passed through into sample conversion for number Word speech data, is analyzed digitized voice data, extraction characterize phonetic feature multiple characteristic values, and combine before to Or the correlation between backward voice, the similarities of multiple characteristic values and the characteristic value in pattern base is determined respectively, by similarity Semanteme of the semanteme of characteristic value in maximum corresponding pattern base as this feature value, according to corresponding to multiple characteristic values difference Semanteme, obtain multiple eigenvalue clusters into sentence semanteme.
In above-mentioned existing audio recognition method, due to needing to combine the correlation between forward or a backward voice to semanteme It is identified, algorithm complex is high so that the treatment effeciency of speech sentences identification is low.
The content of the invention
The embodiment of the present invention provides a kind of speech sentences recognition methods and device, right present in prior art to solve The problem for the treatment of effeciency of speech sentences identification is low.
The embodiment of the present invention provides a kind of speech sentences recognition methods, including:
Pretreatment is digitized to belonging to a variety of speech sentences signals to be identified for presetting one of specific clause, is counted Word voice sequence;
Zero-crossing examination is carried out to the digital speech sequence, obtains multiple sequences point that the digital speech sequence includes Section, and the number of fragments of the multiple sequence segment;
From multiple sentence structures, case statement composition quantity and the number of fragments identical sentence structure, as institute The sentence structure of digital speech sequence is stated, wherein, the sentence composition quantity of the multiple sentence structure is different;
Sentence composition according to the sentence structure of the digital speech sequence forms, and determines each in the digital speech sequence Specify each self-corresponding sequence segment of sentence composition;
Each sequence segment corresponding to specified sentence composition is directed in the digital speech sequence respectively, is referred to by comparing this The characteristic parameter of sequence segment corresponding to attribute sentence composition specifies each characteristic parameter in the corresponding pattern base of sentence composition with this, Determine that this specifies the semanteme of sequence segment corresponding to sentence composition.
Further, by compare this specify the characteristic parameter of sequence segment corresponding to sentence composition with this specify sentence into Each characteristic parameter in pattern base corresponding to point, determine that this specifies the semanteme of sequence segment corresponding to sentence composition, specifically include:
Compare this and specify the pattern corresponding with the specified sentence composition of the characteristic parameter of sequence segment corresponding to sentence composition The similarity of each characteristic parameter in storehouse;
This is specified into semanteme corresponding to the characteristic parameter that similarity is maximum in pattern base corresponding to sentence composition, is defined as this Specify the semanteme of sequence segment corresponding to sentence composition.
Further, from multiple sentence structures, case statement composition quantity and the number of fragments identical sentence Before structure, in addition to:
Extract the expression syntactic class another characteristic information of the digital speech sequence;
By the characteristic information and the other default characteristic information of multiple syntactic classes, the digital speech sequence is determined Grammer classification, wherein, the other default characteristic information of syntactic class known belongs to the other number of the syntactic class for multiple based on extraction What the expression syntactic class another characteristic information of word voice sequence obtained;
From multiple sentence structures, case statement composition quantity and the number of fragments identical sentence structure, specific bag Include:
In the multiple sentence structures included from the grammer classification of the digital speech sequence, case statement composition quantity and institute State number of fragments identical sentence structure.
Further, by the characteristic information and the other default characteristic information of multiple syntactic classes, the number is determined The grammer classification of word voice sequence, is specifically included:
Compare the similarity of the characteristic information and the other default characteristic information of multiple syntactic classes;
By grammer classification corresponding to the maximum default characteristic information of similarity in the multiple grammer classification, it is defined as described The grammer classification of digital speech sequence.
Further, the above method, in addition to:
According to predetermined registration operation mode corresponding to the grammer classification of the digital speech sequence, perform and the digital speech sequence Operated corresponding to the semanteme of sequence segment corresponding to each specified sentence composition of row.
The embodiment of the present invention provides a kind of speech sentences identification device, including:
Pretreatment unit, for carrying out numeral to belonging to a variety of speech sentences signals to be identified for presetting one of specific clause Change pretreatment, obtain digital speech sequence;
Zero-crossing examination unit, for carrying out zero-crossing examination to the digital speech sequence, obtain the digital speech Multiple sequence segments that sequence includes, and the number of fragments of the multiple sequence segment;
Selecting unit, for from multiple sentence structures, case statement composition quantity and the number of fragments identical language Sentence structure, as the sentence structure of the digital speech sequence, wherein, the sentence composition quantity of the multiple sentence structure is not Together;
First determining unit, the sentence composition for the sentence structure according to the digital speech sequence form, and determine institute State each specified each self-corresponding sequence segment of sentence composition in digital speech sequence;
Second determining unit, for being directed in the digital speech sequence each sequence corresponding to specified sentence composition respectively Segmentation, the pattern corresponding with the specified sentence composition of the characteristic parameter of sequence segment corresponding to sentence composition is specified by comparing this Each characteristic parameter in storehouse, determine that this specifies the semanteme of sequence segment corresponding to sentence composition.
Further, second determining unit, sequence segment corresponding to sentence composition is specified specifically for comparing this Characteristic parameter specifies the similarity of each characteristic parameter in the corresponding pattern base of sentence composition with this;
This is specified into semanteme corresponding to the characteristic parameter that similarity is maximum in pattern base corresponding to sentence composition, is defined as this Specify the semanteme of sequence segment corresponding to sentence composition.
Further, said apparatus, in addition to:
Extraction unit, for from multiple sentence structures, case statement composition quantity and the number of fragments identical Before sentence structure, the expression syntactic class another characteristic information of the digital speech sequence is extracted;
3rd determining unit, for by the characteristic information and the other default characteristic information of multiple syntactic classes, really The grammer classification of the fixed digital speech sequence, wherein, the other default characteristic information of syntactic class is multiple known based on extraction Belong to what the expression syntactic class another characteristic information of the other digital speech sequence of the syntactic class obtained;
The selecting unit, specifically for the multiple sentence structures included from the grammer classification of the digital speech sequence In, case statement composition quantity and the number of fragments identical sentence structure.
Further, the 3rd determining unit, it is other pre- with multiple syntactic classes specifically for the characteristic information If the similarity of characteristic information;
By grammer classification corresponding to the maximum default characteristic information of similarity in the multiple grammer classification, it is defined as described The grammer classification of digital speech sequence.
Further, said apparatus, in addition to:
Execution unit, for predetermined registration operation mode corresponding to the grammer classification according to the digital speech sequence, perform with Operated corresponding to the semanteme of sequence segment corresponding to each specified sentence composition of the digital speech sequence.
Using method provided in an embodiment of the present invention, due to determining digital speech sequence corresponding to speech sentences to be identified first The sentence structure of row, determine the sentence composition in the sentence structure in language corresponding to pattern base corresponding to the sentence composition respectively Justice, reduce the recognition time to speech sentences to be identified, so as to improve the treatment effeciency of speech sentences.
Other features and advantage will illustrate in the following description, also, partly become from specification Obtain it is clear that or being understood by implementing the application.The purpose of the application and other advantages can be by the explanations write Specifically noted structure is realized and obtained in book, claims and accompanying drawing.
Brief description of the drawings
Accompanying drawing is used for providing a further understanding of the present invention, and a part for constitution instruction, implements with the present invention Example is used to explain the present invention together, is not construed as limiting the invention.In the accompanying drawings:
Fig. 1 is the flow chart of speech sentences recognition methods provided in an embodiment of the present invention;
Fig. 2 is the flow chart for the speech sentences recognition methods that the embodiment of the present invention 1 provides;
Fig. 3 is the structural representation for the speech sentences identification device that the embodiment of the present invention 2 provides.
Embodiment
In order to provide the implementation for the treatment effeciency for improving voice semantics recognition, the embodiments of the invention provide a kind of language The recognition methods of sound sentence and device, the preferred embodiments of the present invention are illustrated below in conjunction with Figure of description, it will be appreciated that Preferred embodiment described herein is merely to illustrate and explain the present invention, and is not intended to limit the present invention.And do not conflicting In the case of, the feature in embodiment and embodiment in the application can be mutually combined.
The embodiment of the present invention provides a kind of speech sentences recognition methods, as shown in figure 1, including:
Step 101, it is digitized pre- place to belonging to a variety of speech sentences signals to be identified for presetting one of specific clause Reason, obtains digital speech sequence.
Step 102, zero-crossing examination is carried out to the digital speech sequence, obtain multiple sequences that the digital speech sequence includes Row segmentation, and the number of fragments of multiple sequence segments.
Step 103, from multiple sentence structures, case statement composition quantity and the number of fragments identical sentence structure, As the sentence structure of the digital speech sequence, wherein, the sentence composition quantity of multiple sentence structures is different.
Step 104, according to the digital speech sequence sentence structure sentence composition form, determine the digital speech sequence Each specified each self-corresponding sequence segment of sentence composition in row.
Step 105, each sequence segment corresponding to specified sentence composition is directed in the digital speech sequence respectively, pass through ratio The characteristic parameter of sequence segment corresponding to sentence composition is specified to specify with this each spy in corresponding pattern base of sentence composition compared with this Parameter is levied, determines that this specifies the semanteme of sequence segment corresponding to sentence composition.
In the above method provided in an embodiment of the present invention, certain limitation is added for speech sentences to be identified, should Speech sentences to be identified belong to a variety of one kind preset in specific clause.
Below in conjunction with the accompanying drawings, method and device provided by the invention is described in detail with specific embodiment.
Embodiment 1:
Fig. 2 is the flow chart of speech sentences recognition methods provided in an embodiment of the present invention, specifically includes following handling process:
Step 201, obtain speech sentences signal to be identified.
In this step, the speech sentences to be identified can be a variety of one kind preset in specific clause.This presets specific sentence Formula can be carried out according to the higher syntax format of more common in the natural language to people, succinct syntax format and frequency of use Set.
Step 202, pretreatment is digitized to the speech sentences signal to be identified, obtains digital speech sequence.
In this step, the speech sentences signal to be identified can be sampled, filtered, the processing such as framing, obtaining numeral Voice sequence, specific processing method can use various modes of the prior art, no longer be described in detail herein.Wherein, The elementary cell of the digital speech sequence can be phoneme.
Step 203, the expression syntactic class another characteristic information for extracting the digital speech sequence.
In this step, grammer classification can be declarative sentence and interrogative sentence, represent that syntactic class another characteristic information can be table Show the characteristic information of the tone or intonation, such as represent the characteristic information of the modal particle " ", " " etc. of interrogative sentence, or represent The characteristic information of the tone of interrogative sentence.Wherein, this feature information can be represented with the characteristic parameter of phonetic feature, such as:Should Characteristic information can be cepstrum coefficient, difference cepstrum coefficient, energy projecting parameter and energy difference coefficient etc..
Step 204, the similarity for comparing this feature information and the other default characteristic information of multiple syntactic classes.
Wherein, the other default characteristic information of syntactic class can be that multiple based on extraction known belong to the other number of the syntactic class What the expression syntactic class another characteristic information of word voice sequence obtained, it is for instance possible to use SVMs(SVM, Support Vector Machine)Method obtain the other default characteristic information of syntactic class.
In this step, the comparison of similarity can use various modes of the prior art, it is for instance possible to use likelihood is general The method of rate determines the similarity of this feature information and the other default characteristic information of multiple syntactic classes, and specific handling process is herein not It is described in detail again.
Step 205, by grammer classification corresponding to the maximum default characteristic information of similarity in multiple grammer classifications, be defined as The grammer classification of the digital speech sequence.
Step 206, zero-crossing examination is carried out to the digital speech sequence, obtain multiple sequences that the digital speech sequence includes Row segmentation, and the number of fragments of the plurality of sequence segment.
There is no strict sequencing between above-mentioned steps 206 and step 203-205, step 206 can be first carried out, then hold Row step 203-205.
In step 207, the multiple sentence structures included from the grammer classification of the digital speech sequence, case statement component number Amount and the number of fragments identical sentence structure.
In this step, when grammer classification is interrogative sentence, it is non-sentence that the sentence structure of interrogative sentence, which can include, refer in particular to sentence and Sentence is selected, wherein the sentence structure for being non-sentence can be:Attribute, subject, the adverbial modifier, predicate, complement, attribute, object;Refer in particular to sentence Sentence structure can be:Subject, the adverbial modifier, predicate, object;Select the sentence structure of sentence can be for:Subject, predicate, object.
Wherein, the sentence composition quantity of multiple sentence structures is different.
Step 208, according to the digital speech sequence sentence structure sentence composition form, determine the digital speech sequence In each specified each self-corresponding sequence segment of sentence composition.
Step 209, compare the characteristic parameter of sequence segment corresponding to the specified sentence composition and the specified sentence composition pair The similarity of each characteristic parameter in the pattern base answered.
In this step, the similarity identical side compared with step 204 can be used by comparing the specific handling process of similarity Method, no longer it is described in detail herein.Wherein, pattern base corresponding to sentence composition can be object database, action database And slip condition database, subject in sentence composition can with corresponding objects database, predicate can with respective action database, object, The adverbial modifier, complement and attribute can be with corresponding states databases.
Step 210, this is specified to semanteme corresponding to the characteristic parameter that similarity is maximum in pattern base corresponding to sentence composition, It is defined as the semanteme for specifying sequence segment corresponding to sentence composition.
Step 211, according to predetermined registration operation mode corresponding to the grammer classification of the digital speech sequence, perform and the digital language Operated corresponding to the semanteme of sequence segment corresponding to each specified sentence composition of sound sequence.
In this step, when the grammer classification of the digital speech sequence is declarative sentence, multiple datums can be stored with In the semantic and relational database of the corresponding relation of each self-corresponding predetermined registration operation mode of word voice sequence, search and the statement Predetermined registration operation mode corresponding to the semanteme of sentence, and perform corresponding operation.For example, each specified sentence of the digital speech sequence into Instruction that an indicating intelligent robot is cleaned corresponding to the semanteme of sequence segment corresponding to point, then the intelligent robot After the instruction is received, cleaning operation is performed.
When the grammer classification of the digital speech sequence is interrogative sentence, can be searched in the relational database and the query Result data corresponding to sentence, and user is fed back to by the form of voice or word.
Further, when not finding result data corresponding with the interrogative sentence in the relational database, Ke Yitong Cross cloud server and search for result data corresponding to the interrogative sentence.
The method provided by the above embodiment of the present invention, due to determining digital language corresponding to speech sentences to be identified first The sentence structure of sound sequence, determine the sentence composition in the sentence structure corresponding to pattern base corresponding to the sentence composition respectively Semanteme, reduce the recognition time to speech sentences to be identified, so as to improve the treatment effeciency of speech sentences.
Embodiment 2:
Based on same inventive concept, the speech sentences recognition methods provided according to the above embodiment of the present invention, correspondingly, this Inventive embodiments 2 additionally provide a kind of speech sentences identification device, and its structural representation is as shown in figure 3, specifically include:
Pretreatment unit 301, for being carried out to belonging to a variety of speech sentences signals to be identified for presetting one of specific clause Digitlization pretreatment, obtains digital speech sequence;
Zero-crossing examination unit 302, for carrying out zero-crossing examination to the digital speech sequence, obtain the digital language Multiple sequence segments that sound sequence includes, and the number of fragments of the multiple sequence segment;
Selecting unit 303, for from multiple sentence structures, case statement composition quantity and the number of fragments identical Sentence structure, as the sentence structure of the digital speech sequence, wherein, the sentence composition quantity of the multiple sentence structure is not Together;
First determining unit 304, the sentence composition for the sentence structure according to the digital speech sequence form, it is determined that Each specified each self-corresponding sequence segment of sentence composition in the digital speech sequence;
Second determining unit 305, for being directed to respectively in the digital speech sequence corresponding to each specified sentence composition Sequence segment, the characteristic parameter of sequence segment corresponding to sentence composition is specified to specify sentence composition corresponding with this by comparing this Each characteristic parameter in pattern base, determine that this specifies the semanteme of sequence segment corresponding to sentence composition.
Further, the second determining unit 305, specifically for comparing the spy for specifying sequence segment corresponding to sentence composition Levy the similarity of each characteristic parameter in parameter pattern base corresponding with the specified sentence composition;
This is specified into semanteme corresponding to the characteristic parameter that similarity is maximum in pattern base corresponding to sentence composition, is defined as this Specify the semanteme of sequence segment corresponding to sentence composition.
Further, said apparatus, in addition to:
Extraction unit 306, for from multiple sentence structures, case statement composition quantity to be identical with the number of fragments Sentence structure before, extract the expression syntactic class another characteristic information of the digital speech sequence;
3rd determining unit 307, for by the characteristic information and the other default characteristic information of multiple syntactic classes, Determine the grammer classification of the digital speech sequence, wherein, the other default characteristic information of syntactic class for based on extraction it is multiple Know what the expression syntactic class another characteristic information for belonging to the other digital speech sequence of the syntactic class obtained;
The selecting unit 303, specifically for the multiple sentence knots included from the grammer classification of the digital speech sequence In structure, case statement composition quantity and the number of fragments identical sentence structure.
Further, the 3rd determining unit 307, it is other default specifically for the characteristic information and multiple syntactic classes The similarity of characteristic information;
By grammer classification corresponding to the maximum default characteristic information of similarity in the multiple grammer classification, it is defined as described The grammer classification of digital speech sequence.
Further, said apparatus, in addition to:
Execution unit 308, for predetermined registration operation mode corresponding to the grammer classification according to the digital speech sequence, perform Operated corresponding to the semanteme of sequence segment corresponding with each specified sentence composition of the digital speech sequence.
The respective handling step that the function of above-mentioned each unit may correspond in flow shown in Fig. 1 or Fig. 2, it is no longer superfluous herein State.
In summary, scheme provided in an embodiment of the present invention, including:Wait to know to belonging to a variety of one of specific clause preset Other speech sentences signal is digitized pretreatment, obtains digital speech sequence;Zero-crossing examination is carried out to digital voice sequence, Obtain multiple sequence segments that digital speech sequence includes, and the number of fragments of multiple sequence segments;From multiple sentence structures In, case statement composition quantity and number of fragments identical sentence structure, the sentence structure as digital speech sequence;According to number The sentence composition composition of the sentence structure of word voice sequence, determines that each specified sentence composition is each right in the digital speech sequence The sequence segment answered;Each specified sentence composition corresponding to sequence segment is directed in digital speech sequence respectively, by comparing this The characteristic parameter of sequence segment corresponding to sentence composition is specified to specify each feature ginseng in the corresponding pattern base of sentence composition with this Number, determine that this specifies the semanteme of sequence segment corresponding to sentence composition.Using method provided in an embodiment of the present invention, compared to existing Technology, reduce the recognition time to speech sentences to be identified, so as to improve the treatment effeciency of speech sentences.
The speech sentences identification device that embodiments herein is provided can be realized by computer program.Art technology Personnel are it should be appreciated that above-mentioned Module Division mode is only one kind in numerous Module Division modes, if being divided into it His module or non-division module, all should be in the protection domain of the application as long as speech sentences identification device has above-mentioned function Within.
The application is with reference to method, the equipment according to the embodiment of the present application(System)And the flow of computer program product Figure and/or block diagram describe.It should be understood that can be by every first-class in computer program instructions implementation process figure and/or block diagram Journey and/or the flow in square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided The processors of all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce A raw machine so that produced by the instruction of computer or the computing device of other programmable data processing devices for real The device for the function of being specified in present one flow of flow chart or one square frame of multiple flows and/or block diagram or multiple square frames.
These computer program instructions, which may be alternatively stored in, can guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory, which produces, to be included referring to Make the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one square frame of block diagram or The function of being specified in multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that counted Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented processing, so as in computer or The instruction performed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one The step of function of being specified in individual square frame or multiple square frames.
Obviously, those skilled in the art can carry out the essence of various changes and modification without departing from the present invention to the present invention God and scope.So, if these modifications and variations of the present invention belong to the scope of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to comprising including these changes and modification.

Claims (10)

  1. A kind of 1. speech sentences recognition methods, it is characterised in that including:
    Pretreatment is digitized to belonging to a variety of speech sentences signals to be identified for presetting one of specific clause, obtains digital language Sound sequence;
    Zero-crossing examination is carried out to the digital speech sequence, obtains multiple sequence segments that the digital speech sequence includes, And the number of fragments of the multiple sequence segment;
    From multiple sentence structures, case statement composition quantity and the number of fragments identical sentence structure, as the number The sentence structure of word voice sequence, wherein, the sentence composition quantity of the multiple sentence structure is different;
    Sentence composition according to the sentence structure of the digital speech sequence forms, and determines each in the digital speech sequence specify Each self-corresponding sequence segment of sentence composition;
    Each sequence segment corresponding to specified sentence composition is directed in the digital speech sequence respectively, and language is specified by comparing this Each characteristic parameter in the pattern base corresponding with the specified sentence composition of the characteristic parameter of sequence segment corresponding to sentence composition, it is determined that This specifies the semanteme of sequence segment corresponding to sentence composition.
  2. 2. the method as described in claim 1, it is characterised in that specify sequence segment corresponding to sentence composition by comparing this Characteristic parameter specifies each characteristic parameter in the corresponding pattern base of sentence composition with this, determines that this specifies sequence corresponding to sentence composition The semanteme of segmentation is arranged, is specifically included:
    Compare this to specify in the pattern base corresponding with the specified sentence composition of the characteristic parameter of sequence segment corresponding to sentence composition Each characteristic parameter similarity;
    This is specified into semanteme corresponding to the characteristic parameter that similarity is maximum in pattern base corresponding to sentence composition, is defined as this and specifies The semanteme of sequence segment corresponding to sentence composition.
  3. 3. the method as described in claim 1, it is characterised in that from multiple sentence structures, case statement composition quantity with Before the number of fragments identical sentence structure, in addition to:
    Extract the expression syntactic class another characteristic information of the digital speech sequence;
    By the characteristic information and the other default characteristic information of multiple syntactic classes, the language of the digital speech sequence is determined Method classification, wherein, the other default characteristic information of syntactic class known belongs to the other digital language of the syntactic class for multiple based on extraction What the expression syntactic class another characteristic information of sound sequence obtained;
    From multiple sentence structures, case statement composition quantity and the number of fragments identical sentence structure, specifically include:
    In the multiple sentence structures included from the grammer classification of the digital speech sequence, case statement composition quantity with described point Segment number identical sentence structure.
  4. 4. method as claimed in claim 3, it is characterised in that other pre- with multiple syntactic classes by the characteristic information If characteristic information, the grammer classification of the digital speech sequence is determined, is specifically included:
    Compare the similarity of the characteristic information and the other default characteristic information of multiple syntactic classes;
    By grammer classification corresponding to the maximum default characteristic information of similarity in the multiple grammer classification, it is defined as the numeral The grammer classification of voice sequence.
  5. 5. the method as described in claim 3 or 4, it is characterised in that also include:
    According to predetermined registration operation mode corresponding to the grammer classification of the digital speech sequence, perform and the digital speech sequence Operated corresponding to the semanteme of sequence segment corresponding to each specified sentence composition.
  6. A kind of 6. speech sentences identification device, it is characterised in that including:
    Pretreatment unit, it is pre- for being digitized to the speech sentences signals to be identified for belonging to one of a variety of default specific clause Processing, obtains digital speech sequence;
    Zero-crossing examination unit, for carrying out zero-crossing examination to the digital speech sequence, obtain the digital speech sequence Including multiple sequence segments, and the number of fragments of the multiple sequence segment;
    Selecting unit, for from multiple sentence structures, case statement composition quantity and the number of fragments identical sentence knot Structure, as the sentence structure of the digital speech sequence, wherein, the sentence composition quantity of the multiple sentence structure is different;
    First determining unit, the sentence composition for the sentence structure according to the digital speech sequence form, and determine the number Each specified each self-corresponding sequence segment of sentence composition in word voice sequence;
    Second determining unit, for being directed in the digital speech sequence each sequence point corresponding to specified sentence composition respectively Section, the pattern base corresponding with the specified sentence composition of the characteristic parameter of sequence segment corresponding to sentence composition is specified by comparing this In each characteristic parameter, determine that this specifies the semanteme of sequence segment corresponding to sentence composition.
  7. 7. device as claimed in claim 6, it is characterised in that second determining unit, language is specified specifically for comparing this The phase of each characteristic parameter in the pattern base corresponding with the specified sentence composition of the characteristic parameter of sequence segment corresponding to sentence composition Like degree;
    This is specified into semanteme corresponding to the characteristic parameter that similarity is maximum in pattern base corresponding to sentence composition, is defined as this and specifies The semanteme of sequence segment corresponding to sentence composition.
  8. 8. device as claimed in claim 6, it is characterised in that also include:
    Extraction unit, for from multiple sentence structures, case statement composition quantity and the number of fragments identical sentence Before structure, the expression syntactic class another characteristic information of the digital speech sequence is extracted;
    3rd determining unit, for by the characteristic information and the other default characteristic information of multiple syntactic classes, determining institute The grammer classification of digital speech sequence is stated, wherein, the other default characteristic information of syntactic class known belongs to for multiple based on extraction What the expression syntactic class another characteristic information of the other digital speech sequence of the syntactic class obtained;
    The selecting unit, specifically in multiple sentence structures for including from the grammer classification of the digital speech sequence, choosing Select sentence composition quantity and the number of fragments identical sentence structure.
  9. 9. device as claimed in claim 8, it is characterised in that the 3rd determining unit, specifically for the feature The similarity of information and the other default characteristic information of multiple syntactic classes;
    By grammer classification corresponding to the maximum default characteristic information of similarity in the multiple grammer classification, it is defined as the numeral The grammer classification of voice sequence.
  10. 10. device as claimed in claim 8 or 9, it is characterised in that also include:
    Execution unit, for predetermined registration operation mode corresponding to the grammer classification according to the digital speech sequence, perform with it is described Operated corresponding to the semanteme of sequence segment corresponding to each specified sentence composition of digital speech sequence.
CN201310753083.7A 2013-12-31 2013-12-31 A kind of speech sentences recognition methods and device Active CN104751856B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310753083.7A CN104751856B (en) 2013-12-31 2013-12-31 A kind of speech sentences recognition methods and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310753083.7A CN104751856B (en) 2013-12-31 2013-12-31 A kind of speech sentences recognition methods and device

Publications (2)

Publication Number Publication Date
CN104751856A CN104751856A (en) 2015-07-01
CN104751856B true CN104751856B (en) 2017-12-22

Family

ID=53591415

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310753083.7A Active CN104751856B (en) 2013-12-31 2013-12-31 A kind of speech sentences recognition methods and device

Country Status (1)

Country Link
CN (1) CN104751856B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228983B (en) * 2016-08-23 2018-08-24 北京谛听机器人科技有限公司 A kind of scene process method and system in man-machine natural language interaction
CN107895578B (en) * 2017-11-15 2021-07-20 百度在线网络技术(北京)有限公司 Voice interaction method and device
CN107919127B (en) * 2017-11-27 2021-04-06 北京地平线机器人技术研发有限公司 Voice processing method and device and electronic equipment
CN108959617B (en) * 2018-07-18 2022-03-25 上海萌番文化传播有限公司 Grammar feature matching method, device, medium and computing equipment
CN112185418B (en) * 2020-11-12 2022-05-17 度小满科技(北京)有限公司 Audio processing method and device

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5425127A (en) * 1991-06-19 1995-06-13 Kokusai Denshin Denwa Company, Limited Speech recognition method
US5621849A (en) * 1991-06-11 1997-04-15 Canon Kabushiki Kaisha Voice recognizing method and apparatus
CN1588538A (en) * 2004-09-29 2005-03-02 上海交通大学 Training method for embedded automatic sound identification system
CN101086843A (en) * 2006-06-07 2007-12-12 中国科学院自动化研究所 A sentence similarity recognition method for voice answer system
CN101154379A (en) * 2006-09-27 2008-04-02 夏普株式会社 Method and device for locating keywords in voice and voice recognition system
CN101171624A (en) * 2005-03-11 2008-04-30 株式会社建伍 Speech synthesis device, speech synthesis method, and program
CN101562012A (en) * 2008-04-16 2009-10-21 创而新(中国)科技有限公司 Method and system for graded measurement of voice
CN102426835A (en) * 2011-08-30 2012-04-25 华南理工大学 Method for identifying local discharge signals of switchboard based on support vector machine model
CN102760444A (en) * 2012-04-25 2012-10-31 清华大学 Support vector machine based classification method of base-band time-domain voice-frequency signal
CN102831891A (en) * 2011-06-13 2012-12-19 富士通株式会社 Processing method and system for voice data
CN102970618A (en) * 2012-11-26 2013-03-13 河海大学 Video on demand method based on syllable identification
CN103035241A (en) * 2012-12-07 2013-04-10 中国科学院自动化研究所 Model complementary Chinese rhythm interruption recognition system and method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2655903B2 (en) * 1989-02-02 1997-09-24 シャープ株式会社 Voice recognition device
JP3049711B2 (en) * 1989-03-14 2000-06-05 ソニー株式会社 Audio processing device

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5621849A (en) * 1991-06-11 1997-04-15 Canon Kabushiki Kaisha Voice recognizing method and apparatus
US5425127A (en) * 1991-06-19 1995-06-13 Kokusai Denshin Denwa Company, Limited Speech recognition method
CN1588538A (en) * 2004-09-29 2005-03-02 上海交通大学 Training method for embedded automatic sound identification system
CN101171624A (en) * 2005-03-11 2008-04-30 株式会社建伍 Speech synthesis device, speech synthesis method, and program
CN101086843A (en) * 2006-06-07 2007-12-12 中国科学院自动化研究所 A sentence similarity recognition method for voice answer system
CN101154379A (en) * 2006-09-27 2008-04-02 夏普株式会社 Method and device for locating keywords in voice and voice recognition system
CN101562012A (en) * 2008-04-16 2009-10-21 创而新(中国)科技有限公司 Method and system for graded measurement of voice
CN102831891A (en) * 2011-06-13 2012-12-19 富士通株式会社 Processing method and system for voice data
CN102426835A (en) * 2011-08-30 2012-04-25 华南理工大学 Method for identifying local discharge signals of switchboard based on support vector machine model
CN102760444A (en) * 2012-04-25 2012-10-31 清华大学 Support vector machine based classification method of base-band time-domain voice-frequency signal
CN102970618A (en) * 2012-11-26 2013-03-13 河海大学 Video on demand method based on syllable identification
CN103035241A (en) * 2012-12-07 2013-04-10 中国科学院自动化研究所 Model complementary Chinese rhythm interruption recognition system and method

Also Published As

Publication number Publication date
CN104751856A (en) 2015-07-01

Similar Documents

Publication Publication Date Title
CN107291783B (en) Semantic matching method and intelligent equipment
CN104751856B (en) A kind of speech sentences recognition methods and device
CN102831891B (en) Processing method and system for voice data
CN107220235A (en) Speech recognition error correction method, device and storage medium based on artificial intelligence
CN109522392A (en) Voice-based search method, server and computer readable storage medium
CN104575497B (en) A kind of acoustic model method for building up and the tone decoding method based on the model
CN105260416A (en) Voice recognition based searching method and apparatus
CN104143329A (en) Method and device for conducting voice keyword search
CN109686361A (en) A kind of method, apparatus of speech synthesis calculates equipment and computer storage medium
CN107369439A (en) A kind of voice awakening method and device
CN102637433A (en) Method and system for identifying affective state loaded in voice signal
CN109767756A (en) A kind of speech feature extraction algorithm based on dynamic partition inverse discrete cosine transform cepstrum coefficient
CN116110405B (en) Land-air conversation speaker identification method and equipment based on semi-supervised learning
CN106649253A (en) Auxiliary control method and system based on post verification
Quan et al. Reduce the dimensions of emotional features by principal component analysis for speech emotion recognition
CN109543036A (en) Text Clustering Method based on semantic similarity
Zhang et al. Temporal Transformer Networks for Acoustic Scene Classification.
Jeong et al. Audio tagging system using densely connected convolutional networks.
Kumar et al. Speech mel frequency cepstral coefficient feature classification using multi level support vector machine
KR101727306B1 (en) Languange model clustering based speech recognition apparatus and method
CN107103902A (en) Complete speech content recurrence recognition methods
Zhou et al. A hybrid speech emotion recognition system based on spectral and prosodic features
Wu et al. Cm-tcn: channel-aware multi-scale temporal convolutional networks for speech emotion recognition
CN112395414B (en) Text classification method, training method of classification model, training device of classification model, medium and training equipment
Hao et al. The svm based on smo optimization for speech emotion recognition

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant