CN109872726A - Pronunciation evaluating method, device, electronic equipment and medium - Google Patents
Pronunciation evaluating method, device, electronic equipment and medium Download PDFInfo
- Publication number
- CN109872726A CN109872726A CN201910234740.4A CN201910234740A CN109872726A CN 109872726 A CN109872726 A CN 109872726A CN 201910234740 A CN201910234740 A CN 201910234740A CN 109872726 A CN109872726 A CN 109872726A
- Authority
- CN
- China
- Prior art keywords
- keyword
- assessment
- voice
- pronunciation
- result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Electrically Operated Instructional Devices (AREA)
Abstract
The embodiment of the invention discloses a kind of pronunciation evaluating method, device, electronic equipment and media, wherein this method comprises: obtaining the voice that assessment object issues in real time;If identifying target keyword in the voice obtained in real time, the target keyword of identification is matched with the standard keyword in assessment sentence;If the value of matching result is less than confidence threshold, the target keyword currently identified is determined as invalid keyword, and continue based on the target keyword after the invalid keyword of speech recognition obtained in real time;If the value of matching result is greater than or equal to confidence threshold, the target keyword currently identified is determined as effective keyword, and determine the assessment result of effective keyword to the pronunciation character of effective keyword according to assessment object.The embodiment of the present invention can solve existing pronunciation evaluating method aiming at the problem that pronunciation evaluation result of children lacks objectivity, improves the objectivity of pronunciation evaluation result, increases the flexibility of assessment interactive process.
Description
Technical field
The present embodiments relate to intellectual education technical fields more particularly to a kind of pronunciation evaluating method, device, electronics to set
Standby and medium.
Background technique
The pronunciation of children is assessed, the language competence of child can be best understood from, this language learning in children
Stage plays an important role.For example, passing through pronunciation evaluation, it will be appreciated that whether children's pronunciation correct, children are to the reason of language
Solution level and children are to the reply degree etc. of complex language.
Currently, in Speech Assessment interactive process, after user receives the beginning prompt tone of pronunciation evaluation product broadcasting, start
It is spoken according to screen prompt;When assessment system detect user speech tail point or user speech acquisition time-out, stop adopting
Collect user speech;Then collected user speech is carried out with the assessment sentence template of assessment system as a whole pair
Than returning to the pronunciation evaluation result of user according to comparing result.
Consider that cognitive ability and the self-control of children are relatively weak, it, can not be in strict accordance with during pronunciation evaluation
Assess product requirement complete assessment, it may appear that a variety of randomness events, for example, skip assessment sentence in unacquainted word or
The pronunciation sequence etc. of word in the reverse assessment sentence of person.If still using above-mentioned evaluation scheme, lead to the assessment pronounced to children
As a result lack objectivity.Also, the characteristics of being directed to children itself, using the above scheme to the assessment interactive process of children's pronunciation
Lack flexibility.
Summary of the invention
The embodiment of the present invention provides a kind of pronunciation evaluating method, device, electronic equipment and medium, to improve pronunciation evaluation knot
The objectivity and accuracy of fruit increase the flexibility of assessment interactive process.
In a first aspect, the embodiment of the invention provides a kind of pronunciation evaluating methods, this method comprises:
The voice that assessment object issues is obtained in real time;
If identifying target keyword in the voice obtained in real time, by the target keyword of the identification and assessment language
Standard keyword in sentence is matched;
If the value of matching result is less than confidence threshold, the target keyword currently identified is determined as invalid keyword,
And continue based on the target keyword after invalid keyword described in the speech recognition obtained in real time;
If the value of matching result is greater than or equal to the confidence threshold, the target keyword currently identified is determined as having
Keyword is imitated, and determines the assessment knot of effective keyword to the pronunciation character of effective keyword according to assessment object
Fruit.
Second aspect, the embodiment of the invention also provides a kind of pronunciation evaluation device, which includes:
Voice obtains module, the voice issued for obtaining assessment object in real time;
Keywords matching module, if for identifying target keyword in the voice obtained in real time, by the identification
Target keyword with assessment sentence in standard keyword matched;
Invalid keyword determining module, if the value for matching result is less than confidence threshold, the target that will currently identify
Keyword is determined as invalid keyword, and continues to close based on the target after invalid keyword described in the speech recognition obtained in real time
Keyword;
Effective keyword evaluation module will be current if the value for matching result is greater than or equal to the confidence threshold
The target keyword of identification is determined as effective keyword, and is determined according to pronunciation character of the assessment object to effective keyword
The assessment result of effective keyword.
The third aspect, the embodiment of the invention also provides a kind of electronic equipment, comprising:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processing
Device realizes the pronunciation evaluating method as described in any embodiment of the present invention.
Fourth aspect, the embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer
Program realizes the pronunciation evaluating method as described in any embodiment of the present invention when the program is executed by processor.
The voice that the embodiment of the present invention is issued by obtaining assessment object in real time, and the voice of acquisition is known in real time
Not, it once therefrom identifying target keyword, is then matched with the standard keyword in assessment sentence, and then tied according to matching
Fruit and assessment object pronunciation character, determine assessment result, be equivalent to as unit of keyword carry out circulation identification with it is matched
Mode, the pronunciation evaluation result for solving existing pronunciation evaluating method for the relatively weak assessment object of self-control lack visitor
The problem of property seen, the objectivity and accuracy of pronunciation evaluation result are improved, increases the flexibility of assessment interactive process.
Detailed description of the invention
Fig. 1 is the flow chart for the pronunciation evaluating method that the embodiment of the present invention one provides;
Fig. 2 is the flow chart for another pronunciation evaluating method that the embodiment of the present invention one provides;
Fig. 3 is the flow chart of pronunciation evaluating method provided by Embodiment 2 of the present invention;
Fig. 4 is the structural schematic diagram for the pronunciation evaluation device that the embodiment of the present invention three provides;
Fig. 5 is the structural schematic diagram for a kind of electronic equipment that the embodiment of the present invention four provides.
Specific embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining the present invention rather than limiting the invention.It also should be noted that in order to just
Only the parts related to the present invention are shown in description, attached drawing rather than entire infrastructure.
Embodiment one
Fig. 1 is the flow chart for the pronunciation evaluating method that the embodiment of the present invention one provides, and the present embodiment is applicable to automatic control
The relatively weak assessment object of ability, such as children, carry out pronunciation evaluation the case where, this method can by pronunciation evaluation device Lai
It executes, which can be realized by the way of software and/or hardware, and can be integrated on an electronic device, such as mobile terminal,
Intelligent appliance product and intellectual education product etc..
As shown in Figure 1, pronunciation evaluating method provided in this embodiment may include:
S110, the voice that assessment object issues is obtained in real time.
After the pronunciation evaluation function of electronic equipment is activated, the voice acquisition device on electronic equipment can use,
Such as microphone, the voice that the object of acquisition assessment in real time issues.In general, electronic equipment can be by standard during pronunciation evaluation
Assessment sentence be displayed on the screen, assessment object according to display content pronounce, such as with read.
If S120, identifying target keyword in the voice obtained in real time, by the target keyword and assessment of identification
Standard keyword in sentence is matched.
Electronic equipment can be based on speech recognition technology, identified in real time to the voice obtained in real time, extract voice
In target keyword.Standard keyword in assessment sentence can be by being split to obtain to assessment sentence.The present embodiment
In, target keyword and standard keyword include the word being made of at least one minimum language element, wherein according to language kind
The difference of class, minimum language element can be different.For example, minimum language element refers to individual Chinese character, then target critical for Chinese
The phrase that word and standard keyword may each comprise a Chinese character or be made of at least two Chinese characters;For English, minimum language
Speech element refers to single English word, then target keyword and standard keyword may each comprise an English word or by least
The phrase of two English words composition.At least two target keywords or at least two standard keywords, can form one it is complete
Whole sentence.
Illustratively, the criterion evaluation language by sentence " child will form a Good Habit from small " as children's pronunciation evaluation
Sentence, after children distribute the voice of Chinese character " small ", electronic equipment can obtain the voice, and identify " small " in voice
Word, the standard keyword for including with assessment sentence are matched;As children continue to speak, when the language for distributing the second Chinese character " friend "
After sound, electronic equipment obtains the voice again, and identifies " friend " word in voice, the standard keyword for including with assessment sentence
It is matched;The identification and matching of the Chinese character said every time to children are persistently carried out, until children will assess each of sentence
Word is finished.Here, being illustrated by taking the identification of Chinese character one by one and matching as an example, but should not be construed as to the present embodiment
Specific restriction.Can also by the mixing of phrase and Chinese character identify with it is matched in a manner of, the pronunciation of children is assessed, for example,
After children finish word " child ", electronic equipment identifies the word " child " in current speech in real time, and with assessment language
The standard keyword that sentence includes is matched;With the continuation spoken, after children finish Chinese character " wanting ", electronic equipment is known again
" wanting " word in other voice, and the standard keyword for including with assessment sentence is matched;Then, children can continue to issue
" from ", " small ", " forming ", " good " and " habit " voice, obtain the voice that issues every time and in real time in real time with electronic device
Identification.Wherein, children can issue the voice comprising target keyword in any order every time, however it is not limited to assess language
Keyword pronunciation sequence defined by the normal word order of sentence.
The target keyword in voice that electronic equipment obtains every time specifically includes minimum language element or at least
Two minimum language elements can finish the dead time after talking about according to assessment object to determine every time.Electronic equipment have compared with
High identification sensitivity can distinguish this.
If not identifying target keyword in the voice obtained in real time, such as assessment object issues cough etc. and do not have
There is the sound of real semanteme, then can abandon current speech, continues next section of voice for obtaining assessment object.
If the value of S130, matching result is less than confidence threshold, the target keyword currently identified is determined as closing in vain
Keyword, and continue based on the target keyword after the invalid keyword of speech recognition obtained in real time.
Whether the matching result of target keyword and standard key is for determining assessment object according to the content for assessing sentence
Pronounce, specific numerical value, which can be used, to be indicated.The value of matching result is less than confidence threshold, illustrates that the target currently identified is closed
Keyword is not belonging to the content of assessment sentence, i.e., invalid keyword.Wherein, confidence threshold can require to be fitted according to matching precision
Answering property is arranged.
Specifically, consider that target keyword can be matched with each standard keyword in assessment sentence, therefore, if
The value of matching result is less than confidence threshold, then the target keyword currently identified is determined as invalid keyword, may include: true
The matching result of the target keyword and each standard keyword that are identified before settled, if each matching result is respectively less than confidence threshold
Value, then be determined as invalid keyword for the target keyword;Alternatively, determining that the target keyword currently identified and each standard are closed
The matching result of keyword, and determine the maximum value in each matching result, if the maximum value is less than confidence threshold, will currently know
Other target keyword is determined as invalid keyword.If current goal key belong to invalid keyword, ignored, then after
It is continuous to obtain the voice and identification that assessment object issues in real time, i.e., the operation of S110 to S120 is executed again, it in this way can be to avoid hair
Interference of the non-assessment sentence content (i.e. interference voice) to pronunciation evaluation in sound evaluation process, while to assess interactive process
It is more flexible with it is humanized.
Illustratively, as shown in Fig. 2, completely commenting sentence is " You are beautiful ", electronic equipment comments this
Predicate sentence is split as the form of word, i.e. three standard keywords: " You ", " are " and " beautiful ", and is deposited
Storage.After the pronunciation evaluation function of electronic equipment is activated, the voice " Apple " that assessment object issues is got, and know in real time
Not Chu target keyword " Apple ", matched with three standard keywords, the value of three obtained matching result is respectively less than and sets
Believe threshold value, it is determined that " Apple " is invalid keyword, is ignored, and is not accounted for the word during pronunciation evaluation, simultaneously
Continue to obtain voice of the assessment object after distributing " Apple ".Certainly, invalid keyword also includes electronic equipment to except assessment
It is closed obtained from keyword obtained from the identification for the voice that other objects except object issue, or identification to ambient sound
Keyword.
In addition, if determining that the target keyword currently identified is determined as invalid keyword, i.e., the target keyword does not belong to
In the content of assessment sentence, pre-stored sound bank can also be called, is closed based on what is stored in the sound bank about the target
The standard pronunciation feature of keyword, determines the assessment result of the target keyword, and prompts assessment object current goal keyword not
Belong to the content that assessment sentence includes.Wherein, assessment result can be shown using the forms such as fractional value or pronunciation grade, this reality
Example is applied to be not especially limited.
If the value of S140, matching result is greater than or equal to confidence threshold, the target keyword currently identified is determined as
Effective keyword, and the assessment result of effective keyword is determined to the pronunciation character of effective keyword according to assessment object.
The value of matching result is greater than or equal to confidence threshold, illustrates that the target keyword currently identified belongs in assessment sentence
Content, i.e., effective keyword.The determination of effective keyword can use method similar with the invalid keyword of above-mentioned determination, i.e.,
It can be determined compared with confidence threshold according to the target critical currently identified with the matching result of each standard keyword,
It can be according to the maximum value in the matching result of the target critical and each standard keyword that currently identify, and the ratio of confidence threshold
Compared with and determine.If current goal keyword belongs to effective keyword, it is special to the pronunciation of effective keyword object can will to be assessed
It levies and is compared with standard pronunciation feature of the effective keyword in assessment sentence, determine the assessment knot of effective keyword
Accurately whether fruit assess object to the voice quality of effective keyword, such as pronounce.Determining commenting for effective keyword
While estimating result, the voice and the identification that obtain assessment object can be continued, i.e., execute the operation of S110 to S120 again.
Illustratively, continue as shown in Fig. 2, being identified after electronic equipment gets pronunciation of the assessment object to " You "
Target keyword " You " determines that " You " belongs to effective keyword by matching, then combines assessment object special to the pronunciation of " You "
Levy the assessment result for determining the word;The sound of " You " is distributed when assessing object, and has issued the sound of " beautiful ", electronics is set
It is standby to continue to identify target keyword " beautiful " after identifying " You ", and determine whether " beautiful " belongs to
Effective keyword;Above process circulation executes, until pronunciation evaluation process terminates.
It should be noted that the matching of upper a target keyword and standard keyword, electronic equipment is had no effect on to commenting
The acquisition and the identification to current goal keyword for estimating object current speech, as long as the language of assessment object can be got in real time
The real-time identification of sound, target keyword is just persistently carried out with matching.The assessment result of each target keyword can be tied in assessment
Assessment object is fed back to after beam together, determining assessment result can also be fed back into assessment object in evaluation process.
Optionally, this method further include: include if detecting that the quantity of effective keyword of identification is equal in assessment sentence
Standard keyword quantity, then stop obtain assessment object issue voice.By being stopped according to the determination of the quantity of effective keyword
The opportunity for only obtaining voice, i.e., using the quantity of effective keyword as a kind of termination condition of pronunciation evaluation, available assessment
Pronunciation of the object to each standard keyword, and then can determine that assessment object is directed to the pronunciation matter of each standard keyword
Amount avoids assessment object and does not finish content in commentary sentence at the appointed time compared with the prior art, and causes to pronounce
Assessment result lacks the phenomenon that objectivity, meanwhile, the tone period limitation to assessment object is also avoided, so that evaluation process is more
Add flexibly.Certainly, this embodiment scheme can also determine when to terminate to obtain assessment pair according to the preset assessment time
The voice of elephant, wherein the assessment time can be according to factors such as the word speed of assessment object and the behavior expressions in evaluation process
Flexible setting.
The technical solution of the present embodiment by real time obtain assessment object issue voice, and in real time to the voice of acquisition into
Row identification is then matched with the standard keyword in assessment sentence once therefrom identifying target keyword;If it is determined that current
Target keyword belongs to invalid keyword, then is ignored, and continues based on after the invalid keyword of speech recognition obtained in real time
Target keyword, can effectively remove non-assessment sentence content during pronunciation evaluation and filter out assessment object has
Voice input is imitated, influence of the appearance of non-assessment sentence content to assessment result accuracy is avoided;If it is determined that current goal is crucial
Word belongs to effective keyword, it is determined that its corresponding assessment result, while continuing to obtain voice and the identification of assessment object, until
Evaluation process terminates, and solves existing pronunciation evaluating method for the pronunciation evaluation knot of the relatively weak assessment object of self-control
Fruit lacks the problem of objectivity, weakens pronunciation evaluation knot of the pronunciation sequence to assessment object of each keyword in assessment sentence
The influence of fruit improves the applicability of pronunciation evaluating method, improves the objectivity and accuracy of pronunciation evaluation result, increases
Assess the flexibility of interactive process.
Embodiment two
Fig. 3 is the flow chart of pronunciation evaluating method provided by Embodiment 2 of the present invention, and the present embodiment is in above-described embodiment
On the basis of further progress optimization and extension.As shown in figure 3, this method may include:
S210, under the dual-mode based on echo cancellation technology, in real time obtain assessment object issue voice, wherein
Dual-mode refers to the mode that voice acquisition device and assessment system prompt tone playing device work at the same time, and assessment system prompt tone is broadcast
Device is put for issuing voice in assessment interactive process prompt assessment object.
If S220, identifying target keyword in the voice obtained in real time, by the target keyword and assessment of identification
Standard keyword in sentence is matched.
If the value of S230, matching result is less than confidence threshold, the target keyword currently identified is determined as closing in vain
Keyword, and continue based on the target keyword after the invalid keyword of speech recognition obtained in real time.Execute S210 extremely again
The operation of S220.
If the value of S240, matching result is greater than or equal to confidence threshold, the target keyword currently identified is determined as
Effective keyword, and the assessment result of effective keyword is determined to the pronunciation character of effective keyword according to assessment object.True
While the assessment result of fixed effective keyword, the voice and the identification that obtain assessment object can be continued, i.e., executed again
The operation of S110 to S120.
The present embodiment using the voice for obtaining assessment object in real time under dual-mode, comment by the pronunciation for being equivalent to electronic equipment
Estimate after function is activated, voice acquisition device is in the state that can acquire the voice of assessment object at any time, does not need assessment pair
As broadcasting prompt tone using assessment system prompt tone playing device in electronic equipment and then loquituring, thus hair is desalinated
The limitation that sound evaluation process is put at the beginning of pronouncing to assessment object.Especially for children, even if it cannot be in strict accordance with
Prompt tone pronounces, and will not just be loquitured and (be robbed before prompt tone casting because of children using this embodiment scheme
Say), and missing children phonological component for issuing before prompt tone casting in the voice for causing electronic equipment to acquire that is, will not be because
Cause the children speech information of acquisition incomplete to rob.
Likewise, this embodiment scheme is to pronunciation since voice acquisition device can be at any time in the state for acquiring voice
The pronunciation duration that object is assessed in evaluation process is also not especially limited, i.e. the pronunciation end time point of assessment object also has spirit
Activity, rather than such as the accuracy in the prior art to guarantee pronunciation evaluation result, it is desirable that assessment object is before the deadline
Complete pronunciation.For example, the prior art is used, if there are the pause of long period or hesitations in pronunciation evaluation for assessment object
Equal behaviors, not only cause the objectivity of the waste to regulation tone period and impact evaluation result, but also it is also possible to by electronics
Equipment, which is mistakenly identified as assessment object, to be terminated to pronounce and terminate pronunciation evaluation;This embodiment scheme is to assessment pair during pronunciation evaluation
The pronunciation duration of elephant is not especially limited, and well-to-do tone period, the hesitation that assessment object generates can be provided for assessment object
Or the behaviors such as pause are in the error tolerance of assessment result, will not influence the accuracy of final pronunciation evaluation result and objective
Property.
In addition, being based on above-mentioned dual-mode, pronunciation evaluation process may be incorporated into echo cancellation signal processing technique, utilize
The system prompt sound and guidance sound etc. that assessment system prompt tone playing device issues can be by echo cancellation signal processing technique
System sound, the reference loop of evaluated system prompt sound playing device is adaptively supported after including by voice acquisition device
Disappear, interference sound will not be become and the accuracy of pronunciation evaluation result is impacted.
Optionally, this method further include: according to the assessment result of each effectively keyword, determine the comprehensive hair of assessment object
Sound assessment result, wherein include the pronunciation of each effectively keyword and corresponding standard keyword in comprehensive pronunciation evaluation result
Sequence comparing result.
In the present embodiment, hair after given assessment sentence, to assessment object about each keyword in assessment sentence
Sound sequence has no stringent sequence requirement, i.e., assessment object can according to sequence of each keyword in assessment sentence successively into
Row pronunciation can also carry out the out-of-order pronunciation of keyword, can also skip during the pronunciation process unacquainted keyword and direct
Issue the voice of the keyword after the keyword.Because the present embodiment carries out pronunciation evaluation as unit of keyword, as long as commenting
Estimate the pronunciation that object says the keyword, then the available pronunciation evaluation corresponding to the keyword is as a result, in turn according to each
The pronunciation evaluation result of keyword provides the pronunciation evaluation relative to full assessment sentence as a result, i.e. comprehensive pronunciation evaluation result.
The method assessed as unit of whole sentence sentence in compared with the prior art, this embodiment scheme can weaken in assessment sentence
Influence of the pronunciation sequence of each keyword to the pronunciation evaluation result of assessment object, and then improve the objective of pronunciation evaluation result
Property and accuracy, for grasp assessment object language competence have more reference value, and improve to assessment object carry out
The applicability and validity of pronunciation evaluation.Especially for cognitive ability and the relatively weak children of self-control, using this implementation
Example scheme, can get rid of the constraint in existing appraisal procedure to children, it is suitable according to arbitrarily pronouncing that children can play its person's character
Sequence completes pronunciation evaluation, improves the flexibility in assessment interactive process.
The technical solution of the present embodiment is assessed by obtaining in real time under the dual-mode based on echo cancellation technology first
The voice that object issues, the limitation put at the beginning of having desalinated pronunciation evaluation process to assessment object pronunciation, increases assessment
The flexibility of interactive process;Then by being identified in real time to the voice of acquisition, once therefrom identify target keyword, then
It is matched with the standard keyword in assessment sentence, and then according to the pronunciation character of matching result and assessment object, determination is commented
Estimate as a result, being equivalent to as unit of keyword and carry out circulation identification and matched mode, solves existing pronunciation evaluating method needle
The problem of pronunciation evaluation result of the assessment object relatively weak to self-control lacks objectivity, weakens every in assessment sentence
A keyword pronunciation sequence to assessment object pronunciation evaluation result influence, improve pronunciation evaluation result objectivity and
Accuracy further increases the flexibility of assessment interactive process.
Embodiment three
Fig. 4 is the structural schematic diagram for the pronunciation evaluation device that the embodiment of the present invention three provides, and the present embodiment is applicable to pair
The case where relatively weak assessment object of self-control, such as children, progress pronunciation evaluation.The device can using software and/
Or the mode of hardware is realized, and can be integrated on an electronic device, such as intellectual education product etc., it is specific such as intelligent robot.
As shown in figure 4, pronunciation evaluation device provided in this embodiment may include that voice obtains module 310, keyword
With module 320, invalid keyword determining module 330 and effective keyword evaluation module 340, in which:
Voice obtains module 310, the voice issued for obtaining assessment object in real time;
Keywords matching module 320, if for identifying target keyword in the voice obtained in real time, by identification
Target keyword is matched with the standard keyword in assessment sentence;
Invalid keyword determining module 330, if the value for matching result is less than confidence threshold, the mesh that will currently identify
Mark keyword is determined as invalid keyword, and continues based on the target critical after the invalid keyword of speech recognition obtained in real time
Word;
Effective keyword evaluation module 340 will currently be known if the value for matching result is greater than or equal to confidence threshold
Other target keyword is determined as effective keyword, and is determined according to pronunciation character of the assessment object to effective keyword and effectively closed
The assessment result of keyword.
Optionally, the device further include:
Voice obtains stopping modular, includes if the quantity of effective keyword for detecting identification is equal in assessment sentence
Standard keyword quantity, then stop obtain assessment object issue voice.
Optionally, voice obtains module 310 and is specifically used for:
Under the dual-mode based on echo cancellation technology, the voice that assessment object issues is obtained in real time, wherein duplexing mould
Formula refers to the mode that voice acquisition device and assessment system prompt tone playing device work at the same time, assessment system prompt tone playing device
For issuing voice in assessment interactive process prompt assessment object.
Optionally, the device further include:
Comprehensive assessment result determining module determines assessment object for the assessment result according to each effectively keyword
Comprehensive pronunciation evaluation result, wherein include each effectively keyword and corresponding standard keyword in comprehensive pronunciation evaluation result
Pronunciation sequence comparing result.
Pronunciation evaluation device provided by the embodiment of the present invention can be performed pronunciation provided by any embodiment of the invention and comment
Estimate method, has the corresponding functional module of execution method and beneficial effect.The content of not detailed description can join in the present embodiment
Examine the description in any means embodiment of the present invention.
Example IV
Fig. 5 is the structural schematic diagram for a kind of electronic equipment that the embodiment of the present invention four provides.Fig. 5, which is shown, to be suitable for being used in fact
The block diagram of the example electronic device 412 of existing embodiment of the present invention.The electronic equipment 412 that Fig. 5 is shown is only an example,
Should not function to the embodiment of the present invention and use scope bring any restrictions.
As shown in figure 5, electronic equipment 412 is showed in the form of universal electronic device.The component of electronic equipment 412 can wrap
It includes but is not limited to: one or more processor 416, storage device 428, voice collection device 450, sound play device 452,
Connect different system components (including storage device 428, processor 416, voice collection device 450 and sound play device 452)
Bus 418.Wherein, voice collection device 450 includes microphone, the voice issued for acquiring assessment object in real time;Sound
Playing device 452 includes loudspeaker, is used for play system prompt tone, such as prompt assessment object issues the prompt tone etc. of voice.
Bus 418 indicates one of a few class bus structures or a variety of, including storage device bus or storage device control
Device processed, peripheral bus, graphics acceleration port, processor or total using the local of any bus structures in a variety of bus structures
Line.For example, these architectures include but is not limited to industry standard architecture (Industry Subversive
Alliance, ISA) bus, microchannel architecture (Micro Channel Architecture, MAC) bus is enhanced
Isa bus, Video Electronics Standards Association (Video Electronics Standards Association, VESA) local are total
Line and peripheral component interconnection (Peripheral Component Interconnect, PCI) bus.
Electronic equipment 412 typically comprises a variety of computer system readable media.These media can be it is any can be by
The usable medium that electronic equipment 412 accesses, including volatile and non-volatile media, moveable and immovable medium.
Storage device 428 may include the computer system readable media of form of volatile memory, such as arbitrary access
Memory (Random Access Memory, RAM) 430 and/or cache memory 432.Electronic equipment 412 can be into one
Step includes other removable/nonremovable, volatile/non-volatile computer system storage mediums.Only as an example, it stores
System 434 can be used for reading and writing immovable, non-volatile magnetic media (Fig. 5 do not show, commonly referred to as " hard disk drive ").
Although being not shown in Fig. 5, the disc driver for reading and writing to removable non-volatile magnetic disk (such as " floppy disk ") can be provided,
And to removable anonvolatile optical disk, such as CD-ROM (Compact Disc Read-Only Memory, CD-ROM),
Digital video disk (Digital Video Disc-Read Only Memory, DVD-ROM) or other optical mediums) read-write light
Disk drive.In these cases, each driver can pass through one or more data media interfaces and 418 phase of bus
Even.Storage device 428 may include at least one program product, which has one group of (for example, at least one) program mould
Block, these program modules are configured to perform the function of various embodiments of the present invention.
Program/utility 440 with one group of (at least one) program module 442 can store in such as storage dress
It sets in 428, such program module 442 includes but is not limited to operating system, one or more application program, other program moulds
It may include the realization of network environment in block and program data, each of these examples or certain combination.Program module
442 usually execute function and/or method in embodiment described in the invention.
Electronic equipment 412 (such as keyboard, can also be directed toward terminal, display 424 with one or more external equipments 414
Deng) communication, can also be enabled a user to one or more terminal interact with the electronic equipment 412 communicate, and/or with make
Any terminal that the electronic equipment 412 can be communicated with one or more of the other computing terminal (such as network interface card, modem
Etc.) communication.This communication can be carried out by input/output (I/O) interface 422.Also, electronic equipment 412 can also lead to
Cross network adapter 420 and one or more network (such as local area network (Local Area Network, LAN), wide area network
(Wide Area Network, WAN) and/or public network, such as internet) communication.As shown in figure 5, network adapter 420
It is communicated by bus 418 with other modules of electronic equipment 412.It should be understood that although not shown in the drawings, can be set in conjunction with electronics
Standby 412 use other hardware and/or software module, including but not limited to: microcode, terminal driver, redundant processor, outside
Disk drive array, disk array (Redundant Arrays of Independent Disks, RAID) system, tape drive
Dynamic device and data backup storage system etc..
The program that processor 416 is stored in storage device 428 by operation, thereby executing various function application and number
According to processing, such as realize pronunciation evaluating method provided by any embodiment of the invention, this method may include:
The voice that assessment object issues is obtained in real time;
If identifying target keyword in the voice obtained in real time, by the target keyword of the identification and assessment language
Standard keyword in sentence is matched;
If the value of matching result is less than confidence threshold, the target keyword currently identified is determined as invalid keyword,
And continue based on the target keyword after invalid keyword described in the speech recognition obtained in real time;
If the value of matching result is greater than or equal to the confidence threshold, the target keyword currently identified is determined as having
Keyword is imitated, and determines the assessment knot of effective keyword to the pronunciation character of effective keyword according to assessment object
Fruit.
Embodiment five
The embodiment of the present invention five additionally provides a kind of computer readable storage medium, is stored thereon with computer program, should
Realize that such as pronunciation evaluating method provided by any embodiment of the invention, this method may include: when program is executed by processor
The voice that assessment object issues is obtained in real time;
If identifying target keyword in the voice obtained in real time, by the target keyword of the identification and assessment language
Standard keyword in sentence is matched;
If the value of matching result is less than confidence threshold, the target keyword currently identified is determined as invalid keyword,
And continue based on the target keyword after invalid keyword described in the speech recognition obtained in real time;
If the value of matching result is greater than or equal to the confidence threshold, the target keyword currently identified is determined as having
Keyword is imitated, and determines the assessment knot of effective keyword to the pronunciation character of effective keyword according to assessment object
Fruit.
The computer storage medium of the embodiment of the present invention, can be using any of one or more computer-readable media
Combination.Computer-readable medium can be computer-readable signal media or computer readable storage medium.It is computer-readable
Storage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device or
Device, or any above combination.The more specific example (non exhaustive list) of computer readable storage medium includes: tool
There are electrical connection, the portable computer diskette, hard disk, random access memory (RAM), read-only memory of one or more conducting wires
(ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-
ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer-readable storage
Medium can be any tangible medium for including or store program, which can be commanded execution system, device or device
Using or it is in connection.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited
In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can
Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for
By the use of instruction execution system, device or device or program in connection.
The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimited
In wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
The computer for executing operation of the present invention can be write with one or more programming languages or combinations thereof
Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++,
It further include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with
It fully executes, partly execute on the user computer on the user computer, being executed as an independent software package, portion
Divide and partially executes or executed on remote computer or terminal completely on the remote computer on the user computer.It is relating to
And in the situation of remote computer, remote computer can pass through the network of any kind --- including local area network (LAN) or extensively
Domain net (WAN)-be connected to subscriber computer, or, it may be connected to outer computer (such as provided using Internet service
Quotient is connected by internet).
Note that the above is only a better embodiment of the present invention and the applied technical principle.It will be appreciated by those skilled in the art that
The invention is not limited to the specific embodiments described herein, be able to carry out for a person skilled in the art it is various it is apparent variation,
It readjusts and substitutes without departing from protection scope of the present invention.Therefore, although being carried out by above embodiments to the present invention
It is described in further detail, but the present invention is not limited to the above embodiments only, without departing from the inventive concept, also
It may include more other equivalent embodiments, and the scope of the invention is determined by the scope of the appended claims.
Claims (10)
1. a kind of pronunciation evaluating method characterized by comprising
The voice that assessment object issues is obtained in real time;
It, will be in the target keyword of the identification and assessment sentence if identifying target keyword in the voice obtained in real time
Standard keyword matched;
If the value of matching result is less than confidence threshold, the target keyword currently identified is determined as invalid keyword, and after
Continue based on the target keyword after invalid keyword described in the speech recognition obtained in real time;
If the value of matching result is greater than or equal to the confidence threshold, the target keyword currently identified is determined as effectively closing
Keyword, and the assessment result of effective keyword is determined to the pronunciation character of effective keyword according to assessment object.
2. the method according to claim 1, wherein the method also includes:
If detecting, the quantity of effective keyword of identification is equal to the standard keyword quantity in the assessment sentence included, stops
Only obtain the voice that assessment object issues.
3. method according to claim 1 or 2, which is characterized in that the real-time voice for obtaining assessment object and issuing, packet
It includes:
Under the dual-mode based on echo cancellation technology, the voice that assessment object issues is obtained in real time, wherein the duplex mould
Formula refers to the mode that voice acquisition device and assessment system prompt tone playing device work at the same time, and the assessment system prompt tone plays
Device is used to issue voice in assessment interactive process prompt assessment object.
4. the method according to claim 1, wherein the method also includes:
According to the assessment result of each effectively keyword, the synthesis pronunciation evaluation result of assessment object is determined, wherein the synthesis
It include the pronunciation sequence comparing result of each effectively keyword and corresponding standard keyword in pronunciation evaluation result.
5. a kind of pronunciation evaluation device characterized by comprising
Voice obtains module, the voice issued for obtaining assessment object in real time;
Keywords matching module, if for identifying target keyword in the voice obtained in real time, by the mesh of the identification
Mark keyword is matched with the standard keyword assessed in sentence;
Invalid keyword determining module, if the value for matching result is less than confidence threshold, the target critical that will currently identify
Word is determined as invalid keyword, and continues based on the target critical after invalid keyword described in the speech recognition obtained in real time
Word;
Effective keyword evaluation module will be identified currently if the value for matching result is greater than or equal to the confidence threshold
Target keyword be determined as effective keyword, and described in being determined according to pronunciation character of the assessment object to effective keyword
The assessment result of effective keyword.
6. device according to claim 5, which is characterized in that described device further include:
Voice obtains stopping modular, includes if the quantity of effective keyword for detecting identification is equal in the assessment sentence
Standard keyword quantity, then stop obtain assessment object issue voice.
7. device according to claim 5 or 6, which is characterized in that the voice obtains module and is specifically used for:
Under the dual-mode based on echo cancellation technology, the voice that assessment object issues is obtained in real time, wherein the duplex mould
Formula refers to the mode that voice acquisition device and assessment system prompt tone playing device work at the same time, and the assessment system prompt tone plays
Device is used to issue voice in assessment interactive process prompt assessment object.
8. device according to claim 5, which is characterized in that described device further include:
Comprehensive assessment result determining module determines the synthesis of assessment object for the assessment result according to each effectively keyword
Pronunciation evaluation result, wherein include each effectively keyword and corresponding standard keyword in the comprehensive pronunciation evaluation result
Pronunciation sequence comparing result.
9. a kind of electronic equipment characterized by comprising
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
The now pronunciation evaluating method as described in any in claim 1-4.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
The pronunciation evaluating method as described in any in claim 1-4 is realized when execution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910234740.4A CN109872726A (en) | 2019-03-26 | 2019-03-26 | Pronunciation evaluating method, device, electronic equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910234740.4A CN109872726A (en) | 2019-03-26 | 2019-03-26 | Pronunciation evaluating method, device, electronic equipment and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109872726A true CN109872726A (en) | 2019-06-11 |
Family
ID=66921325
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910234740.4A Pending CN109872726A (en) | 2019-03-26 | 2019-03-26 | Pronunciation evaluating method, device, electronic equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109872726A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111370029A (en) * | 2020-02-28 | 2020-07-03 | 北京一起教育信息咨询有限责任公司 | Voice data processing method and device, storage medium and electronic equipment |
CN111402895A (en) * | 2020-06-08 | 2020-07-10 | 腾讯科技(深圳)有限公司 | Voice processing method, voice evaluating method, voice processing device, voice evaluating device, computer equipment and storage medium |
CN111863022A (en) * | 2020-07-23 | 2020-10-30 | 中国科学技术大学 | Children sound feature detection method based on special-shaped double-microphone array |
CN113421587A (en) * | 2021-06-02 | 2021-09-21 | 网易有道信息技术(北京)有限公司 | Voice evaluation method and device, computing equipment and storage medium |
CN115691497A (en) * | 2023-01-04 | 2023-02-03 | 深圳市大晶光电科技有限公司 | Voice control method, device, equipment and medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0619911B1 (en) * | 1992-11-04 | 1997-06-04 | The Secretary Of State For Defence In Her Britannic Majesty's Government Of The United Kingdom Of Great Britain And | Children's speech training aid |
CN1750121A (en) * | 2004-09-16 | 2006-03-22 | 北京中科信利技术有限公司 | A kind of pronunciation evaluating method based on speech recognition and speech analysis |
CN102194454A (en) * | 2010-03-05 | 2011-09-21 | 富士通株式会社 | Equipment and method for detecting key word in continuous speech |
CN103035244A (en) * | 2012-11-24 | 2013-04-10 | 安徽科大讯飞信息科技股份有限公司 | Voice tracking method capable of feeding back loud-reading progress of user in real time |
CN103680505A (en) * | 2013-09-03 | 2014-03-26 | 安徽科大讯飞信息科技股份有限公司 | Voice recognition method and voice recognition system |
CN104143328A (en) * | 2013-08-15 | 2014-11-12 | 腾讯科技(深圳)有限公司 | Method and device for detecting keywords |
CN109273004A (en) * | 2018-12-10 | 2019-01-25 | 苏州思必驰信息科技有限公司 | Predictive audio recognition method and device based on big data |
-
2019
- 2019-03-26 CN CN201910234740.4A patent/CN109872726A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0619911B1 (en) * | 1992-11-04 | 1997-06-04 | The Secretary Of State For Defence In Her Britannic Majesty's Government Of The United Kingdom Of Great Britain And | Children's speech training aid |
CN1750121A (en) * | 2004-09-16 | 2006-03-22 | 北京中科信利技术有限公司 | A kind of pronunciation evaluating method based on speech recognition and speech analysis |
CN102194454A (en) * | 2010-03-05 | 2011-09-21 | 富士通株式会社 | Equipment and method for detecting key word in continuous speech |
CN103035244A (en) * | 2012-11-24 | 2013-04-10 | 安徽科大讯飞信息科技股份有限公司 | Voice tracking method capable of feeding back loud-reading progress of user in real time |
CN104143328A (en) * | 2013-08-15 | 2014-11-12 | 腾讯科技(深圳)有限公司 | Method and device for detecting keywords |
CN103680505A (en) * | 2013-09-03 | 2014-03-26 | 安徽科大讯飞信息科技股份有限公司 | Voice recognition method and voice recognition system |
CN109273004A (en) * | 2018-12-10 | 2019-01-25 | 苏州思必驰信息科技有限公司 | Predictive audio recognition method and device based on big data |
Non-Patent Citations (2)
Title |
---|
朱志祥等: "《IP网络多媒体通信技术及应用》", 30 November 2007, 西安电子科技大学出版社 * |
王勇: "基于点过程模型的连续语音关键词检测技术研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111370029A (en) * | 2020-02-28 | 2020-07-03 | 北京一起教育信息咨询有限责任公司 | Voice data processing method and device, storage medium and electronic equipment |
CN111402895A (en) * | 2020-06-08 | 2020-07-10 | 腾讯科技(深圳)有限公司 | Voice processing method, voice evaluating method, voice processing device, voice evaluating device, computer equipment and storage medium |
CN111863022A (en) * | 2020-07-23 | 2020-10-30 | 中国科学技术大学 | Children sound feature detection method based on special-shaped double-microphone array |
CN111863022B (en) * | 2020-07-23 | 2022-09-30 | 中国科学技术大学 | Children sound feature detection method based on special-shaped double-microphone array |
CN113421587A (en) * | 2021-06-02 | 2021-09-21 | 网易有道信息技术(北京)有限公司 | Voice evaluation method and device, computing equipment and storage medium |
CN113421587B (en) * | 2021-06-02 | 2023-10-13 | 网易有道信息技术(北京)有限公司 | Voice evaluation method, device, computing equipment and storage medium |
CN115691497A (en) * | 2023-01-04 | 2023-02-03 | 深圳市大晶光电科技有限公司 | Voice control method, device, equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109872726A (en) | Pronunciation evaluating method, device, electronic equipment and medium | |
US10152971B2 (en) | System and method for advanced turn-taking for interactive spoken dialog systems | |
Barker et al. | The PASCAL CHiME speech separation and recognition challenge | |
Moerman | Talking culture: Ethnography and conversation analysis | |
CN105575386B (en) | Audio recognition method and device | |
CN103021409B (en) | A kind of vice activation camera system | |
Lee | Prologue: talking organisation | |
US9361589B2 (en) | System and a method for providing a dialog with a user | |
CN109102824B (en) | Voice error correction method and device based on man-machine interaction | |
CN110600013B (en) | Training method and device for non-parallel corpus voice conversion data enhancement model | |
JP7060106B2 (en) | Dialogue device, its method, and program | |
EP2879062A2 (en) | A system and a method for providing a dialog with a user | |
CN110175242B (en) | Human-computer interaction association method, device and medium based on knowledge graph | |
CN109697981A (en) | A kind of voice interactive method, device, equipment and storage medium | |
Williams et al. | Demonstration of AT&T “Let's Go”: A production-grade statistical spoken dialog system | |
CN116821290A (en) | Multitasking dialogue-oriented large language model training method and interaction method | |
CN109859773A (en) | A kind of method for recording of sound, device, storage medium and electronic equipment | |
CN110164020A (en) | Ballot creation method, device, computer equipment and computer readable storage medium | |
Cumbal et al. | Detection of listener uncertainty in robot-led second language conversation practice | |
Möller et al. | A corpus analysis of spoken smart-home interactions with older users | |
CN112667787A (en) | Intelligent response method, system and storage medium based on phonetics label | |
CN109712443A (en) | A kind of content is with reading method, apparatus, storage medium and electronic equipment | |
CN116403583A (en) | Voice data processing method and device, nonvolatile storage medium and vehicle | |
CN113707128B (en) | Test method and system for full duplex voice interaction system | |
CN109147419A (en) | Language learner system based on incorrect pronunciations detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190611 |