CN106512393A - Application voice control method and system suitable for virtual reality environment - Google Patents
Application voice control method and system suitable for virtual reality environment Download PDFInfo
- Publication number
- CN106512393A CN106512393A CN201610899523.3A CN201610899523A CN106512393A CN 106512393 A CN106512393 A CN 106512393A CN 201610899523 A CN201610899523 A CN 201610899523A CN 106512393 A CN106512393 A CN 106512393A
- Authority
- CN
- China
- Prior art keywords
- voice
- speech
- module
- time window
- phonetic order
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000010276 construction Methods 0.000 claims description 4
- 238000003860 storage Methods 0.000 claims description 4
- 230000007547 defect Effects 0.000 abstract description 2
- 239000011800 void material Substances 0.000 abstract description 2
- 238000012544 monitoring process Methods 0.000 abstract 1
- 238000005516 engineering process Methods 0.000 description 9
- 230000007246 mechanism Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 230000007480 spreading Effects 0.000 description 3
- 238000003892 spreading Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 241000287828 Gallus gallus Species 0.000 description 1
- 235000017899 Spathodea campanulata Nutrition 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 235000012054 meals Nutrition 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/20—Input arrangements for video game devices
- A63F13/21—Input arrangements for video game devices characterised by their sensors, purposes or types
- A63F13/215—Input arrangements for video game devices characterised by their sensors, purposes or types comprising means for detecting acoustic signals, e.g. using a microphone
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The invention provides an application voice control method and system suitable for a virtual reality environment. The method includes: a voice acquisition step: acquiring a voice input command of a user; a voice command recognition step: extracting one or more voice input words from the voice input command of the user, and performing matching to acquire a voice command according to the voice input words; and a control command acquisition step: acquiring a control command associated with the voice command. The application voice control method and system can void the defect that a command input manner is limited due to shortage of hardware input equipment (such as a mouse and a keyboard) in a virtual reality game environment; the feedback speed of acquiring a result by the voice command is greatly improved; a user can control the input time by himself instead of real-time monitoring of input; and interference due to unmeant talk of players and outer sounds.
Description
Technical field
The present invention relates to virtual reality technology game technical field and voice technology field, and in particular to one kind is applied to
Voice-operated method and system is applied in reality environment.
Background technology
With the progressively maturation of virtual reality science and technology, people have also expressed increasing concern to virtual reality,
Wherein, reality-virtualizing game exactly one of focus.
Electronic game industry has been developed many decades, and people have got used to carrying out game behaviour using mouse and keyboard
Control, but under reality environment, by hardware limitation, people cannot be manipulated to game by mouse and keyboard.Such as
Where player is allowed comfortably to experience naturally game content in reality environment, this has become reality-virtualizing game developers needs
A big problem to be solved.
For many years, voice technology has had a great development, and have started to from the high research and production field of professional degree by
Step enters into the central of the life of people.Wherein it is the biggest it is well known be exactly speech recognition technology, by huge sample
Storehouse, recognizes vocabulary using complicated speech recognition algorithm, and using artificial neural network and the voice based on grammar rule at
Constituting complete sentence, the huge material resources of this needs and manpower basis, medium-sized and small enterprises are difficult to undertake correlative charges reason mechanism.Language
Sound recognizes the complexity of the huge and algorithm due to database so that recognition speed has higher delay, it is impossible to which meeting people makes
The immediate feedback needed when being entertained with game software.Also, the language of the mankind is actually extremely complex, this
So that the degree of accuracy of speech recognition is inversely proportional to the voice length of input.
Due to above reason, in computer game field, there is presently no company and voice technology is applied to into game
In terms of the manipulation of system, being still for employing is manipulated to games system by keyboard and mouse this quasi-tradition input mode.
The content of the invention
For defect of the prior art, it is an object of the invention to provide one kind applies language suitable for reality environment
The method and system of sound control.
Voice-operated method is applied suitable for reality environment according to one kind that the present invention is provided, including:
Speech acquisition step:The speech-input instructions of collection user;
Phonetic order identification step:One or more phonetic entry words are extracted from the speech-input instructions of user,
Phonetic order is obtained according to phonetic entry word matched;
Control command obtaining step:The control command that acquisition is associated with phonetic order.
Preferably, the speech acquisition step, including:
Acquisition time window setting procedure:Voice collecting time window is determined according to the operation of user;
Voice is prescribed a time limit acquisition step:The speech-input instructions of user are gathered in voice collecting time window;
Punctuate judges step:During the speech-input instructions of collection user, will be greater than equal to dead time threshold value
Pronunciation pause as punctuate mark.
Preferably, the acquisition time window setting procedure, including:
Time window initial time setting procedure:In non-voice acquisition time window, by the moment of user operation input equipment
As the initial time of current speech acquisition time window;
Time window end time setting procedure:When current speech acquisition time window continues, by user operation input equipment
Moment as this voice collecting time window end time.
Preferably, the phonetic order identification step, including:
Split word step:According to language model storehouse, one or more languages are extracted from the speech-input instructions of user
Sound is input into word, and one or more of phonetic entry words are constituted to be identified group;
Matching step:To be identified group is matched in language model storehouse, obtain in language model storehouse with to be identified group
The speech recognition group of matching;
Wherein, speech recognition group is corresponded with phonetic order.
Preferably, the language model library module is only made by phonetic order and is obtained, including:
Phonetic order presets step:One or more phonetic orders are preset, wherein, phonetic order is stored in language model storehouse
In;
Speech recognition group construction step:For single phonetic order, by extract from phonetic order one or more
Keyword is configured to speech recognition group, and wherein, speech recognition group is stored in language model library module;
Order association step:Speech recognition group and control command are set up into one-to-one incidence relation, wherein, association is closed
System is stored in language model library module.
Voice-operated system is applied suitable for reality environment according to one kind that the present invention is provided, including:
Voice acquisition module:The speech-input instructions of collection user;
Phonetic order identification module:One or more phonetic entry words are extracted from the speech-input instructions of user,
Phonetic order is obtained according to phonetic entry word matched;
Control command acquisition module:The control command that acquisition is associated with phonetic order.
Preferably, the voice acquisition module, including:
Acquisition time window setting module:Voice collecting time window is determined according to the operation of user;
Voice is prescribed a time limit acquisition module:The speech-input instructions of user are gathered in voice collecting time window;
Punctuate judge module:During the speech-input instructions of collection user, will be greater than equal to dead time threshold value
Pronunciation pause as punctuate mark.
Preferably, the acquisition time window setting module, including:
Time window initial time setting module:In non-voice acquisition time window, by the moment of user operation input equipment
As the initial time of current speech acquisition time window;
Time window end time setting module:When current speech acquisition time window continues, by user operation input equipment
Moment as this voice collecting time window end time.
Preferably, the phonetic order identification module, including:
Split word module:According to language model storehouse, one or more languages are extracted from the speech-input instructions of user
Sound is input into word, and one or more of phonetic entry words are constituted to be identified group;
Matching module:To be identified group is matched in language model storehouse, obtain in language model storehouse with to be identified group
The speech recognition group of matching;
Wherein, speech recognition group is corresponded with phonetic order.
Preferably, including:
Phonetic order presetting module:One or more phonetic orders are preset, wherein, phonetic order is stored in language model storehouse
In;
Speech recognition group builds module:For single phonetic order, by extract from phonetic order one or more
Keyword is configured to speech recognition group, and wherein, speech recognition group is stored in language model library module;
Order association module:Speech recognition group and control command are set up into one-to-one incidence relation, wherein, association is closed
System is stored in language model library module;
Wherein, the language model library module is only made by phonetic order and is obtained.
Compared with prior art, the present invention has following beneficial effect:
1st, make up and evaded under reality-virtualizing game environment, due to lacking hardware input equipment (such as mouse and keyboard)
And the extremely limited situation of the instruction input mode that causes (such as existing HTC VIVE virtual game input equipments, user is in trip
Only can be manipulated by 2 handle controllers in gaming in play, and each control machine is only had 6 buttons).
2nd, the feedback speed for obtaining result by phonetic order obtains significant increase.By the editor to speech model storehouse,
The scale in speech model storehouse is reduced, simultaneously as given up the speech processing mechanism based on grammar rule, and only to voice list
Word is matched itself, also significantly reduces the amount of calculation of voice messaging identification.
3rd, player oneself the control input time, rather than moment monitors input, reduces player and unintentionally speaks and extraneous
The interference of sound.Setting dead time mark, allows player's control dead time, and minibreak when reducing due to speaking naturally is made
Into punctuate mistake.
4th, the discrimination of long sentence is substantially improved.Because the complexity of human language and randomness so that computer
Complete sentence is constituted based on the speech processing mechanism of grammar rule very difficult.So, conventional speech recognition technology is to long
Sentence discrimination is relatively low.And after using the method for the present invention and system, using is carried out to the key words in phonetic order
Matching and screening, so the key words included in phonetic order are more, it is easier correctly to be matched, so drastically increase
The identification probability of long sentence.
5th, significantly reduce the cost of a set of available speech control system of framework.At present, many language have all been deposited
In acoustic model, dictionary, or even large vocabulary language model is available for download, but in huge model library greatly actually
It is not needed, but due to being limited by speech recognition algorithm and software content updates considers, and not directly deletes
Remove.Meanwhile, the cost of collection special sound cannot also bear in most enterprises.After the method for the present invention and system, phase
Shutout business can be adapted to the language model storehouse of itself to meet the use demand of oneself Games Software from edlin, can not only ensure
Needed for content update voice resource addition, and not again can by huge acoustic model repository acquisition cost and complexity semanteme at
Reason mechanism is limited.So that relevant manufactures can to have more method bands to give people happy, and be more worth for social creativity.
6th, the habits and customs of people, extremely low learning cost are more pressed close to.Keyboard and mouse in human society history
Jing occurs in that the time of decades, and nonetheless, many special populations still need longer time learning and mastering its use
Method.And the language technical ability that to be everyone be accustomed to grasps, without the need for learn again, and be also easier to be accepted, understand and
Memory.
7th, in reality environment, more preferably, it is more natural to interact and manipulating.In life, people's custom passes through
Interacting and manipulate, what reality-virtualizing game was emphasized is exactly that significant environment substitutes into sense for language and gesture.By the present invention
Method and system, people will can from only limit hand manipulations limitation in break away from, combine this using voice and gesture
Plant more natural mode to interact and manipulate.
Description of the drawings
Detailed description non-limiting example made with reference to the following drawings by reading, the further feature of the present invention,
Objects and advantages will become more apparent upon:
Fig. 1 is the module relation figure of the present invention.
Fig. 2 is the speech processes principle schematic of the present invention.
Fig. 3 is the step flow chart of the present invention.
Specific embodiment
With reference to specific embodiment, the present invention is described in detail.Following examples will be helpful to the technology of this area
Personnel further understand the present invention, but the invention is not limited in any way.It should be pointed out that the ordinary skill to this area
For personnel, without departing from the inventive concept of the premise, some changes and improvements can also be made.These belong to the present invention
Protection domain.
Voice-operated method is applied suitable for reality environment according to one kind that the present invention is provided, including:
Speech acquisition step:The speech-input instructions of collection user;
Phonetic order identification step:One or more phonetic entry words are extracted from the speech-input instructions of user,
Phonetic order is obtained according to phonetic entry word matched;
Control command obtaining step:The control command that acquisition is associated with phonetic order.
Preferably, the speech acquisition step, including:
Acquisition time window setting procedure:Voice collecting time window is determined according to the operation of user;
Voice is prescribed a time limit acquisition step:The speech-input instructions of user are gathered in voice collecting time window;
Punctuate judges step:During the speech-input instructions of collection user, will be greater than equal to dead time threshold value
Pronunciation pause as punctuate mark.
Preferably, the acquisition time window setting procedure, including:
Time window initial time setting procedure:In non-voice acquisition time window, by the moment of user operation input equipment
As the initial time of current speech acquisition time window;
Time window end time setting procedure:When current speech acquisition time window continues, by user operation input equipment
Moment as this voice collecting time window end time.
Preferably, the phonetic order identification step, including:
Split word step:According to language model storehouse, one or more languages are extracted from the speech-input instructions of user
Sound is input into word, and one or more of phonetic entry words are constituted to be identified group;
Matching step:To be identified group is matched in language model storehouse, obtain in language model storehouse with to be identified group
The speech recognition group of matching;
Wherein, speech recognition group is corresponded with phonetic order.
Preferably, the language model library module is only made by phonetic order and is obtained, including:
Phonetic order presets step:One or more phonetic orders are preset, wherein, phonetic order is stored in language model storehouse
In;
Speech recognition group construction step:For single phonetic order, by extract from phonetic order one or more
Keyword is configured to speech recognition group, and wherein, speech recognition group is stored in language model library module;
Order association step:Speech recognition group and control command are set up into one-to-one incidence relation, wherein, association is closed
System is stored in language model library module.
The present invention also provides one kind and applies voice-operated system suitable for reality environment, described suitable for virtual
Voice-operated side can be applied suitable for reality environment by described using voice-operated system in actual environment
The flow process realization of the step of method.Carry out specifically using voice-operated system suitable for reality environment to described below
It is bright.
It is described to apply voice-operated system suitable for reality environment, including:Phonetic order presetting module:It is default
One or more phonetic orders, wherein, phonetic order is stored in language model storehouse;Speech recognition group builds module:For list
One or more keywords extracted from phonetic order are configured to speech recognition group by one phonetic order, and wherein, voice is known
Other group is stored in language model library module;Order association module:Speech recognition group and control command are set up into one-to-one
Incidence relation, wherein, incidence relation is stored in language model library module;Wherein, the language model library module only passes through language
Sound instruction making is obtained.
Specifically, what traditional language model library module (language model and dictionary) was included is that the word of whole languages is sent out
The huge information such as sound, probability of occurrence, combinations of words.And the phonetic order system that the present invention only will be related in the application such as game
As language model and dictionary, rather than using the model and dictionary of whole languages, this significantly reduces language model and word
The scale of allusion quotation, so that improve the accuracy and speed of speech recognition.Wherein, build in module in speech recognition group, can be by language
Phonetic word in sound instruction is divided into 2 priority:Then the phonetic word of high priority is made by high priority, low priority
For keyword.Language model library module includes language model and dictionary.The information stored by language model is for constraining word
Search, define which word can follow probability behind a upper identified word, can be thus matching process
Exclude some impossible words.Such as, " I " is to have recognized word, just very high followed by the probability of " having a meal ", and " chicken
The probability of egg " is just extremely low.Dictionary is contained from word (words) to the mapping phoneme (phones).Each pronunciation of words
All being made up of phoneme, but multiple mappings being there may be because the pronunciation of people is different, the phoneme of such as " Fire " is included
" F AY ER " or " F AY R ", can so improve identification probability.
It is described to apply voice-operated system suitable for reality environment, also include:Voice acquisition module:Collection is used
The speech-input instructions at family;Phonetic order identification module:One or more voices are extracted from the speech-input instructions of user
Input word, obtains phonetic order according to phonetic entry word matched;Control command acquisition module:Acquisition is associated with phonetic order
Control command.
The voice acquisition module, including:Acquisition time window setting module:When determining voice collecting according to the operation of user
Between window;Voice is prescribed a time limit acquisition module:The speech-input instructions of user are gathered in voice collecting time window;Punctuate judge module:
During the speech-input instructions of collection user, will be greater than pausing as punctuate mark equal to the pronunciation of dead time threshold value
Know.The acquisition time window setting module, including:Time window initial time setting module:In non-voice acquisition time window, will
Initial time of the moment of user operation input equipment as current speech acquisition time window;Time window end time sets mould
Block:When current speech acquisition time window continues, using the moment of user operation input equipment as this voice collecting time window
End time.
Specifically, input equipment can be the specified button on virtual unit, and user can be by activating on virtual unit
Specified button voluntarily control voice be input into beginning and end time, games system without the need for the moment monitor phonetic entry.In void
When proposing standby upper specified button and not being activated, now not in voice collecting time window, the speech-input instructions that user sends
It is invalid to be accordingly to be regarded as, and will not be input into games system, thus avoids to big degree user and unintentionally speaks and other sound
Interference.Meanwhile, we are paused with the pronunciation of certain time and identify (such as continuing the pause of 1 second) as punctuate, when with
After family is input into one section of continuous voice messaging, when pause duration reaches 1 second, this instruction input can be judged as by system automatically
Terminate.User can by this method voluntarily between control statement pause, so as to avoid minibreak in natural pronunciation
The punctuate mistake for causing.
The phonetic order identification module, including:Split word module:According to language model storehouse, the voice from user is defeated
One or more phonetic entry words are extracted in entering instruction, one or more of phonetic entry words is constituted to be identified
Group;Matching module:To be identified group is matched in language model storehouse, is matched with to be identified group in obtaining language model storehouse
Speech recognition group;Wherein, speech recognition group is corresponded with phonetic order.
Specifically, to be identified group is divided with phonetic word with the phonetic entry word for each being included in speech recognition group
Matching screening is not carried out, matching degree highest speech recognition group is therefrom filtered out, and with this result as index, is searched corresponding trip
Play order, according to the game commands control games system for finding.Wherein, phonetic entry word is word with phonetic word,
So as to be matched between word.
It is described to apply voice-operated system suitable for reality environment, also include:Apparatus control module, wherein,
Apparatus control module is for according to control command control games system.
Below the preferred embodiment of the present invention is illustrated.
Example 1, realizes the effect at " interface of spreading out the map " in gaming using phonetic order " show me the map "
I realizes example 1 by following steps:
Step 1:If we have 3 phonetic orders:“show me the map”,“show myself”,“fire
Debris ", and related words (" show " " me " " the " " map " " myself " " fire " " debris ") are constituted into play speech
Model library.
Step 2:Phonetic order is split, further according to the identification priority of word, is recombinated respectively, is each corresponded to
Speech recognition group, it is as follows:
Phonetic order | Speech recognition group after splitting and reorganizing |
show me the map | “show”+“me”+“map” |
show myself | “show”+“myself” |
fire debris | “fire”+“debris” |
Step 3:By speech recognition group and game commands associated storage, for afterwards the step of inquiry it is used, it is as follows:
Speech recognition group after splitting and reorganizing | Game control command |
“show”+“me”+“map” | Spread out the map interface |
“show”+“myself” | Open role interface |
“fire”+“debris” | Release fireball |
Step 4:The speech-input instructions of collection user input, and it is converted into be identified group.Such as user says finger
" show me a map " is made, phonetic entry word " show "+" me "+" a "+" map " is split as
Step 5:To be identified group " show "+" me "+" a "+" map " is carried out respectively with all speech recognition groups for setting
Matching, three keywords of " show "+" me "+" map " occur all in this group, and order is correct, and probability of occurrence is
100%.All results are as follows:
Speech recognition group | Matching degree |
“show”+“me”+“map” | 100% |
“show”+“myself” | 50% |
“fire”+“debris” | 0% |
Screened according to matching result, selected matching degree highest speech recognition group " show "+" me "+" map "
Step 6:In associated storage module, corresponding control command is searched out according to the speech recognition group that matching is filtered out
(with reference to step 3) " interface of spreading out the map ", and this game commands is sent to into Gamecontrol system;
Step 7:The game feedback of correlation after Gamecontrol system receives the order at " interface of spreading out the map ", is carried out, is terminated
This flow process.
By above example:
The present invention can be constituted by only selecting the speech data for only meeting software requirement from existing language model storehouse
With targetedly small-sized language model storehouse, so as to the scale of construction of data is greatly reduced and save the collection of primary voice data into
This.
It is additionally, since and uses word identification matching way, and non-voice implication RM so that related calculating
Amount is greatly reduced, so as to improve the feedback speed of phonetic order.
Further, since the matching way of speech recognition group is used, it is only single comprising the key for arranging in advance in identification group
Word, while more key words, matching degree is more accurate, and this can not only improve the phonetic order recognition success rate of long sentence,
And user's deviation when phonetic order is input into is allowed, is facilitated user's memory and is used.
The above, preferable implementation example only of the invention is not intended to limit protection scope of the present invention.It is empty
The technical staff for intending reality game field can be designed that a lot of other modifications, equivalent, and improved embodiment, wrap
Include but be not limited to as:Technical ability is discharged in gaming using phonetic order, manipulates other game in gaming using phonetic order single
Position etc..These modifications and embodiment will fall within spirit disclosed in the present application and spirit, and should be included in the present invention
Protection domain within.
One skilled in the art will appreciate that except the system for realizing present invention offer in pure computer readable program code mode
And its beyond each device, module, unit, the present invention can be provided by method and step is carried out programming in logic completely
System and its each device, module, unit with gate, switch, special IC, programmable logic controller (PLC) and embedding
Enter the form of the controller that declines etc. to realize identical function.So, system and its every device, module, list that the present invention is provided
Unit is considered a kind of hardware component, and the device for realizing various functions that includes in which, module, unit also may be used
With the structure being considered as in hardware component;It can both be real that will can also be used for realizing that the device of various functions, module, unit be considered as
The software module of existing method can be the structure in hardware component again.
Above the specific embodiment of the present invention is described.It is to be appreciated that the invention is not limited in above-mentioned
Particular implementation, those skilled in the art can make a variety of changes within the scope of the claims or change, this not shadow
Ring the flesh and blood of the present invention.In the case where not conflicting, the feature in embodiments herein and embodiment can any phase
Mutually combine.
Claims (10)
1. one kind applies voice-operated method suitable for reality environment, it is characterised in that include:
Speech acquisition step:The speech-input instructions of collection user;
Phonetic order identification step:One or more phonetic entry words are extracted from the speech-input instructions of user, according to
Phonetic entry word matched obtains phonetic order;
Control command obtaining step:The control command that acquisition is associated with phonetic order.
2. it is according to claim 1 to apply voice-operated method suitable for reality environment, it is characterised in that institute
Speech acquisition step is stated, including:
Acquisition time window setting procedure:Voice collecting time window is determined according to the operation of user;
Voice is prescribed a time limit acquisition step:The speech-input instructions of user are gathered in voice collecting time window;
Punctuate judges step:During the speech-input instructions of collection user, sending out equal to dead time threshold value is will be greater than
Sound pauses as punctuate mark.
3. it is according to claim 2 to apply voice-operated method suitable for reality environment, it is characterised in that institute
Acquisition time window setting procedure is stated, including:
Time window initial time setting procedure:In non-voice acquisition time window, using the moment of user operation input equipment as
The initial time of current speech acquisition time window;
Time window end time setting procedure:Current speech acquisition time window continue when, by user operation input equipment when
Carve the end time as this voice collecting time window.
4. it is according to claim 1 to apply voice-operated method suitable for reality environment, it is characterised in that institute
Phonetic order identification step is stated, including:
Split word step:According to language model storehouse, one or more voices are extracted from the speech-input instructions of user defeated
Enter word, one or more of phonetic entry words are constituted into be identified group;
Matching step:To be identified group is matched in language model storehouse, is matched with to be identified group in obtaining language model storehouse
Speech recognition group;
Wherein, speech recognition group is corresponded with phonetic order.
5. it is according to claim 4 to apply voice-operated method suitable for reality environment, it is characterised in that institute
Predicate speech model library module is only made by phonetic order and is obtained, including:
Phonetic order presets step:One or more phonetic orders are preset, wherein, phonetic order is stored in language model storehouse;
Speech recognition group construction step:For single phonetic order, by one or more keys extracted from phonetic order
Word is configured to speech recognition group, and wherein, speech recognition group is stored in language model library module;
Order association step:Speech recognition group and control command are set up into one-to-one incidence relation, wherein, incidence relation is deposited
Storage is in language model library module.
6. one kind applies voice-operated system suitable for reality environment, it is characterised in that include:
Voice acquisition module:The speech-input instructions of collection user;
Phonetic order identification module:One or more phonetic entry words are extracted from the speech-input instructions of user, according to
Phonetic entry word matched obtains phonetic order;
Control command acquisition module:The control command that acquisition is associated with phonetic order.
7. it is according to claim 6 to apply voice-operated system suitable for reality environment, it is characterised in that institute
Voice acquisition module is stated, including:
Acquisition time window setting module:Voice collecting time window is determined according to the operation of user;
Voice is prescribed a time limit acquisition module:The speech-input instructions of user are gathered in voice collecting time window;
Punctuate judge module:During the speech-input instructions of collection user, sending out equal to dead time threshold value is will be greater than
Sound pauses as punctuate mark.
8. it is according to claim 7 to apply voice-operated system suitable for reality environment, it is characterised in that institute
Acquisition time window setting module is stated, including:
Time window initial time setting module:In non-voice acquisition time window, using the moment of user operation input equipment as
The initial time of current speech acquisition time window;
Time window end time setting module:Current speech acquisition time window continue when, by user operation input equipment when
Carve the end time as this voice collecting time window.
9. it is according to claim 6 to apply voice-operated system suitable for reality environment, it is characterised in that institute
Predicate sound instruction identification module, including:
Split word module:According to language model storehouse, one or more voices are extracted from the speech-input instructions of user defeated
Enter word, one or more of phonetic entry words are constituted into be identified group;
Matching module:To be identified group is matched in language model storehouse, is matched with to be identified group in obtaining language model storehouse
Speech recognition group;
Wherein, speech recognition group is corresponded with phonetic order.
It is 10. according to claim 9 to apply voice-operated system suitable for reality environment, it is characterised in that
Including:
Phonetic order presetting module:One or more phonetic orders are preset, wherein, phonetic order is stored in language model storehouse;
Speech recognition group builds module:For single phonetic order, by one or more keys extracted from phonetic order
Word is configured to speech recognition group, and wherein, speech recognition group is stored in language model library module;
Order association module:Speech recognition group and control command are set up into one-to-one incidence relation, wherein, incidence relation is deposited
Storage is in language model library module;
Wherein, the language model library module is only made by phonetic order and is obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610899523.3A CN106512393A (en) | 2016-10-14 | 2016-10-14 | Application voice control method and system suitable for virtual reality environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610899523.3A CN106512393A (en) | 2016-10-14 | 2016-10-14 | Application voice control method and system suitable for virtual reality environment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106512393A true CN106512393A (en) | 2017-03-22 |
Family
ID=58332386
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610899523.3A Pending CN106512393A (en) | 2016-10-14 | 2016-10-14 | Application voice control method and system suitable for virtual reality environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106512393A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106782543A (en) * | 2017-03-24 | 2017-05-31 | 联想(北京)有限公司 | A kind of information processing method and electronic equipment |
CN106846959A (en) * | 2017-04-20 | 2017-06-13 | 成都景中教育软件有限公司 | Voice command and operation teaching aid in the method and device of software |
CN107274891A (en) * | 2017-05-23 | 2017-10-20 | 武汉秀宝软件有限公司 | A kind of AR interface alternation method and system based on speech recognition engine |
CN108389579A (en) * | 2018-02-09 | 2018-08-10 | 北京北行科技有限公司 | One kind is in VR virtual worlds speech control system and control method |
CN108986803A (en) * | 2018-06-26 | 2018-12-11 | 北京小米移动软件有限公司 | Scenery control method and device, electronic equipment, readable storage medium storing program for executing |
CN110299138A (en) * | 2019-06-28 | 2019-10-01 | 北京机械设备研究所 | A kind of augmented reality assembly technology instructs system and method |
CN110992757A (en) * | 2019-11-29 | 2020-04-10 | 北方工业大学 | High-risk chemical laboratory fire safety training method based on VR and artificial intelligence |
CN111326145A (en) * | 2020-01-22 | 2020-06-23 | 南京雷鲨信息科技有限公司 | Speech model training method, system and computer readable storage medium |
CN111622616A (en) * | 2020-04-15 | 2020-09-04 | 阜阳万瑞斯电子锁业有限公司 | Personal voice recognition unlocking system and method for electronic lock |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101383150A (en) * | 2008-08-19 | 2009-03-11 | 南京师范大学 | Control method of speech soft switch and its application in geographic information system |
CN105955698A (en) * | 2016-05-04 | 2016-09-21 | 深圳市凯立德科技股份有限公司 | Voice control method and apparatus |
-
2016
- 2016-10-14 CN CN201610899523.3A patent/CN106512393A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101383150A (en) * | 2008-08-19 | 2009-03-11 | 南京师范大学 | Control method of speech soft switch and its application in geographic information system |
CN105955698A (en) * | 2016-05-04 | 2016-09-21 | 深圳市凯立德科技股份有限公司 | Voice control method and apparatus |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106782543A (en) * | 2017-03-24 | 2017-05-31 | 联想(北京)有限公司 | A kind of information processing method and electronic equipment |
CN106846959A (en) * | 2017-04-20 | 2017-06-13 | 成都景中教育软件有限公司 | Voice command and operation teaching aid in the method and device of software |
CN107274891A (en) * | 2017-05-23 | 2017-10-20 | 武汉秀宝软件有限公司 | A kind of AR interface alternation method and system based on speech recognition engine |
CN108389579A (en) * | 2018-02-09 | 2018-08-10 | 北京北行科技有限公司 | One kind is in VR virtual worlds speech control system and control method |
CN108986803A (en) * | 2018-06-26 | 2018-12-11 | 北京小米移动软件有限公司 | Scenery control method and device, electronic equipment, readable storage medium storing program for executing |
CN110299138A (en) * | 2019-06-28 | 2019-10-01 | 北京机械设备研究所 | A kind of augmented reality assembly technology instructs system and method |
CN110992757A (en) * | 2019-11-29 | 2020-04-10 | 北方工业大学 | High-risk chemical laboratory fire safety training method based on VR and artificial intelligence |
CN111326145A (en) * | 2020-01-22 | 2020-06-23 | 南京雷鲨信息科技有限公司 | Speech model training method, system and computer readable storage medium |
CN111622616A (en) * | 2020-04-15 | 2020-09-04 | 阜阳万瑞斯电子锁业有限公司 | Personal voice recognition unlocking system and method for electronic lock |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106512393A (en) | Application voice control method and system suitable for virtual reality environment | |
CN110717339B (en) | Semantic representation model processing method and device, electronic equipment and storage medium | |
CN104484411B (en) | A kind of construction method of the semantic knowledge-base based on dictionary | |
CN101799849A (en) | Method for realizing non-barrier automatic psychological consult by adopting computer | |
CN106356057A (en) | Speech recognition system based on semantic understanding of computer application scenario | |
CN107972028A (en) | Man-machine interaction method, device and electronic equipment | |
Baur et al. | eXplainable cooperative machine learning with NOVA | |
CN102750125A (en) | Voice-based control method and control system | |
CN112487139A (en) | Text-based automatic question setting method and device and computer equipment | |
CN106776926A (en) | Improve the method and system of responsibility when robot talks with | |
CN110263653A (en) | A kind of scene analysis system and method based on depth learning technology | |
CN110457661A (en) | Spatial term method, apparatus, equipment and storage medium | |
Wagner et al. | Applying cooperative machine learning to speed up the annotation of social signals in large multi-modal corpora | |
CN110046232A (en) | Natural expression processing method, response method, equipment and the system of natural intelligence | |
CN110059166A (en) | Natural expression processing method, response method, equipment and the system of natural intelligence | |
CN110059168A (en) | The method that man-machine interactive system based on natural intelligence is trained | |
CN115293132A (en) | Conversation processing method and device of virtual scene, electronic equipment and storage medium | |
CN114676259B (en) | Conversation emotion recognition method based on causal perception interactive network | |
CN113591489A (en) | Voice interaction method and device and related equipment | |
CN110853669B (en) | Audio identification method, device and equipment | |
KR102139855B1 (en) | Intelligent personal assistant system based on the inner state of user | |
CN110008317A (en) | Natural expression processing method, response method, equipment and the system of natural intelligence | |
CN108804648A (en) | New word receiving and recording method based on voice search and electronic equipment | |
CN110263346B (en) | Semantic analysis method based on small sample learning, electronic equipment and storage medium | |
CN112885338A (en) | Speech recognition method, apparatus, computer-readable storage medium, and program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170322 |
|
RJ01 | Rejection of invention patent application after publication |