CN109326290A - Audio recognition method and device - Google Patents
Audio recognition method and device Download PDFInfo
- Publication number
- CN109326290A CN109326290A CN201811504208.1A CN201811504208A CN109326290A CN 109326290 A CN109326290 A CN 109326290A CN 201811504208 A CN201811504208 A CN 201811504208A CN 109326290 A CN109326290 A CN 109326290A
- Authority
- CN
- China
- Prior art keywords
- user interface
- interface
- recognition result
- user
- recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 230000015654 memory Effects 0.000 claims description 18
- 238000000605 extraction Methods 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 7
- 238000004891 communication Methods 0.000 claims description 6
- 238000012795 verification Methods 0.000 claims description 3
- 230000002452 interceptive effect Effects 0.000 abstract description 11
- 230000000694 effects Effects 0.000 abstract description 3
- 230000002708 enhancing effect Effects 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 6
- 240000002853 Nelumbo nucifera Species 0.000 description 5
- 235000006508 Nelumbo nucifera Nutrition 0.000 description 5
- 235000006510 Nelumbo pentapetala Nutrition 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 235000013399 edible fruits Nutrition 0.000 description 2
- 238000010295 mobile communication Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000007474 system interaction Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/221—Announcement of recognition results
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of audio recognition method, include the following steps: to obtain user interface content;User interface content is registered;When receiving user speech instruction, the recognition result instructed to user speech is determined according to the content of registration.The invention also discloses a kind of speech recognition equipments, the effect for enhancing the precision of speech recognition during interactive voice may be implemented with device according to the method for the present invention, and significantly increase the experience sense of user.
Description
Technical field
The present invention relates to technical field of voice recognition, especially a kind of audio recognition method and device.
Background technique
With the more maturation of interactive voice technology, it is directed to product or method based on interactive voice at present, due to existing
The special circumstances such as polyphone, a variety of results or uncommon term are easy to produce the not high problem of accuracy of identification, such as user's hair
Phonetic order out is " lotus street ", and user's expectation is identified as in " lotus street " when speech recognition, but actual speech identifies
May being identified as the recognition result of " street Fu Rong " and expectations of customer, just there is any discrepancy, the problem for causing accuracy of identification not high.
Summary of the invention
Inventor is practiced and summarized the experience discovery, and user carries out phonetic order, is issued based on application interface
's.Moreover, inventor is further contemplated that, with the fast development of interactive voice technology, since it can be provided convenient for user
Service, it is seen that can say and have become a kind of irresistible development trend.Under this trend, the operation of third-party application becomes
Gesture is developed into and is operated based on phonetic order by current manual operation, and mainstream will be become.For this purpose, inventor contemplating solution
The certainly new design of the above problem: being registered from the corresponding content of user interface of other application (such as APP), is directed to user in this way
The phonetic order that interface issues can be matched according to interface participle, according to matching result as recognition result.It can mention in this way
For the precision of speech recognition, enhance the experience sense of user.
According to the first aspect of the invention, a kind of audio recognition method is provided, is included the following steps:
Obtain user interface content;
User interface content is registered;
When receiving user speech instruction, the recognition result instructed to user speech is determined according to the content of registration.
According to the second aspect of the invention, a kind of speech recognition equipment is provided, including
Interface content obtains module, for obtaining user interface content;
Interface content extraction module, for registering user interface content;
Speech recognition module, for being determined to user speech according to the content of registration when receiving user speech instruction
The recognition result of instruction.
According to the third aspect of the present invention, a kind of electronic equipment is provided comprising: at least one processor, and
The memory being connect at least one processor communication, wherein memory is stored with the finger that can be executed by least one processor
It enables, instruction is executed by least one processor, so that the step of at least one processor is able to carry out the above method.
According to the fourth aspect of the present invention, a kind of storage medium is provided, computer program is stored thereon with, the program
The step of above method is realized when being executed by processor.
The essence for enhancing speech recognition during interactive voice may be implemented in the method and device provided according to the present invention
The effect of degree, and significantly increase the experience sense of user.
Detailed description of the invention
Fig. 1 is the audio recognition method flow chart of an embodiment of the present invention;
Fig. 2 is the audio recognition method flow chart of a further embodiment of this invention;
Fig. 3 is the speech recognition equipment functional block diagram of an embodiment of the present invention;
Fig. 4 is the block diagram of the electronic equipment of an embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is
A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art
Every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.
The present invention can describe in the general context of computer-executable instructions executed by a computer, such as program
Module.Generally, program module includes routines performing specific tasks or implementing specific abstract data types, programs, objects, member
Part, data structure etc..The present invention can also be practiced in a distributed computing environment, in these distributed computing environments, by
Task is executed by the connected remote processing devices of communication network.In a distributed computing environment, program module can be with
In the local and remote computer storage media including storage equipment.
In the present invention, the fingers such as " module ", " device ", " system " are applied to the related entities of computer, such as hardware, hardware
Combination, software or software in execution with software etc..In detail, for example, element can with but be not limited to run on processing
Process, processor, object, executable element, execution thread, program and/or the computer of device.In addition, running on server
Application program or shell script, server can be element.One or more elements can be in the process and/or thread of execution
In, and element can be localized and/or be distributed between two or multiple stage computers on one computer, and can be by each
Kind computer-readable medium operation.Element can also according to the signal with one or more data packets, for example, from one with
Another element interacts in local system, distributed system, and/or the network in internet passes through signal and other system interactions
The signals of data communicated by locally and/or remotely process.
Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by
One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation
Between there are any actual relationship or orders.Moreover, the terms "include", "comprise", not only include those elements, and
And further include other elements that are not explicitly listed, or further include for this process, method, article or equipment institute it is intrinsic
Element.In the absence of more restrictions, the element limited by sentence " including ... ", it is not excluded that including described want
There is also other identical elements in the process, method, article or equipment of element.
The audio recognition method of the embodiment of the present invention can be applied to any terminal device for being configured with phonetic function, example
Such as, the terminal devices such as smart phone, tablet computer, smart home, the invention is not limited in this regard, so that user exists
Response more promptly and accurately is obtained during using these terminal devices, promotes user experience.
The invention will now be described in further detail with reference to the accompanying drawings.
Fig. 1 schematically shows audio recognition method flow chart according to an embodiment of the present invention, as shown in Figure 1,
The present embodiment includes the following steps:
Step S101: user interface content is obtained.User interface is user circle of each APP installed on terminal device
Face, acquisition modes can be obtained by the api interface of each app user interface.
Step S102: user interface content is registered.Participle extraction is carried out respectively to each user interface content, point
Word extraction is a mature technology, and being referred to the participle inside prior art realization, such as " you are well small to speed " has " hello "
" small to speed ", extracts by unit of phrase.The participle extracted is determined as to the interface participle of the user interface.Later,
By determining interface participle by identifying that the interface of engine is registered in identification engine.
Step S103: when receiving user speech instruction, the knowledge instructed to user speech is determined according to the content of registration
Other result.Specific implementation are as follows: dock the user speech instruction received and carry out speech recognition, the mode of speech recognition is referred to existing
There is technology, obtain the first recognition result, wherein the first recognition result is obtained according to tradition or existing voice recognition mode
Recognition result.By the first recognition result and last step, chartered interface participle carries out similarity mode, if matching at
Function is just using first recognition result as final recognition result, and illustratively, the phonetic order that user issues is " lotus street ", leads to
Crossing existing speech recognition and obtaining the first recognition result is " street Fu Rong ", by " street Fu Rong " and the interface of registration participle based on hair
Sound is matched, and the interface participle in " lotus street " will be matched to, in this case, just by the final identification of the first recognition result
Modified result is " lotus street ", i.e., when the interface for being matched to same or similar pronunciation segments, preferably selection interface segments conduct
Recognition result.It, will be according to existing voice recognition mode using the first recognition result as final recognition result if it fails to match.
Illustratively, if being not matched to similar interface participle, the first recognition result " street Fu Rong " is just used as recognition result.
The embodiment of the present invention carries out user in respective user interfaces the scene of interactive voice, can quickly and accurately
It is matched to the speech recognition result of user, accuracy of identification is improved, avoids the asking for the generating due to special circumstances such as polyphone, rarely used word
Topic, and can effectively realize the purpose of " finding is i.e. described ", so that all operating user interfaces can pass through interactive voice
It realizes, enriches the interactive voice experience of user, and the mode that user is operated using various user interfaces is more abundant
And it is friendly.
Fig. 2 schematically shows the audio recognition method flow chart of another embodiment according to the present invention, such as Fig. 2 institute
Show, the present embodiment includes:
Step S201: user interface content is obtained.Its concrete implementation mode is referred to the implementation of step S101.
Step S202: user interface mark is configured for the user interface content of acquisition.Specific implementation are as follows: according to user interface
Affiliated APP is the identifier that the user interface content obtained is configured to the unique identification user interface, the content of identifier
Can be includes the field for identifying App and the field for identity user interface.Illustratively, to music APP
Playlist user interface can for its allocation identification accord with: MUSIC.PLAY LIST001.MUSIC indicates music
APP, PLAY LIST indicate playlist, and 001 indicates first user interface of playlist.
Wherein, music APP itself can be directly used in the field in a preferred embodiment for identity user interface
User interface ID.
In some embodiments user interface mark still can be terminal device system or respective application software to its
ID is identified, as long as achieving the purpose that effectively to identify each user interface, the embodiment of the present invention is not carried out specific implementation
Limitation.
Step S203: user interface content is registered.Its implementation and step S102 are essentially identical, difference
It is to carry out participle extraction to each user interface content, after determining interface participle, interface participle is infused in identification engine
Volume, and be associated with corresponding user interface mark for it, i.e., by the user interface mark of user interface belonging to the participle of interface also into
Row registration.
Step S204: receive user speech instruction when, according to the content of registration and user interface mark determine to
The recognition result of family phonetic order.Specific implementation are as follows: when receiving user speech instruction, by calling corresponding system interface
The user interface being currently located is obtained, determines that its user interface identifies.Later, according to the user interface for being currently located user interface
Mark obtains the registered interface participle of present user interface.Speech recognition is carried out to the user speech instruction received, is obtained
First recognition result.First recognition result is matched with the registered interface of present user interface participle, is tied according to matching
Fruit determines the final recognition result of the first recognition result.
According to the present embodiment because having user interface mark, thus during interactive voice, to the first recognition result
Further identification is based on present user interface, and whether can effectively tell is the use issued based on present user interface
Family phonetic order, can be more accurate be matched to recognition result, have higher speech recognition precision.
Fig. 3 schematically shows speech recognition equipment functional block diagram according to an embodiment of the present invention, such as Fig. 3 institute
Show, speech recognition equipment includes: that interface content obtains module 4, interface content extraction module 5 and speech recognition module 6.
Interface content obtains module 4 for obtaining user interface content, is embodied as connecing with the api of each app user interface
Mouth connection, according to the interface captures user interface content.
Interface content extraction module 5 is for registering user interface content comprising has participle 501 He of extraction unit
Registering unit 502 is segmented, participle extraction unit 501 determines that interface segments for carrying out participle extraction to user interface content.Point
Word registering unit 502 is used to register interface participle in identification engine.The registration content of the participle registering unit 502 can
To be a register interface participle, it is also possible to distinguishing interface participle into registration with interface identification.Specific implementation can join
According to above-mentioned method part.
Speech recognition module 6 is used to be determined according to the content of registration to user speech when receiving user speech instruction
The recognition result of instruction.It includes recognition unit 601 and identification verification unit 602.Recognition unit 601 is used for receiving
User speech instruction carries out speech recognition, obtains the first recognition result, can use existing speech recognition module.Identification verification
Unit 602 is used to match the first recognition result with the interface of registration participle, determines the first identification knot according to matching result
The final recognition result of fruit.Wherein it is possible to the first recognition result is matched with all registered interface participles, it can also be by first
Recognition result is matched with the registered interface of present user interface participle.Specific implementation is referred to above-mentioned method
Part.
The phonetic order that the device of the embodiment of the present invention can be issued for user interface, after carrying out speech recognition, into
One step is matched according to interface participle, is corrected recognition result according to matching result, can be provided the precision of speech recognition in this way.
Also, the device of the embodiment of the present invention is to carry out modified result based on the participle for extracting user interface, it is thus possible to effectively be expanded
The interactive mode of various applications is opened up, so that interactive voice all plays left and right in various applications, is realized " visible to say ", enhancing
The experience sense of user.
In some embodiments, the embodiment of the present invention provides a kind of non-volatile computer readable storage medium storing program for executing, described to deposit
Being stored in storage media one or more includes the programs executed instruction, it is described execute instruction can by electronic equipment (including but
It is not limited to computer, server or the network equipment etc.) it reads and executes, for executing any of the above-described voice of the present invention
Recognition methods.
In some embodiments, the embodiment of the present invention also provides a kind of computer program product, computer program product packet
The computer program being stored on non-volatile computer readable storage medium storing program for executing is included, computer program includes program instruction, works as institute
When program instruction is computer-executed, computer is made to execute any of the above-described audio recognition method.
In some embodiments, the embodiment of the present invention also provides a kind of electronic equipment comprising: at least one processor,
And the memory being connect at least one processor communication, wherein memory, which is stored with, to be executed by least one processor
Instruction, instruction by least one described processor execute so that at least one processor is able to carry out audio recognition method.
In some embodiments, the embodiment of the present invention also provides a kind of storage medium, is stored thereon with computer program,
It is characterized in that, audio recognition method when which is executed by processor.
The speech recognition equipment of the embodiments of the present invention can be used for executing the audio recognition method of the embodiment of the present invention, and
Reach the realization audio recognition method technical effect achieved of the embodiments of the present invention accordingly, which is not described herein again.This
Related function module can be realized by hardware processor (hardware processor) in inventive embodiments.
Fig. 4 is the hardware configuration signal of the electronic equipment for the execution audio recognition method that another embodiment of the application provides
Figure, as shown in figure 4, the equipment includes:
One or more processors 410 and memory 420, in Fig. 4 by taking a processor 410 as an example.
The equipment for executing audio recognition method can also include: input unit 430 and output device 440.Processor 410,
Memory 420, input unit 430 and output device 440 can be connected by bus or other modes, by total in Fig. 4
For line connection.
Memory 420 is used as a kind of non-volatile computer readable storage medium storing program for executing, can be used for storing non-volatile software journey
Sequence, non-volatile computer executable program and module, such as the corresponding program of audio recognition method in the embodiment of the present application
Instruction/module.Non-volatile software program, instruction and the module that processor 410 is stored in memory 420 by operation,
Thereby executing the various function application and data processing of server, that is, realize the audio recognition method of above method embodiment.
Memory 420 may include storing program area and storage data area, wherein storing program area can store operation system
Application program required for system, at least one function;Storage data area can be stored to be created according to using for speech recognition equipment
Data etc..In addition, memory 420 may include high-speed random access memory, it can also include nonvolatile memory, example
Such as at least one disk memory, flush memory device or other non-volatile solid state memory parts.In some embodiments, it deposits
Optional reservoir 420 includes the memory remotely located relative to processor 410, these remote memories can pass through network connection
To speech recognition equipment.The example of above-mentioned network includes but is not limited to internet, intranet, local area network, mobile radio communication
And combinations thereof.
Input unit 430 can receive the number or character information of input, and generates and set with the user of speech recognition equipment
It sets and the related signal of function control.Output device 440 may include that display screen etc. shows equipment.
Said one or multiple modules are stored in the memory 420, when by one or more of processors
When 410 execution, the audio recognition method in above-mentioned any means embodiment is executed.
Method provided by the embodiment of the present application can be performed in the said goods, has the corresponding functional module of execution method and has
Beneficial effect.The not technical detail of detailed description in the present embodiment, reference can be made to method provided by the embodiment of the present application.
The electronic equipment of the embodiment of the present application exists in a variety of forms, including but not limited to:
(1) mobile communication equipment: the characteristics of this kind of equipment is that have mobile communication function, and to provide speech, data
Communication is main target.This Terminal Type includes: smart phone (such as iPhone), multimedia handset, functional mobile phone and low
Hold mobile phone etc..
(2) super mobile personal computer equipment: this kind of equipment belongs to the scope of personal computer, there is calculating and processing function
Can, generally also have mobile Internet access characteristic.This Terminal Type includes: PDA, MID and UMPC equipment etc., such as iPad.
(3) portable entertainment device: this kind of equipment can show and play multimedia content.Such equipment include: audio,
Video player (such as iPod), handheld device, e-book and intelligent toy and portable car-mounted navigation equipment.
(4) server: providing the equipment of the service of calculating, and the composition of server includes that processor, hard disk, memory, system are total
Line etc., server is similar with general computer architecture, but due to needing to provide highly reliable service, in processing energy
Power, stability, reliability, safety, scalability, manageability etc. are more demanding.
(5) other electronic devices with data interaction function.
The apparatus embodiments described above are merely exemplary, wherein described, unit can as illustrated by the separation member
It is physically separated with being or may not be, component shown as a unit may or may not be physics list
Member, it can it is in one place, or may be distributed over multiple network units.It can be selected according to the actual needs
In some or all of the modules achieve the purpose of the solution of this embodiment.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can
It is realized by the mode of software plus general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, above-mentioned technology
Scheme substantially in other words can be embodied in the form of software products the part that the relevant technologies contribute, the computer
Software product may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions to
So that computer equipment (can be personal computer, server or the network equipment etc.) execute each embodiment or
Method described in certain parts of embodiment.
Finally, it should be noted that above embodiments are only to illustrate the technical solution of the application, rather than its limitations;Although
The application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: it still may be used
To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features;
And these are modified or replaceed, each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution spirit and
Range.
Claims (10)
1. audio recognition method characterized by comprising
Obtain user interface content;
The user interface content is registered;
When receiving user speech instruction, the recognition result instructed to user speech is determined according to the content of registration.
2. the method according to claim 1, wherein it is described by the user interface content carry out registration include:
Participle extraction is carried out to user interface content, determines that interface segments;
Interface participle is registered in identification engine.
3. according to the method described in claim 2, it is characterized in that, receive user speech instruction when, according in registration
Hold and determines that the recognition result instructed to user speech includes:
Speech recognition is carried out to the user speech instruction received, obtains the first recognition result;
First recognition result is matched with the interface of registration participle, the final of the first recognition result is determined according to matching result
Recognition result.
4. the method according to claim 1, wherein the method also includes:
User interface mark is configured for the user interface content of acquisition;
It is described by the user interface content carry out registration include:
Participle extraction is carried out to each user interface content, determines that interface segments;
Interface participle is registered in identification engine, and is associated with corresponding user interface mark for it.
5. according to the method described in claim 4, it is characterized in that, it is described receive user speech instruction when, according to registration
Content determine that the recognition result that instructs to user speech includes
When receiving user speech instruction, acquisition is currently located user interface;
The registered interface participle for obtaining present user interface is identified according to the user interface for being currently located user interface;
Speech recognition is carried out to the user speech instruction received, obtains the first recognition result;
First recognition result is matched with the registered interface of present user interface participle, determines first according to matching result
The final recognition result of recognition result.
6. speech recognition equipment characterized by comprising
Interface content obtains module, for obtaining user interface content;
Interface content extraction module, for registering the user interface content;
Speech recognition module, for being determined according to the content of registration and being instructed to user speech when receiving user speech instruction
Recognition result.
7. device according to claim 6, which is characterized in that the interface content extraction module includes
Extraction unit is segmented, for carrying out participle extraction to user interface content, determines that interface segments;
Registering unit is segmented, for registering interface participle in identification engine.
8. device according to claim 7, which is characterized in that the speech recognition module includes
Recognition unit obtains the first recognition result for carrying out speech recognition to the user speech instruction received;
It identifies verification unit, for matching the first recognition result with the interface of registration participle, is determined according to matching result
The final recognition result of first recognition result.
9. electronic equipment comprising: at least one processor, and the storage being connect at least one described processor communication
Device, wherein the memory is stored with the instruction that can be executed by least one described processor, and described instruction is by described at least one
A processor executes, so that at least one described processor is able to carry out the step of any one of claim 1-5 the method
Suddenly.
10. storage medium is stored thereon with computer program, which is characterized in that the program realizes right when being executed by processor
It is required that the step of any one of 1-5 the method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811504208.1A CN109326290A (en) | 2018-12-10 | 2018-12-10 | Audio recognition method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811504208.1A CN109326290A (en) | 2018-12-10 | 2018-12-10 | Audio recognition method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109326290A true CN109326290A (en) | 2019-02-12 |
Family
ID=65256274
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811504208.1A Pending CN109326290A (en) | 2018-12-10 | 2018-12-10 | Audio recognition method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109326290A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112102832A (en) * | 2020-09-18 | 2020-12-18 | 广州小鹏汽车科技有限公司 | Speech recognition method, speech recognition device, server and computer-readable storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105513594A (en) * | 2015-11-26 | 2016-04-20 | 许传平 | Voice control system |
US20170017464A1 (en) * | 2006-11-03 | 2017-01-19 | Philippe Roy | Layered contextual configuration management system and method and minimized input speech recognition user interface interactions experience |
CN106710598A (en) * | 2017-03-24 | 2017-05-24 | 上海与德科技有限公司 | Voice recognition method and device |
CN107154262A (en) * | 2017-05-31 | 2017-09-12 | 北京安云世纪科技有限公司 | A kind of voice operating method, device and mobile terminal |
CN107608652A (en) * | 2017-08-28 | 2018-01-19 | 三星电子(中国)研发中心 | A kind of method and apparatus of Voice command graphical interfaces |
-
2018
- 2018-12-10 CN CN201811504208.1A patent/CN109326290A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170017464A1 (en) * | 2006-11-03 | 2017-01-19 | Philippe Roy | Layered contextual configuration management system and method and minimized input speech recognition user interface interactions experience |
CN105513594A (en) * | 2015-11-26 | 2016-04-20 | 许传平 | Voice control system |
CN106710598A (en) * | 2017-03-24 | 2017-05-24 | 上海与德科技有限公司 | Voice recognition method and device |
CN107154262A (en) * | 2017-05-31 | 2017-09-12 | 北京安云世纪科技有限公司 | A kind of voice operating method, device and mobile terminal |
CN107608652A (en) * | 2017-08-28 | 2018-01-19 | 三星电子(中国)研发中心 | A kind of method and apparatus of Voice command graphical interfaces |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112102832A (en) * | 2020-09-18 | 2020-12-18 | 广州小鹏汽车科技有限公司 | Speech recognition method, speech recognition device, server and computer-readable storage medium |
CN112102832B (en) * | 2020-09-18 | 2021-12-28 | 广州小鹏汽车科技有限公司 | Speech recognition method, speech recognition device, server and computer-readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102660922B1 (en) | Management layer for multiple intelligent personal assistant services | |
US11568876B2 (en) | Method and device for user registration, and electronic device | |
US10043520B2 (en) | Multilevel speech recognition for candidate application group using first and second speech commands | |
CN109637548A (en) | Voice interactive method and device based on Application on Voiceprint Recognition | |
US11188289B2 (en) | Identification of preferred communication devices according to a preference rule dependent on a trigger phrase spoken within a selected time from other command data | |
CN109471678A (en) | Voice midpoint controlling method and device based on image recognition | |
US20160240195A1 (en) | Information processing method and electronic device | |
US11610590B2 (en) | ASR training and adaptation | |
CN109741755A (en) | Voice wakes up word threshold management device and manages the method that voice wakes up word threshold value | |
US11587568B2 (en) | Streaming action fulfillment based on partial hypotheses | |
CN109192212B (en) | Voice control method and device | |
WO2014117645A1 (en) | Information identification method and apparatus | |
WO2020119569A1 (en) | Voice interaction method, device and system | |
US20180190295A1 (en) | Voice recognition | |
JP7342286B2 (en) | Voice function jump method, electronic equipment and storage medium for human-machine interaction | |
US20170286049A1 (en) | Apparatus and method for recognizing voice commands | |
WO2017166651A1 (en) | Voice recognition model training method, speaker type recognition method and device | |
CN112652302B (en) | Voice control method, device, terminal and storage medium | |
CN111341315B (en) | Voice control method, device, computer equipment and storage medium | |
US20230133146A1 (en) | Method and apparatus for determining skill field of dialogue text | |
CN109559749A (en) | Combined decoding method and system for speech recognition system | |
CN109686370A (en) | The method and device of fighting landlord game is carried out based on voice control | |
WO2018094952A1 (en) | Content recommendation method and apparatus | |
CN109326290A (en) | Audio recognition method and device | |
CN110874176B (en) | Interaction method, storage medium, operating system and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province Applicant after: Sipic Technology Co.,Ltd. Address before: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province Applicant before: AI SPEECH Co.,Ltd. |
|
CB02 | Change of applicant information | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190212 |
|
RJ01 | Rejection of invention patent application after publication |