CN106251864A - A kind of audio recognition method and terminal - Google Patents

A kind of audio recognition method and terminal Download PDF

Info

Publication number
CN106251864A
CN106251864A CN201610626656.3A CN201610626656A CN106251864A CN 106251864 A CN106251864 A CN 106251864A CN 201610626656 A CN201610626656 A CN 201610626656A CN 106251864 A CN106251864 A CN 106251864A
Authority
CN
China
Prior art keywords
terminal
voice signal
feature
object information
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610626656.3A
Other languages
Chinese (zh)
Inventor
向攀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Jinli Communication Equipment Co Ltd
Original Assignee
Shenzhen Jinli Communication Equipment Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Jinli Communication Equipment Co Ltd filed Critical Shenzhen Jinli Communication Equipment Co Ltd
Priority to CN201610626656.3A priority Critical patent/CN106251864A/en
Publication of CN106251864A publication Critical patent/CN106251864A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)

Abstract

Embodiments providing a kind of audio recognition method and terminal, the method includes: terminal gathers voice signal;The object information of this voice signal is extracted from this voice signal;Judge that whether object information is the information of this terminal;When the information that object information is not this terminal, this voice signal is sent to the terminal that object information is corresponding, so that terminal corresponding to object information is extracted instruction from this voice signal and perform.Implement the embodiment of the present invention, it is possible to achieve user passes through voice operating terminal.

Description

A kind of audio recognition method and terminal
Technical field
The present invention relates to multimedia technology field, be specifically related to a kind of audio recognition method and terminal.
Background technology
Man-machine interaction is the knowledge that people interacts with terminal, and terminal can be various machine, as mobile phone, Panel computer, washing machine, television set etc..Human-computer interaction interface typically refers to the visible part of user, and user can be by man-machine Interactive interface exchanges with terminal.User can be handed over terminal by modes such as visualization window, stick, handle, remote controls Mutually, but above-mentioned interactive mode is required for user's manual operation terminal, and when user's inconvenience manual operation, user will be unable to operation Terminal.
Summary of the invention
The embodiment of the present invention provides a kind of audio recognition method and terminal, it is possible to achieve user passes through voice operating terminal.
Embodiment of the present invention first aspect provides a kind of audio recognition method, including:
Terminal gathers voice signal;
Described terminal extracts the object information of described voice signal from described voice signal;
Whether object information described in described terminal judges is the information of described terminal;
When the information that described object information is not described terminal, it is described right that described voice signal is sent to by described terminal The terminal that image information is corresponding, so that terminal corresponding to described object information is extracted instruction from described voice signal and performs.
Embodiment of the present invention second aspect provides a kind of terminal, including:
Collecting unit, is used for gathering voice signal;
First extraction unit, for extracting the object of described voice signal from the voice signal that described collecting unit gathers Information;
First judging unit, for judging that whether the object information that described first extraction unit extracts is the letter of described terminal Breath;
Transmitting element, for when the judged result of described first judging unit be described object information be not described terminal During information, the voice signal that described collecting unit gathers is sent to the terminal that described object information is corresponding, so that described object Terminal corresponding to information is extracted instruction from described voice signal and performs.
In the embodiment of the present invention, terminal gathers voice signal, extracts the object information of voice signal, sentence from voice signal Whether disconnected object information is the information of this terminal, and when the information that object information is not this terminal, it is right to be sent to by voice signal The terminal that image information is corresponding, so that terminal corresponding to object information is extracted instruction from voice signal and performs, it is seen then that user is not Only can directly interact with a terminal, it is also possible to carry out indirect interaction by this terminal and another terminal, thus can To realize user by voice operating terminal.
Accompanying drawing explanation
For the technical scheme being illustrated more clearly that in the embodiment of the present invention, below by use required in embodiment Accompanying drawing is briefly described, it should be apparent that, the accompanying drawing in describing below is some embodiments of the present invention, general for this area From the point of view of logical technical staff, on the premise of not paying creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 is a kind of network architecture schematic diagram that the embodiment of the present invention provides;
Fig. 2 is the schematic flow sheet of a kind of audio recognition method that the embodiment of the present invention provides;
Fig. 3 is the schematic flow sheet of the another kind of audio recognition method that the embodiment of the present invention provides;
Fig. 4 is the structural representation of a kind of terminal that the embodiment of the present invention provides;
Fig. 5 is the structural representation of the another kind of terminal that the embodiment of the present invention provides;
Fig. 6 is the structural representation of another terminal that the embodiment of the present invention provides.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Describe, it is clear that described embodiment is a part of embodiment of the present invention rather than whole embodiments wholely.Based on this Embodiment in bright, the every other enforcement that those of ordinary skill in the art are obtained under not making creative work premise Example, broadly falls into the scope of protection of the invention.
The embodiment of the present invention provides a kind of audio recognition method and terminal, it is possible to achieve user passes through voice operating terminal. It is described in detail individually below.
In order to be more fully understood that a kind of audio recognition method and terminal that the embodiment of the present invention provides, the most first to the present invention The network architecture that embodiment uses is described.Referring to Fig. 1, Fig. 1 is that a kind of network architecture that the embodiment of the present invention provides is shown It is intended to.As it is shown in figure 1, this network architecture can include at least two terminal, between this at least two terminal, pass through data network Being attached, can be carried out data transmission by data network between this at least two terminal, this at least two terminal has language Sound acquisition function.This at least two terminal can be mobile phone, panel computer, washing machine, television set, electric refrigerator etc..Data network Can be with the Internet, LAN, WIFI network etc..
Based on the network architecture shown in Fig. 1, referring to Fig. 2, Fig. 2 is a kind of speech recognition side that the embodiment of the present invention provides The schematic flow sheet of method.Wherein, this audio recognition method is that the angle of any terminal from above-mentioned at least two terminal is retouched State.As in figure 2 it is shown, this audio recognition method may comprise steps of.
201, terminal gathers voice signal.
In the present embodiment, terminal can be passed through the voice acquisition devices such as microphone, mike, sensor and gather voice signal.
202, terminal extracts the object information of voice signal from voice signal.
In the present embodiment, after terminal collects voice signal, the object extracting voice signal from voice signal is believed Breath, the object information of voice signal is the voice signal for which terminal for identifying this voice signal.Wherein, voice letter Number object information can be terminal name, such as: when family only has a washing machine, a television set and an electric refrigerator Time, object information can be directly the terminal names such as washing machine, television set, electric refrigerator, and voice signal can be " to open electricity ice Case " etc.;Can also be terminal number, in advance one numbering can be uniquely set for each terminal, can be known by numbering afterwards The most each terminal, such as: washing machine be numbering 1, electric refrigerator be numbering 2, the television set in parlor be numbering 3, bedroom television For numbering 4 etc., object information is 1,2,3,4 etc., and voice signal can be " starting 1 ";Can also be terminal name and terminal volume Number, such as: when family has at least two television sets, only will be incapable of recognizing that this voice signal by terminal name television set It is the voice signal for that television set, therefore, it can be numbered, such as television set 1, TV for these at least two television sets Machine 2 etc..
203, whether the object information of terminal judges voice signal is the information of this terminal, when the object information of voice signal During for the information of this terminal, perform step 204, when the information that the object information of voice signal is not this terminal, perform step 205。
In the present embodiment, after terminal extracts the object information of voice signal from voice signal, will determine that voice is believed Number object information be whether the information of this terminal, i.e. compare the object information of voice signal and the information of this terminal, work as voice When the object information of signal is identical with the information of this terminal, show the information that object information is this terminal of voice signal, i.e. table Plain language tone signal is the voice signal for this terminal, will perform step 204;Object information and this terminal when voice signal During information difference, show that the object information of voice signal is not the information of this terminal, i.e. show that voice signal is not for this end The voice signal of end, will perform step 205.
204, terminal is extracted instruction from voice signal and performs.
In the present embodiment, when judging the information that object information is this terminal of voice signal, i.e. show that voice signal is During for the voice signal of this terminal, terminal will be extracted instruction from voice signal and perform, such as: voice signal is " by electricity The temperature of refrigerator is adjusted to 5 degree " time, instruct as " temperature is adjusted to 5 degree ".
205, voice signal is sent to the terminal that object information is corresponding by terminal, so that terminal corresponding to object information is from language Tone signal is extracted instruction and performs.
In the present embodiment, when the object information judging voice signal is not the information of this terminal, i.e. show voice signal When not being the voice signal for this terminal, voice signal is sent to the terminal that object information is corresponding by terminal, so that object letter The terminal that breath is corresponding is extracted instruction from voice signal and performs, i.e. so that terminal execution step 204 corresponding to object information Operation.
In the audio recognition method described by Fig. 2, from voice signal, extract the object information of voice signal, it is judged that right Whether image information is the information of this terminal, when the information that object information is not this terminal, voice signal is sent to object letter The terminal that breath is corresponding, so that terminal corresponding to object information is extracted from voice signal and is instructed and perform, it is seen then that user not only may be used Directly to interact with a terminal, it is also possible to carry out indirect interaction by this terminal and another terminal, such that it is able to real Existing user is by voice operating terminal.
Based on the network architecture shown in Fig. 1, referring to Fig. 3, Fig. 3 is the another kind of speech recognition that the embodiment of the present invention provides The schematic flow sheet of method.Wherein, this audio recognition method is the angle of any terminal from above-mentioned at least two terminal Describe.As it is shown on figure 3, this audio recognition method may comprise steps of.
301, terminal gathers voice signal.
In the present embodiment, terminal can be passed through the voice acquisition devices such as microphone, mike, sensor and gather voice signal.
302, terminal extracts the object information of voice signal from voice signal.
In the present embodiment, after terminal collects voice signal, the object extracting voice signal from voice signal is believed Breath, the object information of voice signal is the voice signal for which terminal for identifying this voice signal.Wherein, voice letter Number object information can be terminal name, such as: when family only has a washing machine, a television set and an electric refrigerator Time, object information can be directly the terminal names such as washing machine, television set, electric refrigerator, and voice signal can be " to open electricity ice Case " etc.;Can also be terminal number, in advance one numbering can be uniquely set for each terminal, can be known by numbering afterwards The most each terminal, such as: washing machine be numbering 1, electric refrigerator be numbering 2, the television set in parlor be numbering 3, bedroom television For numbering 4 etc., object information is 1,2,3,4 etc., and voice signal can be " starting 1 ";Can also be terminal name and terminal volume Number, such as: when family has at least two television sets, only will be incapable of recognizing that this voice signal by terminal name television set It is the voice signal for that television set, therefore, it can be numbered, such as television set 1, TV for these at least two television sets Machine 2 etc..
303, whether the object information of terminal judges voice signal is the information of this terminal, when the object information of voice signal During for the information of this terminal, perform step 304, when the information that the object information of voice signal is not this terminal, perform step 308。
In the present embodiment, after terminal extracts the object information of voice signal from voice signal, will determine that voice is believed Number object information be whether the information of this terminal, i.e. compare the object information of voice signal and the information of this terminal, work as voice When the object information of signal is identical with the information of this terminal, show the information that object information is this terminal of voice signal, i.e. table Plain language tone signal is the voice signal for this terminal, will perform step 304;Object information and this terminal when voice signal During information difference, show that the object information of voice signal is not the information of this terminal, i.e. show that voice signal is not for this end The voice signal of end, will perform step 308.
304, whether this terminal of terminal judges is provided with authority, when this terminal is provided with authority, performs step 305, when When this terminal is not provided with authority, perform step 306.
In the present embodiment, in order to protect the safety of terminal or limit certain user's use to terminal, can be in advance Terminal arranges authority, i.e. gathers the first voice signal allowing to use the user of terminal, extracts first from the first voice signal Feature, arranges allowable error value for fisrt feature, and stores fisrt feature and allowable error value.When judging the right of voice signal Image information is not the information of this terminal, i.e. shows that, when voice signal is not the voice signal for this terminal, terminal can first be sentenced Whether this terminal disconnected is provided with authority, when this terminal is provided with authority, show some user this terminal can be operated, This terminal can not be operated by some user, will perform step 305, and when this terminal is not provided with authority, show all of This terminal can be operated by user, will perform step 306.
305, terminal extracts the feature of voice signal, compares the phonetic feature of this feature and storage, when the voice of storage is special When levying the phonetic feature that middle existence matches with this feature, step 306 will be performed, when the phonetic feature of storage not existing and being somebody's turn to do During the phonetic feature that feature matches, step 307 will be performed.
In the present embodiment, when judging that this terminal is provided with authority, the feature of voice signal will be extracted, compare voice letter Number the phonetic feature of feature and storage, when existing in the phonetic feature of storage, the voice that the feature with voice signal matches is special When levying, show that the user that voice signal is corresponding has the authority operating this terminal, step 306 will be performed;When the voice of storage is special When there is not the phonetic feature that the feature with voice signal matches in levying, show that the user that voice signal is corresponding does not have operation The authority of this terminal, will perform step 307.Wherein, the feature of voice signal can include amplitude, phase and frequency, compares language The feature of tone signal and the phonetic feature of storage, i.e. compare the amplitude of voice signal and target amplitude, the phase place of voice signal and Target phase and the frequency of voice signal and target frequency, target amplitude, target phase and target frequency belong to the language of storage Target voice feature in sound feature, when voice signal amplitude and target amplitude difference absolute value less than the first preset value, The absolute value of the phase place of voice signal and the difference of target phase is less than the second preset value and the frequency of voice signal and target frequency When the absolute value of the difference of rate is less than three preset values, determine that target voice feature matches with the feature of voice signal.First is pre- If value, the second preset value and the 3rd preset value are allowable error value.
306, terminal is extracted instruction from voice signal and performs.
In the present embodiment, when judging the information that object information is this terminal of voice signal, i.e. show that voice signal is During for the voice signal of this terminal, terminal will be extracted instruction from voice signal and perform, such as: voice signal is " by electricity The temperature of refrigerator is adjusted to 5 degree " time, instruct as " temperature is adjusted to 5 degree ".
307, terminal abandons voice signal.
308, voice signal is sent to the terminal that object information is corresponding by terminal, so that terminal corresponding to object information is from language Tone signal is extracted instruction and performs.
In the present embodiment, when the object information judging voice signal is not the information of this terminal, i.e. show voice signal When not being the voice signal for this terminal, voice signal is sent to the terminal that object information is corresponding by terminal, so that object letter The terminal that breath is corresponding is extracted instruction from voice signal and performs, i.e. so that terminal execution step 304-corresponding to object information The operation of 307.
In the audio recognition method described by Fig. 3, from voice signal, extract the object information of voice signal, it is judged that right Whether image information is the information of this terminal, when the information that object information is not this terminal, voice signal is sent to object letter The terminal that breath is corresponding, so that terminal corresponding to object information is extracted from voice signal and is instructed and perform, it is seen then that user not only may be used Directly to interact with a terminal, it is also possible to carry out indirect interaction by this terminal and another terminal, such that it is able to real Existing user is by voice operating terminal.
Based on the network architecture shown in Fig. 1, refer to the structure that Fig. 4, Fig. 4 are a kind of terminals that the embodiment of the present invention provides Schematic diagram.As shown in Figure 4, this terminal may include that
Collecting unit 401, is used for gathering voice signal;
First extraction unit 402, for extracting the object letter of voice signal from the voice signal that collecting unit 401 gathers Breath;
First judging unit 403, for judging that whether the object information of the first extraction unit 402 extraction is the letter of this terminal Breath;
Transmitting element 404, is not this for the object information that judged result is voice signal when the first judging unit 403 During the information of terminal, the voice signal that collecting unit 401 gathers is sent to the terminal that the object information of voice signal is corresponding, with The terminal making the object information of voice signal corresponding is extracted instruction from voice signal and performs.
In the terminal described by Fig. 4, from voice signal, extract the object information of voice signal, it is judged that object information is The no information for this terminal, when the information that object information is not this terminal, is sent to object information by voice signal corresponding Terminal, so that terminal corresponding to object information is extracted instruction from voice signal and performs, it is seen then that user be possible not only to directly with One terminal interacts, it is also possible to carry out indirect interaction by this terminal and another terminal, leads to such that it is able to realize user Cross voice operating terminal.
Based on the network architecture shown in Fig. 1, refer to the knot that Fig. 5, Fig. 5 are the another kind of terminals that the embodiment of the present invention provides Structure schematic diagram.Wherein, the terminal shown in Fig. 5 is that as shown in Figure 4 terminal optimized obtains, and wherein, this terminal can also include:
Performance element 405, is this end for the object information that judged result is voice signal when the first judging unit 403 During the information held, from the voice signal that collecting unit 401 gathers, extract instruction and perform.
As a kind of possible embodiment, this terminal can also include:
Second judging unit 406, is used for judging whether this terminal is provided with authority;
Second extraction unit 407, for when the judged result of the second judging unit 406 be this terminal be provided with authority time, Extract the feature of the voice signal that collecting unit 401 gathers;
Comparing unit 408, for comparing feature and the phonetic feature of storage that the second extraction unit 407 extracts, when comparing When the comparative result of unit 408 is the phonetic feature that in the phonetic feature stored, the feature of existence and voice signal matches, touch Send out performance element 405 and perform described extraction instruction from voice signal the step performed.
Specifically, it is the information of this terminal when the object information that judged result is voice signal of the first judging unit 403 Time, judge whether this terminal is provided with authority by triggering the second judging unit 406.
As a kind of possible embodiment, the feature of voice signal can include amplitude, phase and frequency;
Comparing unit 408, specifically for comparing amplitude and target amplitude, the phase place of voice signal and the target of voice signal Phase place and the frequency of voice signal and target frequency, when the amplitude of voice signal and the absolute value of the difference of target amplitude are less than the The absolute value of the difference of one preset value, the phase place of voice signal and target phase is less than the second preset value and the frequency of voice signal When being less than three preset values with the absolute value of the difference of target frequency, determine the feature phase of target voice feature and voice signal Joining, target amplitude, target phase and target frequency belong to the target voice feature in the phonetic feature of storage.
As a kind of possible embodiment, the object information of voice signal may include that
Terminal name;Or
Terminal number;Or
Terminal name and terminal number.
In the terminal described by Fig. 5, from voice signal, extract the object information of voice signal, it is judged that object information is The no information for this terminal, when the information that object information is not this terminal, is sent to object information by voice signal corresponding Terminal, so that terminal corresponding to object information is extracted instruction from voice signal and performs, it is seen then that user be possible not only to directly with One terminal interacts, it is also possible to carry out indirect interaction by this terminal and another terminal, leads to such that it is able to realize user Cross voice operating terminal.
Based on the network architecture shown in Fig. 1, refer to the knot that Fig. 6, Fig. 6 are another terminals that the embodiment of the present invention provides Structure schematic diagram.As shown in Figure 6, this terminal may include that at least one processor 601, such as CPU, memorizer 602, communication interface 603, voice acquisition device 604 and at least one communication bus 605.Memorizer 602 can be high-speed RAM memorizer, it is possible to To be non-labile memorizer (non-volatile memory), for example, at least one disk memory.Alternatively, storage Device 602 can also is that at least one is located remotely from the storage device of aforementioned processor 601.Wherein:
Communication bus 605, for realizing the connection communication between these assemblies;
Voice acquisition device 604, is used for gathering voice signal and being sent to processor 601;
In memorizer 602, storage has batch processing code, and processor 601 is for calling the program of storage in memorizer 602 Code operates below performing:
The object information of voice signal is extracted from voice signal;
Judge that whether the object information of voice signal is the information of this terminal;
Communication interface 603, for when the information that the object information of voice signal is not this terminal, sends voice signal To the terminal that the object information of voice signal is corresponding, so that terminal corresponding to the object information of voice signal carries from voice signal Instruction fetch also performs.
As a kind of possible embodiment, processor 601 is additionally operable to call the program code of storage in memorizer 602 and holds The following operation of row:
When the information that the object information of voice signal is this terminal, from voice signal, extract instruction and perform.
As a kind of possible embodiment, when the information that the object information of voice signal is this terminal, processor 601 It is additionally operable to call below the program code execution of storage in memorizer 602 operate:
Judge whether this terminal is provided with authority;
When this terminal is provided with authority, extract the feature of voice signal;
Compare the feature of voice signal and the phonetic feature of storage;
When the phonetic feature of storage exists the phonetic feature matched with the feature of voice signal, described in execution from Voice signal extracts instruction the step performed.
As a kind of possible embodiment, the feature of voice signal can include amplitude, phase and frequency;
The phonetic feature of feature and storage that processor 601 compares voice signal includes:
The relatively amplitude of voice signal and target amplitude, the phase place of voice signal and target phase and the frequency of voice signal Rate and target frequency, target amplitude, target phase and target frequency belong to the target voice feature in the phonetic feature of storage;
When voice signal amplitude and target amplitude difference absolute value less than the first preset value, the phase place of voice signal and The absolute value of the difference of target phase is little less than the absolute value of the second preset value and the difference of the frequency of voice signal and target frequency When three preset values, determine that target voice feature matches with the feature of voice signal.
As a kind of possible embodiment, the object information of voice signal may include that
Terminal name;Or
Terminal number;Or
Terminal name and terminal number.
Wherein, step 201 and 301 can be performed by the voice acquisition device 604 in terminal, step 202-204,302- 307 can be performed by the processor 601 in terminal and memorizer 602, and step 205 can be connect by communicating in terminal with 308 Mouth 603 performs.
Wherein, collecting unit 401 can be realized by the voice acquisition device 604 in terminal, the first extraction unit 402, First judging unit 403, performance element the 405, second judging unit the 406, second extraction unit 407 and comparing unit 408 are permissible Being realized by the processor 601 in terminal and memorizer 602, transmitting element 404 can be real by the communication interface 603 in terminal Existing.
In the terminal described by Fig. 6, from voice signal, extract the object information of voice signal, it is judged that object information is The no information for this terminal, when the information that object information is not this terminal, is sent to object information by voice signal corresponding Terminal, so that terminal corresponding to object information is extracted instruction from voice signal and performs, it is seen then that user be possible not only to directly with One terminal interacts, it is also possible to carry out indirect interaction by this terminal and another terminal, leads to such that it is able to realize user Cross voice operating terminal.
The unit of the embodiment of the present invention, can be with universal integrated circuit (such as central processor CPU), or with special integrated electricity (ASIC) realizes on road.
Those of ordinary skill in the art are it is to be appreciated that combine the list of each example that the embodiments described herein describes Unit and algorithm steps, it is possible to electronic hardware, computer software or the two be implemented in combination in, in order to clearly demonstrate hardware With the interchangeability of software, the most generally describe composition and the step of each example according to function.This A little functions perform with hardware or software mode actually, depend on application-specific and the design constraint of technical scheme.Specially Industry technical staff can use different methods to realize described function to each specifically should being used for, but this realization is not It is considered as beyond the scope of this invention.
Those skilled in the art is it can be understood that arrive, for convenience of description and succinctly, and the end of foregoing description End and the specific works process of unit, be referred to the corresponding process in preceding method embodiment, do not repeat them here.
In several embodiments provided herein, it should be understood that disclosed terminal and method, can be passed through it Its mode realizes.Such as, device embodiment described above is only schematically, such as, and the division of described unit, only Being only a kind of logic function to divide, actual can have other dividing mode, the most multiple unit or assembly to tie when realizing Close or be desirably integrated into another system, or some features can be ignored, or not performing.It addition, shown or discussed phase Coupling between Hu or direct-coupling or communication connection can be the INDIRECT COUPLING by some interfaces, device or unit or communication Connect, it is also possible to be electric, machinery or other form connect.
Step in embodiment of the present invention method can carry out order according to actual needs and adjust, merges and delete.
Unit in embodiment of the present invention terminal can merge according to actual needs, divides and delete.
The described unit illustrated as separating component can be or may not be physically separate, shows as unit The parts shown can be or may not be physical location, i.e. may be located at a place, or can also be distributed to multiple On NE.Some or all of unit therein can be selected according to the actual needs to realize embodiment of the present invention scheme Purpose.
It addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, it is also possible to It is that unit is individually physically present, it is also possible to be that two or more unit are integrated in a unit.Above-mentioned integrated Unit both can realize to use the form of hardware, it would however also be possible to employ the form of SFU software functional unit realizes.
If described integrated unit realizes and as independent production marketing or use using the form of SFU software functional unit Time, can be stored in a computer read/write memory medium.Based on such understanding, technical scheme is substantially The part in other words prior art contributed, or this technical scheme completely or partially can be with the form of software product Embodying, this computer software product is stored in a storage medium, including some instructions with so that a computer Equipment (can be personal computer, server, or the network equipment etc.) performs the complete of method described in each embodiment of the present invention Portion or part steps.And aforesaid storage medium includes: USB flash disk, portable hard drive, read only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disc or CD etc. are various can store journey The medium of sequence code.
The audio recognition method and the terminal that there is provided the embodiment of the present invention above are described in detail, used herein Principle and the embodiment of the present invention are set forth by specific case, and the explanation of above example is only intended to help to understand this The method of invention and core concept thereof;Simultaneously for one of ordinary skill in the art, according to the thought of the present invention, specifically All will change on embodiment and range of application, in sum, this specification content should not be construed as the present invention's Limit.

Claims (10)

1. an audio recognition method, it is characterised in that including:
Terminal gathers voice signal;
Described terminal extracts the object information of described voice signal from described voice signal;
Whether object information described in described terminal judges is the information of described terminal;
When the information that described object information is not described terminal, described voice signal is sent to described object letter by described terminal The terminal that breath is corresponding, so that terminal corresponding to described object information is extracted from described voice signal and is instructed and perform.
Method the most according to claim 1, it is characterised in that described method also includes:
When the information that described object information is described terminal, described terminal is extracted instruction from described voice signal and performs.
Method the most according to claim 2, it is characterised in that when the information that described object information is described terminal, institute Method of stating also includes:
Described in described terminal judges, whether terminal is provided with authority;
When described terminal is provided with authority, described terminal extracts the feature of described voice signal;
The more described feature of described terminal and the phonetic feature of storage;
When the phonetic feature of storage exists the phonetic feature matched with described feature, described in the execution of described terminal from institute Predicate tone signal is extracted instruction the step performed.
Method the most according to claim 3, it is characterised in that described feature includes amplitude, phase and frequency;
The more described feature of described terminal includes with the phonetic feature of storage:
The more described amplitude of described terminal and target amplitude, described phase place and target phase and described frequency and target frequency, Described target amplitude, described target phase and described target frequency belong to the target voice feature in the phonetic feature of storage;
When the absolute value of described amplitude with the difference of described target amplitude is less than the first preset value, described phase place and described target phase The absolute value of difference preset less than the 3rd less than the absolute value of the difference of the second preset value and described frequency and described target frequency During value, determine that described target voice feature matches with described feature.
5. according to the method described in any one of claim 1-4, it is characterised in that described object information includes:
Terminal name;Or
Terminal number;Or
Terminal name and terminal number.
6. a terminal, it is characterised in that including:
Collecting unit, is used for gathering voice signal;
First extraction unit, for extracting the object letter of described voice signal from the voice signal that described collecting unit gathers Breath;
First judging unit, for judging that whether the object information that described first extraction unit extracts is the information of described terminal;
Transmitting element, for when the judged result of described first judging unit be described object information be not the information of described terminal Time, the voice signal that described collecting unit gathers is sent to the terminal that described object information is corresponding, so that described object information Corresponding terminal is extracted instruction from described voice signal and performs.
Terminal the most according to claim 6, it is characterised in that described terminal also includes:
Performance element, for when the judged result of described first judging unit be described object information be the information of described terminal Time, from the voice signal that described collecting unit gathers, extract instruction and perform.
Terminal the most according to claim 7, it is characterised in that described terminal also includes:
Second judging unit, is used for judging whether described terminal is provided with authority;
Second extraction unit, for when the judged result of described second judging unit be described terminal be provided with authority time, extract The feature of the voice signal that described collecting unit gathers;
Comparing unit, the feature extracted for the most described second extraction unit and the phonetic feature of storage, when described the most single The comparative result of unit be storage phonetic feature in exist and during phonetic feature that described feature matches, trigger described performing list The step that extraction instructs and performs from described voice signal described in unit's execution.
Terminal the most according to claim 8, it is characterised in that described feature includes amplitude, phase and frequency;
Described comparing unit, specifically for relatively described amplitude and target amplitude, described phase place and target phase and described frequency Rate and target frequency, when the absolute value of described amplitude with the difference of described target amplitude is less than the first preset value, described phase place and institute The absolute value of the difference stating target phase is little less than the absolute value of the second preset value and the difference of described frequency and described target frequency When three preset values, determine that target voice feature matches with described feature, described target amplitude, described target phase and institute State target frequency and belong to the described target voice feature in the phonetic feature of storage.
10. according to the terminal described in any one of claim 6-9, it is characterised in that described object information includes:
Terminal name;Or
Terminal number;Or
Terminal name and terminal number.
CN201610626656.3A 2016-08-03 2016-08-03 A kind of audio recognition method and terminal Pending CN106251864A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610626656.3A CN106251864A (en) 2016-08-03 2016-08-03 A kind of audio recognition method and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610626656.3A CN106251864A (en) 2016-08-03 2016-08-03 A kind of audio recognition method and terminal

Publications (1)

Publication Number Publication Date
CN106251864A true CN106251864A (en) 2016-12-21

Family

ID=57606203

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610626656.3A Pending CN106251864A (en) 2016-08-03 2016-08-03 A kind of audio recognition method and terminal

Country Status (1)

Country Link
CN (1) CN106251864A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109243443A (en) * 2018-09-28 2019-01-18 联想(北京)有限公司 Sound control method, device and electronic equipment
CN110415682A (en) * 2019-07-08 2019-11-05 海尔优家智能科技(北京)有限公司 Control the method and device of smart machine

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080097760A1 (en) * 2006-10-23 2008-04-24 Sungkyunkwan University Foundation For Corporate Collaboration User-initiative voice service system and method
CN101420543A (en) * 2008-12-05 2009-04-29 天津三星电子显示器有限公司 Method for voice controlling television and television therewith
CN103092181A (en) * 2012-12-28 2013-05-08 吴玉胜 Household appliance control method and system thereof based on intelligent television equipment
CN103631211A (en) * 2012-08-29 2014-03-12 三星电子(中国)研发中心 Method, device and system for controlling household appliance device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080097760A1 (en) * 2006-10-23 2008-04-24 Sungkyunkwan University Foundation For Corporate Collaboration User-initiative voice service system and method
CN101420543A (en) * 2008-12-05 2009-04-29 天津三星电子显示器有限公司 Method for voice controlling television and television therewith
CN103631211A (en) * 2012-08-29 2014-03-12 三星电子(中国)研发中心 Method, device and system for controlling household appliance device
CN103092181A (en) * 2012-12-28 2013-05-08 吴玉胜 Household appliance control method and system thereof based on intelligent television equipment

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109243443A (en) * 2018-09-28 2019-01-18 联想(北京)有限公司 Sound control method, device and electronic equipment
CN109243443B (en) * 2018-09-28 2022-05-31 联想(北京)有限公司 Voice control method and device and electronic equipment
CN110415682A (en) * 2019-07-08 2019-11-05 海尔优家智能科技(北京)有限公司 Control the method and device of smart machine

Similar Documents

Publication Publication Date Title
CN104992709B (en) A kind of the execution method and speech recognition apparatus of phonetic order
CN107305774B (en) Voice detection method and device
CN105446146B (en) Intelligent terminal control method, system and intelligent terminal based on semantic analysis
CN106658129A (en) Emotion-based terminal control method and apparatus, and terminal
WO2018000278A1 (en) Context sensitive multi-round dialogue management system and method based on state machines
CN102708454B (en) Solution of terminal fault provides method and device
CN105138110A (en) Voice interaction method and voice interaction device
WO2021135604A1 (en) Voice control method and apparatus, server, terminal device, and storage medium
CN104049721A (en) Information processing method and electronic equipment
CN103841272B (en) A kind of method and device sending speech message
CN106055260B (en) A kind of reading screen method and device of safety keyboard
JP6588673B2 (en) Virtual reality device and input control method of virtual reality device
CN105635778A (en) Voice interaction method and system of intelligent television
CN109074807A (en) Information processing equipment and information processing method
JP2016014967A (en) Information management method
CN108345442A (en) A kind of operation recognition methods and mobile terminal
CN106251864A (en) A kind of audio recognition method and terminal
CN104615358A (en) Application program starting method and electronic device
CN107704233A (en) A kind of information processing method and electronic equipment
CN106775349A (en) A kind of speech modification method and device of word content
CN106227537A (en) Display packing and device
CN104049869B (en) A kind of data processing method and device
CN106559540A (en) voice data processing method and device
CN105741841A (en) Voice control method and electronic equipment
US20160277698A1 (en) Method for vocally controlling a television and television thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20161221