CN106251864A

CN106251864A - A kind of audio recognition method and terminal

Info

Publication number: CN106251864A
Application number: CN201610626656.3A
Authority: CN
Inventors: 向攀
Original assignee: Shenzhen Jinli Communication Equipment Co Ltd
Current assignee: Shenzhen Jinli Communication Equipment Co Ltd
Priority date: 2016-08-03
Filing date: 2016-08-03
Publication date: 2016-12-21

Abstract

Embodiments providing a kind of audio recognition method and terminal, the method includes: terminal gathers voice signal；The object information of this voice signal is extracted from this voice signal；Judge that whether object information is the information of this terminal；When the information that object information is not this terminal, this voice signal is sent to the terminal that object information is corresponding, so that terminal corresponding to object information is extracted instruction from this voice signal and perform.Implement the embodiment of the present invention, it is possible to achieve user passes through voice operating terminal.

Description

A kind of audio recognition method and terminal

Technical field

The present invention relates to multimedia technology field, be specifically related to a kind of audio recognition method and terminal.

Background technology

Man-machine interaction is the knowledge that people interacts with terminal, and terminal can be various machine, as mobile phone, Panel computer, washing machine, television set etc..Human-computer interaction interface typically refers to the visible part of user, and user can be by man-machine Interactive interface exchanges with terminal.User can be handed over terminal by modes such as visualization window, stick, handle, remote controls Mutually, but above-mentioned interactive mode is required for user's manual operation terminal, and when user's inconvenience manual operation, user will be unable to operation Terminal.

Summary of the invention

The embodiment of the present invention provides a kind of audio recognition method and terminal, it is possible to achieve user passes through voice operating terminal.

Embodiment of the present invention first aspect provides a kind of audio recognition method, including:

Terminal gathers voice signal；

Described terminal extracts the object information of described voice signal from described voice signal；

Whether object information described in described terminal judges is the information of described terminal；

When the information that described object information is not described terminal, it is described right that described voice signal is sent to by described terminal The terminal that image information is corresponding, so that terminal corresponding to described object information is extracted instruction from described voice signal and performs.

Embodiment of the present invention second aspect provides a kind of terminal, including:

Collecting unit, is used for gathering voice signal；

First extraction unit, for extracting the object of described voice signal from the voice signal that described collecting unit gathers Information；

First judging unit, for judging that whether the object information that described first extraction unit extracts is the letter of described terminal Breath；

Transmitting element, for when the judged result of described first judging unit be described object information be not described terminal During information, the voice signal that described collecting unit gathers is sent to the terminal that described object information is corresponding, so that described object Terminal corresponding to information is extracted instruction from described voice signal and performs.

In the embodiment of the present invention, terminal gathers voice signal, extracts the object information of voice signal, sentence from voice signal Whether disconnected object information is the information of this terminal, and when the information that object information is not this terminal, it is right to be sent to by voice signal The terminal that image information is corresponding, so that terminal corresponding to object information is extracted instruction from voice signal and performs, it is seen then that user is not Only can directly interact with a terminal, it is also possible to carry out indirect interaction by this terminal and another terminal, thus can To realize user by voice operating terminal.

Accompanying drawing explanation

For the technical scheme being illustrated more clearly that in the embodiment of the present invention, below by use required in embodiment Accompanying drawing is briefly described, it should be apparent that, the accompanying drawing in describing below is some embodiments of the present invention, general for this area From the point of view of logical technical staff, on the premise of not paying creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.

Fig. 1 is a kind of network architecture schematic diagram that the embodiment of the present invention provides；

Fig. 2 is the schematic flow sheet of a kind of audio recognition method that the embodiment of the present invention provides；

Fig. 3 is the schematic flow sheet of the another kind of audio recognition method that the embodiment of the present invention provides；

Fig. 4 is the structural representation of a kind of terminal that the embodiment of the present invention provides；

Fig. 5 is the structural representation of the another kind of terminal that the embodiment of the present invention provides；

Fig. 6 is the structural representation of another terminal that the embodiment of the present invention provides.

Detailed description of the invention

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Describe, it is clear that described embodiment is a part of embodiment of the present invention rather than whole embodiments wholely.Based on this Embodiment in bright, the every other enforcement that those of ordinary skill in the art are obtained under not making creative work premise Example, broadly falls into the scope of protection of the invention.

The embodiment of the present invention provides a kind of audio recognition method and terminal, it is possible to achieve user passes through voice operating terminal. It is described in detail individually below.

In order to be more fully understood that a kind of audio recognition method and terminal that the embodiment of the present invention provides, the most first to the present invention The network architecture that embodiment uses is described.Referring to Fig. 1, Fig. 1 is that a kind of network architecture that the embodiment of the present invention provides is shown It is intended to.As it is shown in figure 1, this network architecture can include at least two terminal, between this at least two terminal, pass through data network Being attached, can be carried out data transmission by data network between this at least two terminal, this at least two terminal has language Sound acquisition function.This at least two terminal can be mobile phone, panel computer, washing machine, television set, electric refrigerator etc..Data network Can be with the Internet, LAN, WIFI network etc..

Based on the network architecture shown in Fig. 1, referring to Fig. 2, Fig. 2 is a kind of speech recognition side that the embodiment of the present invention provides The schematic flow sheet of method.Wherein, this audio recognition method is that the angle of any terminal from above-mentioned at least two terminal is retouched State.As in figure 2 it is shown, this audio recognition method may comprise steps of.

201, terminal gathers voice signal.

In the present embodiment, terminal can be passed through the voice acquisition devices such as microphone, mike, sensor and gather voice signal.

202, terminal extracts the object information of voice signal from voice signal.

In the present embodiment, after terminal collects voice signal, the object extracting voice signal from voice signal is believed Breath, the object information of voice signal is the voice signal for which terminal for identifying this voice signal.Wherein, voice letter Number object information can be terminal name, such as: when family only has a washing machine, a television set and an electric refrigerator Time, object information can be directly the terminal names such as washing machine, television set, electric refrigerator, and voice signal can be " to open electricity ice Case " etc.；Can also be terminal number, in advance one numbering can be uniquely set for each terminal, can be known by numbering afterwards The most each terminal, such as: washing machine be numbering 1, electric refrigerator be numbering 2, the television set in parlor be numbering 3, bedroom television For numbering 4 etc., object information is 1,2,3,4 etc., and voice signal can be " starting 1 "；Can also be terminal name and terminal volume Number, such as: when family has at least two television sets, only will be incapable of recognizing that this voice signal by terminal name television set It is the voice signal for that television set, therefore, it can be numbered, such as television set 1, TV for these at least two television sets Machine 2 etc..

203, whether the object information of terminal judges voice signal is the information of this terminal, when the object information of voice signal During for the information of this terminal, perform step 204, when the information that the object information of voice signal is not this terminal, perform step 205。

In the present embodiment, after terminal extracts the object information of voice signal from voice signal, will determine that voice is believed Number object information be whether the information of this terminal, i.e. compare the object information of voice signal and the information of this terminal, work as voice When the object information of signal is identical with the information of this terminal, show the information that object information is this terminal of voice signal, i.e. table Plain language tone signal is the voice signal for this terminal, will perform step 204；Object information and this terminal when voice signal During information difference, show that the object information of voice signal is not the information of this terminal, i.e. show that voice signal is not for this end The voice signal of end, will perform step 205.

204, terminal is extracted instruction from voice signal and performs.

In the present embodiment, when judging the information that object information is this terminal of voice signal, i.e. show that voice signal is During for the voice signal of this terminal, terminal will be extracted instruction from voice signal and perform, such as: voice signal is " by electricity The temperature of refrigerator is adjusted to 5 degree " time, instruct as " temperature is adjusted to 5 degree ".

205, voice signal is sent to the terminal that object information is corresponding by terminal, so that terminal corresponding to object information is from language Tone signal is extracted instruction and performs.

In the present embodiment, when the object information judging voice signal is not the information of this terminal, i.e. show voice signal When not being the voice signal for this terminal, voice signal is sent to the terminal that object information is corresponding by terminal, so that object letter The terminal that breath is corresponding is extracted instruction from voice signal and performs, i.e. so that terminal execution step 204 corresponding to object information Operation.

In the audio recognition method described by Fig. 2, from voice signal, extract the object information of voice signal, it is judged that right Whether image information is the information of this terminal, when the information that object information is not this terminal, voice signal is sent to object letter The terminal that breath is corresponding, so that terminal corresponding to object information is extracted from voice signal and is instructed and perform, it is seen then that user not only may be used Directly to interact with a terminal, it is also possible to carry out indirect interaction by this terminal and another terminal, such that it is able to real Existing user is by voice operating terminal.

Based on the network architecture shown in Fig. 1, referring to Fig. 3, Fig. 3 is the another kind of speech recognition that the embodiment of the present invention provides The schematic flow sheet of method.Wherein, this audio recognition method is the angle of any terminal from above-mentioned at least two terminal Describe.As it is shown on figure 3, this audio recognition method may comprise steps of.

301, terminal gathers voice signal.

302, terminal extracts the object information of voice signal from voice signal.

303, whether the object information of terminal judges voice signal is the information of this terminal, when the object information of voice signal During for the information of this terminal, perform step 304, when the information that the object information of voice signal is not this terminal, perform step 308。

In the present embodiment, after terminal extracts the object information of voice signal from voice signal, will determine that voice is believed Number object information be whether the information of this terminal, i.e. compare the object information of voice signal and the information of this terminal, work as voice When the object information of signal is identical with the information of this terminal, show the information that object information is this terminal of voice signal, i.e. table Plain language tone signal is the voice signal for this terminal, will perform step 304；Object information and this terminal when voice signal During information difference, show that the object information of voice signal is not the information of this terminal, i.e. show that voice signal is not for this end The voice signal of end, will perform step 308.

304, whether this terminal of terminal judges is provided with authority, when this terminal is provided with authority, performs step 305, when When this terminal is not provided with authority, perform step 306.

In the present embodiment, in order to protect the safety of terminal or limit certain user's use to terminal, can be in advance Terminal arranges authority, i.e. gathers the first voice signal allowing to use the user of terminal, extracts first from the first voice signal Feature, arranges allowable error value for fisrt feature, and stores fisrt feature and allowable error value.When judging the right of voice signal Image information is not the information of this terminal, i.e. shows that, when voice signal is not the voice signal for this terminal, terminal can first be sentenced Whether this terminal disconnected is provided with authority, when this terminal is provided with authority, show some user this terminal can be operated, This terminal can not be operated by some user, will perform step 305, and when this terminal is not provided with authority, show all of This terminal can be operated by user, will perform step 306.

305, terminal extracts the feature of voice signal, compares the phonetic feature of this feature and storage, when the voice of storage is special When levying the phonetic feature that middle existence matches with this feature, step 306 will be performed, when the phonetic feature of storage not existing and being somebody's turn to do During the phonetic feature that feature matches, step 307 will be performed.

In the present embodiment, when judging that this terminal is provided with authority, the feature of voice signal will be extracted, compare voice letter Number the phonetic feature of feature and storage, when existing in the phonetic feature of storage, the voice that the feature with voice signal matches is special When levying, show that the user that voice signal is corresponding has the authority operating this terminal, step 306 will be performed；When the voice of storage is special When there is not the phonetic feature that the feature with voice signal matches in levying, show that the user that voice signal is corresponding does not have operation The authority of this terminal, will perform step 307.Wherein, the feature of voice signal can include amplitude, phase and frequency, compares language The feature of tone signal and the phonetic feature of storage, i.e. compare the amplitude of voice signal and target amplitude, the phase place of voice signal and Target phase and the frequency of voice signal and target frequency, target amplitude, target phase and target frequency belong to the language of storage Target voice feature in sound feature, when voice signal amplitude and target amplitude difference absolute value less than the first preset value, The absolute value of the phase place of voice signal and the difference of target phase is less than the second preset value and the frequency of voice signal and target frequency When the absolute value of the difference of rate is less than three preset values, determine that target voice feature matches with the feature of voice signal.First is pre- If value, the second preset value and the 3rd preset value are allowable error value.

306, terminal is extracted instruction from voice signal and performs.

307, terminal abandons voice signal.

308, voice signal is sent to the terminal that object information is corresponding by terminal, so that terminal corresponding to object information is from language Tone signal is extracted instruction and performs.

In the present embodiment, when the object information judging voice signal is not the information of this terminal, i.e. show voice signal When not being the voice signal for this terminal, voice signal is sent to the terminal that object information is corresponding by terminal, so that object letter The terminal that breath is corresponding is extracted instruction from voice signal and performs, i.e. so that terminal execution step 304-corresponding to object information The operation of 307.

In the audio recognition method described by Fig. 3, from voice signal, extract the object information of voice signal, it is judged that right Whether image information is the information of this terminal, when the information that object information is not this terminal, voice signal is sent to object letter The terminal that breath is corresponding, so that terminal corresponding to object information is extracted from voice signal and is instructed and perform, it is seen then that user not only may be used Directly to interact with a terminal, it is also possible to carry out indirect interaction by this terminal and another terminal, such that it is able to real Existing user is by voice operating terminal.

Based on the network architecture shown in Fig. 1, refer to the structure that Fig. 4, Fig. 4 are a kind of terminals that the embodiment of the present invention provides Schematic diagram.As shown in Figure 4, this terminal may include that

Collecting unit 401, is used for gathering voice signal；

First extraction unit 402, for extracting the object letter of voice signal from the voice signal that collecting unit 401 gathers Breath；

First judging unit 403, for judging that whether the object information of the first extraction unit 402 extraction is the letter of this terminal Breath；

Transmitting element 404, is not this for the object information that judged result is voice signal when the first judging unit 403 During the information of terminal, the voice signal that collecting unit 401 gathers is sent to the terminal that the object information of voice signal is corresponding, with The terminal making the object information of voice signal corresponding is extracted instruction from voice signal and performs.

In the terminal described by Fig. 4, from voice signal, extract the object information of voice signal, it is judged that object information is The no information for this terminal, when the information that object information is not this terminal, is sent to object information by voice signal corresponding Terminal, so that terminal corresponding to object information is extracted instruction from voice signal and performs, it is seen then that user be possible not only to directly with One terminal interacts, it is also possible to carry out indirect interaction by this terminal and another terminal, leads to such that it is able to realize user Cross voice operating terminal.

Based on the network architecture shown in Fig. 1, refer to the knot that Fig. 5, Fig. 5 are the another kind of terminals that the embodiment of the present invention provides Structure schematic diagram.Wherein, the terminal shown in Fig. 5 is that as shown in Figure 4 terminal optimized obtains, and wherein, this terminal can also include:

Performance element 405, is this end for the object information that judged result is voice signal when the first judging unit 403 During the information held, from the voice signal that collecting unit 401 gathers, extract instruction and perform.

As a kind of possible embodiment, this terminal can also include:

Second judging unit 406, is used for judging whether this terminal is provided with authority；

Second extraction unit 407, for when the judged result of the second judging unit 406 be this terminal be provided with authority time, Extract the feature of the voice signal that collecting unit 401 gathers；

Comparing unit 408, for comparing feature and the phonetic feature of storage that the second extraction unit 407 extracts, when comparing When the comparative result of unit 408 is the phonetic feature that in the phonetic feature stored, the feature of existence and voice signal matches, touch Send out performance element 405 and perform described extraction instruction from voice signal the step performed.

Specifically, it is the information of this terminal when the object information that judged result is voice signal of the first judging unit 403 Time, judge whether this terminal is provided with authority by triggering the second judging unit 406.

As a kind of possible embodiment, the feature of voice signal can include amplitude, phase and frequency；

Comparing unit 408, specifically for comparing amplitude and target amplitude, the phase place of voice signal and the target of voice signal Phase place and the frequency of voice signal and target frequency, when the amplitude of voice signal and the absolute value of the difference of target amplitude are less than the The absolute value of the difference of one preset value, the phase place of voice signal and target phase is less than the second preset value and the frequency of voice signal When being less than three preset values with the absolute value of the difference of target frequency, determine the feature phase of target voice feature and voice signal Joining, target amplitude, target phase and target frequency belong to the target voice feature in the phonetic feature of storage.

As a kind of possible embodiment, the object information of voice signal may include that

Terminal name；Or

Terminal number；Or

Terminal name and terminal number.

In the terminal described by Fig. 5, from voice signal, extract the object information of voice signal, it is judged that object information is The no information for this terminal, when the information that object information is not this terminal, is sent to object information by voice signal corresponding Terminal, so that terminal corresponding to object information is extracted instruction from voice signal and performs, it is seen then that user be possible not only to directly with One terminal interacts, it is also possible to carry out indirect interaction by this terminal and another terminal, leads to such that it is able to realize user Cross voice operating terminal.

Based on the network architecture shown in Fig. 1, refer to the knot that Fig. 6, Fig. 6 are another terminals that the embodiment of the present invention provides Structure schematic diagram.As shown in Figure 6, this terminal may include that at least one processor 601, such as CPU, memorizer 602, communication interface 603, voice acquisition device 604 and at least one communication bus 605.Memorizer 602 can be high-speed RAM memorizer, it is possible to To be non-labile memorizer (non-volatile memory), for example, at least one disk memory.Alternatively, storage Device 602 can also is that at least one is located remotely from the storage device of aforementioned processor 601.Wherein:

Communication bus 605, for realizing the connection communication between these assemblies；

Voice acquisition device 604, is used for gathering voice signal and being sent to processor 601；

In memorizer 602, storage has batch processing code, and processor 601 is for calling the program of storage in memorizer 602 Code operates below performing:

The object information of voice signal is extracted from voice signal；

Judge that whether the object information of voice signal is the information of this terminal；

Communication interface 603, for when the information that the object information of voice signal is not this terminal, sends voice signal To the terminal that the object information of voice signal is corresponding, so that terminal corresponding to the object information of voice signal carries from voice signal Instruction fetch also performs.

As a kind of possible embodiment, processor 601 is additionally operable to call the program code of storage in memorizer 602 and holds The following operation of row:

When the information that the object information of voice signal is this terminal, from voice signal, extract instruction and perform.

As a kind of possible embodiment, when the information that the object information of voice signal is this terminal, processor 601 It is additionally operable to call below the program code execution of storage in memorizer 602 operate:

Judge whether this terminal is provided with authority；

When this terminal is provided with authority, extract the feature of voice signal；

Compare the feature of voice signal and the phonetic feature of storage；

When the phonetic feature of storage exists the phonetic feature matched with the feature of voice signal, described in execution from Voice signal extracts instruction the step performed.

The phonetic feature of feature and storage that processor 601 compares voice signal includes:

The relatively amplitude of voice signal and target amplitude, the phase place of voice signal and target phase and the frequency of voice signal Rate and target frequency, target amplitude, target phase and target frequency belong to the target voice feature in the phonetic feature of storage；

When voice signal amplitude and target amplitude difference absolute value less than the first preset value, the phase place of voice signal and The absolute value of the difference of target phase is little less than the absolute value of the second preset value and the difference of the frequency of voice signal and target frequency When three preset values, determine that target voice feature matches with the feature of voice signal.

Terminal name；Or

Terminal number；Or

Terminal name and terminal number.

Wherein, step 201 and 301 can be performed by the voice acquisition device 604 in terminal, step 202-204,302- 307 can be performed by the processor 601 in terminal and memorizer 602, and step 205 can be connect by communicating in terminal with 308 Mouth 603 performs.

Wherein, collecting unit 401 can be realized by the voice acquisition device 604 in terminal, the first extraction unit 402, First judging unit 403, performance element the 405, second judging unit the 406, second extraction unit 407 and comparing unit 408 are permissible Being realized by the processor 601 in terminal and memorizer 602, transmitting element 404 can be real by the communication interface 603 in terminal Existing.

In the terminal described by Fig. 6, from voice signal, extract the object information of voice signal, it is judged that object information is The no information for this terminal, when the information that object information is not this terminal, is sent to object information by voice signal corresponding Terminal, so that terminal corresponding to object information is extracted instruction from voice signal and performs, it is seen then that user be possible not only to directly with One terminal interacts, it is also possible to carry out indirect interaction by this terminal and another terminal, leads to such that it is able to realize user Cross voice operating terminal.

The unit of the embodiment of the present invention, can be with universal integrated circuit (such as central processor CPU), or with special integrated electricity (ASIC) realizes on road.

Those of ordinary skill in the art are it is to be appreciated that combine the list of each example that the embodiments described herein describes Unit and algorithm steps, it is possible to electronic hardware, computer software or the two be implemented in combination in, in order to clearly demonstrate hardware With the interchangeability of software, the most generally describe composition and the step of each example according to function.This A little functions perform with hardware or software mode actually, depend on application-specific and the design constraint of technical scheme.Specially Industry technical staff can use different methods to realize described function to each specifically should being used for, but this realization is not It is considered as beyond the scope of this invention.

Those skilled in the art is it can be understood that arrive, for convenience of description and succinctly, and the end of foregoing description End and the specific works process of unit, be referred to the corresponding process in preceding method embodiment, do not repeat them here.

In several embodiments provided herein, it should be understood that disclosed terminal and method, can be passed through it Its mode realizes.Such as, device embodiment described above is only schematically, such as, and the division of described unit, only Being only a kind of logic function to divide, actual can have other dividing mode, the most multiple unit or assembly to tie when realizing Close or be desirably integrated into another system, or some features can be ignored, or not performing.It addition, shown or discussed phase Coupling between Hu or direct-coupling or communication connection can be the INDIRECT COUPLING by some interfaces, device or unit or communication Connect, it is also possible to be electric, machinery or other form connect.

Step in embodiment of the present invention method can carry out order according to actual needs and adjust, merges and delete.

Unit in embodiment of the present invention terminal can merge according to actual needs, divides and delete.

The described unit illustrated as separating component can be or may not be physically separate, shows as unit The parts shown can be or may not be physical location, i.e. may be located at a place, or can also be distributed to multiple On NE.Some or all of unit therein can be selected according to the actual needs to realize embodiment of the present invention scheme Purpose.

It addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, it is also possible to It is that unit is individually physically present, it is also possible to be that two or more unit are integrated in a unit.Above-mentioned integrated Unit both can realize to use the form of hardware, it would however also be possible to employ the form of SFU software functional unit realizes.

If described integrated unit realizes and as independent production marketing or use using the form of SFU software functional unit Time, can be stored in a computer read/write memory medium.Based on such understanding, technical scheme is substantially The part in other words prior art contributed, or this technical scheme completely or partially can be with the form of software product Embodying, this computer software product is stored in a storage medium, including some instructions with so that a computer Equipment (can be personal computer, server, or the network equipment etc.) performs the complete of method described in each embodiment of the present invention Portion or part steps.And aforesaid storage medium includes: USB flash disk, portable hard drive, read only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disc or CD etc. are various can store journey The medium of sequence code.

The audio recognition method and the terminal that there is provided the embodiment of the present invention above are described in detail, used herein Principle and the embodiment of the present invention are set forth by specific case, and the explanation of above example is only intended to help to understand this The method of invention and core concept thereof；Simultaneously for one of ordinary skill in the art, according to the thought of the present invention, specifically All will change on embodiment and range of application, in sum, this specification content should not be construed as the present invention's Limit.

Claims

1. an audio recognition method, it is characterised in that including:

Terminal gathers voice signal；

When the information that described object information is not described terminal, described voice signal is sent to described object letter by described terminal The terminal that breath is corresponding, so that terminal corresponding to described object information is extracted from described voice signal and is instructed and perform.

Method the most according to claim 1, it is characterised in that described method also includes:

When the information that described object information is described terminal, described terminal is extracted instruction from described voice signal and performs.

Method the most according to claim 2, it is characterised in that when the information that described object information is described terminal, institute Method of stating also includes:

Described in described terminal judges, whether terminal is provided with authority；

When described terminal is provided with authority, described terminal extracts the feature of described voice signal；

The more described feature of described terminal and the phonetic feature of storage；

When the phonetic feature of storage exists the phonetic feature matched with described feature, described in the execution of described terminal from institute Predicate tone signal is extracted instruction the step performed.

Method the most according to claim 3, it is characterised in that described feature includes amplitude, phase and frequency；

The more described feature of described terminal includes with the phonetic feature of storage:

The more described amplitude of described terminal and target amplitude, described phase place and target phase and described frequency and target frequency, Described target amplitude, described target phase and described target frequency belong to the target voice feature in the phonetic feature of storage；

When the absolute value of described amplitude with the difference of described target amplitude is less than the first preset value, described phase place and described target phase The absolute value of difference preset less than the 3rd less than the absolute value of the difference of the second preset value and described frequency and described target frequency During value, determine that described target voice feature matches with described feature.

5. according to the method described in any one of claim 1-4, it is characterised in that described object information includes:

Terminal name；Or

Terminal number；Or

Terminal name and terminal number.

6. a terminal, it is characterised in that including:

Collecting unit, is used for gathering voice signal；

First extraction unit, for extracting the object letter of described voice signal from the voice signal that described collecting unit gathers Breath；

First judging unit, for judging that whether the object information that described first extraction unit extracts is the information of described terminal；

Transmitting element, for when the judged result of described first judging unit be described object information be not the information of described terminal Time, the voice signal that described collecting unit gathers is sent to the terminal that described object information is corresponding, so that described object information Corresponding terminal is extracted instruction from described voice signal and performs.

Terminal the most according to claim 6, it is characterised in that described terminal also includes:

Performance element, for when the judged result of described first judging unit be described object information be the information of described terminal Time, from the voice signal that described collecting unit gathers, extract instruction and perform.

Terminal the most according to claim 7, it is characterised in that described terminal also includes:

Second judging unit, is used for judging whether described terminal is provided with authority；

Second extraction unit, for when the judged result of described second judging unit be described terminal be provided with authority time, extract The feature of the voice signal that described collecting unit gathers；

Comparing unit, the feature extracted for the most described second extraction unit and the phonetic feature of storage, when described the most single The comparative result of unit be storage phonetic feature in exist and during phonetic feature that described feature matches, trigger described performing list The step that extraction instructs and performs from described voice signal described in unit's execution.

Terminal the most according to claim 8, it is characterised in that described feature includes amplitude, phase and frequency；

Described comparing unit, specifically for relatively described amplitude and target amplitude, described phase place and target phase and described frequency Rate and target frequency, when the absolute value of described amplitude with the difference of described target amplitude is less than the first preset value, described phase place and institute The absolute value of the difference stating target phase is little less than the absolute value of the second preset value and the difference of described frequency and described target frequency When three preset values, determine that target voice feature matches with described feature, described target amplitude, described target phase and institute State target frequency and belong to the described target voice feature in the phonetic feature of storage.

10. according to the terminal described in any one of claim 6-9, it is characterised in that described object information includes:

Terminal name；Or

Terminal number；Or

Terminal name and terminal number.