CN106448668A

CN106448668A - Method for speech recognition and devices

Info

Publication number: CN106448668A
Application number: CN201610882389.6A
Authority: CN
Inventors: 修志远; 刘永辉; 房兰涛; 马克勇; 林洪刚
Original assignee: Shandong Inspur Business System Co Ltd
Current assignee: Shandong Inspur Business System Co Ltd
Priority date: 2016-10-10
Filing date: 2016-10-10
Publication date: 2017-02-22

Abstract

Embodiments of the invention provide a method for speech recognition and devices, which relates to the field of digital television technologies for improving the recognition rate of voice commands and improving the accuracy of recognizing the desired operation of the user. The method comprises steps of receiving to-be-recognized voice command information sent by a set top box; determining at least one piece of operation information based on the to-be-recognized voice command information; according to the at least one piece of operation information, determining the type and the type weight of the at least one piece of operation information; according to the at least one piece of operation information and the type of the at least one piece of operation information, determining the validity weight corresponding to the at least one piece of operation information; according to the type weight of the at least one piece of operation information and the validity weight of the at least one piece of operation information, determining the command weight of the at least one piece of operation information; and sending operation information of the maximal command weight to the set top box. The invention is applicable to a scene in which a television terminal is controlled by voice.

Description

A kind of method and apparatus of speech recognition

Technical field

The present invention relates to digital TV technology, more particularly, to a kind of method and apparatus of speech recognition.

Background technology

With the continuous progress of digital home's technology, the step of the integration of three networks is more and more nearer, the function that Set Top Box is realized Also get more and more.Because set-top box functionality gets more and more, in order to reduce the number of keys on remote control or Set Top Box, Voice command Set Top Box is increasingly liked by people.

Voice set top box is by server end, the phonetic order of user input to be converted to the operation that it is capable of identify that refer to Order.But, for same phonetic order, may all there is corresponding operational order in different scenes, for example, in application, point Broadcast, live in all there is corresponding operational order, now, one operational order of the random return of server end is to phonetic machine top Box.So, when voice set top box carries out respective handling according to the operational order that server end returns, may not be user The operation wanted, leads to the discrimination of phonetic order low, so lead to the desired operation of identifying user accuracy low.

Content of the invention

Embodiments of the invention provide a kind of method and apparatus of speech recognition, in order to improve the discrimination of phonetic order, Improve the accuracy rate of the desired operation of identifying user.

For reaching above-mentioned purpose, embodiments of the invention adopt the following technical scheme that：

Embodiments provide a kind of method of speech recognition, including：The phonetic order to be identified that receiving set up box sends Information；At least one operation information is determined according to described phonetic order information to be identified；According at least one operation letter described Breath, determines type and the type weight of at least one operation information described；According at least one operation information described and at least one The type of individual operation information determines described at least one operation information corresponding effectiveness weight；According at least one operation described The effectiveness weight of the type weight of information and at least one operation information described, determines the finger of at least one operation information described Make weight；Operation information maximum for instruction weight is sent to described Set Top Box.

Alternatively, the described operation information by instruction weight maximum sends to after described Set Top Box, also includes：Determine institute State the operation information of Set Top Box execution, and the operation information according to the execution of described Set Top Box, update described operation information corresponding Effectiveness weight.

Alternatively, the type weight of at least one operation information described in described basis and at least one operation information described Effectiveness weight, determines that the instruction weight of at least one operation information described includes：According at least one operation information described The effectiveness weight of type weight and at least one operation information described, instructs weight=type weight * 0.7+ using formula（Have Effect property weighted mean）* 0.3, determine the instruction weight of at least one operation information described.

Further, embodiments provide a kind of method of speech recognition, including：Obtain phonetic order to be identified Information；Described phonetic order information to be identified is sent to identification server；Receive the operation letter that described identification server sends Breath；Execute described operation information.

Alternatively, the described operation information of described execution includes：When the number of described operation information is one, execution is described Operation information；Or, when the number of described operation information is at least two, described operation information is sent to user；Obtain The operation information that described user selects, and execute the operation information that described user selects.

Further, embodiments provide a kind of identification server, including：Receiving unit, for receiver top The phonetic order information to be identified that box sends；Determining unit, for the voice described to be identified being received according to described receiving unit Command information determines at least one operation information；Described determining unit, is additionally operable to according at least one operation information described, really The type of fixed at least one operation information described and type weight；Described determining unit, is additionally operable to according at least one behaviour described Make information and the type of at least one operation information determines described at least one operation information corresponding effectiveness weight；Described true Order unit, is additionally operable to the effectiveness of type weight according at least one operation information described and at least one operation information described Weight, determines the instruction weight of at least one operation information described；Transmitting element, for the instruction determining described determining unit The maximum operation information of weight sends to described Set Top Box.

Alternatively, also include：Processing unit, for determining the operation information of described Set Top Box execution, and according to described machine The operation information of top box execution, updates described operation information corresponding effectiveness weight.

Alternatively, described determining unit, specifically for the type weight according at least one operation information described and described The effectiveness weight of at least one operation information, instructs weight=type weight * 0.7+ using formula（Effectiveness weighted average Value）* 0.3, determine the instruction weight of at least one operation information described.

Further, embodiments provide a kind of Set Top Box, including：Acquiring unit, for obtaining language to be identified Sound command information；Transmitting element, the phonetic order information described to be identified for obtaining described acquiring unit sends to identification Server；Receiving unit, for receiving the operation information that described identification server sends；Processing unit, for connecing described in executing Receive the described operation information that unit receives.

Alternatively, described processing unit, specifically for when the number of described operation information is one, executing described operation Information；Or, when the number of described operation information is at least two, described operation information is sent to user；Obtain described The operation information that user selects, and execute the operation information that described user selects.

Embodiments provide a kind of method and apparatus of speech recognition, including：What receiving set up box sent waits to know Other phonetic order information；At least one operation information is determined according to phonetic order information to be identified；At least one operation letter of root Breath, determines the type weight of at least one operation information；At least one operation information is determined according at least one operation information Effectiveness weight；The effectiveness weight of the type weight according at least one operation information and at least one operation information, determines The instruction weight of at least one operation information；Operation information maximum for instruction weight is sent to Set Top Box.So, identify Server after receiving phonetic order information to be identified it may be determined that going out at least one operation information, and calculate each operation The instruction weight of information, operation information maximum for instruction weight is sent to Set Top Box, with respect in technology, the present invention needs to calculate Go out the instruction weight of each operation information, improve the discrimination of phonetic order, and then improve the desired operation of identifying user Accuracy rate.

Brief description

In order to be illustrated more clearly that the technical scheme of the embodiment of the present invention, below will be in embodiment or description of the prior art The accompanying drawing of required use be briefly described it should be apparent that, drawings in the following description be only the present invention some are real Apply example, for those of ordinary skill in the art, on the premise of not paying creative work, can also be according to these accompanying drawings Obtain other accompanying drawings.

Fig. 1 is a kind of schematic flow sheet of the method for speech recognition provided in an embodiment of the present invention；

Fig. 2 is the schematic flow sheet of the method for another kind provided in an embodiment of the present invention speech recognition；

Fig. 3 is the schematic flow sheet of the method for another kind provided in an embodiment of the present invention speech recognition；

Fig. 4 is a kind of structural representation of identification server provided in an embodiment of the present invention；

Fig. 5 is the structural representation of another kind of identification server provided in an embodiment of the present invention；

Fig. 6 is a kind of structural representation of Set Top Box provided in an embodiment of the present invention.

Specific embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation description is it is clear that described embodiment is only a part of embodiment of the present invention, rather than whole embodiments.It is based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of not making creative work Embodiment, broadly falls into the scope of protection of the invention.

Embodiments provide a kind of method of speech recognition, as shown in figure 1, including：

The phonetic order information to be identified that step 101, receiving set up box send.

Specifically, user can send phonetic order by remote control equipment to Set Top Box, and remote control equipment is in uppick user After the phonetic order of input, this phonetic order can be converted to the phonetic order information to be identified of written form, as will use The phonetic order of family input is converted to word.And send the phonetic order information to be identified of written form to Set Top Box, machine top Box sends the phonetic order information to be identified receiving to can recognize that the corresponding operation information of phonetic order information to be identified Identification server.Now, identification server can receive the phonetic order information to be identified of Set Top Box transmission.

Step 102, at least one operation information is determined according to phonetic order information to be identified.

Specifically, the function that Set Top Box is realized can be divided three classes：Program request, live and application.In identification server The function of being respectively directed to above-mentioned three classes realizations carries out the storage of operation information.After receiving phonetic order information to be identified, can To be made a look up in order program data storehouse, live data storehouse and application database respectively according to command information to be identified, find out In command information to be identified each operation information corresponding.

Step 103, according at least one operation information, determine type and the type weight of at least one operation information.

It should be noted that the different corresponding types of operation information in identification server may be different.And in advance Different types of class weight is set.For example, it is possible to the classification of operation information is divided into digital TV channel title, operate class （Quiet, shutdown, zapping etc.）, channel number, menu column, bidirectional video-on-demand etc..Operation class weight can be pre-set（Quiet, close Machine, zapping etc.）20, channel designation weight is 10, and channel number weight is 8, menu column weight 9, and bidirectional video-on-demand is 9.

Specifically, identification server is after finding out command information to be identified at least one operation information corresponding, permissible Parse each operation information, determine the classification of each operation information, so can according to the type of each operation information, Search the different types of class weight information pre-setting, determine the corresponding type weight of type of each operation information, As determine the type of each operation information and corresponding type weight at least one operation information.

Step 104, according to the type of at least one operation information and at least one operation information determine at least one operation Information corresponding effectiveness weight.

It should be noted that in embodiments of the present invention, having preset the concrete operations of different types of operation information Weight, as effectiveness weight.For example, preset the quiet weight under operation class, the system setting under operation class Weight, the weight of the CCTV under digital channel title, the weight of the channel * * under channel number, menu bar now system setting Weight etc..

Specifically, identification server, after determining the type of each operation information, can be carried out to each operation information The corresponding concrete operations of each operation information are determined in parsing, according to the corresponding concrete operations of each operation information and then permissible Check that each operates corresponding effectiveness weight, determine the effectiveness weight of each operation information.

Further, if parsing same type of multiple concrete operations in an operation information, operation information corresponds to Effectiveness weight can be multiple concrete operations effectiveness weight meansigma methodss.

The effectiveness weight of step 105, the type weight according at least one operation information and at least one operation information, Determine the instruction weight of at least one operation information.

Specifically, identification server is in the type weight determining at least one operation information and at least one operation information Effectiveness weight after, it is possible to use the effectiveness of the type weight of at least one operation information and at least one operation information power Weight, calculates the instruction weight of each operation information.

Further, the effectiveness power of the type weight according at least one operation information and at least one operation information Weight, determines that the instruction weight of at least one operation information includes：Type weight according at least one operation information and at least one The effectiveness weight of individual operation information, instructs weight=type weight * 0.7+ using formula（Effectiveness weighted mean）* 0.3, Determine the instruction weight of at least one operation information.

Step 106, operation information maximum for instruction weight is sent to Set Top Box.

Specifically, identification quality, after calculating the instruction weight of each operation information, can be known according to instruction weight The probability that user selects, as, instruction weight is bigger, then the probability that user selects is bigger.Now, weight will preferentially be instructed Maximum operation information sends to Set Top Box.

So, identification server is grasped it may be determined that going out at least one after receiving phonetic order information to be identified Make information, and calculate the instruction weight of each operation information, operation information maximum for instruction weight is sent to Set Top Box, relatively In technology, the present invention need to calculate the instruction weight of each operation information, improves the discrimination of phonetic order, and then improves The accuracy rate of the desired operation of identifying user.

Embodiments provide a kind of method of speech recognition, as shown in Fig. 2 including：

Step 201, acquisition phonetic order information to be identified.

Specifically, user can send phonetic order by remote control equipment to Set Top Box, and remote control equipment is in uppick user After the phonetic order of input, this phonetic order can be converted to the phonetic order information to be identified of written form, as will use The phonetic order of family input is converted to word.And send the phonetic order information to be identified of written form to Set Top Box, now Set Top Box receives phonetic order information to be identified.

Step 202, by phonetic order information to be identified send to identification server.

Specifically, Set Top Box is after receiving phonetic order information to be identified, because Set Top Box cannot Direct Recognition go out to treat The identification corresponding operation information of phonetic order information, needs to complete by identifying server, therefore, Set Top Box can will be waited to know Other phonetic order information sends to identification server.

The operation information that step 203, reception identification server send.

Specifically, identification server, can be according to this after the phonetic order information to be identified receiving Set Top Box transmission Phonetic order information to be identified determines operation information, and this operation information is sent to Set Top Box.Set Top Box receives operation letter Breath.

Step 204, execution operation information.

Specifically, Set Top Box, after receiving operation information, can carry out corresponding data processing according to this operation information, Picture by television terminal display response.

Further, execute described operation information to include：When the number of operation information is one, execute operation information. Or, when the number of operation information is at least two, operation information is sent to user；Obtain the operation letter that user selects Breath, and execute the operation information of user's selection.

It is, when calculating the weight information of each operation information, the instruction weight that there is maximum corresponds to identification server At least two operation informations, the instruction weight of as at least two operation informations is identical, and the situation of maximum.Now, identification clothes Business device all sends above-mentioned at least two operation informations to Set Top Box.Certainly, there is also the corresponding behaviour of instruction weight of maximum Make the situation of information, now identification server directly sends operation information maximum for instruction weight to Set Top Box.

Now, when the operation information that Set Top Box receives is one, Set Top Box can directly execute this operation information.Connecing When the operation information receiving is at least two, Set Top Box cannot determine executed which operation information, at this point it is possible to will at least two Individual operation information feeds back to user, is as sent to user, carries out selecting executed which operation information by user.User can be by The operation information selecting informs Set Top Box, and this set-top box obtains the operation information that user selects, and executes the behaviour of user's selection Make information.

It should be noted that at least two operation informations are fed back to user by Set Top Box, user is selected, and Set Top Box obtains The specific implementation taking the operation information of family selection can be that Set Top Box just at least two operation information is whole by TV End feeds back to user, and now, user is selected in television terminal using remote control equipment, and selection instruction is sent to machine top Include operation information in box, wherein selection instruction, after Set Top Box receives selection instruction, can be obtained by parsing selection instruction Know the operation information that user selects, and then processed accordingly.It is, of course, also possible to be other modes, the present invention does not do to this Limit.

It should be noted that Set Top Box is after receiving at least two operation informations, also user can not be fed back to, but root Rule is selected to select an operation information to be processed accordingly according to default.Wherein, rule is selected can be user according to reality Border demand pre-sets.It is, of course, also possible to determine the operation information that need to execute by other means, the present invention does not make to this Limit.

Embodiments provide a kind of method of speech recognition, including：Set Top Box obtains phonetic order letter to be identified Breath, and phonetic order information to be identified is sent to identification server, receive the phonetic order to be identified that identification server sends Information, and execute phonetic order information to be identified.So, Set Top Box sends phonetic order information to be identified to identification clothes After business device, identification server can determine that at least one operation information, and calculates the instruction weight of each operation information, will refer to The operation information making weight maximum sends to Set Top Box, and with respect in technology, the present invention need to calculate the finger of each operation information Make weight, improve the discrimination of phonetic order, and then improve the accuracy rate of the desired operation of identifying user.

Embodiments provide a kind of method of speech recognition, as shown in figure 3, including：

Step 301, Set Top Box obtain phonetic order information to be identified.

Specifically, refer to step 201, will not be described here.

Step 302, Set Top Box send phonetic order information to be identified to identification server.Identification server receiver top The phonetic order information to be identified that box sends.

Specifically, refer to step 202 and step 101, will not be described here.

Step 303, identification server determine at least one operation information according to phonetic order information to be identified.

Specifically, refer to step 102, will not be described here.

Step 304, identification server, according at least one operation information, determine type and the class of at least one operation information Type weight.

Specifically, refer to step 103, will not be described here.

Step 305, identification server according to the type of at least one operation information and at least one operation information determine to The corresponding effectiveness weight of a few operation information.

Specifically, refer to step 104, will not be described here.

Step 306, identification server are according to the type weight of at least one operation information and at least one operation information Effectiveness weight, determines the instruction weight of at least one operation information.

Specifically, refer to step 105, will not be described here.

Step 307, identification server send instructing the maximum operation information of weight to Set Top Box.Set Top Box receives identification The operation information that server sends.

Specifically, refer to step 106 and step 203, will not be described here.

Step 308, Set Top Box execution operation information.

Specifically, refer to step 204, will not be described here.

The operation information of execution is fed back to identification server by step 309, Set Top Box, and identification server determines that Set Top Box is held The operation information of row, and the operation information according to Set Top Box execution, update operation information corresponding effectiveness weight.

Specifically, Set Top Box, after having executed operation information, needs the operation information being executed to be sent to identification service Device, after identification server receives operation information, can improve this operation information corresponding effectiveness weight, for example, this be grasped Information of making corresponding effectiveness weight adds 1, such that it is able to improve this operation information corresponding effectiveness weight.So can be each This operation information corresponding effectiveness weight is updated according to the operation information of Set Top Box execution, to improve the instruction of operation information Weight, improves the discrimination of phonetic order further.

Embodiments provide a kind of identification server, as shown in figure 4, including：

Receiving unit 401, the phonetic order information to be identified sending for receiving set up box.

Determining unit 402, the phonetic order information to be identified for being received according to receiving unit 401 determines at least one Operation information.

Determining unit 402, is additionally operable to determine type and the class of at least one operation information according at least one operation information Type weight.

Determining unit 402, be additionally operable to according to the type of at least one operation information and at least one operation information determine to The corresponding effectiveness weight of a few operation information.

Determining unit 402, is additionally operable to type weight according at least one operation information and at least one operation information Effectiveness weight, determines the instruction weight of at least one operation information.

Specifically, determining unit 402, specifically for the type weight according at least one operation information and at least one behaviour Make the effectiveness weight of information, instruct weight=type weight * 0.7+ using formula（Effectiveness weighted mean）* 0.3, determine The instruction weight of at least one operation information.

Transmitting element 403, the maximum operation information of the instruction weight for determining determining unit 402 sends to machine top Box.

Further, above-mentioned identification server, as shown in figure 5, also include：

Processing unit 404, for determining the operation information of Set Top Box execution, and the operation information according to Set Top Box execution, updates Operation information corresponding effectiveness weight.

Embodiments provide a kind of Set Top Box, as shown in fig. 6, including：

Acquiring unit 501, for obtaining phonetic order information to be identified.

Transmitting element 502, the phonetic order information to be identified for obtaining acquiring unit 501 sends to identification service Device.

Receiving unit 503, for receiving the operation information that identification server sends.

Processing unit 504, for executing the operation information of receiving unit 503 reception.

Specifically, processing unit 504, specifically for when the number of operation information is one, executing operation information.Or Person, when the number of operation information is at least two, operation information is sent to user；Obtain the operation information that user selects, And execute the operation information of user's selection.

Set Top Box obtains phonetic order information to be identified, and phonetic order information to be identified is sent to identification server, Receive the phonetic order information to be identified that identification server sends, and execute phonetic order information to be identified.So, machine top Box sends phonetic order information to be identified to identification server, and identification server can determine that at least one operation letter Breath, and calculate the instruction weight of each operation information, operation information maximum for instruction weight is sent to Set Top Box, with respect to skill In art, the present invention need to calculate the instruction weight of each operation information, improve the discrimination of phonetic order, and then improve knowledge The accuracy rate of the desired operation of other user.

Finally it should be noted that：Above example only in order to technical scheme to be described, is not intended to limit；Although With reference to the foregoing embodiments the present invention is described in detail, it will be understood by those within the art that：It still may be used To modify to the technical scheme described in foregoing embodiments, or equivalent is carried out to wherein some technical characteristics； And these modification or replace, do not make appropriate technical solution essence depart from various embodiments of the present invention technical scheme spirit and Scope.

Claims

1. a kind of method of speech recognition is it is characterised in that include：

The phonetic order information to be identified that receiving set up box sends；

At least one operation information is determined according to described phonetic order information to be identified；

According at least one operation information described, determine type and the type weight of at least one operation information described；

At least one operation information according to the type determination of at least one operation information described and at least one operation information Corresponding effectiveness weight；

Type weight according at least one operation information described and the effectiveness weight of at least one operation information described, determine The instruction weight of at least one operation information described；

Operation information maximum for instruction weight is sent to described Set Top Box.

2. method according to claim 1 is it is characterised in that described send operation information maximum for instruction weight to institute After stating Set Top Box, also include：

Determine the operation information of described Set Top Box execution, and the operation information according to the execution of described Set Top Box, update described operation Information corresponding effectiveness weight.

3. method according to claim 1 and 2 is it is characterised in that the class of at least one operation information described in described basis The effectiveness weight of type weight and at least one operation information described, determines the instruction weight bag of at least one operation information described Include：

Type weight according at least one operation information described and the effectiveness weight of at least one operation information described, utilize Formula instructs weight=type weight * 0.7+（Effectiveness weighted mean）* 0.3, determine the finger of at least one operation information described Make weight.

4. a kind of method of speech recognition is it is characterised in that include：

Obtain phonetic order information to be identified；

Described phonetic order information to be identified is sent to identification server；

Receive the operation information that described identification server sends；

Execute described operation information.

5. method according to claim 4 is it is characterised in that the described operation information of described execution includes：

When the number of described operation information is one, execute described operation information；

Or,

When the number of described operation information is at least two, described operation information is sent to user；

Obtain the operation information that described user selects, and execute the operation information that described user selects.

6. a kind of identification server is it is characterised in that include：

Receiving unit, the phonetic order information to be identified sending for receiving set up box；

Determining unit, the phonetic order information described to be identified for being received according to described receiving unit determines at least one behaviour Make information；

Described determining unit, is additionally operable to according at least one operation information described, determines the class of at least one operation information described Type and type weight；

Described determining unit, is additionally operable to determine institute according to the type of at least one operation information described and at least one operation information State at least one operation information corresponding effectiveness weight；

Described determining unit, is additionally operable to the type weight according at least one operation information described and at least one operation letter described The effectiveness weight of breath, determines the instruction weight of at least one operation information described；

Transmitting element, the maximum operation information of the instruction weight for determining described determining unit sends to described Set Top Box.

7. identification server according to claim 6 is it is characterised in that also include：

Processing unit, for determining the operation information of described Set Top Box execution, and the operation information according to the execution of described Set Top Box, Update described operation information corresponding effectiveness weight.

8. the identification server according to claim 6 or 7 it is characterised in that

Described determining unit, specifically for the type weight according at least one operation information described and at least one operation described The effectiveness weight of information, instructs weight=type weight * 0.7+ using formula（Effectiveness weighted mean）* 0.3, determine institute State the instruction weight of at least one operation information.

9. a kind of Set Top Box is it is characterised in that include：

Acquiring unit, for obtaining phonetic order information to be identified；

Transmitting element, the phonetic order information described to be identified for obtaining described acquiring unit sends to identification server；

Receiving unit, for receiving the operation information that described identification server sends；

Processing unit, for executing the described operation information that described receiving unit receives.

10. Set Top Box according to claim 9 it is characterised in that

Described processing unit, specifically for when the number of described operation information is one, executing described operation information；

Or,