CN109584865B

CN109584865B - Application program control method and device, readable storage medium and terminal equipment

Info

Publication number: CN109584865B
Application number: CN201811210044.1A
Authority: CN
Inventors: 董亚荣
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2018-10-17
Filing date: 2018-10-17
Publication date: 2024-05-31
Anticipated expiration: 2038-10-17
Also published as: CN109584865A

Abstract

The present invention relates to the field of computer technologies, and in particular, to an application control method, an application control device, a computer readable storage medium, and a terminal device. After receiving a voice acquisition instruction, the method acquires voice information input by a user, carries out voice recognition on the acquired voice information to obtain text information corresponding to the voice information, then calculates and determines a target control instruction of an application program through matching degree, and controls the application program to execute operation corresponding to the target control instruction. According to the embodiment of the invention, the user can send the control instruction of the application program in a voice control mode, the application program can automatically execute the corresponding operation, the operation is simple and easy, the efficiency is greatly improved, and the user obtains better use experience.

Description

Application program control method and device, readable storage medium and terminal equipment

Technical Field

The present invention relates to the field of computer technologies, and in particular, to an application control method, an application control device, a computer readable storage medium, and a terminal device.

Background

Along with the development of technology, more and more enterprises begin to adopt electronic office work, users can directly apply for leave, business trip, reimbursement, outgoing application and the like on office application programs, and compared with the traditional paper application mode, the work efficiency is greatly improved. However, the operation of the existing office application program is still complicated, the corresponding function options can be opened only by clicking and searching for multiple times, time and labor are consumed, and the user experience is poor.

Disclosure of Invention

In view of the above, the embodiments of the present invention provide an application control method, an apparatus, a computer readable storage medium, and a terminal device, so as to solve the problem that the existing office application has complicated operation and poor user experience.

A first aspect of an embodiment of the present invention provides an application control method, which may include:

after receiving a voice acquisition instruction, acquiring voice information input by a user, wherein the voice information comprises a control instruction of an application program;

performing voice recognition on the collected voice information to obtain text information corresponding to the voice information;

respectively calculating the matching degree between the text information and each control instruction in a preset control instruction set;

and selecting a control instruction with highest matching degree with the text information from the control instruction set as a target control instruction for the application program, and controlling the application program to execute an operation corresponding to the target control instruction.

A second aspect of an embodiment of the present invention provides an application control apparatus, which may include:

the voice information acquisition module is used for acquiring voice information input by a user after receiving a voice acquisition instruction, wherein the voice information comprises a control instruction of an application program;

The voice recognition module is used for carrying out voice recognition on the collected voice information to obtain text information corresponding to the voice information;

the matching degree calculation module is used for calculating the matching degree between the text information and each control instruction in a preset control instruction set respectively;

The target control instruction selecting module is used for selecting a control instruction with highest matching degree with the text information from the control instruction set as a target control instruction for the application program;

and the operation execution module is used for controlling the application program to execute the operation corresponding to the target control instruction.

A third aspect of embodiments of the present invention provides a computer readable storage medium storing computer readable instructions which when executed by a processor perform the steps of:

A fourth aspect of the embodiments of the present invention provides a terminal device comprising a memory, a processor and computer readable instructions stored in the memory and executable on the processor, the processor executing the computer readable instructions to perform the steps of:

Compared with the prior art, the embodiment of the invention has the beneficial effects that: after receiving a voice acquisition instruction, the embodiment of the invention acquires voice information input by a user, carries out voice recognition on the acquired voice information to obtain text information corresponding to the voice information, then determines a target control instruction of an application program through matching degree calculation, and controls the application program to execute operation corresponding to the target control instruction. According to the embodiment of the invention, the user can send the control instruction of the application program in a voice control mode, the application program can automatically execute the corresponding operation, the operation is simple and easy, the efficiency is greatly improved, and the user obtains better use experience.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of an embodiment of a method for controlling an application program according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of calculating the degree of matching between text information and each control instruction in a preset control instruction set, respectively;

FIG. 3 is a schematic flow chart of computing voiceprint feature vectors of speech information;

FIG. 4 is a block diagram of an embodiment of an application control device according to an embodiment of the present invention;

Fig. 5 is a schematic block diagram of a terminal device in an embodiment of the present invention.

Detailed Description

In order to make the objects, features and advantages of the present invention more comprehensible, the technical solutions in the embodiments of the present invention are described in detail below with reference to the accompanying drawings, and it is apparent that the embodiments described below are only some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, an embodiment of an application control method according to an embodiment of the present invention may include:

step S101, after receiving a voice acquisition instruction, acquiring voice information input by a user.

The voice information comprises control instructions of an application program. And when the user releases the voice input button, namely, a voice acquisition termination instruction is issued to the application program, and the application program ends the acquisition of the voice information.

Step S102, performing voice recognition on the collected voice information to obtain text information corresponding to the voice information.

The speech recognition is to convert a section of speech information into corresponding text information, and mainly comprises the processes of feature extraction, acoustic model, language model, decoding and the like, and in order to extract features more effectively, audio data preprocessing work such as filtering, framing and the like is often needed to be carried out on the collected speech information, so that an audio signal to be analyzed is extracted from an original signal properly.

Feature extraction works transform the speech information from the time domain to the frequency domain, providing the acoustic model with the appropriate feature vectors.

And calculating the score of each feature vector on the acoustic feature according to the acoustic characteristics in the acoustic model. Hidden markov (Hidden Markov Model, HMM) acoustic modeling is preferably used in this embodiment: the concept of a Markov model is a discrete time domain finite state automaton, and hidden Markov refers to the condition that the internal state of the Markov model is invisible outside, and the outside can only see the output value at each moment. For speech recognition systems, the output values are typically acoustic features calculated from individual frames. Two assumptions are made for the HMM to characterize speech information, one in which the transition of the internal state is related to the last state only and the other in which the output value is related to the current state (or current state transition) only, which greatly reduces the complexity of the model. The use of HMMs in speech recognition is typically modeled with a unidirectional, self-loop, cross-loop topology from left to right, one phoneme being a three to five state HMM, one word being an HMM formed by concatenating HMMs of multiple phonemes that make up the word, while the entire model of continuous speech recognition is an HMM of words and silence.

The language model calculates the probability of the corresponding possible phrase sequence of the voice information according to the theory of linguistic correlation. In this embodiment, an N-Gram language model is preferably used, which is based on the assumption that the occurrence of the nth word is related to only the preceding N-1 words, but not to any other word, and the probability of the whole sentence is the product of the occurrence probabilities of the respective words. These probabilities can be obtained by directly counting the number of simultaneous occurrences of N words from the corpus, and are usually binary Bi-Gram and ternary Tri-Gram. The performance of a language model is typically measured in terms of cross entropy and complexity. The meaning of cross entropy is the difficulty of recognition with the model pair, or from a compression perspective, each word is encoded with on average a few bits. The meaning of complexity is that the model represents the average number of branches of this text, the inverse of which can be regarded as the average probability for each word. Smoothing refers to assigning a probability value to the unobserved N-ary combinations to ensure that the word sequence always gets a probability value through the language model.

And finally, decoding the phrase sequence according to the existing dictionary to obtain text information corresponding to the voice information.

Step S103, the matching degree between the text information and each control instruction in a preset control instruction set is calculated respectively.

The control instruction set may include, but is not limited to, a control instruction to create an leave application, create a business application, create a reimbursement application, create an outbound application, and the like.

As shown in fig. 2, step S103 may specifically include the following procedures:

Step S1031, determining keyword sets corresponding to the control instructions, and calculating the classification identifiers of the keywords in each keyword set.

Firstly, word segmentation processing is carried out on each corpus in a preset corpus to obtain each word.

The corpus comprises corpus sub-libraries corresponding to the control instructions respectively, wherein each corpus sub-library can be obtained according to statistics of large-scale user data. Specifically, sentences which are conventionally used by each user when issuing a certain control instruction are obtained, and the sentences are added into a corpus corresponding to the control instruction. For example, if, when user data statistics is performed, a user is used to issue a control instruction for creating a business trip application by using the statement "i want to go on business", and B user is used to issue a control instruction for creating a business trip application by using the statement "please help i create a business trip application", then all the statements are added into a corpus sub-base corresponding to the control instruction for creating a business trip application as the corpus therein.

The word segmentation process refers to the process of segmenting a corpus into individual words, and in this embodiment, the corpus can be segmented according to a general dictionary, so that the segmented words are normal words, and if the words are not in the dictionary, the words are segmented. When words can be formed in the front-back direction, for example, "want pray to Gods for blessing" will be divided according to the size of the statistical word frequency, if "ask" word frequency is high, "ask/mind" will be divided, if "pray to Gods for blessing" word frequency is high, "want/pray to Gods for blessing".

Then, the occurrence frequency of each word in each word sub-library is counted, and the classification identity of each word is calculated according to the following formula:

Wherein w is the sequence number of words, w is 1-WordNum, wordNum, the total number of words, freqSeq _w is the frequency sequence of the w word in each word sub-base, FreqSeq_w＝[Freq_w,1,Freq_w,2,......,Freq_w,c,......,Freq_w,ClassNum],Freq_w,c is the frequency of the w word in the corpus sub-base corresponding to the c-th control instruction, freqSeq' _w is the sequence remaining after the maximum value is removed from FreqSeq _w, namely: freqSeq' _w＝FreqSeq_w-MAX(FreqSeq_w), MAX is the maximum function, classDeg _w is the classification identity of the w-th word;

then, selecting words with classification discrimination greater than a preset discrimination threshold as keywords, wherein the keywords correspond to control instructions corresponding to FreqSeq _w when the maximum value is obtained.

The recognition threshold may be set according to practical situations, for example, it may be set to 5, 10, 20 or other values.

Control instructions corresponding to the respective keywords may be determined according to the following equation:

TgtKwSet_w＝argmax(FreqSeq_w)＝argmax(Freq_w,1,Freq_w,2,......,Freq_w,c,......,Freq_w,ClassNum) Wherein TgtKwSet _w is the sequence number of the control instruction corresponding to the w-th keyword;

For example, the frequency of occurrence of the term "illness" in the corpus sub-library corresponding to the creation of the leave application is 1000 times, the frequency of occurrence in the corpus sub-library corresponding to the creation of the reimbursement application is 20 times, and the frequency of occurrence in the corpus sub-library corresponding to the creation of the business application is 1 time, the classification discrimination is:

The classification identity is greater than the identity threshold, and can be determined to be a keyword, and since the classification identity occurs most frequently in the corpus corresponding to the application to be reimbursed, the keyword corresponding to the control instruction to be reimbursed can be determined.

Finally, each keyword corresponding to the c-th control instruction is constructed as a keyword set corresponding to the c-th control instruction, as shown in the following table:

Control instructions	Keyword set
		Control instruction 1	Set 1= { keyword 1, keyword 2, keyword 3}
Control instruction 2	Set 2= { keyword 4, keyword 5, keyword 6}
		Control instruction 3	Set 3= { keyword 7, keyword 8}
……	……
		……	……

For example, the keyword set for creating the control instruction of the leave application may include: keywords such as "please leave", "wedding holidays", "ill", "accompanying product", etc.

Step S1032, counting the occurrence frequency of each keyword in the text information.

Step S1033, calculating the matching degree between the text information and each control instruction.

Preferably, the matching degree between the text information and each control instruction may be calculated according to the following formula:

Wherein c is the sequence number of the control instruction, c is more than or equal to 1 and less than or equal to ClassNum, classNum, kn is the total number of the control instructions, kn is the sequence number of the keywords, kn is more than or equal to 1 and less than or equal to KwNum _c,KwNum_c, the total number of the keywords in the keyword set corresponding to the c-th control instruction, msgKWNum _c,kn is the frequency of occurrence of the kn-th keyword in the keyword set corresponding to the c-th control instruction in the text information, classDeg _c,kn is the classification recognition degree of the kn-th keyword in the keyword set corresponding to the c-th control instruction, and MatchDeg _c is the matching degree between the text information and the c-th control instruction.

And step S104, selecting a control instruction with highest matching degree with the text information from the control instruction set as a target control instruction for the application program.

The target control instructions for the application may be determined according to the following equation:

TargetCmd＝argmax(MatchDegSeq)

＝argmax(MatchDeg₁,MatchDeg₂,......,MatchDeg_c,......,MatchDeg_ClassNum)

Wherein ,MatchDegSeq＝(MatchDeg₁,MatchDeg₂,......,MatchDeg_c,......,MatchDeg_ClassNum),, MATCHDEGSEQ is the matching degree sequence between the text information and each control instruction, and TARGETCMD is the finally determined sequence number of the target control instruction of the application program.

Step S105, controlling the application program to execute an operation corresponding to the target control instruction.

When the operation to be performed by the user is determined, the operation step can be automatically performed. For example, if the voice information sent by the user is "i ill", it is determined that the user wants to create the leave application through voice recognition and matching calculation, the application is controlled to automatically open the corresponding operation interface, and a new leave application is created for the user.

Preferably, in order to ensure the security, after the voice information is collected, before the collected voice information is subjected to voice recognition, the voice information can be authenticated so as to prevent other users from impersonating the current user to perform operation.

First, a voiceprint feature vector of the speech information is calculated.

As shown in fig. 3, the calculation process of the voiceprint feature vector of the voice information may include:

Step S301, dividing the voice information into M voice subsections.

Wherein M is an integer greater than 1, and its specific value may be set according to practical situations, for example, it may be set to 3, 5, 10 or other values, etc.

Step S302, respectively calculating the Mel frequency spectrum scrambling coefficient vector of each voice subsection.

Preferably, the mel-spectrum cepstral coefficient vector for each speech sub-segment may be calculated separately according to:

MelVec_m＝MFCCFuc(SubVoice_m)

Wherein M is the sequence number of the voice sub-segment, M is more than or equal to 1 and less than or equal to M, subVoice _m is the M-th voice sub-segment, MFCCFuc is a preset Mel frequency spectrum scrambling coefficient calculation function, melVec _m is the Mel frequency spectrum scrambling coefficient vector of the M-th voice sub-segment, and MelVec_m＝(MelCoe_m,1,MelCoe_m,2,......,MelCoe_m,n,......,MelCoe_m,N),MelCoe_m,n is the n-th Mel frequency spectrum scrambling coefficient of the M-th voice sub-segment.

Step S303, respectively calculating the weight coefficient of each voice subsection.

Preferably, the weighting coefficients of the individual speech subsections can be calculated separately according to the following formula:

wherein Weight _m is the Weight coefficient of the mth speech subsection;

And S304, constructing the voiceprint feature vector of the voice information.

Preferably, the voiceprint feature vector of the speech information may be constructed according to the following equation:

VoPrintVec＝(VpElem₁,VpElem₂,......,VpElem_n,......,VpElem_N)

wherein, VoPrintVec is a voiceprint feature vector of the speech information.

And then, inquiring the reference feature vector corresponding to the user in a preset database.

The reference feature vector is a voiceprint feature vector extracted from the voice of the user corresponding to the current login account in advance, and the specific calculation process is similar to the foregoing process, and is not repeated here.

Then, a similarity between a voiceprint feature vector of the speech information and the reference feature vector is calculated according to the following formula:

Wherein N is the element number of the voiceprint feature vector of the voice information, N is 1-N, N is the total number of elements of the voiceprint feature vector of the voice information, vpElem _n is the nth element of the voiceprint feature vector of the voice information, stVpElem _n is the nth element of the reference feature vector, and SimDeg is the similarity between the voiceprint feature vector of the voice information and the reference feature vector.

If the similarity between the voiceprint feature vector of the voice information and the reference feature vector is greater than a preset similarity threshold, the voice information is indicated to be the user corresponding to the current login account, and the step of performing voice recognition on the collected voice information and the subsequent steps are executed. If the similarity between the voiceprint feature vector of the voice information and the reference feature vector is smaller than or equal to the similarity threshold, the user who sends the voice information and is not corresponding to the current login account is indicated, the voice information is ignored at the moment, and the step of carrying out voice recognition on the collected voice information and the subsequent steps are not executed.

The similarity threshold may be set according to practical situations, for example, it may be set to 70%, 80%, 90%, or other values, and so on.

Preferably, in this embodiment, different operation rights may be set for different users, and each user may only control the application program to execute an operation corresponding to the rights, as shown in the following table:

User' s	Operating rights
		User 1	Operation authority set 1= { operation 1, operation 2, operation 3, operation 4}
User 2	Operation authority set 2= { operation 1, operation 3, operation 4}
		User 3	Operation authority set 3= { operation 1, operation 2}
……	……
		……	……

After determining the operation corresponding to the target control instruction, inquiring whether the user has the corresponding operation authority, if the user does not have the operation authority, not executing the operation, and prompting the user in a related manner, and if the user has the operation authority, executing the step of controlling the application program to execute the operation corresponding to the target control instruction.

In summary, after receiving a voice acquisition instruction, the embodiment of the invention acquires voice information input by a user, performs voice recognition on the acquired voice information to obtain text information corresponding to the voice information, then determines a target control instruction for an application program through matching degree calculation, and controls the application program to execute an operation corresponding to the target control instruction. According to the embodiment of the invention, the user can send the control instruction of the application program in a voice control mode, the application program can automatically execute the corresponding operation, the operation is simple and easy, the efficiency is greatly improved, and the user obtains better use experience.

It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.

Corresponding to an application control method described in the above embodiments, fig. 4 shows a block diagram of an embodiment of an application control device according to an embodiment of the present invention.

In this embodiment, an application control device may include:

the voice information acquisition module 401 is configured to acquire voice information input by a user after receiving a voice acquisition instruction, where the voice information includes a control instruction of an application program;

the voice recognition module 402 is configured to perform voice recognition on the collected voice information to obtain text information corresponding to the voice information;

A matching degree calculating module 403, configured to calculate matching degrees between the text information and each control instruction in a preset control instruction set respectively;

A target control instruction selecting module 404, configured to select, from the control instruction set, a control instruction with the highest matching degree with the text information as a target control instruction for the application program;

And an operation execution module 405, configured to control the application program to execute an operation corresponding to the target control instruction.

Further, the matching degree calculating module may include:

The keyword set determining unit is used for determining keyword sets corresponding to the control instructions respectively and calculating the classification identifiers of the keywords in each keyword set respectively;

The frequency statistics unit is used for respectively counting the frequency of each keyword in the text information;

The matching degree calculating unit is used for calculating the matching degree between the text information and each control instruction according to the following formula:

Further, the keyword set determining unit may include:

The word segmentation processing subunit is used for carrying out word segmentation processing on each corpus in a preset corpus to obtain each word, wherein the corpus comprises corpus sub-databases respectively corresponding to each control instruction;

the frequency statistics subunit is used for respectively counting the frequency of each word in each word sub-library;

the classification recognition degree calculating subunit is used for respectively calculating the classification recognition degree of each word according to the following steps:

the keyword determining subunit is configured to select, as keywords, words with classification identifiers greater than a preset identifier threshold, and determine control instructions corresponding to the keywords according to the following formula:

and the keyword set construction subunit is used for constructing each keyword corresponding to the c-th control instruction into a keyword set corresponding to the c-th control instruction.

Further, the application control device may further include:

The voiceprint feature vector calculation module is used for calculating voiceprint feature vectors of the voice information;

The reference feature vector query module is used for querying a reference feature vector corresponding to the user in a preset database;

the similarity calculation module is used for calculating the similarity between the voiceprint feature vector of the voice information and the reference feature vector according to the following formula:

Further, the voiceprint feature vector calculation module may include:

A voice sub-segment dividing unit, configured to divide the voice information into M voice sub-segments, where M is an integer greater than 1;

A mel-frequency spectrum scrambling coefficient vector calculating unit for calculating mel-frequency spectrum scrambling coefficient vectors of the respective voice subsections according to the following formula:

MelVec_m＝MFCCFuc(SubVoice_m)

Wherein M is the sequence number of the voice sub-segment, M is more than or equal to 1 and less than or equal to M, subVoice _m is the M-th voice sub-segment, MFCCFuc is a preset Mel frequency spectrum scrambling coefficient calculation function, melVec _m is the Mel frequency spectrum scrambling coefficient vector of the M-th voice sub-segment, and MelVec_m＝(MelCoe_m,1,MelCoe_m,2,......,MelCoe_m,n,......,MelCoe_m,N),MelCoe_m,n is the n-th Mel frequency spectrum scrambling coefficient of the M-th voice sub-segment;

The weight coefficient calculation unit is used for calculating the weight coefficient of each voice subsection according to the following formula:

wherein Weight _m is the Weight coefficient of the mth speech subsection;

a voiceprint feature vector construction unit configured to construct a voiceprint feature vector of the voice information according to:

VoPrintVec＝(VpElem₁,VpElem₂,......,VpElem_n,......,VpElem_N)

wherein, VoPrintVec is a voiceprint feature vector of the speech information.

It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described apparatus, modules and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.

Fig. 5 shows a schematic block diagram of a terminal device according to an embodiment of the present invention, and for convenience of explanation, only a portion related to the embodiment of the present invention is shown.

In this embodiment, the terminal device 5 may be a computing device such as a desktop computer, a notebook computer, a palm computer, and a cloud server. The terminal device 5 may include: a processor 50, a memory 51, and computer readable instructions 52 stored in the memory 51 and executable on the processor 50, such as computer readable instructions for performing the application control method described above. The processor 50, when executing the computer readable instructions 52, implements the steps of the various application control method embodiments described above, such as steps S101 through S105 shown in fig. 1. Or the processor 50, when executing the computer-readable instructions 52, performs the functions of the modules/units of the apparatus embodiments described above, such as the functions of modules 401 through 405 shown in fig. 4.

Illustratively, the computer readable instructions 52 may be partitioned into one or more modules/units that are stored in the memory 51 and executed by the processor 50 to accomplish the present invention. The one or more modules/units may be a series of computer readable instruction segments capable of performing specific functions describing the execution of the computer readable instructions 52 in the terminal device 5.

The Processor 50 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 51 may be an internal storage unit of the terminal device 5, such as a hard disk or a memory of the terminal device 5. The memory 51 may also be an external storage device of the terminal device 5, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the terminal device 5. Further, the memory 51 may also include both an internal storage unit and an external storage device of the terminal device 5. The memory 51 is used for storing the computer readable instructions as well as other instructions and data required by the terminal device 5. The memory 51 may also be used to temporarily store data that has been output or is to be output.

The functional units in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution contributing to the prior art or in the form of a software product stored in a storage medium, comprising a number of computer readable instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing computer readable instructions.

The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. An application control method, comprising:

Word segmentation is carried out on each corpus in a preset corpus to obtain each word, wherein each corpus comprises a corpus sub-database corresponding to each control instruction in a preset control instruction set; counting the occurrence frequency of each word in each word sub-library respectively; taking the ratio of the maximum value to the next maximum value of the frequency of the w-th word in each word sub-library as the classification identification degree of the w-th word, wherein w is the serial number of the word, and w is more than or equal to 1 and less than or equal to WordNum, wordNum and the total number of the words; selecting words with classification recognition degree larger than a preset recognition degree threshold as keywords, and taking a control instruction corresponding to a corpus with the maximum frequency of the w-th keyword as a control instruction corresponding to the w-th keyword; constructing each keyword corresponding to the c-th control instruction into a keyword set corresponding to the c-th control instruction, wherein c is the sequence number of the control instruction, and c is more than or equal to 1 and less than or equal to ClassNum, classNum and is the total number of the control instructions;

respectively counting the occurrence frequency of each keyword in the text information;

Taking the sum of products of the frequency of occurrence and the classification recognition degree of each keyword in the keyword set corresponding to the c-th control instruction in the text information as the matching degree between the text information and the c-th control instruction;

2. The application control method according to claim 1, wherein the step of taking, as the degree of matching between the text information and the c-th control instruction, a sum of products of the frequency of occurrence and the classification discrimination of each keyword in the keyword set corresponding to the c-th control instruction in the text information includes:

calculating the matching degree between the text information and the c-th control instruction according to the following formula:

Wherein kn is the sequence number of the key words, kn is 1-KwNum _c,KwNum_c is the total number of the key words in the key word set corresponding to the c-th control instruction, msgKWNum _c,kn is the frequency of occurrence of the kn key words in the key word set corresponding to the c-th control instruction in the text information, classDeg _c,kn is the classification identification of the kn key words in the key word set corresponding to the c-th control instruction, and MatchDeg _c is the matching degree between the text information and the c-th control instruction.

3. The application control method according to claim 1, wherein the step of using the ratio of the maximum value of the frequency of occurrence of the w-th word in each word sub-library to the next-largest value as the classification recognition degree of the w-th word comprises:

The classification identity of the w-th word is calculated according to the following:

Wherein FreqSeq _w is the frequency sequence of occurrence of the w-th word in each word sub-library, FreqSeq_w＝[Freq_w,1,Freq_w,2,......,Freq_w,c,......,Freq_w,ClassNum],Freq_w,c is the frequency of occurrence of the w-th word in the corpus sub-library corresponding to the c-th control instruction, freqSeq' _w is the sequence remaining after the maximum value is removed from FreqSeq _w, namely: freqSeq' _w＝FreqSeq_w-MAX(FreqSeq_w), MAX is the maximum function, classDeg _w is the classification identity of the w-th word.

4. The application control method according to any one of claims 1 to 3, characterized by further comprising, before performing speech recognition on the collected speech information:

Calculating the voiceprint feature vector of the voice information, and inquiring the reference feature vector corresponding to the user in a preset database;

Calculating the similarity between the voiceprint feature vector of the voice information and the reference feature vector according to the following formula:

wherein N is the element number of the voiceprint feature vector of the voice information, N is more than or equal to 1 and less than or equal to N, N is the total number of elements of the voiceprint feature vector of the voice information, vpElem _n is the nth element of the voiceprint feature vector of the voice information, stVpElem _n is the nth element of the reference feature vector, and SimDeg is the similarity between the voiceprint feature vector of the voice information and the reference feature vector;

And if the similarity between the voiceprint feature vector of the voice information and the reference feature vector is greater than a preset similarity threshold, executing the step of carrying out voice recognition on the acquired voice information and the subsequent steps.

5. The application control method according to claim 4, wherein the calculating the voiceprint feature vector of the voice information includes:

Dividing the voice information into M voice subsections, wherein M is an integer greater than 1;

Respectively calculating the mel frequency spectrum scrambling coefficient vector of each voice subsection according to the following steps:

MelVec_m＝MFCCFuc(SubVoice_m)

and respectively calculating the weight coefficient of each voice subsection according to the following steps:

wherein Weight _m is the Weight coefficient of the mth speech subsection;

constructing a voiceprint feature vector of the voice information according to the following steps:

VoPrintVec＝(VpElem₁,VpElem₂,......,VpElem_n,......,VpElem_N)

wherein, VoPrintVec is a voiceprint feature vector of the speech information.

6. An application control apparatus, comprising:

the matching degree calculation module is used for performing word segmentation on each corpus in a preset corpus to obtain each word, wherein the corpus comprises corpus sub-databases respectively corresponding to each control instruction in a preset control instruction set; counting the occurrence frequency of each word in each word sub-library respectively; taking the ratio of the maximum value to the next maximum value of the frequency of the w-th word in each word sub-library as the classification identification degree of the w-th word, wherein w is the serial number of the word, and w is more than or equal to 1 and less than or equal to WordNum, wordNum and the total number of the words; selecting words with classification recognition degree larger than a preset recognition degree threshold as keywords, and taking a control instruction corresponding to a corpus with the maximum frequency of the w-th keyword as a control instruction corresponding to the w-th keyword; constructing each keyword corresponding to the c-th control instruction into a keyword set corresponding to the c-th control instruction, wherein c is the sequence number of the control instruction, and c is more than or equal to 1 and less than or equal to ClassNum, classNum and is the total number of the control instructions; respectively counting the occurrence frequency of each keyword in the text information; taking the sum of products of the frequency of occurrence and the classification recognition degree of each keyword in the keyword set corresponding to the c-th control instruction in the text information as the matching degree between the text information and the c-th control instruction;

7. The application control device according to claim 6, wherein the matching degree calculation module includes:

8. The application control device according to claim 7, wherein the keyword set determination unit includes:

the word segmentation processing subunit is used for carrying out word segmentation processing on each corpus in the corpus to obtain each word;

Wherein FreqSeq _w is the frequency sequence of occurrence of the w-th word in each word sub-library, FreqSeq_w＝[Freq_w,1,Freq_w,2,......,Freq_w,c,......,Freq_w,ClassNum],Freq_w,c is the frequency of occurrence of the w-th word in the corpus sub-library corresponding to the c-th control instruction, freqSeq' _w is the sequence remaining after the maximum value is removed from FreqSeq _w, namely: freqSeq' _w＝FreqSeq_w-MAX(FreqSeq_w), MAX is the maximum function, classDeg _w is the classification identity of the w-th word;

TgtKwSet_w＝argmax(FreqSeq_w)＝argmax(Freq_w,1,Freq_w,2,......,Freq_w,c,......,Freq_w,ClassNum)

Wherein TgtKwSet _w is the sequence number of the control instruction corresponding to the w-th keyword;

9. A computer readable storage medium storing computer readable instructions which, when executed by a processor, implement the steps of the application control method of any one of claims 1 to 5.

10. A terminal device comprising a memory, a processor and computer readable instructions stored in the memory and executable on the processor, wherein the processor, when executing the computer readable instructions, implements the steps of the application control method of any one of claims 1 to 5.