CN113223520A

CN113223520A - Voice interaction method, system and platform for semantic understanding of software operation live-action

Info

Publication number: CN113223520A
Application number: CN202110450524.0A
Authority: CN
Inventors: 赵克; 仇彦男; 周一帆; 高健壮
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2021-04-25
Filing date: 2021-04-25
Publication date: 2021-08-06
Anticipated expiration: 2041-04-25
Also published as: CN113223520B

Abstract

The invention discloses a voice interaction method, a system and a platform for semantic understanding of a software operation live-action, which are characterized in that button information of an operation button of interoperable software is extracted, a use problem and help information are generated, training data of an interactive help system are generated, and an interactive help model based on the semantic understanding of the software operation live-action is constructed in advance; generating real-scene natural language question information; generating natural language question information based on a real scene according to the use questions put forward by a user through natural language, operating an interactive help model for real scene semantic understanding based on software for deep learning, and outputting the sequencing scores of semantic analysis results; one or more semantic parsing results are selected as a user problem understanding result, and interactive operation help is provided for the user based on an interactive help model. The invention improves the reliability and accuracy of the semantic understanding of the software operation live-action, solves the problem that the general help instruction is not intuitive when the user uses the general help instruction, and greatly enhances the good experience of the user using the product.

Description

Voice interaction method, system and platform for semantic understanding of software operation live-action

Technical Field

The invention relates to a semantic understanding-based voice interaction method, system and platform for various computers, mobile terminals and various electrical appliances.

Background

Whether software in present electronic equipment software and hardware facilitates the use has decided the convenience that electronic equipment used, and the user mainly has three problems when using electronic equipment's software: firstly, whether a certain software has the functions required by the software or not is not known, or whether the software develops the functions required by the software or not is not known; secondly, the required function buttons can not be found in the software; third, it knows that some software has the functions required by itself and also knows where, but it is "unused" or poorly used. Although the software is helpful to the specification, understanding the specification and searching functions are still far from the experience of a professional person guiding operation, and the three problems greatly influence the use efficiency of the software and simultaneously restrict the wide popularization of the software.

Natural language is the most convenient and natural way for human to express self thought, in recent years, with deep learning and the appearance of *** natural language processing model bert, natural language has gradually become the most mainstream way in human-computer interaction, and because of the characteristics of diversity, complexity, omission, close correlation with conversation scenes and the like of natural language conversation, accurate understanding of conversation contents is always a hotspot and difficulty of research in the field of artificial intelligence; in addition, in deep learning, calculation power, algorithm and data are the key points for success, and it is also a difficult point to effectively obtain a large amount of training data.

In a word, how to make the electronic equipment accurately understand the use problem of the user in use and give good experience like an expert guiding the use of the product at hand is a problem which needs to be solved urgently in human-computer interaction.

Disclosure of Invention

In order to solve the above-mentioned defects in the prior art, the present invention aims to provide an interactive help method and system based on software operation live-action semantic understanding and a development platform thereof, so as to improve the efficiency of software interactive help in electrical and electronic devices and enhance the quality of experience of users using products.

The invention is realized by the following technical scheme.

The technical solution of the invention is as follows: an interactive help method based on software operation live-action semantic understanding comprises the following steps:

step 1, extracting button information of an operation button of interactive operation software, generating use problems and help information, and generating training data of an interactive help system based on software operation live-action semantic understanding;

step 2, an interactive help model which adopts deep learning and is based on the semantic understanding of the software operation real scene is constructed in advance;

step 3, generating real-scene natural language question information;

step 4, inputting natural language question information based on the real scene into an interactive help model which adopts deep learning and is based on the semantic understanding of the software operation real scene, and outputting the sequencing score of each semantic analysis result;

and 5, selecting one or more semantic analysis results as an understanding result of the user problem according to the sorting scores, and then providing interactive operation help for the user based on the interactive help model.

Wherein the button information includes: button names, button range information, relationship information between buttons, and button function information.

The step 1 specifically comprises the following steps:

step 1-1, extracting button information of an interactive operation software operation button according to an interactive help system development platform based on software operation scene semantic understanding;

step 1-2, generating training data combining the use problem of the live-action information, the user intention and the help information by adopting a rule-based method or an if-then-based mode based on the knowledge of the use problem of the button, and generating a software button relation tree;

step 1-3, further generating training data according to a word bank and the use problem, the user intention and the help information which are combined with the real-scene information;

step 1-4, collecting other user use problems and help information of the interactive operation software, marking the problems and the help information, and using the problems and the help information as training data;

and 1-5, taking the training data as a training sample.

The combination method combines the use questions of the live-action information with the live-action natural language question information, the user intention and the help information, and uses the live-action information as a part of the question information in a form of natural language omission recovery or uses the live-action information as a part of the question information in a manner of attention mechanism.

The step 2 specifically comprises the following steps:

and training by using the training sample and the help information to obtain an interactive help model based on software operation live-action semantic understanding, wherein the model adopts a task dialogue mode, and the software live-action information when a user asks a question is added into the training model in a weighting mode in the training process.

The step 3 specifically comprises the following steps:

step 3-1, extracting natural language question information of a user using interactive operation software;

the current live-action information of the extraction software represents that: when a user uses software, the information comprises the name of the software, the actual state of a menu bar, the size of a screen and the resolution; the live-action information is acquired by using button information provided by an operating system and a software interface, or button operation information acquired by adopting a hook method (hook), a system monitoring method or a polling scheduling method;

and 3-2, combining the live-action information and the natural language question information to generate live-action natural language question information, wherein the combining method comprises the step of using the live-action information as a part of the question information in a form of language omission recovery or using the live-action information as a part of the question information in a way of attention mechanism.

The step 4 specifically comprises the following steps:

inputting the natural language question information based on the real scene into an interactive help model which adopts deep learning and is based on the semantic understanding of the software operation real scene; outputting the sequencing score of each semantic analysis result, and giving out the semantic analysis result combined with the real scene and the sequencing score thereof;

the task-oriented dialog realized by adopting the deep learning comprises a dialog system established by adopting pipeline or end-to-end; intent recognition includes employing rule-based intent recognition, long short-term memory network (LSTM) + intent recognition, or BERT-based intent recognition.

The step 5 specifically comprises the following steps:

step 5-1, selecting one or more semantic parsing results as an understanding result of the user problem according to the ranking score;

step 5-2, providing interactive operation help for the user based on the interactive help model; the interactive help model includes:

button location finding help: giving a sequential prompt of needing to click one or more buttons based on the position of the current menu state of the interactive operation software and the button relation tree;

button function query help: giving a button function introduction comprising a text introduction, a picture introduction and a video introduction;

how the buttons help: and combining the live-action results of the button function query help and the button position search help for introduction.

The natural language question information is voice information or text information; for voice information, before semantic parsing of natural language information, the natural language information is converted into text information through voice recognition.

The invention further provides an interactive help system development platform based on the semantic understanding of the software operation live-action, which comprises the following steps:

the interactive software button information acquisition module is used for helping a system developer to acquire button information by utilizing interactive software; wherein the button function information, the button border or vertex information, the current button and its parent button may be manually filled in a menu provided by the development platform;

the knowledge-based real-scene natural language question information-answer pair generation module is used for generating a software button relation tree based on button use problems according to button information; further generating training data; generating training data according to a word bank of the near sense words and the use problem, the user intention and the help information which are combined with the real scene information; marking the collected use problems and help information of other users of the interactive operation software, and using the use problems and help information as training data which is used as a training sample;

and the interactive help model construction module is used for training according to the training sample to obtain the interactive help model based on the software operation real scene semantic understanding.

The invention also provides an interactive help system based on the software operation live-action semantic understanding, which comprises:

the live-action obtaining module is used for obtaining live-action information including software names and menu current state information;

the real-scene natural language question information generating module is used for extracting natural language question information of a user using interactive operation software; combining the live-action information and the natural language question information to generate live-action natural language question information;

an interactive help module based on software operation live-action semantic understanding adopts a deep learning task-based oriented dialogue system; the system is used for understanding natural language questions containing live-action information of a user, giving interactive help according to the questions, and introducing live-action results of the help of how the buttons are used and combining the button function query help and the button position search help;

the system further comprises: and the voice recognition module is used for converting the natural language information into text information through voice recognition before the semantic analysis module carries out semantic analysis on the natural language information.

Due to the adoption of the technical scheme, the invention has the following beneficial effects:

according to the interactive help method and system based on the software operation live-action semantic understanding and the development platform thereof provided by the embodiment of the invention, the button information of the interactive operation software is obtained through the development platform, the user intention-question-answer pair which is combined with the live-action information and is used in the interactive operation is generated, and the training sample formed by combining other help information in use trains the interactive help model based on the software operation live-action semantic understanding and used for deep learning.

And performing semantic analysis on the natural language input by the user in combination with the real scene to obtain one or more semantic analysis results, and sequencing the semantic analysis results according to sequencing learning to obtain a natural language understanding result.

Furthermore, the scheme of the invention solves the problem that the whole natural language understanding system can not be recovered due to the understanding error of the lack of semantic scene types existing in the deep learning method, fully utilizes the real-scene semantic analysis result information, realizes semantic result selection, finally obtains the natural language understanding result, greatly improves the reliability and the accuracy of the natural language understanding, and provides high-quality interactive help on the basis of the understanding.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention:

FIG. 1 is a flowchart of an interactive help method based on the semantic understanding of a software operation scene according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of an interactive help system development platform based on the semantic understanding of the software operation real scene according to the embodiment of the present invention;

fig. 3 is a schematic structural diagram of an interactive help system based on the semantic understanding of the software operation real scene according to the embodiment of the present invention.

Detailed Description

The present invention will now be described in detail with reference to the drawings and specific embodiments, wherein the exemplary embodiments and descriptions of the present invention are provided to explain the present invention without limiting the invention thereto.

Fig. 1 is a flowchart of a natural language understanding method according to an embodiment of the present invention, including the following steps:

and step 111, extracting button information of an interactive operation button of the interactive operation software, further generating use problems and help information, and generating training data of the interactive help system based on the semantic understanding of the software operation scene.

The button information includes: button name, button range information, relationship information between buttons, button function information. The button range information refers to the conversion relation information of different screen sizes and resolutions according to the range contained by the button frame or vertex information and the button range; the button relationship refers to the relationship between the current button and other buttons (called sub-buttons) which appear when the current button is clicked; the button relation tree refers to a tree-like hierarchical relation formed by the relations among the parent-child buttons of the interactive operation software; the button function information refers to description information of the button function. The acquisition of the button information is not limited to the 'button information' provided by using an operating system or a software interface, and for software without interface information, the interactive help method and system based on the semantic understanding of the software operation scene and the development platform thereof can be adopted to acquire the relevant information.

In one embodiment, the button information of word count in word software is obtained. Firstly, word software needs to be started, interactive help method and system based on software operation scene semantic understanding and a development platform thereof are needed, a mouse is respectively placed at the left upper corner and the right lower corner of a button of word counting, range information of the button can be obtained according to prompt operation of the platform, then button name word counting and button father node review buttons are manually filled, and then functional information and using steps are filled, if needed, picture information of function explanation and small videos for introducing using methods can be filled.

The method comprises the steps of extracting button information of an interactive operation software operation button according to an interactive help system development platform based on software operation scene semantic understanding, generating training data combining the use problem of scene information, user intention and help information based on the knowledge of the button use problem, generating a software button relation tree by adopting a rule-based method or an if-then-based form, wherein the button relation tree is a tree-shaped hierarchical relation formed by the relation between parent and child buttons of the button of the interactive operation software. In one embodiment, the live-action information is a review menu of word software.

Training data is further generated from the thesaurus and the usage questions, user intent and help information combined with the live action information. The live-action information refers to the current state of the interactive operation software and the software menu.

The synonym description of "word count" in one embodiment may be: the number of words, or how many words, counts.

Collecting other use problems and help information of a user with interactive operation software, marking the use problems and the help information, and using the use problems and the help information as training data; the word count may also be described as: the number of words in this portion is counted.

The combination method is not limited to the real-scene information being part of the question information in a form of language omission restoration or the real-scene information being part of the question information in a manner of attention mechanism.

The INPUT sequence INPUT1 for the user query question is:

INPUT1＝<x1,x2,…,xm>；

the combined live action INPUT sequence INPUT2 is:

INPUT2＝<z,x1,x2,…,xm>；

wherein z is the content corresponding to the real scene, can be the conversation content of the real scene without recovery, and can also be the corresponding content of the real scene after the attention mechanism processing.

And taking the training data as a training sample.

And 112, training by using the training sample, and constructing an interactive help model which adopts deep learning and is based on the software operation real scene semantic understanding in advance. The model adopts a task type dialogue mode, and the software real-time information when the user asks questions is added into the training model in a weighting mode during training.

The task-oriented dialog realized by adopting the deep learning is not limited to a dialog system established by adopting end-to-end or pipeline; intent recognition is not limited to the use of rule-based intent recognition, long short-term memory network (LSTM) + intent recognition, BERT-based intent recognition.

Step 113, aiming at the question of the user, in the specific use, the real natural language question information needs to be generated, which includes:

extracting natural language question information of a user using interactive operation software; the current live-action information of the extraction software represents that: the user uses the software, including but not limited to the name of the software, the actual state of the menu bar, the screen size and the resolution.

The acquisition of the live-action information is not limited to the use of "button information" provided by the operating system, software interface, or button operation information obtained by the hook method (hook), the method of system listening, and the method of polling scheduling.

And combining the live-action information and the natural language question information to generate live-action natural language question information, wherein the combining method comprises the step of using the live-action information as a part of the question information in a form of language omission recovery or using the live-action information as a part of the question information in a way of attention mechanism.

In practical applications, the natural language information may be information input by a user in a text form, may also be information input by the user in a voice form, and may also include text information and voice information. If the natural language information input by the user includes not only text information but also voice information, or the input natural language information is voice information, the voice information needs to be converted into corresponding text information through a voice recognition technology.

And step 114, inputting natural language question information based on the real scene into the real scene semantic understanding software operation interaction help model adopting deep learning, and outputting the sequencing score of each semantic analysis result, wherein the semantic analysis result and the sequencing score are combined with the real scene.

The method solves the problem that the whole natural language understanding system can not recover errors due to lack of real scenes in the deep learning method, makes full use of real scene semantic analysis result information, realizes semantic result selection, finally obtains a real scene natural language understanding result based on software operation, and can greatly improve reliability and accuracy of real scene semantic understanding based on software operation.

And step 115, selecting one or more semantic parsing results according to the ranking scores as understanding results of the user problems, and then providing interactive operation help for the user based on the interactive help model.

Interactive help models include, but are not limited to: button location finding help, button function query help, and how to use help with buttons, wherein: button location finding help: and giving a sequential prompt of needing to click one or more buttons based on the position of the current menu state of the interactive operation software and the button relation tree.

Button function query help: the function introduction of the buttons is not limited to text, pictures and video introduction.

After the task-based dialogue processing by deep learning, the output is as follows:

OUTPUT1＝<y1,y2,…,yn>

the output sequence includes a button operation command.

The scheme overcomes the defect that the user uses a general help instruction not visually, greatly enhances the good experience of the user in using the product by combining the visual help of the real scene, and improves the use efficiency.

In use, in one embodiment 1: the user inquires the problems under the start menu of the word software as follows: where is the word count statistic? The live-action is a 'start' menu of word software, and the interactive help method and system prompt information based on the semantic understanding of the software operation live-action comprises the following steps: and after the statistical content is selected, the user is prompted to click a button for review and word count statistics in sequence, and the method for prompting the user is not limited to highlighting or flashing the button position or directly placing the current mouse at the button position and prompting the user by voice or characters.

In one embodiment 2: if the user queries the question under the "review" menu of the word software as in embodiment 1, the scene is the "review" menu of the word software, and the system prompt information includes: and clicking a word counting button after the counting content is selected.

Correspondingly, an embodiment of the present invention further provides an interactive help system development platform based on the semantic understanding of the software operation real scene, as shown in fig. 2, which is a schematic structural diagram of an interactive help system development platform based on the semantic understanding of the software operation real scene in an embodiment of the present invention, and the platform includes:

an interactive software button information obtaining module, as 201 in fig. 2: the method is used for helping a system developer obtain button information by utilizing the department interaction software, and the button information comprises the following steps: button name, button range information, relationship information between buttons, function information. The button function information refers to description information of the button function, and comprises characters, pictures and videos; the button range information refers to the range contained by the button frame or vertex information and the transformation information of different screen sizes and resolutions; the button relation information refers to the relation between the current button and other buttons (called sub-buttons) which appear when the current button is clicked; wherein the button function information, the button border or vertex information, the current button and its parent button may be manually filled in a menu provided by the development platform;

knowledge-based live-action natural language question information-answer pair generation module 202: the system comprises a button information generation module, a software button relation tree generation module, a database and a database, wherein the button information generation module is used for generating training data which combines the use problem of real scene information, user intention and help information according to the button information and generating the software button relation tree; generating training data according to a word bank of the near sense words and the use problem, the user intention and the help information which are combined with the real scene information; marking the collected use problems and help information of other users of the interactive operation software, and using the use problems and help information as training data; taking the training data as a training sample; the button relation tree refers to a tree-shaped hierarchical relation formed by the relations among the parent and child buttons of the interactive operation software; the live-action information refers to the current state of the interactive operation software and the software menu; the combination method is not limited to the real-scene information being part of the question information in a form of natural language omission recovery or being part of the question information in a manner of attention mechanism;

and the interactive help model construction module 203 based on the software operation live-action semantic understanding is used for obtaining an interactive help model based on the software operation live-action semantic understanding according to the training sample.

Correspondingly, an embodiment of the present invention further provides an interactive help system based on the semantic understanding of the software operation real scene, as shown in fig. 3, which is a schematic structural diagram of the interactive help system based on the semantic understanding of the software operation real scene in the embodiment of the present invention, and the system includes:

the live-action acquisition module 301: the method comprises the steps of acquiring live-action information including a software name and current menu state information;

real natural language question information generation module 302: extracting natural language question information of a user using interactive operation software; the current live-action information of the extraction software represents the information of the actual state of the menu bar, the size of the screen and the resolution ratio in the software used by the user. Combining the live-action information and the natural language question information to generate live-action natural language question information;

the interactive help module 303 based on the semantic understanding of the software operation scene: adopting a task oriented dialogue system based on deep learning; the system is used for understanding a natural language question of a user, which contains real scene information, and making a position of a current menu state of software and a button relation tree to give a sequential prompt of clicking one or more buttons: button function query help: the function introduction of the buttons is given, and the function introduction is not limited to text, pictures and video introduction; how the buttons help: and combining the live-action results of the button function query help and the button position search help for introduction.

The natural language information is voice information or text information.

By adopting the method, the reliability and the accuracy of the semantic understanding of the live-action based on the software operation can be improved, the problem that a user uses a general help instruction to be non-intuitive is solved, and the good experience of the user using a product is greatly enhanced by combining the intuitive help of the live-action.

The present invention is not limited to the above-mentioned embodiments, and based on the technical solutions disclosed in the present invention, those skilled in the art can make some substitutions and modifications to some technical features without creative efforts according to the disclosed technical contents, and these substitutions and modifications are all within the protection scope of the present invention.

Claims

1. An interactive help method based on software operation live-action semantic understanding is characterized by comprising the following steps:

step 3, generating real-scene natural language question information;

2. The method of claim 1, wherein the button information comprises:

button names, button range information, relationship information between buttons, and button function information;

the button range information refers to the conversion relation information between different screen sizes and different resolutions according to the range contained by the button frame or vertex information and the button range;

the button relationship refers to the relationship between the current button and the sub-button which appears when the current button is clicked;

the button function information refers to description information of the button function.

3. The method according to claim 1, wherein step 1 is specifically:

the button relation tree refers to a tree-shaped hierarchical relation formed by the relations among the parent and child buttons of the interactive operation software;

step 1-5, using the training data as a training sample;

4. The method according to claim 3, wherein step 2 is specifically:

5. The method according to claim 1, wherein step 3 is specifically:

the current live-action information of the extraction software represents that: when a user uses software, the information comprises the name of the software, the actual state of a menu bar, the size of a screen and the resolution; the live-action information is acquired by using 'button information' provided by an operating system and a software interface, or button operation information acquired by adopting a hook method, a system monitoring method or a polling scheduling method;

6. The method according to claim 1, wherein step 4 is specifically:

the task-oriented dialog realized by adopting the deep learning comprises a dialog system established by adopting pipeline or end-to-end; intent recognition includes employing rule-based intent recognition, long-short term memory network + intent recognition, or BERT-based intent recognition.

7. The method according to claim 1, wherein step 5 is specifically:

8. The method according to any one of claims 1 to 7, wherein the natural language question information is voice information or text information;

for voice information, before semantic parsing of natural language information, the natural language information is converted into text information through voice recognition.

9. An interactive help system development platform based on software operation live-action semantic understanding, which is characterized by comprising:

the interactive software button information acquisition module is used for helping system developers to acquire button information by utilizing the interactive software, wherein the button function information, the button frame or vertex information, the current button and the father button thereof can be manually filled in a menu provided by the development platform;

the knowledge-based real-scene natural language question information-answer pair generation module is used for generating a software button relation tree based on button use problems according to button information; further generating training data; taking the training data as a training sample; generating training data according to a word bank of the near sense words and the use problem, the user intention and the help information which are combined with the real scene information;

10. An interactive help system based on the semantic understanding of the software operation scene, which is characterized by comprising:

the interactive help module based on the software operation scene semantic understanding: adopting a task oriented dialogue system based on deep learning; the system comprises a natural language question which is used for understanding the user and contains the live-action information, gives interactive help according to the question, and introduces live-action results of combined button function query help and button position search help;