CN107526717B - Method for automatically generating natural language text by structured process model - Google Patents

Method for automatically generating natural language text by structured process model Download PDF

Info

Publication number
CN107526717B
CN107526717B CN201710620781.8A CN201710620781A CN107526717B CN 107526717 B CN107526717 B CN 107526717B CN 201710620781 A CN201710620781 A CN 201710620781A CN 107526717 B CN107526717 B CN 107526717B
Authority
CN
China
Prior art keywords
flow
text
model
natural language
tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710620781.8A
Other languages
Chinese (zh)
Other versions
CN107526717A (en
Inventor
曾庆田
原桂远
李超
鲁法明
段华
周长红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University of Science and Technology
Original Assignee
Shandong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University of Science and Technology filed Critical Shandong University of Science and Technology
Priority to CN201710620781.8A priority Critical patent/CN107526717B/en
Publication of CN107526717A publication Critical patent/CN107526717A/en
Application granted granted Critical
Publication of CN107526717B publication Critical patent/CN107526717B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • G06F40/154Tree transformation for tree-structured or markup documents, e.g. XSLT, XSL-FO or stylesheets
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a method for automatically generating a natural language text by a structured process model, and belongs to the field of process mining. The method comprises the steps of firstly utilizing the analysis technology of the label text to obtain and analyze the label text information in the BPMN process model, secondly utilizing the process model structure conversion technology to convert the structure of the BPMN process model into a process structure tree, and finally utilizing the natural language generation technology to generate the natural language text. The invention can generate the natural language text with correct grammar and complete semantics, so that non-professionals can understand the BPMN flow model by reading the natural language text.

Description

Method for automatically generating natural language text by structured process model
Technical Field
The invention belongs to the field of process mining, and particularly relates to a method for automatically generating a natural language text by a structured process model.
Background
Currently, generating natural language text from a BPMN (Business Process Model and notification) flow Model includes two types of schemes: reading and understanding the model by a BPMN process model expert to generate a corresponding natural language text; another type is natural language text that is automatically generated by means of natural language information analysis.
The first method is that with the help of the expert of the process model, the expert masters the knowledge related to the BPMN process model, and the expert reads and understands the BPMN process model and expresses the meaning of the model by a natural language text.
The second method is to use natural language analysis to generate natural language expression of the flow model, and the method has the advantages of low cost, high efficiency and simple operation. However, the text generated by natural language analysis depends on the defined template, so the quality of the generated text depends on the integrity and correctness of the template, and the structures such as branches and choices in the flow model cannot be correctly described.
The two methods are comprehensively analyzed, the result generated by the first method is more correct and authoritative, and personalized design can be carried out according to the model. But the difficulty of searching for flow experts is large, and the time cost is high. The second method has low development difficulty and high efficiency. But the generation of the text depends on the template, and the quality of the text cannot be guaranteed. The technology and thought proposed by the invention are innovative in the whole view and cannot be realized by the existing natural language text generation method.
The existing method for generating the natural language text from the BPMN process model comprises the schemes of artificial generation, natural language analysis generation and the like. The technical defects are mainly reflected in the following aspects:
the manual generation scheme cannot solve the problem of long time period. With the increase of the scale of the process model, the accuracy of the generated text cannot be ensured for the process model experts, and the cost of manual generation is high and the efficiency is low.
The natural language analysis generation scheme is to generate a natural language text by using template and label text information, and the generated text grammar cannot be guaranteed to be correct because the grammar of the generated text is not considered, so the readability of the generated text is poor. For the structures such as branches, selections, loops, and concurrences existing in the flow model, the natural language analysis method cannot perform correct processing, and therefore flow structure information is lost, which is fatal to understanding the flow model.
Disclosure of Invention
Aiming at the technical problems in the prior art, the invention provides the method for automatically generating the natural language text by the structured flow model, which has reasonable design, overcomes the defects of the prior art and has good effect.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for automatically generating a natural language text by a structured process model adopts a label text information analysis module, a process model structure conversion module and a natural language text generation module;
the tag text information analysis module is configured to acquire and analyze tag text information of a Model element in a BPMN (Business Process Model and Notification) flow Model, acquire text information including nodes, edges and lanes, and then use semantic role marking to analyze the text information;
the flow model structure conversion module is configured to complete the conversion of the flow model structure, divide the flow model into flow segments with hierarchy, and store the flow segments and the relationship among the flow segments by using a flow structure tree;
a natural language text generation module configured to complete generation of a natural language text of the flow model;
the method for automatically generating the natural language text by the structured flow model comprises the following steps:
step 1: analyzing the text;
the method comprises the steps that label text information in a BPMN process model is obtained and analyzed through a label text information analyzing module, text information including nodes, edges and lanes is obtained, and then semantic role labeling is used for analyzing the text information;
step 2: structure conversion;
traversing the BPMN flow model in a depth-first mode, dividing the flow model into flow fragments by using an RPST (the refined process structure tree) algorithm, discovering the relation of organizing the flow fragments by a flow model structure conversion module, and converting the flow model structure represented by a graph into a form represented by a tree;
and step 3: generating a text;
and completing the generation of the natural language text of the flow model through a natural language text generation module.
Preferably, in step 2, the method specifically comprises the following steps:
step 2.1: traversing the graph;
traversing the flow model by a depth-first traversal technology, and traversing all nodes and edges in the flow model;
step 2.2: dividing the structure;
dividing a graph structure of the flow model by using an RPST algorithm, dividing the flow model into flow fragments with unique inlets and unique outlets, and nesting or not intersecting the flow fragments;
step 2.3: constructing a flow structure tree;
and storing the flow fragments and the relationship among the flow fragments by using a flow structure tree, wherein the hierarchy of the tree represents the nesting relationship of the flow fragments, and the nodes in the tree represent the flow fragments.
Preferably, in step 3, the method specifically comprises the following steps:
step 3.1: generating a flow structure tree with annotations;
generating a flow structure tree with annotations by using the parsed label text information and the flow structure tree;
step 3.2: generating a sentence by a grammar tree;
recursively traversing the annotated flow structure tree, generating statements describing leaf nodes using the syntax tree;
step 3.3: generating a natural language text;
recursively traversing the annotated flow structure tree, and generating a natural language text by using sentences generated by the syntax tree according to the types of flow fragments;
step 3.4: organizing a text structure;
and adding paragraph marks, indentation and punctuation marks according to the flow model structure, and organizing a natural language text paragraph structure.
Preferably, in the structured flow model, the text information on the edge is stored in the target node, so that the text on the edge is parsed and used only once.
The invention has the following beneficial technical effects:
and (3) parsing of the label text information: the grammatical structure of a text is not considered in the text generated by the existing natural language analysis method, the generated text depends on a template, and the text format is single; according to the method and the device, the text information of the model elements in the process model is acquired through analyzing the label text information, and the text is generated by using the model element text information, so that the consistency of the model text is ensured.
The process structure conversion technology based on the process structure tree comprises the following steps: the natural language analysis method and the method for artificially generating the text describe the process model based on the local structure of the process model, and do not take the process model as an integral structure; the invention uses the flow structure tree to represent the structure of the flow model, divides the flow model into flow segments with hierarchy, and each flow segment represents a modularized sub-flow, thus being capable of more accurately depicting the structure of the flow model.
Syntax tree based natural language generation techniques: the invention uses the grammar tree and the analyzed label text information to generate the short text with correct grammar, compared with the prior art, the invention can ensure that the grammar of the text is correct and the semantic is complete; the structure of the process model is described through the recursive traversal of the annotated process structure tree, and the consistency of the model text can be ensured.
Drawings
Fig. 1 is a basic principle diagram of the present invention.
FIG. 2(a) is a schematic view of a process model structure.
FIG. 2(b) is a schematic view of a flow structure tree.
FIG. 3 is a diagram of a syntax tree and the text generated thereby.
Fig. 4 is a BPMN flow chart.
Fig. 5 is a graph of the experimental results.
Detailed Description
The invention is described in further detail below with reference to the following figures and detailed description:
starting from a BPMN flow model, as shown in FIG. 1, the method firstly obtains text information of nodes, edges and lanes in the BPMN flow model, and then analyzes the text information by using semantic role labeling; then, using an RPST (the refined process structure tree) algorithm to convert a flow model structure, and storing the flow model structure in a flow structure tree; then merging the analyzed and text information with the process structure tree to generate a process structure tree with annotations; the textual information in the annotated flow structure tree is then converted to short text using the syntax tree. And finally, generating a natural language text of the BPMN process model through the traversal of the annotated process structure tree and the short text generated by the syntax tree. Therefore, the invention provides detailed function modules from the function point of view and provides a detailed implementation technical scheme for each function module based on the basic content of the scheme. The main functional modules of the invention comprise: the system comprises a label text information analysis module, a flow model structure conversion module and a natural language text generation module.
1. Label text information analysis module
The module mainly acquires text information of model elements in the BPMN process model, including the text information of nodes, edges and lanes, and then uses semantic role labeling to analyze the text information.
The BPMN process model has label text information carried on nodes, edges and lanes, and the information needs to be acquired and analyzed, and is a source of sentence components of natural language text. In order to generate a natural language text with correct grammar, semantic role marking is used for analyzing text information, and information such as subjects, verbs, objects, clauses and the like in the text is obtained.
In the BPMN flow model, role information is stored in a pool and a swim lane, and text in a node is generally a moving object phrase, so that the role information on the swim lane, namely a subject of the moving object phrase, needs to be added when the text information of the node is analyzed. In the BPMN process model, in order to prevent the edges from being repeatedly analyzed, the edges are associated with the target node, and the texts on the edges are stored in the target node, so that each edge can be analyzed only once, and the analysis efficiency is improved. It should be noted that the gateway node is used as a decision point for representing branching, selecting and concurrency in the flow model, and the processing procedure of the gateway node is slightly different from that of a common active node, because the gateway node in the flow model may have multiple input edges and multiple output edges, and the edges may carry text information at the same time, so that the expression of the text information needs to be manually adjusted.
After the module is completed, the information can be used as a data source when a natural language is generated.
2. Conversion module of process model structure
The module mainly completes the conversion of the flow model structure. The BPMN flow model is represented in the form of a graph, requiring the flow model to be divided into hierarchical flow segments, each flow segment having a start node and an end node. The flow fragments are organized into a tree with a hierarchical relationship through the relationship among the flow fragments, namely a flow structure tree.
The method comprises the steps of using a depth-first search traversal flow model, dividing the flow model by using an RPST (the refined process structure tree) algorithm in the traversal process, wherein the RPST algorithm can find the control flow of a business flow and decompose the original flow into sub-flows with hierarchical relationship, and the decomposition result of each time is unique and modularized.
The RPST algorithm can find various structures such as selection, skipping, circulation, parallel, mutual exclusion and the like, the RPST algorithm decomposes a flow into sub-flow segments with a hierarchical relationship, each sub-flow has a corresponding type, such as trival with only one edge, bond with multiple edges connected between two nodes, poly connected in series between the nodes, and rigid types which do not belong to the three types. Thus, the process model can be decomposed, and after organization, a tree corresponding to the structure of the process model can be generated, as shown in fig. 2(a) and 2(b), the process model and the process structure tree thereof are divided by the RPST algorithm, so that the division result is unique and modularized, and the division is as simple as possible.
3. Natural language text generation module
The module mainly completes the generation of natural language texts of the flow model. Merging the analyzed text information and the flow structure tree to form a flow structure tree with annotations; the annotated flow structure tree is recursively traversed, and short text describing the behavior of the flow nodes is generated using the syntax tree and parsed text information, as shown in fig. 3, which describes the behavior of edges and nodes in the flow model. The structure information of the flow model is stored in the annotated flow structure tree, and the leaf node in the annotated flow structure tree represents an edge in the flow model, so that the short texts can be stored in the leaf node of the annotated flow structure tree.
In the process of recursive traversal, paragraph identification, indentation and punctuation marks are added according to the branch structure of the flow, so that the readability of the text is enhanced, and therefore, the module is also the key of the invention.
The label text information analysis technology comprises the following steps: the method obtains the text information on the edges, nodes and lanes in the BPMN process model, and uses semantic role marking to analyze the text information, so as to obtain the information of the subject, verb, object, clause, text on the input edges and the like of each node in the BPMN process model, provide sentence components for generating the natural language text, ensure the consistency of the model text and avoid the deletion of the text information.
The flow structure conversion mechanism based on the flow structure tree comprises the following steps: the invention can analyze and convert the structure of the BPMN process model, divide the process into process segments with hierarchy, wherein the process segments represent the modules of the process model, and the inclusion relationship of the process segments represents the structure of the process model. The flow structure tree is used to represent flow segments and the relationship between flow segments. The flow structure tree is used for representing the structure of the flow model, so that the integrity of the structure can be ensured, the access speed of the structure is increased, and the conversion efficiency is improved.
Syntax tree based natural language text generation techniques: the invention uses the grammar tree to convert the text information in the annotated flow structure tree into short texts, wherein the short texts are texts with correct grammars and describe the behaviors of nodes and edges. And traversing the structure in the flow structure tree, describing the structure of the flow model by using a short text, generating a natural language text of the flow model, and adding paragraph identification, indentation and punctuation marks in the text generation process. Model elements in the process model are described, the structure of the process model is depicted, readability of the text is enhanced, and consistency of the model text is guaranteed.
The invention is proved to be feasible through experiments, simulation and use, and how the result is
The scheme of the invention converts the BPMN process model of the bicycle manufacturer in an experimental mode to generate the natural language expression of the process model. For the BPMN flow shown in fig. 4, the flow model is expressed in chinese, and a corresponding natural language expression in chinese is generated. The results of the experiment are shown in FIG. 5.
It is to be understood that the above description is not intended to limit the present invention, and the present invention is not limited to the above examples, and those skilled in the art may make modifications, alterations, additions or substitutions within the spirit and scope of the present invention.

Claims (2)

1. A method for automatically generating a natural language text by a structured process model is characterized in that: a label text information analysis module, a flow model structure conversion module and a natural language text generation module are adopted;
the tag text information analysis module is configured to acquire and analyze tag text information of a model element in the BPMN process model, acquire text information including nodes, edges and lanes, and then use semantic role labeling to analyze the text information;
the flow model structure conversion module is configured to complete the conversion of the flow model structure, divide the flow model into flow segments with hierarchy, and store the flow segments and the relationship among the flow segments by using a flow structure tree;
a natural language text generation module configured to complete generation of a natural language text of the flow model;
the method for automatically generating the natural language text by the structured flow model comprises the following steps:
step 1: analyzing the text;
the method comprises the steps that label text information in a BPMN process model is obtained and analyzed through a label text information analyzing module, text information including nodes, edges and lanes is obtained, and then semantic role labeling is used for analyzing the text information;
step 2: structure conversion;
traversing the BPMN flow model in a depth-first mode, dividing the flow model into flow segments by using an RPST algorithm, finding the relationship of organizing the flow segments by using a flow model structure conversion module, and converting the flow model structure represented by a graph into a form represented by a tree; the method specifically comprises the following steps:
step 2.1: traversing the graph;
traversing the flow model by a depth-first traversal technology, and traversing all nodes and edges in the flow model;
step 2.2: dividing the structure;
dividing a graph structure of the flow model by using an RPST algorithm, dividing the flow model into flow fragments with unique inlets and unique outlets, and nesting or not intersecting the flow fragments;
step 2.3: constructing a flow structure tree;
storing the flow fragments and the relationship among the flow fragments by using a flow structure tree, wherein the hierarchy of the tree represents the nesting relationship of the flow fragments, and the nodes in the tree represent the flow fragments;
and step 3: generating a text;
the generation of the natural language text of the flow model is completed through a natural language text generation module;
the method specifically comprises the following steps:
step 3.1: generating a flow structure tree with annotations;
generating a flow structure tree with annotations by using the parsed label text information and the flow structure tree;
step 3.2: generating a sentence by a grammar tree;
recursively traversing the annotated flow structure tree, generating statements describing leaf nodes using the syntax tree;
step 3.3: generating a natural language text;
recursively traversing the annotated flow structure tree, and generating a natural language text by using sentences generated by the syntax tree according to the types of flow fragments;
step 3.4: organizing a text structure;
and adding paragraph marks, indentation and punctuation marks according to the flow model structure, and organizing a natural language text paragraph structure.
2. The method for automatically generating natural language text from a structured flow model as claimed in claim 1, wherein: in the structured flow model, the text information on the edge is stored in the target node, so that the text on the edge is analyzed and used only once.
CN201710620781.8A 2017-07-27 2017-07-27 Method for automatically generating natural language text by structured process model Active CN107526717B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710620781.8A CN107526717B (en) 2017-07-27 2017-07-27 Method for automatically generating natural language text by structured process model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710620781.8A CN107526717B (en) 2017-07-27 2017-07-27 Method for automatically generating natural language text by structured process model

Publications (2)

Publication Number Publication Date
CN107526717A CN107526717A (en) 2017-12-29
CN107526717B true CN107526717B (en) 2021-01-01

Family

ID=60680118

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710620781.8A Active CN107526717B (en) 2017-07-27 2017-07-27 Method for automatically generating natural language text by structured process model

Country Status (1)

Country Link
CN (1) CN107526717B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108519963B (en) * 2018-03-02 2021-12-03 山东科技大学 Method for automatically converting process model into multi-language text
CN108681529B (en) * 2018-03-26 2022-01-25 山东科技大学 Multi-language text and voice generation method of flow model diagram
CN110175225A (en) * 2019-04-26 2019-08-27 美林数据技术股份有限公司 Non-structural text data processing method and device
CN112733515B (en) * 2020-12-31 2022-11-11 贝壳技术有限公司 Text generation method and device, electronic equipment and readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101668047A (en) * 2009-09-30 2010-03-10 北京航空航天大学 Method and device for automatically generating composite service description language
CN102520953A (en) * 2011-12-15 2012-06-27 北京航空航天大学 Page generating method based on BPMN (business process modeling notation) and device
CN104391730A (en) * 2014-08-03 2015-03-04 浙江网新恒天软件有限公司 Software source code language translation system and method
CN105975269A (en) * 2016-05-03 2016-09-28 北京航空航天大学 Process model-based demand verification method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101668047A (en) * 2009-09-30 2010-03-10 北京航空航天大学 Method and device for automatically generating composite service description language
CN102520953A (en) * 2011-12-15 2012-06-27 北京航空航天大学 Page generating method based on BPMN (business process modeling notation) and device
CN104391730A (en) * 2014-08-03 2015-03-04 浙江网新恒天软件有限公司 Software source code language translation system and method
CN105975269A (en) * 2016-05-03 2016-09-28 北京航空航天大学 Process model-based demand verification method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
面向多视图的跨部门应急处置流程相似度计算方法;曾庆田 等;《计算机集成制造***》;20150228;第21卷(第2期);全文 *

Also Published As

Publication number Publication date
CN107526717A (en) 2017-12-29

Similar Documents

Publication Publication Date Title
CN107526717B (en) Method for automatically generating natural language text by structured process model
Wang et al. Joint word alignment and bilingual named entity recognition using dual decomposition
Schuler et al. Broad-coverage parsing using human-like memory constraints
Riefer et al. Mining process models from natural language text: A state-of-the-art analysis
CN108681529B (en) Multi-language text and voice generation method of flow model diagram
CN110609983B (en) Structured decomposition method for policy file
Dawood From requirements engineering to uml using natural language processing–survey study
CN116501306B (en) Method for generating interface document code based on natural language description
US20220414463A1 (en) Automated troubleshooter
Abdelnabi et al. Generating uml class diagram from natural language requirements: A survey of approaches and techniques
CN106021224A (en) Bilingual discourse annotation method
CN113609838B (en) Document information extraction and mapping method and system
Arellano et al. Frameworks for natural language processing of textual requirements
CN102654873A (en) Tourism information extraction and aggregation method based on Chinese word segmentation
CN109062904A (en) Logical predicate extracting method and device
KR20140052328A (en) Apparatus and method for generating rdf-based sentence ontology
CN107526726B (en) Method for automatically converting Chinese process model into English natural language text
JIMCALE et al. An approach for detecting syntax and syntactic ambiguity in software requirement specification
CN108519963B (en) Method for automatically converting process model into multi-language text
Kamalabalan et al. Tool support for traceability of software artefacts
CN116245177A (en) Geographic environment knowledge graph automatic construction method and system and readable storage medium
CN114911893A (en) Method and system for automatically constructing knowledge base based on knowledge graph
Peng et al. UofR at SemEval-2016 task 8: Learning synchronous hyperedge replacement grammar for AMR parsing
Osman et al. Generate use case from the requirements written in a natural language using machine learning
CN117473054A (en) Knowledge graph-based general intelligent question-answering method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant