CN111666747A - Method for generating WORD document into description class data module conforming to S1000D standard - Google Patents

Method for generating WORD document into description class data module conforming to S1000D standard Download PDF

Info

Publication number
CN111666747A
CN111666747A CN202010481742.6A CN202010481742A CN111666747A CN 111666747 A CN111666747 A CN 111666747A CN 202010481742 A CN202010481742 A CN 202010481742A CN 111666747 A CN111666747 A CN 111666747A
Authority
CN
China
Prior art keywords
word document
standard
elements
rule
data module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010481742.6A
Other languages
Chinese (zh)
Inventor
冯彬
张悦
程铮
曹亢
马永起
蒙立荣
郑翠芳
齐天永
吴家菊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
COMPUTER APPLICATION RESEARCH INST CHINA ACADEMY OF ENGINEERING PHYSICS
Original Assignee
COMPUTER APPLICATION RESEARCH INST CHINA ACADEMY OF ENGINEERING PHYSICS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by COMPUTER APPLICATION RESEARCH INST CHINA ACADEMY OF ENGINEERING PHYSICS filed Critical COMPUTER APPLICATION RESEARCH INST CHINA ACADEMY OF ENGINEERING PHYSICS
Priority to CN202010481742.6A priority Critical patent/CN111666747A/en
Publication of CN111666747A publication Critical patent/CN111666747A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The invention discloses a method for generating a WORD document into a description class data module conforming to S1000D standard, which comprises the following steps: determining a WS rule element, wherein the WS rule is a mapping rule between an element needing attention based on the WORD document identification and the S1000D description class data module; establishing WS rule mapping relation and conversion rules; and analyzing and identifying the WORD document to be converted, converting the identified elements and contents according to the WS rule mapping relation and the conversion rule, and automatically generating a description data module file conforming to the S1000D standard. The user can accurately, conveniently and quickly automatically generate the WORD document into the description data module information file conforming to the S1000D standard by the method, thereby improving the efficiency of editing and making IETM data content and reducing the complexity of editing and making IETM data content.

Description

Method for generating WORD document into description class data module conforming to S1000D standard
Technical Field
The invention relates to a data editing method implemented by software, in particular to a method for generating a WORD document into a description class data module conforming to the S1000D standard.
Background
An Interactive Electronic Technical Manual (IETM) is a technical publication which is compiled according to standard digital format and provides contents such as basic principle, operation, technical maintenance and the like in a man-machine interaction mode in various forms such as characters, graphs, tables, audios and videos. At present, the IETM technology is becoming mature, some units have made interactive electronic technical manuals meeting user requirements through the IETM technology, and it is especially important to adopt advanced technical means and making tools in the process of making ITEM.
The traditional IETM software making tools mainly comprise two types, namely special making software and general making software, and can also be divided into domestic and foreign software in China. The foreign IETM makes software has early start, mature technical development, good performance and strong function and is verified by the use of a plurality of users, but the adopted technical standard version is older, the supported standard is not always compatible with the Chinese IETM standard, the interaction function is poor, the price is high, and the maintenance and the safety are not good enough.
However, in the process of writing content, whether domestic or foreign in China, or professional software or general software, the text content of the WORD document needs to be made into a data module meeting the requirements, and a worker needs to edit the content by using a tool with an XML editor function, or to convert the WORD document by using the tool, and the WORD document needs to be processed into a specific format (such as inserting a tag) or needs to be manually set for the content format to be used. That is to say: for the whole text of the WORD document, when the content part of the data module is edited, there are two ways: (1) manually copying the corresponding contents to the corresponding positions one by one; (2) and processing the WORD document into a specific format (such as inserting a label) by using a manual or manual using tool, and the like and converting. The defects of the two modes are as follows: the first method needs a lot of manpower when actually manufacturing the IETM, and meanwhile, because of manual operation, the accuracy cannot be guaranteed, and multiple checks are needed; the second method also requires a corresponding manual or manual tool setting to a specific format.
The S1000D standard is an international standard for creating technical documents using universal resource database, which proposes two core concepts of data module and universal resource database to ensure information sharing and exchange between IETMs, so it is very convenient for IETM production, but the traditional IETM production method has no precedent for combining and converting WORD documents with the S1000D standard.
Disclosure of Invention
The invention aims to solve the problems and provide a method for generating a WORD document into a description class data module conforming to the S1000D standard, which can effectively produce IETM.
The invention realizes the purpose through the following technical scheme:
a method for generating a WORD document into a description class data module conforming to the S1000D standard, comprising the steps of:
step 1, determining WS rule elements, wherein the WS rules are mapping rules between elements needing attention and S1000D description class data modules based on WORD documents;
step 2, establishing a WS rule mapping relation and a conversion rule, wherein the WS rule mapping relation is the mapping relation between the determined WS rule elements and the information elements of the description data module conforming to the S1000D standard; the conversion rule is a rule how the rule elements are converted according to the types of the rule elements, the organizational structures among the rule elements and the mapping relationship;
and 3, analyzing and identifying the WORD document to be converted, converting the identified elements and contents according to the WS rule mapping relation and the conversion rule, and automatically generating a description data module file conforming to the S1000D standard.
Preferably, in step 1, the determining WS rule element includes the following steps:
step 1.1, analyzing a universal WORD document structure;
step 1.2, identifying elements needing attention in a WORD document structure;
and step 1.3, determining the elements needing attention in the WORD document structure as WS rule elements.
Preferably, in step 1.2 and step 1.3, the elements needing attention in the WORD document structure include, but are not limited to, title elements, table elements, figure elements, body elements, paragraph elements, letter number elements, and number elements at all levels.
Preferably, the mapping relationship in step 2 includes, but is not limited to, a mapping relationship between each level of title element and title element defined by the S1000D standard, a mapping relationship between table element and table element defined by the S1000D standard, a mapping relationship between primitive element and diagram element defined by the S1000D standard, a mapping relationship between body element and body element defined by the S1000D standard, a mapping relationship between segment element and segment element defined by the S1000D standard, a mapping relationship between letter number element and letter number element defined by the S1000D standard, and a mapping relationship between number element and number letter number element defined by the S1000D standard.
Preferably, the conversion rules in step 2 include a title element conversion rule, a table element conversion rule, a primitive element conversion rule, a body element conversion rule, a segment element conversion rule, an alphabet number element conversion rule, and a number element conversion rule.
Preferably, the step 3 comprises the following steps:
step 3.1, acquiring WORD document materials to be converted;
and 3.2, analyzing the WORD document material, circularly traversing the WORD document material, and automatically converting the identified elements and contents according to the mapping relation and the conversion rule to generate a description class data module file conforming to the S1000D standard.
Preferably, the method of step 3.1 is: opening WORD DOCUMENT materials to be converted, acquiring a DOCUMENT object, positioning a pointer at the starting position of the WORD DOCUMENT, wherein the DOCUMENT object represents the whole XML DOCUMENT, is the root of a DOCUMENT tree and provides an initial or top-level access entry to the DOCUMENT data; said step 3.2 comprises the steps of:
step 3.2.1, acquiring a WORD document element object for analysis and identification, converting according to the identification result, contrasting the mapping relation and the conversion rule, and adding the converted content data into the description data module;
step 3.2.2, moving down the WORD document pointer, and repeatedly executing the step 3.2.1 until the pointer reaches the end of the WORD document;
and 3.2.3, closing the WORD document and storing the WORD document in the description class data module file.
The invention has the beneficial effects that:
the invention analyzes the WORD document, analyzes the relation between the WORD document and the S1000D standard description data module, establishes the mapping relation between the elements needing attention identified by the WORD document and the S1000D standard data module description information elements, and automatically maps the elements needing attention and the content in the WORD document to the S1000D standard description data module, thereby automatically generating the description data information conforming to the S1000D standard. Meanwhile, because the invention analyzes the WORD document in full text, the invention is only used for automatically generating the description class data module which conforms to the S1000D standard for analyzing the WORD document in full text, and does not include automatically generating other classes such as program classes or fault classes.
Drawings
FIG. 1 is a flow chart of a method for generating a WORD document into a description class data module conforming to the S1000D standard according to the present invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings in which:
as shown in FIG. 1, the method for generating a WORD document into a description class data module conforming to the S1000D standard according to the present invention includes the following steps:
step 1, determining WS rule elements, wherein the WS rules are mapping rules between elements needing attention based on WORD document identification and S1000D description class data modules.
The method specifically comprises the following steps:
step 1.1, analyzing a universal WORD document structure;
step 1.2, identifying elements needing attention in a WORD document structure; elements of the WORD document structure that need attention include, but are not limited to, title elements, table elements, graph elements, body elements, paragraph elements, letter number elements, and number elements at various levels;
and step 1.3, determining the elements needing attention in the WORD document structure as WS rule elements.
Step 2, establishing a WS rule mapping relation and a conversion rule, wherein the WS rule mapping relation is the mapping relation between the determined WS rule elements and the information elements of the description data module conforming to the S1000D standard; the conversion rule is a rule how the rule elements are converted according to the types of the rule elements, the organizational structures among the rule elements and the mapping relationship; the mapping relationship includes, but is not limited to, the mapping relationship between the title elements of each level and the title elements defined by the S1000D standard, the mapping relationship between the table elements and the table elements defined by the S1000D standard, the mapping relationship between the primitive elements and the diagram elements defined by the S1000D standard, the mapping relationship between the body elements and the body elements defined by the S1000D standard, the mapping relationship between the segment elements and the segment elements defined by the S1000D standard, the mapping relationship between the letter number elements and the letter number elements defined by the S1000D standard, and the mapping relationship between the number elements and the number letter number elements defined by the S1000D standard; the conversion rules include a title element conversion rule, a table element conversion rule, a primitive element conversion rule, a text element conversion rule, a segment element conversion rule, an alphabet number element conversion rule, and a number element conversion rule.
In this step, the specific method for establishing various mapping relationships is as follows:
the mapping relationship between each level of title element and title element defined by the S1000D standard is established as follows:
<levelledPara>
< title > title name </title >
</levelledPara>;
Establishing a mapping relation between the table elements and the table elements defined by the S1000D standard as follows:
table;
the mapping relationship between the graph elements and the graph elements defined by the S1000D standard is established as follows:
< figure > figure </figure >;
establishing a mapping relation between the text element and the text element defined by the S1000D standard as follows:
< para > text content </para >;
establishing a mapping relation between the segment elements and the segment elements defined by the S1000D standard as follows:
< para > paragraph content </para >;
establishing the mapping relation between the letter number elements and the letter number elements defined by the S1000D standard as follows:
< listItem > numeric value and content </listItem >,
wherein: the alphanumeric value is to be converted to a numeric value;
establishing the mapping relation between the number element and the number-letter number element defined by the S1000D standard as follows:
< listItem > data value and content </listItem >.
The above mapping relationships are summarized in the following table:
WS rule mapping table
Figure BSA0000210092960000061
Figure BSA0000210092960000071
And 3, analyzing and identifying the WORD document to be converted, converting the identified elements and contents according to the WS rule mapping relation and the conversion rule, and automatically generating a description data module file conforming to the S1000D standard.
The method comprises the following steps:
step 3.1, acquiring WORD document materials to be converted; the specific method comprises the following steps: opening WORD DOCUMENT materials to be converted, acquiring a DOCUMENT object, and positioning a pointer at the starting position of the WORD DOCUMENT;
and 3.2, analyzing the WORD document material, circularly traversing the WORD document material, and automatically converting the identified elements and contents according to the mapping relation and the conversion rule to generate a description class data module file conforming to the S1000D standard.
Step 3.2 comprises the following specific steps:
step 3.2.1, acquiring a WORD document element object for analysis and identification, converting according to the identification result, contrasting the mapping relation and the conversion rule, and adding the converted content data into the description data module;
step 3.2.2, moving down the WORD document pointer, and repeatedly executing the step 3.2.1 until the pointer reaches the end of the WORD document;
and 3.2.3, closing the WORD document and storing the WORD document in the description class data module file.
The above embodiments are only preferred embodiments of the present invention, and are not intended to limit the technical solutions of the present invention, so long as the technical solutions can be realized on the basis of the above embodiments without creative efforts, which should be considered to fall within the protection scope of the patent of the present invention.

Claims (7)

1. A method for generating a WORD document into a description class data module conforming to the S1000D standard, the method comprising: the method comprises the following steps:
step 1, determining WS rule elements, wherein the WS rules are mapping rules between elements needing attention and S1000D description class data modules based on WORD documents;
step 2, establishing a WS rule mapping relation and a conversion rule, wherein the WS rule mapping relation is the mapping relation between the determined WS rule elements and the information elements of the description data module conforming to the S1000D standard; the conversion rule is a rule how the rule elements are converted according to the types of the rule elements, the organizational structures among the rule elements and the mapping relationship;
and 3, analyzing and identifying the WORD document to be converted, converting the identified elements and contents according to the WS rule mapping relation and the conversion rule, and automatically generating a description data module file conforming to the S1000D standard.
2. The method of generating a WORD document into a description class data module conforming to the S1000D standard as recited in claim 1, wherein: in step 1, the determining WS rule element includes the following steps:
step 1.1, analyzing a universal WORD document structure;
step 1.2, identifying elements needing attention in a WORD document structure;
and step 1.3, determining the elements needing attention in the WORD document structure as WS rule elements.
3. The method of generating a WORD document into a description class data module conforming to the S1000D standard as recited in claim 2, wherein: in step 1.2 and step 1.3, the elements needing attention in the WORD document structure include, but are not limited to, title elements, table elements, drawing elements, body elements, segment elements, letter number elements, and number elements at all levels.
4. A method for generating a WORD document into a description class data module according to the S1000D standard as claimed in claim 1, 2 or 3, wherein: the mapping relationship in the step 2 includes, but is not limited to, a mapping relationship between each level of title element and title element defined by the S1000D standard, a mapping relationship between table element and table element defined by the S1000D standard, a mapping relationship between primitive element and diagram element defined by the S1000D standard, a mapping relationship between body element and body element defined by the S1000D standard, a mapping relationship between segment element and segment element defined by the S1000D standard, a mapping relationship between letter number element and letter number element defined by the S1000D standard, and a mapping relationship between number element and number letter number element defined by the S1000D standard.
5. A method for generating a WORD document into a description class data module according to the S1000D standard as claimed in claim 1, 2 or 3, wherein: the conversion rules in the step 2 include a title element conversion rule, a table element conversion rule, a primitive element conversion rule, a body element conversion rule, a segment element conversion rule, an alphabet number element conversion rule, and a number element conversion rule.
6. A method for generating a WORD document into a description class data module according to the S1000D standard as claimed in claim 1, 2 or 3, wherein: the step 3 comprises the following steps:
step 3.1, acquiring WORD document materials to be converted;
and 3.2, analyzing the WORD document material, circularly traversing the WORD document material, and automatically converting the identified elements and contents according to the mapping relation and the conversion rule to generate a description class data module file conforming to the S1000D standard.
7. The method of generating a WORD document into a description class data module conforming to the S1000D standard as recited in claim 6, wherein: the method of the step 3.1 is as follows: opening WORD DOCUMENT materials to be converted, acquiring a DOCUMENT object, and positioning a pointer at the starting position of the WORD DOCUMENT; said step 3.2 comprises the steps of:
step 3.2.1, acquiring a WORD document element object for analysis and identification, converting according to the identification result, contrasting the mapping relation and the conversion rule, and adding the converted content data into the description data module;
step 3.2.2, moving down the WORD document pointer, and repeatedly executing the step 3.2.1 until the pointer reaches the end of the WORD document;
and 3.2.3, closing the WORD document and storing the WORD document in the description class data module file.
CN202010481742.6A 2020-05-29 2020-05-29 Method for generating WORD document into description class data module conforming to S1000D standard Pending CN111666747A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010481742.6A CN111666747A (en) 2020-05-29 2020-05-29 Method for generating WORD document into description class data module conforming to S1000D standard

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010481742.6A CN111666747A (en) 2020-05-29 2020-05-29 Method for generating WORD document into description class data module conforming to S1000D standard

Publications (1)

Publication Number Publication Date
CN111666747A true CN111666747A (en) 2020-09-15

Family

ID=72385395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010481742.6A Pending CN111666747A (en) 2020-05-29 2020-05-29 Method for generating WORD document into description class data module conforming to S1000D standard

Country Status (1)

Country Link
CN (1) CN111666747A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559837A (en) * 2021-01-05 2021-03-26 广州华资软件技术有限公司 Business electronic file development method
CN112699641A (en) * 2021-03-25 2021-04-23 南京国睿信维软件有限公司 Method for quickly converting batch copy of WORD content to DM based on S1000D standard
CN115688690A (en) * 2022-11-16 2023-02-03 金航数码科技有限责任公司 Dynamic conversion method for converting Word document content into XML fragment conforming to S1000D standard

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101055578A (en) * 2006-04-12 2007-10-17 龙搜(北京)科技有限公司 File content dredger based on rule
CN105786921A (en) * 2014-12-26 2016-07-20 北京航天测控技术有限公司 Data module conversion method and device for non-structured document
CN110990636A (en) * 2019-12-18 2020-04-10 哈尔滨工程大学 Intelligent data module acquisition and conversion method for diesel engine interactive electronic technical manual

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101055578A (en) * 2006-04-12 2007-10-17 龙搜(北京)科技有限公司 File content dredger based on rule
CN105786921A (en) * 2014-12-26 2016-07-20 北京航天测控技术有限公司 Data module conversion method and device for non-structured document
CN110990636A (en) * 2019-12-18 2020-04-10 哈尔滨工程大学 Intelligent data module acquisition and conversion method for diesel engine interactive electronic technical manual

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王友刚: "WordML 在线性数字化技术手册中的应用", 《设计与应用》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559837A (en) * 2021-01-05 2021-03-26 广州华资软件技术有限公司 Business electronic file development method
CN112699641A (en) * 2021-03-25 2021-04-23 南京国睿信维软件有限公司 Method for quickly converting batch copy of WORD content to DM based on S1000D standard
CN112699641B (en) * 2021-03-25 2021-07-20 南京国睿信维软件有限公司 Method for quickly converting batch copy of WORD content to DM based on S1000D standard
CN115688690A (en) * 2022-11-16 2023-02-03 金航数码科技有限责任公司 Dynamic conversion method for converting Word document content into XML fragment conforming to S1000D standard
CN115688690B (en) * 2022-11-16 2023-10-03 金航数码科技有限责任公司 Dynamic conversion method for converting Word document content into XML fragment conforming to S1000D standard

Similar Documents

Publication Publication Date Title
CN111666747A (en) Method for generating WORD document into description class data module conforming to S1000D standard
CN111462327B (en) Unstructured data analysis method for three-dimensional inspection model of three-dimensional modeling software
CN106325969B (en) The reversely tracing system of demand change
CN108595389B (en) Method for converting Word document into txt plain text document
CN109582647B (en) Unstructured evidence file oriented analysis method and system
CN110083580B (en) Method and system for converting Word document into PowerPoint document
CN105975446A (en) Method and system for displaying word document content by modules in mobile phone terminal
CN106776495A (en) A kind of document logical structure method for reconstructing
CN108519963B (en) Method for automatically converting process model into multi-language text
CN111859886B (en) Document generation method and device based on product prototype interface
CN110889261A (en) Method for automating electronic official document service processing
CN113553055A (en) Visual chart code automatic generation method based on machine learning
CN111144116B (en) Document knowledge structured extraction method and device
CN113569543B (en) Implementation method of nuclear power engineering automatic report generation technology
CN109542969B (en) Text transformer test data structuring system and method
CN106649219B (en) A kind of telecommunication satellite design document automatic generation method
CN115936880A (en) Real-time stock information transaction data processing system
CN114419645A (en) Contract intelligent analysis method based on AI
CN114973798A (en) Word learning card generation method and device
CN109739981B (en) PDF file type judgment method and character extraction method
WO2014094534A1 (en) File development tool device and method based on file dictionary assembly
CN112686013A (en) Cable number head compiling system and method
CN113011183A (en) Unstructured text data processing method and system in electric power regulation and control field
CN111275409A (en) Power grid overhaul audit data processing system and processing method
CN114036926A (en) Automatic precious metal material data file extraction system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200915

RJ01 Rejection of invention patent application after publication