CN112364830A

CN112364830A - Method for inputting user examination questionnaire based on word document

Info

Publication number: CN112364830A
Application number: CN202011381075.0A
Authority: CN
Inventors: 雷丽平
Original assignee: Changsha Ranxing Information Technology Co ltd
Current assignee: Changsha Ranxing Information Technology Co ltd
Priority date: 2020-11-30
Filing date: 2020-11-30
Publication date: 2021-02-12

Abstract

The invention belongs to the technical field of questionnaire input, and particularly relates to a method for inputting a user examination questionnaire based on a word document, which comprises the following steps: s1, separating examination titles in the document, S2, identifying question types, S3, identifying options in the questions, answers, answer analysis and scores; s4, identifying and storing pictures and formulas in the document; s5, storing the document into an XML format file which can be read by a system; the invention greatly reduces the workload of repeatedly inputting questions by the user, reduces the complexity of setting the contents of the questions, and can more efficiently, accurately and comprehensively identify and input the examination questionnaire; the identified format has strong containment, smooth coverage and high user experience.

Description

Method for inputting user examination questionnaire based on word document

Technical Field

The invention belongs to the technical field of questionnaire input, and particularly relates to a method for inputting a user examination questionnaire based on a word document.

Background

Before organizing an examination, users such as company enterprises generally have examination content of a Word document and input the examination content into some examination systems or questionnaire systems, in order to reduce the workload of repeatedly inputting topic content by the users and reduce the complexity of setting the topic content, the scheme enables the users to directly upload the existing Word document, and the questions, types, scores and answers in the examination content are identified by reading the content of the Word document, and the answer analysis is stored in a database according to the format of the system and is used for creating an examination paper capable of being automatically classified.

Therefore, how to identify the input examination questionnaire more efficiently, accurately and comprehensively is an urgent problem to be solved by the technical personnel in the field.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides a method for inputting a user examination questionnaire based on a word document.

The technical scheme of the invention is realized as follows:

a method for inputting a user examination questionnaire based on a word document comprises the following steps:

s1, separating the test title in the document,

s2, identifying the type of the title,

s3, identifying options, answers, answer analysis and scores in the questions;

s4, identifying and storing pictures and formulas in the document;

s5, storing the document into an XML format file which can be read by the system.

Further, words are read using expose in step S1, while all formats existing in the Word document are retained; reading the non-empty first row content in the document, namely Firstlinetxt, judging whether the Firstlinetxt starts with a title label, if so, prompting a user to input a title, and otherwise, returning to the Firstlinetxt.

Further, the step of S2 further includes the following steps:

a. identifying the content of the question stem; reading the content of the first line in the new paragraph until a line ends with a line break, which all default to the theme content, and reading the content of one more line for increasing the accuracy, if the new line has no content, or starts with numbers or letters + ". ", if the answer is analyzed and the answer is started, the default question stem content is finished, otherwise, the reading is continued until the above conditions exist;

b. reading the title stem content, and judging whether the content comprises the following contents: and identifying characters of fixed identification question types such as single-choice questions, multiple-choice questions, blank filling questions, short answer questions and the like, if so, directly determining the question types according to the identification characters of the question types, otherwise, reading the next line of contents of the question stem to judge whether the number is plus. "begin with; if yes, the question type is question and answer, otherwise, it is judged whether the line content is in letter + ". "start form, judge whether the correct answer is only one if yes, question type is a single choice question if yes, otherwise question answer question; when the judgment is not in the form of a plus letter. If the question stem is in the form of a single item, judging whether the question stem contains a blank mark for filling in the blank, if so, judging whether only one blank mark exists, if so, judging that the question type is single item blank filling, and if not, judging that the question type is multi-item blank filling; when the question stem does not contain the blank space mark for filling the blank space, judging whether the question stem contains the mark of the judgment question, if so, judging the question, otherwise, judging the question;

c. identifying the end of paragraph, carrying out traitor according to the type and content of the title, judging to be one of question and answer, judging question and blank filling question, reading the next line of the paragraph, and judging that the new line of the content is blank or is plus digits. If the head is judged as a new paragraph, otherwise, whether the answer, the answer analysis, the score and the like are included is continuously judged, and if not, the head is judged as a new paragraph; otherwise, continuously reading a new line and continuously identifying until the new line is identified as a new paragraph; if the selection question is a choice question during the identification, the process of identifying the option is added.

Further, the step of S3 includes the following steps:

d. identifying options of the questions, after judging that the types of the questions are the choice questions, identifying the contents of the options of the choice questions, after identifying the question stem, reading the contents of the next line until a line change symbol is read, identifying the contents as a section of contents, and when the section of contents contains three or more spaces, dividing the section of contents into a plurality of options by using the spaces as separators, otherwise, defaulting the section of contents as a first option; if only one option is identified, continuously reading a new section of content until more than two options exist; reading the content of the options, wherein the first option starts with A or a, 1, the second option starts with corresponding B or B,2, then judging the corresponding letter as the option identification of the title, otherwise, the option in the title is the option without identification;

e. identifying answers to questions, wherein short answer questions, document questions: by default, there is no answer and therefore no recognition is required;

when the question is judged, reading the question stem content to judge whether the question stem content comprises 'pair, error, x, v', and the like, if so, carrying out corresponding identification, otherwise, reading a new section after the question stem to see whether the default 'pair, error, x, v', and the like are included, if so, carrying out corresponding identification, and if not, the default is correct; when the questions are singled, setting a correct answer as a first option by default, wherein the questions have marked options, firstly reading the content of the question stem, and identifying the identifier letters in the parentheses as correct answers if the question stem comprises big and small brackets and the brackets contain letters of corresponding identifiers; otherwise, continuously reading the content in the options, starting from the first option, if the options contain word patterns such as correct answers and the like, identifying the type of the options as the correct answers; otherwise, reading a new section after the option, judging whether the new section contains an option identifier or content consistent with the option, and if so, identifying the option as correct content; the item is identified directly from the option without an identification option, if the option has characters such as a correct answer, the item is identified as the correct answer, otherwise, a new section after the option is read, whether the item contains the content consistent with the option is judged, and if the item contains the content consistent with the option, the item is identified as the correct answer.

When multiple questions are selected, multiple options for identifying correct answers are provided;

when the one-way space filling is performed, reading whether characters exist in the space filling mark in the question stem or whether a field of brace plus characters exists, if so, the characters are correct answers, otherwise, continuously reading a new section of characters behind the question stem as correct answers;

when a plurality of items are filled, reading whether characters exist in the filling mark position of the question stem or whether a field of brace plus characters exists, if so, the characters are correct answers corresponding to the filling mark position, otherwise, continuously reading a new section of characters after the question stem, conforming the characters by using a blank space, and identifying different empty correct answers by using "|" as a distinguishing interval.

f. Identifying answer analysis, namely selecting a new section of content after a question reading option, reading a new section of content after other types of questions read a question stem, judging whether the new section of content contains characters such as answer analysis or analysis and the like, if so, identifying the section of content as the answer analysis, otherwise, continuously reading the new section of content until a new paragraph is finished;

g. and identifying the score, wherein the default score is 5, judging whether the question stem comprises a big bracket and a small bracket and a number is arranged in the brackets, identifying the number as the score, otherwise, reading a new section of content if no score exists, judging whether characters such as the score are contained, reading the following characters if yes, and identifying the characters as the score if the characters are numbers.

Further, the step of S4 further includes the following steps:

reading the content of a word document by using an Aspose.Words in a NET class library, automatically identifying the document as a plurality of Node nodes, circularly reading whether each Node is an Aspose.Words.Drawing.Drawing ML Node, wherein the system automatically reads both a picture and a special formula as Aspose.Words.Drawing.Drawing ML, if so, uploading the data stream of the picture to the cloud end and storing the data stream as an internal link of the cloud end, and the Aspose.Words.Drawing.Drawing ML Node is replaced by < img src ═ plus "+ Link +"/".

Further, the step of S5 further includes the following steps:

firstly, defining each node to store examination questions, paragraph descriptions and pages; storing the data in the following structure with js when reading each paragraph data:

DataNode _ topic// sequence number of title

Question stem of dataNode _text// question

Whether dataNode. __ isCeShi// is an exam

DataNode _ type// topic type

Keyword in dataNode _ keyword// topic

DataNode. _ hasvalue// has score

DataNode. _ ceeshiSeDesc// question having answer resolution

DataNode ceshiValue// topic score

Whether a dataNode question/question is a mandatory question or not

Whether the dataNode. _ randomChoice// option is random or not

Whether or not dataNode// ispandean is a judgment question

dataNode. _ select ═ new Array (); // the choice of topic; when all topics are uploaded to a server for storage, the structured data are separated by symbols to form a character string, and the data are separated by split at the back end of the net and stored in xml.

The scheme has the following effects:

the scheme greatly reduces the workload of repeatedly inputting questions by the user, reduces the complexity of setting the contents of the questions, and can more efficiently, accurately and comprehensively identify and input the examination questionnaire; the identified format has strong containment, smooth coverage and high user experience.

Drawings

FIG. 1 is a schematic flow chart of separating examination titles in a document according to a method for entering a user examination questionnaire based on a word document;

FIG. 2 is a schematic flow chart of a method for inputting a user examination questionnaire based on a word document to identify a question type according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that all the directional indicators (such as upper, lower, left, right, front and rear … …) in the embodiment of the present invention are only used to explain the relative position relationship between the components, the movement situation, etc. in a specific posture (as shown in the drawing), and if the specific posture is changed, the directional indicator is changed accordingly.

In addition, the descriptions related to "first", "second", etc. in the present invention are for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.

Example 1

As shown in FIG. 1, a method for entering a user examination questionnaire based on a word document comprises the following steps: s1, separating the test title in the document,

s2, identifying the type of the title,

s3, identifying options, answers, answer analysis and scores in the questions;

s4, identifying and storing pictures and formulas in the document;

For step S1, i.e., how to separate examination titles from different types of documents, we will keep all the formats existing in the Words document, such as space, line feed, table, etc., when reading the document with aspese.

The general existing document puts the examination title in the first line, but cannot exclude the case that there is no title in the Words document, so the flow of the system identification is as follows: reading the non-empty first row content in the document, namely Firstlinetxt, judging whether the Firstlinetxt starts with a title label, if so, prompting a user to input a title, and otherwise, returning to the Firstlinetxt (as shown in figure 1).

For step S2, the examination content includes title content, paragraph description, and pages, etc. to distinguish what type of content is in a line and whether there is any paragraph in the line. To meet the user's needs, we provide the user with the following types of topics: single-choice questions, multiple-choice questions, judgment questions, single-item blank filling, multiple-item blank filling, short-answer questions, document titles, paragraph descriptions and the like. How to identify what type a paragraph belongs to based on the document content, and the end of the paragraph to a line, is the first step we identify. The specific identification steps are as follows:

a. the question stem content identification is to read the content of the first line in the new paragraph until a certain line is ended by a line feed character, the content is defaulted as the question stem content, but considering that special conditions exist in some documents, the content of one line is read more, if the content of the new line is empty, or the content is started by numbers or letters plus ", or the content is started by answer resolution, answers and the like, the system defaults that the question stem content is ended, otherwise, the reading is continued until the conditions exist.

b. The topic type identification process is as follows (as shown in fig. 2): judging whether the content comprises the following contents: and identifying characters of fixed identification question types such as single-choice questions, multiple-choice questions, blank filling questions, short answer questions and the like, if so, directly determining the question types according to the identification characters of the question types, otherwise, reading the next line of contents of the question stem to judge whether the number is plus. "begin with; if yes, the question type is question and answer, otherwise, it is judged whether the line content is in letter + ". "start form, judge whether the correct answer is only one if yes, question type is a single choice question if yes, otherwise question answer question; when the judgment is not in the form of a plus letter. If the question stem is in the form of a single item, judging whether the question stem contains a blank mark for filling in the blank, if so, judging whether only one blank mark exists, if so, judging that the question type is single item blank filling, and if not, judging that the question type is multi-item blank filling; when the question stem does not contain the blank space mark for filling the blank space, judging whether the question stem contains the mark of the judgment question, if so, judging the question, otherwise, judging the question;

c. paragraph end identification, each paragraph is each topic constituting the test paper, so how to accurately identify the content of each paragraph is also the key to identify the topic. The judgment is mainly carried out according to the type and the content of the title. If the paragraph is judged to be a question and answer according to the question stem, judging the question and filling in blank questions, reading the next line of content of the paragraph, if the new line of content is blank or is not started by numbers plus ". multidot.; otherwise, a new line is continuously read, and the identification is continuously carried out until the content of a new paragraph is identified. If the question is a choice question (comprising single choice and multiple choices), a process for identifying the choice is added.

For the step S3:

d. an option for a topic is identified. If the title is judged to be the choice title, the content of the option of the choice title needs to be identified, after the title stem is identified, the content of the next line is read until a line-changing character is read, the system identifies the content as a section of content, if the section of content contains three or more spaces, the section of content is divided into a plurality of options by using the spaces as separators, and if the section of content does not contain three or more spaces, the first option is defaulted. If only one option is identified, the reading of the new section of content is continued until there are more than two options. And then reading the content of the options, wherein if the first option starts with A or a, 1 and the second option starts with corresponding B or B,2, the corresponding letter is the option identification of the title, otherwise, the option in the title is changed into the option without identification. The next option can be identified differently according to the existence of the identification, if the option has the identification option, when a new section of content is read, the new section of content is identified as the beginning according to whether the new row of content is identified by the sequential option after the last option, if so, the item is continuously identified, otherwise, the item identification is finished; if the option is not identified, the identification of the option is finished according to the word patterns such as whether the new section of content is empty or not, whether the new section of content contains an answer or not, analyzing the answer, and judging the correct answer, if the new section of content does not contain the answer, the option is identified as the new option, and the identification of the option is finished.

e. An answer to the topic is identified. The answer typesetting modes of different types of questions are generally inconsistent, so that the identification can be carried out according to the types of the questions.

Simply answering the question; documentary title: by default there is no answer and therefore no recognition is required.

Question judgment: reading whether the content of the question stem comprises 'pair, error, x, v' and the like, if so, carrying out corresponding recognition, otherwise, reading a new section after the question stem to see whether default 'pair, error, x, v' is included, if so, carrying out corresponding recognition, and if not, default to be correct.

The single-choice question: to prevent overall recognition errors, we will default to the correct answer as the first choice.

The subject has an identified option, the content of the subject stem is read firstly, and if the subject stem comprises a big bracket and a small bracket, and the brackets contain letters of corresponding identifiers, the letters of the identifiers in the brackets are identified as correct answers; if the options contain word patterns such as correct answers and the like, identifying the type of the options as the correct answers; otherwise, reading a new section after the option, judging whether the new section contains the option identification or the content consistent with the option, and if so, identifying the option as the correct content. And (4) identifying the item without an identification option, directly identifying the item from the option, identifying the item as a correct answer if the item has characters such as a correct answer, reading a new section after the option to see whether the item contains the content consistent with the option, and identifying the item as the correct answer if the item contains the content consistent with the option.

Multiple choice questions: the logic is consistent with the single-choice question, only when the correct answer is identified, there are multiple ones.

Single item gap filling: and reading whether the blank mark in the question stem has characters or whether a field of brace plus characters exists, if so, the characters are correct answers, otherwise, continuously reading the characters of a new section behind the question stem as correct answers.

Filling in a plurality of items: reading whether characters exist in the blank filling mark position of the question stem or whether a field of brace + characters exists, if so, the characters are correct answers corresponding to the blank filling position, otherwise, continuously reading the characters of a new section after the question stem, conforming with a blank space, and identifying different blank correct answers by using I as a distinguishing interval.

f. And (4) recognition of answer resolution. Answer parsing is not essential in the topic, we default to null. Selecting a new section of content after the question reading option, reading a new section of content after the question stem is read by other types of questions, judging whether the new section of content contains characters such as answer analysis or resolution, if so, identifying the content of the section as the answer analysis, and if not, continuously reading the content of the new section until the new paragraph is ended.

g. And identifying the topic score. We will default the score of the question to be 5, if the question stem contains a big bracket and a number inside the bracket, the number is identified as the score, if no score exists, a new section of content is read to see whether characters such as the score are contained, if yes, the following characters are read, and if the characters are the numbers, the score is identified.

For the step S4:

when the content of a word document is read by Aspose.Words in a NET class library, the document is automatically recognized as a plurality of Node nodes, the system circularly reads whether each Node is an Aspose.Words.Drawing.Drawing ML Node (the system automatically reads both a picture and a special formula as Aspose.Words.Drawing.Drawing ML), if so, the data stream of the picture is uploaded to Alice cloud and stored as an internal link of the Alice cloud, and the Aspose.Words.Drawing.Drawing ML Node is replaced by < img src ═ plus "+ Link +'/>.

For the step S5:

in order to read the stored data more conveniently, the system defines each node to store the examination questions, the paragraph descriptions and the pages. When reading each paragraph data we store the data in the following structure with js:

DataNode _ topic// sequence number of title

Question stem of dataNode _text// question

Whether dataNode. __ isCeShi// is an exam

DataNode _ type// topic type

Keyword in dataNode _ keyword// topic

DataNode. _ hasvalue// has score

DataNode. _ ceeshiSeDesc// question having answer resolution

DataNode ceshiValue// topic score

Whether a dataNode question/question is a mandatory question or not

Whether the dataNode. _ randomChoice// option is random or not

Whether or not dataNode// ispandean is a judgment question

dataNode. _ select ═ new Array (); // item of subject

When all topics are uploaded to a server for storage, structured data can be separated by special symbols to form a character string, and at the back end of the net, the data is separated by split and stored in xml.

According to the scheme, the workload of repeatedly inputting questions by the user is greatly reduced, the complexity of setting the question content is reduced, and the input examination questionnaire can be identified more efficiently, more accurately and more comprehensively; the identified format has strong containment, smooth coverage and high user experience.

Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in the claims of the present invention.

Claims

1. A method for inputting a user examination questionnaire based on a word document is characterized by comprising the following steps: s1, separating the test title in the document,

s2, identifying the type of the title,

s3, identifying options, answers, answer analysis and scores in the questions;

s4, identifying and storing pictures and formulas in the document;

2. The method for entering a user examination questionnaire based on Word documents as claimed in claim 1, wherein in step S1 the document is read using aspese. Reading the non-empty first row content in the document, namely Firstlinetxt, judging whether the Firstlinetxt starts with a title label, if so, prompting a user to input a title, and otherwise, returning to the Firstlinetxt.

3. The method for entering a user examination questionnaire based on word document according to claim 2, wherein the step of S2 further comprises the steps of:

4. A method for entering a user examination questionnaire based on a word document, wherein the step of S3 comprises the following steps:

5. The method for entering a user examination questionnaire based on word document according to claim 4, wherein the step of S4 further comprises the steps of:

6. The method for entering a user examination questionnaire based on word document according to claim 5, wherein the step of S5 further comprises the steps of:

firstly, defining each node to store examination questions, paragraph descriptions and pages; performing structured storage on the data by js when reading the data of each paragraph; when all topics are uploaded to a server for storage, the structured data are separated by symbols to form a character string, and the data are separated by split at the back end of the net and stored in xml.