CN112765307A - Test paper test question splitting tool based on machine learning algorithm and splitting and extracting method thereof - Google Patents

Test paper test question splitting tool based on machine learning algorithm and splitting and extracting method thereof Download PDF

Info

Publication number
CN112765307A
CN112765307A CN202110036117.5A CN202110036117A CN112765307A CN 112765307 A CN112765307 A CN 112765307A CN 202110036117 A CN202110036117 A CN 202110036117A CN 112765307 A CN112765307 A CN 112765307A
Authority
CN
China
Prior art keywords
test
question
test question
splitting
paper
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110036117.5A
Other languages
Chinese (zh)
Inventor
陈麟
许青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xuzhou Jinlin Artificial Intelligence Technology Co ltd
Original Assignee
Xuzhou Jinlin Artificial Intelligence Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xuzhou Jinlin Artificial Intelligence Technology Co ltd filed Critical Xuzhou Jinlin Artificial Intelligence Technology Co ltd
Priority to CN202110036117.5A priority Critical patent/CN112765307A/en
Publication of CN112765307A publication Critical patent/CN112765307A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Educational Administration (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Educational Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention discloses a test paper test question splitting tool based on a machine learning algorithm and a splitting and extracting method thereof, wherein the splitting tool comprises: logging in the verification system: the user IP and the key are verified and authorized to log in the security verification of the system; a central processing unit: carrying out logic processing on the data of the test questions; a test question bank: storing data of the test questions and storing the data in a classified mode, and storing the data according to the number, the types of the question stems and the types of scores of the test questions; the test question text analysis module: the test paper reading device is used for reading the test paper file and analyzing characters in the test paper; test question splitting module: separating out information such as test question serial number, test question score, test question stem, test question option, test question answer, test question analysis and the like. The invention effectively saves the labor input related to test question collection, reduces simple and repeated labor, and greatly improves the efficiency of collection, splitting and recombination of test question resources.

Description

Test paper test question splitting tool based on machine learning algorithm and splitting and extracting method thereof
Technical Field
The invention relates to the technical field of test question splitting application by using a learning algorithm technology, in particular to a test paper test question splitting tool based on a machine learning algorithm and a splitting and extracting method thereof.
Background
With the rapid development of social economy, machine learning is a branch of artificial intelligence, and the general rule behind data is learned through mass data and strong computer computational mechanics to realize artificial intelligence; the online education is an education behavior based on a network, and the content propagation and the rapid learning are performed by applying an information technology and an internet technology, representative online education platforms include internet cloud classes, cool learning and the like, and representative online education mobile applications include work sides, simian tutoring, subject shooting and the like.
However, the collection of the existing online education and teaching resources (such as an item bank) still needs to rely on a large amount of manpower to carry out simple and repeated labor, so that the efficiency is low and mistakes are easy to make; therefore, the existing requirements are not met, and a test paper test question splitting tool based on a machine learning algorithm and a splitting and extracting method thereof are provided.
Disclosure of Invention
The invention aims to provide a test paper test question splitting tool based on a machine learning algorithm and a splitting and extracting method thereof, so as to solve the problems that the collection of the existing on-line education and teaching resources (such as a question bank) in the background technology still needs a large amount of manpower to perform simple and repeated labor, the efficiency is low, errors are easy to occur and the like.
In order to achieve the purpose, the invention provides the following technical scheme: a machine learning algorithm based test paper question splitting tool, the splitting tool comprising:
logging in the verification system: the user IP and the key are verified and authorized to log in the security verification of the system;
a central processing unit: carrying out logic processing on the data of the test questions;
a test question bank: storing data of the test questions and storing the data in a classified mode, and storing the data according to the number, the types of the question stems and the types of scores of the test questions;
the test question text analysis module: the test paper reading device is used for reading the test paper file and analyzing characters in the test paper;
test question splitting module: separating information such as test question serial number, test question score, test question stem, test question option, test question answer, test question analysis and the like;
test question formatting generation module: the test question information separated by the test question text analysis module is combined correspondingly again, formatted and rearranged according to a preset rule and enters a question bank;
test question extraction module: extracting the recombined test questions rearranged into the question bank to the terminal equipment;
cloud test question updating module: updating and uploading test question information uploaded in the cloud to a test question bank regularly;
the test question similarity comparison module: comparing the recombined test questions extracted from the same batch at the same time;
a terminal: the terminal receiving program can be a WeChat applet, a mobile APP or a web terminal.
Preferably, the test question entity recognition algorithm of the test question text analysis module adopts word embedding to obtain a semantic vector of the test question text, and a machine learning algorithm is adopted to analyze the semantic vector to obtain test paper information and test question information.
Preferably, the test paper information includes test paper name, test paper year and test paper area, and the test question information includes question stem, score, serial number, analysis and answer.
Preferably, the test paper file format may be docx, doc, pdf, html, etc.
Preferably, the test question formatting generation module can edit and output a test paper format for the module, and the module can rearrange the text obtained by analysis according to the format and output the text according to the service requirement.
Preferably, the test question similarity comparison module compares the individual test question sets subjected to different recombinations in the same batch, and when the similarity is greater than fifty percent, the test question formatting generation recombination of the single test question set is carried out again.
A test paper test question splitting and extracting method based on a machine learning algorithm is characterized by comprising the following steps:
the method comprises the following steps: after logging in the system, uploading the existing test question set file or inputting test questions, and aiming at the existing test questions, performing test question entity identification algorithm operation through a test question text analysis module;
step two: analyzing the semantic vector by adopting a machine learning algorithm to obtain test paper information and test question information;
step three: splitting and separating information such as test question serial numbers, test question scores, test question stems, test question options, test question answers, test question analysis and the like;
step four: then, formatting the test questions, correspondingly combining the test question information separated by the test question text analysis module again, formatting and rearranging the test question information into a question bank according to a preset rule;
step five: carrying out similarity comparison between the output test question sets aiming at a plurality of sets of test questions generated in the same batch and screening the recombined test question sets;
step six: and outputting the recombined test question set to a terminal after the arrangement is finished.
Preferably, in the similarity comparison process, the test formatting generation recombination of the individual test sets is performed again when the similarity between the recombined test sets in the same batch is greater than fifty percent, and the test sets are output to the terminal after the similarity is less than fifty percent.
Preferably, the test question entity identification algorithm operation adopts word embedding to obtain a test question text semantic vector.
Compared with the prior art, the invention has the beneficial effects that:
according to the invention, the machine learning algorithm is adopted to analyze the semantic vector to obtain the test paper information and the test question information, and the test paper information and the test question information are split and recombined in a targeted manner, so that the collection efficiency of online education resources for the test questions is greatly improved, the labor investment related to test question collection is effectively saved, the simple and repeated labor is reduced, and the collection, splitting and recombining efficiency of the test question resources is greatly improved.
Drawings
FIG. 1 is a schematic system diagram of the splitting tool of the present invention;
FIG. 2 is a schematic structural diagram of the resolution and extraction method of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
Referring to fig. 1 to fig. 2, an embodiment of the present invention includes: a machine learning algorithm based test paper question splitting tool, the splitting tool comprising:
logging in the verification system: the user IP and the key are verified and authorized to log in the security verification of the system;
a central processing unit: carrying out logic processing on the data of the test questions;
a test question bank: storing data of the test questions and storing the data in a classified mode, and storing the data according to the number, the types of the question stems and the types of scores of the test questions;
the test question text analysis module: the test paper reading device is used for reading the test paper file and analyzing characters in the test paper;
test question splitting module: separating information such as test question serial number, test question score, test question stem, test question option, test question answer, test question analysis and the like;
test question formatting generation module: the test question information separated by the test question text analysis module is combined correspondingly again, formatted and rearranged according to a preset rule and enters a question bank;
test question extraction module: extracting the recombined test questions rearranged into the question bank to the terminal equipment;
cloud test question updating module: updating and uploading test question information uploaded in the cloud to a test question bank regularly;
the test question similarity comparison module: comparing the recombined test questions extracted from the same batch at the same time;
a terminal: the terminal receiving program can be a WeChat applet, a mobile APP or a web terminal.
Furthermore, the test question entity recognition algorithm of the test question text analysis module adopts word embedding to obtain a test question text semantic vector, and a machine learning algorithm is adopted to analyze the semantic vector to obtain test paper information and test question information.
Further, the test paper information includes the name of the test paper, the year of the test paper and the area of the test paper, and the test question information includes the question stem, the score, the serial number, the analysis and the answer.
Further, the test paper file format may be docx, doc, pdf, html, etc.
Furthermore, the test question formatting generation module can edit the output test paper format of the module, and the module can rearrange the text obtained by analysis according to the format and output the text according to the service requirement.
Furthermore, the test question similarity comparison module compares the individual test question sets subjected to different recombinations in the same batch, and when the similarity is more than fifty percent, the test question formatting generation recombination of the single test question set is carried out again.
A test paper test question splitting and extracting method based on a machine learning algorithm is characterized by comprising the following steps:
the method comprises the following steps: after logging in the system, uploading the existing test question set file or inputting test questions, and aiming at the existing test questions, performing test question entity identification algorithm operation through a test question text analysis module;
step two: analyzing the semantic vector by adopting a machine learning algorithm to obtain test paper information and test question information;
step three: splitting and separating information such as test question serial numbers, test question scores, test question stems, test question options, test question answers, test question analysis and the like;
step four: then, formatting the test questions, correspondingly combining the test question information separated by the test question text analysis module again, formatting and rearranging the test question information into a question bank according to a preset rule;
step five: carrying out similarity comparison between the output test question sets aiming at a plurality of sets of test questions generated in the same batch and screening the recombined test question sets;
step six: and outputting the recombined test question set to a terminal after the arrangement is finished.
Further, in the similarity comparison process, the test questions in the same batch after recombination are formatted and recombined again when the similarity between the test question sets is greater than fifty percent, and the test question sets with the similarity less than fifty percent are output to the terminal after layout.
Furthermore, word embedding is adopted in the test question entity recognition algorithm operation, and a test question text semantic vector is obtained.
According to the invention, the machine learning algorithm is adopted to analyze the semantic vector to obtain the test paper information and the test question information, and the test paper information and the test question information are split and recombined in a targeted manner, so that the collection efficiency of online education resources for the test questions is greatly improved, the labor investment related to test question collection is effectively saved, the simple and repeated labor is reduced, and the collection, splitting and recombining efficiency of the test question resources is greatly improved.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Claims (9)

1. A test paper question splitting tool based on a machine learning algorithm, the splitting tool comprising:
logging in the verification system: the user IP and the key are verified and authorized to log in the security verification of the system;
a central processing unit: carrying out logic processing on the data of the test questions;
a test question bank: storing data of the test questions and storing the data in a classified mode, and storing the data according to the number, the types of the question stems and the types of scores of the test questions;
the test question text analysis module: the test paper reading device is used for reading the test paper file and analyzing characters in the test paper;
test question splitting module: separating information such as test question serial number, test question score, test question stem, test question option, test question answer, test question analysis and the like;
test question formatting generation module: the test question information separated by the test question text analysis module is combined correspondingly again, formatted and rearranged according to a preset rule and enters a question bank;
test question extraction module: extracting the recombined test questions rearranged into the question bank to the terminal equipment;
cloud test question updating module: updating and uploading test question information uploaded in the cloud to a test question bank regularly;
the test question similarity comparison module: comparing the recombined test questions extracted from the same batch at the same time;
a terminal: the terminal receiving program can be a WeChat applet, a mobile APP or a web terminal.
2. The machine learning algorithm-based test paper question splitting tool as claimed in claim 1, wherein: the test question entity recognition algorithm of the test question text analysis module adopts word embedding to obtain a test question text semantic vector, and a machine learning algorithm is adopted to analyze the semantic vector to obtain test paper information and test question information.
3. The machine learning algorithm-based test paper question splitting tool as claimed in claim 2, wherein: the test paper information includes the name of the test paper, the year of the test paper and the area of the test paper, and the test question information includes the question stem, the score, the serial number, the analysis and the answer.
4. The machine learning algorithm-based test paper question splitting tool as claimed in claim 1, wherein: the test paper file format can be docx, doc, pdf, html and the like.
5. The machine learning algorithm-based test paper question splitting tool as claimed in claim 1, wherein: the test question formatting generation module can edit and output the test paper format for the module, and the module can rearrange the text obtained by analysis according to the format and output the text according to the service requirement.
6. The machine learning algorithm-based test paper question splitting tool as claimed in claim 1, wherein: the test question similarity comparison module compares individual test question sets subjected to different recombinations in the same batch, and performs test question formatting generation recombination on a single test question set again when the similarity is more than fifty percent.
7. A test paper test question splitting and extracting method based on a machine learning algorithm is characterized by comprising the following steps:
the method comprises the following steps: after logging in the system, uploading the existing test question set file or inputting test questions, and aiming at the existing test questions, performing test question entity identification algorithm operation through a test question text analysis module;
step two: analyzing the semantic vector by adopting a machine learning algorithm to obtain test paper information and test question information;
step three: splitting and separating information such as test question serial numbers, test question scores, test question stems, test question options, test question answers, test question analysis and the like;
step four: then, formatting the test questions, correspondingly combining the test question information separated by the test question text analysis module again, formatting and rearranging the test question information into a question bank according to a preset rule;
step five: carrying out similarity comparison between the output test question sets aiming at a plurality of sets of test questions generated in the same batch and screening the recombined test question sets;
step six: and outputting the recombined test question set to a terminal after the arrangement is finished.
8. The method for splitting and extracting test paper and test question based on machine learning algorithm as claimed in claim 7, wherein: in the similarity comparison process, the test questions in the same batch of recombined test question sets with the similarity larger than fifty percent are formatted and recombined again, and the test question sets with the similarity lower than fifty percent are arranged and then output to the terminal.
9. The method for splitting and extracting test paper and test question based on machine learning algorithm as claimed in claim 7, wherein: the test question entity recognition algorithm adopts word embedding to obtain a test question text semantic vector.
CN202110036117.5A 2021-01-12 2021-01-12 Test paper test question splitting tool based on machine learning algorithm and splitting and extracting method thereof Pending CN112765307A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110036117.5A CN112765307A (en) 2021-01-12 2021-01-12 Test paper test question splitting tool based on machine learning algorithm and splitting and extracting method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110036117.5A CN112765307A (en) 2021-01-12 2021-01-12 Test paper test question splitting tool based on machine learning algorithm and splitting and extracting method thereof

Publications (1)

Publication Number Publication Date
CN112765307A true CN112765307A (en) 2021-05-07

Family

ID=75701597

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110036117.5A Pending CN112765307A (en) 2021-01-12 2021-01-12 Test paper test question splitting tool based on machine learning algorithm and splitting and extracting method thereof

Country Status (1)

Country Link
CN (1) CN112765307A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117473037A (en) * 2023-12-27 2024-01-30 广州云积软件技术有限公司 Examination question bank construction method and system based on large language model and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106354740A (en) * 2016-05-04 2017-01-25 上海秦镜网络科技有限公司 Electronic examination paper inputting method
CN110647885A (en) * 2019-09-17 2020-01-03 广州光大教育软件科技股份有限公司 Test paper splitting method, device, equipment and medium based on picture identification
CN111680669A (en) * 2020-08-12 2020-09-18 江西风向标教育科技有限公司 Test question segmentation method and system and readable storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106354740A (en) * 2016-05-04 2017-01-25 上海秦镜网络科技有限公司 Electronic examination paper inputting method
CN110647885A (en) * 2019-09-17 2020-01-03 广州光大教育软件科技股份有限公司 Test paper splitting method, device, equipment and medium based on picture identification
CN111680669A (en) * 2020-08-12 2020-09-18 江西风向标教育科技有限公司 Test question segmentation method and system and readable storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117473037A (en) * 2023-12-27 2024-01-30 广州云积软件技术有限公司 Examination question bank construction method and system based on large language model and electronic equipment
CN117473037B (en) * 2023-12-27 2024-04-16 广州云积软件技术有限公司 Examination question bank construction method and system based on large language model and electronic equipment

Similar Documents

Publication Publication Date Title
CN107122416A (en) A kind of Chinese event abstracting method
CN110781668B (en) Text information type identification method and device
CN111866004B (en) Security assessment method, apparatus, computer system, and medium
CN111694937A (en) Interviewing method and device based on artificial intelligence, computer equipment and storage medium
CN109241330A (en) The method, apparatus, equipment and medium of key phrase in audio for identification
CN113312924A (en) Risk rule classification method and device based on NLP high-precision analysis label
CN115361176A (en) SQL injection attack detection method based on FlexUDA model
CN112765307A (en) Test paper test question splitting tool based on machine learning algorithm and splitting and extracting method thereof
CN111488501A (en) E-commerce statistical system based on cloud platform
CN112882899B (en) Log abnormality detection method and device
CN110377706B (en) Search sentence mining method and device based on deep learning
CN110929085B (en) System and method for processing electric customer service message generation model sample based on meta-semantic decomposition
CN114465875B (en) Fault processing method and device
CN114881312B (en) Short-term wind power prediction method based on improved depth forest
CN114880690B (en) Edge calculation-based source data time sequence refinement method
CN115659004A (en) Online learning system based on internet
CN113240443A (en) Entity attribute pair extraction method and system for power customer service question answering
CN106547913B (en) Page information collection and classification feedback method, device and system
Zhao et al. The Application of Artificial Intelligence in Enterprise Auditing
CN115082174B (en) Method, device, computer equipment and storage medium for identifying similar quality control of bonds
CN113806530A (en) Education resource indexing method
CN118013012A (en) Session recommendation method, device, equipment and medium
CN118277581A (en) Comprehensive annual report information extraction and evaluation system and method
CN118132050A (en) Software development system based on artificial intelligence
CN113343816A (en) Automatic testing method and system for OCR resume recognition algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210507