CN109325098B - Reference resolution method for semantic analysis of mathematical questions - Google Patents

Reference resolution method for semantic analysis of mathematical questions Download PDF

Info

Publication number
CN109325098B
CN109325098B CN201810964809.4A CN201810964809A CN109325098B CN 109325098 B CN109325098 B CN 109325098B CN 201810964809 A CN201810964809 A CN 201810964809A CN 109325098 B CN109325098 B CN 109325098B
Authority
CN
China
Prior art keywords
entities
entity
mathematical
text
formula
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810964809.4A
Other languages
Chinese (zh)
Other versions
CN109325098A (en
Inventor
梅阳阳
谢德刚
郑文娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Mutual Education Intelligent Technology Co.,Ltd.
Original Assignee
Shanghai Hujiao Education Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Hujiao Education Technology Co ltd filed Critical Shanghai Hujiao Education Technology Co ltd
Priority to CN201810964809.4A priority Critical patent/CN109325098B/en
Publication of CN109325098A publication Critical patent/CN109325098A/en
Application granted granted Critical
Publication of CN109325098B publication Critical patent/CN109325098B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

A reference resolution method for semantic parsing of mathematical questions comprises the following steps: s1: classifying different subject texts, and extracting basic entities related to each type of subject text; s2: analyzing a given mathematical problem text, and if the analysis is successful, judging whether a reference problem exists in a sentence; s3: and (3) increasing the judgment of the candidate entity in the referring process, including further judging the grammar of the sentence where the entity is located, finding out the accurate referring entity, and then carrying out entity replacement operation.

Description

Reference resolution method for semantic analysis of mathematical questions
Technical Field
The invention belongs to the technical field of artificial intelligence, and particularly relates to a reference resolution method for semantic analysis of mathematical questions.
Background
The problem referred to in real language expression is common, and in natural language processing, the processing of the problem referred to is a necessary step of text semantic understanding. The reference resolution is applied to various fields such as text summarization, machine translation, information extraction and automatic problem solving. Although the language expression of the mathematical problem is relatively simple and standardized, a lot of problems still exist, and for rigorous and highly logical automatic mathematical problem solving, the resolution effect of the pronouns directly influences the accuracy of understanding the semantic meaning of the problem and influences the success rate of automatic problem solving.
Natural language processing aiming at professional fields such as elementary mathematics cannot only adopt general technologies in the field of artificial intelligence, but existing general entity reference resolution only comprises a back reference and a pre-reference, the method limits the searching range of pronouns, the condition of inaccurate reference is easy to occur, a novel applicable reference resolution method needs to be formulated aiming at common expressions in mathematical subjects, and the method is important for understanding and automatically solving the mathematical subjects.
Disclosure of Invention
The invention aims to provide a reference resolution method for semantic analysis of mathematical titles, which aims to solve the problem of inaccurate reference in the existing method.
In one embodiment of the present invention, a reference resolution method for semantic parsing of a mathematical problem includes the following steps:
s1: classifying different subject texts, and extracting basic entities related to each type of subject text;
s2: analyzing a given mathematical problem text, and if the analysis is successful, judging whether a reference problem exists in a sentence;
s3: and (3) increasing the judgment of the candidate entity in the referring process, including further judging the grammar of the sentence where the entity is located, finding out the accurate referring entity, and then carrying out entity replacement operation.
The common reference resolution method in the mathematical title text is back reference, and the traditional operation is to find the entity closest to the pronoun phrase in the previous position for reference, so that the method limits the range of reference searching.
The embodiment of the invention adds the judgment of the candidate entity on the basis of the conventional reference resolution method, comprises the further judgment of the front and back part of speech and grammar of the candidate entity, and screens the candidate entity with the highest credibility for reference resolution by combining the resolution information of the pronoun phrases.
Aiming at the reference problem appearing in the mathematical title text, the method and the device increase the judgment of the candidate entity in the reference process on the basis of classifying different titles, including the grammar judgment of the sentence in which the candidate entity is positioned. And on all entities identified by the algorithm, the judgment of the candidate entities is added, including classification of the mathematical questions and further judgment of sentence grammars where the candidate entities are located, and the candidate entities with the highest credibility are screened out for reference resolution by combining entity category information obtained by resolving pronoun phrases, so that the idea and the method of the reference resolution are perfected and optimized, and the method is simple and easy to operate and has wider applicability. Compared with the traditional reference resolution method, the accuracy of reference resolution in the mathematical problem text is greatly increased, the implementation effect on the aspect of natural language processing in an automatic problem solving system is good, and the application of the natural language processing technology in the professional fields of mathematics and the like is promoted.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present invention will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:
FIG. 1 is a flow chart of a resolution method according to an embodiment of the present invention.
Detailed Description
According to one or more embodiments, as shown in fig. 1, a novel reference resolution method for semantic parsing of mathematical subjects includes the following steps:
s1: classifying different subject texts, and extracting basic entities related to each type of subject text;
s2: analyzing a given mathematical problem text, and if the analysis is successful, judging whether a reference problem exists in a sentence;
s3: and (3) increasing the judgment of the candidate entity in the referring process, including further judging the grammar of the sentence where the entity is located, finding out the accurate referring entity, and then carrying out entity replacement operation.
In the present invention, the resolution problem of the reference appearing in the text of the mathematical title mainly aims at basic entities, such as sets, functions, equations, inequalities, straight lines, circles and the like.
The step S1 specifically includes the following steps:
classifying the mathematical topic texts according to different chapters in the elementary mathematics textbook, extracting main basic entities involved in each type of topic texts, and taking the main basic entities as candidate entities in the reference problems.
The step S2 specifically includes the following steps:
the analysis process of the mathematical title text comprises formula identification, word segmentation operation, part-of-speech tagging of non-formula text, sequence tagging of a formula part and the like, and after the analysis is successful, whether the text has a reference problem or not is judged according to the part-of-speech.
The formula part in the topic text is identified and sequence labeled by adopting a CRF algorithm, and the non-formula part is labeled by using a dictionary in a word segmentation and part-of-speech manner, wherein the topic text is divided into the formula part and the non-formula part, and different labeling methods are adopted, so that the whole text labeling is quicker and controllable, the method is expandable and simple to maintain, and the automatic labeling effect is obvious.
Since the parsing content mainly includes entity categories and names, such as the set a ═ { x |1< x <3}, the function f (x) ═ abs (x ^2-1) -2, etc., these are some structured templates common in text. Searching similar phrase templates for analysis, and correcting sequence labels according to phrase template words, that is, when a specific category of a certain entity category appears, such as a quadratic Function f (x) ═ a x ^2+4 x +1, because the subjects of the mathematical subjects related to the Function type are mostly described by a Function word, a formula is firstly identified as a Function entity in the entity identification process, then the Function entity is adjusted into a Quadracic Function entity according to the specific category description, and then the sequence labels of the formula part are corrected, and the entity representations are all represented in an automatic problem solving system in a predicate logic form. The default entity is supplemented in this process.
The step S3 specifically includes the following steps:
and based on different entity candidate rules of the topics of each category, finding all entities in each sentence, and endowing different credibility to the entities in the text of each topic according to whether the entities belong to the main entities in the topics of the category. Screening out an entity with high credibility, judging whether the entity can be used as an entity for referring resolution according to words, parts of speech and grammar before and after the entity, if so, adding a candidate set, and recording the position of the entity.
And analyzing phrases of pronouns appearing in the mathematic subjects with the reference problems, and determining the type and the reference number of the pronouns. Pronouns often do not appear independently, a designated category is often carried behind, a number of words are possibly added, the category of an entity can be primarily and definitely indicated by analyzing the characteristics of the phrases, the candidate set is combined, the indicated entity can be accurately found, and entity replacement operation, namely, resolution operation, is carried out at the position of the pronoun phrases.
It should be noted that while the foregoing has described the spirit and principles of the invention with reference to several specific embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, nor is the division of aspects, which is for convenience only as the features in these aspects cannot be combined. The invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (4)

1. A reference resolution method for semantic parsing of mathematical titles is characterized by comprising the following steps:
s1: classifying different subject texts, and extracting basic entities related to each type of subject text; step S1 includes classifying the mathematics subject texts according to different chapters in the elementary mathematics textbook, extracting the main basic entities involved in each type of subject texts, and using the main basic entities as candidate entities in the reference questions;
s2: analyzing a given mathematical problem text, and if the analysis is successful, judging whether a reference problem exists in a sentence;
s3: and (3) increasing the judgment of the candidate entity in the referring process, including further judging the grammar of the sentence where the entity is located, finding out the accurate referring entity, and then carrying out entity replacement operation.
2. The reference resolution method for semantic parsing of mathematical problems according to claim 1, wherein the step S2 further comprises:
the analysis process of the mathematical title text comprises formula identification, word segmentation operation, part-of-speech tagging of non-formula text and sequence tagging of a formula, and whether the text has a reference problem or not is judged according to the part-of-speech after the analysis is successful, wherein,
the formula in the mathematical problem text is identified and sequence labeled by adopting a CRF algorithm, and a dictionary is used for carrying out word segmentation and part-of-speech labeling on a non-formula part in the mathematical problem text, wherein the mathematical problem text comprises the formula and the non-formula.
3. The reference resolution method for semantic parsing of mathematical topics according to claim 1, wherein the step S3 specifically includes the following steps:
based on different entity candidate rules of the subjects of each category, all entities in each sentence are found, different credibility is given to the entities in the text of each subject according to whether the subjects belong to the main entities in the subject type,
screening out entities with high credibility, judging whether the entities can be used as entities for referring resolution according to words, parts of speech and grammar before and after the entities, if so, adding a candidate set, recording the positions of the entities,
and analyzing phrases of pronouns appearing in the mathematic subjects with the reference problems, and determining the type and the reference number of the pronouns.
4. The reference resolution method for semantic parsing of mathematical problems according to claim 3,
and analyzing pronouns, preliminarily and definitely referring to the categories of the entities, accurately finding the referred entities by combining the candidate set, and then carrying out entity replacement operation, namely the reference resolution operation, at the positions of pronoun phrases.
CN201810964809.4A 2018-08-23 2018-08-23 Reference resolution method for semantic analysis of mathematical questions Active CN109325098B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810964809.4A CN109325098B (en) 2018-08-23 2018-08-23 Reference resolution method for semantic analysis of mathematical questions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810964809.4A CN109325098B (en) 2018-08-23 2018-08-23 Reference resolution method for semantic analysis of mathematical questions

Publications (2)

Publication Number Publication Date
CN109325098A CN109325098A (en) 2019-02-12
CN109325098B true CN109325098B (en) 2021-07-16

Family

ID=65263233

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810964809.4A Active CN109325098B (en) 2018-08-23 2018-08-23 Reference resolution method for semantic analysis of mathematical questions

Country Status (1)

Country Link
CN (1) CN109325098B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110473551B (en) * 2019-09-10 2022-07-08 北京百度网讯科技有限公司 Voice recognition method and device, electronic equipment and storage medium
CN111695054A (en) * 2020-06-12 2020-09-22 上海智臻智能网络科技股份有限公司 Text processing method and device, information extraction method and system, and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107153640A (en) * 2017-05-08 2017-09-12 成都准星云学科技有限公司 A kind of segmenting method towards elementary mathematics field
CN107168947A (en) * 2017-04-19 2017-09-15 成都准星云学科技有限公司 A kind of method and its system of new entity reference resolution
CN107203813A (en) * 2017-05-22 2017-09-26 成都准星云学科技有限公司 A kind of new default entity nomenclature and its system
CN107463553A (en) * 2017-09-12 2017-12-12 复旦大学 For the text semantic extraction, expression and modeling method and system of elementary mathematics topic
CN107894999A (en) * 2017-10-27 2018-04-10 成都准星云学科技有限公司 Towards the topic type automatic classification method and system based on thinking of solving a problem of elementary mathematics

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9715488B2 (en) * 2014-10-06 2017-07-25 International Business Machines Corporation Natural language processing utilizing transaction based knowledge representation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107168947A (en) * 2017-04-19 2017-09-15 成都准星云学科技有限公司 A kind of method and its system of new entity reference resolution
CN107153640A (en) * 2017-05-08 2017-09-12 成都准星云学科技有限公司 A kind of segmenting method towards elementary mathematics field
CN107203813A (en) * 2017-05-22 2017-09-26 成都准星云学科技有限公司 A kind of new default entity nomenclature and its system
CN107463553A (en) * 2017-09-12 2017-12-12 复旦大学 For the text semantic extraction, expression and modeling method and system of elementary mathematics topic
CN107894999A (en) * 2017-10-27 2018-04-10 成都准星云学科技有限公司 Towards the topic type automatic classification method and system based on thinking of solving a problem of elementary mathematics

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
篇章中指代消解研究综述;周炫余等;《武汉大学学报(理学版)》;20140228;第60卷(第1期);全文 *

Also Published As

Publication number Publication date
CN109325098A (en) 2019-02-12

Similar Documents

Publication Publication Date Title
KR100961717B1 (en) Method and apparatus for detecting errors of machine translation using parallel corpus
US8447588B2 (en) Region-matching transducers for natural language processing
US8266169B2 (en) Complex queries for corpus indexing and search
US8510097B2 (en) Region-matching transducers for text-characterization
CN109460552B (en) Method and equipment for automatically detecting Chinese language diseases based on rules and corpus
US9645988B1 (en) System and method for identifying passages in electronic documents
CN110119510B (en) Relationship extraction method and device based on transfer dependency relationship and structure auxiliary word
CN111046656B (en) Text processing method, text processing device, electronic equipment and readable storage medium
US11031003B2 (en) Dynamic extraction of contextually-coherent text blocks
CN112541337B (en) Document template automatic generation method and system based on recurrent neural network language model
CN109190099B (en) Sentence pattern extraction method and device
CN112380848B (en) Text generation method, device, equipment and storage medium
CN109325098B (en) Reference resolution method for semantic analysis of mathematical questions
JP2018206262A (en) Word linking identification model learning device, word linking detection device, method and program
Abolhassani et al. Information extraction and automatic markup for XML documents
CN117216214A (en) Question and answer extraction generation method, device, equipment and medium
Baron et al. Automatic standardization of spelling for historical text mining
CN111046649A (en) Text segmentation method and device
Duan et al. Automatically build corpora for chinese spelling check based on the input method
CN113609864B (en) Text semantic recognition processing system and method based on industrial control system
CN113822013B (en) Labeling method and device for text data, computer equipment and storage medium
CN109086272B (en) Sentence pattern recognition method and system
CN109145297B (en) Network vocabulary semantic analysis method and system based on hash storage
James et al. The development of a labelled te reo Māori–English bilingual database for language technology
CN112036183A (en) Word segmentation method and device based on BilSTM network model and CRF model, computer device and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: Building 10, Lane 2277, Zuchongzhi Road, Pudong New Area Free Trade Pilot Zone, Shanghai, 200000

Patentee after: Shanghai Mutual Education Intelligent Technology Co.,Ltd.

Address before: Room 211, Building 29, No.368, Zhangjiang Road, China (Shanghai) pilot Free Trade Zone, Pudong New Area, Shanghai 201210

Patentee before: SHANGHAI HUJIAO EDUCATION TECHNOLOGY Co.,Ltd.