CN104751230A - Ontology-based automatic manuscript reviewing method - Google Patents

Ontology-based automatic manuscript reviewing method Download PDF

Info

Publication number
CN104751230A
CN104751230A CN201510156543.7A CN201510156543A CN104751230A CN 104751230 A CN104751230 A CN 104751230A CN 201510156543 A CN201510156543 A CN 201510156543A CN 104751230 A CN104751230 A CN 104751230A
Authority
CN
China
Prior art keywords
contribution
individual
match
vocabulary
manuscript
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510156543.7A
Other languages
Chinese (zh)
Inventor
刘永坚
白立华
杨朝阳
杨慧
曾瑞
李文忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Epoch Publish Medium Inc Co
Wuhan University of Technology WUT
Original Assignee
Epoch Publish Medium Inc Co
Wuhan University of Technology WUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Epoch Publish Medium Inc Co, Wuhan University of Technology WUT filed Critical Epoch Publish Medium Inc Co
Priority to CN201510156543.7A priority Critical patent/CN104751230A/en
Publication of CN104751230A publication Critical patent/CN104751230A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The invention provides an ontology-based automatic manuscript reviewing method. The method includes domain ontology selection; pre-processing a manuscript to acquire construction thesaurus; automatic manuscript individual model construction, including individual identification and data attribute and object attribute filling; automatic manuscript reviewing, including data attribute matching, object attribute matching and corresponded result information returning. The returned information serving as reviewing results is returned to users. According to the method, the individuals are constructed by the knowledge ontology technology, the individual manuscripts can be reviewed automatically, the reviewing results can be returned, workload is reduced greatly for staffs, and the publishing efficiency is improved.

Description

Based on the contribution automatic checking method of body
Technical field
The present invention relates to digital publication technical field, refer more particularly to a kind of contribution automatic checking method based on body for digital information processing.
Background technology
Knowledge processing is the inexorable trend of Information Technology Development, along with requiring more and more higher to knowledge apply, traditional Knowledge Database System can not meet new demand, so refer in knowledge engineering by body, by body relative theory Technology application in the exploitation of knowledge base.
Ontology knowledge system was 20 century 70 later stages, and the constructing technology of expert system, knowledge system and knowledge-intensive infosystem develops and forms knowledge engineering, and the system set up is referred to as knowledge system (knowledge-based systems).Knowledge system is the most important industrialization of artificial intelligence subject and commercialization product.Knowledge system is used for auxiliary people and carries out problem solving, as detected credit card fraud, accelerates Ship Design, assisted medical diagnosis, makes the service recovery that scientific software is more intelligent, to provide financial service, the evaluation of product quality and advertising, support electric network to F/O.
Along with the development of knowledge system and popular, knowledge services also becomes the inexorable trend of Information Technology Development, knowledge engineering application based on body starts to attract attention by people, the application that knowledge engineering is new in the knowledge services of digital publishing industry, current people also carry out contribution examination & verification in dependence manual type, also do not have a kind of technology of going over a manuscript or draft based on domain body of robotization.
Summary of the invention
Technical matters to be solved by this invention is just to the technical deficiency of above-mentioned existence, one is provided to utilize ontologies technology component individual, automatically can audit individual contribution, and return result of going over a manuscript or draft, significantly reduce person works's intensity, improve the contribution automatic checking method based on body publishing efficiency.
The technical solution adopted for the present invention to solve the technical problems is:
Based on the contribution automatic checking method of body, it is characterized in that, include following steps:
Select domain body: corresponding domain body model is selected in field belonging to contribution, this model builds in other field model system, comprise class, the ontology model of object properties, data attribute Sum fanction information, and a corresponding body Model is put up.
Contribution pre-service obtains and builds vocabulary: carry out deconsolidation process to contribution in computer systems, which, and use participle instrument to carry out participle to text message wherein and obtain one after filtering out the stop words such as otiose function word, auxiliary word and build vocabulary, this vocabulary is for building the individual body Model corresponding with contribution, and vocabulary preserves corresponding positional information.
A contribution body Model builds automatically: a contribution body Model automatically builds and comprises individual identification, data attribute and object properties and fill, and specific implementation is divided into following step:
Individual identification: with according to selected domain model category information for reference standard class, adopt corresponding sorting algorithm or instrument to the classification of structure vocabulary in computer systems, which, calculate the similarity of itself and reference standard class word, identify the individuality built in vocabulary according to similarity threshold values.
For individual padding data attribute, object properties: the individual title under corresponding reference standard class and data attribute title, object properties title are mated in structure vocabulary; Use the mode of Similarity Measure to mate, matching range is mating with the word built in vocabulary near individuality; When matching degree reaches a threshold value, the text data mark of correspondence is filled into data attribute and object properties, and to identify the text be respective attributes.
Automatically go over a manuscript or draft: automatically go over a manuscript or draft comprise data attribute coupling, object properties matching result and accordingly result information returns, return message returns to user as result of going over a manuscript or draft, and is implemented as follows:
Data attribute mates: mate with the individual data items attribute identified with the data attribute building the vocabulary individuality that the match is successful according in selected domain body model, matching degree reaches certain threshold values, and then the match is successful, otherwise it fails to match returns individual information array (match-type and data attribute coupling, individual information are namely referenced individual be namely referenced data attribute source information and contribution data attribute source information with contribution individual information, source information).
Object properties are mated: mate with the individual subject attribute identified with the object properties building the vocabulary individuality that the match is successful according in selected domain body model, matching degree reaches certain threshold values, and then the match is successful, otherwise it fails to match returns individual information array (match-type and object properties coupling, individual information are namely referenced individual be namely referenced object properties source information and contribution data attribute source information with contribution individual information, source information).
Matching process: the match is successful that individual attribute is match-on criterion with dictionary according in selected domain body model, employing similarity calculating method or instrument calculate the similarity between corresponding attribute, when similarity reaches certain threshold values, then the match is successful, and lower than threshold values, then it fails to match.
Return message process: red to the place mark that logic is wrong according to return message array, and information encoding is returned to user check.
Principle of the present invention is that belonging to contribution, corresponding domain body model is selected in field, with this model for being referenced ontology model; The contribution of author is carried out series of preprocessing and obtain the dictionary being used for building a rod member body Model; Filled by individual identification, data attribute and object properties and build a contribution body Model; Then automatically go over a manuscript or draft by completing the attributes match being referenced model and a contribution body Model and return object information of going over a manuscript or draft.
The invention has the beneficial effects as follows:
The inventive method utilizes ontologies technology component individual, automatically can audit, and return result of going over a manuscript or draft to individual contribution, significantly reduces person works's intensity, improves publication efficiency.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the embodiment of the present invention.
Embodiment
Below in conjunction with embodiment, the present invention is further illustrated:
The contribution automatic checking method based on body as shown in Figure 1, includes following steps:
Step (1) is referenced the selection of body: belonging to contribution, corresponding domain body model is selected in field, this model builds in other field model system, comprise class, the ontology model of object properties, data attribute Sum fanction information, and a corresponding body Model is put up.
Step (2) contribution pre-service obtains and builds vocabulary: carry out deconsolidation process to contribution, and carry out participle and filter out the stop words such as otiose function word, auxiliary word obtaining a structure vocabulary to text message use participle instrument wherein, this vocabulary is for building the individual body Model corresponding with contribution, and vocabulary preserves corresponding positional information.
Step (3) contribution body Model builds automatically: a contribution body Model automatically builds and comprises individual identification, data attribute and object properties and fill, and specific implementation is divided into following step:
(a) individual identification: with according to selected domain model category information for reference standard class, adopt corresponding sorting algorithm or instrument to the classification of structure dictionary, calculate itself and the similarity of reference standard class word, identify the individuality in structure vocabulary according to threshold values;
B () is individual padding data attribute, object properties: the individual title under corresponding reference standard class and data attribute title, object properties title are mated in structure vocabulary.Use the mode of Similarity Measure to mate, matching range is mating with the word built in vocabulary near individuality.When matching degree reaches a threshold value, the text data mark of correspondence is filled into data attribute and object properties, and to identify the text be respective attributes.
Step (4) is gone over a manuscript or draft automatically: automatically going over a manuscript or draft comprises data attribute and object matching result and accordingly result information and return, and return message returns to user as result of going over a manuscript or draft, and is implemented as follows:
A () data attribute mates: mate with the individual data items attribute identified with the data attribute of the dictionary individuality that the match is successful according in selected domain body model, matching degree reaches certain threshold values, and then the match is successful, otherwise it fails to match returns individual information array (match-type and data attribute coupling, individual information are namely referenced individual be namely referenced data attribute source information and contribution data attribute source information with contribution individual information, source information);
B () object properties are mated: mate with the individual subject attribute identified with the object properties of the dictionary individuality that the match is successful according in selected domain body model, matching degree reaches certain threshold values, and then the match is successful, otherwise it fails to match returns individual information array (match-type and object properties coupling, individual information are namely referenced individual be namely referenced object properties source information and contribution data attribute source information with contribution individual information, source information);
(c) matching process: the match is successful that individual attribute is match-on criterion with dictionary according in selected domain body model, employing similarity calculating method or instrument calculate the similarity between corresponding attribute, when similarity reaches certain threshold values, then the match is successful, and lower than threshold values, then it fails to match;
(d) return message process: red to the place mark that logic is wrong according to return message array, and information encoding is returned to user check.
Protection scope of the present invention is not limited to the above embodiments, and obviously, those skilled in the art can carry out various change and distortion to the present invention and not depart from the scope of the present invention and spirit.If these are changed and distortion belongs in the scope of the claims in the present invention and equivalent technologies thereof, then the intent of the present invention also comprises these changes and distortion.

Claims (6)

1., based on the contribution automatic checking method of body, it is characterized in that, include following steps:
Select domain body: corresponding domain body model is selected in field belonging to contribution, this model builds in other field model system, comprise class, the ontology model of object properties, data attribute Sum fanction information, and a corresponding body Model is put up;
Contribution pre-service obtains and builds vocabulary: carry out deconsolidation process to contribution in computer systems, which, and use participle instrument to carry out participle to text message wherein and obtain one after filtering out the stop words such as otiose function word, auxiliary word and build vocabulary, this vocabulary is for building the individual body Model corresponding with contribution, and vocabulary preserves corresponding positional information;
A contribution body Model builds automatically: a contribution body Model automatically builds and comprises individual identification, data attribute and object properties and fill, wherein individual identification be with according to selected domain model category information for reference standard class, adopt corresponding sorting algorithm or instrument to the classification of structure vocabulary in computer systems, which, calculate the similarity of itself and reference standard class word, identify the individuality built in vocabulary according to similarity threshold values; For individual padding data attribute and object properties are mated in structure vocabulary at the individual title under corresponding reference standard class and data attribute title, object properties title, after the match is successful, this word is filled to the respective attributes value of corresponding contribution individuality;
Automatically go over a manuscript or draft: automatically go over a manuscript or draft comprise data attribute coupling, object properties matching result and accordingly result information returns, return message returns to user as result of going over a manuscript or draft.
2. as claimed in claim 1 based on the contribution automatic checking method of body, it is characterized in that: described is in individual padding data property value, object attribute values, carry out mating and refer to that the mode of use Similarity Measure is mated, matching range is mating with the word built in vocabulary near individuality; When matching degree reaches a threshold value, the text data mark of correspondence is filled into data attribute and object properties, and to identify the text be respective attributes.
3. as claimed in claim 1 based on the contribution automatic checking method of body, it is characterized in that: described automatically to go over a manuscript or draft, data attribute coupling refers to mates with the individual data items attribute identified with the data attribute building the vocabulary individuality that the match is successful according in selected domain body model, matching degree reaches certain threshold values, and then the match is successful, otherwise it fails to match returns individual information array.
4. as claimed in claim 1 based on the contribution automatic checking method of body, it is characterized in that: described automatically to go over a manuscript or draft, object properties coupling refers to mates with the individual subject attribute identified with the object properties building the vocabulary individuality that the match is successful according in selected domain body model, matching degree reaches certain threshold values, and then the match is successful, otherwise it fails to match returns individual information array.
5. the contribution automatic checking method based on body as described in claim 3 or 4, it is characterized in that: described automatically to go over a manuscript or draft, matching process refers to that the match is successful that individual attribute is match-on criterion with dictionary according in selected domain body model, employing similarity calculating method or instrument calculate the similarity between corresponding attribute, when similarity reaches certain threshold values, then the match is successful, and lower than threshold values, then it fails to match.
6. as claimed in claim 1 based on the contribution automatic checking method of body, it is characterized in that: described automatically to go over a manuscript or draft, return message process refers to according to return message array red to the place mark that logic is wrong, and information encoding is returned to user checks.
CN201510156543.7A 2015-04-03 2015-04-03 Ontology-based automatic manuscript reviewing method Pending CN104751230A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510156543.7A CN104751230A (en) 2015-04-03 2015-04-03 Ontology-based automatic manuscript reviewing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510156543.7A CN104751230A (en) 2015-04-03 2015-04-03 Ontology-based automatic manuscript reviewing method

Publications (1)

Publication Number Publication Date
CN104751230A true CN104751230A (en) 2015-07-01

Family

ID=53590875

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510156543.7A Pending CN104751230A (en) 2015-04-03 2015-04-03 Ontology-based automatic manuscript reviewing method

Country Status (1)

Country Link
CN (1) CN104751230A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105354224A (en) * 2015-09-30 2016-02-24 百度在线网络技术(北京)有限公司 Knowledge data processing method and apparatus

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102141993A (en) * 2010-02-02 2011-08-03 同济大学 Semantic ontology-based case representation method
CN103593335A (en) * 2013-09-05 2014-02-19 姜赢 Chinese semantic proofreading method based on ontology consistency verification and reasoning
CN103699663A (en) * 2013-12-27 2014-04-02 中国科学院自动化研究所 Hot event mining method based on large-scale knowledge base

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102141993A (en) * 2010-02-02 2011-08-03 同济大学 Semantic ontology-based case representation method
CN103593335A (en) * 2013-09-05 2014-02-19 姜赢 Chinese semantic proofreading method based on ontology consistency verification and reasoning
CN103699663A (en) * 2013-12-27 2014-04-02 中国科学院自动化研究所 Hot event mining method based on large-scale knowledge base

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
谢铭: "关联数据和知识表示的自动语义标注技术", 《中国博士学位论文全文数据库 信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105354224A (en) * 2015-09-30 2016-02-24 百度在线网络技术(北京)有限公司 Knowledge data processing method and apparatus
CN105354224B (en) * 2015-09-30 2019-07-23 百度在线网络技术(北京)有限公司 The treating method and apparatus of knowledge data

Similar Documents

Publication Publication Date Title
CN108133045B (en) Keyword extraction method and system, and keyword extraction model generation method and system
CN110427461B (en) Intelligent question and answer information processing method, electronic equipment and computer readable storage medium
CN106484664B (en) Similarity calculating method between a kind of short text
CN110705301B (en) Entity relationship extraction method and device, storage medium and electronic equipment
CN110619568A (en) Risk assessment report generation method, device, equipment and storage medium
CN109918635A (en) A kind of contract text risk checking method, device, equipment and storage medium
CN106296195A (en) A kind of Risk Identification Method and device
CN111160452A (en) Multi-modal network rumor detection method based on pre-training language model
CN104021302A (en) Auxiliary registration method based on Bayes text classification model
CN103605972A (en) Non-restricted environment face verification method based on block depth neural network
CN104462053A (en) Inner-text personal pronoun anaphora resolution method based on semantic features
CN111159387A (en) Recommendation method based on multi-dimensional alarm information text similarity analysis
CN107480688A (en) Fine granularity image-recognizing method based on zero sample learning
CN105224937A (en) Based on the semantic color pedestrian of the fine granularity heavily recognition methods of human part position constraint
CN107943514A (en) The method for digging and system of core code element in a kind of software document
CN109726715A (en) A kind of character image serializing identification, structural data output method
CN110188359B (en) Text entity extraction method
US20220292861A1 (en) Docket Analysis Methods and Systems
CN110188714A (en) A kind of method, system and storage medium for realizing financial management under chat scenario
CN107357785A (en) Theme feature word abstracting method and system, feeling polarities determination methods and system
CN113609892A (en) Handwritten poetry recognition method integrating deep learning with scenic spot knowledge map
CN109117891B (en) Cross-social media account matching method fusing social relations and naming features
CN110399433A (en) A kind of data entity Relation extraction method based on deep learning
CN116010581A (en) Knowledge graph question-answering method and system based on power grid hidden trouble shooting scene
CN111143394B (en) Knowledge data processing method, device, medium and electronic equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20150701

RJ01 Rejection of invention patent application after publication