CN103246641A - Text semantic information analyzing system and method - Google Patents

Text semantic information analyzing system and method Download PDF

Info

Publication number
CN103246641A
CN103246641A CN2013101822827A CN201310182282A CN103246641A CN 103246641 A CN103246641 A CN 103246641A CN 2013101822827 A CN2013101822827 A CN 2013101822827A CN 201310182282 A CN201310182282 A CN 201310182282A CN 103246641 A CN103246641 A CN 103246641A
Authority
CN
China
Prior art keywords
rule
target text
semantic information
text
natural language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013101822827A
Other languages
Chinese (zh)
Inventor
李营
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN2013101822827A priority Critical patent/CN103246641A/en
Publication of CN103246641A publication Critical patent/CN103246641A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses a text semantic information analyzing system and method. Quasi nature language rules are established according to a preset rule set and a target rule set, data matching of a target text is performed according to the quasi nature language rules to obtain semantic information of the target text, and semantic information analysis of the text based on the quasi nature language rules is achieved. In the text semantic information analyzing system and method, the quasi nature language rules are easy to understand and compile, have a natural combination characteristic, the data matching process and the information extraction process are unified, the accuracy of the semantic analysis is high, the quasi nature language rules are highly reusable, and the model generality is strong.

Description

A kind of text semantic information analysis system and method
Technical field
The present invention relates to the information analysis technical field, relate in particular to a kind of text semantic information extraction system and method based on the natural language rule.
Background technology
Before text was carried out semantic processes, the pre-service of text was an extremely important link, and the quality of pretreating effect has directly determined the result that text semantic is analyzed.Before analyzing at the internet text, its pre-service has its unique aspect: the internet text is relatively diffusing on the one hand, and the interference literal is many; On the other hand, semi-structured text feature also provides abundant relatively semantic information.
Therefore, in the prior art, in the process of carrying out the text semantic processing, how to accomplish text message is effectively utilized, this will bring more facility to the text semantic analysis.
Summary of the invention
Problem at prior art exists the objective of the invention is to propose a kind of text semantic information extraction system and method.
For reaching this purpose, the present invention by the following technical solutions:
A kind of text semantic information analysis method comprises:
Set up the natural language rule according to presetting rule collection and goal rule collection;
Target text is carried out Data Matching and obtain the semantic information of target text according to described natural language rule.
Preferably, describedly set up the natural language rule according to presetting rule collection and goal rule collection and comprise:
Obtain the target text sample;
Mark according to presetting rule set pair target text sample;
Add up according to the target text sample behind the goal rule set pair mark, extract the goal rule set, and the natural language rule is set up in set according to goal rule.
Preferably, the described target text sample that obtains also comprises afterwards: the target text sample is carried out cutting, obtain the sentence set, analyze according to presetting rule collection and the set of goal rule set pair sentence.
Preferably, the described semantic information of target text being carried out Data Matching and obtain target text according to described natural language rule comprises: the Data Matching of target text being carried out sentence level, paragraph level according to described natural language rule, carry out semanteme according to matching result again and conclude, obtain the semantic information of target text.
A kind of text semantic information analysis system comprises:
Rule is set up module, is used for setting up the natural language rule according to presetting rule collection and goal rule collection;
The Data Matching module is set up module with rule and is connected, and is used for according to described natural language rule target text being carried out Data Matching;
The semantic information acquisition module is connected with the Data Matching module, is used for obtaining according to matching result the semantic information of target text.
Preferably, described rule is set up module and specifically is used for: obtain the target text sample, mark according to presetting rule set pair target text sample, add up according to the target text sample behind the goal rule set pair mark, the extraction goal rule is gathered, and the natural language rule is set up in set according to goal rule.
Preferably, described rule is set up module and also is used for: the target text sample is carried out cutting, obtain the sentence set, analyze according to presetting rule collection and the set of goal rule set pair sentence.
Preferably, described Data Matching module specifically is used for: the Data Matching of target text being carried out sentence level, paragraph level according to described natural language rule; Described semantic information acquisition module specifically is used for: carry out semanteme according to matching result and conclude, obtain the semantic information of target text.
Based on disclosing of above technical scheme, the present invention possesses following beneficial effect:
Among the present invention, set up the natural language rule according to presetting rule collection and goal rule collection, target text is carried out Data Matching and obtain the semantic information of target text according to described natural language rule, realized carrying out the text semantic information analysis based on the natural language rule natural language rule easy to understand and write natural property capable of being combined, Data Matching and information extraction process are unified, the precision height of speech analysis, the natural language rule is highly reusable, and model commonality is strong.
Description of drawings
Fig. 1 is the schematic flow sheet of a kind of text semantic information analysis method of proposing of the present invention.
Fig. 2 is the structural representation of a kind of text semantic information analysis system of proposing of the present invention.
Embodiment
As shown in Figure 1, the schematic flow sheet of a kind of text semantic information analysis method that proposes for the present invention.
With reference to Fig. 1, a kind of text semantic information analysis method that the present invention proposes comprises:
Step S1 sets up the natural language rule according to presetting rule collection and goal rule collection;
Step S2 carries out Data Matching to target text and obtains the semantic information of target text according to described natural language rule.
In step S1, describedly set up the natural language rule according to presetting rule collection and goal rule collection and comprise:
Step S11 obtains the target text sample;
Step S12 marks according to presetting rule set pair target text sample;
Step S13 adds up according to the target text sample behind the goal rule set pair mark, extracts the goal rule set, and the natural language rule is set up in set according to goal rule.
After step S11, the described target text sample that obtains also comprises afterwards: the target text sample is carried out cutting, obtain the sentence set, analyze according to presetting rule collection and the set of goal rule set pair sentence.
The semantic information of according to described natural language rule target text being carried out Data Matching and obtaining target text at step S2 comprises:
According to described natural language rule target text is carried out the Data Matching of sentence level, paragraph level, carry out semanteme according to matching result again and conclude, obtain the semantic information of target text.
As shown in Figure 2, the structural representation of a kind of text semantic information analysis system that proposes for the present invention.
With reference to Fig. 2, a kind of text semantic information analysis system that the present invention proposes comprises:
Rule is set up module, is used for setting up the natural language rule according to presetting rule collection and goal rule collection;
The Data Matching module is set up module with rule and is connected, and is used for according to described natural language rule target text being carried out Data Matching;
The semantic information acquisition module is connected with the Data Matching module, is used for obtaining according to matching result the semantic information of target text.
Further, described rule is set up module and specifically is used for: obtain the target text sample, mark according to presetting rule set pair target text sample, add up according to the target text sample behind the goal rule set pair mark, the extraction goal rule is gathered, and the natural language rule is set up in set according to goal rule.
Further, described rule is set up module and also is used for: the target text sample is carried out cutting, obtain the sentence set, analyze according to presetting rule collection and the set of goal rule set pair sentence.
Further, described Data Matching module specifically is used for: the Data Matching of target text being carried out sentence level, paragraph level according to described natural language rule; Described semantic information acquisition module specifically is used for: carry out semanteme according to matching result and conclude, obtain the semantic information of target text.
Among the present invention, set up the natural language rule according to presetting rule collection and goal rule collection, target text is carried out Data Matching and obtain the semantic information of target text according to described natural language rule, realized carrying out the text semantic information analysis based on the natural language rule natural language rule easy to understand and write natural property capable of being combined, Data Matching and information extraction process are unified, the precision height of speech analysis, the natural language rule is highly reusable, and model commonality is strong.
The above; only be the preferable embodiment of the present invention; but protection scope of the present invention is not limited thereto; anyly be familiar with those skilled in the art in the technical scope that the present invention discloses; be equal to replacement or change according to technical scheme of the present invention and inventive concept thereof, all should be encompassed within protection scope of the present invention.

Claims (8)

1. a text semantic information analysis method is characterized in that, comprising:
Set up the natural language rule according to presetting rule collection and goal rule collection;
Target text is carried out Data Matching and obtain the semantic information of target text according to described natural language rule.
2. text semantic information analysis method according to claim 1 is characterized in that, describedly sets up the natural language rule according to presetting rule collection and goal rule collection and comprises:
Obtain the target text sample;
Mark according to presetting rule set pair target text sample;
Add up according to the target text sample behind the goal rule set pair mark, extract the goal rule set, and the natural language rule is set up in set according to goal rule.
3. text semantic information analysis method according to claim 2, it is characterized in that, the described target text sample that obtains also comprises afterwards: the target text sample is carried out cutting, obtain the sentence set, analyze according to presetting rule collection and the set of goal rule set pair sentence.
4. according to each described text semantic information analysis method among the claim 1-3, it is characterized in that, the described semantic information of target text being carried out Data Matching and obtain target text according to described natural language rule comprises: the Data Matching of target text being carried out sentence level, paragraph level according to described natural language rule, carry out semanteme according to matching result again and conclude, obtain the semantic information of target text.
5. a text semantic information analysis system is characterized in that, comprising:
Rule is set up module, is used for setting up the natural language rule according to presetting rule collection and goal rule collection;
The Data Matching module is set up module with rule and is connected, and is used for according to described natural language rule target text being carried out Data Matching;
The semantic information acquisition module is connected with the Data Matching module, is used for obtaining according to matching result the semantic information of target text.
6. text semantic information analysis system according to claim 5, it is characterized in that, described rule is set up module and specifically is used for: obtain the target text sample, mark according to presetting rule set pair target text sample, add up according to the target text sample behind the goal rule set pair mark, the extraction goal rule is gathered, and the natural language rule is set up in set according to goal rule.
7. text semantic information analysis system according to claim 6, it is characterized in that, described rule is set up module and also is used for: the target text sample is carried out cutting, obtain the sentence set, analyze according to presetting rule collection and the set of goal rule set pair sentence.
8. text semantic information analysis system according to claim 5 is characterized in that, described Data Matching module specifically is used for: the Data Matching of target text being carried out sentence level, paragraph level according to described natural language rule; Described semantic information acquisition module specifically is used for: carry out semanteme according to matching result and conclude, obtain the semantic information of target text.
CN2013101822827A 2013-05-16 2013-05-16 Text semantic information analyzing system and method Pending CN103246641A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013101822827A CN103246641A (en) 2013-05-16 2013-05-16 Text semantic information analyzing system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013101822827A CN103246641A (en) 2013-05-16 2013-05-16 Text semantic information analyzing system and method

Publications (1)

Publication Number Publication Date
CN103246641A true CN103246641A (en) 2013-08-14

Family

ID=48926168

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013101822827A Pending CN103246641A (en) 2013-05-16 2013-05-16 Text semantic information analyzing system and method

Country Status (1)

Country Link
CN (1) CN103246641A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104166682A (en) * 2014-07-21 2014-11-26 安徽华贞信息科技有限公司 Method and system for extracting natural-language-like semantic information on the basis combinatorial theory
CN104199803A (en) * 2014-07-21 2014-12-10 安徽华贞信息科技有限公司 Text information processing system and method based on combinational theory
CN106469192A (en) * 2016-08-30 2017-03-01 北京奇艺世纪科技有限公司 A kind of determination method and device of text relevant
CN106649278A (en) * 2016-12-30 2017-05-10 三星电子(中国)研发中心 Method and system for extending spoken language dialogue system corpora
CN106815204A (en) * 2015-12-01 2017-06-09 北京国双科技有限公司 The segmentation method and device of judgement document
CN107608949A (en) * 2017-10-16 2018-01-19 北京神州泰岳软件股份有限公司 A kind of Text Information Extraction method and device based on semantic model
CN108319586A (en) * 2018-01-31 2018-07-24 天闻数媒科技(北京)有限公司 A kind of generation of information extraction rule and semantic analysis method and device
CN109753659A (en) * 2018-12-28 2019-05-14 北京猎户星空科技有限公司 Semantic processes method, apparatus, electronic equipment and storage medium
US11704505B2 (en) 2017-12-23 2023-07-18 Huawei Technologies Co., Ltd. Language processing method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101330432A (en) * 2007-06-18 2008-12-24 阿里巴巴集团控股有限公司 System and method for implementing on-line QA
CN102439590A (en) * 2009-03-13 2012-05-02 发明机器公司 System and method for automatic semantic labeling of natural language texts
CN102567304A (en) * 2010-12-24 2012-07-11 北大方正集团有限公司 Filtering method and device for network malicious information
CN102799577A (en) * 2012-08-17 2012-11-28 苏州大学 Extraction method of semantic relation between Chinese entities
CN102866990A (en) * 2012-08-20 2013-01-09 北京搜狗信息服务有限公司 Thematic conversation method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101330432A (en) * 2007-06-18 2008-12-24 阿里巴巴集团控股有限公司 System and method for implementing on-line QA
CN102439590A (en) * 2009-03-13 2012-05-02 发明机器公司 System and method for automatic semantic labeling of natural language texts
CN102567304A (en) * 2010-12-24 2012-07-11 北大方正集团有限公司 Filtering method and device for network malicious information
CN102799577A (en) * 2012-08-17 2012-11-28 苏州大学 Extraction method of semantic relation between Chinese entities
CN102866990A (en) * 2012-08-20 2013-01-09 北京搜狗信息服务有限公司 Thematic conversation method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
张莉等: "领域本体半自动化建模工具的设计与实现", 《计算机与数字工程》 *
段宇锋等: "基于自主学习规则的中文物种描述文本的语义标注研究", 《现代图书情报技术》 *
沙丽华: "面向领域文档的语义标注方法研究", <中国优秀硕士学位论文全文数据库信息科技辑(月刊)> *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104199803A (en) * 2014-07-21 2014-12-10 安徽华贞信息科技有限公司 Text information processing system and method based on combinational theory
CN104199803B (en) * 2014-07-21 2017-10-13 安徽华贞信息科技有限公司 A kind of text information processing system and method based on combinatorial theory
CN104166682A (en) * 2014-07-21 2014-11-26 安徽华贞信息科技有限公司 Method and system for extracting natural-language-like semantic information on the basis combinatorial theory
CN104166682B (en) * 2014-07-21 2018-05-01 安徽华贞信息科技有限公司 The semantic information abstracting method and system of a kind of natural language based on combinatorial theory
CN106815204A (en) * 2015-12-01 2017-06-09 北京国双科技有限公司 The segmentation method and device of judgement document
CN106469192A (en) * 2016-08-30 2017-03-01 北京奇艺世纪科技有限公司 A kind of determination method and device of text relevant
CN106469192B (en) * 2016-08-30 2021-07-30 北京奇艺世纪科技有限公司 Text relevance determining method and device
CN106649278B (en) * 2016-12-30 2019-11-15 三星电子(中国)研发中心 Extend the method and system of spoken dialogue system corpus
CN106649278A (en) * 2016-12-30 2017-05-10 三星电子(中国)研发中心 Method and system for extending spoken language dialogue system corpora
CN107608949A (en) * 2017-10-16 2018-01-19 北京神州泰岳软件股份有限公司 A kind of Text Information Extraction method and device based on semantic model
US11704505B2 (en) 2017-12-23 2023-07-18 Huawei Technologies Co., Ltd. Language processing method and device
CN108319586A (en) * 2018-01-31 2018-07-24 天闻数媒科技(北京)有限公司 A kind of generation of information extraction rule and semantic analysis method and device
CN108319586B (en) * 2018-01-31 2021-09-24 天闻数媒科技(北京)有限公司 Information extraction rule generation and semantic analysis method and device
CN109753659A (en) * 2018-12-28 2019-05-14 北京猎户星空科技有限公司 Semantic processes method, apparatus, electronic equipment and storage medium
CN109753659B (en) * 2018-12-28 2023-08-04 北京猎户星空科技有限公司 Semantic processing method, semantic processing device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN103246641A (en) Text semantic information analyzing system and method
CN103077164A (en) Text analysis method and text analyzer
CN103853834B (en) Text structure analysis-based Web document abstract generation method
CN103838796A (en) Webpage structured information extraction method
CN102693279B (en) Method, device and system for fast calculating comment similarity
JP2016508264A5 (en)
CN103514171B (en) Optically-based character recognition and the self-defined reptile method of vertical search
CN105183742A (en) Resume identification method
CN102254027A (en) Method for obtaining webpage contents in batch
CN103020044A (en) Machine-aided webpage translation method and system thereof
CN105068990B (en) A kind of English long sentence dividing method of more strategies of Machine oriented translation
CN105045847A (en) Method for extracting Chinese institutional unit name from text information
CN103559181A (en) Establishment method and system for bilingual semantic relation classification model
CN105528357A (en) Webpage content extraction method based on similarity of URLs and similarity of webpage document structures
CN107436931B (en) Webpage text extraction method and device
CN103309851B (en) The rubbish recognition methods of short text and system
CN110008473A (en) A kind of medical text name Entity recognition mask method based on alternative manner
CN102999523A (en) Intelligence digitizing method
CN103678284A (en) Method and device for translating page characters
CN102750374A (en) Data tracing and influence relationship analysis method based on database script
CN103377207B (en) Microblog users relation acquisition method based on script engine
CN111723297B (en) Dual-semantic similarity judging method for grid society situation research and judgment
CN108205542A (en) A kind of analysis method and system of song comment
CN103761222A (en) Semantic-analysis-algorithm pseudo-original identification method
CN103116448A (en) Extract method for visualizing information

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20130814

RJ01 Rejection of invention patent application after publication