CN103246641A - Text semantic information analyzing system and method - Google Patents
Text semantic information analyzing system and method Download PDFInfo
- Publication number
- CN103246641A CN103246641A CN2013101822827A CN201310182282A CN103246641A CN 103246641 A CN103246641 A CN 103246641A CN 2013101822827 A CN2013101822827 A CN 2013101822827A CN 201310182282 A CN201310182282 A CN 201310182282A CN 103246641 A CN103246641 A CN 103246641A
- Authority
- CN
- China
- Prior art keywords
- rule
- target text
- semantic information
- text
- natural language
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Machine Translation (AREA)
Abstract
The invention discloses a text semantic information analyzing system and method. Quasi nature language rules are established according to a preset rule set and a target rule set, data matching of a target text is performed according to the quasi nature language rules to obtain semantic information of the target text, and semantic information analysis of the text based on the quasi nature language rules is achieved. In the text semantic information analyzing system and method, the quasi nature language rules are easy to understand and compile, have a natural combination characteristic, the data matching process and the information extraction process are unified, the accuracy of the semantic analysis is high, the quasi nature language rules are highly reusable, and the model generality is strong.
Description
Technical field
The present invention relates to the information analysis technical field, relate in particular to a kind of text semantic information extraction system and method based on the natural language rule.
Background technology
Before text was carried out semantic processes, the pre-service of text was an extremely important link, and the quality of pretreating effect has directly determined the result that text semantic is analyzed.Before analyzing at the internet text, its pre-service has its unique aspect: the internet text is relatively diffusing on the one hand, and the interference literal is many; On the other hand, semi-structured text feature also provides abundant relatively semantic information.
Therefore, in the prior art, in the process of carrying out the text semantic processing, how to accomplish text message is effectively utilized, this will bring more facility to the text semantic analysis.
Summary of the invention
Problem at prior art exists the objective of the invention is to propose a kind of text semantic information extraction system and method.
For reaching this purpose, the present invention by the following technical solutions:
A kind of text semantic information analysis method comprises:
Set up the natural language rule according to presetting rule collection and goal rule collection;
Target text is carried out Data Matching and obtain the semantic information of target text according to described natural language rule.
Preferably, describedly set up the natural language rule according to presetting rule collection and goal rule collection and comprise:
Obtain the target text sample;
Mark according to presetting rule set pair target text sample;
Add up according to the target text sample behind the goal rule set pair mark, extract the goal rule set, and the natural language rule is set up in set according to goal rule.
Preferably, the described target text sample that obtains also comprises afterwards: the target text sample is carried out cutting, obtain the sentence set, analyze according to presetting rule collection and the set of goal rule set pair sentence.
Preferably, the described semantic information of target text being carried out Data Matching and obtain target text according to described natural language rule comprises: the Data Matching of target text being carried out sentence level, paragraph level according to described natural language rule, carry out semanteme according to matching result again and conclude, obtain the semantic information of target text.
A kind of text semantic information analysis system comprises:
Rule is set up module, is used for setting up the natural language rule according to presetting rule collection and goal rule collection;
The Data Matching module is set up module with rule and is connected, and is used for according to described natural language rule target text being carried out Data Matching;
The semantic information acquisition module is connected with the Data Matching module, is used for obtaining according to matching result the semantic information of target text.
Preferably, described rule is set up module and specifically is used for: obtain the target text sample, mark according to presetting rule set pair target text sample, add up according to the target text sample behind the goal rule set pair mark, the extraction goal rule is gathered, and the natural language rule is set up in set according to goal rule.
Preferably, described rule is set up module and also is used for: the target text sample is carried out cutting, obtain the sentence set, analyze according to presetting rule collection and the set of goal rule set pair sentence.
Preferably, described Data Matching module specifically is used for: the Data Matching of target text being carried out sentence level, paragraph level according to described natural language rule; Described semantic information acquisition module specifically is used for: carry out semanteme according to matching result and conclude, obtain the semantic information of target text.
Based on disclosing of above technical scheme, the present invention possesses following beneficial effect:
Among the present invention, set up the natural language rule according to presetting rule collection and goal rule collection, target text is carried out Data Matching and obtain the semantic information of target text according to described natural language rule, realized carrying out the text semantic information analysis based on the natural language rule natural language rule easy to understand and write natural property capable of being combined, Data Matching and information extraction process are unified, the precision height of speech analysis, the natural language rule is highly reusable, and model commonality is strong.
Description of drawings
Fig. 1 is the schematic flow sheet of a kind of text semantic information analysis method of proposing of the present invention.
Fig. 2 is the structural representation of a kind of text semantic information analysis system of proposing of the present invention.
Embodiment
As shown in Figure 1, the schematic flow sheet of a kind of text semantic information analysis method that proposes for the present invention.
With reference to Fig. 1, a kind of text semantic information analysis method that the present invention proposes comprises:
Step S1 sets up the natural language rule according to presetting rule collection and goal rule collection;
Step S2 carries out Data Matching to target text and obtains the semantic information of target text according to described natural language rule.
In step S1, describedly set up the natural language rule according to presetting rule collection and goal rule collection and comprise:
Step S11 obtains the target text sample;
Step S12 marks according to presetting rule set pair target text sample;
Step S13 adds up according to the target text sample behind the goal rule set pair mark, extracts the goal rule set, and the natural language rule is set up in set according to goal rule.
After step S11, the described target text sample that obtains also comprises afterwards: the target text sample is carried out cutting, obtain the sentence set, analyze according to presetting rule collection and the set of goal rule set pair sentence.
The semantic information of according to described natural language rule target text being carried out Data Matching and obtaining target text at step S2 comprises:
According to described natural language rule target text is carried out the Data Matching of sentence level, paragraph level, carry out semanteme according to matching result again and conclude, obtain the semantic information of target text.
As shown in Figure 2, the structural representation of a kind of text semantic information analysis system that proposes for the present invention.
With reference to Fig. 2, a kind of text semantic information analysis system that the present invention proposes comprises:
Rule is set up module, is used for setting up the natural language rule according to presetting rule collection and goal rule collection;
The Data Matching module is set up module with rule and is connected, and is used for according to described natural language rule target text being carried out Data Matching;
The semantic information acquisition module is connected with the Data Matching module, is used for obtaining according to matching result the semantic information of target text.
Further, described rule is set up module and specifically is used for: obtain the target text sample, mark according to presetting rule set pair target text sample, add up according to the target text sample behind the goal rule set pair mark, the extraction goal rule is gathered, and the natural language rule is set up in set according to goal rule.
Further, described rule is set up module and also is used for: the target text sample is carried out cutting, obtain the sentence set, analyze according to presetting rule collection and the set of goal rule set pair sentence.
Further, described Data Matching module specifically is used for: the Data Matching of target text being carried out sentence level, paragraph level according to described natural language rule; Described semantic information acquisition module specifically is used for: carry out semanteme according to matching result and conclude, obtain the semantic information of target text.
Among the present invention, set up the natural language rule according to presetting rule collection and goal rule collection, target text is carried out Data Matching and obtain the semantic information of target text according to described natural language rule, realized carrying out the text semantic information analysis based on the natural language rule natural language rule easy to understand and write natural property capable of being combined, Data Matching and information extraction process are unified, the precision height of speech analysis, the natural language rule is highly reusable, and model commonality is strong.
The above; only be the preferable embodiment of the present invention; but protection scope of the present invention is not limited thereto; anyly be familiar with those skilled in the art in the technical scope that the present invention discloses; be equal to replacement or change according to technical scheme of the present invention and inventive concept thereof, all should be encompassed within protection scope of the present invention.
Claims (8)
1. a text semantic information analysis method is characterized in that, comprising:
Set up the natural language rule according to presetting rule collection and goal rule collection;
Target text is carried out Data Matching and obtain the semantic information of target text according to described natural language rule.
2. text semantic information analysis method according to claim 1 is characterized in that, describedly sets up the natural language rule according to presetting rule collection and goal rule collection and comprises:
Obtain the target text sample;
Mark according to presetting rule set pair target text sample;
Add up according to the target text sample behind the goal rule set pair mark, extract the goal rule set, and the natural language rule is set up in set according to goal rule.
3. text semantic information analysis method according to claim 2, it is characterized in that, the described target text sample that obtains also comprises afterwards: the target text sample is carried out cutting, obtain the sentence set, analyze according to presetting rule collection and the set of goal rule set pair sentence.
4. according to each described text semantic information analysis method among the claim 1-3, it is characterized in that, the described semantic information of target text being carried out Data Matching and obtain target text according to described natural language rule comprises: the Data Matching of target text being carried out sentence level, paragraph level according to described natural language rule, carry out semanteme according to matching result again and conclude, obtain the semantic information of target text.
5. a text semantic information analysis system is characterized in that, comprising:
Rule is set up module, is used for setting up the natural language rule according to presetting rule collection and goal rule collection;
The Data Matching module is set up module with rule and is connected, and is used for according to described natural language rule target text being carried out Data Matching;
The semantic information acquisition module is connected with the Data Matching module, is used for obtaining according to matching result the semantic information of target text.
6. text semantic information analysis system according to claim 5, it is characterized in that, described rule is set up module and specifically is used for: obtain the target text sample, mark according to presetting rule set pair target text sample, add up according to the target text sample behind the goal rule set pair mark, the extraction goal rule is gathered, and the natural language rule is set up in set according to goal rule.
7. text semantic information analysis system according to claim 6, it is characterized in that, described rule is set up module and also is used for: the target text sample is carried out cutting, obtain the sentence set, analyze according to presetting rule collection and the set of goal rule set pair sentence.
8. text semantic information analysis system according to claim 5 is characterized in that, described Data Matching module specifically is used for: the Data Matching of target text being carried out sentence level, paragraph level according to described natural language rule; Described semantic information acquisition module specifically is used for: carry out semanteme according to matching result and conclude, obtain the semantic information of target text.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2013101822827A CN103246641A (en) | 2013-05-16 | 2013-05-16 | Text semantic information analyzing system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2013101822827A CN103246641A (en) | 2013-05-16 | 2013-05-16 | Text semantic information analyzing system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103246641A true CN103246641A (en) | 2013-08-14 |
Family
ID=48926168
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2013101822827A Pending CN103246641A (en) | 2013-05-16 | 2013-05-16 | Text semantic information analyzing system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103246641A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104166682A (en) * | 2014-07-21 | 2014-11-26 | 安徽华贞信息科技有限公司 | Method and system for extracting natural-language-like semantic information on the basis combinatorial theory |
CN104199803A (en) * | 2014-07-21 | 2014-12-10 | 安徽华贞信息科技有限公司 | Text information processing system and method based on combinational theory |
CN106469192A (en) * | 2016-08-30 | 2017-03-01 | 北京奇艺世纪科技有限公司 | A kind of determination method and device of text relevant |
CN106649278A (en) * | 2016-12-30 | 2017-05-10 | 三星电子(中国)研发中心 | Method and system for extending spoken language dialogue system corpora |
CN106815204A (en) * | 2015-12-01 | 2017-06-09 | 北京国双科技有限公司 | The segmentation method and device of judgement document |
CN107608949A (en) * | 2017-10-16 | 2018-01-19 | 北京神州泰岳软件股份有限公司 | A kind of Text Information Extraction method and device based on semantic model |
CN108319586A (en) * | 2018-01-31 | 2018-07-24 | 天闻数媒科技(北京)有限公司 | A kind of generation of information extraction rule and semantic analysis method and device |
CN109753659A (en) * | 2018-12-28 | 2019-05-14 | 北京猎户星空科技有限公司 | Semantic processes method, apparatus, electronic equipment and storage medium |
US11704505B2 (en) | 2017-12-23 | 2023-07-18 | Huawei Technologies Co., Ltd. | Language processing method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101330432A (en) * | 2007-06-18 | 2008-12-24 | 阿里巴巴集团控股有限公司 | System and method for implementing on-line QA |
CN102439590A (en) * | 2009-03-13 | 2012-05-02 | 发明机器公司 | System and method for automatic semantic labeling of natural language texts |
CN102567304A (en) * | 2010-12-24 | 2012-07-11 | 北大方正集团有限公司 | Filtering method and device for network malicious information |
CN102799577A (en) * | 2012-08-17 | 2012-11-28 | 苏州大学 | Extraction method of semantic relation between Chinese entities |
CN102866990A (en) * | 2012-08-20 | 2013-01-09 | 北京搜狗信息服务有限公司 | Thematic conversation method and device |
-
2013
- 2013-05-16 CN CN2013101822827A patent/CN103246641A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101330432A (en) * | 2007-06-18 | 2008-12-24 | 阿里巴巴集团控股有限公司 | System and method for implementing on-line QA |
CN102439590A (en) * | 2009-03-13 | 2012-05-02 | 发明机器公司 | System and method for automatic semantic labeling of natural language texts |
CN102567304A (en) * | 2010-12-24 | 2012-07-11 | 北大方正集团有限公司 | Filtering method and device for network malicious information |
CN102799577A (en) * | 2012-08-17 | 2012-11-28 | 苏州大学 | Extraction method of semantic relation between Chinese entities |
CN102866990A (en) * | 2012-08-20 | 2013-01-09 | 北京搜狗信息服务有限公司 | Thematic conversation method and device |
Non-Patent Citations (3)
Title |
---|
张莉等: "领域本体半自动化建模工具的设计与实现", 《计算机与数字工程》 * |
段宇锋等: "基于自主学习规则的中文物种描述文本的语义标注研究", 《现代图书情报技术》 * |
沙丽华: "面向领域文档的语义标注方法研究", <中国优秀硕士学位论文全文数据库信息科技辑(月刊)> * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104199803A (en) * | 2014-07-21 | 2014-12-10 | 安徽华贞信息科技有限公司 | Text information processing system and method based on combinational theory |
CN104199803B (en) * | 2014-07-21 | 2017-10-13 | 安徽华贞信息科技有限公司 | A kind of text information processing system and method based on combinatorial theory |
CN104166682A (en) * | 2014-07-21 | 2014-11-26 | 安徽华贞信息科技有限公司 | Method and system for extracting natural-language-like semantic information on the basis combinatorial theory |
CN104166682B (en) * | 2014-07-21 | 2018-05-01 | 安徽华贞信息科技有限公司 | The semantic information abstracting method and system of a kind of natural language based on combinatorial theory |
CN106815204A (en) * | 2015-12-01 | 2017-06-09 | 北京国双科技有限公司 | The segmentation method and device of judgement document |
CN106469192A (en) * | 2016-08-30 | 2017-03-01 | 北京奇艺世纪科技有限公司 | A kind of determination method and device of text relevant |
CN106469192B (en) * | 2016-08-30 | 2021-07-30 | 北京奇艺世纪科技有限公司 | Text relevance determining method and device |
CN106649278B (en) * | 2016-12-30 | 2019-11-15 | 三星电子(中国)研发中心 | Extend the method and system of spoken dialogue system corpus |
CN106649278A (en) * | 2016-12-30 | 2017-05-10 | 三星电子(中国)研发中心 | Method and system for extending spoken language dialogue system corpora |
CN107608949A (en) * | 2017-10-16 | 2018-01-19 | 北京神州泰岳软件股份有限公司 | A kind of Text Information Extraction method and device based on semantic model |
US11704505B2 (en) | 2017-12-23 | 2023-07-18 | Huawei Technologies Co., Ltd. | Language processing method and device |
CN108319586A (en) * | 2018-01-31 | 2018-07-24 | 天闻数媒科技(北京)有限公司 | A kind of generation of information extraction rule and semantic analysis method and device |
CN108319586B (en) * | 2018-01-31 | 2021-09-24 | 天闻数媒科技(北京)有限公司 | Information extraction rule generation and semantic analysis method and device |
CN109753659A (en) * | 2018-12-28 | 2019-05-14 | 北京猎户星空科技有限公司 | Semantic processes method, apparatus, electronic equipment and storage medium |
CN109753659B (en) * | 2018-12-28 | 2023-08-04 | 北京猎户星空科技有限公司 | Semantic processing method, semantic processing device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103246641A (en) | Text semantic information analyzing system and method | |
CN103077164A (en) | Text analysis method and text analyzer | |
CN103853834B (en) | Text structure analysis-based Web document abstract generation method | |
CN103838796A (en) | Webpage structured information extraction method | |
CN102693279B (en) | Method, device and system for fast calculating comment similarity | |
JP2016508264A5 (en) | ||
CN103514171B (en) | Optically-based character recognition and the self-defined reptile method of vertical search | |
CN105183742A (en) | Resume identification method | |
CN102254027A (en) | Method for obtaining webpage contents in batch | |
CN103020044A (en) | Machine-aided webpage translation method and system thereof | |
CN105068990B (en) | A kind of English long sentence dividing method of more strategies of Machine oriented translation | |
CN105045847A (en) | Method for extracting Chinese institutional unit name from text information | |
CN103559181A (en) | Establishment method and system for bilingual semantic relation classification model | |
CN105528357A (en) | Webpage content extraction method based on similarity of URLs and similarity of webpage document structures | |
CN107436931B (en) | Webpage text extraction method and device | |
CN103309851B (en) | The rubbish recognition methods of short text and system | |
CN110008473A (en) | A kind of medical text name Entity recognition mask method based on alternative manner | |
CN102999523A (en) | Intelligence digitizing method | |
CN103678284A (en) | Method and device for translating page characters | |
CN102750374A (en) | Data tracing and influence relationship analysis method based on database script | |
CN103377207B (en) | Microblog users relation acquisition method based on script engine | |
CN111723297B (en) | Dual-semantic similarity judging method for grid society situation research and judgment | |
CN108205542A (en) | A kind of analysis method and system of song comment | |
CN103761222A (en) | Semantic-analysis-algorithm pseudo-original identification method | |
CN103116448A (en) | Extract method for visualizing information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20130814 |
|
RJ01 | Rejection of invention patent application after publication |