CN104731768B - A kind of location of incident abstracting method towards Chinese newsletter archive - Google Patents
A kind of location of incident abstracting method towards Chinese newsletter archive Download PDFInfo
- Publication number
- CN104731768B CN104731768B CN201510097822.0A CN201510097822A CN104731768B CN 104731768 B CN104731768 B CN 104731768B CN 201510097822 A CN201510097822 A CN 201510097822A CN 104731768 B CN104731768 B CN 104731768B
- Authority
- CN
- China
- Prior art keywords
- newsletter archive
- incident
- feature
- location
- place
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A kind of location of incident abstracting method towards Chinese newsletter archive, participle is carried out to Chinese newsletter archive T first with ICTCLAS Chinese word segmentations instrument, and selection part of speech is mechanism name, the vocabulary of place noun, place name with constituting candidate events point set;Each vocabulary for candidate events in point set again, build comprising contextual feature, position feature, topological characteristic three-dimensional feature vector;Finally using constructed three-dimensional feature vector, each vocabulary using Random Forest graders by candidate events in point set carries out two-value classification according to location of incident and non-event place, so as to realize the extraction of location of incident.Multiclass feature in energy integrated use newsletter archive of the invention, contextual feature, position feature, three feature constitutive characteristic vectors of topological characteristic are extracted, location of incident is identified from participle acquisition mechanism name, place noun, place name using Random Forest graders;The place that media event occurs can be further identified on the basis of place name identification.
Description
Technical field
The present invention relates to the text mining in Computer Science and Technology, natural language processing, the analysis of public opinion field, especially
It is related to a kind of location of incident abstracting method towards Chinese newsletter archive.
Background technology
In newsletter archive, there is vocabulary or the phrases such as mechanism name, place noun, place name, but their not necessarily things
The place that part occurs.For example, in newsletter archive " on June 19th, 2012, the G20 summits held in Mexico's Loews card Butterworth
Period, Argentine President's Christina submits the letters about Islas Malvinas sovereignty to British Prime Minister Cameron " in,
There are place name " Mexico ", " Loews card Butterworth ", " Islas Malvinas " three place names, still " Islas Malvinas " simultaneously
It is not locale.It is one in event extraction that location of incident how is identified from mechanism name, place noun, place name
Individual problem.
Patent on the location of incident extraction towards Chinese newsletter archive.Patent name《A kind of place name identification method and
Device》Publication number CN103186524A;A kind of place name identification method and apparatus of the disclosure of the invention, to carry out place name identification.
The inventive method includes:Participle is carried out to character string to be identified and obtains candidate word;Each candidate word is obtained in address name storehouse
Generic;Each candidate word is traveled through, if the generic of current candidate word is first category, by current candidate word
It is added to alternative place name set as place name;If the generic of current candidate word be second category, to current candidate word with
And the candidate word closed in the address name storehouse with current candidate word is combined and obtains compound word, and by the compound word
It is added to alternative place name set as place name.But the patent can only recognize the place name in text, event can't be identified
Point.
The content of the invention
In order to overcome the defect of above-mentioned prior art, it is an object of the invention to provide a kind of towards Chinese newsletter archive
Location of incident abstracting method, this method extracts contextual feature, position feature, three feature structures of topological characteristic from newsletter archive
Into characteristic vector, outgoing event is recognized from participle acquisition mechanism name, place noun, place name using Random Forest graders
Place;The place that media event occurs can be further identified on the basis of place name identification.
To achieve the above objectives, the technical scheme is that:
A kind of location of incident abstracting method towards Chinese newsletter archive, comprises the following steps:
Step one:Extract in candidate events place
(1) first, participle is carried out to Chinese newsletter archive T using ICTCLAS Chinese word segmentations instrument, generation one is by binary
The sequence S that group is constitutedT=(w1, p1), (w2, p2) ..., (wi, pi) ..., (wn, pn), wherein, n represents the vocabulary being syncopated as
Number, n > 0, wiRepresent the vocabulary that ICTCLAS is syncopated as, piRepresent wiPart of speech;
(2) from STIn select all to meet p successivelyi=" ni " pi=" nl ", piThe binary of one of three kinds of situations of=" ns "
Group, pi=" ni " pi=" nl ", pi=" ns " represents corresponding w respectivelyiFor mechanism name, place noun, place name;It is all selected
Two tuples in wiConstitute a set WT={ w '1, w '2..., w 'j..., w 'm, WTCandidate events place will be used as
Set, m represents WTThe number of middle vocabulary, m > 0;
Step 2:Characteristic vector is built
For set WTIn each w 'j, three features are selected, including:w′jContext in newsletter archive T
Context features;w′jPosition feature in newsletter archive T;w′jTopological characteristic in newsletter archive T;The meter of three features
Calculate as follows:
Feature one:w′jContextual feature c in newsletter archive Tj
w′jContextual feature w ' in newsletter archive TjThe weight of the regular expression matched is represented, is designated as cj;
(1) if w 'jSome regular expression in following table can be matched in newsletter archive T, it is assumed that be k-th, then cj=
ak;akRepresent " if k-th of regular expression of terminology match, the vocabulary is the ratio of location of incident ", akCalculation formula be:
ak=| Lk|/|Sk|, wherein, SkRepresent the set of all vocabulary of energy k-th of regular expression of successful match in labeled data, Lk
Expression belongs to SkAnd be the set that constitutes of vocabulary of location of incident in labeled data, labeled data refers to manually be labelled with event
The newsletter archive collection in place;
(2) if w 'jMultiple regular expressions in following table can be matched in newsletter archive T, kth is set to1, k2..., kl
(l > 1) individual regular expression, then
(3) if w 'jThe regular expression in following table can not be matched in newsletter archive T, then cj=0.
Feature two:w′jPosition feature p in newsletter archive Tj
pj=loc (w 'j, T), wherein, loc (w 'j, T) and represent vocabulary w 'jThe position occurred first in newsletter archive T,
I.e. from text T section starts to vocabulary w 'jOccurs the number of words between position for the first time.
Feature three:w′jTopological characteristic t in newsletter archive Tj
Initialize null set E;For any two tuples (w 'j, w 'j)∈WT×WTAnd, perform following two steps:
STEP1:By character string " http://www.baike.com/wiki/ " and w 'jURL is constituted, URL correspondences are downloaded
The page;If downloading less than not handling two tuple (w 'j, w 'i);
STEP2:Utilize regular expression
/<a[^>]*Href=[" ']([^’″>]*)[’″][^>]*>(.*)</a>/ ig matches pagefile, obtains
Take all Anchor Texts;If including w ' in Anchor Texti, then E ∪ { (w 'j, w 'i) result be assigned to E;
For any two tuples (w 'j, w 'i)∈WT×WTAnd, perform after above-mentioned two step, generate with WTFor nodal set
Close, E is the digraph G of oriented line set;
To any w 'j∈WT, calculate its convergence factor C (w 'j), convergence factor is used to weigh what is connected between different node
Tightness degree;
In above formula, GΔ(w′j) represent to include w ' in GjThe quantity for closing three point groups, GΔ(w′j) it is to represent to include w ' in Gj's
Open the quantity of three point groups;Close three point groups and refer to any three nodes being connected two-by-two in figure, open three point groups and refer in figure by two sides companies
Pick up three nodes come;
Make w 'jTopological characteristic t in newsletter archive TjFor C (w 'j), i.e. tj=C (w 'j);
Utilize w 'jContextual feature, position feature, topological characteristic in newsletter archive T, build w 'jThree-dimensional feature to
Measure (cj, pj, tj)。
Step 3:Location of incident is recognized
100-200 characteristic vector is selected, manually a class label (location of incident and non-thing is marked for each characteristic vector
Part place), form a training dataset;Using one grader of Random Forest classifier trainings, the grader is utilized
Will set WTIn each w 'jTwo-value classification is carried out according to location of incident and non-event place, so as to realize taking out for location of incident
Take.
Contextual feature, position feature, topological characteristic that can be according to vocabulary in newsletter archive of the invention, sets up grader,
Realize the automatic extraction of location of incident.
Brief description of the drawings
Accompanying drawing 1 is the location of incident extraction process towards Chinese newsletter archive.
Accompanying drawing 2 is to be used to calculate topological characteristic tjA digraph example.
Embodiment
The invention will now be described in detail with reference to the accompanying drawings.
Referring to the drawings, the specific embodiment of this method can be divided into the extraction of candidate events place, characteristic vector structure, event
Place recognizes three steps.It is described in detail below:
Step one:Extract in candidate events place
A) first, participle is carried out to Chinese newsletter archive T using ICTCLAS Chinese word segmentations instrument, generation one is by binary
The sequence S that group is constitutedT=(w1, p1), (w2, p2) ..., (wi, pi) ..., (wn, pn), wherein, n represents the vocabulary being syncopated as
Number, n > 0, wiRepresent the vocabulary that ICTCLAS is syncopated as, piRepresent wiPart of speech;
B) from STIn select all to meet p successivelyi=" ni ", pi=" nl ", piThe binary of one of three kinds of situations of=" ns "
Group, pi=" ni ", pi=" nl ", pi=" ns " represents corresponding w respectivelyiFor mechanism name, place noun, place name;It is all selected
Two tuples in wiConstitute a set WT={ w '1, w '2..., w 'j..., w 'm, WTCandidate events place will be used as
Set.
With newsletter archive, " on June 19th, 2012, during the G20 summits that Mexico's Loews card Butterworth is held, Argentina is total
System Christina submits the letters about Islas Malvinas sovereignty to British Prime Minister Cameron " exemplified by, by above-mentioned step
Suddenly, set { Mexico, Loews card Butterworth, Argentina, Britain, Islas Malvinas } can be generated.
Step 2:Characteristic vector is built
For set WTIn each w 'j, three features are selected, including:w′jContext in newsletter archive T
(Context) feature;w′jPosition feature in newsletter archive T;w′jTopological characteristic in newsletter archive T;Three features
It is calculated as follows:
Feature one:w′jContext (Context) feature c in newsletter archive Tj
w′jContextual feature w ' in newsletter archive TjThe weight of the regular expression matched is represented, is designated as cj:
If a) w 'jSome regular expression in following table can be matched in newsletter archive T, it is assumed that be k-th, then cj=
ak;akRepresent " if k-th of regular expression of terminology match, the vocabulary is the ratio of location of incident ", akCalculation formula be:
ak=| Lk|/|Sk|, wherein, SkRepresent the set of all vocabulary of energy k-th of regular expression of successful match in labeled data, Lk
Expression belongs to SkAnd be the set that constitutes of vocabulary of location of incident in labeled data, labeled data refers to manually be labelled with event
The newsletter archive collection in place;
If b) w 'jMultiple regular expressions in following table can be matched in newsletter archive T, kth is set to1, k2..., kl
(l > 1) individual regular expression, then
If c) w 'jThe regular expression in following table can not be matched in newsletter archive T, then cj=0.
With newsletter archive, " on June 19th, 2012, during the G20 summits that Mexico's Loews card Butterworth is held, Argentina is total
System Christina submits the letters about Islas Malvinas sovereignty to British Prime Minister Cameron " exemplified by, gather { Mo Xi
Brother, Loews card Butterworth, Argentina, Britain, Islas Malvinas in " Mexico " can match first regular expression " ^
W+ $ ", then corresponding cjFor 0.64.
Feature two:w′jPosition feature p in newsletter archive Tj
Newsletter archive statistical analysis to artificial mark location of incident shows, when mechanism name, place noun, place name occur
Position is more forward, is more likely to be time place.Therefore, introducing position feature, and it is defined as:pj=loc (w 'j, T), wherein,
loc(w′j, T) and represent vocabulary w 'jThe position occurred first in newsletter archive T, i.e., from text T section starts to vocabulary w 'jFirst
The secondary number of words occurred between position.
It is still " on June 19th, 2012, Argentinian during the G20 summits that Mexico's Loews card Butterworth is held with newsletter archive
Presidential Christina submits the letters about Islas Malvinas sovereignty to British Prime Minister Cameron " exemplified by, " Mexico "
It is 12 in the position of the text first, therefore corresponding pjFor 12.
Feature three:w′jTopological characteristic t in newsletter archive Tj
Newsletter archive statistical analysis to artificial mark location of incident shows, if locale has multiple,
There is stronger correlation in these location of incidents.Therefore, with place to whether occurring mutually in " interactive encyclopaedia " corresponding page
To describe to whether there is correlation.
Initialize null set E;For any two tuples (w 'j, w 'i)∈WT×WTAnd, perform following two steps:
STEP1:By character string " http://www.baike.com/wiki/ " and w 'jURL is constituted, URL correspondences are downloaded
The page;If downloading less than not handling two tuple (w 'j, w 'i);
STEP2:Utilize regular expression
/<a[^>]*Href=[" ']([^’″>]*)[’″][^>]*>(.*)</a>/ ig matches pagefile, obtains
Take all Anchor Texts;If including w ' in Anchor Texti, then E ∪ { (w 'j, w 'i) result be assigned to E.
For any two tuples (w 'j, w 'i)∈WT×WTAnd, perform after above-mentioned two step, generate with WTFor nodal set
Close, E is the digraph G of oriented line set;
With newsletter archive, " on June 19th, 2012, during the G20 summits that Mexico's Loews card Butterworth is held, Argentina is total
System Christina submits the letters about Islas Malvinas sovereignty to British Prime Minister Cameron " exemplified by, Fig. 2 institutes can be generated
The digraph shown.
To any w 'j∈WT, calculate its convergence factor C (w 'j), convergence factor is used to weigh what is connected between different node
Tightness degree;Here, for weighing multiple degrees of correlation being selected between time place;
In above formula, GΔ(w′j) represent to include w ' in GjThe quantity for closing three point groups, GΔ(w′j) it is to represent to include w ' in Gj's
Open the quantity of three point groups.Close three point groups and refer to any three nodes being connected two-by-two in figure, open three point groups and refer in figure by two sides companies
Pick up three nodes come;
Make w 'jTopological characteristic t in newsletter archive TjFor C (w 'j), i.e. tj=C (w 'j)。
According to Fig. 2, the topological characteristic of " Mexico "
Utilize w 'jContextual feature, position feature, topological characteristic in newsletter archive T, build w 'jThree-dimensional feature to
Measure (cj, pi, tj)。
With newsletter archive, " on June 19th, 2012, during the G20 summits that Mexico's Loews card Butterworth is held, Argentina is total
System Christina submits the letters about Islas Malvinas sovereignty to British Prime Minister Cameron " therefore, " Mexico " exists
Characteristic vector in this text is (0.64,12,0.6)
Step 3:Location of incident is recognized
100-200 characteristic vectors are selected, manually class label (location of incident and a non-event are marked for each characteristic vector
Place), form a training dataset.
According to w 'jContextual feature, position feature, topological characteristic in newsletter archive T, using Random Forest
Classifier training goes out a disaggregated model, can be to WTIn vocabulary carry out two-value point according to location of incident and non-event place
Class, so as to realize the extraction of location of incident.
Claims (1)
1. a kind of location of incident abstracting method towards Chinese newsletter archive, it is characterised in that comprise the following steps:
Step one:Extract in candidate events place
(1) first, participle is carried out to Chinese newsletter archive T using ICTCLAS Chinese word segmentations instrument, generation one is by two tuple structures
Into sequence ST=(w1, p1), (w2, p2) ..., (wi, pi) ..., (wn, pn), wherein, n represents for the vocabulary being syncopated as
Number, n > 0, ωiRepresent the vocabulary that ICTCLAS is syncopated as, piRepresent wiPart of speech;
(2) from STIn select all to meet p successivelyi=" ni ", pi=" nl ", piTwo tuples of one of three kinds of situations of=" ns ", pi
=" ni ", pi=" nl ", pi=" ns " represents corresponding w respectivelyiFor mechanism name, place noun, place name;All selected two
W in tupleiConstitute a set WT={ w '1, w '2..., w 'j..., w 'm, WTUsing as the set in candidate events place,
M represents WTThe number of middle vocabulary, m > 0;
Step 2:Characteristic vector is built
For set WTIn each w 'j, three features are selected, including:w′jContext Context in newsletter archive T is special
Levy;w′jPosition feature in newsletter archive T;w′jTopological characteristic in newsletter archive T;
Step 3:Location of incident is recognized
100-200 characteristic vector is selected, manually class label (location of incident and non-event is marked for each characteristic vector
Point), form a training dataset;Using one grader of Random Forest classifier trainings, it will be collected using the grader
Close WTIn each w 'jTwo-value classification is carried out according to location of incident and non-event place, so as to realize the extraction of location of incident;
W ' described in described step twojContext Context features in newsletter archive T;w′jIn newsletter archive T
Position feature;w′jTopological characteristic in newsletter archive T, three features are calculated as follows:
Feature one:w′jContextual feature in newsletter archive Tcj
w′jContextual feature w ' in newsletter archive TjThe weight of the regular expression matched is represented, is designated as cj;
(1) if w 'jSome regular expression in following table can be matched in newsletter archive T, it is assumed that be k-th, then cj=ak;ak
Represent " if k-th of regular expression of terminology match, the vocabulary is the ratio of location of incident ", akCalculation formula be:ak=|
Lk|/|Sk|, wherein, SkRepresent the set of all vocabulary of energy k-th of regular expression of successful match in labeled data, LkRepresent
Belong to SkAnd be the set that constitutes of vocabulary of location of incident in labeled data, labeled data refers to manually be labelled with location of incident
Newsletter archive collection;
(2) if w 'jMultiple regular expressions in following table can be matched in newsletter archive T, kth is set to1, k2..., kl(l >
1) individual regular expression, then
(3) if w 'jThe regular expression in following table can not be matched in newsletter archive T, then cj=0;
Feature two:w′jPosition feature p in newsletter archive Tj
pj=loc (w 'j, T), wherein, loc (w 'j, T) and represent vocabulary w 'jThe position occurred first in newsletter archive T, i.e., from text
This T section starts are to vocabulary w 'jOccurs the number of words between position for the first time;
Feature three:w′jTopological characteristic t in newsletter archive Tj
Initialize null set E;For any two tuples (w 'j, w 'i)∈WT×WTAnd, perform following two steps:
STEP1:By character string " http://www.baike.com/wiki/ " and w 'jURL is constituted, the corresponding pages of the URL are downloaded
Face;If downloading less than not handling two tuple (w 'j, w 'i);
STEP2:Utilize regular expression
/<a[^>]*Href=[" ']([^'">]*)['"][^>]*>(.*)</a>/ ig matches pagefile, obtains institute
There is Anchor Text;If including w ' in Anchor Texti, then E ∪ { (w 'j, w 'i) result be assigned to E;
For any two tuples (w 'j, w 'i)∈WT×WTAnd, perform after above-mentioned two step, generate with WTFor node set, E is
The digraph G of oriented line set;
To any w 'j∈WT, calculate its convergence factor C (w 'j), convergence factor is used to weighing connect between different node close
Degree;
In above formula, GΔ(w′j) represent to include w ' in GjThe quantity for closing three point groups, G∧(w′j) it is to represent to include w ' in GjOpen three
The quantity of point group;Close three point groups and refer in figure any connected three nodes two-by-two, open three point groups and refer in figure and connected by two sides
Three nodes come;
Make w 'jTopological characteristic t in newsletter archive TjFor C (w 'j), i.e. tj=C (w 'j);
Utilize w 'jContextual feature, position feature, topological characteristic in newsletter archive T, build w 'jThree-dimensional feature vector
(cj, pj, tj)。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510097822.0A CN104731768B (en) | 2015-03-05 | 2015-03-05 | A kind of location of incident abstracting method towards Chinese newsletter archive |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510097822.0A CN104731768B (en) | 2015-03-05 | 2015-03-05 | A kind of location of incident abstracting method towards Chinese newsletter archive |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104731768A CN104731768A (en) | 2015-06-24 |
CN104731768B true CN104731768B (en) | 2017-10-20 |
Family
ID=53455666
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510097822.0A Expired - Fee Related CN104731768B (en) | 2015-03-05 | 2015-03-05 | A kind of location of incident abstracting method towards Chinese newsletter archive |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104731768B (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105528421B (en) * | 2015-12-07 | 2018-09-04 | 中国人民大学 | A kind of search dimension method for digging for query word in mass data |
CN107016556B (en) * | 2016-01-27 | 2021-02-05 | 创新先进技术有限公司 | Data processing method and device |
CN110298039B (en) * | 2019-06-20 | 2023-05-30 | 北京百度网讯科技有限公司 | Event place identification method, system, equipment and computer readable storage medium |
CN111090994A (en) * | 2019-11-12 | 2020-05-01 | 北京信息科技大学 | Chinese-internet-forum-text-oriented event place attribution province identification method |
CN113127713A (en) * | 2019-12-30 | 2021-07-16 | 北大方正集团有限公司 | Method, device, equipment and storage medium for identifying news location |
CN113111169A (en) * | 2020-02-13 | 2021-07-13 | 北京明亿科技有限公司 | Deep learning model-based alarm receiving and processing text address information extraction method and device |
CN113111170A (en) * | 2020-02-13 | 2021-07-13 | 北京明亿科技有限公司 | Method and device for extracting alarm receiving and processing text track ground information based on deep learning model |
CN113111166A (en) * | 2020-02-13 | 2021-07-13 | 北京明亿科技有限公司 | Method and device for determining types of alarm receiving and processing places based on deep learning model |
CN111581954B (en) * | 2020-05-15 | 2023-06-09 | 中国人民解放军国防科技大学 | Text event extraction method and device based on grammar dependency information |
CN111914535B (en) * | 2020-07-31 | 2023-03-24 | 平安科技(深圳)有限公司 | Word recognition method and device, computer equipment and storage medium |
CN112307364B (en) * | 2020-11-25 | 2021-10-29 | 哈尔滨工业大学 | Character representation-oriented news text place extraction method |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103699568A (en) * | 2013-11-16 | 2014-04-02 | 西安交通大学城市学院 | Method for extracting hyponymy relation of field terms from wikipedia |
CN103902649A (en) * | 2014-02-17 | 2014-07-02 | 复旦大学 | Knowledge extraction method based on online encyclopedia link entities |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130086072A1 (en) * | 2011-10-03 | 2013-04-04 | Xerox Corporation | Method and system for extracting and classifying geolocation information utilizing electronic social media |
-
2015
- 2015-03-05 CN CN201510097822.0A patent/CN104731768B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103699568A (en) * | 2013-11-16 | 2014-04-02 | 西安交通大学城市学院 | Method for extracting hyponymy relation of field terms from wikipedia |
CN103902649A (en) * | 2014-02-17 | 2014-07-02 | 复旦大学 | Knowledge extraction method based on online encyclopedia link entities |
Non-Patent Citations (4)
Title |
---|
《随机森林特征选择》;王全才;《中国优秀硕士学位论文全文数据库 信息科技辑》;20120715;I138-1504 * |
事件报道中地点实体的提取研究;高燕;《中国优秀硕士学位论文全文数据库 信息科技辑》;20111215(第S1期);I141-139 * |
基于SimRank 的百度百科词条语义相似度计算;尹坤 等;《山东大学学报(工学版)》;20140317;第44卷(第3期);第0节,图1 * |
基于事件框架的突发事件信息抽取;冯礼;《中国优秀硕士学位论文全文数据库 信息科技辑》;20080715;第2.2.1、2.3.2、2.3.4节,图2、7 * |
Also Published As
Publication number | Publication date |
---|---|
CN104731768A (en) | 2015-06-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104731768B (en) | A kind of location of incident abstracting method towards Chinese newsletter archive | |
CN106202010B (en) | Method and apparatus based on deep neural network building Law Text syntax tree | |
CN104391942B (en) | Short essay eigen extended method based on semantic collection of illustrative plates | |
CN104462066B (en) | Semantic character labeling method and device | |
CN106547739A (en) | A kind of text semantic similarity analysis method | |
JP2022534377A (en) | Text classification method, apparatus and computer readable storage medium | |
CN108121700A (en) | A kind of keyword extracting method, device and electronic equipment | |
CN103870000B (en) | The method and device that candidate item caused by a kind of pair of input method is ranked up | |
CN104809176A (en) | Entity relationship extracting method of Zang language | |
CN107766371A (en) | A kind of text message sorting technique and its device | |
CN107122349A (en) | A kind of feature word of text extracting method based on word2vec LDA models | |
CN106599032A (en) | Text event extraction method in combination of sparse coding and structural perceptron | |
CN112232058A (en) | False news identification method and system based on deep learning three-layer semantic extraction framework | |
CN103761239A (en) | Method for performing emotional tendency classification to microblog by using emoticons | |
CN103473380B (en) | A kind of computer version sensibility classification method | |
CN104361037B (en) | Microblogging sorting technique and device | |
CN110175221A (en) | Utilize the refuse messages recognition methods of term vector combination machine learning | |
CN109710769A (en) | A kind of waterborne troops's comment detection system and method based on capsule network | |
CN103577989A (en) | Method and system for information classification based on product identification | |
CN105893362A (en) | A method for acquiring knowledge point semantic vectors and a method and a system for determining correlative knowledge points | |
Nikhila et al. | Text imbalance handling and classification for cross-platform cyber-crime detection using deep learning | |
CN107679199A (en) | A kind of external the Chinese text readability analysis method based on depth local feature | |
Mahmud et al. | Deep learning based sentiment analysis from Bangla text using glove word embedding along with convolutional neural network | |
CN110008312A (en) | A kind of document writing assistant implementation method, system and electronic equipment | |
Sagcan et al. | Toponym recognition in social media for estimating the location of events |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20171020 Termination date: 20180305 |
|
CF01 | Termination of patent right due to non-payment of annual fee |