CN109582958A - A kind of disaster story line construction method and device - Google Patents

A kind of disaster story line construction method and device Download PDF

Info

Publication number
CN109582958A
CN109582958A CN201811382046.9A CN201811382046A CN109582958A CN 109582958 A CN109582958 A CN 109582958A CN 201811382046 A CN201811382046 A CN 201811382046A CN 109582958 A CN109582958 A CN 109582958A
Authority
CN
China
Prior art keywords
disaster
information
entity
specified
story line
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811382046.9A
Other languages
Chinese (zh)
Other versions
CN109582958B (en
Inventor
周绮凤
倪进鑫
安超杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Research Institute of Xiamen University
Original Assignee
Shenzhen Research Institute of Xiamen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Research Institute of Xiamen University filed Critical Shenzhen Research Institute of Xiamen University
Priority to CN201811382046.9A priority Critical patent/CN109582958B/en
Publication of CN109582958A publication Critical patent/CN109582958A/en
Application granted granted Critical
Publication of CN109582958B publication Critical patent/CN109582958B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A10/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
    • Y02A10/40Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Economics (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention relates to a kind of disaster story line construction methods, device, belong to Semantic Web technology field.This method comprises: collecting the relevant information of specified disaster;The specified relevant triple entity information of disaster is extracted from relevant information;Extract the relationship between triple entity information;Extract specified disaster entity attributes;According between triple entity information, triple entity information relationship and specified disaster entity attributes, construct the story line of the specified disaster.Disaster accident line is generated using knowledge mapping, useful information is extracted from newsletter archive to generate disaster story line by knowledge mappings constructing technologies such as Entity recognition, relationship extraction, attributes extractions, solves the problems, such as that useful information can not be extracted from massive information in the prior art to construct disaster story line.

Description

A kind of disaster story line construction method and device
Technical field
The invention belongs to Semantic Web technology fields, and in particular to a kind of disaster story line construction method and device.
Background technique
The reaction and disposition of disaster event are always social issues that need special attention.When a disaster thing occurs, if We can find out its development law, and loss caused by disaster event can be effectively reduced.For at present, related calamity Difficult information, if we can extract effective information from these news, restores mostly from the news report of media The entire flow that disaster develops, when same type of disaster occurs again, we can drill according to the disaster of this type Change process makes targeted measure, thus can be effectively reduced loss caused by disaster.
Currently, the story line building in disaster field mainly uses file summarization method, with the development of science and technology network is believed The growth rate of breath amount linearly rises, and the rapid growth of information usually can all cause information explosion, this is allowed for based on document The disaster story line construction method of method of abstracting is difficult to quickly and accurately extract useful information in massive information to construct calamity Difficult story line.
Summary of the invention
In order to solve that useful information can not be extracted from massive information in the prior art to construct the technology of disaster story line Problem, the present invention provides a kind of disaster accident line construction method and devices.
In order to achieve the above object, the present invention adopts the following technical scheme:
On the one hand, a kind of disaster story line construction method, which comprises
Collect the relevant information of specified disaster;
The relevant triple entity information of the specified disaster is extracted from the relevant information;
Extract the relationship between the triple entity information;
Extract the specified disaster entity attributes;
According between the triple entity information, the triple entity information relationship and the specified disaster entity Attribute constructs the story line of the specified disaster.
Still optionally further, the relevant information for collecting specified disaster includes:
Obtain the relevant information of specified disaster on the internet using web crawlers technology;
Goal-selling information is chosen from the information crawled.
Still optionally further, described that goal-selling information is chosen from the information crawled, comprising: using based on degree sum aggregate The network node of poly- coefficient measures importance method, and goal-selling information is chosen from the information crawled.
Still optionally further, described that the relevant ternary group object letter of the specified disaster is extracted from the relevant information Breath, comprising: using extracting disaster phase in the bidirectional circulating neural network model information after the pre-treatment of fusion conditions random field The triple entity information of pass.
Still optionally further, the relationship extracted between the triple entity information, comprising: utilize attention mechanism Bidirectional circulating neural network model extracts the relationship between disaster entity.
Still optionally further, the extraction disaster entity attributes, comprising: utilize Bootstrapping model extraction calamity Difficult entity attributes.
Still optionally further, the relationship according between the triple entity information, the triple entity information with The specified disaster entity attributes construct the story line of the specified disaster, comprising:
Construct local disaster story line;
Generate global disaster story line.
Still optionally further, the local disaster story line of the building includes:
Classified by location entity, obtains information disaster entity relationship, the disaster entity attribute of different location;
Carry out the disambiguation of disaster entity;
Carry out the fusion of disaster attribute.
Still optionally further, the global disaster story line of the building includes:
Cost function is constructed, the cost function is used to describe the similarity degree between at least two map;
Judge whether there is directed edge connection between the map of at least two part according to cost function;
The cost function and the local map are merged, global story line is constructed;
When the number of at least two part map is 2, the cost function includes:
Another aspect, a kind of disaster story line construction device, comprising: information collection module, entity information extraction module, reality Body Relation extraction module, entity attribute abstraction module, story line generation module;
The information collection module is used to collect the relevant information of specified disaster;
The entity information extraction module from the relevant information for extracting the relevant triple of the specified disaster Entity information;
The entity relation extraction module is used to extract the relationship between the triple entity information;
The entity attribute abstraction module is for extracting the specified disaster entity attributes;
The story line generation module is used for according to the pass between the triple entity information, the triple entity information System and the specified disaster entity attributes, construct the story line of the specified disaster.
In an embodiment of the present invention, disaster accident line is generated using knowledge mapping, extracted by Entity recognition, relationship, The knowledge mappings constructing technology such as attributes extraction extracts useful information from newsletter archive to generate disaster story line, solves existing Useful information can not be extracted from massive information come the problem of constructing disaster story line by having in technology.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.
Fig. 1 is a kind of flow chart of one embodiment of disaster story line construction method provided by the invention;
Fig. 2 is to name Entity recognition BLSTM- in a kind of one embodiment of disaster story line construction method provided by the invention CRF (the bidirectional circulating neural network of fusion conditions random field) illustraton of model;
Fig. 3 is that relationship extracts Att-BLSTM (note in a kind of one embodiment of disaster story line construction method provided by the invention The bidirectional circulating neural network for power mechanism of anticipating) illustraton of model;
Fig. 4 is attributes extraction in a kind of one embodiment of disaster story line construction method provided by the invention Bootstrapping illustraton of model;
Fig. 5 is local story line schematic diagram in a kind of one embodiment of disaster story line construction method provided by the invention;
Fig. 6 is global story line schematic diagram in a kind of one embodiment of disaster story line construction method provided by the invention;
Fig. 7 is a kind of structure chart of one embodiment of disaster story line construction device provided by the invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, technical solution of the present invention will be carried out below Detailed description.Obviously, described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Base Embodiment in the present invention, those of ordinary skill in the art are obtained all without making creative work Other embodiment belongs to the range that the present invention is protected.
In order to clearly illustrate that the process and advantage of the present embodiment inventive method, the present invention provide a kind of disaster story The embodiment of line construction method.
Referring to Fig. 1, the method for the embodiment of the present invention includes:
Collect the relevant information of specified disaster;
The specified relevant triple entity information of disaster is extracted from relevant information;
Extract the relationship between triple entity information;
Extract specified disaster entity attributes;
According between triple entity information, triple entity information relationship and specified disaster entity attributes, building refers to Determine the story line of disaster.
In an embodiment of the present invention, disaster accident line is generated using knowledge mapping, extracted by Entity recognition, relationship, The knowledge mappings constructing technology such as attributes extraction extracts useful information from newsletter archive to generate disaster story line, solves existing Useful information can not be extracted from massive information come the problem of constructing disaster story line by having in technology.
Based on a kind of above-mentioned disaster story line construction method, the embodiment of the present invention provides an alternative embodiment: referring to Fig. 1, The disaster story line construction method of the present embodiment may comprise steps of:
S101, the relevant information for collecting specified disaster.
Google's formal concept for proposing knowledge mapping in 2012, subsequent knowledge mapping rapidly become a big hot research neck Domain.Knowledge mapping is also known as mapping knowledge domains, is known as knowledge domain visualization or ken mapping map in books and information group, A series of a variety of different figures of explicit knowledge's development process and structural relation, with visualization technique describe knowledge resource and Its carrier, excavation, analysis, building, drafting and explicit knowledge and connecting each other between them.Knowledge mapping is that a kind of announcement is real The semantic network of body relationship, it can carry out formalized description to things in the real world.Knowledge mapping takes the form of Triple, that is, wherein presentation-entity set indicates set of relationship, indicates triplet sets.The basic expressions form of triple has Two kinds, be (entity 1, relationship, entity 2) and (entity, attribute, attribute value) respectively.Knowledge mapping constructing technology can be from a large amount of Useful information is extracted in the network information and is showed by way of graphic plotting, is disclosed the rule of development of disaster, is It formulates the targeted precautionary measures and provides reference.
The powerful semantic expressiveness ability of knowledge mapping and succinct manifestation mode are very suitable for the building of disaster story line, The story line in disaster field constructs the method for being based primarily upon documentation summary at present, and the research of knowledge based atlas calculation is substantially at Space state.Therefore, the disaster story line that information is abundant, concise is constructed by knowledge mapping becomes particularly important.
In the present embodiment, the relevant information for collecting specified disaster, for example, the relevant information of collection table disaster caused by a windstorm hardly possible.One During a specifically realization, with web crawlers technology, especially Python crawler technology, different platforms is obtained in internet Wind news report, and by choosing highlight from the news crawled based on the node importance method for spending and gathering coefficient To reach the target of removal redundancy.
Specifically, degree index describes the number of the neighbor node of a node:
ki=∑j∈Gδij
Wherein,
Degree index embodies the ability established direct links between the node and surroundings nodes, but cannot reflect the node Company's side situation of neighbor node.
Gather coefficient and describe between the neighbours of nodes the ratio of neighbours each other, indicates are as follows:
Wherein eiIt indicates to be formed by triangle format between node i and its any two neighbor node.With degree index phase Instead, although company's side situation of neighbor node can be reflected to a certain extent by gathering coefficient, it cannot reflect the scale of neighbor node, Then we utilize nodes neighbors information, and consider to gather coefficient, propose a kind of new node importance evaluation index pi:
Wherein, fiFor the sum of node i itself degree and its neighbours' degree, indicate are as follows:
Wherein, kwIndicate the degree of node w, ΔiIndicate the neighbor node set of node i.Function giIt indicates are as follows:
Using based on degree and the network node importance measures method for gathering coefficient, we select from the news documents crawled Importance news is taken, to achieve the purpose that remove redundancy, highlight information relevant to typhoon is obtained out, is made For the relevant information for specifying disaster.
S102, the specified relevant triple entity information of disaster is extracted from relevant information.
Specifically, using the bidirectional circulating neural network model of fusion conditions random field from select come highlight text Entity is extracted in this, entity includes but is not limited to name, mechanism name and location name.
Name entity is extracted from newsletter archive using BLSTM-CRF model.
Knowledge mapping takes the form of triple, and the core of triple is entity, therefore the first step of information extraction Exactly name Entity recognition.Name Entity recognition has become the basic fundamental of many natural language processing application problems.In recent years, With the rapid development of deep learning, Recognition with Recurrent Neural Network starts to show powerful ability in natural language processing task. In the present embodiment, we used BLSTM-CRF models to be named Entity recognition.
Referring to fig. 2, Fig. 2 is BLSTM-CRF model structure schematic diagram.
Input layer: the sentence x=(x given for one1, x2..., xn), wherein xiIt is one-hot vector, representing should Position of the character in character dictionary.Then we by word2vec model by one-hot vector projection at word vector, The input of BLSTM-CRF model is exactly word vector.
LSTM layers: two-way LSTM layers for automatically extracting the feature of sentence.The sentence given for one, wherein each word Input of the word vector as each time step of two-way LSTM, the hidden state sequence for then obtaining positive LSTM and reversed LSTM It is stitched together to obtain complete hidden state sequenceLSTM layers of concrete operation process is as follows:
it=σ (Wxixt+Whiht-1+Wcict-1+bi) (input gate)
ft=σ (Wxfxt+Whfht-1+Wcfct-1+bif) (forgeing door)
ct=ft·ct-1+it·tanh(Wxcxt+Whcht-1+bif) (cell state)
ot=σ (Wxoxt+Whoht-1+Wcoct-1+bo) (out gate)
ht=ot·tanh(ct) (output)
CRF layers: the third layer of model is CRF layers, for a sentence inputting x=(x1, x2..., xn), setting P is The score matrix of BLSTM network output.The size of P is n × k, and wherein k is the quantity of different labels, PijIt is i-th of character in sentence The score of j-th of label in son.For a series of prediction y=(y1, y2..., yn), we are defined as:
Wherein A is the matrix simulated from label i to the conversion fraction of label j.Starting and ending label is added to by we In one group of possible label, they are y0And ynLabel, respectively indicate the beginning and end symbol of sentence.Therefore, A is size For the square matrix of k+2.After to softmax layers of all possible sequence label application, the probability of sequences y are as follows:
The relevant triple entity information of specified disaster can be extracted in total correlation information as a result, for example, extracting in platform Disaster caused by a windstorm hardly possible in, typhoon, Taiwan, Ni Baite typhoon relevant information.
Relationship between s103, extraction triple entity information.
In the present embodiment, the relationship between 8 kinds of entities is artificially constructed by analysis newsletter archive data, then relationship Extraction task regards classification task as, carries out the relationship between entity using the bidirectional circulating neural network model of attention mechanism and mentions It takes.
Relation extraction be it is a kind of searching noun between semantic relation task, in a recent study, Relation extraction task warp Often it is seen as classification task.In the present invention, we go to extract the relationship between entity, structure using Att-BLSTM model As shown in Figure 4.
Compared with BLSTM model, the more one layer of attention layer of Att-BLSTM enable H=(h1, h2..., hT) it is BLSTM layers Output, wherein T be input sentence size.Then the expression r of sentence is made of the weighted sum of these following output vectors:
M=tanh (H)
α=softmax (wTM)
R=H αT
h*=tanh (r)
WhereindwIt is the dimension of word vector, w is training parameter.
In the last layer, we remove projected relationship label using softmax function, predict shown in the following formula of process:
The output of softmax is the vector that a size is number of labels, wherein each element represents value as corresponding mark The probability of label, we take label corresponding to wherein maximum probability as the relationship of the entity in input sentence.
Based on this, in the present embodiment, it can extract relationship, the relationship of " typhoon " and " Taiwan " in " Taiwan " and " Ni Baite " Deng.
S104, specified disaster entity attributes are extracted.
The entity attribute in text is extracted using Bootstrapping semi-supervised learning method.
Bootstrapping is a kind of semi-supervised learning technology for being widely used in knowledge acquisition, be it is a kind of in proper order Progressive learning method.A small amount of labeled data or initial seed set are only needed, is effectively expanded by recycling study It fills, is finally reached required data set.It mainly includes a small amount of kind of subpattern that attribute information based on Bootstrapping, which obtains, Selection and a large amount of unmarked texts preparation.
During attributed scheme based on Bootstrapping algorithm obtains, the evaluation of candidate pattern has very important work With.If the mode of mistake, which enters iteration as kind of a subpattern again, obtains state, it will wrong amplification is caused, even more so that entirely Pattern acquiring failure.Therefore, it is necessary to the confidence levels according to certain each candidate pattern of estimation of score function, and arrange it Sequence, n is a before choosing or the candidate pattern example for being greater than some threshold value is selected to enter iterative process.Calculate kind of subpattern and candidate The similarity of mode is a kind of good mode evaluation mode.Currently used similarity calculating method includes vector space mould Type, edit distance approach and inquiry likelihood model etc..Editing distance refers to two character strings S1 and S2, is changed into needed for S2 as S1 Minimum edit operation times.By the way that a character is substituted for another character, it is inserted into a character, deletes being permitted for a character Editable operation carries out Text similarity computing.The present invention is similar to kind subpattern using editing distance evaluation candidate pattern Property.
New model acquisition process:
(1) pretreatment such as sentence fractionation, participle, part-of-speech tagging is carried out to text;
(2) sentence with triggering vocabulary is found in training corpus, extracts the descriptive statement comprising attribute trigger word Syntactic pattern is as candidate pattern;
(3) candidate pattern is calculated based on editing distance and plants the similarity of subpattern;
(4) according to the similarity being calculated in (3), compare the size with given threshold value, then retain if it is greater than threshold value The mode;
(5) mode obtained in (4) is converted into new mode seed, then carries out next round iteration, obtains new mode.
In the present embodiment, according to acquired new model, " typhoon ", " Taiwan " and " Ni Baite " attribute and attribute are obtained Value.
S105, according between triple entity information, triple entity information relationship and specified disaster entity attributes, structure Build the story line of specified disaster.
Utilize the information architecture disaster story line extracted from newsletter archive, including local disaster story line and global calamity Difficult story line, local story line is classified by location entity, and the information for obtaining different location goes building;For the overall situation Disaster story line, we are extracted a cost functionTo determine to be between two local maps No to have directed edge connection, finally fusion gets up to obtain final global story line.
Local disaster story line building:
The present invention first temporally divides all news documents, then carries out information extraction to daily news respectively, By information extraction, realizes and obtain entity, relationship and entity attribute information from unstructured and semi-structured data Target, however, in these results may comprising bulk redundancy and mistake information, it is therefore necessary to it is carried out cleaning and it is whole It closes.By knowledge fusion, the ambiguity of concept can be eliminated, redundancy and erroneous picture are rejected, so that it is guaranteed that the quality of knowledge.
It is disambiguated in entity, since the entity extracted herein is place and mechanism name, so according to Baidupedia to chain of entities The webpage being connected to compares to complete the work of entity disambiguation.
Attribute is merged, we classify to attribute according to trigger word classification above, to the attribute of each type The most attribute value of frequency of occurrence is selected using the method for ballot.
The map daily during our available typhoon disaster generations based on above-mentioned rule, then by map according to the time Line connects shown in the local disaster story line institute Fig. 5 constituted.
Global story line building:
(the time ratio j that only i occurs is early just to have having for i to j to the joint structure digraph generated using local story line To figure), we construct a cost function to describe the similarity degree between two maps, as follows thus:
Wherein, d (i, j) indicates the distance between the place of two maps of i and j description after normalization, NjIndicate map j Triple quantity, we have comprehensively considered geographical location and profile information amount, and in general, i can tend to be transitioned into and i The identical map in the place of description, but typhoon moves always, has more information in center of typhoon position, so We have also contemplated profile information amount this because usually constructing cost.
Referring to Fig. 6, Fig. 6 is the global disaster story line for constructing digraph based on cost function and ultimately producing.
In an embodiment of the present invention, disaster accident line is generated using knowledge mapping, extracted by Entity recognition, relationship, The knowledge mappings constructing technology such as attributes extraction extracts useful information from newsletter archive to generate disaster story line, solves existing Useful information can not be extracted from massive information come the problem of constructing disaster story line by having in technology.
Fig. 7 is a kind of structure chart of one embodiment of disaster story line construction device provided by the invention.
Referring to Fig. 7, in this disaster accident line construction device embodiment, disaster story line construction device includes: that information is collected Module 61, entity information extraction module 62, entity relation extraction module 63, entity attribute abstraction module 64, story line generate mould Block 65.
Specifically, information collection module 61 is used to collect the relevant information of specified disaster;Entity information extraction module 62 is used In at least two entity informations for extracting specified disaster from relevant information;Entity relation extraction module 63 is for extracting at least two Relationship between a entity;Entity attribute abstraction module 64 is for extracting specified disaster entity attributes;Story line generation module 65 For according between at least two entity informations, at least two entities relationship and specified disaster entity attributes, construct the finger Determine the story line of disaster.
In an embodiment of the present invention, disaster accident line is generated using knowledge mapping, extracted by Entity recognition, relationship, The knowledge mappings constructing technology such as attributes extraction extracts useful information from newsletter archive to generate disaster story line, solves existing Useful information can not be extracted from massive information come the problem of constructing disaster story line by having in technology.
Above description, only a specific embodiment of the invention, but scope of protection of the present invention is not limited thereto, it is any Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain Lid is within protection scope of the present invention.Therefore, protection scope of the present invention should be based on the protection scope of the described claims.
It is understood that same or similar part can mutually refer in the various embodiments described above, in some embodiments Unspecified content may refer to the same or similar content in other embodiments.
It should be noted that in the description of the present invention, term " first ", " second " etc. are used for description purposes only, without It can be interpreted as indication or suggestion relative importance.In addition, in the description of the present invention, unless otherwise indicated, the meaning of " multiple " Refer at least two.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be of the invention Embodiment person of ordinary skill in the field understood.
It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In read/write memory medium.
Storage medium mentioned above can be read-only memory, disk or CD etc..
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms are not Centainly refer to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be any One or more embodiment or examples in can be combined in any suitable manner.
Although the embodiments of the present invention has been shown and described above, it is to be understood that above-described embodiment is example Property, it is not considered as limiting the invention, those skilled in the art within the scope of the invention can be to above-mentioned Embodiment is changed, modifies, replacement and variant.

Claims (10)

1. a kind of disaster story line construction method, which is characterized in that the described method includes:
Collect the relevant information of specified disaster;
The relevant triple entity information of the specified disaster is extracted from the relevant information;
Extract the relationship between the triple entity information;
Extract the specified disaster entity attributes;
According to the category of relationship and the specified disaster entity between the triple entity information, the triple entity information Property, construct the story line of the specified disaster.
2. the method according to claim 1, wherein the relevant information for collecting specified disaster includes:
Obtain the relevant information of specified disaster on the internet using web crawlers technology;
Goal-selling information is chosen from the information crawled.
3. according to the method described in claim 2, it is characterized in that, described choose goal-selling letter from the information crawled Breath, comprising: measure importance method using based on the network node for spending and gathering coefficient, chosen from the information crawled default Target information.
4. the method according to claim 1, wherein described extract the specified disaster from the relevant information Relevant triple entity information, comprising: using fusion conditions random field bidirectional circulating neural network model after the pre-treatment Information in extract the relevant triple entity information of disaster.
5. the method according to claim 1, wherein the relationship extracted between the triple entity information, It include: the relationship extracted using the bidirectional circulating neural network model of attention mechanism between disaster entity.
6. the method according to claim 1, wherein the extraction disaster entity attributes, comprising: utilize Bootstrapping model extraction disaster entity attributes.
7. the method according to claim 1, wherein described according to the triple entity information, the ternary Relationship and the specified disaster entity attributes between group object information, construct the story line of the specified disaster, comprising:
Construct local disaster story line;
Generate global disaster story line.
8. the method according to the description of claim 7 is characterized in that building part disaster story line includes:
Classified by location entity, obtains information disaster entity relationship, the disaster entity attribute of different location;
Carry out the disambiguation of disaster entity;
Carry out the fusion of disaster attribute.
9. the method according to the description of claim 7 is characterized in that building overall situation disaster story line includes:
Cost function is constructed, the cost function is used to describe the similarity degree between at least two map;
Judge whether there is directed edge connection between the map of at least two part according to cost function;
The cost function and the local map are merged, global story line is constructed;
When the number of at least two part map is 2, the cost function includes:
10. a kind of disaster story line construction device characterized by comprising information collection module, entity information extraction module, Entity relation extraction module, entity attribute abstraction module, story line generation module;
The information collection module is used to collect the relevant information of specified disaster;
The entity information extraction module from the relevant information for extracting the relevant ternary group object of the specified disaster Information;
The entity relation extraction module is used to extract the relationship between the triple entity information;
The entity attribute abstraction module is for extracting the specified disaster entity attributes;
The story line generation module be used for according between the triple entity information, the triple entity information relationship with The specified disaster entity attributes construct the story line of the specified disaster.
CN201811382046.9A 2018-11-20 2018-11-20 Disaster story line construction method and device Active CN109582958B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811382046.9A CN109582958B (en) 2018-11-20 2018-11-20 Disaster story line construction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811382046.9A CN109582958B (en) 2018-11-20 2018-11-20 Disaster story line construction method and device

Publications (2)

Publication Number Publication Date
CN109582958A true CN109582958A (en) 2019-04-05
CN109582958B CN109582958B (en) 2023-07-18

Family

ID=65922787

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811382046.9A Active CN109582958B (en) 2018-11-20 2018-11-20 Disaster story line construction method and device

Country Status (1)

Country Link
CN (1) CN109582958B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083709A (en) * 2019-04-28 2019-08-02 宁波深擎信息科技有限公司 A kind of knowledge mapping method for auto constructing and system based on description definition
CN110866190A (en) * 2019-11-18 2020-03-06 支付宝(杭州)信息技术有限公司 Method and device for training neural network model for representing knowledge graph

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140356846A1 (en) * 2012-02-06 2014-12-04 Su-Kam Intelligent Education Systems, Inc. Apparatus, systems and methods for interactive dissemination of knowledge
US20160125093A1 (en) * 2014-10-31 2016-05-05 Linkedin Corporation Partial graph incremental update in a social network
CN106156365A (en) * 2016-08-03 2016-11-23 北京智能管家科技有限公司 A kind of generation method and device of knowledge mapping
US20170056764A1 (en) * 2015-08-31 2017-03-02 Omniscience Corporation Event categorization and key prospect identification from storylines
CN106845474A (en) * 2015-12-07 2017-06-13 富士通株式会社 Image processing apparatus and method
CN107194422A (en) * 2017-06-19 2017-09-22 中国人民解放军国防科学技术大学 A kind of convolutional neural networks relation sorting technique of the forward and reverse example of combination
CN107330125A (en) * 2017-07-20 2017-11-07 云南电网有限责任公司电力科学研究院 The unstructured distribution data integrated approach of magnanimity of knowledge based graphical spectrum technology
CN108664615A (en) * 2017-05-12 2018-10-16 华中师范大学 A kind of knowledge mapping construction method of discipline-oriented educational resource
CN108763333A (en) * 2018-05-11 2018-11-06 北京航空航天大学 A kind of event collection of illustrative plates construction method based on Social Media
CN108776684A (en) * 2018-05-25 2018-11-09 华东师范大学 Optimization method, device, medium, equipment and the system of side right weight in knowledge mapping

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140356846A1 (en) * 2012-02-06 2014-12-04 Su-Kam Intelligent Education Systems, Inc. Apparatus, systems and methods for interactive dissemination of knowledge
US20160125093A1 (en) * 2014-10-31 2016-05-05 Linkedin Corporation Partial graph incremental update in a social network
US20170056764A1 (en) * 2015-08-31 2017-03-02 Omniscience Corporation Event categorization and key prospect identification from storylines
CN106845474A (en) * 2015-12-07 2017-06-13 富士通株式会社 Image processing apparatus and method
CN106156365A (en) * 2016-08-03 2016-11-23 北京智能管家科技有限公司 A kind of generation method and device of knowledge mapping
CN108664615A (en) * 2017-05-12 2018-10-16 华中师范大学 A kind of knowledge mapping construction method of discipline-oriented educational resource
CN107194422A (en) * 2017-06-19 2017-09-22 中国人民解放军国防科学技术大学 A kind of convolutional neural networks relation sorting technique of the forward and reverse example of combination
CN107330125A (en) * 2017-07-20 2017-11-07 云南电网有限责任公司电力科学研究院 The unstructured distribution data integrated approach of magnanimity of knowledge based graphical spectrum technology
CN108763333A (en) * 2018-05-11 2018-11-06 北京航空航天大学 A kind of event collection of illustrative plates construction method based on Social Media
CN108776684A (en) * 2018-05-25 2018-11-09 华东师范大学 Optimization method, device, medium, equipment and the system of side right weight in knowledge mapping

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CHUANHAI DONG ET AL.: "Character-Based LSTM-CRF with Radical-Level Features for Chinese Named Entity Recognition", 《NATURAL LANGUAGE UNDERSTANDING AND INTELLIGENT APPLICATIONS》 *
CHUANHAI DONG ET AL.: "Character-Based LSTM-CRF with Radical-Level Features for Chinese Named Entity Recognition", 《NATURAL LANGUAGE UNDERSTANDING AND INTELLIGENT APPLICATIONS》, 31 December 2016 (2016-12-31), pages 239 - 250, XP047363971, DOI: 10.1007/978-3-319-50496-4_20 *
QIFENG ZHOU ET AL.: "An Improved Textual Storyline Generating Framework for Disaster Information Management", 《2017 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND KNOWLEDGE ENGINEERING (ISKE)》 *
QIFENG ZHOU ET AL.: "An Improved Textual Storyline Generating Framework for Disaster Information Management", 《2017 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND KNOWLEDGE ENGINEERING (ISKE)》, 31 December 2017 (2017-12-31), pages 1 - 8 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083709A (en) * 2019-04-28 2019-08-02 宁波深擎信息科技有限公司 A kind of knowledge mapping method for auto constructing and system based on description definition
CN110866190A (en) * 2019-11-18 2020-03-06 支付宝(杭州)信息技术有限公司 Method and device for training neural network model for representing knowledge graph

Also Published As

Publication number Publication date
CN109582958B (en) 2023-07-18

Similar Documents

Publication Publication Date Title
CN110968700B (en) Method and device for constructing domain event map integrating multiple types of affairs and entity knowledge
CN110633409B (en) Automobile news event extraction method integrating rules and deep learning
CN107992597B (en) Text structuring method for power grid fault case
CN110347894A (en) Knowledge mapping processing method, device, computer equipment and storage medium based on crawler
CN112579477A (en) Defect detection method, device and storage medium
CN112183094B (en) Chinese grammar debugging method and system based on multiple text features
CN108628828A (en) A kind of joint abstracting method of viewpoint and its holder based on from attention
CN112396185B (en) Fact verification method, system, computer equipment and storage medium
CN112417097B (en) Multi-modal data feature extraction and association method for public opinion analysis
Mookdarsanit et al. The COVID-19 fake news detection in Thai social texts
Lai et al. A natural language processing approach to understanding context in the extraction and geocoding of historical floods, storms, and adaptation measures
CN117076653A (en) Knowledge base question-answering method based on thinking chain and visual lifting context learning
Zhang et al. A multi-feature fusion model for Chinese relation extraction with entity sense
CN108509423A (en) A kind of acceptance of the bid webpage name entity abstracting method based on second order HMM
Wang et al. Understanding geological reports based on knowledge graphs using a deep learning approach
Qiu et al. Construction and application of a knowledge graph for iron deposits using text mining analytics and a deep learning algorithm
CN115390806A (en) Software design mode recommendation method based on bimodal joint modeling
Roudsari et al. Comparison and analysis of embedding methods for patent documents
CN109582958A (en) A kind of disaster story line construction method and device
CN116244446A (en) Social media cognitive threat detection method and system
CN114564768A (en) End-to-end intelligent plane design method based on deep learning
Ma et al. Ontology-based BERT model for automated information extraction from geological hazard reports
EP4198808A1 (en) Extraction of tasks from documents using weakly supervision
Zhou et al. Text and information analytics for fully automated energy code checking
CN115859989A (en) Entity identification method and system based on remote supervision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant