CN103473263B - News event development process-oriented visual display method - Google Patents
News event development process-oriented visual display method Download PDFInfo
- Publication number
- CN103473263B CN103473263B CN201310303085.6A CN201310303085A CN103473263B CN 103473263 B CN103473263 B CN 103473263B CN 201310303085 A CN201310303085 A CN 201310303085A CN 103473263 B CN103473263 B CN 103473263B
- Authority
- CN
- China
- Prior art keywords
- news
- data subset
- event
- sentence
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 85
- 230000000007 visual effect Effects 0.000 title abstract description 7
- 238000011161 development Methods 0.000 title abstract description 5
- 230000008569 process Effects 0.000 claims abstract description 36
- 238000012800 visualization Methods 0.000 claims description 24
- 230000001747 exhibiting effect Effects 0.000 claims description 18
- 238000000205 computational method Methods 0.000 claims description 6
- 238000004458 analytical method Methods 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 3
- 238000013139 quantization Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 2
- 238000012935 Averaging Methods 0.000 claims 1
- 238000003058 natural language processing Methods 0.000 abstract description 2
- 238000004883 computer application Methods 0.000 abstract 1
- 238000005516 engineering process Methods 0.000 description 7
- 238000013459 approach Methods 0.000 description 3
- 238000007670 refining Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 241000233805 Phoenix Species 0.000 description 1
- 244000097202 Rathbunia alamosensis Species 0.000 description 1
- 235000009776 Rathbunia alamosensis Nutrition 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000010429 evolutionary process Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Document Processing Apparatus (AREA)
Abstract
The invention belongs to the field of natural language processing and computer application and relates to a news event development process-oriented visual display method. The method comprises the following steps: acquiring a related news report webpage of a specific news event from a news source and removing duplicates to obtain a non-repeat news report data set of the event; dividing data subsets according to news report time to obtain data subsets which are ordered according to time; extracting character and place factors of the report time from each data subset to generate an event abstract; performing visual display of the event by taking character, place and event abstract sentences, meeting the importance in the data subset corresponding to the report time, as nodes and incidence relations among the sentences as edges. The method contributes to complete, concise and intuitive display of the development process of the entire news event and the incidence relations of the key factors such as the character and the place, so that the cost for a reader to acquire the news information is reduced.
Description
Technical field
The invention belongs to natural language processing and Computer Applied Technology field, drill towards media event particularly to one kind
The visualization exhibiting method of change process.
Background technology
Development with network technology and multimedia technology and popularization, the approach that people obtain social news event information is sent out
Give birth to huge change.The core status of the news medias such as newspaper, broadcast, TV are just gradually replaced by Internet news media,
Internet news report has become popular main information acquisition platform.But Internet news reports numerous and complicated, content is close even
The news report repeating wastes valuable the consulting the time of people, and " fragment type " report of different visual angles makes people be difficult to fast
The ins and outs of media event are grasped on ground, be therefore badly in need of at present a kind of comprehensively, refining, intuitively media event exhibiting method to be dropping
The low popular time cost obtaining news information.The exhibiting method of Internet news event mainly has two categories below at present:
First kind method:By Professional News worker, the Internet news report of particular news event is manually compiled
Volume, taxonomic revision is so that relevant information is more orderly, be easy to user is read.For grave news event, Sina, phoenix,
The news websites such as Netease then more meticulously sort out special news report, to provide more comprehensively relevant information.The method
Advantage, be the evolution and detailed information that user can be helped more fully to grasp whole media event;Shortcoming is artificial
Editor, arrange relatively costly, and the special report quantity that sorted out for grave news event still more it is still necessary to
Take a significant amount of time the evolution that the report read in special topic could grasp media event.
Equations of The Second Kind method:Using search, cluster etc. the information processing technology relevant information of media event is collected and
Arrange, and more careful classification can be carried out and represented by the source of information and type (as news, forum, blog, video etc.), with
When can be ranked up by the sequencing of information issuing time.The advantage of the method, although be to adopt at the information such as search, cluster
Reason technology achieves the automatic collection to media event relevant information, arrangement and refines theme, greatly reduces human-edited, whole
The cost of reason;But shortcoming is to refine, intuitively show the evolution of media event, also cannot be to people in media event
The incidence relation of the key elements such as thing, place and evolutionary process.
Content of the invention
Present invention is generally directed to the deficiency of existing two class Internet news technology for revealing, invention is a kind of to be developed towards media event
The visualization exhibiting method of process, according to the computer program write, concrete each step executing methods described.Reducing people
While work editor, arrangement cost, evolution and personage, place comprehensive, that refine, intuitively represent whole media event
Incidence relation Deng key element.
The technical solution used in the present invention is a kind of visualization exhibiting method towards media event evolution process, according to volume
The computer program write, concrete each step executing methods described.The method specifically includes:
A kind of visualization exhibiting method towards media event evolution process is it is characterised in that obtain from news sources first
The related news report webpage of certain particular news event simultaneously carries out duplicate removal process, obtains the no repetition news report data of this event
Collection;And then by news report data set, carry out data subset division by the news report time, obtain by report time order and function sequence
Data subset;Extract personage, the place media event key element of this report time from each data subset, generate on this basis
The event summary of this report time;Finally in each news report time corresponding data subset, to meet importance requirement
Personage, place element of news and event summary sentence are node, and the incidence relation between them, as side, carries out this time point
Incident visualization represents.
A kind of described visualization exhibiting method towards media event evolution process is it is characterised in that to certain obtaining
The related news report webpage of particular news event simultaneously carries out duplicate removal process, specifically includes in every news report web page text
Character carry out occurrence number statistics, will appear from the characteristic character as this webpage for the character that number of times exceedes preset requirement;By institute
There is characteristic character to be arranged in feature string by occurrence number, utilize Hash method to generate a regular length on its basis
Condition code;If the condition code of two news report webpages identical then it is assumed that one of them be repeated pages.
A kind of described visualization exhibiting method towards media event evolution process is it is characterised in that to data subset
Selection, if specifically include comprise in data subset personage, place sentence quantity be more than predetermined threshold value, by this data subset
It is determined as significant data subset, be determined as the important report of this particular news event corresponding for this data subset news report time
The road time.
A kind of described visualization exhibiting method towards media event evolution process is it is characterised in that event summary
Generating process, specifically includes the descending sequence of importance according to sentence in data subset, selects importance to be more than default wanting
The sentence asked is as the event summary of this report time data subset;Wherein, the definition of sentence importance is related to word frequency statisticses power
Value and element of news weights two aspect factor:The computational methods of sentence word frequency statisticses weights employ TFIDF weight calculation method;
The relative frequency that the computational methods of sentence element of news weights are personage, place element of news occurs in data subset;Specifically
Process includes:
First, extract the media event key elements such as personage, place from each data subset and obtain element of news phrase set (such as
Place key element phrase set:L={ l1,l2,...ln), and weights quantization is carried out to key element phrase set;With place key element phrase
As a example set, the weights quantizing process of each place key element is:
In formula (1)It is place key element liOccurrence number in data subset, | L | is that in data subset, all places will
The total number of element, by p (li) as key element liWeight;
Secondly, calculate the weight of word in each data subset, calculate each word w in data subset using TFIDF methodi?
Divide Score (wi), concrete formula is represented by:
TF (w in formula (2)i) it is word wiOccurrence number in data subset,
Represent word wiThe inverse of the data subset frequency occurring in total data set, | W | is the total number of word in data subset;Will
Score(wi) as word wiWeight, if word wiFor key element phrase, then use the weight of the weighed value adjusting word of element of news, otherwise word
Weight constant;With word wiAs a example the key element of place, the concrete formula of adjustment weight method can be expressed as:
π (w in formula (3)i) represent word wiWeight;
Again, according to sentence SjIn comprise the weight of word and give weight, the weight to the word that this sentence is comprised to sentence
Average, formula is:
π (S in formula (4)j) represent sentence SjWeight, n represents sentence SjIn the number of word that comprises;
Finally, according to sentence weight is descending, sentence is ranked up, and chooses the forward some sentence conducts of sequence
The event summary of this time point.
5. a kind of visualization exhibiting method towards media event evolution process described in is it is characterised in that media event
The visualization of evolution process represents, and wherein personage, the importance computational methods of place element of news are personage, place element of news
The relative frequency occurring in data subset;The generating process of the above-mentioned event summary of system of selection of event summary sentence is identical;
Centrad analysis is carried out to fixed node and side, the bigger node of importance is bigger in the size of in figure, and placement location is got over
Tend to the middle part of figure;If comprising certain personage or certain place in event summary sentence node, by this event summary sentence of in figure
Node is connected with this personage or this point node line, is otherwise not connected between node.
The invention has the beneficial effects as follows, can assist reader comprehensively, refining, intuitively represent the development of whole media event
The incidence relation of the key element such as journey and personage, place.
Brief description
A kind of visualization exhibiting method flow chart towards media event evolution process of Fig. 1 present invention,
Fig. 2 sometime puts a preferred embodiment of media event visual presentation figure for the present invention,
Fig. 3 is a preferred embodiment of the present invention multiple continuous time point media event visual presentation figure.
Specific embodiment
Describe the specific embodiment of the present invention below in conjunction with the accompanying drawings with technical scheme in detail.
In order that the object, technical solutions and advantages of the present invention are clearer, below in conjunction with the accompanying drawings with instantiation to this
Invention is described in detail.
Fig. 1 is the visualization exhibiting method flow chart towards media event evolution process provided in an embodiment of the present invention, such as
Shown in Fig. 1, the method comprises the following steps:
Step one:Obtain the related news report webpage of certain particular news event from news sources.
In this step, it is possible to use the interface that news platform provides, the keyword according to particular news event obtains phase
Close news report webpage.Wherein, news platform can be the news channel of any website or that search engine is collected is new
Hear data.
Step 2:Carry out duplicate removal process to obtaining news report webpage, obtain news report data set.
News report removing duplicate webpages in this step are processed, can be using the removing duplicate webpages side based on character frequency hash value
Method, specific duplicate removal process can include:
First, occurrence number statistics is carried out to the text character of the every news report webpage obtaining, will appear from number of times and surpass
Cross the characteristic character as this news report webpage for the non-stop words character of preset requirement;
Secondly, all characteristic characters in every news report webpage are arranged in feature string by occurrence number, and should
With Hash method, this feature character string maps are become the condition code of a regular length;
Finally, more all two-by-two condition codes obtaining news report webpage, the such as condition code of two news report webpages
Identical then it is assumed that one of them is repeated pages, and get rid of from news report data set.
Certainly, above-mentioned is only the side of being preferable to carry out that the present invention provides based on the removing duplicate webpages method of character frequency hash value
Formula, it would however also be possible to employ other removing duplicate webpages methods.
Step 3:By report time and importance, data subset according to time sequence is determined to news report data set.
In this step, data subset determines that the detailed process of method can be:
First, the web page text in news report data set is carried out data subset division by the news report time, will report
Road time identical web page text is classified as the same data subset with this report time as label;
Secondly, word segmentation processing is carried out to all web page texts in each data subset, and mark in which sentence and comprise
The media event key element such as personage, place;
Again, preset if the sentence quantity comprising the news key element such as personage, place in a certain data subset is more than
Threshold value, then be determined as significant data subset by this data subset, and continued to employ;Otherwise, then get rid of this report time
Data subset;
Finally, by the sequencing of report time, rearrange each data subset continued to employ.
Step 4:Extract the media event key elements such as personage, place from each data subset, and form the thing of this time point
Part is made a summary.
In this step, on each time point event summary extraction, can be using being plucked based on many documents of media event key element
Want extracting method, detailed process includes:
First, extract the media event key elements such as personage, place from each data subset and obtain element of news phrase set (such as
Place key element phrase set:L={ l1,l2,...ln), and weights quantization is carried out to key element phrase set.With place key element phrase
As a example set, the weights quantizing process of each place key element is:
In formula (1)It is place key element liOccurrence number in data subset, | L | is that in data subset, all places will
The total number of element.By p (li) as key element liWeight.
Secondly, calculate the weight of word in each data subset.Calculate each word w in data subset using TFIDF methodi?
Divide Score (wi), concrete formula is represented by:
TF (w in formula (2)i) it is word wiOccurrence number in data subset,
Represent word wiThe inverse of the data subset frequency occurring in total data set, | W | is the total number of word in data subset.Will
Score(wi) as word wiWeight.If word wiFor key element phrase, then use the weight of the weighed value adjusting word of element of news, otherwise word
Weight constant.With word wiAs a example the key element of place, the concrete formula of adjustment weight method can be expressed as:
π (w in formula (3)i) represent word wiWeight.
Again, according to sentence SjIn comprise the weight of word and give weight, the weight to the word that this sentence is comprised to sentence
Average, formula is:
π (S in formula (4)j) represent sentence SjWeight, n represents sentence SjIn the number of word that comprises.
Finally, according to sentence weight is descending, sentence is ranked up, and chooses the forward some sentence conducts of sequence
The event summary of this time point.
Certainly, above-mentioned is only the preferred reality that the present invention provides based on the multi-document summary extracting method of media event key element
Apply mode, it would however also be possible to employ other event summary extracting methods.
Step 5:With meet importance requirement personage, the element of news such as place and event summary sentence as node, with it
Between incidence relation be side, visualization is carried out to the event of each time point and represents.
In this step, the detailed process of the incident visualization exhibiting method of each time point may include:
First, calculate the relative frequency that the elements of news such as personage, place occur in sometime point data subset, and make
Measure for element of news importance;
Secondly, by importance size, element of news is ranked up, and retain importance exceed the news of preset requirement will
Element, gets rid of the element of news that importance is not less than preset requirement, obtains the element of news node of this time point in this approach;
Again, generate the summary sentence of this time point by the event summary generation method in abovementioned steps four, and will make a summary
The weighted value of sentence, as the measure of its importance, obtains the event summary sentence node of this time point in this approach;
Finally, with the incidence relation between above-mentioned element of news node and event summary node as side, draw each time
The incident visualization of point represents figure.Wherein, the incidence relation between node can be described as:If wrapping in event summary sentence node
Containing certain personage (or certain place), then this event summary sentence node of in figure is connected with line with this personage (or this place) node
Connect, be otherwise not connected between node.Additionally, visualization represents figure and is built upon node and side are carried out with the basis of centrad analysis
On, the bigger node of importance is bigger in the size of in figure, and placement location more tends to the middle part of figure.According to the computer write
Program, execution step one, two, three, four, five methods described.
Step 6:Connect the node in adjacent time point visualized graphs with incidence relation, represent this particular news thing
Part evolution process.
In this step, the joint connecting method in adjacent time point visualized graphs can be:If adjacent time point
There is identical node in visualized graphs, then connect this two identical nodes, be otherwise not connected to.
Fig. 2 gives the preferred embodiment that visualization represents figure.As illustrated, triangle represents sometime
Personage's key element in point data subset, the size of triangle represents the importance of personage;Square represents this time point data
Place key element in subset, foursquare size has been similarly represented as the importance in place;Circular representative is from this time point data
The event summary sentence extracting in collection, circular size represents the importance of summary sentence.In figure most important summary sentence
For " Minister of Railways contains light ancestral, Ministry of Railways vice-minister Hu Yadong rushes towards Wenzhou motor-car scene of the accident commander ", wherein comprise " to contain light
Ancestral ", " Hu Yadong " two personage's key elements and " Wenzhou " this place key element, therefore maximum circle " contains light with representing respectively
Ancestral ", two triangles of " Hu Yadong " are connected with the square representing " Wenzhou ".
Fig. 3 gives a preferred embodiment of adjacent time point visualized graphs connection.In figure presses from top to bottom suitable
Sequence gives the partial visual present graphical of " derailing of Wenzhou rear end collision of motor train " event, and the arrow of in figure is connected to adjacent time point
Same place.
The feature of this embodiment visualization provided by the present invention exhibiting method, can assist reader comprehensively, refining, straight
See ground and represent the evolution of whole media event and the incidence relation of the key element such as personage, place.
Claims (3)
1. a kind of visualization exhibiting method towards media event evolution process is it is characterised in that obtain certain from news sources first
The related news report webpage of particular news event simultaneously carries out duplicate removal process, obtains the no repetition news report data of this event
Collection;And then the data set of news report is carried out data subset division by the news report time, obtain by report time order and function row
The data subset of sequence;Extract personage, the place media event key element of this report time from each data subset, give birth on this basis
Become the event summary of this report time;Finally in each news report time corresponding data subset, to meet importance requirement
Personage, place element of news and event summary sentence be node, the incidence relation between them, as side, carries out this time point
Incident visualization represent, according to the computer program write, execute above-mentioned steps;
Selection to data subset, specifically includes:If comprising personage in data subset, the quantity of place sentence is more than default threshold
Value, then be determined as significant data subset by this data subset, corresponding for this data subset news report time be determined as this spy
Determine the important report time of media event;
The generating process of event summary, specifically includes the descending sequence of importance according to sentence in data subset, selects weight
The property wanted is more than the event summary as this report time data subset for the sentence of preset requirement;Wherein, the definition of sentence importance
It is related to word frequency statisticses weights and element of news weights two aspect factor:The computational methods of sentence word frequency statisticses weights employ
TFIDF weight calculation method;The computational methods of sentence element of news weights are personage, place element of news goes out in data subset
Existing relative frequency;Detailed process includes:
First, extract personage from each data subset, place media event key element obtains element of news phrase set, such as place will
Plain phrase set:L={ l1,l2,...ln, and weights quantization is carried out to key element phrase set;The weights amount of each place key element
Change process is:
In formula (1)It is place key element liOccurrence number in data subset, | L | is all places key element in data subset
Total number, by p (li) as key element liWeight;
Secondly, calculate the weight of word in each data subset, calculate each word w in data subset using TFIDF methodiScore
Score(wi), concrete formula is represented by:
TF (w in formula (2)i) it is word wiOccurrence number in data subset,
Represent word wiThe inverse of the data subset frequency occurring in total data set, | W | is the total number of word in data subset;Will
Score(wi) as word wiWeight, if word wiFor key element phrase, then use the weight of the weighed value adjusting word of element of news, otherwise word
Weight constant;With word wiAs a example the key element of place, the concrete formula of adjustment weight method can be expressed as:
π (w in formula (3)i) represent word wiWeight;
Again, according to sentence SjIn comprise the weight of word and give weight to sentence, the weight of the word that this sentence is comprised is averaging
It is worth, formula is:
π (S in formula (4)j) represent sentence SjWeight, n represents sentence SjIn the number of word that comprises;
Finally, when sentence being ranked up according to sentence weight is descending, and choosing the forward some sentences of sequence as this
Between point event summary.
2. a kind of visualization exhibiting method towards media event evolution process according to claim 1 it is characterised in that
Obtain the related news report webpage of certain particular news event and carry out duplicate removal process, specifically include to every news report webpage
Character in text carries out occurrence number statistics, will appear from the tagged word as this webpage for the character that number of times exceedes preset requirement
Symbol;All characteristic characters are arranged in feature string by occurrence number, utilize Hash method to generate one admittedly on its basis
The condition code of measured length;If the condition code of two news report webpages identical then it is assumed that one of them be repeated pages.
3. a kind of visualization exhibiting method towards media event evolution process according to claim 1 and 2, its feature exists
In, the visualization of media event evolution process represents, wherein personage, place element of news importance computational methods be personage,
The relative frequency that place element of news occurs in data subset;In the system of selection of event summary sentence and claim 1
Event summary generating process is identical;Centrad analysis is carried out to fixed node and side, the bigger node of importance is in figure
Size bigger, placement location more tends to the middle part of figure;If comprising certain personage or certain place in event summary sentence node, will
This event summary sentence node of in figure is connected with this personage or this point node line, is otherwise not connected between node.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310303085.6A CN103473263B (en) | 2013-07-18 | 2013-07-18 | News event development process-oriented visual display method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310303085.6A CN103473263B (en) | 2013-07-18 | 2013-07-18 | News event development process-oriented visual display method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103473263A CN103473263A (en) | 2013-12-25 |
CN103473263B true CN103473263B (en) | 2017-02-08 |
Family
ID=49798111
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310303085.6A Active CN103473263B (en) | 2013-07-18 | 2013-07-18 | News event development process-oriented visual display method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103473263B (en) |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104182504B (en) * | 2014-08-18 | 2017-06-06 | 合肥工业大学 | A kind of dynamic tracking of media event and summary algorithm |
CN104408093B (en) * | 2014-11-14 | 2018-01-26 | 中国科学院计算技术研究所 | A kind of media event key element abstracting method and device |
CN106156042B (en) * | 2015-03-26 | 2020-02-07 | 科大讯飞股份有限公司 | Hotspot information display method and system |
CN104915446B (en) * | 2015-06-29 | 2019-01-29 | 华南理工大学 | Event Evolvement extraction method and its system based on news |
WO2017107010A1 (en) * | 2015-12-21 | 2017-06-29 | 浙江核新同花顺网络信息股份有限公司 | Information analysis system and method based on event regression test |
CN105787095B (en) * | 2016-03-16 | 2019-09-27 | 广州索答信息科技有限公司 | The automatic generation method and device of internet news |
CN105912526A (en) * | 2016-04-15 | 2016-08-31 | 北京大学 | Sports game live broadcasting text based sports news automatic constructing method and device |
CN106709968A (en) * | 2016-11-30 | 2017-05-24 | 剧加科技(厦门)有限公司 | Data visualization method and system for play story information |
CN106874419B (en) * | 2017-01-22 | 2019-09-10 | 北京航空航天大学 | A kind of real-time hot spot polymerization of more granularities |
CN110020104B (en) * | 2017-09-05 | 2023-04-07 | 腾讯科技(北京)有限公司 | News processing method and device, storage medium and computer equipment |
CN110110089B (en) * | 2018-01-09 | 2021-03-30 | 网智天元科技集团股份有限公司 | Cultural relation graph generation method and system |
CN108170838B (en) * | 2018-01-12 | 2022-07-08 | 平安科技(深圳)有限公司 | Topic evolution visualization display method, application server and computer readable storage medium |
CN108427761B (en) * | 2018-03-21 | 2022-01-14 | 腾讯科技(深圳)有限公司 | News event processing method, terminal, server and storage medium |
CN110162651B (en) * | 2019-04-23 | 2023-07-14 | 南京邮电大学 | News content image-text disagreement identification system and identification method based on semantic content abstract |
CN111931092B (en) * | 2020-07-07 | 2022-07-12 | 浙江大学 | Data visualization exploration system based on Scrollytelling technology |
CN112328856A (en) * | 2020-10-30 | 2021-02-05 | 中国平安人寿保险股份有限公司 | Common event tracking method and device, computer equipment and computer readable medium |
CN113626668B (en) * | 2021-07-02 | 2024-05-14 | 武汉大学 | News multi-scale visualization method for map |
CN114040518A (en) * | 2021-11-26 | 2022-02-11 | 中国银行股份有限公司 | Network node display method and device |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102646114A (en) * | 2012-02-17 | 2012-08-22 | 清华大学 | News topic timeline abstract generating method based on breakthrough point |
-
2013
- 2013-07-18 CN CN201310303085.6A patent/CN103473263B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102646114A (en) * | 2012-02-17 | 2012-08-22 | 清华大学 | News topic timeline abstract generating method based on breakthrough point |
Non-Patent Citations (3)
Title |
---|
基于字符统计的新闻网页去重方法研究;蒋金平等;《中国科技论文在线》;20130513;第4页第3段第3-6行、第6段,第5页第2段第1-3行 * |
文本可视化在新闻事件演变中的应用;刘晓娟等;《图书情报工作》;20100920;第54卷(第18期);全文 * |
新闻话题事件演变关系自动生成***的设计和实现;李斌;《万方数据知识服务平台》;20120531;第22页第3段,第23页第5-6段,第39页第1段 * |
Also Published As
Publication number | Publication date |
---|---|
CN103473263A (en) | 2013-12-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103473263B (en) | News event development process-oriented visual display method | |
JP5886733B2 (en) | Video group reconstruction / summarization apparatus, video group reconstruction / summarization method, and video group reconstruction / summarization program | |
CN110909164A (en) | Text enhancement semantic classification method and system based on convolutional neural network | |
CN102591612B (en) | General webpage text extraction method based on punctuation continuity and system thereof | |
CN103544176A (en) | Method and device for generating page structure template corresponding to multiple pages | |
CN103544255A (en) | Text semantic relativity based network public opinion information analysis method | |
CN104778209A (en) | Opinion mining method for ten-million-scale news comments | |
CN103810251B (en) | Method and device for extracting text | |
CN112667940B (en) | Webpage text extraction method based on deep learning | |
CN103870001A (en) | Input method candidate item generating method and electronic device | |
WO2016057984A1 (en) | Methods and systems for base map and inference mapping | |
CN103678412A (en) | Document retrieval method and device | |
CN111460162B (en) | Text classification method and device, terminal equipment and computer readable storage medium | |
WO2014000130A1 (en) | Method or system for automated extraction of hyper-local events from one or more web pages | |
CN106294330A (en) | A kind of scientific text selection method and device | |
CN102955853A (en) | Method and device for generating cross-language abstract | |
Ma et al. | Extracting unstructured data from template generated web documents | |
CN109522410A (en) | Document clustering method and platform, server and computer-readable medium | |
CN106485525A (en) | Information processing method and device | |
Liu et al. | Main content extraction from web pages based on node characteristics | |
Li et al. | Text mining and visualization of papers reviews using R language | |
D’Silva et al. | Development of a Konkani language dataset for automatic text summarization and its challenges | |
Das et al. | Sentiment analysis: what is the end user's requirement? | |
CN1604073A (en) | Method for conducting title and text logic connection for newspaper pages | |
Hsu et al. | Hierarchical comments-based clustering |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |