CN114201600A - Public opinion text abstract extraction method, device, equipment and computer storage medium - Google Patents

Public opinion text abstract extraction method, device, equipment and computer storage medium Download PDF

Info

Publication number
CN114201600A
CN114201600A CN202111510633.3A CN202111510633A CN114201600A CN 114201600 A CN114201600 A CN 114201600A CN 202111510633 A CN202111510633 A CN 202111510633A CN 114201600 A CN114201600 A CN 114201600A
Authority
CN
China
Prior art keywords
public opinion
text
opinion text
abstract
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111510633.3A
Other languages
Chinese (zh)
Inventor
陈佳颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jindi Technology Co Ltd
Original Assignee
Beijing Jindi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jindi Technology Co Ltd filed Critical Beijing Jindi Technology Co Ltd
Priority to CN202111510633.3A priority Critical patent/CN114201600A/en
Publication of CN114201600A publication Critical patent/CN114201600A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a method, a device and equipment for abstracting a public sentiment text and a computer storage medium, and relates to the technical field of computers. Wherein the method comprises the following steps: respectively determining the similarity between each sentence in the public opinion text to be abstracted and the text title of the public opinion text; determining abstract candidate sentences of the public opinion text according to the similarity between each sentence in the public opinion text and the text title of the public opinion text; extracting a summary of the public opinion text through a public opinion text summary extraction model to obtain scoring data of text summaries of sentences in the public opinion text, wherein the sentences belong to the public opinion text; and determining the text abstract of the public opinion text from the abstract candidate sentences of the public opinion text according to the scoring data of the text abstract of the public opinion text to which each sentence in the public opinion text belongs. The scheme can effectively ensure the central gist of the abstracted abstract of the public opinion text.

Description

Public opinion text abstract extraction method, device, equipment and computer storage medium
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a method and a device for extracting a summary of a public opinion text, electronic equipment and a computer storage medium.
Background
In the public opinion block of the enterprise detail page, a user can know events, news and the like which occur recently in one enterprise through the public opinion block. Aiming at the public opinion plate, the content displayed by the public opinion list also displays partial text content of the public opinion text besides the title of the public opinion text so as to supplement the theme to be expressed by the public opinion text. At present, the first N words of the text of the public opinion text are taken as the supplementary parts, but the words of the supplementary parts cannot completely show the real theme expressed by the public opinion text.
Therefore, how to effectively ensure the central theme of the abstracted abstract of the public opinion text becomes a technical problem to be solved urgently at present.
Disclosure of Invention
In view of the above, an embodiment of the present invention provides a method, an apparatus, an electronic device and a computer storage medium for abstracting a summary of a public opinion text, so as to solve the technical problem of how to effectively ensure the central gist of the abstract abstracted from the public opinion text in the prior art.
According to a first aspect of the embodiments of the present invention, there is provided a method for abstracting a summary of public opinion text, the method including: respectively determining the similarity between each sentence in the public opinion text to be abstracted and the text title of the public opinion text; determining abstract candidate sentences of the public opinion text according to the similarity between each sentence in the public opinion text and the text title of the public opinion text; extracting a summary of the public opinion text through a public opinion text summary extraction model to obtain scoring data of text summaries of sentences in the public opinion text, wherein the sentences belong to the public opinion text; and determining the text abstract of the public opinion text from the abstract candidate sentences of the public opinion text according to the scoring data of the text abstract of the public opinion text to which each sentence in the public opinion text belongs.
According to a second aspect of the embodiments of the present invention, there is provided a device for abstracting a summary of public opinion text, the device including: the first determining module is used for respectively determining the similarity between each sentence in the public opinion text to be abstracted and the text title of the public opinion text; the second determination module is used for determining abstract candidate sentences of the public opinion text according to the similarity between each sentence in the public opinion text and the text title of the public opinion text; the abstract extraction module is used for extracting an abstract of the public opinion text through a public opinion text abstract extraction model so as to obtain scoring data of text abstract of the public opinion text belonging to each sentence in the public opinion text; and the third determining module is used for determining the text abstract of the public opinion text from the abstract candidate sentences of the public opinion text according to the grading data of the text abstract of the public opinion text to which each sentence belongs.
According to a third aspect of embodiments of the present invention, there is provided an electronic apparatus, including: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus; the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the public opinion text abstract extracting method in the first aspect.
According to a fourth aspect of embodiments of the present invention, there is provided a computer storage medium having stored thereon a computer program which, when executed by a processor, implements a method of abstracting a public opinion text as described in the first aspect.
According to the abstract extraction scheme of the public opinion text provided by the embodiment of the invention, the similarity between each sentence in the public opinion text to be abstracted and the text title of the public opinion text is respectively determined, the abstract candidate sentences of the public opinion text are determined according to the similarity between each sentence in the public opinion text and the text title of the public opinion text, then the abstract extraction is carried out on the public opinion text through a public opinion text abstract extraction model to obtain the score data of the text abstract of the public opinion text to which each sentence in the public opinion text belongs, then the text abstract of the public opinion text is determined from the abstract candidate sentences of the public opinion text according to the score data of the text abstract of the public opinion text to which each sentence in the public opinion text belongs, compared with the prior other modes, the text abstract of the public opinion text is determined according to the similarity between each sentence in the public opinion text and the text title, and determining the abstract candidate sentence of the public opinion text, so that the finally extracted text abstract of the public opinion text is closer to the text title of the public opinion text, and the central theme of the finally extracted text abstract of the public opinion text is effectively ensured. In addition, the information content of the sentences belonging to the text abstract of the public opinion text in the public opinion text can be effectively ensured through the scoring data of the sentences belonging to the text abstract of the public opinion text in the public opinion text, so that the text abstract of the public opinion text can be accurately determined from the abstract candidate sentences of the public opinion text, and the central gist of the finally extracted text abstract of the public opinion text can be effectively ensured.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the embodiments of the present invention, and it is also possible for a person skilled in the art to obtain other drawings based on the drawings.
Fig. 1 is a flowchart illustrating a method for abstracting a summary of a public sentiment text according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a public opinion text abstract extracting device in the second embodiment;
fig. 3 is a schematic structural diagram of an electronic device in the third embodiment.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the embodiments of the present invention, the technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments of the present invention shall fall within the scope of the protection of the embodiments of the present invention.
The following further describes specific implementation of the embodiments of the present invention with reference to the drawings.
Referring to fig. 1, a flowchart illustrating steps of a method for abstracting a summary of a public opinion text in this embodiment is shown.
Specifically, the method for extracting the abstract of the public opinion text provided by the embodiment includes the following steps:
in step S101, the similarity between each sentence in the public opinion text to be abstracted and the text title of the public opinion text is determined respectively.
In this embodiment, the public opinion text to be abstracted may be events or news in public opinion tiles in a website page of a business entity. It should be understood that the above description is only exemplary, and the present embodiment is not limited thereto.
In some optional embodiments, when the similarity between each sentence in the public opinion text to be abstracted and the text title of the public opinion text is respectively determined, semantic feature characterization data of each sentence in the public opinion text is respectively determined; determining semantic feature representation data of a text title of the public opinion text; respectively determining distance data between each sentence in the public opinion text and the text title of the public opinion text according to semantic feature representation data of each sentence in the public opinion text and semantic feature representation data of the text title of the public opinion text; and determining the similarity between each sentence in the public opinion text and the text title of the public opinion text according to the distance data between each sentence in the public opinion text and the text title of the public opinion text. Therefore, the distance data between each sentence in the public opinion text and the text title of the public opinion text can be accurately determined through the semantic feature representation data of each sentence in the public opinion text and the semantic feature representation data of the text title of the public opinion text, and the similarity between each sentence in the public opinion text and the text title of the public opinion text can be accurately determined through the distance data between each sentence in the public opinion text and the text title of the public opinion text. It should be understood that the above description is only exemplary, and the present embodiment is not limited thereto.
In a specific example, when determining semantic feature representation data of each sentence in the public opinion text and semantic feature representation data of a text title of the public opinion text respectively, a semantic feature representation vector of each sentence in the public opinion text and a semantic feature representation vector of the text title of the public opinion text are predicted through a semantic feature representation model. The semantic feature characterization model may be any suitable neural network model that can implement feature extraction or target object detection, including but not limited to a convolutional neural network, an reinforcement learning neural network, a generation network in an antagonistic neural network, and the like. The specific structure of the neural network can be set by those skilled in the art according to actual requirements, such as the number of convolutional layers, the size of convolutional core, the number of channels, and the like. The semantic feature Representation model may be a BERT (Bidirectional Encoder Representation from transforms) model. When determining distance data between each sentence in the public opinion text and the text title of the public opinion text, calculating the cosine distance between each sentence in the public opinion text and the text title of the public opinion text according to the semantic feature characterization vector of each sentence in the public opinion text and the semantic feature characterization vector of the text title of the public opinion text. When determining the similarity between each sentence in the public opinion text and the text title of the public opinion text, the greater the distance data between each sentence in the public opinion text and the text title of the public opinion text is, the smaller the similarity between each sentence in the public opinion text and the text title of the public opinion text is; the smaller the distance data between each sentence in the public opinion text and the text title of the public opinion text is, the greater the similarity between each sentence in the public opinion text and the text title of the public opinion text is. It should be understood that the above description is only exemplary, and the present embodiment is not limited thereto.
In step S102, a candidate sentence of the abstract of the public opinion text is determined according to the similarity between each sentence in the public opinion text and the text title of the public opinion text.
In some optional embodiments, when determining the abstract candidate sentences of the public opinion text according to the similarity between each sentence in the public opinion text and the text title of the public opinion text, performing descending order arrangement on each sentence in the public opinion text according to the similarity between each sentence in the public opinion text and the text title of the public opinion text respectively to obtain descending order arrangement results of each sentence in the public opinion text; and determining abstract candidate sentences of the public opinion text according to the descending order arrangement result of each sentence in the public opinion text. Therefore, through the descending order arrangement result of each sentence in the public opinion text, the abstract candidate sentence of the public opinion text can be accurately determined. It should be understood that the above description is only exemplary, and the present embodiment is not limited thereto.
In a specific example, when determining the abstract candidate sentence of the public opinion text according to the descending order arrangement result of each sentence in the public opinion text, determining the M sentences which are arranged at the top in the descending order in the public opinion text according to the descending order arrangement result of each sentence in the public opinion text, wherein M represents a natural number to determine the M sentences which are arranged at the top in the descending order in the public opinion text as the abstract candidate sentences of the public opinion text. Therefore, through M sentences which are arranged at the forefront in descending order in the public opinion text, the abstract candidate sentences of the public opinion text can be accurately determined. It should be understood that the above description is only exemplary, and the present embodiment is not limited thereto.
In a specific example, in order to make the extracted text abstract of the public opinion text closer to the text title of the public opinion text and ensure the central gist of the text abstract of the public opinion text, the text title of the public opinion text is taken into consideration of the extraction of the text abstract of the public opinion text. Therefore, the top M sentences can be selected as the candidate set of abstract sentences first according to the similarity with the text titles. It should be understood that the above description is only exemplary, and the present embodiment is not limited thereto.
In step S103, abstracting the public opinion text through a public opinion text abstract abstraction model to obtain scoring data of text abstracts of the public opinion text to which each sentence in the public opinion text belongs.
In this embodiment, the public opinion text abstract extraction model may be any suitable neural network model that can implement feature extraction or target object detection, including but not limited to a convolutional neural network, an reinforcement learning neural network, a generation network in an antagonistic neural network, and the like. The specific structure of the neural network can be set by those skilled in the art according to actual requirements, such as the number of convolutional layers, the size of convolutional core, the number of channels, and the like. The public opinion text abstract extraction model can be a LexRank model. And extracting the abstract of the public opinion text by adopting a LexRank model, so that the centrality and the fluency of the text abstract of the public opinion text and the degree of attaching to the text title of the public opinion text are optimal. It should be understood that the above description is only exemplary, and the present embodiment is not limited thereto.
In some optional embodiments, when the public opinion text is abstracted through a public opinion text abstraction model, determining similarity between sentences in the public opinion text through the public opinion text abstraction model; constructing a standard graph by taking all sentences in the public opinion text as nodes and taking the similarity among the sentences in the public opinion text as a connection line among the nodes through the public opinion text abstract extraction model, wherein the thickness degree of the connection line among the nodes represents the similarity among the sentences corresponding to the nodes; and obtaining scoring data of the text abstract of the public opinion text belonging to each sentence in the public opinion text according to the standard map through the public opinion text abstract extraction model. Therefore, the scoring data of the text abstract of the public opinion text to which each sentence in the public opinion text belongs can be accurately obtained through the public opinion text abstract extraction model. It should be understood that the above description is only exemplary, and the present embodiment is not limited thereto.
In a specific example, when the similarity between each sentence in the public opinion text is determined through the public opinion text abstract extraction model, determining the cosine distance between the semantic feature characterization vectors of each sentence in the public opinion text through the public opinion text abstract extraction model; and determining the similarity between the sentences in the public opinion text according to the cosine distance between the semantic feature representation vectors of the sentences in the public opinion text by the public opinion text abstract extraction model. It should be understood that the above description is only exemplary, and the present embodiment is not limited thereto.
In some optional embodiments, when the score data of the text abstract of the public opinion text belonging to each sentence in the public opinion text is obtained according to the metric map through the public opinion text abstract extraction model, the score data of the text abstract of the public opinion text belonging to each sentence in the public opinion text is predicted according to the number of lines and the thickness degree of the lines of the corresponding nodes of each sentence in the public opinion text in the metric map through the public opinion text abstract extraction model. Therefore, by the public opinion text abstract extraction model, the scoring data of the text abstract of the public opinion text to which each sentence belongs can be accurately predicted according to the line number and the line thickness degree of the corresponding node of each sentence in the public opinion text in the scalar graph. It should be understood that the above description is only exemplary, and the present embodiment is not limited thereto.
In a specific example, the number of links of nodes corresponding to the sentences in the public opinion text in the standard quantity graph reflects the size of the information content contained in the sentences corresponding to the nodes, that is, the larger the number of links of the nodes corresponding to the sentences in the public opinion text in the standard quantity graph is, the larger the information content contained in the sentences corresponding to the nodes is; the smaller the number of the connecting lines of the corresponding nodes of the sentences in the public opinion text in the scalar graph is, the smaller the information content contained in the sentences corresponding to the nodes is. The connection thickness degree of the corresponding node of the sentence in the public sentiment text in the scalar graph reflects the information content contained in the sentence corresponding to the node, namely the thicker the connection of the corresponding node of the sentence in the public sentiment text in the scalar graph, the larger the information content contained in the sentence corresponding to the node; the thinner the connecting line of the corresponding node of the sentence in the public opinion text in the scalar graph is, the smaller the information content contained in the sentence corresponding to the node is. When score data of each sentence in the public opinion text belonging to the text abstract of the public opinion text is predicted according to the line quantity and the line thickness degree of the corresponding node of each sentence in the public opinion text in the standard map through the public opinion text abstract extraction model, calculating the information quantity contained in each sentence in the public opinion text according to the line quantity and the line thickness degree of the corresponding node of each sentence in the public opinion text in the standard map through the public opinion text abstract extraction model; and determining the scoring data of the text abstract of the public opinion text to which each sentence in the public opinion text belongs according to the information content contained in each sentence in the public opinion text by the public opinion text abstract extraction model. Therefore, the score data of the text abstract of the public opinion text, to which each sentence in the public opinion text belongs, can be accurately determined according to the information content of each sentence in the public opinion text by the public opinion text abstract extraction model. It should be understood that the above description is only exemplary, and the present embodiment is not limited thereto.
In a specific example, when the score data of the text abstract of the public opinion text is determined according to the information content of each sentence in the public opinion text by the public opinion text abstract extraction model, the score data of the text abstract of the public opinion text belonging to each sentence in the public opinion text is larger as the information content of each sentence in the public opinion text is larger; the smaller the information content of each sentence in the public opinion text is, the smaller the score data of the text abstract of the public opinion text belonging to each sentence in the public opinion text is. It should be understood that the above description is only exemplary, and the present embodiment is not limited thereto.
In step S104, a text abstract of the public opinion text is determined from the abstract candidate sentences of the public opinion text according to the score data of the text abstract of the public opinion text to which each sentence belongs.
In some optional embodiments, when the text abstract of the public opinion text is determined from the text abstract candidate sentences of the public opinion text according to the score data of the text abstract of the public opinion text to which each sentence in the public opinion text belongs, the score data of the text abstract of the public opinion text to which each sentence in the public opinion text belongs is determined according to the score data of the text abstract of the public opinion text to which each sentence in the public opinion text belongs; according to the scoring data of text abstracts of the public opinion text belonging to abstract candidate sentences in the public opinion text, carrying out descending order arrangement on abstract candidate sentences in the public opinion text, and according to the descending order arrangement result of the abstract candidate sentences in the public opinion text, determining N abstract candidate sentences of the public opinion text with the descending order arranged at the top, wherein N represents a natural number; and generating text abstracts of the public opinion text according to the N abstract candidate sentences which are arranged at the forefront in the descending order of the public opinion text. Therefore, the descending order arrangement result of the abstract candidate sentences of the public opinion text can be accurately determined through the score data that the abstract candidate sentences of the public opinion text belong to the text abstract of the public opinion text, and in addition, the text abstract of the public opinion text can be accurately generated through the descending order arrangement result of the abstract candidate sentences of the public opinion text. It should be understood that the above description is only exemplary, and the present embodiment is not limited thereto.
Through the method for extracting the abstract of the public opinion text provided by the embodiment of the invention, the similarity between each sentence in the public opinion text to be extracted and the text title of the public opinion text is respectively determined, the abstract candidate sentences of the public opinion text are determined according to the similarity between each sentence in the public opinion text and the text title of the public opinion text, then the abstract extraction is carried out on the public opinion text through a public opinion text abstract extraction model to obtain the score data of the text abstract of the public opinion text belonging to each sentence in the public opinion text, then the text abstract of the public opinion text is determined from the abstract candidate sentences of the public opinion text according to the score data of the text abstract of the public opinion text belonging to each sentence in the public opinion text, compared with the prior other modes, the text abstract of the public opinion text is determined according to the similarity between each sentence in the public opinion text and the text title, and determining the abstract candidate sentence of the public opinion text, so that the finally extracted text abstract of the public opinion text is closer to the text title of the public opinion text, and the central theme of the finally extracted text abstract of the public opinion text is effectively ensured. In addition, the information content of the sentences belonging to the text abstract of the public opinion text in the public opinion text can be effectively ensured through the scoring data of the sentences belonging to the text abstract of the public opinion text in the public opinion text, so that the text abstract of the public opinion text can be accurately determined from the abstract candidate sentences of the public opinion text, and the central gist of the finally extracted text abstract of the public opinion text can be effectively ensured.
The method for extracting the abstract of the public opinion text provided by the embodiment can be executed by any suitable device with data processing capability, including but not limited to: a camera, a terminal, a mobile terminal, a PC, a server, an in-vehicle device, an entertainment device, an advertising device, a Personal Digital Assistant (PDA), a tablet computer, a notebook computer, a handheld game console, smart glasses, a smart watch, a wearable device, a virtual display device, a display enhancement device, or the like.
Referring to fig. 2, a schematic structural diagram of a public opinion text abstract extracting device in the second embodiment of the present application is shown.
The public opinion text abstract extraction device provided by the embodiment comprises: a first determining module 201, configured to determine similarity between each sentence in a public opinion text to be abstracted and a text title of the public opinion text; a second determining module 202, configured to determine candidate sentences of the abstract of the public opinion text according to similarity between each sentence in the public opinion text and a text title of the public opinion text; the abstract extraction module 203 is used for extracting an abstract of the public opinion text through a public opinion text abstract extraction model so as to obtain scoring data of text abstract of the public opinion text belonging to each sentence in the public opinion text; a third determining module 204, configured to determine a text abstract of the public opinion text from the candidate sentences of the abstract of the public opinion text according to score data of text abstract of the public opinion text to which each sentence in the public opinion text belongs.
Optionally, the first determining module 201 is specifically configured to: respectively determining semantic feature representation data of each sentence in the public opinion text; determining semantic feature representation data of a text title of the public opinion text; respectively determining distance data between each sentence in the public opinion text and the text title of the public opinion text according to semantic feature representation data of each sentence in the public opinion text and semantic feature representation data of the text title of the public opinion text; and determining the similarity between each sentence in the public opinion text and the text title of the public opinion text according to the distance data between each sentence in the public opinion text and the text title of the public opinion text.
Optionally, the second determining module 202 includes: the arrangement submodule is used for carrying out descending arrangement on each sentence in the public opinion text according to the similarity between each sentence in the public opinion text and the text title of the public opinion text so as to obtain a descending arrangement result of each sentence in the public opinion text; and the first determining submodule is used for determining abstract candidate sentences of the public opinion text according to the descending order arrangement result of each sentence in the public opinion text.
Optionally, the first determining submodule is specifically configured to: determining M sentences which are arranged at the top in a descending order mode in the public opinion text according to the descending order result of each sentence in the public opinion text, wherein M represents a natural number; and determining M sentences which are arranged at the top in descending order in the public opinion text as abstract candidate sentences of the public opinion text.
Optionally, the digest extracting module 203 includes: the second determining submodule is used for determining the similarity between sentences in the public opinion text through the public opinion text abstract extraction model; the construction submodule is used for taking all sentences in the public opinion text as nodes and taking the similarity between all sentences in the public opinion text as a connecting line between the nodes through the public opinion text abstract extraction model so as to construct a standard graph, wherein the thickness degree of the connecting line between the nodes represents the size of the similarity between the sentences corresponding to the nodes; and the obtaining submodule is used for obtaining the scoring data of the text abstract of the public opinion text, wherein each sentence in the public opinion text belongs to the public opinion text, through the public opinion text abstract extracting model and according to the scalar diagram.
Optionally, the obtaining sub-module is specifically configured to: and predicting the scoring data of the text abstract of the public opinion text belonging to each sentence in the public opinion text according to the number and thickness degree of the connecting lines of the corresponding nodes of each sentence in the public opinion text in the standard graph through the public opinion text abstract extraction model.
Optionally, the third determining module 204 is specifically configured to: determining score data of text abstracts of the public opinion texts to which abstract candidate sentences in the public opinion texts belong according to the score data of the text abstracts of the public opinion texts to which all sentences in the public opinion texts belong; according to the scoring data of text abstracts of the public opinion text belonging to abstract candidate sentences in the public opinion text, carrying out descending order arrangement on abstract candidate sentences in the public opinion text, and according to the descending order arrangement result of the abstract candidate sentences in the public opinion text, determining N abstract candidate sentences of the public opinion text with the descending order arranged at the top, wherein N represents a natural number; and generating text abstracts of the public opinion text according to the N abstract candidate sentences which are arranged at the forefront in the descending order of the public opinion text.
The apparatus for abstracting a public opinion text provided in this embodiment is used to implement a corresponding method for abstracting a public opinion text in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, which are not described herein again.
Referring to fig. 3, a schematic structural diagram of an electronic device according to a third embodiment of the present invention is shown, and the specific embodiment of the present invention does not limit the specific implementation of the electronic device.
As shown in fig. 3, the electronic device may include: a processor (processor)302, a communication Interface 304, a memory 306, and a communication bus 308.
Wherein:
the processor 302, communication interface 304, and memory 306 communicate with each other via a communication bus 308.
A communication interface 304 for communicating with other electronic devices or servers.
The processor 302 is configured to execute the program 310, and may specifically execute relevant steps in the above-mentioned public opinion text abstract extraction method embodiment.
In particular, program 310 may include program code comprising computer operating instructions.
The processor 302 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement an embodiment of the present invention. The intelligent device comprises one or more processors which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
And a memory 306 for storing a program 310. Memory 306 may comprise high-speed RAM memory and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 310 may specifically be configured to cause the processor 302 to perform the following operations: respectively determining the similarity between each sentence in the public opinion text to be abstracted and the text title of the public opinion text; determining abstract candidate sentences of the public opinion text according to the similarity between each sentence in the public opinion text and the text title of the public opinion text; extracting a summary of the public opinion text through a public opinion text summary extraction model to obtain scoring data of text summaries of sentences in the public opinion text, wherein the sentences belong to the public opinion text; and determining the text abstract of the public opinion text from the abstract candidate sentences of the public opinion text according to the scoring data of the text abstract of the public opinion text to which each sentence in the public opinion text belongs.
In an alternative embodiment, the program 310 is further configured to cause the processor 302 to determine semantic feature representation data of each sentence in the public opinion text to be abstracted when determining similarity between each sentence in the public opinion text and a text title of the public opinion text; determining semantic feature representation data of a text title of the public opinion text; respectively determining distance data between each sentence in the public opinion text and the text title of the public opinion text according to semantic feature representation data of each sentence in the public opinion text and semantic feature representation data of the text title of the public opinion text; and determining the similarity between each sentence in the public opinion text and the text title of the public opinion text according to the distance data between each sentence in the public opinion text and the text title of the public opinion text.
In an alternative embodiment, the program 310 is further configured to, when determining the candidate sentences of the abstract of the public opinion text according to the similarity between each sentence in the public opinion text and the text title of the public opinion text, cause the processor 302 to perform descending order arrangement on each sentence in the public opinion text according to the similarity between each sentence in the public opinion text and the text title of the public opinion text, so as to obtain a descending order arrangement result of each sentence in the public opinion text; and determining abstract candidate sentences of the public opinion text according to the descending order arrangement result of each sentence in the public opinion text.
In an alternative embodiment, the program 310 is further configured to, when determining the candidate sentence of the abstract of the public opinion text according to the descending order result of each sentence in the public opinion text, determine M sentences arranged at the top in descending order in the public opinion text according to the descending order result of each sentence in the public opinion text, where M represents a natural number; and determining M sentences which are arranged at the top in descending order in the public opinion text as abstract candidate sentences of the public opinion text.
In an alternative embodiment, the program 310 is further configured to enable the processor 302 to determine similarity between sentences in the public opinion text through a public opinion text abstract extraction model when the public opinion text is abstracted through the public opinion text abstract extraction model to obtain score data of text abstract of the public opinion text to which each sentence in the public opinion text belongs; constructing a standard graph by taking all sentences in the public opinion text as nodes and taking the similarity among the sentences in the public opinion text as a connection line among the nodes through the public opinion text abstract extraction model, wherein the thickness degree of the connection line among the nodes represents the similarity among the sentences corresponding to the nodes; and obtaining scoring data of the text abstract of the public opinion text belonging to each sentence in the public opinion text according to the standard map through the public opinion text abstract extraction model.
In an alternative embodiment, the program 310 is further configured to, when obtaining, through the public opinion text abstract extraction model, score data of a text abstract of the public opinion text to which each sentence in the public opinion text belongs according to the opinion map, predict, through the public opinion text abstract extraction model, the score data of the text abstract of the public opinion text to which each sentence in the public opinion text belongs according to the number of links and the link thickness degree of the corresponding node in the opinion map of each sentence in the public opinion text.
In an alternative embodiment, the program 310 is further configured to, when the text abstract of the public opinion text is determined from candidate sentences belonging to the abstract of the public opinion text according to the score data of the text abstract of the public opinion text belonging to each sentence in the public opinion text, determine the score data of the text abstract of the public opinion text belonging to the candidate sentence abstract of the public opinion text according to the score data of the text abstract of the public opinion text belonging to each sentence in the public opinion text; according to the scoring data of text abstracts of the public opinion text belonging to abstract candidate sentences in the public opinion text, carrying out descending order arrangement on abstract candidate sentences in the public opinion text, and according to the descending order arrangement result of the abstract candidate sentences in the public opinion text, determining N abstract candidate sentences of the public opinion text with the descending order arranged at the top, wherein N represents a natural number; and generating text abstracts of the public opinion text according to the N abstract candidate sentences which are arranged at the forefront in the descending order of the public opinion text.
For specific implementation of each step in the program 310, reference may be made to corresponding steps and corresponding descriptions in units in the foregoing public opinion text abstract extracting method embodiments, which are not described herein again. It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described devices and modules may refer to the corresponding process descriptions in the foregoing method embodiments, and are not described herein again.
Through the electronic equipment of the embodiment, the similarity between each sentence in the public opinion text to be abstracted and the text title of the public opinion text is respectively determined, the abstract candidate sentences of the public opinion text are determined according to the similarity between each sentence in the public opinion text and the text title of the public opinion text, then the abstract extraction is carried out on the public opinion text through a public opinion text abstract extraction model to obtain the grading data of the text abstract of the public opinion text to which each sentence in the public opinion text belongs, then the text abstract of the public opinion text is determined from the abstract candidate sentences of the public opinion text according to the grading data of the text abstract of the public opinion text to which each sentence in the public opinion text belongs, compared with the existing other modes, the text abstract of the public opinion text is determined according to the similarity between each sentence in the public opinion text and the text title, and determining the abstract candidate sentence of the public opinion text, so that the finally extracted text abstract of the public opinion text is closer to the text title of the public opinion text, and the central theme of the finally extracted text abstract of the public opinion text is effectively ensured. In addition, the information content of the sentences belonging to the text abstract of the public opinion text in the public opinion text can be effectively ensured through the scoring data of the sentences belonging to the text abstract of the public opinion text in the public opinion text, so that the text abstract of the public opinion text can be accurately determined from the abstract candidate sentences of the public opinion text, and the central gist of the finally extracted text abstract of the public opinion text can be effectively ensured.
It should be noted that, according to the implementation requirement, each component/step described in the embodiment of the present invention may be divided into more components/steps, and two or more components/steps or partial operations of the components/steps may also be combined into a new component/step to achieve the purpose of the embodiment of the present invention.
The above-described method according to an embodiment of the present invention may be implemented in hardware, firmware, or as software or computer code storable in a recording medium such as a CD ROM, a RAM, a floppy disk, a hard disk, or a magneto-optical disk, or as computer code originally stored in a remote recording medium or a non-transitory machine-readable medium downloaded through a network and to be stored in a local recording medium, so that the method described herein may be stored in such software processing on a recording medium using a general-purpose computer, a dedicated processor, or programmable or dedicated hardware such as an ASIC or FPGA. It is understood that the computer, processor, microprocessor controller or programmable hardware includes a storage component (e.g., RAM, ROM, flash memory, etc.) that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the public opinion text summarization extraction method described herein. Further, when a general-purpose computer accesses code for implementing the method for abstracting a summary of a public opinion text shown herein, execution of the code converts the general-purpose computer into a special-purpose computer for executing the method for abstracting a summary of a public opinion text shown herein.
Those of ordinary skill in the art will appreciate that the various illustrative elements and method steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present embodiments.
The above embodiments are only for illustrating the embodiments of the present invention and not for limiting the embodiments of the present invention, and those skilled in the art can make various changes and modifications without departing from the spirit and scope of the embodiments of the present invention, so that all equivalent technical solutions also belong to the scope of the embodiments of the present invention, and the scope of patent protection of the embodiments of the present invention should be defined by the claims.

Claims (10)

1. A method for abstracting a summary of a public opinion text, the method comprising:
respectively determining the similarity between each sentence in the public opinion text to be abstracted and the text title of the public opinion text;
determining abstract candidate sentences of the public opinion text according to the similarity between each sentence in the public opinion text and the text title of the public opinion text;
extracting a summary of the public opinion text through a public opinion text summary extraction model to obtain scoring data of text summaries of sentences in the public opinion text, wherein the sentences belong to the public opinion text;
and determining the text abstract of the public opinion text from the abstract candidate sentences of the public opinion text according to the scoring data of the text abstract of the public opinion text to which each sentence in the public opinion text belongs.
2. The method for abstracting a public opinion text according to claim 1, wherein the determining similarity between each sentence in the public opinion text to be abstracted and the text title of the public opinion text respectively comprises:
respectively determining semantic feature representation data of each sentence in the public opinion text;
determining semantic feature representation data of a text title of the public opinion text;
respectively determining distance data between each sentence in the public opinion text and the text title of the public opinion text according to semantic feature representation data of each sentence in the public opinion text and semantic feature representation data of the text title of the public opinion text;
and determining the similarity between each sentence in the public opinion text and the text title of the public opinion text according to the distance data between each sentence in the public opinion text and the text title of the public opinion text.
3. The method for abstracting a summary of a public opinion text as claimed in claim 1, wherein the determining of the candidate sentences for the summary of the public opinion text according to the similarity between each sentence in the public opinion text and the text title of the public opinion text comprises:
according to the similarity between each sentence in the public opinion text and the text title of the public opinion text, performing descending order arrangement on each sentence in the public opinion text to obtain a descending order arrangement result of each sentence in the public opinion text;
and determining abstract candidate sentences of the public opinion text according to the descending order arrangement result of each sentence in the public opinion text.
4. The method for abstracting a summary of a public opinion text as claimed in claim 3, wherein the determining the candidate sentences of the summary of the public opinion text according to the descending order of each sentence in the public opinion text comprises:
determining M sentences which are arranged at the top in a descending order mode in the public opinion text according to the descending order result of each sentence in the public opinion text, wherein M represents a natural number;
and determining M sentences which are arranged at the top in descending order in the public opinion text as abstract candidate sentences of the public opinion text.
5. The method for abstracting a summary of a public opinion text as claimed in claim 1, wherein the abstracting the summary of the public opinion text through a public opinion text summarization model to obtain scoring data of text summaries of the public opinion text belonging to each sentence in the public opinion text comprises:
determining similarity among sentences in the public opinion text through the public opinion text abstract extraction model;
constructing a standard graph by taking all sentences in the public opinion text as nodes and taking the similarity among the sentences in the public opinion text as a connection line among the nodes through the public opinion text abstract extraction model, wherein the thickness degree of the connection line among the nodes represents the similarity among the sentences corresponding to the nodes;
and obtaining scoring data of the text abstract of the public opinion text belonging to each sentence in the public opinion text according to the standard map through the public opinion text abstract extraction model.
6. The method for abstracting a summary of a public opinion text as claimed in claim 5, wherein the obtaining of scoring data of text summaries of the public opinion text belonging to each sentence in the public opinion text according to the metric graph by the public opinion text summary abstraction model comprises:
and predicting the scoring data of the text abstract of the public opinion text belonging to each sentence in the public opinion text according to the number and thickness degree of the connecting lines of the corresponding nodes of each sentence in the public opinion text in the standard graph through the public opinion text abstract extraction model.
7. The method for extracting a summary of a public opinion text according to claim 1, wherein the determining the text summary of the public opinion text from candidate sentences of the summary of the public opinion text according to the score data of the text summary of the public opinion text to which each sentence belongs comprises:
determining score data of text abstracts of the public opinion texts to which abstract candidate sentences in the public opinion texts belong according to the score data of the text abstracts of the public opinion texts to which all sentences in the public opinion texts belong;
according to the scoring data of text abstracts of the public opinion text belonging to abstract candidate sentences in the public opinion text, carrying out descending order arrangement on abstract candidate sentences in the public opinion text, and according to the descending order arrangement result of the abstract candidate sentences in the public opinion text, determining N abstract candidate sentences of the public opinion text with the descending order arranged at the top, wherein N represents a natural number;
and generating text abstracts of the public opinion text according to the N abstract candidate sentences which are arranged at the forefront in the descending order of the public opinion text.
8. An abstract extracting device for public opinion text, the device comprising:
the first determining module is used for respectively determining the similarity between each sentence in the public opinion text to be abstracted and the text title of the public opinion text;
the second determination module is used for determining abstract candidate sentences of the public opinion text according to the similarity between each sentence in the public opinion text and the text title of the public opinion text;
the abstract extraction module is used for extracting an abstract of the public opinion text through a public opinion text abstract extraction model so as to obtain scoring data of text abstract of the public opinion text belonging to each sentence in the public opinion text;
and the third determining module is used for determining the text abstract of the public opinion text from the abstract candidate sentences of the public opinion text according to the grading data of the text abstract of the public opinion text to which each sentence belongs.
9. An electronic device, characterized in that the device comprises:
the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the public opinion text abstract extracting method as claimed in any one of claims 1-7.
10. A computer storage medium, on which a computer program is stored, which when executed by a processor, implements a method for abstracting a summary of public opinion text as recited in any one of claims 1 to 7.
CN202111510633.3A 2021-12-10 2021-12-10 Public opinion text abstract extraction method, device, equipment and computer storage medium Pending CN114201600A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111510633.3A CN114201600A (en) 2021-12-10 2021-12-10 Public opinion text abstract extraction method, device, equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111510633.3A CN114201600A (en) 2021-12-10 2021-12-10 Public opinion text abstract extraction method, device, equipment and computer storage medium

Publications (1)

Publication Number Publication Date
CN114201600A true CN114201600A (en) 2022-03-18

Family

ID=80652482

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111510633.3A Pending CN114201600A (en) 2021-12-10 2021-12-10 Public opinion text abstract extraction method, device, equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN114201600A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109101489A (en) * 2018-07-18 2018-12-28 武汉数博科技有限责任公司 A kind of text automatic abstracting method, device and a kind of electronic equipment
CN111209752A (en) * 2019-11-13 2020-05-29 北京航空航天大学 Chinese extraction integrated unsupervised abstract method based on auxiliary information
CN111460131A (en) * 2020-02-18 2020-07-28 平安科技(深圳)有限公司 Method, device and equipment for extracting official document abstract and computer readable storage medium
CN112214576A (en) * 2020-09-10 2021-01-12 深圳价值在线信息科技股份有限公司 Public opinion analysis method, device, terminal equipment and computer readable storage medium
CN113342968A (en) * 2021-05-21 2021-09-03 中国石油天然气股份有限公司 Text abstract extraction method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109101489A (en) * 2018-07-18 2018-12-28 武汉数博科技有限责任公司 A kind of text automatic abstracting method, device and a kind of electronic equipment
CN111209752A (en) * 2019-11-13 2020-05-29 北京航空航天大学 Chinese extraction integrated unsupervised abstract method based on auxiliary information
CN111460131A (en) * 2020-02-18 2020-07-28 平安科技(深圳)有限公司 Method, device and equipment for extracting official document abstract and computer readable storage medium
CN112214576A (en) * 2020-09-10 2021-01-12 深圳价值在线信息科技股份有限公司 Public opinion analysis method, device, terminal equipment and computer readable storage medium
CN113342968A (en) * 2021-05-21 2021-09-03 中国石油天然气股份有限公司 Text abstract extraction method and device

Similar Documents

Publication Publication Date Title
CN106897428B (en) Text classification feature extraction method and text classification method and device
CN110362372A (en) Page translation method, device, medium and electronic equipment
CN112162965B (en) Log data processing method, device, computer equipment and storage medium
US10394839B2 (en) Crowdsourcing application history search
JP2020074193A (en) Search method, device, facility, and non-volatile computer memory
CN108932320B (en) Article searching method and device and electronic equipment
CN115982376B (en) Method and device for training model based on text, multimode data and knowledge
US20160188569A1 (en) Generating a Table of Contents for Unformatted Text
CN107491477A (en) A kind of emoticon searching method and device
CN111767394A (en) Abstract extraction method and device based on artificial intelligence expert system
CN110704608A (en) Text theme generation method and device and computer equipment
CN106663123B (en) Comment-centric news reader
US11294964B2 (en) Method and system for searching new media information
US10592572B2 (en) Application view index and search
CN114357325A (en) Content search method, device, equipment and medium
CN113407775B (en) Video searching method and device and electronic equipment
CN113177407A (en) Data dictionary construction method and device, computer equipment and storage medium
CN110377891B (en) Method, device and equipment for generating event analysis article and computer readable storage medium
CN109558468B (en) Resource processing method, device, equipment and storage medium
CN114201600A (en) Public opinion text abstract extraction method, device, equipment and computer storage medium
CN116011955A (en) Robot flow automation demand realization method, device, equipment and storage medium
CN115640790A (en) Information processing method and device and electronic equipment
WO2022105120A1 (en) Text detection method and apparatus from image, computer device and storage medium
CN107729499A (en) Information processing method, medium, system and electronic equipment
US11354036B2 (en) Method and electronic device for configuring touch screen keyboard

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220318

RJ01 Rejection of invention patent application after publication