CN116562263A - Method, device, equipment and storage medium for evaluating document link continuity - Google Patents

Method, device, equipment and storage medium for evaluating document link continuity Download PDF

Info

Publication number
CN116562263A
CN116562263A CN202310511318.5A CN202310511318A CN116562263A CN 116562263 A CN116562263 A CN 116562263A CN 202310511318 A CN202310511318 A CN 202310511318A CN 116562263 A CN116562263 A CN 116562263A
Authority
CN
China
Prior art keywords
event
argument
relation
document
continuity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310511318.5A
Other languages
Chinese (zh)
Inventor
王华珍
赵荐轩
何霆
李弼程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huaqiao University
Original Assignee
Huaqiao University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huaqiao University filed Critical Huaqiao University
Priority to CN202310511318.5A priority Critical patent/CN116562263A/en
Publication of CN116562263A publication Critical patent/CN116562263A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a method, a device, equipment and a storage medium for evaluating document link up consistency, which are characterized in that an event argument is filled one by calling a pre-training language model and adopting a mask prediction mode, an event instance is restored to an event description statement, then an event relation between two event description statements is mapped to a related word set, a related word with highest confidence is selected as an event relation prediction result, the event relation prediction result comprises the event argument consistency and the event relation link up, finally, an evaluation result of the document link up consistency is generated according to the measurement and fusion of the event argument consistency and the event relation link up, and the applicable problem type and application scene of automatic document evaluation are widened through document link up consistency evaluation.

Description

Method, device, equipment and storage medium for evaluating document link continuity
Technical Field
The present invention relates to the field of natural language processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for evaluating document linking consistency.
Background
Event relationships exist objectively between events and act on a collection of otherwise isolated events. The event relationship can connect events discrete in the text to form an event relationship network and a topological context for event development, however, in the prior art, the applicable problem type and application scene of automatic composition evaluation cannot be further widened based on document engagement consistency evaluation.
In view of this, the present application is presented.
Disclosure of Invention
The invention discloses a method, a device, equipment and a storage medium for evaluating document link up consistency, which aim to widen the applicable problem type and application scene of automatic composition evaluation based on document link up consistency evaluation.
The first embodiment of the invention provides a method for evaluating document link continuity, which comprises the following steps:
invoking a pre-training language model, filling event arguments one by one in a mask prediction mode, and restoring an event instance into an event description statement;
mapping the event relation between two event description sentences into a related word set, and selecting the related word with the highest confidence as an event relation prediction result, wherein the event relation prediction result comprises event argument consistency and event relation connectivity;
generating an evaluation result of the document link coherence according to the measurement and fusion of the event argument coherence and the event relation link coherence, wherein the event argument coherence is the proportion of the number of events with the event argument coherence in two document event sets to the number of all events; the event relation connectivity is the proportion of the number of event pairs with event relation in the two document event sets to all event pairs.
Preferably, the method calls a pre-training language model and fills event arguments one by one in a mask prediction mode, and restores event instances into event description sentences, specifically:
s201, taking event trigger words in event instances as initial states of event description reconstruction sentences, grouping event argument in the event instances according to the lengths of the event argument, and selecting an event argument group as a candidate argument set in ascending order every round;
s202, invoking a pre-training language model to predict all positions to be filled of each argument in the candidate argument set in a current event description reconstruction sentence, and selecting the argument with the highest confidence coefficient obtained in the current prediction to be filled in the corresponding position of the event description reconstruction sentence;
s203, updating the event reconstruction statement and removing the argument from the candidate argument set;
s204, carrying out next prediction, and repeating the process until all the argument is filled.
Preferably, the mapping the event relationship between two event description sentences into a related word set, and selecting the related word with the highest confidence as the event relationship prediction result specifically includes:
the event description sentences corresponding to the two event instances are spliced by using the covering marks in a Prompt learning mode,
and predicting the covering mark through a pre-training language model, selecting a connecting word with the highest confidence and mapping the connecting word to obtain an event relation between two events.
Preferably, the document join coherence is a product of the event argument coherence and the event relationship coherence.
The second embodiment of the invention provides an evaluation device for document link continuity, which comprises:
the event description statement restoring unit is used for calling the pre-training language model, filling event arguments one by one in a mask prediction mode, and restoring the event instance into an event description statement;
the event relation prediction result selection unit is used for mapping the event relation between two event description sentences into an associated word set, and selecting the associated word with the highest confidence level as an event relation prediction result, wherein the event relation prediction result comprises event argument consistency and event relation connectivity;
the evaluation result generation unit of the file link continuity is used for generating an evaluation result of the file link continuity according to the measurement and fusion of the event argument continuity and the event relation continuity, wherein the event argument continuity is the proportion of the number of events with the event argument continuity in two file event sets to the number of all events; the event relation connectivity is the proportion of the number of event pairs with event relation in the two document event sets to all event pairs.
Preferably, the event description sentence reduction unit is specifically configured to:
s201, taking event trigger words in event instances as initial states of event description reconstruction sentences, grouping event argument in the event instances according to the lengths of the event argument, and selecting an event argument group as a candidate argument set in ascending order every round;
s202, invoking a pre-training language model to predict all positions to be filled of each argument in the candidate argument set in a current event description reconstruction sentence, and selecting the argument with the highest confidence coefficient obtained in the current prediction to be filled in the corresponding position of the event description reconstruction sentence;
s203, updating the event reconstruction statement and removing the argument from the candidate argument set;
s204, carrying out next prediction, and repeating the process until all the argument is filled.
Preferably, the event relationship prediction result selecting unit is specifically configured to:
the event description sentences corresponding to the two event instances are spliced by using the covering marks in a Prompt learning mode,
and predicting the covering mark through a pre-training language model, selecting a connecting word with the highest confidence and mapping the connecting word to obtain an event relation between two events.
Preferably, the document join coherence is a product of the event argument coherence and the event relationship coherence.
A third embodiment of the present invention provides a device for evaluating document link coherence, including a memory and a processor, where the memory stores a computer program, and the computer program is capable of being executed by the processor to implement a method for evaluating document link coherence as set forth in any one of the above.
A fourth embodiment of the present invention provides a computer readable storage medium, where a computer program is stored, where the computer program can be executed by a processor of an apparatus in which the computer readable storage medium is located, to implement a method for evaluating document join consistency according to any one of the above claims.
According to the method, the device, the equipment and the storage medium for evaluating the continuity of the file connection, the pre-training language model is firstly called, event arguments are filled one by one in a mask prediction mode, event examples are restored to event description sentences, then event relations between the two event description sentences are mapped to a related word set, the related word with the highest confidence is selected as an event relation prediction result, the event relation prediction result comprises event argument continuity and event relation continuity, and finally, the evaluation result of the continuity of the file connection is generated according to the measurement and fusion of the event argument continuity and the event relation continuity, wherein the event argument continuity is the proportion of the number of events with the continuity of the event arguments in the two file event sets to the number of all events; the event relation connectivity is the proportion of the number of event pairs with event relation in two document event sets to all event pairs, and the applicable problem type and application scene of automatic document evaluation are widened through document connection continuity evaluation.
Drawings
FIG. 1 is a flowchart of a method for evaluating document join consistency according to a first embodiment of the present invention;
fig. 2 is a schematic block diagram of an evaluation apparatus for document join consistency according to a second embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
For a better understanding of the technical solution of the present invention, the following detailed description of the embodiments of the present invention refers to the accompanying drawings.
It should be understood that the described embodiments are merely some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be understood that the term "and/or" as used herein is merely one relationship describing the association of the associated objects, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
Depending on the context, the word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to detection". Similarly, the phrase "if determined" or "if detected (stated condition or event)" may be interpreted as "when determined" or "in response to determination" or "when detected (stated condition or event)" or "in response to detection (stated condition or event), depending on the context.
References to "first\second" in the embodiments are merely to distinguish similar objects and do not represent a particular ordering for the objects, it being understood that "first\second" may interchange a particular order or precedence where allowed. It is to be understood that the "first\second" distinguishing objects may be interchanged where appropriate to enable the embodiments described herein to be implemented in sequences other than those illustrated or described herein.
Specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
The invention discloses a method, a device, equipment and a storage medium for evaluating document link up consistency, which aim to widen the applicable problem type and application scene of automatic composition evaluation based on document link up consistency evaluation.
A first embodiment of the present invention provides a method for evaluating document link coherence, which may be executed by a device for evaluating document link coherence (hereinafter referred to as an evaluating device), in particular, by one or more processors in the evaluating device, to at least implement the following steps:
s101, calling a pre-training language model, filling event arguments one by one in a mask prediction mode, and restoring an event instance into an event description statement;
in this embodiment, the evaluating device may be a terminal with data processing and analysis capabilities, such as a desktop computer, a notebook computer, a server, a workstation, etc., where a corresponding operating system and application software may be installed in the evaluating device, and the functions required in this embodiment are implemented by combining the operating system and the application software.
Specifically, in the present embodiment:
s201, taking event trigger words in event instances as initial states of event description reconstruction sentences, grouping event argument in the event instances according to the lengths of the event argument, and selecting an event argument group as a candidate argument set in ascending order every round;
s202, invoking a pre-training language model to predict all positions to be filled of each argument in the candidate argument set in a current event description reconstruction sentence, and selecting the argument with the highest confidence coefficient obtained in the current prediction to be filled in the corresponding position of the event description reconstruction sentence;
s203, updating the event reconstruction statement and removing the argument from the candidate argument set;
s204, carrying out next prediction, and repeating the process until all the argument is filled.
More specifically:
suppose E k For an event instance consisting of N arguments and 1 trigger word, there areWherein (1)>Representing event E k Trigger word, arg n Representing event E k N-th event argument in (a), n E [1, N)]. Pair E k All elements except trigger are grouped by length, assume E k The longest event argument in (a) has length of I and +.> i∈[1,I]In Sort of i Represented by E k A list of arguments of length i.
Initialization event E k Corresponding event reconstruction statementWheel t (t E [1, N)]) The process of filling event arguments is: select->A list of the most front positions of the two>All arguments in (a) constitute the candidate argument set candidate t ,candidate t The argument length of all the arguments in (a) is len arg The method comprises the steps of carrying out a first treatment on the surface of the Then, a pre-training language model is used to carry out total t+1 times of prediction, and the mask prediction template used for the jth time is set as a template j_ERS ,j∈[1,t+1]The generation rule is as follows: in composition->The j-th element pre-addition number is equal to len arg Is a MASK mark o ],o∈[1,len arg ]In particular, when j=t+1, it will be +.>The tail addition amount of (C) is equal to len arg Is a MASK mark o ],o∈[1,len arg ]。
To avoid the low accuracy problem of predicting multiple mask markers simultaneously, only one word of an argument is predicted at a time by masking the pre-trained language model. Pre-trained language model predicts word on template j_ERS Medium [ MASK o ]The confidence formula for the location is as follows:
wherein the method comprises the steps ofRepresenting a pre-trained language model at [ MASK ] o ]Hidden layer vector of position output, W T The word TokenId is a function that converts the input characters into corresponding IDs in the pre-training language model dictionary, representing the transpose of the input layer embedded weight matrix of the pre-training language model.
For a length of len arg The argument arg of (1) is greedy to be len arg The secondary mask predicts and gets its on-template j_ERS Confidence of the mid-mask position. Filling will be performed m times, m.epsilon.1, len arg ]Each time one will select the arg with the highest confidence p m The single words of the corresponding positions are replaced by the mask marks, and then the remaining single words in the arg are replaced by the template j_ERS The next filling is carried out on the residual mask marks, and the pre-training language model predictive argument arg is filled into the template j_ERS The confidence formula is as follows:
finally, selecting the event argument which obtains the highest confidence coefficient in all t+1 predictions and the corresponding position of the event argument in the event reconstruction statement to fill to obtain a new oneAt the same time->List corresponding to the argument->Is removed. The event argument filling process will go through N rounds until +.>All lists in (a) are empty.
S102, mapping an event relation between two event description sentences into a related word set, and selecting a related word with highest confidence as an event relation prediction result, wherein the event relation prediction result comprises event argument consistency and event relation connectivity;
specifically, in the present embodiment:
the event description sentences corresponding to the two event instances are spliced by using the covering marks in a Prompt learning mode,
and predicting the covering mark through a pre-training language model, selecting a connecting word with the highest confidence and mapping the connecting word to obtain an event relation between two events.
More specifically:
assume event E to be predicted a And E is connected with b The corresponding event reconstruction statement is ERS a And ERS b Generating an input template for event relationship prediction ERP The shape is as follows:
template ERP =ERS a [MASK 1 ][MASK 2 ]ERS b
defining three sets V consisting of connective words Accompany 、V Follow And V is equal to Causal The three sets are specifically defined as follows:
V Accompany = { "accompanying", "simultaneous", "sudden", "following", "immediate" }
V Follow = { "following", "subsequent", "then", "subsequent" }
V Causal = { "cause", "thus", "cause", "generate" }
Filling any one of the three connector word sets v into the template ERP The confidence of the event relationship prediction is obtained as follows:
p i the highest confidence that is obtained for each word that fills in the connective v in a greedy manner. Let v be Accompany 、v Follow And v Causal Respectively set V Accompany 、V Follow And V is equal to Causal Three connecting words with event relation prediction confidence degrees in the three connecting words, wherein the confidence degrees are p respectively Accompany 、p Follow And p is as follows Causal . Order the
result er =softmax([p Accompany :p Follow :p Causal ])
Final E a And E is connected with b Has the event relationship determination function as follows:
s103, according to the measurement and fusion of the event argument coherence and the event relation connectivity, generating an evaluation result of the document connectivity, wherein the event argument coherence is the proportion of the number of events with the event argument coherence in two document event sets to the number of all events; the event relation connectivity is the proportion of the number of event pairs with event relation in the two document event sets to all event pairs.
In this embodiment, the document join coherence is a product of the event argument coherence and the event relationship coherence.
Specifically, in the present embodiment
Given the text of the language a And text b The corresponding event instance set isAnd->Wherein the method comprises the steps of text a And text b The event relation set between the language fragments is as followsConstituent element shapes such as [ relation|E a ,E b ]Wherein-> relation∈{Accompany,Follow,
Causal}。
For any oneIdentify E a The condition for the argument coherent event is that any event argument arg E a There is any->With arg E b
For any oneIdentify E b The condition for an argument coherent event is. Arg E for arbitrary event argument b There is any->With arg E a
Meter with a meter bodyAnd->The total number of the argument coherent events contained in the method is num co□erent_event . Text then a And text b The decision function for event argument consistency between is as follows:
wherein the method comprises the steps ofAnd->Respectively represent->And->The number of event instances in (a).
text a And text b The decision function for event relationship continuity between is as follows:
represents->Number of elements, i.e. text a And text b Number of event pairs having an event relationship between documents.
text a And text b The decision function for the link coherence between them is as follows:
the above embodiments are further illustrated by the following example:
step 1.1, given input E 1 = { stroll, tiger, one day }, stroll into E 1 Trigger words, i.e.Pair E 1 Middle and remove->All elements are grouped according to length to obtainWherein Sort 1 ={},Sort 2 = { tiger }, sort 3 = { one day }. E (E) 1 The longest event argument length of (3).
Step 1.2, initialize E 1 Corresponding event reconstruction statement
Round 1 fill event argument: due to Sort 1 Is empty, thus select Sort 2 All arguments in (a) constitute candidate argument set candidate 1 = { tiger }, the 1 st prediction uses template 1_ERS =[MASK 1 ][MASK 2 ]Strolling through the use of pre-training diesMASK prediction by MacBert yields p ([ MASK) 1 ]=old|template 1_ERS )=0.471,p([MASK 2 ]=tiger|template 1_ERS ) =0.581, thus select tiger substitution 1 st [ MASK ] 2 ],p 1 =0.581. Updating a template 1_ERS Is a template 1_ERS =[MASK 1 ]Tiger strolling, p ([ MASK) is obtained by masking predictions using a pre-trained model MacBert 1 ]=old|template j_ERS ) = 1.346, so the 2 nd selection of old substitution [ MASK ] 1 ],p 2 = 1.346. So far, the tiger can be filled into the template 1_ERS Confidence level Template for prediction of 2 nd time 2_SRS =strolling [ MASK ] 1 ][MASK 2 ]The calculation process is similar to the 1 st prediction, and p (Tiger|template can be obtained 2_ERS ) =0.389, thus filling the goblet into template 1_ERS Is to obtain a new event reconstruction statement +.>And from sortarg 2 Removing tiger.
Round 2 fill event argument: due to sortarg 1 ,sortarg 2 Is empty, thus select sortarg 3 All arguments in (a) constitute candidate argument set candidate 2 = { there is one day }, the calculation process is similar to the 1 st round of filling event argument, and finally the event reconstruction statement is obtained
The step 2 is further specifically:
step 2.1, predicting event E 1 And E is connected with 2 The corresponding event reconstruction statement is ERS 1 =there is one day tiger strolling and ERS 2 =tiger found fox, generated for event relationshipPredicted input template ERP The shape is as follows:
template ERP =there is a day tiger strolling [ MASK ] 1 ][MASK 2 ]Tiger found fox
At the same time, after and resulting in respectively set V Accompany 、V Follow And V is equal to Causal The three elements with the highest confidence levels are respectively 0.231, 2.909 and 0.856. With result er =softmax([0.231:2.909:0.856])=[0.0574,0.8354,0.1072]。
Available in general, ERP (E 1 ,E 2 )=Follow
The step 3 is further specifically:
step 3.1, give the text of the language a And text b The corresponding event set isAnd->Wherein->Wherein E is 1 = { strolling, tiger, one day }, E 2 = { find, tiger, fox }, E 3 = { Smart, fox }, E 4 = { fraud, fox, tiger }, E 5 = { escape, fox }, E 6 = { anger, tiger }, E 7 = { vital energy, tiger }, E 8 = { grasp, tiger, fox, next day }, E 9
{ eat, tiger, fox }.
Wherein strolling, finding, clever, deception, escape, anger, angry, grabbing, eating are E respectively 1 ,E 2 ,E 3 ,E 4 ,E 5 ,E 6 ,E 7 ,E 8 ,E 9 Trigger words of (i) i.e
text a And text b The event relation set between the language fragments is as followsHas the following components
Step 3.2, statistics by traversalAnd->Available num co□erent_event =9, then there is
TEAC(text a ,text b )=1
Step 3.3, statistics by traversalAnd->Has the following components
TERC(text a ,text b )=1
Step 3.4, text a And text b The decision function for the join consistency is as follows:
TC(text a ,text b )=1
text therefore a ,text b Having cohesive nature
It is not difficult to find that the method for evaluating the continuity of the document engagement provided by the embodiment can realize a better event relation prediction effect and a better document engagement continuity evaluation model effect.
Referring to fig. 2, a second embodiment of the present invention provides an apparatus for evaluating document link coherence, including:
an event description statement restoring sheet 201, configured to invoke a pre-training language model and fill event arguments one by one in a mask prediction manner, so as to restore an event instance into an event description statement;
an event relationship prediction result selecting unit 202, configured to map an event relationship between two event description sentences into a set of related words, and select a related word with the highest confidence level as an event relationship prediction result, where the event relationship prediction result includes event argument consistency and event relationship connectivity;
the evaluation result generation unit 203 of the document link coherence is configured to generate an evaluation result of the document link coherence according to the measurement and fusion of the event argument coherence and the event relationship coherence, where the event argument coherence is a proportion of the number of events in which the event argument coherence occurs in two document event sets to the number of all events; the event relation connectivity is the proportion of the number of event pairs with event relation in the two document event sets to all event pairs.
Preferably, the event description sentence reduction unit is specifically configured to:
s201, taking event trigger words in event instances as initial states of event description reconstruction sentences, grouping event argument in the event instances according to the lengths of the event argument, and selecting an event argument group as a candidate argument set in ascending order every round;
s202, invoking a pre-training language model to predict all positions to be filled of each argument in the candidate argument set in a current event description reconstruction sentence, and selecting the argument with the highest confidence coefficient obtained in the current prediction to be filled in the corresponding position of the event description reconstruction sentence;
s203, updating the event reconstruction statement and removing the argument from the candidate argument set;
s204, carrying out next prediction, and repeating the process until all the argument is filled.
Preferably, the event relationship prediction result selecting unit is specifically configured to:
the event description sentences corresponding to the two event instances are spliced by using the covering marks in a Prompt learning mode,
and predicting the covering mark through a pre-training language model, selecting a connecting word with the highest confidence and mapping the connecting word to obtain an event relation between two events.
Preferably, the document join coherence is a product of the event argument coherence and the event relationship coherence.
A third embodiment of the present invention provides a device for evaluating document link coherence, including a memory and a processor, where the memory stores a computer program, and the computer program is capable of being executed by the processor to implement a method for evaluating document link coherence as set forth in any one of the above.
A fourth embodiment of the present invention provides a computer readable storage medium, where a computer program is stored, where the computer program can be executed by a processor of an apparatus in which the computer readable storage medium is located, to implement a method for evaluating document join consistency according to any one of the above claims.
According to the method, the device, the equipment and the storage medium for evaluating the continuity of the file connection, the pre-training language model is firstly called, event arguments are filled one by one in a mask prediction mode, event examples are restored to event description sentences, then event relations between the two event description sentences are mapped to a related word set, the related word with the highest confidence is selected as an event relation prediction result, the event relation prediction result comprises event argument continuity and event relation continuity, and finally, the evaluation result of the continuity of the file connection is generated according to the measurement and fusion of the event argument continuity and the event relation continuity, wherein the event argument continuity is the proportion of the number of events with the continuity of the event arguments in the two file event sets to the number of all events; the event relation connectivity is the proportion of the number of event pairs with event relation in two document event sets to all event pairs, and the applicable problem type and application scene of automatic document evaluation are widened through document connection continuity evaluation.
Illustratively, the computer programs described in the third and fourth embodiments of the present invention may be divided into one or more modules, which are stored in the memory and executed by the processor to complete the present invention. The one or more modules may be a series of computer program instruction segments capable of performing a specified function that describe the execution of the computer program in the evaluating device that implements a document linkage coherence. For example, the device described in the second embodiment of the present invention.
The processor may be a central processing unit (Central Processing Unit, CPU), other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, and the processor is a control center of the method for evaluating the continuity of one document link, and uses various interfaces and lines to connect various parts of the entire method for evaluating the continuity of one document link.
The memory may be used to store the computer program and/or the module, and the processor may implement various functions of a method for evaluating document link up consistency by running or executing the computer program and/or the module stored in the memory and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, a text conversion function, etc.) required for at least one function, and the like; the storage data area may store data (such as audio data, text message data, etc.) created according to the use of the cellular phone, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
Wherein the modules may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as a stand alone product. Based on this understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and the computer program may implement the steps of each method embodiment described above when executed by a processor. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
It should be noted that the above-described apparatus embodiments are merely illustrative, and the units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the invention, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.

Claims (10)

1. A method for evaluating continuity of document engagement, comprising:
invoking a pre-training language model, filling event arguments one by one in a mask prediction mode, and restoring an event instance into an event description statement;
mapping the event relation between two event description sentences into a related word set, and selecting the related word with the highest confidence as an event relation prediction result, wherein the event relation prediction result comprises event argument consistency and event relation connectivity;
generating an evaluation result of the document link coherence according to the measurement and fusion of the event argument coherence and the event relation link coherence, wherein the event argument coherence is the proportion of the number of events with the event argument coherence in two document event sets to the number of all events; the event relation connectivity is the proportion of the number of event pairs with event relation in the two document event sets to all event pairs.
2. The method for evaluating the continuity of the document linkage according to claim 1, wherein the method calls a pre-training language model and fills event arguments one by one in a mask prediction mode, and restores event instances into event description sentences, specifically:
s201, taking event trigger words in event instances as initial states of event description reconstruction sentences, grouping event argument in the event instances according to the lengths of the event argument, and selecting an event argument group as a candidate argument set in ascending order every round;
s202, invoking a pre-training language model to predict all positions to be filled of each argument in the candidate argument set in a current event description reconstruction sentence, and selecting the argument with the highest confidence coefficient obtained in the current prediction to be filled in the corresponding position of the event description reconstruction sentence;
s203, updating the event reconstruction statement and removing the argument from the candidate argument set;
s204, carrying out next prediction, and repeating the process until all the argument is filled.
3. The method for evaluating continuity of document linkage according to claim 1, wherein the mapping the event relationship between two event description sentences into the associated word set selects the associated word with the highest confidence as the event relationship prediction result, specifically comprises:
the event description sentences corresponding to the two event instances are spliced by using the covering marks in a Prompt learning mode,
and predicting the covering mark through a pre-training language model, selecting a connecting word with the highest confidence and mapping the connecting word to obtain an event relation between two events.
4. The method of claim 1, wherein the document join coherence is a product of the event argument coherence and the event relationship coherence.
5. An apparatus for evaluating continuity of document engagement, comprising:
the event description statement restoring unit is used for calling the pre-training language model, filling event arguments one by one in a mask prediction mode, and restoring the event instance into an event description statement;
the event relation prediction result selection unit is used for mapping the event relation between two event description sentences into an associated word set, and selecting the associated word with the highest confidence level as an event relation prediction result, wherein the event relation prediction result comprises event argument consistency and event relation connectivity;
the evaluation result generation unit of the file link continuity is used for generating an evaluation result of the file link continuity according to the measurement and fusion of the event argument continuity and the event relation continuity, wherein the event argument continuity is the proportion of the number of events with the event argument continuity in two file event sets to the number of all events; the event relation connectivity is the proportion of the number of event pairs with event relation in the two document event sets to all event pairs.
6. The device for evaluating continuity of document linkage according to claim 5, wherein the event description sentence reduction unit is specifically configured to:
s201, taking event trigger words in event instances as initial states of event description reconstruction sentences, grouping event argument in the event instances according to the lengths of the event argument, and selecting an event argument group as a candidate argument set in ascending order every round;
s202, invoking a pre-training language model to predict all positions to be filled of each argument in the candidate argument set in a current event description reconstruction sentence, and selecting the argument with the highest confidence coefficient obtained in the current prediction to be filled in the corresponding position of the event description reconstruction sentence;
s203, updating the event reconstruction statement and removing the argument from the candidate argument set;
s204, carrying out next prediction, and repeating the process until all the argument is filled.
7. The apparatus for evaluating document linkage continuity according to claim 5, wherein said event relationship prediction result selecting unit is specifically configured to:
the event description sentences corresponding to the two event instances are spliced by using the covering marks in a Prompt learning mode,
and predicting the covering mark through a pre-training language model, selecting a connecting word with the highest confidence and mapping the connecting word to obtain an event relation between two events.
8. The apparatus for evaluating document join consistency according to claim 5, wherein the document join consistency is a product of the event argument consistency and the event relation consistency.
9. A device for evaluating document join consistency, comprising a memory and a processor, wherein the memory has stored therein a computer program executable by the processor to implement a method for evaluating document join consistency as claimed in any of claims 1 to 4.
10. A computer readable storage medium, storing a computer program executable by a processor of a device in which the computer readable storage medium is located, to implement a method for evaluating document join consistency according to any of claims 1 to 4.
CN202310511318.5A 2023-05-08 2023-05-08 Method, device, equipment and storage medium for evaluating document link continuity Pending CN116562263A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310511318.5A CN116562263A (en) 2023-05-08 2023-05-08 Method, device, equipment and storage medium for evaluating document link continuity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310511318.5A CN116562263A (en) 2023-05-08 2023-05-08 Method, device, equipment and storage medium for evaluating document link continuity

Publications (1)

Publication Number Publication Date
CN116562263A true CN116562263A (en) 2023-08-08

Family

ID=87489274

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310511318.5A Pending CN116562263A (en) 2023-05-08 2023-05-08 Method, device, equipment and storage medium for evaluating document link continuity

Country Status (1)

Country Link
CN (1) CN116562263A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117171653A (en) * 2023-11-02 2023-12-05 成方金融科技有限公司 Method, device, equipment and storage medium for identifying information relationship

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117171653A (en) * 2023-11-02 2023-12-05 成方金融科技有限公司 Method, device, equipment and storage medium for identifying information relationship
CN117171653B (en) * 2023-11-02 2024-01-23 成方金融科技有限公司 Method, device, equipment and storage medium for identifying information relationship

Similar Documents

Publication Publication Date Title
US11157693B2 (en) Stylistic text rewriting for a target author
US20170351663A1 (en) Iterative alternating neural attention for machine reading
US20180121785A1 (en) Context-aware attention-based neural network for interactive question answering
CN110276023B (en) POI transition event discovery method, device, computing equipment and medium
CN108427991A (en) Neural network is realized in fixed-point calculation computing system
CN112861527A (en) Event extraction method, device, equipment and storage medium
CN107003834B (en) Pedestrian detection device and method
CN116562263A (en) Method, device, equipment and storage medium for evaluating document link continuity
CN111737961B (en) Method and device for generating story, computer equipment and medium
CN111666393A (en) Verification method and device of intelligent question-answering system, computer equipment and storage medium
CN110442803A (en) Data processing method, device, medium and the calculating equipment executed by calculating equipment
CN111930891B (en) Knowledge graph-based search text expansion method and related device
JP2014002257A (en) Language model generation apparatus, method thereof and program
CN112527967A (en) Text matching method, device, terminal and storage medium
CN111581347A (en) Sentence similarity matching method and device
CN108520482A (en) A kind of automatic declaration method of trade mark declares terminal and storage medium
CN115002508A (en) Live data stream method and device, computer equipment and storage medium
CN115836288A (en) Method and apparatus for generating training data
CN113688232A (en) Method and device for classifying bidding texts, storage medium and terminal
RU2779526C2 (en) Method and device for text translation at discourse level
CN113343082B (en) Method, device, storage medium and equipment for generating hot field prediction model
CN115905598B (en) Social event abstract generation method, device, terminal equipment and medium
CN113177399B (en) Text processing method, device, electronic equipment and storage medium
WO2023075198A1 (en) Apparatus, method, system, and computer readable storage medium for extracting slot in dialogue system
US20230385556A1 (en) Systems and methods for reducing input to and increasing processing speeds of natural language processing models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination