CN111259673B - Legal decision prediction method and system based on feedback sequence multitask learning - Google Patents

Legal decision prediction method and system based on feedback sequence multitask learning Download PDF

Info

Publication number
CN111259673B
CN111259673B CN202010031722.9A CN202010031722A CN111259673B CN 111259673 B CN111259673 B CN 111259673B CN 202010031722 A CN202010031722 A CN 202010031722A CN 111259673 B CN111259673 B CN 111259673B
Authority
CN
China
Prior art keywords
vector
criminal
case
task
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010031722.9A
Other languages
Chinese (zh)
Other versions
CN111259673A (en
Inventor
张春云
崔超然
尹义龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University of Finance and Economics
Original Assignee
Shandong University of Finance and Economics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University of Finance and Economics filed Critical Shandong University of Finance and Economics
Priority to CN202010031722.9A priority Critical patent/CN111259673B/en
Publication of CN111259673A publication Critical patent/CN111259673A/en
Application granted granted Critical
Publication of CN111259673B publication Critical patent/CN111259673B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Marketing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Primary Health Care (AREA)
  • Technology Law (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a law decision prediction method and a system based on feedback sequence multitask learning, wherein the method comprises the following steps: the text characteristic representation learning of the case description is realized by using a single-task law prediction method based on representation learning; by taking the information of the preceding task and the feedback information of the following task of each subtask as the input of the current task, the sequence relation and the reverse verification relation among the subtasks are considered, and legal judgment prediction based on feedback sequence multitask learning is realized. The invention is based on the combination of the task of the representation learning list and the feedback-based sequential multi-task learning method, effectively utilizes the advantages of the task of the representation learning list and the feedback-based sequential multi-task learning method in legal judgment and prediction, overcomes the defect that the task-based representation learning list method does not utilize the complementary information of other tasks in a targeted manner, and can improve the accuracy and the robustness of judgment and prediction results compared with the traditional multi-task learning-based method.

Description

Legal decision prediction method and system based on feedback sequence multitask learning
Technical Field
The invention belongs to the technical field of judicial judgment prediction, and particularly relates to a law judgment prediction method and system based on feedback sequence multitask learning.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
Legal decision prediction aims at predicting the decision result of legal cases based on case facts. The technology is a core technology of a legal assistant system, and the technology is studied in depth, so that the technology has important application value and practical significance. In one aspect, legal decision prediction can provide low-cost, high-quality legal consulting services for some masses unfamiliar with legal terms and complex decision procedures. On the other hand, it can provide convenient reference materials for professionals (e.g., lawyers, judges) to improve their work efficiency. Currently, legal decision prediction mainly involves three subtasks: relevant legal predictions, criminal predictions and criminal period predictions. Aiming at the prediction of the three tasks, the prediction is currently used as a classification task to classify related laws, criminals and criminal periods. Representative of the current comparison is a legal decision prediction method based on a single task representing learning and a multi-task legal decision prediction method based on a plurality of related sub-tasks.
The law judgment prediction method based on representation learning mainly adopts a deep neural network to encode the semantics of the case through training a large number of labeling samples, thereby realizing the mapping from a symbol space to a vector space, and finally realizing the prediction of relevant laws, criminals and criminal periods based on semantic vector representation of case description. However, the method based on legal judgment prediction of the presentation learning single task has the defect that the method only aims at a single task, classification of the single task is realized only based on the case description characteristics, and influence of other tasks on the task is not considered.
The legal judgment prediction method based on the multitasking mainly considers the association among all the subtasks of legal judgment, shares the related information of the learned fields mutually through a sharing representation in a shallow layer, supplements the related information of the learned fields mutually, trains different classification models based on the characteristics of the respective tasks in the last layer of the model, and finally realizes the parallel classification of a plurality of tasks. More specifically, the sub-tasks of legal decision prediction have a dependency relationship (namely a sequence relationship) and a verification relationship (feedback verification). Generally, legal persons determine relevant legal laws related to the legal persons according to the case descriptions, then determine criminals based on the relevant legal laws related to the legal laws, and determine corresponding criminal periods based on the relevant legal laws and the determined criminals; conversely, through feedback, the predicted corresponding criminal act can verify the relevant laws involved, and the predicted corresponding criminal period can also verify the laws and criminal acts involved. However, most of the currently adopted legal judgment prediction methods based on multitasking are classified by adopting a classification framework of multitasking learning for several related tasks, and the sequence relationship among the tasks and the feedback verification relationship among the tasks are rarely considered.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a legal decision prediction method based on feedback sequence multitask learning, overcomes the defect that only a single task feature is needed to be considered to represent and other related task sharing information is difficult to utilize based on a method for representing a learning list task, and simultaneously adds sequence relation information and feedback verification information between tasks under a multitask-based framework.
To achieve the above object, one or more embodiments of the present invention provide the following technical solutions:
a law decision prediction method based on feedback sequence multitask learning comprises the following steps:
the text characteristic representation learning of the case description is realized by using a single-task law prediction method based on representation learning;
by taking the information of the preceding task and the feedback information of the following task of each subtask as the input of the current task, the sequence relation and the reverse verification relation among the subtasks are considered, and legal judgment prediction based on feedback sequence multitask learning is realized.
According to a further technical scheme, a case description and a training data set of relevant laws, criminals and criminal periods of the case description are obtained from a data center server, and the training data set is stored in a database;
text characteristic representation learning is carried out on the case description to obtain vector representation;
all laws, criminals and criminal periods perform text feature representation learning to obtain the feature vector representation.
According to a further technical scheme, a multitask pre-training model based on legal prediction, criminal prediction and criminal period prediction of case description is constructed, and corresponding pre-classification vectors of three subtasks are obtained;
feature fusion is carried out on a normal vector pointed by a pre-classification vector of the normal prediction task and a case description representation vector to obtain a case-normal representation vector;
performing feature fusion on a crime vector corresponding to a crime pointed by a pre-classification vector of a crime prediction task and a case representation vector to obtain a case-crime representation vector;
carrying out feature fusion on a criminal period vector corresponding to the criminal period pointed by the pre-classification vector of the criminal period prediction task and a case representation vector to obtain a case-criminal period representation vector;
taking a case-normal vector, a case-criminal vector and a case-criminal vector as inputs, and inputting the inputs into a bidirectional long-short-term memory neural network to obtain high-level semantic representation of the three vectors;
constructing classifiers for legal, criminal and criminal periods based on high-level characteristic representations of the case-legal vectors, the case-criminal vectors and the case-criminal period vectors;
the high-level characteristic representation is input into three classifiers to realize prediction of legal, criminal and criminal periods.
According to a further technical scheme, a multitask pre-training model based on legal prediction, criminal prediction and criminal period prediction of case description is constructed, and corresponding pre-classification vectors of three subtasks are obtained:
inputting the obtained case description vector into a multi-task classifier, and training the multi-task classification model to pre-classify the normal, crime and criminal period, so as to obtain normal classification vectors, crime prediction vectors and criminal period prediction vectors.
According to the further technical scheme, the BERT model is used for pre-training based on the case fact expression training data set, so that a language model of legal prediction tasks is obtained, and vector representations of D case fact descriptions are obtained.
According to the further technical scheme, based on the BERT model, a legal system vector, a criminal description vector and a criminal description vector are obtained by looking up a dictionary aiming at legal system content, criminal description and criminal description.
According to the further technical scheme, through the case fact description in the training data set and corresponding legal, criminal and criminal period labels, a multitask learning method of parameter hard sharing is adopted to obtain a legal classification vector, a criminal prediction vector and a criminal period prediction vector of each key fact description.
According to a further technical scheme, a gate in an LSTM block is adopted to encode a sequence relation between tasks in multi-task learning and a verification relation between a subsequent task and a current task:
step1: selecting each di in the batch case fact description D, acquiring a predicted result vector lri for a legal rule in a pre-classification result based on a multitasking pre-training classification model, and acquiring a legal rule vector l corresponding to the element with the maximum value in the vector j Describing the vector and case fact description vector d i Splicing, inputting to a full connection layer to obtain a case-normal vector representation dl i
Step2: for each d i Obtaining a predicted result vector cr for a crime in a pre-classification result based on a multi-task pre-classification model i The criminal vector corresponding to the element in the vector is taken out, and the vector and the case fact description vector d are combined i Splicing, inputting to a full connection layer to obtain a case-criminal vector representation dc i
Step3: for each d i Obtaining a predicted result vector pr aiming at criminal period in a pre-classification result based on a multi-task pre-classification model i And extracting the criminal period vector p corresponding to the element with the largest value in the vector i Describing the vector and case fact description vector d i Splicing, inputting into a full-connection layer to obtain a case-criminal period vector representation dp i
Step4, random initialization of initial State c of Forward LSTM Module 0 And h 0 State, case-law vector dl i As input vectors, the cell states of the modules are calculated respectively
Figure GDA0004051062750000051
And forward output state->
Figure GDA0004051062750000052
Step5 in the cellular state
Figure GDA0004051062750000053
And its forward output state->
Figure GDA0004051062750000054
For the cell state and the input state of the current forward LSTM module at the moment, the case-crime vector dc i As input vector, the cell status of the module is calculated separately +.>
Figure GDA0004051062750000055
And forward output state->
Figure GDA0004051062750000056
Step6: in the cellular state
Figure GDA0004051062750000057
And its forward output state->
Figure GDA0004051062750000058
For inputting cell state and input state at the moment on the current forward LSTM module, a case-criminal period vector dp i As input vector, the cell status of the module is calculated separately +.>
Figure GDA0004051062750000059
And forward output state->
Figure GDA00040510627500000510
Step7: in the cellular state
Figure GDA00040510627500000511
And its forward output state->
Figure GDA00040510627500000512
For inputting cell state and input state at one moment on the current reverse LSTM module, a case-criminal period vector dp i As input vector, the cell status of the module is calculated separately +.>
Figure GDA0004051062750000061
And reverse output state->
Figure GDA0004051062750000062
Step8: in the cellular state
Figure GDA0004051062750000063
And its reverse output state->
Figure GDA0004051062750000064
For the cell state and the input state at the current time on the reverse LSTM module, the case-crime vector dp i As input vector, the cell status of the module is calculated separately +.>
Figure GDA0004051062750000065
And reverse output state->
Figure GDA0004051062750000066
Step9: in the cellular state
Figure GDA0004051062750000067
And its reverse output state->
Figure GDA0004051062750000068
For the cell state and the input state at the current time on the reverse LSTM module, the case-law vector dl i As input vector, the cell status of the module is calculated separately +.>
Figure GDA0004051062750000069
And reverse output state->
Figure GDA00040510627500000610
Step10: respectively splicing the forward and reverse output states to obtain
Figure GDA00040510627500000611
Figure GDA00040510627500000612
As a case-based description d i Corresponding to the input of the French classifier, the criminal classifier and the criminal period classifier, calculating a cross entropy loss function corresponding to the batch input, and updating parameters;
step11: if the iteration number is less than the limit number, the process jumps to Step1.
The invention also discloses a legal decision prediction system based on feedback sequence multitask learning, which comprises the following steps:
the text feature representation learning module is used for realizing text feature representation learning of the case description by using a single-task legal prediction method based on representation learning;
and the legal judgment prediction module takes the information of the preceding task and the feedback information of the following task of each subtask as the input of the current task, considers the sequence relation and the reverse verification relation among the subtasks, and realizes the legal judgment prediction based on feedback sequence multitask learning.
The one or more of the above technical solutions have the following beneficial effects:
the invention expands the legal judgment prediction considering single subtasks to a multitask learning method considering the sequence relation and the reverse verification relation among the tasks to realize the prediction of legal judgment prediction subtasks, on one hand, the multitask learning method is adopted to realize the complementation of information by utilizing the shared information among all the subtasks; on the other hand, by taking the information of the preceding task and the feedback information of the following task of each subtask as the input of the current task, the sequence relation and the reverse verification relation among the subtasks are considered, and the prediction precision of legal judgment prediction is better improved.
The invention is based on the combination of the task of the representation learning list and the feedback-based sequential multi-task learning method, effectively utilizes the advantages of the task of the representation learning list and the feedback-based sequential multi-task learning method in legal judgment and prediction, overcomes the defect that the task-based representation learning list method does not utilize the complementary information of other tasks in a targeted manner, and can improve the accuracy and the robustness of judgment and prediction results compared with the traditional multi-task learning-based method.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
FIG. 1 is a flow chart of a legal decision prediction method based on feedback sequence multitasking learning in an embodiment of the invention;
FIG. 2 is a schematic diagram of a multi-task pre-training classification model according to an embodiment of the invention.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
Embodiments of the invention and features of the embodiments may be combined with each other without conflict.
The invention provides a general idea:
the text characteristic representation learning of the case description is realized by using a single-task legal prediction method based on representation learning, the sequence relation and reverse verification relation among all the subtasks are considered by taking the information of the preceding task of each subtask and the information of the feedback information of the following task as the input of the current task, and finally the legal judgment prediction based on feedback sequence multitask learning is realized.
The technical steps are as follows: the method is based on two-way LSTM to realize sequence relation and reverse verification, wherein the forward LSTM takes a preceding task and a current task as inputs to realize sequence modeling between tasks, and the reverse LSTM takes a following task and the current task as inputs to realize reverse verification of the tasks. The implementation of this part corresponds to the eighth step.
Example 1
Referring to fig. 1, the embodiment discloses a law decision prediction method based on feedback sequence multitask learning, which comprises the following specific steps:
the first step: a training dataset of case descriptions and related laws, criminals and criminal periods is obtained.
And a second step of: and carrying out text characteristic representation learning on the case description to obtain vector representation.
And a third step of: all laws, criminals and criminal periods perform text feature representation learning to obtain the feature vector representation.
Fourth step: and constructing a multitask pre-training model based on legal prediction, criminal prediction and criminal period prediction of the case description, and acquiring corresponding pre-classification vectors of three subtasks.
Fifth step: and carrying out feature fusion on the normal vector pointed by the pre-classification vector of the normal prediction task and the case description representation vector to obtain the case-normal representation vector.
Sixth step: and carrying out feature fusion on the crime vector corresponding to the crime pointed by the pre-classification vector of the crime prediction task and the case representation vector to obtain the case-crime representation vector.
Seventh step: and carrying out feature fusion on the criminal period vectors corresponding to the criminal period pointed by the pre-classified vector of the criminal period prediction task and the case representation vector to obtain the case-criminal period representation vector.
Eighth step: the case-normal vector, the case-criminal vector and the case-criminal vector are taken as input and are input into a bidirectional long-short-term memory neural network (Bidirectional Long Short Term Memory, bi-LSTM) to obtain high-level semantic representation of the three vectors.
Ninth step: the classifier for legal, criminal and criminal period is constructed based on the higher-level characteristic representation of the vectors of the legal, criminal and criminal period.
Tenth step: and outputting prediction results of laws, criminals and criminal periods.
In the second step of (2), a vector representation d of the case description text is obtained by a method based on representation learning i ,{i=1,2,3…D}。
In the third step, each legal rule is obtained by adopting a method for expressing learning, and vectors of criminal and criminal periods are expressed as l respectively i ,{i=1,2,3…L},c i ,{i=1,2,3…C},p i ,{i=1,2,3…P}。
In the fourth step, the case description vector d obtained in the second step is used for i Inputting { i=1, 2,3 … T } into a multi-task classifier, and training the multi-task classification model to realize pre-classification of legal, criminal and criminal periods and obtain a legal classification vector lr i { i=1, 2,3 … D }, criminal prediction vector cr i (i=1, 2,3 … C) and criminal prediction vector pr i ,{i=1,2,3…D}。
In the fifth step, the case description vector d obtained in the second step is used for i (i=1, 2,3 … D) and the corresponding normal predictive vector lr obtained in the fourth step i A normal vector l pointed to by { i=1, 2,3 … D } i Feature fusion is carried out to obtain a case-legal expression vector dl i ,{i=1,2,3…D}。
In the sixth step, based on the case description vector d obtained in the second step i (i=1, 2,3 … D) and the corresponding criminal prediction vector cr obtained in the fourth step i Crime c pointed to by { i=1, 2,3 … D } i Performing special treatmentThe case-criminal expression vector dc is obtained by the fusion of the signs i ,{i=1,2,3…D}。
In the seventh step, the case description vector d obtained in the second step is used for i (i=1, 2,3 … D) and the corresponding criminal phase prediction vector pr obtained in the fourth step i Criminal period vector p pointed to by { i=1, 2,3 … D } i Feature fusion is carried out to obtain a case-criminal period expression vector dp i ,{i=1,2,3…D}。
In the eighth step, the case-law expression vector, the case-criminal vector and the case criminal vector obtained in the fifth, sixth and seventh steps are sequentially input into a Bi-LSTM network for training, and high-level characteristic expressions corresponding to the three vectors are obtained
Figure GDA0004051062750000101
Figure GDA0004051062750000102
And->
Figure GDA0004051062750000103
/>
In the ninth step, the high-level characteristic representation of the eighth step is input into three classifiers to realize prediction of laws, crimes and criminal periods.
In this embodiment, the text feature represents learning:
the feature representation learning of the text refers to representing the semantic, syntactic and other information of the text in a low-dimensional dense vector space through a modeling method, and then calculating and reasoning. Representation learning for text features is largely divided into three granularities: word vector representations, sentence vector representations, and document vector representations.
In this embodiment, the BERT model of the existing Google release is mainly used. The full name of BERT is Bidirectional Encoder Representation from Transformers, the encoder of the bi-directional transducer. The main innovation point of the model is that the method is on the pre-training method, namely, two methods of covered language model (Masked Language Model) and next sentence prediction (Next Sentence Prediction) are used for capturing words, sentences and chapter levels respectivelyOther features. Through the BERT model, the training data set can be pre-trained based on the case fact expression to obtain a language model of legal prediction tasks, so as to obtain vector representations D of D case fact descriptions i { i=1, 2,3 … D }. Meanwhile, aiming at legal contents, criminal descriptions and criminal period descriptions in the task, a dictionary searching mode is adopted to obtain a legal vector l based on a BERT model i { i=1, 2,3 … L }, criminal description vector c i (i=1, 2,3 … C) and criminal description vector p i ,{i=1,2,3…P}。
In this embodiment, the multitasking pre-trained classification model:
referring to fig. 2, multi-task learning (Multitask learning, MTL) is one of the migration learning algorithms, and migration learning can be understood as defining a source domain and a target domain, learning in the source domain, and migrating the learned knowledge to the target domain, so as to improve the learning effect of the target domain. Two multitask learning modes in deep learning: hard and soft sharing of hidden layer parameters. This item takes the hard sharing mechanism of parameters as an example, but is not limited to the method of hard sharing, which is typically implemented by sharing hidden layers among all tasks, while preserving the output layers of several specific tasks.
The method comprises the steps of obtaining a legal classification vector lr of each key fact description by a multitask learning method of parameter hard sharing through case fact description and corresponding legal, criminal and criminal period labels in a training data set i { i=1, 2,3 … D }, criminal prediction vector cr i (i=1, 2,3 … C) and criminal prediction vector pr i ,{i=1,2,3…D}。
In this embodiment, a two-way long and short memory module network (Bi-LSTM) modeling task sequence relationships and feedback relationships:
a Long Short-term memory network (LSTM) is a time-circulating neural network, and is specifically designed to solve the gradient disappearance problem of a general circulating neural network (RNN). LSTM is a type of neural network that contains blocks of LSTM, the gates (gates) in each block can memorize values for indefinite lengths of time, and typically LSTM comprises three gates: forget gate, input gate and output gate. Currently, LSTM is mainly used for encoding context information in time series. The invention mainly adopts the gate in the LSTM block to code the sequence relation between tasks in multi-task learning and the verification relation between the follow-up task and the current task. The specific learning process is as follows:
step1: selecting each D in the batch case fact description D in m i Obtaining a predicted result vector lr for a legal rule in a pre-classification result based on a multitasking pre-training classification model i And extracting the normal vector l corresponding to the element with the largest value in the vector j Describing the vector and case fact description vector d i Splicing, inputting to a full connection layer to obtain a case-normal vector representation dl i
Step2: for each d i Obtaining a predicted result vector cr for a crime in a pre-classification result based on a multi-task pre-classification model i And extracting the criminal vector c corresponding to the element with the value exceeding 0.5 in the vector j { j=1, 2, … c }, describe these vectors and case facts as vector d i Splicing, inputting to a full connection layer to obtain a case-criminal vector representation dc i
Step3: for each d i Obtaining a predicted result vector pr aiming at criminal period in a pre-classification result based on a multi-task pre-classification model i And extracting the criminal period vector p corresponding to the element with the largest value in the vector i Describing the vector and case fact description vector d i Splicing, inputting into a full-connection layer to obtain a case-criminal period vector representation dp i
Step4, random initialization of initial State c of Forward LSTM Module 0 And h 0 State, case-law vector dl i As input vectors, the cell states of the modules are calculated respectively
Figure GDA0004051062750000131
And forward output state->
Figure GDA0004051062750000132
Step5 in the cellular state
Figure GDA0004051062750000133
And its forward output state->
Figure GDA0004051062750000134
For the cell state and the input state of the current forward LSTM module at the moment, the case-crime vector dc i As input vector, the cell status of the module is calculated separately +.>
Figure GDA0004051062750000135
And forward output state->
Figure GDA0004051062750000136
Step6: in the cellular state
Figure GDA0004051062750000137
And its forward output state->
Figure GDA0004051062750000138
For inputting cell state and input state at the moment on the current forward LSTM module, a case-criminal period vector dp i As input vector, the cell status of the module is calculated separately +.>
Figure GDA0004051062750000139
And forward output state->
Figure GDA00040510627500001310
Step7: in the cellular state
Figure GDA00040510627500001311
And its forward output state->
Figure GDA00040510627500001312
For inputting cell state and input state at one moment on the current reverse LSTM module, a case-criminal period vector dp i As input vector, the cell status of the module is calculated separately +.>
Figure GDA00040510627500001313
And reverse output state->
Figure GDA00040510627500001314
Step8: in the cellular state
Figure GDA00040510627500001315
And its reverse output state->
Figure GDA00040510627500001316
For the cell state and the input state at the current time on the reverse LSTM module, the case-crime vector dp i As input vector, the cell status of the module is calculated separately +.>
Figure GDA00040510627500001317
And reverse output state->
Figure GDA00040510627500001318
Step9: in the cellular state
Figure GDA00040510627500001319
And its reverse output state->
Figure GDA00040510627500001320
For the cell state and the input state at the current time on the reverse LSTM module, the case-law vector dl i As input vector, the cell status of the module is calculated separately +.>
Figure GDA00040510627500001321
And reverse output state->
Figure GDA00040510627500001322
Step10, respectively splicing the forward output state and the reverse output state to obtain
Figure GDA00040510627500001323
Figure GDA00040510627500001324
As a case-based description d i Corresponding to the inputs of the French classifier, the criminal classifier and the criminal period classifier, calculating the corresponding cross entropy loss function of the batch input, and updating parameters.
Step11: if the iteration number is less than the limit number, the process jumps to Step1.
Example two
It is an object of the present embodiment to provide a computing device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the legal decision prediction method steps of the first embodiment based on feedback sequence multitasking.
Example III
An object of the present embodiment is to provide a computer-readable storage medium.
A computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs a legal decision prediction method step of implementing a feedback sequence based multitasking learning in accordance with the first embodiment.
Example IV
The invention also discloses a legal decision prediction system based on feedback sequence multitask learning, which comprises the following steps:
the text feature representation learning module is used for realizing text feature representation learning of the case description by using a single-task legal prediction method based on representation learning;
and the legal judgment prediction module takes the information of the preceding task and the feedback information of the following task of each subtask as the input of the current task, considers the sequence relation and the reverse verification relation among the subtasks, and realizes the legal judgment prediction based on feedback sequence multitask learning.
The steps involved in the devices of the second, third and fourth embodiments correspond to those of the first embodiment of the method, and the detailed description of the embodiments can be found in the related description section of the first embodiment. The term "computer-readable storage medium" should be taken to include a single medium or multiple media including one or more sets of instructions; it should also be understood to include any medium capable of storing, encoding or carrying a set of instructions for execution by a processor and that cause the processor to perform any one of the methods of the present invention.
It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented by general-purpose computer means, alternatively they may be implemented by program code executable by computing means, whereby they may be stored in storage means for execution by computing means, or they may be made into individual integrated circuit modules separately, or a plurality of modules or steps in them may be made into a single integrated circuit module. The present invention is not limited to any specific combination of hardware and software.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
While the foregoing description of the embodiments of the present invention has been presented in conjunction with the drawings, it should be understood that it is not intended to limit the scope of the invention, but rather, it is intended to cover all modifications or variations within the scope of the invention as defined by the claims of the present invention.

Claims (8)

1. A law decision prediction method based on feedback sequence multitask learning is characterized by comprising the following steps:
the text characteristic representation learning of the case description is realized by using a single-task law prediction method based on representation learning;
by taking the information of the preceding task and the feedback information of the following task of each subtask as the input of the current task, the sequence relation and the reverse verification relation among the subtasks are considered, and legal judgment prediction based on feedback sequence multitask learning is realized;
constructing a multi-task pre-training model based on legal prediction, criminal prediction and criminal period prediction of case description, and acquiring corresponding pre-classification vectors of three subtasks;
feature fusion is carried out on a normal classification vector pointed by a pre-classification vector of a normal prediction task and a case description expression vector to obtain a case-normal expression vector;
performing feature fusion on a crime prediction vector corresponding to a crime pointed by a pre-classification vector of a crime prediction task and a case description representation vector to obtain a case-crime representation vector;
carrying out feature fusion on a criminal period prediction vector corresponding to the criminal period pointed by a pre-classification vector of a criminal period prediction task and a case description expression vector to obtain a case-criminal period expression vector;
taking a case-normal vector, a case-criminal vector and a case-criminal vector as inputs, and inputting the inputs into a bidirectional long-short-term memory neural network to obtain high-level semantic representation of the three vectors;
constructing classifiers for legal, criminal and criminal periods based on high-level semantic representations of the case-legal vectors, the case-criminal vectors and the case-criminal period vectors;
inputting the high-level semantic representation into three classifiers to realize prediction of legal, criminal and criminal periods;
the gates in the LSTM block are adopted to code the sequence relation between tasks in the multi-task learning and the verification relation between the follow-up task and the current task:
step1: selecting each case description representation vector D in a batch of case fact descriptions D i Obtaining the predicted result direction aiming at the legal strips in the pre-classification result based on the multitasking pre-training classification modelMeasuring lr i And extracting the normal vector l corresponding to the element with the largest value in the vector j Representing the vector and the case description by a vector d i Splicing, inputting to a full connection layer to obtain a case-normal vector representation dl i
Step2: representing vector d for each case description i Obtaining a predicted result vector cr for a crime in a pre-classification result based on a multi-task pre-classification model i The criminal vector corresponding to the element in the vector is taken out, and the vector and the case description represent vector d i Splicing, inputting to a full connection layer to obtain a case-criminal vector representation dc i
Step3: representing vector d for each case description i Obtaining a predicted result vector pr aiming at criminal period in a pre-classification result based on a multi-task pre-classification model i And extracting the criminal period vector p corresponding to the element with the largest value in the vector i Representing the vector and the case description by a vector d i Splicing, inputting into a full-connection layer to obtain a case-criminal period vector representation dp i
Step4: random initialization of initial state c of forward LSTM module 0 And h 0 Status, dl is expressed by case-law vector i As input vectors, the cell states of the modules are calculated respectively
Figure FDA0004127677840000021
And forward output state->
Figure FDA0004127677840000022
Step5: in the cellular state
Figure FDA0004127677840000023
And its forward output state->
Figure FDA0004127677840000024
For the last moment of the current forward LSTM moduleIs the input cell state and input state of the case-criminal vector representing dc i As input vector, the cell status of the module is calculated separately +.>
Figure FDA0004127677840000025
And forward output state->
Figure FDA0004127677840000026
Step6: in the cellular state
Figure FDA0004127677840000027
And its forward output state->
Figure FDA0004127677840000028
For the cell state and the input state of the current forward LSTM module at the moment, the case-criminal period vector represents dp i As input vector, the cell status of the module is calculated separately +.>
Figure FDA0004127677840000029
And forward output state->
Figure FDA00041276778400000210
Step7: in the cellular state
Figure FDA00041276778400000211
And its forward output state->
Figure FDA00041276778400000212
For the cell state and the input state of the current reverse LSTM module at the moment, the case-criminal period vector represents dp i As input vector, the cell status of the module is calculated separately +.>
Figure FDA00041276778400000213
And reverse output state->
Figure FDA00041276778400000214
Step8: in the cellular state
Figure FDA00041276778400000215
And its reverse output state->
Figure FDA00041276778400000216
For the input cell state and input state at the current time on the reverse LSTM module, the case-criminal vector represents dc i As input vector, the cell status of the module is calculated separately +.>
Figure FDA0004127677840000031
And reverse output state->
Figure FDA0004127677840000032
Step9: in the cellular state
Figure FDA0004127677840000033
And its reverse output state->
Figure FDA0004127677840000034
For the input cell state and input state at the current time on the reverse LSTM module, the case-law vector represents dl i As input vector, the cell status of the module is calculated separately +.>
Figure FDA0004127677840000035
And reverse output state->
Figure FDA0004127677840000036
Step10: respectively splicing the forward and reverse output states to obtain
Figure FDA0004127677840000037
Figure FDA0004127677840000038
Representing vector d as a case-based description i Corresponding to the input of the French classifier, the criminal classifier and the criminal period classifier, calculating a cross entropy loss function corresponding to batch input, and updating parameters;
step11: if the iteration number is less than the limit number, the process jumps to Step1.
2. The legal decision prediction method based on feedback sequence multitasking learning as recited in claim 1, wherein a training data set of the case description and its related laws, criminals and criminals is obtained from a data center server, and the training data set is stored in a database;
text characteristic representation learning is carried out on the case description to obtain vector representation;
all laws, criminals and criminal periods perform text feature representation learning to obtain the feature vector representation.
3. The law decision prediction method based on feedback sequence multitask learning as claimed in claim 1, wherein a multitask pre-training model based on legal prediction, criminal prediction and criminal period prediction of case description is constructed, and corresponding pre-classification vectors of three subtasks are obtained:
inputting the obtained case description representation vector into a multi-task classifier, and training the multi-task classification model to pre-classify the normal, crime and criminal period, so as to obtain normal classification vectors, crime prediction vectors and criminal period prediction vectors.
4. The legal decision prediction method based on feedback sequence multitask learning of claim 1, wherein the language model of legal prediction tasks is obtained by pretraining based on a case description training data set through a BERT model, so as to obtain D case description expression vectors;
based on the BERT model, a dictionary searching mode is adopted for legal content, criminal descriptions and criminal descriptions to obtain legal vectors, criminal description vectors and criminal description vectors.
5. The legal judgment prediction method based on feedback sequence multitask learning as claimed in claim 1, wherein the rule classification vector, the criminal prediction vector and the criminal prediction vector of each case description are obtained by adopting a parameter hard-sharing multitask learning method for the case description and corresponding laws, criminal and criminal labels in the training data set.
6. A legal decision prediction system based on feedback sequence multitasking learning, comprising:
the text feature representation learning module is used for realizing text feature representation learning of the case description by using a single-task legal prediction method based on representation learning;
the legal judgment prediction module takes the information of the preceding task and the feedback information of the following task of each subtask as the input of the current task, considers the sequence relation and the reverse verification relation among the subtasks, and realizes the legal judgment prediction based on feedback sequence multitask learning;
constructing a multi-task pre-training model based on legal prediction, criminal prediction and criminal period prediction of case description, and acquiring corresponding pre-classification vectors of three subtasks;
feature fusion is carried out on a normal classification vector pointed by a pre-classification vector of a normal prediction task and a case description expression vector to obtain a case-normal expression vector;
performing feature fusion on a crime prediction vector corresponding to a crime pointed by a pre-classification vector of a crime prediction task and a case description representation vector to obtain a case-crime representation vector;
carrying out feature fusion on a criminal period prediction vector corresponding to the criminal period pointed by a pre-classification vector of a criminal period prediction task and a case description expression vector to obtain a case-criminal period expression vector;
taking a case-normal vector, a case-criminal vector and a case-criminal vector as inputs, and inputting the inputs into a bidirectional long-short-term memory neural network to obtain high-level semantic representation of the three vectors;
constructing classifiers for legal, criminal and criminal periods based on high-level semantic representations of the case-legal vectors, the case-criminal vectors and the case-criminal period vectors;
inputting the high-level semantic representation into three classifiers to realize prediction of legal, criminal and criminal periods;
the gates in the LSTM block are adopted to code the sequence relation between tasks in the multi-task learning and the verification relation between the follow-up task and the current task:
step1: selecting each case description representation vector D in a batch of case fact descriptions D i Obtaining a predicted result vector lr for a legal rule in a pre-classification result based on a multitasking pre-training classification model i And extracting the normal vector l corresponding to the element with the largest value in the vector j Representing the vector and the case description by a vector d i Splicing, inputting to a full connection layer to obtain a case-normal vector representation dl i
Step2: representing vector d for each case description i Obtaining a predicted result vector cr for a crime in a pre-classification result based on a multi-task pre-classification model i The criminal vector corresponding to the element in the vector is taken out, and the vector and the case description represent vector d i Splicing, inputting to a full connection layer to obtain a case-criminal vector representation dc i
Step3: representing vector d for each case description i Obtaining a predicted result vector pr aiming at criminal period in a pre-classification result based on a multi-task pre-classification model i And extracting the criminal period vector p corresponding to the element with the largest value in the vector i Representing the vector and the case description by a vector d i Splicing, inputting into a full-connection layer to obtain a case-criminal period vector representation dp i
Step4: random initialization of initial state c of forward LSTM module 0 And h 0 Status, dl is expressed by case-law vector i As input vectors, the cell states of the modules are calculated respectively
Figure FDA0004127677840000051
And forward output state->
Figure FDA0004127677840000052
Step5: in the cellular state
Figure FDA0004127677840000053
And its forward output state->
Figure FDA0004127677840000054
For the input cell state and input state at the current time on the forward LSTM module, the case-criminal vector represents dc i As input vector, the cell status of the module is calculated separately +.>
Figure FDA0004127677840000055
And forward output state->
Figure FDA0004127677840000056
Step6: in the cellular state
Figure FDA0004127677840000057
And its forward output state->
Figure FDA0004127677840000058
For the cell state and the input state of the current forward LSTM module at the moment, the case-criminal period vector represents dp i As input vector, the cell status of the module is calculated separately +.>
Figure FDA0004127677840000059
And forward output state->
Figure FDA00041276778400000510
Step7: in the cellular state
Figure FDA0004127677840000061
And its forward output state->
Figure FDA0004127677840000062
For the cell state and the input state of the current reverse LSTM module at the moment, the case-criminal period vector represents dp i As input vector, the cell status of the module is calculated separately +.>
Figure FDA0004127677840000063
And reverse output state->
Figure FDA0004127677840000064
Step8: in the cellular state
Figure FDA0004127677840000065
And its reverse output state->
Figure FDA0004127677840000066
For the input cell state and input state at the current time on the reverse LSTM module, the case-criminal vector represents dc i As input vector, the cell status of the module is calculated separately +.>
Figure FDA0004127677840000067
And reverse output state->
Figure FDA0004127677840000068
Step9: in the cellular state
Figure FDA0004127677840000069
And its reverse output state->
Figure FDA00041276778400000610
For the input cell state and input state at the current time on the reverse LSTM module, the case-law vector represents dl i As input vector, the cell status of the module is calculated separately +.>
Figure FDA00041276778400000611
And reverse output state->
Figure FDA00041276778400000612
Step10: respectively splicing the forward and reverse output states to obtain
Figure FDA00041276778400000613
Figure FDA00041276778400000614
Representing vector d as a case-based description i Corresponding to the input of the French classifier, the criminal classifier and the criminal period classifier, calculating a cross entropy loss function corresponding to batch input, and updating parameters;
step11: if the iteration number is less than the limit number, the process jumps to Step1.
7. A computing device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor performs the legal decision prediction method steps of any one of claims 1-5 based on feedback sequence multitasking learning.
8. A computer readable storage medium having stored thereon a computer program, which when executed by a processor performs the legal decision prediction method steps of any one of claims 1-5 based on feedback sequence multitasking.
CN202010031722.9A 2020-01-13 2020-01-13 Legal decision prediction method and system based on feedback sequence multitask learning Active CN111259673B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010031722.9A CN111259673B (en) 2020-01-13 2020-01-13 Legal decision prediction method and system based on feedback sequence multitask learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010031722.9A CN111259673B (en) 2020-01-13 2020-01-13 Legal decision prediction method and system based on feedback sequence multitask learning

Publications (2)

Publication Number Publication Date
CN111259673A CN111259673A (en) 2020-06-09
CN111259673B true CN111259673B (en) 2023-05-09

Family

ID=70945221

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010031722.9A Active CN111259673B (en) 2020-01-13 2020-01-13 Legal decision prediction method and system based on feedback sequence multitask learning

Country Status (1)

Country Link
CN (1) CN111259673B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112015659A (en) * 2020-09-02 2020-12-01 三维通信股份有限公司 Prediction method and device based on network model
CN112131370B (en) * 2020-11-23 2021-03-12 四川大学 Question-answer model construction method and system, question-answer method and device and trial system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229582A (en) * 2018-02-01 2018-06-29 浙江大学 Entity recognition dual training method is named in a kind of multitask towards medical domain
CN109241528A (en) * 2018-08-24 2019-01-18 讯飞智元信息科技有限公司 A kind of measurement of penalty prediction of result method, apparatus, equipment and storage medium
CN109255119A (en) * 2018-07-18 2019-01-22 五邑大学 A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition
CN109376227A (en) * 2018-10-29 2019-02-22 山东大学 A kind of prison term prediction technique based on multitask artificial neural network
CN109829055A (en) * 2019-02-22 2019-05-31 苏州大学 User's law article prediction technique based on filtering door machine
CN109919175A (en) * 2019-01-16 2019-06-21 浙江大学 A kind of more classification methods of entity of combination attribute information

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229582A (en) * 2018-02-01 2018-06-29 浙江大学 Entity recognition dual training method is named in a kind of multitask towards medical domain
CN109255119A (en) * 2018-07-18 2019-01-22 五邑大学 A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition
CN109241528A (en) * 2018-08-24 2019-01-18 讯飞智元信息科技有限公司 A kind of measurement of penalty prediction of result method, apparatus, equipment and storage medium
CN109376227A (en) * 2018-10-29 2019-02-22 山东大学 A kind of prison term prediction technique based on multitask artificial neural network
CN109919175A (en) * 2019-01-16 2019-06-21 浙江大学 A kind of more classification methods of entity of combination attribute information
CN109829055A (en) * 2019-02-22 2019-05-31 苏州大学 User's law article prediction technique based on filtering door machine

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Legal Judgment Prediction via Multi-Perspective Bi-Feedback Network";Wenmian Yang 等;《arXiv》;20190516;第1-7页 *
Wenmian Yang 等."Legal Judgment Prediction via Multi-Perspective Bi-Feedback Network".《arXiv》.2019,第1-7页. *
融入罪名关键词的法律判决预测多任务学习模型;刘宗林 等;《清华大学学报(自然科学版)》;20190410;第1-8页 *

Also Published As

Publication number Publication date
CN111259673A (en) 2020-06-09

Similar Documents

Publication Publication Date Title
Yao et al. Bi-directional LSTM recurrent neural network for Chinese word segmentation
CN111428525A (en) Implicit discourse relation identification method and system and readable storage medium
CN110083702B (en) Aspect level text emotion conversion method based on multi-task learning
CN109919175B (en) Entity multi-classification method combined with attribute information
CN110196967A (en) Sequence labelling method and apparatus based on depth converting structure
CN111259673B (en) Legal decision prediction method and system based on feedback sequence multitask learning
CN110633473B (en) Implicit discourse relation identification method and system based on conditional random field
CN115244587A (en) Efficient ground truth annotation
US20220067590A1 (en) Automatic knowledge graph construction
Wang et al. Attention-based CNN-BLSTM networks for joint intent detection and slot filling
WO2021223882A1 (en) Prediction explanation in machine learning classifiers
US11941360B2 (en) Acronym definition network
Kassawat et al. Incorporating joint embeddings into goal-oriented dialogues with multi-task learning
Reddy et al. An approach for suggestion mining based on deep learning techniques
CN116341564A (en) Problem reasoning method and device based on semantic understanding
Zhang et al. Introducing DRAIL–a step towards declarative deep relational learning
CN117980915A (en) Contrast learning and masking modeling for end-to-end self-supervised pre-training
CN114936564A (en) Multi-language semantic matching method and system based on alignment variational self-coding
CN113487453A (en) Legal judgment prediction method and system based on criminal elements
CN114707509A (en) Traffic named entity recognition method and device, computer equipment and storage medium
CN114519353A (en) Model training method, emotion message generation device, emotion message generation equipment and emotion message generation medium
Pattanayak et al. Natural language processing using recurrent neural networks
CN113505937A (en) Multi-view encoder-based legal decision prediction system and method
Arora et al. METIS-GAN: An approach to generate spatial configurations using deep learning and semantic building models
Vishwanath et al. Adding CNNs to the Mix: Stacking models for sentiment classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant