CN110119449A - A kind of criminal case charge prediction technique based on sequence enhancing capsule net network - Google Patents
A kind of criminal case charge prediction technique based on sequence enhancing capsule net network Download PDFInfo
- Publication number
- CN110119449A CN110119449A CN201910396510.8A CN201910396510A CN110119449A CN 110119449 A CN110119449 A CN 110119449A CN 201910396510 A CN201910396510 A CN 201910396510A CN 110119449 A CN110119449 A CN 110119449A
- Authority
- CN
- China
- Prior art keywords
- sequence
- charge
- capsule
- case
- net network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000002775 capsule Substances 0.000 title claims abstract description 78
- 238000000034 method Methods 0.000 title claims abstract description 45
- 230000002708 enhancing effect Effects 0.000 title claims abstract description 22
- 238000012549 training Methods 0.000 claims abstract description 31
- 239000013598 vector Substances 0.000 claims description 30
- 230000006870 function Effects 0.000 claims description 12
- 239000011159 matrix material Substances 0.000 claims description 9
- 230000000153 supplemental effect Effects 0.000 claims description 8
- 230000007246 mechanism Effects 0.000 claims description 6
- 230000000694 effects Effects 0.000 claims description 5
- 238000005516 engineering process Methods 0.000 claims description 4
- 230000007787 long-term memory Effects 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 2
- 230000001105 regulatory effect Effects 0.000 claims description 2
- 238000010801 machine learning Methods 0.000 description 6
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000015654 memory Effects 0.000 description 4
- 238000003058 natural language processing Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000004445 quantitative analysis Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 206010039203 Road traffic accident Diseases 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 238000004836 empirical method Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000003292 glue Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/18—Legal services
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Tourism & Hospitality (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Operations Research (AREA)
- Development Economics (AREA)
- Technology Law (AREA)
- Databases & Information Systems (AREA)
- Quality & Reliability (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Primary Health Care (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Alarm Systems (AREA)
Abstract
The present invention relates to intelligent legal field more particularly to a kind of criminal case charge prediction techniques based on sequence enhancing capsule net network.The following steps are included: S1 construct training dataset, obtain case the fact description and charge penalty result as training data;S2 building sequence enhancing capsule net network model is simultaneously trained by training data;S3 enhances capsule net network model by the sequence of S2 after training, the fact that new case is described text input into sequence capsule network model, the corresponding charge of model automatic Prediction is as charge prediction result.Model proposed by the present invention can not only preferably capture the notable feature and semantic information of Law Text, and have preferable competitiveness in low frequency charge forecasting problem;Focal loss loss function is introduced, as the loss function of sequence enhancing capsule net network model, further alleviates the charge height imbalance problem of low frequency charge prediction task.
Description
Technical field
The present invention relates to intelligent legal field more particularly to a kind of criminal case charges based on sequence enhancing capsule net network
Prediction technique.
Background technique
In recent years, quantum jump is obtained using deep learning and natural language processing as the artificial intelligence technology of representative, started
Show up prominently in intelligent legal field, receives the extensive concern of academia and industrial circle.Intelligent law assigns machine and understands method
The ability restrained text, analyze case can carry out intelligence according to case and handle a case.
Automatic charge prediction plays in Legal Assistant's system as most one of the subtask of table in intelligence law
Important role, also have a wide range of applications in real life.For example, it can be law expert (such as lawyer and judge)
The charge reference of case defendant is provided, assists judge to decide a case with this, improves working efficiency;It can be simultaneously to be unfamiliar with law art
The ordinary people of language and complicated process provides legal advice.Automatic charge prediction is to utilize machine learning or depth learning technology training
The charge (such as steal, plunder, traffic accident) of machine court conclusion part defendant.Previous studies work proposes many
The method for realizing automatic charge prediction.These methods are broadly divided into three classes: (1) conventional method;(2) machine learning method;(3) deep
Spend learning method.
Conventional method is frequently with mathematical formulae or quantitative calculating.Kort[Fred Kort.Predicting Supreme
Court decisions mathematically:A quantitative analysis of the“right to
1957,51 (1): counsel " cases.American Political Science Review 1-12] attempts with quantitative
Method prediction is typically considered to the uncertain People events of height, the i.e. judgement of the US Supreme Court.The research is intended to prove, until
Less in a field of the judicial review, the fact that influence decision factor is determined with the case of some decisions, is asked with formula
Then the numerical value of these factors out is correctly predicted the decision of remaining case in specified field.Nagel[Stuart S
Nagel.Applying correlation analysis to case prediction.Tex.L.Rev.,1963,42:
1006] think scientifically to predict lawsuit as a result, he is demonstrated using the example redistributed by occurring in case
Four variables distribute related coefficient, and prediction is possible.This prediction will be helpful to the party of planning lawsuit, understand judicial journey
The theoretician of sequence explains the judicial legislator reacted and seeks the public to abide by the law.Keown[R
Keown.Mathematical models for legal prediction.Computer/LJ, 1980,2:829] it proposes
Mathematically predict the feasibility of judicial decision.He using Haar, Sawyer and Cummings linear model method and
The nearest neighbor method of Mackaay and Robillard is in case a more than 1000, correctly predicted 99% decision.It is this to be successfully
Real chance and urgent need are provided in other special dimensions exploitation linear model, this is not only for from empirically testing
It is typically effective to demonstrate,prove this method, but also provides additional prediction model for legal industry.These are traditional
Method achieves some effects in certain scenes, but they are only limitted to the small data set with a small amount of label.
Success due to machine learning in many fields, it is pre- to handle charge that researcher begins to use machine learning method
It surveys.This kind of work, which is usually laid particular emphasis on, extracts feature from the case fact, is then predicted using machine learning algorithm.Liu et al.
People [Chao-Lin Liu, Cheng-Tsung Chang, Jim-How Ho.Case instance generation and
refinement for case-based criminal summary judgments in Chinese.2004.,Chao-
Lin Liu,Chwen-Dar Hsieh.Exploring phrase-based classification of judicial
documents for criminal charges in chinese.In:Proc of International Symposium
On Methodologies for Intelligent Systems.Springer, 2006,681-690] one kind is proposed to be based on
K-Nearest Neighbor (KNN) algorithm, for automatically generating and refining from the judgement text of real world for criminal
The case example of summary judgement.The algorithm attempts to extract important legal information from past charging document to construct case reality
Then example deletes relatively incoherent information through the similar case of merging and from case to refine these case examples.Lin
Et al. [Wan-Chen Lin, Tsung-Ting Kuo, Tung-Jia Chang.Exploiting machine learning
models for Chinese legal documents labeling,case classification,and
Sentencing prediction.ROCLING XXIV (2012), 2012.140] for " crime of robbery " and " blackmail crime "
Define 21 kinds of law element labels, then classified using law element information classify " crime of robbery " and " blackmail crime " with
And predict that this two kinds guilty are sentenced the prison term.Mackaay et al. [Ejan Mackaay, Pierre Robillard.Predicting
judicial decisions:The nearest neighbor rule and visual representation of
Case patterns.1974] feature extracted by the semantic similar N-grams of cluster.Sulea et al. [Octavia-
Maria Sulea,Marcos Zampieri,Shervin Malmasi,et al.Exploring the Use of Text
Classification in the Legal Domain.CoRR, 2017, abs/1710.09306] utilize the French Supreme Judicial Court
Case and ruling, investigated file classification method in the application of legal field, then proposed a kind of based on support vector machines
Case description, time span and adjudicate feature decision system, with predict case legal field and judgement aspect it is accurate
Property.However, these methods only extract shallow-layer text characteristics or hand labeled, it is difficult to collect these features on big data set.
Therefore, when data volume is very big, their performance will not be fine.
In recent years, as deep neural network is in natural language processing (NLP), computer vision (CV) and voice field
Success, a few thing starts to apply it in automatic charge prediction task, and shows huge performance boost.Luo et al.
[Bingfeng Luo,Yansong Feng,Jianbo Xu,et al.Learning to Predict Charges for
Criminal Cases with Legal Basis.arXiv preprint arXiv:1707.09168,2017.] think phase
It closes legal provision and very important effect is played to charge prediction task in this task.Therefore it proposes a kind of based on attention
Neural network method, charge prediction task and related provision are extracted into task and carry out joint modeling under unified frame, from
And the appropriate charge of different expression way cases can be effectively predicted.However, this work not can solve low frequency charge prediction with
And the problem of multiple charge prediction.Zhong et al. [Haoxi Zhong, Guo Zhipeng, Cunchao Tu, et al.Legal
Judgment Prediction via Topological Learning.In:Proc of Proceedings of the
2018Conference on Empirical Methods in Natural Language Processing.2018,3540–
3549] by relying on and closing in view of charge, law article, fine, the topology between punishment these subtasks of time limit in legal provision
System proposes a kind of frame of topological multi-task learning, and the dependence of multiple subtasks is integrated in charge judgement prediction.
Hu et al. [Zikun Hu, Xiang Li, Cunchao Tu, et al.Few-shot charge prediction with
discriminative legal attributes.In:Proc of Proceedings of the 27th
International Conference on Computational Linguistics.2018,487-498] for low frequency crime
Name is predicted and holds confusing charge, introduces between the fact that several discrimination properties of charge are as case description and charge
Internal maps, these attributes provide additional information for low frequency charge and the validity feature of charge is obscured in differentiation, then propose
A kind of Attribute-attentive charge prediction model comes while inferring attribute and charge.By studying above-mentioned scholar
The further analysis of content it can be found that academic circles at present and industry although have been proposed it is a series of based on deep learning
Automatic charge prediction algorithm, and make a lot of progress.But existing method is still in Shortcomings: (1) existing big portion
The low frequency charge scene that [9,10] ignore automatic charge prediction task is made in the division of labor, takes into consideration only high frequency charge scene, therefore not
It can solve low frequency charge forecasting problem.(2) Hu et al. [11] is using manually generated auxiliary information in low frequency charge field
It achieves good results in scape, however, artificial markup information wastes a large amount of time, and can not realize deep end to end
Spend learning model.
National inventing patent application " a kind of criminal case charge prediction technique based on Memory Neural Networks " (publication date:
2019.02.22 training dataset) is built using the description of the merit of standard and its charge as training data, passes through training dataset pair
The Memory Neural Networks model constructed is trained, and " merit Expressive Features vector "-" charge coding " is refreshing to memory is converted to
Key-value pair through storing in network model judges criminal case charge using multi-layer perception (MLP) classifier, and this method mentions
Although model out also can be carried out prediction to low frequency charge, memory module needs to compare the charge of true charge and prediction
Relationship, however low frequency charge data volume is less, even only there was only several cases in the charge of part, therefore, it is difficult to low
The effect got in frequency charge prediction scene.
Summary of the invention
It analyses in depth by the research achievement to above-mentioned lot of domestic and foreign scholar, existing in the prior art is asked for above-mentioned
Topic, the present invention proposes a kind of criminal case charge prediction technique based on sequence enhancing capsule net network, to alleviate in criminal case
Low frequency charge forecasting problem.
To achieve the goals above, the technical solution adopted by the present invention is a kind of punishment based on sequence enhancing capsule net network
Thing case charge prediction technique, comprising the following steps:
The fact that S1 constructs training dataset, obtains case description and charge penalty result are as training data;
S2 building sequence enhancing capsule net network model is simultaneously trained by training data, comprising the following steps:
S2.1, which constructs sequence, enhances capsule net network model, the specific steps are as follows:
S2.1.1 constructs initial capsule layer: text described to the fact that case and is segmented, and is mapped as term vector sequence,
As initial capsule layer u={ u1,u2,…,un};
S2.1.2 constructs Multiple seq-caps layers: by the initial capsule layer u obtained to S2.1.1, utilizing
Seq-caps layers of extraction feature of Multiple, obtain the principal eigenvector that case facts describe text, the Multiple
Seq-caps layers are formed by two seq-caps layers;
S2.1.3 constructs the residual unit layer (attention layers) based on attention mechanism, obtains to S2.1.1 initial
Capsule layer u uses attention mechanism, obtains the supplemental characteristic vector c that case facts describe text:
Described attention layers as follows: by n initial capsule u in initial capsule layer ui, (i=1,2 ..., n) passes through
Weight matrix W obtains a vector e after matrixingi, then to vector ei, by softmax function, obtain each
A initial capsule uiImportance weight αi, all initial capsules are added according to importance weight, case facts is finally obtained and retouches
State the supplemental characteristic vector c of text;Formula is as follows:
ei=tanh (Wui+b)
Wherein W is weight matrix, and b is bias vector.
S2.1.4 construct output layer, the case facts that S2.1.2 is obtained describe text principal eigenvector and
The supplemental characteristic vector c that the case facts that S2.1.3 is obtained describe text combines, and is conveyed to and connects layer network entirely.
S2.2 training sequence enhances capsule net network model;
S3 enhances capsule net network model by the sequence of S2 after training, and the fact that new case is described text input and is arrived
In sequence capsule network model, the corresponding charge of model automatic Prediction is as charge prediction result.
Further, the data set in S1 nets disclosed true criminal case, every case from Chinese judgement document
Including two parts: description and charge penalty are as a result, as training data for the fact that case.
Further, it segments in S2.1.1 using Peking University Open-Source Tools pkuseg, and utilizes Embedding skill
The term vector of Word2vec training is mapped as term vector sequence by art.
Further, capsule net network model is enhanced using focal loss loss function training sequence in S2.2.
Compared with prior art:
(1) the invention proposes a kind of sequences to enhance capsule net network model, which can not only preferably capture law
The notable feature and semantic information of text, and there is preferable competitiveness in low frequency charge forecasting problem.
(2) focal loss loss function is introduced, as the loss function of sequence enhancing capsule net network model, further
Alleviate the charge height imbalance problem of low frequency charge prediction task.
(3) by comparing current state-of-the-art method, sequence proposed by the present invention increases capsule network model and is really counting
It is promoted according to the F1 for realizing 4.5% and 6.4% in collection Criminal-S and Criminal-L respectively.Experimental result unanimously demonstrates
Sequence enhances capsule net network model and is solving superiority and competitiveness in low frequency charge scene.
Detailed description of the invention
Fig. 1 is the flow chart of the method for the invention;
Fig. 2 is the schematic diagram of sequence capsule network model of the present invention;
Fig. 3 is the schematic diagram of Seq-caps layer of the invention;
Fig. 4 is the schematic diagram of Attention layer of the invention.
Specific embodiment
The present invention is further elaborated with specific embodiment with reference to the accompanying drawings of the specification.
Brief flow diagram of the invention is as shown in Figure 1, the present invention is based on the crime of the criminal case of sequence capsule network model
Name prediction technique the following steps are included:
The fact that S1 constructs training dataset, obtains case description and charge penalty result are as training data;
The present invention tests in disclosed three real data sets, these data sets are both from Chinese judgement document
The description of the fact that three criminal cases disclosed in net, acquisition case and charge penalty result are as training data;Due to public affairs
The main charge of case is only remained in the data set opened, therefore it may only be necessary to by each charge be mapped as a unique integer into
Row coding.
S2 building sequence enhancing capsule net network model is simultaneously trained by training data, comprising the following steps:
S2.1, which constructs sequence, enhances capsule net network model, and sequence enhancing capsule net network model of the invention is as shown in Figure 2.Structure
Build the model the following steps are included:
S2.1.1 constructs initial capsule layer: describing text to the fact that case and segments, and utilizes Embedding technology
The term vector of Word2vec training is mapped as term vector sequence, as initial capsule layer u={ u1,u2,…,un}。
S2.1.2 constructs Multiple seq-caps layers, by the initial capsule layer u obtained to S2.1.1, utilizes
Seq-caps layers of Multiple obtain the principal eigenvector that case facts describe text.
Described Multiple seq-caps layers are formed by two seq-caps layers, for each seq-caps layers, are such as schemed
Shown in 3, by a sequence information encoder (Sequence Information Encode) and a dynamic routing converter
(Dynamic Routing) composition.The present invention uses shot and long term memory network (LSTM) as sequence information encoder.With first
For seq-caps layers a, by initial capsule layer u={ u1,u2,…,unBe passed in seq-caps layers, shot and long term memory network
Formula it is as follows:
ft=σ (Wfut+Ufht-1+bf),
it=σ (Wiut+Uiht-1+bi),
oT=σ (Wout+Uoht-1+bo),
ht=OtOtanh(ct)
H is solved by above-mentioned formulatThe sequence information at moment, wherein ft、it、otIt is forgetting door, the input of LSTM respectively
Door, out gate,Indicate be currently can the moment candidate value, ctIndicate the state at current time, htIndicate the output at current time
Value, Wf、Wi、Wo、WcIndicate weight matrix, Uf、Ui、Uo、UcIndicate weight matrix, bf、bi、bo、bcIndicate bias vector, ut
Indicate current input value, ct-1Indicate the state of last moment, ht-1Indicate the output valve of last moment, σ indicates sigmoid letter
Number.
Then the output of sequence information encoder is passed in dynamic routing converter, first by low layer capsule uj|iPass through
Matrix wjIt is mapped to low layer capsule copy.Then, low layer capsule copy utilizes Dynamic routing mechanisms by uj|iAggregate into high-rise capsule
Layer, in this step, has obtained the output v={ v of dynamic routing converter1,v2,…,vn, v indicates that case facts describe text
Principal eigenvector.
S2.1.3 constructs the residual unit layer (attention layers) based on attention mechanism, to initial capsule layer u={ u1,
u2,…,unAttention mechanism is used, obtain the supplemental characteristic vector c that case facts describe text.
Described attention layers is as shown in Figure 4:
By n initial capsule u in initial capsule layer ui, (i=1,2 ..., n) obtains a warp by weight matrix W
Vector e after crossing matrixingi, then to vector ei, by softmax function, obtain each initial capsule uiImportance
Weight αi, according to importance weight by all initial capsule addition of vectors, finally obtain the auxiliary spy that case facts describe text
Levy vector c;Formula is as follows:
ei=tanh (Wui+b)
Wherein W is weight matrix, and b is bias vector.
S2.1.4 construct output layer, the case facts that S2.1.2 is obtained describe text principal eigenvector and
The supplemental characteristic vector c that the case facts that S2.1.3 is obtained describe text combines, and is conveyed to and connects layer network entirely.
S2.2 training sequence enhances capsule net network model: the sequence obtained using focal loss loss function training S2.1
Enhance capsule net network model.The focal loss loss function formula is shown below:
Wherein,It is the model estimated probability being calculated by softmax function, α is focal loss
α-balanced variable.It is a regulatory factor, γ (γ ≠ 0) is adjustable parameter, in order to improve and adjust
Save the effect of the factor.
S3 enhances capsule net network model by the sequence of S2 after training, and the fact that new case is described text input and is arrived
In sequence capsule network model, the corresponding charge of model automatic Prediction is as charge prediction result.
In order to illustrate the validity of the criminal case charge prediction technique proposed by the present invention based on sequence capsule network, originally
Invention is by it with the file classification method of several classics and existing two state-of-the-art charge prediction techniques in three data
Concentration is compared.In addition, we carry out in order to prove the validity of model of the invention in terms of handling the prediction of low frequency charge
The charge prognostic experiment of one group of different frequency.
Table 1 shows the result of the baseline model based on three data sets.Generally speaking, proposed by the present invention to be based on sequence
Performance of the criminal case charge prediction technique of capsule network on three data sets is better than all baselines, has significant excellent
Gesture.Specifically, compared with state-of-the-art charge prediction technique before, model of the invention utilizes F1 evaluation index, in three numbers
According to 4.5%, 2.5% and 6.4% absolutely considerable improvement is obtained on collection respectively, illustrate proposed by the present invention based on sequence glue
Validity of the criminal case charge prediction technique of keed network to charge prediction task.This trend shows base proposed by the present invention
Can capture the height that vital Law Text is predicted charge in the criminal case charge prediction technique of sequence capsule network
Grade semantic expressiveness.
Table 1: the charge prediction result under real data set compares, and wherein MP indicates that macro precision, MR are indicated
Macro recall, F1 indicate macro f1.
Low frequency charge compares
Table 2: the low frequency charge under real data set compares
The criminal case charge prediction technique based on sequence capsule network proposed in order to further illustrate the present invention is being located
The validity in terms of low frequency charge is managed, We conducted the charge split-run tests of one group of different frequency.Charge is pressed frequency by us
It is divided into three parts (low frequency, intermediate frequency and high frequency).Low frequency is defined as the charge occurred in all data sets less than 10 times (containing 10 times),
High frequency is defined as the charge occurred in all data sets greater than 100 times (in addition to 100 times), other then belong to intermediate frequency.
Table 2 shows that the criminal case charge prediction technique proposed by the present invention based on sequence capsule network exists
Performance on Criminal-S data set under different frequency, we compare model of the invention and state-of-the-art charge predicts mould
Type and state-of-the-art textual classification model are in the low frequency of macro-f1, intermediate frequency and high frequency result.As can be seen from the table, low frequency
Macro-f1 be 53.8%, improve 65% or more than LSTM-200 model, improved than state-of-the-art charge prediction model
4.1%.With the help of SECaps model, low frequency charge forecasting problem is not only alleviated, but also proposes one kind end to end
Model reduces artificial data label.Wherein there is SECaps model stronger vector to indicate that ability and sequence indicate ability,
Focal loss has preferable performance in the problem that processing classification is uneven and classification is difficult, can alleviate the prediction of low frequency charge
Deficiency.
Claims (9)
1. it is a kind of based on sequence enhancing capsule net network criminal case charge prediction technique, which is characterized in that this method include with
Lower step:
The fact that S1 constructs training dataset, obtains case description and charge penalty result are as training data;
S2 building sequence enhancing capsule net network model is simultaneously trained by training data, comprising the following steps:
S2.1, which constructs sequence, enhances capsule net network model, the specific steps are as follows:
S2.1.1 constructs initial capsule layer: describing text to the fact that case and segments, and be mapped as term vector sequence, by it
As initial capsule layer u={ u1, u2..., un};
S2.1.2 constructs Multiple seq-caps layers: by the initial capsule layer u obtained to S2.1.1, utilizing
Multipleseq-caps layers of extraction feature, obtain the principal eigenvector that case facts describe text;
S2.1.3 constructs attention layers, uses attention mechanism to the initial capsule layer u that S2.1.1 is obtained, obtains case thing
The supplemental characteristic vector c of real description text;
S2.1.4 constructs output layer, and the case facts that S2.1.2 is obtained describe the principal eigenvector of text and S2.1.3 is obtained
To case facts describe the supplemental characteristic vector c of text and combine, and be conveyed to and connect layer network entirely;
S2.2 training sequence enhances capsule net network model;
S3 enhances capsule net network model by the sequence of S2 after training, and the fact that new case is described text input to sequence
In capsule network model, the corresponding charge of model automatic Prediction is as charge prediction result.
2. a kind of criminal case charge prediction technique according to claim 1 based on sequence enhancing capsule net network, feature
Be: the data set in S1 nets disclosed true criminal case from Chinese judgement document, and every case includes two parts:
The description of the fact that case and charge penalty are as a result, as training data.
3. a kind of criminal case charge prediction technique according to claim 1 based on sequence enhancing capsule net network, feature
Be: participle is using Peking University Open-Source Tools pkuseg in S2.1.1, and utilizes Embedding technology by Word2vec
Trained term vector is mapped as term vector sequence.
4. a kind of criminal case charge prediction technique according to claim 1 based on sequence enhancing capsule net network, feature
Be: in S2.1.2, described Multiple seq-caps layers are formed by two seq-caps layers.
5. a kind of criminal case charge prediction technique according to claim 4 based on sequence enhancing capsule net network, feature
Be: each seq-caps layers is made of a sequence information encoder and a dynamic routing converter.
6. a kind of criminal case charge prediction technique according to claim 5 based on sequence enhancing capsule net network, feature
It is: using shot and long term memory network as sequence information encoder.
7. a kind of criminal case charge prediction technique according to claim 1 based on sequence enhancing capsule net network, feature
Be: in S2.1.3, described attention layers is as follows: by the initial capsule ui of n in initial capsule layer u, (i=1,2 ...,
N) by weight matrix W, a vector e after matrixing is obtainedi, then to vector ei, by softmax function, obtain
To each initial capsule uiImportance weight αi, all initial capsules are added according to importance weight, finally obtain case
The supplemental characteristic vector c of fact description text;Formula is as follows:
ei=tanh (Wui+b)
Wherein W is weight matrix, and b is bias vector.
8. a kind of criminal case charge prediction technique according to claim 1 based on sequence enhancing capsule net network, feature
It is: in S2.2, capsule net network model is enhanced using focal loss loss function training sequence.
9. a kind of criminal case charge prediction technique according to claim 8 based on sequence enhancing capsule net network, feature
Be: the focal loss loss function formula is shown below:
Wherein,It is the model estimated probability being calculated by softmax function, α is the α-of focal loss
Balanced variable,It is a regulatory factor, γ (γ ≠ 0) is adjustable parameter, in order to improve and adjust
The effect of the factor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910396510.8A CN110119449B (en) | 2019-05-14 | 2019-05-14 | Criminal case criminal name prediction method based on sequence-enhanced capsule network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910396510.8A CN110119449B (en) | 2019-05-14 | 2019-05-14 | Criminal case criminal name prediction method based on sequence-enhanced capsule network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110119449A true CN110119449A (en) | 2019-08-13 |
CN110119449B CN110119449B (en) | 2020-12-25 |
Family
ID=67522206
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910396510.8A Active CN110119449B (en) | 2019-05-14 | 2019-05-14 | Criminal case criminal name prediction method based on sequence-enhanced capsule network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110119449B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111179971A (en) * | 2019-12-03 | 2020-05-19 | 杭州网易云音乐科技有限公司 | Nondestructive audio detection method and device, electronic equipment and storage medium |
CN111260114A (en) * | 2020-01-08 | 2020-06-09 | 昆明理工大学 | Low-frequency confusable criminal name prediction method for integrating case auxiliary sentence |
CN111881654A (en) * | 2020-08-01 | 2020-11-03 | 牡丹江师范学院 | Penalty test data amplification method based on multi-objective optimization |
CN111985680A (en) * | 2020-07-10 | 2020-11-24 | 昆明理工大学 | Criminal multi-criminal name prediction method based on capsule network and time sequence |
CN112101559A (en) * | 2020-09-04 | 2020-12-18 | 中国航天科工集团第二研究院 | Case and criminal name inference method based on machine learning |
CN112231477A (en) * | 2020-10-20 | 2021-01-15 | 淮阴工学院 | Text classification method based on improved capsule network |
CN112256916A (en) * | 2020-11-12 | 2021-01-22 | 中国计量大学 | Short video click rate prediction method based on graph capsule network |
CN113033174A (en) * | 2021-03-23 | 2021-06-25 | 哈尔滨工业大学 | Case and criminal name judgment method and device based on output type similar door and storage medium |
CN113111895A (en) * | 2020-02-13 | 2021-07-13 | 北京明亿科技有限公司 | Support vector machine-based alarm handling and warning condition category determination method and device |
CN114781389A (en) * | 2022-03-04 | 2022-07-22 | 重庆大学 | Criminal name prediction method and system based on label enhanced representation |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109241285A (en) * | 2018-08-29 | 2019-01-18 | 东南大学 | A kind of device of the judicial decision in a case of auxiliary based on machine learning |
US20190035099A1 (en) * | 2017-07-27 | 2019-01-31 | AI Incorporated | Method and apparatus for combining data to construct a floor plan |
CN109344839A (en) * | 2018-08-07 | 2019-02-15 | 北京市商汤科技开发有限公司 | Image processing method and device, electronic equipment, storage medium, program product |
CN109410575A (en) * | 2018-10-29 | 2019-03-01 | 北京航空航天大学 | A kind of road network trend prediction method based on capsule network and the long Memory Neural Networks in short-term of nested type |
CN109740148A (en) * | 2018-12-16 | 2019-05-10 | 北京工业大学 | A kind of text emotion analysis method of BiLSTM combination Attention mechanism |
CN110097096A (en) * | 2019-04-16 | 2019-08-06 | 天津大学 | A kind of file classification method based on TF-IDF matrix and capsule network |
-
2019
- 2019-05-14 CN CN201910396510.8A patent/CN110119449B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190035099A1 (en) * | 2017-07-27 | 2019-01-31 | AI Incorporated | Method and apparatus for combining data to construct a floor plan |
CN109344839A (en) * | 2018-08-07 | 2019-02-15 | 北京市商汤科技开发有限公司 | Image processing method and device, electronic equipment, storage medium, program product |
CN109241285A (en) * | 2018-08-29 | 2019-01-18 | 东南大学 | A kind of device of the judicial decision in a case of auxiliary based on machine learning |
CN109410575A (en) * | 2018-10-29 | 2019-03-01 | 北京航空航天大学 | A kind of road network trend prediction method based on capsule network and the long Memory Neural Networks in short-term of nested type |
CN109740148A (en) * | 2018-12-16 | 2019-05-10 | 北京工业大学 | A kind of text emotion analysis method of BiLSTM combination Attention mechanism |
CN110097096A (en) * | 2019-04-16 | 2019-08-06 | 天津大学 | A kind of file classification method based on TF-IDF matrix and capsule network |
Non-Patent Citations (3)
Title |
---|
RAHUL KATARYA 等: "Study on Text Classification using Capsule Networks", 《2019 5TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS(ICACCS)》 * |
沈炜域: "基于自注意力与动态路由的文本建模方法", 《软件导刊》 * |
郑毅: "时间序列数据的胶囊式LSTM特征提取算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111179971A (en) * | 2019-12-03 | 2020-05-19 | 杭州网易云音乐科技有限公司 | Nondestructive audio detection method and device, electronic equipment and storage medium |
CN111260114A (en) * | 2020-01-08 | 2020-06-09 | 昆明理工大学 | Low-frequency confusable criminal name prediction method for integrating case auxiliary sentence |
CN111260114B (en) * | 2020-01-08 | 2022-06-17 | 昆明理工大学 | Low-frequency confusable criminal name prediction method for integrating case auxiliary sentence |
CN113111895A (en) * | 2020-02-13 | 2021-07-13 | 北京明亿科技有限公司 | Support vector machine-based alarm handling and warning condition category determination method and device |
CN111985680A (en) * | 2020-07-10 | 2020-11-24 | 昆明理工大学 | Criminal multi-criminal name prediction method based on capsule network and time sequence |
CN111985680B (en) * | 2020-07-10 | 2022-06-14 | 昆明理工大学 | Criminal multi-criminal name prediction method based on capsule network and time sequence |
CN111881654A (en) * | 2020-08-01 | 2020-11-03 | 牡丹江师范学院 | Penalty test data amplification method based on multi-objective optimization |
CN112101559A (en) * | 2020-09-04 | 2020-12-18 | 中国航天科工集团第二研究院 | Case and criminal name inference method based on machine learning |
CN112101559B (en) * | 2020-09-04 | 2023-08-04 | 中国航天科工集团第二研究院 | Case crime name deducing method based on machine learning |
CN112231477A (en) * | 2020-10-20 | 2021-01-15 | 淮阴工学院 | Text classification method based on improved capsule network |
CN112231477B (en) * | 2020-10-20 | 2023-09-22 | 淮阴工学院 | Text classification method based on improved capsule network |
CN112256916A (en) * | 2020-11-12 | 2021-01-22 | 中国计量大学 | Short video click rate prediction method based on graph capsule network |
CN113033174A (en) * | 2021-03-23 | 2021-06-25 | 哈尔滨工业大学 | Case and criminal name judgment method and device based on output type similar door and storage medium |
CN113033174B (en) * | 2021-03-23 | 2022-06-10 | 哈尔滨工业大学 | Case classification method and device based on output type similar door and storage medium |
CN114781389A (en) * | 2022-03-04 | 2022-07-22 | 重庆大学 | Criminal name prediction method and system based on label enhanced representation |
CN114781389B (en) * | 2022-03-04 | 2024-04-05 | 重庆大学 | Crime name prediction method and system based on label enhancement representation |
Also Published As
Publication number | Publication date |
---|---|
CN110119449B (en) | 2020-12-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110119449A (en) | A kind of criminal case charge prediction technique based on sequence enhancing capsule net network | |
Roscher et al. | Explainable machine learning for scientific insights and discoveries | |
Yue et al. | Neurjudge: A circumstance-aware neural framework for legal judgment prediction | |
Zhang et al. | A quantum-inspired multimodal sentiment analysis framework | |
Bang et al. | Explaining a black-box by using a deep variational information bottleneck approach | |
Grawe et al. | Automated patent classification using word embedding | |
Wang et al. | A hybrid document feature extraction method using latent Dirichlet allocation and word2vec | |
CN110188192A (en) | A kind of multitask network struction and multiple dimensioned charge law article unified prediction | |
CN112163608B (en) | Visual relation detection method based on multi-granularity semantic fusion | |
Chaudhuri et al. | Sentiment analysis of customer reviews using robust hierarchical bidirectional recurrent neural network | |
Hu et al. | A survey of state-of-the-art short text matching algorithms | |
Chen et al. | Joint alignment of multi-task feature and label spaces for emotion cause pair extraction | |
CN115114455A (en) | Ontology-based multi-granularity urban rainstorm waterlogging knowledge map construction method | |
Liu et al. | Legal cause prediction with inner descriptions and outer hierarchies | |
Zhang et al. | Applying data discretization to DPCNN for law article prediction | |
Alshuwaier et al. | Applications and enhancement of document-based sentiment analysis in deep learning methods: Systematic literature review | |
CN117574898A (en) | Domain knowledge graph updating method and system based on power grid equipment | |
Kuppusamy et al. | A novel hybrid deep learning model for aspect based sentiment analysis | |
Shang | A computational intelligence model for legal prediction and decision support | |
Yang et al. | A survey of text classification models | |
Zheng et al. | A text classification-based approach for evaluating and enhancing the machine interpretability of building codes | |
Zheng et al. | Pretrained domain-specific language model for general information retrieval tasks in the aec domain | |
Xu et al. | Short text classification of chinese with label information assisting | |
CN114610871B (en) | Information system modeling analysis method based on artificial intelligence algorithm | |
Derbentsev et al. | Sentiment analysis of electronic social media based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |