CN110119449B - Criminal case criminal name prediction method based on sequence-enhanced capsule network - Google Patents

Criminal case criminal name prediction method based on sequence-enhanced capsule network Download PDF

Info

Publication number
CN110119449B
CN110119449B CN201910396510.8A CN201910396510A CN110119449B CN 110119449 B CN110119449 B CN 110119449B CN 201910396510 A CN201910396510 A CN 201910396510A CN 110119449 B CN110119449 B CN 110119449B
Authority
CN
China
Prior art keywords
sequence
criminal
case
capsule network
enhanced
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910396510.8A
Other languages
Chinese (zh)
Other versions
CN110119449A (en
Inventor
彭黎
何从庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN201910396510.8A priority Critical patent/CN110119449B/en
Publication of CN110119449A publication Critical patent/CN110119449A/en
Application granted granted Critical
Publication of CN110119449B publication Critical patent/CN110119449B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Development Economics (AREA)
  • Biomedical Technology (AREA)
  • Game Theory and Decision Science (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Technology Law (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Alarm Systems (AREA)

Abstract

The invention relates to the field of intelligent laws, in particular to a criminal case criminal name forecasting method based on a sequence-enhanced capsule network. The method comprises the following steps: s1, constructing a training data set, and acquiring fact description of a case and a result of penalty for a crime as training data; s2, constructing a sequence-enhanced capsule network model and training through training data; s3, the sequence enhanced capsule network model after S2 training inputs the fact description text of the new case into the sequence capsule network model, and the model automatically predicts the corresponding guilty name as the guilty name prediction result. The model provided by the invention not only can better capture the remarkable characteristics and semantic information of the legal text, but also has better competitiveness on the low-frequency criminal name prediction problem; a focal loss function is introduced to serve as a loss function of the sequence enhanced capsule network model, and the problem of high imbalance of the crime names of the low-frequency crime name prediction task is further solved.

Description

Criminal case criminal name prediction method based on sequence-enhanced capsule network
Technical Field
The invention relates to the field of intelligent laws, in particular to a criminal case criminal name forecasting method based on a sequence-enhanced capsule network.
Background
In recent years, artificial intelligence technologies represented by deep learning and natural language processing have made a great breakthrough, and the attention of the academic world and the industrial world has been drawn to the field of intelligent laws. The intelligent law endows the machine with the capability of understanding legal texts and analyzing cases, and intelligent case handling can be performed according to cases.
The automatic criminal name prediction is one of the most representative subtasks in intelligent law, plays an important role in a law assistant system, and is widely applied to real life. For example, the system can provide criminal reference of case notifiers for legal experts (such as lawyers and judges) so as to assist the judges in case judgment and improve the working efficiency; while providing legal consultancy for ordinary people unfamiliar with legal terms and complex procedures. The automatic criminal name prediction is to use machine learning or deep learning technique to train the judge of criminal name (such as theft, robbery, traffic accident, etc.) of the person under the case. Previous research work has proposed a number of methods to implement automatic criminal name prediction. These methods are mainly classified into three categories: (1) a conventional method; (2) a machine learning method; (3) provided is a deep learning method.
The traditional method usually adopts mathematical formulas or quantitative calculation. Kort [ Fred Kort. predicting Supreme Court details chemistry: A quantitative analysis of the "right to counter" cases. American policy Science Review,1957,51(1): 1-12 ] attempted to use quantitative methods to predict human events that are generally considered highly uncertain, i.e., the decision of the highest Court in the United states. The study is intended to demonstrate that, at least in one area of judicial examination, cases that have already been decided upon are used to determine the factual factors that influence the decision, these factors are formulated to value, and the decision for the remaining cases is then correctly predicted in the specified area. Nagel [ Stuart S nagel.application correlation analysis to case prediction. tex.l.rev.,1963,42:1006] considered that litigation outcomes could be scientifically predicted, which using the reassignment example demonstrated that prediction was possible by assigning correlation coefficients to the four variables that occur in the case. This prediction will help the parties planning litigation, the theorems understanding judicial programs, the legislators explaining judicial responses, and the public seeking to comply with laws. Keown [ R Keown. chemical models for legal prediction. computer/LJ,1980,2:829] proposes the possibility of predicting judicial decisions mathematically. He correctly predicted 99% of the decisions in over 1000 cases using linear models of Haar, Sawyer and Cummings and nearest neighbor of Mackaay and Robillard. This success provides real opportunity and urgent need for developing linear models in other specific areas, not only to empirically verify that the method is generally effective, but also to provide additional predictive models for the legal industry. These traditional methods have achieved some effect in some scenarios, but they are limited to small datasets with a small number of tags.
Because of the success of machine learning in many areas, researchers have begun to use machine learning methods to deal with criminal name predictions. This type of work typically focuses on extracting features from case facts, and then using machine learning algorithms for predictions. Liu et al [ Chao-Lin Liu, Cheng-Tsung Chang, Jim-How Ho. case instance generation and refinement for case-based summary judgment in Chinese 2004 ], Chao-Lin Liu, Chuwn-Dar Hsieh. expanding phrase-based classification of statistical contributions for statistical signatures in Chinese, Proc of International Symposium method for intellectual systems Springer,2006,681 + 690] propose a K-New Neighbor-based algorithm for automatic generation of refined and real-world decision cases for simple case decisions from decision texts. The algorithm attempts to extract important legal information from the past litigation documents to construct case instances, which are then refined by merging similar cases and removing relatively irrelevant information from the cases. Lin et al [ Wan-Chen Lin, Tsung-Ting Kuo, Tung-Jia Chang. explicit great friend models for Chinese leave documents labelin, case classification, and present prediction. ROclinG XXIV (2012),2012.140] define 21 legal element labels for "robbery" and "threatening crime", and then classify "robbery" and "threatening crime" by using legal element information and predict the decision period of the two crimes. Mackaay et al [ Ejan Mackaay, Pierre Rolling. prediction judging: The nearest neighbor rule and visual representation of case patterns.1974] extract features by clustering semantically similar N-grams. Sulea et al [ Octavia-Maria Sulea, Marcos Zampieri, Shervin Malmsii, et al. expanding the Use of Text Classification in the Legal domain. CoRR,2017, abs/1710.09306] investigated the application of Text Classification methods in the Legal domain using cases and adjudications of the highest French court, and then proposed a decision system based on case description, time span and decision features of a support vector machine to predict the Legal domain and accuracy in decision of cases. However, these methods only extract shallow text features or manual labels, which are difficult to collect on large datasets. Therefore, when the amount of data is large, their performance is not good.
In recent years, with the success of deep neural networks in the fields of Natural Language Processing (NLP), Computer Vision (CV) and speech, some work has begun to apply them to the task of automated criminal name prediction and has shown a tremendous performance increase. Luo et al [ Bingfeng Luo, Yansong, Jianbo Xu, et al.learning to Predict targets for clinical Cases with Legal basis.arXiv prediction arXiv:1707.09168,2017 ] consider relevant Legal provisions to play a very important role in this task for the task of predicting a crime name. Therefore, the attention-based neural network method is provided, and the criminal name prediction task and the related clause extraction task are subjected to combined modeling under a unified framework, so that the proper criminal names of cases with different expression modes can be effectively predicted. However, this work does not address the problem of low frequency criminal name prediction as well as multiple criminal name prediction. Zhong et al [ Haoxi Zhong, Guo Zhoupping, Cunchao Tu, et al, Legal Judge Prediction sight national learning. in: Proc of Proceedings of the 2018Conference on Empirical Methods in Natural Language processing.2018, 3540-3549 ] propose a framework of topology multitask learning by considering the Topological dependencies among the subtasks of the legal provisions of the names of crimes, laws, penalties, penalty deadline, and incorporating the dependencies of multiple subtasks into the Prediction of crime Judgment. Hu et al [ Zikun Hu, Xiang Li, Cunchao Tu, et al. Few-shot charge prediction with differential legal attributes. In: Proc of Proceedings of the 27th International Conference on Computational Linguitics.2018, 487-498 ] introduce several discriminant attributes of a crime as an internal mapping between the factual description of the crime and the crime name, which provide additional information for the low-frequency crime name and effective features to distinguish the confusion name, and then propose an Attribute-Attribute prediction model to infer the crime Attribute and the crime name at the same time. Through further analysis of the research content of the above scholars, it can be found that, although a series of automatic criminal name prediction algorithms based on deep learning have been proposed in the academic world and the industrial world, the development is not small. However, the existing method still has the defects that: (1) most of the existing works [9,10] ignore the low-frequency criminal name scene of the automatic criminal name prediction task, and only consider the high-frequency criminal name scene, so that the problem of low-frequency criminal name prediction cannot be well solved. (2) Hu et al [11] achieved good results in low-frequency criminal scenes using artificially generated auxiliary information, however, manually labeling information wastes a lot of time and cannot implement an end-to-end deep learning model.
The invention discloses a criminal case and criminal name prediction method based on a memory neural network (published: 2019.02.22). A training data set is built by taking standard case description and criminal names thereof as training data, a built memory neural network model is trained through the training data set, case description characteristic vectors and criminal name codes are converted into key-value pairs stored in the memory neural network model, and criminal case names are judged by adopting a multi-layer perceptron classifier.
Disclosure of Invention
Through the intensive analysis of research results of numerous scholars at home and abroad, aiming at the problems in the prior art, the invention provides a criminal case criminal name prediction method based on a sequence-enhanced capsule network, so as to relieve the problem of low-frequency criminal name prediction in criminal cases.
In order to achieve the purpose, the invention adopts the technical scheme that a criminal case criminal name forecasting method based on a sequence-enhanced capsule network comprises the following steps:
s1, constructing a training data set, and acquiring fact description of a case and a result of penalty for a crime as training data;
s2, constructing a sequence-enhanced capsule network model and training through training data, wherein the method comprises the following steps:
s2.1, constructing a sequence-enhanced capsule network model, and specifically comprising the following steps:
s2.1.1 construction of the initial capsule layer: segmenting the fact description text of the case, mapping the fact description text into a word vector sequence, and taking the word vector sequence as an initial capsule layer u ═ { u ═1,u2,…,un};
S2.1.2 Multiple seq-caps layers were constructed: extracting features by using a Multiple seq-caps layer to obtain a main feature vector of a case fact description text by using the initial capsule layer u obtained from S2.1.1, wherein the Multiple seq-caps layer consists of two seq-caps layers;
s2.1.3, constructing a residual error unit layer (attention layer) based on an attention mechanism, and obtaining an auxiliary feature vector c of the case fact description text by using the attention mechanism on the S2.1.1 obtained initial capsule layer u:
the attention layers are as follows: n initial capsules u in the initial capsule layer uiAnd (i is 1,2, …, n) obtaining a vector e after matrix transformation by a weight matrix WiThen to vector eiObtaining each initial capsule u through a softmax functioniIs weighted by the importance ofiAdding all the initial capsules according to the importance weights to finally obtain an auxiliary feature vector c of the case fact description text; the formula is as follows:
ei=tanh(Wui+b)
Figure BDA0002058316800000031
Figure BDA0002058316800000032
where W is the weight matrix and b is the bias vector.
S2.1.4, an output layer is constructed, the main feature vector of the case fact description text obtained by S2.1.2 and the auxiliary feature vector c of the case fact description text obtained by S2.1.3 are combined and transmitted to a full-link network.
S2.2, training a sequence enhanced capsule network model;
s3, the sequence enhanced capsule network model after S2 training inputs the fact description text of the new case into the sequence capsule network model, and the model automatically predicts the corresponding guilty name as the guilty name prediction result.
Further, the data set in S1 is from real criminal cases published by the Chinese judge paper network, each case includes two parts, the fact description of the case and the result of penalty for the name of the case, which are used as training data.
Further, S2.1.1, the Word segmentation adopts the Beijing university sourcing tool pkuseg, and maps Word2vec trained Word vectors into Word vector sequences by using the Embedding technology.
Further, a focal loss function training sequence is adopted in S2.2 to enhance the capsule network model.
Compared with the prior art:
(1) the invention provides a sequence-enhanced capsule network model, which can better capture the remarkable characteristics and semantic information of legal texts and has better competitiveness on the aspect of low-frequency criminal name prediction.
(2) A focal loss function is introduced to serve as a loss function of the sequence enhanced capsule network model, and the problem of high imbalance of the crime names of the low-frequency crime name prediction task is further solved.
(3) By comparing the most advanced method at present, the sequence-enhanced capsule network model provided by the invention realizes 4.5% and 6.4% of F1 promotion in the real data sets Criminal-S and Criminal-L respectively. The experimental results prove the superiority and competitiveness of the sequence-enhanced capsule network model in solving low-frequency criminal scenes.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a schematic diagram of the sequential capsule network model of the present invention;
FIG. 3 is a schematic representation of the Seq-caps layer of the present invention;
fig. 4 is a schematic diagram of the Attention layer of the present invention.
Detailed Description
The invention is further described with reference to the drawings and the specific embodiments in the following description.
The brief flow block diagram of the invention is shown in figure 1, and the criminal case criminal name forecasting method based on the sequence capsule network model comprises the following steps:
s1, constructing a training data set, and acquiring fact description of a case and a result of penalty for a crime as training data;
the invention carries out experiments on three disclosed real data sets, wherein the data sets are all from three criminal cases disclosed in a Chinese referee document network, and the fact description and the criminal penalty result of the case are obtained as training data; since only the case's principal names are retained in the disclosed dataset, each name need only be mapped to a unique integer for encoding.
S2, constructing a sequence-enhanced capsule network model and training through training data, wherein the method comprises the following steps:
s2.1, constructing a sequence enhanced capsule network model, wherein the sequence enhanced capsule network model is shown in figure 2. The construction of the model comprises the following steps:
s2.1.1 construction of the initial capsule layer: performing Word segmentation on the fact description text of the case, mapping Word vectors trained by Word2vec into Word vector sequences by using an Embedding technology, and taking the Word vectors as an initial capsule layer u ═ { u ═1,u2,…,un}。
S2.1.2, constructing Multiple seq-caps layer, and obtaining main feature vector of case fact description text by using the Multiple seq-caps layer for S2.1.1 obtained initial capsule layer u.
The Multiple seq-caps layer is composed of two seq-caps layers, and each seq-caps layer is composed of a Sequence Information encoder (Sequence Information encoder) and a Dynamic route converter (Dynamic route) as shown in FIG. 3ng) is prepared. The present invention uses a long short term memory network (LSTM) as a sequence information encoder. Taking the first seq-caps layer as an example, let the initial capsule layer u ═ { u ═1,u2,…,unThe transmission into the seq-caps layer, the formula of the long-short term memory network is as follows:
ft=σ(Wfut+Ufht-1+bf),
it=σ(Wiut+Uiht-1+bi),
oT=σ(Wout+Uoht-1+bo),
Figure BDA0002058316800000051
Figure BDA0002058316800000052
ht=OtOtanh(ct)
solving for h by the above formulatSequence information of time instants, whereint、it、otRespectively a forgetting gate, an input gate and an output gate of the LSTM,
Figure BDA0002058316800000053
candidate value representing the current moment of time, ctIndicates the state of the current time, htAn output value, W, representing the current timef、Wi、Wo、WcAll represent a weight matrix, Uf、Ui、Uo、UcAll represent a weight matrix, bf、bi、bo、bcRepresenting an offset vector utRepresenting the current input value, ct-1Indicates the state of the last time, ht-1The output value at the previous time is represented, and σ represents a sigmoid function.
The output of the sequence information encoder is then transmitted to the mobileIn the state routing converter, the lower layer capsule u is firstly encapsulatedj|iBy means of a matrix wjMapping to a lower capsule copy. Next, the low-level capsule replica utilizes a dynamic routing mechanism to route uj|iThe output v ═ v of the dynamic route converter is obtained in the step of aggregating into a high-level capsule layer1,v2,…,vnV denotes the dominant feature vector of case fact description text.
S2.1.3 residual unit layer (attention layer) based on attention mechanism is constructed, and initial capsule layer u is { u }1,u2,…,unUsing an attention mechanism, obtaining an assistant feature vector c of case fact description text.
The attention layer is shown in fig. 4:
n initial capsules u in the initial capsule layer uiAnd (i is 1,2, …, n) obtaining a vector e after matrix transformation by a weight matrix WiThen to vector eiObtaining each initial capsule u through a softmax functioniIs weighted by the importance ofiAdding all the initial capsule vectors according to the importance weight, and finally obtaining an auxiliary feature vector c of the case fact description text; the formula is as follows:
ei=tanh(Wui+b)
Figure BDA0002058316800000054
Figure BDA0002058316800000055
where W is the weight matrix and b is the bias vector.
S2.1.4, an output layer is constructed, the main feature vector of the case fact description text obtained by S2.1.2 and the auxiliary feature vector c of the case fact description text obtained by S2.1.3 are combined and transmitted to a full-link network.
S2.2 training sequence enhanced capsule network model: and training the sequence enhancement capsule network model obtained by S2.1 by utilizing the focal loss function. The focal loss function is expressed as follows:
Figure BDA0002058316800000061
wherein the content of the first and second substances,
Figure BDA0002058316800000062
is the model estimated probability calculated by the softmax function, and alpha is the alpha-balanced variable of focal loss.
Figure BDA0002058316800000063
Is a tuning factor, γ (γ ≠ 0) is a tunable parameter in order to improve the effect of the tuning factor.
S3, the sequence enhanced capsule network model after S2 training inputs the fact description text of the new case into the sequence capsule network model, and the model automatically predicts the corresponding guilty name as the guilty name prediction result.
To illustrate the effectiveness of the criminal case criminal name prediction method based on the sequential capsule network proposed by the present invention, the present invention compares it with several classical text classification methods and two most advanced criminal name prediction methods in the prior art in three data sets. In addition, in order to prove the effectiveness of the model in processing low-frequency criminal name prediction, a group of criminal name prediction experiments with different frequencies are carried out.
Table 1 shows the results of the baseline model based on three data sets. In general, the criminal case criminal name prediction method based on the sequence capsule network has the advantages that the performance of the criminal case criminal name prediction method based on the sequence capsule network on three data sets is superior to that of all base lines, and the method has remarkable advantages. Specifically, compared with the most advanced criminal name prediction method, the model of the invention utilizes F1 evaluation index to respectively obtain 4.5%, 2.5% and 6.4% absolute considerable improvements on three data sets, and the effectiveness of the criminal case criminal name prediction method based on the sequence capsule network on the criminal name prediction task is demonstrated. The trend shows that the criminal case criminal name prediction method based on the sequence capsule network can capture high-level semantic representation of legal text which is crucial to criminal name prediction.
Table 1: and comparing the prediction results of the names of the crimes under the real data sets, wherein MP represents macro precision, MR represents macro call, and F1 represents macro F1.
Figure BDA0002058316800000064
Low frequency criminal name comparison
Table 2: low frequency criminal name comparison under real data set
Figure BDA0002058316800000065
In order to further illustrate the effectiveness of the criminal case criminal name prediction method based on the sequence capsule network in the aspect of processing low-frequency criminal names, a group of criminal name segmentation experiments with different frequencies are carried out. We divide the names of guilties into three parts by frequency (low, medium and high). The low frequency is defined as the crime appearing in all data sets less than 10 times (including 10 times), the high frequency is defined as the crime appearing in all data sets more than 100 times (except 100 times), and the others belong to the medium frequency.
Table 2 shows the performance of the Criminal case Criminal name prediction method based on the sequential capsule network proposed by the present invention on criminol-S data set at different frequencies, and we compared the low frequency, medium frequency and high frequency results of the model of the present invention with the most advanced Criminal name prediction model and the most advanced text classification model at macro-f 1. As can be seen from the table, the low-frequency macro-f1 is 53.8%, which is improved by more than 65% compared with the LSTM-200 model and is improved by 4.1% compared with the most advanced guilt name prediction model. With the help of the SECaps model, the problem of low-frequency criminal name prediction is relieved, an end-to-end model is provided, and manual data marks are reduced. The SECaps model has strong vector representation capability and sequence representation capability, and the focal loss has good performance in the aspect of processing the problems of unbalanced classification and difficult classification, so that the defect of low-frequency criminal name prediction can be overcome.

Claims (7)

1. A criminal case criminal name forecasting method based on a sequence-enhanced capsule network is characterized by comprising the following steps:
s1, constructing a training data set, and acquiring fact description of a case and a result of penalty for a crime as training data;
s2, constructing a sequence-enhanced capsule network model and training through training data, wherein the method comprises the following steps:
s2.1, constructing a sequence-enhanced capsule network model, and specifically comprising the following steps:
s2.1.1 construction of the initial capsule layer: segmenting the fact description text of the case, mapping the fact description text into a word vector sequence, and taking the word vector sequence as an initial capsule layer u ═ { u ═1,u2,…,un};
S2.1.2 Multiple seq-caps layers were constructed: extracting features by using a Multiple seq-caps layer from the initial capsule layer u obtained from S2.1.1 to obtain a main feature vector of the case fact description text;
the Multiple seq-caps layer consists of two seq-caps layers; each seq-caps layer consists of a sequence information encoder and a dynamic route converter;
s2.1.3, constructing an attention layer, and obtaining an auxiliary feature vector c of a case fact description text by using an attention mechanism for the S2.1.1 obtained initial capsule layer u;
s2.1.4, constructing an output layer, combining the main feature vector of the case fact description text obtained by S2.1.2 and the auxiliary feature vector c of the case fact description text obtained by S2.1.3, and transmitting the combined result to a full-connection layer network;
s2.2, training a sequence enhanced capsule network model;
s3, the sequence enhanced capsule network model after S2 training inputs the fact description text of the new case into the sequence capsule network model, and the model automatically predicts the corresponding guilty name as the guilty name prediction result.
2. A criminal case criminal name prediction method based on a sequence-enhanced capsule network according to claim 1, characterized in that: the data set in S1 is from real criminal cases published by the chinese judge paper web, each case comprising two parts: and the fact description of the case and the result of the penalty of the criminal name are used as training data.
3. A criminal case criminal name prediction method based on a sequence-enhanced capsule network according to claim 1, characterized in that: s2.1.1, the Word segmentation adopts the Beijing university sourcing tool pkuseg, and uses the Embedding technology to map Word2vec training Word vectors into Word vector sequences.
4. A criminal case criminal name prediction method based on a sequence-enhanced capsule network according to claim 1, characterized in that: s2.1.2, a long short term memory network is used as a sequence information encoder.
5. A criminal case criminal name prediction method based on a sequence-enhanced capsule network according to claim 1, characterized in that: s2.1.3, the attention layers are as follows: n initial capsules u in the initial capsule layer uiAnd (i is 1,2, …, n) obtaining a vector e after matrix transformation by a weight matrix WiThen to vector eiObtaining each initial capsule u through a softmax functioniIs weighted by the importance ofiAdding all the initial capsules according to the importance weights to finally obtain an auxiliary feature vector c of the case fact description text; the formula is as follows:
ei=tanh(Wui+b)
Figure FDA0002780658540000011
Figure FDA0002780658540000021
where W is the weight matrix and b is the bias vector.
6. A criminal case criminal name prediction method based on a sequence-enhanced capsule network according to claim 1, characterized in that: and S2.2, enhancing the capsule network model by adopting a focal loss function training sequence.
7. The criminal case criminal name forecasting method based on the sequence-enhanced capsule network is characterized by comprising the following steps of: the focal loss function is expressed as follows:
Figure FDA0002780658540000022
wherein the content of the first and second substances,
Figure FDA0002780658540000023
is the model estimated probability calculated by the softmax function, alpha is the alpha-balanced variable of focal loss,
Figure FDA0002780658540000024
is a tuning factor, γ (γ ≠ 0) is a tunable parameter in order to improve the effect of the tuning factor.
CN201910396510.8A 2019-05-14 2019-05-14 Criminal case criminal name prediction method based on sequence-enhanced capsule network Active CN110119449B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910396510.8A CN110119449B (en) 2019-05-14 2019-05-14 Criminal case criminal name prediction method based on sequence-enhanced capsule network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910396510.8A CN110119449B (en) 2019-05-14 2019-05-14 Criminal case criminal name prediction method based on sequence-enhanced capsule network

Publications (2)

Publication Number Publication Date
CN110119449A CN110119449A (en) 2019-08-13
CN110119449B true CN110119449B (en) 2020-12-25

Family

ID=67522206

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910396510.8A Active CN110119449B (en) 2019-05-14 2019-05-14 Criminal case criminal name prediction method based on sequence-enhanced capsule network

Country Status (1)

Country Link
CN (1) CN110119449B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111179971A (en) * 2019-12-03 2020-05-19 杭州网易云音乐科技有限公司 Nondestructive audio detection method and device, electronic equipment and storage medium
CN111260114B (en) * 2020-01-08 2022-06-17 昆明理工大学 Low-frequency confusable criminal name prediction method for integrating case auxiliary sentence
CN113111895A (en) * 2020-02-13 2021-07-13 北京明亿科技有限公司 Support vector machine-based alarm handling and warning condition category determination method and device
CN111985680B (en) * 2020-07-10 2022-06-14 昆明理工大学 Criminal multi-criminal name prediction method based on capsule network and time sequence
CN111881654B (en) * 2020-08-01 2023-07-18 牡丹江师范学院 Criminal investigation test data amplification method based on multi-objective optimization
CN112101559B (en) * 2020-09-04 2023-08-04 中国航天科工集团第二研究院 Case crime name deducing method based on machine learning
CN112231477B (en) * 2020-10-20 2023-09-22 淮阴工学院 Text classification method based on improved capsule network
CN112256916B (en) * 2020-11-12 2021-06-18 中国计量大学 Short video click rate prediction method based on graph capsule network
CN113033174B (en) * 2021-03-23 2022-06-10 哈尔滨工业大学 Case classification method and device based on output type similar door and storage medium
CN114781389B (en) * 2022-03-04 2024-04-05 重庆大学 Crime name prediction method and system based on label enhancement representation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109241285A (en) * 2018-08-29 2019-01-18 东南大学 A kind of device of the judicial decision in a case of auxiliary based on machine learning
CN109344839A (en) * 2018-08-07 2019-02-15 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment, storage medium, program product
CN109410575A (en) * 2018-10-29 2019-03-01 北京航空航天大学 A kind of road network trend prediction method based on capsule network and the long Memory Neural Networks in short-term of nested type
CN109740148A (en) * 2018-12-16 2019-05-10 北京工业大学 A kind of text emotion analysis method of BiLSTM combination Attention mechanism
CN110097096A (en) * 2019-04-16 2019-08-06 天津大学 A kind of file classification method based on TF-IDF matrix and capsule network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10482619B2 (en) * 2017-07-27 2019-11-19 AI Incorporated Method and apparatus for combining data to construct a floor plan

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344839A (en) * 2018-08-07 2019-02-15 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment, storage medium, program product
CN109241285A (en) * 2018-08-29 2019-01-18 东南大学 A kind of device of the judicial decision in a case of auxiliary based on machine learning
CN109410575A (en) * 2018-10-29 2019-03-01 北京航空航天大学 A kind of road network trend prediction method based on capsule network and the long Memory Neural Networks in short-term of nested type
CN109740148A (en) * 2018-12-16 2019-05-10 北京工业大学 A kind of text emotion analysis method of BiLSTM combination Attention mechanism
CN110097096A (en) * 2019-04-16 2019-08-06 天津大学 A kind of file classification method based on TF-IDF matrix and capsule network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Study on Text Classification using Capsule Networks;Rahul Katarya 等;《2019 5th International Conference on Advanced Computing & Communication Systems(ICACCS)》;20190316;第501-505页 *
基于自注意力与动态路由的文本建模方法;沈炜域;《软件导刊》;20190115(第1期);第56-60、64页 *
时间序列数据的胶囊式LSTM特征提取算法研究;郑毅;《中国优秀硕士学位论文全文数据库 信息科技辑》;20190115(第1期);第A002-1265页 *

Also Published As

Publication number Publication date
CN110119449A (en) 2019-08-13

Similar Documents

Publication Publication Date Title
CN110119449B (en) Criminal case criminal name prediction method based on sequence-enhanced capsule network
Li et al. Improving convolutional neural network for text classification by recursive data pruning
CN106777013A (en) Dialogue management method and apparatus
CN115114455A (en) Ontology-based multi-granularity urban rainstorm waterlogging knowledge map construction method
CN116150509B (en) Threat information identification method, system, equipment and medium for social media network
Bedi et al. CitEnergy: A BERT based model to analyse Citizens’ Energy-Tweets
CN117237559B (en) Digital twin city-oriented three-dimensional model data intelligent analysis method and system
Li et al. Zero-shot surface defect recognition with class knowledge graph
CN114742071A (en) Chinese cross-language viewpoint object recognition and analysis method based on graph neural network
CN110889505A (en) Cross-media comprehensive reasoning method and system for matching image-text sequences
CN117349437A (en) Government information management system and method based on intelligent AI
Zhu et al. Causality extraction model based on two-stage GCN
Sathiyaprasad Ontology-based video retrieval using modified classification technique by learning in smart surveillance applications
CN116050523A (en) Attention-directed enhanced common sense reasoning framework based on mixed knowledge graph
CN115878800A (en) Double-graph neural network fusing co-occurrence graph and dependency graph and construction method thereof
CN110633394A (en) Graph compression method based on feature enhancement
Sidek et al. Interacting through disclosing: Peer interaction patterns based on self-disclosure levels via Facebook
CN115965085A (en) Ship static attribute reasoning method and system based on knowledge graph technology
Hu et al. Adaptive cross-stitch graph convolutional networks
Gao et al. Command2Vec: Feature Learning of 3D Modeling Behavior Sequence—A Case Study on “Spiral-stair”
Agbesi et al. Attention based BiGRU-2DCNN with hunger game search technique for low-resource document-level sentiment classification
Lapertot et al. Supervised learning of hierarchical image segmentation
Zhang et al. [Retracted] Temporal and Spatial Differences of Urban Ecological Environment and Economic Development Based on Graph Neural Network
Chen English translation template retrieval based on semantic distance ontology knowledge recognition algorithm
Divya et al. An Empirical Study on Fake News Detection System using Deep and Machine Learning Ensemble Techniques

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant