CN113032544B - Case automatic processing method and device based on big data and terminal equipment - Google Patents

Case automatic processing method and device based on big data and terminal equipment Download PDF

Info

Publication number
CN113032544B
CN113032544B CN202110542723.4A CN202110542723A CN113032544B CN 113032544 B CN113032544 B CN 113032544B CN 202110542723 A CN202110542723 A CN 202110542723A CN 113032544 B CN113032544 B CN 113032544B
Authority
CN
China
Prior art keywords
case
processing method
model
historical
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110542723.4A
Other languages
Chinese (zh)
Other versions
CN113032544A (en
Inventor
周金明
陈贵龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Inspector Intelligent Technology Co Ltd
Original Assignee
Nanjing Inspector Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Inspector Intelligent Technology Co Ltd filed Critical Nanjing Inspector Intelligent Technology Co Ltd
Priority to CN202110542723.4A priority Critical patent/CN113032544B/en
Publication of CN113032544A publication Critical patent/CN113032544A/en
Application granted granted Critical
Publication of CN113032544B publication Critical patent/CN113032544B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3322Query formulation using system suggestions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a case automatic processing method, a device and terminal equipment based on big data, wherein the method comprises the following steps of 1, acquiring all processed historical cases as historical cases to be matched, and acquiring a central thought vector of the cases; step 2, carrying out rough arrangement matching on the new case according to the processed historical case; and 3, after the rough arrangement result is obtained, calculating the fine arrangement similarity through a text similarity matching algorithm, and intelligently matching the processing result of the new case. The processing result of the new case is intelligently obtained by performing rough arrangement matching on the new case and calculating the fine arrangement similarity by adopting a text similarity matching algorithm, so that the case processing efficiency is greatly improved, and a large amount of manpower and material resources are saved.

Description

Case automatic processing method and device based on big data and terminal equipment
Technical Field
The invention relates to the field of big data case processing and natural language processing research, in particular to a case automatic processing method and device based on big data and terminal equipment.
Background
Most of current case processing is traditional manual processing, problems are solved manually, however, due to the fact that the population base of China is large, social problems are complex, the total number of cases is large, and related fields are complex. The staff need provide the solution according to self knowledge level, professional accumulation and work experience, and is time-consuming and labor-consuming. The staff needs to manually decide the approximate type of the case according to the case text and determine a corresponding solution strategy, and the case automatic processing method which is intelligent is lacked to automatically process the case.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a case automatic processing method, a device and terminal equipment based on big data, wherein the processing result of a new case is intelligently obtained by carrying out rough layout matching on the new case and adopting a text similarity matching algorithm to calculate the refined similarity, so that the case processing efficiency is greatly improved, and a large amount of manpower and material resources are saved. The technical scheme is as follows:
in a first aspect, a big data based case automatic processing method is provided, which comprises the following steps:
step 1, acquiring all history cases after finishing processing as history cases to be matched, wherein each history case comprises case description and processing results of the case, extracting a plurality of keywords for each case according to the case description and processing results of the case, calculating a word vector of each keyword through a Chinese BERT model, and averaging the word vectors of the keywords to obtain a central thought vector of the case.
Step 2, carrying out rough arrangement matching on the new case according to the processed historical case;
for a new case, firstly, a plurality of keywords are selected from the case description, and the synonyms of the keywords are added to form a search term set W { W }1,w2,……,wnN is the number of the search words, the word vector of each search word is obtained through the calculation of a Chinese BERT model, the word vector of each search word and the central thought vector are standardized, namely the model length of the vector is divided by the vector, so that the standardized vector model length is 1, and the new case search word w is recordediNormalized word vector of AiAnd B is the standardized vector of the thought in a certain historical case.
Calculating the rough-row similarity of the new case and any historical case, wherein the rough-row similarity is the inner product average value of the standardized word vector of the new case search word and the standardized vector of the central idea of a certain historical case, namely the rough-row similarity C is as follows:
Figure 877571DEST_PATH_IMAGE002
and acquiring the historical cases with the rough ranking similarity larger than a given threshold, and selecting the historical cases with the rough ranking similarity value of N before ranking as rough ranking results.
Step 3, after the rough arrangement result is obtained, calculating the fine arrangement similarity through a text similarity matching algorithm, and intelligently matching the processing result of the new case;
and constructing a case description-case description matching degree model and a case description-processing method matching degree model, and training the two models, wherein the two models have the same structure and are both BERT + two-classification frames.
Training a case description-case description matching degree model:
for any two historical cases, if the case descriptions of the two cases are the same fact, the two cases are considered to be matched, otherwise, the two cases are considered to be unmatched, and therefore training samples are obtained.
The training process is as follows: the method comprises the steps of taking two historical cases as a text 1 and a text 2 respectively, converting each word of the two texts into a word vector, inputting a BERT Model, inputting a vector output from the first [ CLS ] position of the last layer of the BERT Model into a linear binary Model to obtain a specific matching score with a value range of 0-1, considering the matching score to be matched when the matching score is larger than or equal to alpha, and determining the alpha belongs to [0.5, 0.6], otherwise, considering the matching score to be unmatched, and obtaining a case description-case description matching degree Model1 through training sample training parameters.
Training a case description-processing method matching degree model:
for any two historical cases, if the case description and the processing method of the two cases are matched, the two cases are considered to be matched, otherwise, the two cases are considered to be unmatched, and therefore training samples are obtained.
The training process is as follows: respectively taking a case description and a processing method as a text 1 and a text 2, converting each word of the two texts into a word vector, inputting the word vector into a BERT Model, inputting a vector output from the first [ CLS ] position of the last layer of the BERT Model into a linear binary Model to obtain a specific matching score with the value range of 0-1, considering the matching beta belongs to [0.6, 0.7] when the matching score is larger than or equal to beta, and considering the matching beta belongs to [0.6, 0.7] otherwise, considering the matching beta is not matched, and obtaining a case description-processing method matching degree Model2 by training sample training parameters.
After the Model1 and the Model2 are obtained through training, for a new case, the matching degree of the new case and each historical case in the rough ranking result is calculated in sequence: for a historical case H, splicing the case description of the new case with the case description of the historical case H to input a Model1 to obtain a matching score S1, and splicing the case description of the new case with a processing method of the historical case H to input a Model2 to obtain a matching score S2, wherein the refined similarity S between the historical case H and the new case is as follows:
Figure 592061DEST_PATH_IMAGE004
wherein X1 and X2 are respectively the weight of the matching score S1 and the matching score S2,
and sequentially calculating the fine-ranking similarity of each historical case in the new case and the coarse-ranking result, selecting the historical case with the maximum fine-ranking similarity, and taking the processing method of the historical case with the maximum fine-ranking similarity as the processing result of the new case.
Preferably, the method further comprises: when the history case after the processing is finished is obtained in the step 1 or a new case is obtained in the step 2, if the case is in a text input form, directly obtaining a text as case description, and if the case is in a pdf or picture form, obtaining the text as case description through image recognition.
Preferably, during the training sample acquisition process of the Model2, the method further includes: if the processing method of other cases is also applicable to the present case, the case description of the present case and the processing method are considered to be matched.
Preferably, the method further comprises: in the training process of the two similarity models, the training of the augmentation samples is added, namely: after each word in the text is converted into a word vector, a part of word vectors are randomly selected, one or more dimensions are randomly selected to add or subtract a minimum value, and then the minimum value is input into a similarity model for training.
Preferably, the method further comprises: selecting a historical case with the maximum fine ranking similarity, and replacing the selected historical case with the maximum fine ranking similarity by taking the processing method of the historical case with the maximum fine ranking similarity as the processing result of the new case: selecting a plurality of historical cases with high precision ranking similarity, and synthesizing the processing methods of the plurality of historical cases to obtain the processing result of the new case.
Preferably, the selecting a history case with the maximum fine rule similarity, and the selecting the processing method of the history case with the maximum fine rule similarity as the processing result of the new case specifically include: and if the maximum fine ranking similarity is larger than the threshold value, directly selecting the processing method of the historical case as a result, and otherwise, generating a processing strategy through a model.
Further, a processing strategy is generated through the model, specifically: and constructing a seq2seq model, wherein a BERT model is selected as an encoding module encoder of the seq2seq model, and a BERT model is selected as a decoding module decoder of the seq2seq model. The input is case description of a case, the output is a corresponding processing method, a Model3 is obtained by training a Model through historical cases and the corresponding processing method, and the case description of a new case is input into the Model3 to obtain a generated processing method.
Further, the method for processing the Model3 further judges the rationality: and calculating the matching degree between the case description of the new case and the processing method generated by the Model3 through the matching degree Model2, wherein if the matching degree is greater than a set threshold value, the generated processing method is more suitable and can be directly used, and otherwise, manual adjustment is performed.
Compared with the prior art, one of the technical schemes has the following beneficial effects: the processing result of the new case is intelligently obtained by performing rough-layout matching on the new case and calculating the refined-layout similarity by adopting a text similarity matching algorithm, and an intelligent automatic processing strategy is provided, so that a worker can directly adopt the intelligent automatic processing strategy as a reference; greatly improving the efficiency of case treatment and saving a large amount of manpower and material resources.
Drawings
Fig. 1 is a diagram of a matching degree model structure according to an embodiment of the present disclosure.
Detailed Description
In order to clarify the technical solution and the working principle of the present invention, the embodiments of the present disclosure will be described in further detail with reference to the accompanying drawings. All the above optional technical solutions may be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described herein again.
The terms "step 1," "step 2," "step 3," and the like in the description and claims of this application and the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that the embodiments of the application described herein may, for example, be implemented in an order other than those described herein.
In a first aspect: the embodiment of the disclosure provides a case automatic processing method based on big data, which comprises the following steps:
fig. 1 is a structure diagram of a matching degree model provided in an embodiment of the present disclosure, and with reference to the diagram, the structure diagram mainly includes the following steps:
step 1, obtaining all history cases which are processed completely as history cases to be matched, wherein each history case comprises case description and processing results of the case, extracting a plurality of keywords (such as 3 keywords) for each case according to the case description and processing results of the case, calculating a word vector of each keyword through a Chinese BERT model, and averaging the word vectors of the keywords to obtain a central thought vector of the case.
In the actual operation process, the acquired case format is often not in a text format, such as a pdf or picture format, and the purpose of quick processing can be achieved by acquiring the text of the attachments such as the pdf and the picture by an image recognition method.
Step 2, carrying out rough arrangement matching on the new case according to the processed historical case
For a new case, firstly, a plurality of keywords are selected from the case description, synonyms similar to the plurality of keywords are added to construct a search term set W { W }1,w2,……,wnAnd (6) calculating to obtain a word vector of each search word through a Chinese BERT model.
The cases in the history case that are substantially the same as the new case are first obtained. For any historical case, calculating the rough-row similarity of the historical case and the new case, firstly standardizing the word vector and the central thought vector of the search word w, namely dividing the vector by the modular length of the vector, thereby ensuring that the modular length of the vector is 1 after standardization, and recording the new case search word wiThe normalized word vector of (a) is a,the standardized vector of the central thought of a certain historical case is B, the rough-row similarity C is the inner product average value of the standardized word vector of the new case search word and the standardized vector of the central thought of a certain historical case, namely
Figure 265813DEST_PATH_IMAGE001
In order to ensure comparability, it can be seen that the coarse-row similarity only involves simple vector inner product operation, but does not involve complicated natural language processing and model operation, so that the computation can be parallelized rapidly. And acquiring the history cases with the rough-layout similarity larger than a given threshold, and selecting N history cases with larger values as rough-layout results from high to low according to the rough-layout similarity.
Preferably, the step 2 further includes, when a new case is obtained, directly obtaining a text as the case description if the new case is in a form of direct text input, and obtaining the text as the case description first through image recognition if the new case is in a form of pdf or picture.
Step 3, after the rough arrangement result is obtained, calculating the fine arrangement similarity through a more accurate text similarity matching algorithm, and intelligently matching the processing result of the new case;
two models were trained: the case description-case description matching degree model and the case description-processing method matching degree model. The two matching degree models have the same structure, and as shown in fig. 1, the matching degree model has a structure of BERT + two-class framework.
(1) Training case description-case description matching degree model:
for any two historical cases, if the case descriptions of the two cases are the same, the two cases are considered to be matched, otherwise, the two cases are considered to be unmatched, and therefore training samples are obtained. The training process of the case description matching degree model comprises the following steps: the two historical cases are respectively used as a text 1 and a text 2, each word of the two texts is converted into a word vector, a BERT Model is input, a vector output from the first [ CLS ] position of the last layer of the BERT Model is input into a linear binary Model to obtain a specific matching score with the value range of 0-1, the matching score is considered to be matched when the matching score is larger than or equal to 0.5, otherwise, the matching score is considered to be unmatched, and a case description-case description matching degree Model1 is obtained through training sample training parameters.
(2) Training case description-processing method matching degree Model 2:
for historical cases, case descriptions and processing methods of the same case are matched, otherwise, the case descriptions and processing methods are considered not to be matched, and training samples are obtained. Preferably, the training sample acquisition process of the Model2 further includes that if the processing method of other cases is also applicable to the present case, the case description of the present case and the processing method are considered to be matched. The training process is as follows: respectively taking a case description and a processing method as a text 1 and a text 2, converting each word of the two texts into a word vector, inputting a BERT Model, inputting a vector output from the first [ CLS ] position of the last layer of the BERT Model into a linear binary Model to obtain a matching score of which the specific value range is between 0 and 1, considering the matching score to be matched when the matching score is more than or equal to 0.6, and obtaining a case description-processing method matching degree Model2 by training sample training parameters if the matching score is not more than 0.6, otherwise, considering the matching score to be unmatched. The higher threshold value of the matching score is set in the Model2 because the matching requirement on the processing method is better, so that the matching processing method is more accurate and feasible.
Preferably, in the two similarity model training processes, the training of the augmentation sample is added, namely: after each word in the text is converted into a word vector, a part of word vectors are randomly selected, a certain dimensionality is randomly selected to add/subtract a minimum value (such as 0.0000000001) to/from the word vector, and then the word vectors are input into a similarity model for training, namely the training difficulty is improved by adding small disturbance to the word vectors, so that the model can still learn the central thought of the text when the model is subjected to replacement of a few wrongly-written characters and synonyms, and the intelligence degree of the model is improved.
After obtaining the Model1 and the Model2 through training, for the new case, the matching degree calculation is sequentially carried out with each history case in the rough ranking result: for a historical case H, splicing the case description of the new case with the case description of the historical case H to input a Model1 to obtain a matching score S1, and splicing the case description of the new case with a processing method of the historical case H to input a Model2 to obtain a matching score S2, wherein the refined similarity S between the historical case H and the new case is as follows:
Figure 661022DEST_PATH_IMAGE002
wherein X1 and X2 are respectively the matching score S1 and the weight of the matching score S2, when the similarity of the refined row is more important for the similarity between the case descriptions, X1 is larger, and when the similarity of the refined row is more important for the similarity between the case descriptions and the processing method, X2 is larger.
And sequentially calculating the fine-ranking similarity of each historical case in the new case and the coarse-ranking result, selecting the historical case with the maximum fine-ranking similarity, and selecting the processing method of the historical case as the processing result of the new case.
Preferably, the selecting of the historical case with the largest refined ranking similarity and the selecting of the processing method of the historical case as the processing result of the new case specifically include: and if the maximum fine ranking similarity is larger than the threshold value, directly selecting the processing method of the historical case as a result, and otherwise, generating a processing strategy through a model.
For a history case without reference, generating a processing strategy through a model, and in step 1, when there is no history case with a fine-ranking similarity greater than a threshold (i.e. there is no history case with reference), generating a suitable processing method through the model.
Further, a processing strategy is generated through the model, specifically: and constructing a seq2seq model, wherein a BERT model is selected as an encoding module encoder of the seq2seq model, and a BERT model is selected as a decoding module decoder of the seq2seq model. The input is case description of a case, the output is a corresponding processing method, a Model3 is obtained by training a Model through historical cases and the corresponding processing method, and the case description of a new case is input into the Model3 to obtain a generated processing method.
Further, the processing method for Model3 generation is further judged to be reasonable: and calculating the matching degree between the case description of the new case and the processing method generated by the Model3 through the matching degree Model2, wherein if the matching degree is greater than a set threshold value, the generated processing method is more suitable and can be directly used, and otherwise, manual adjustment is performed.
In a second aspect, the embodiments of the present disclosure provide a big data based case automatic processing apparatus, which may implement or execute a big data based case automatic processing method according to any one of all possible implementation manners based on the same technical concept.
Preferably, the device comprises an acquisition unit, a coarse arranging unit and a fine arranging unit;
the acquiring unit is configured to execute the step 1 of the case automatic processing method based on big data according to any one of all possible implementation manners.
The coarse arrangement unit is configured to execute the step 2 of the automatic case processing method based on big data according to any one of all possible implementation manners.
The fine ranking unit is used for executing the step 3 of the case automatic processing method based on big data in any one of all possible implementation modes.
It should be noted that, when implementing an automatic case processing method based on big data according to the foregoing embodiments, the above-mentioned division of each functional module is merely used as an example, and in practical applications, the above-mentioned function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above-mentioned functions. In addition, the embodiment of the case automatic processing device based on big data and the embodiment of the case automatic processing method based on big data provided by the above embodiments belong to the same concept, and the specific implementation process thereof is detailed in the method embodiments and is not described herein again.
In a third aspect, an embodiment of the present disclosure provides a terminal device, where the terminal device includes any one of all possible implementation manners of the case automatic processing device based on big data.
The invention has been described above by way of example with reference to the accompanying drawings, it being understood that the invention is not limited to the specific embodiments described above, but is capable of numerous insubstantial modifications when implemented in accordance with the principles and solutions of the present invention; or directly apply the conception and the technical scheme of the invention to other occasions without improvement and equivalent replacement, and the invention is within the protection scope of the invention.

Claims (10)

1. A big data based case automatic processing method is characterized by comprising the following steps:
step 1, acquiring all history cases which are processed completely as history cases to be matched, wherein each history case comprises case description and processing results of the case, extracting a plurality of keywords for each case according to the case description and processing results of the case, calculating a word vector of each keyword through a Chinese BERT model, and averaging the word vectors of the keywords to obtain a central thought vector of the case;
step 2, carrying out rough arrangement matching on the new case according to the processed historical case;
for a new case, firstly, a plurality of keywords are selected from the case description, and the synonyms of the keywords are added to form a search term set W { W }1,w2,……,wnN is the number of the search words, the word vector of each search word is obtained through the calculation of a Chinese BERT model, the word vector of each search word and the central thought vector are standardized, namely the model length of the vector is divided by the vector, so that the standardized vector model length is 1, and the new case search word w is recordediNormalized word vector of AiThe central thought standardized vector of a certain historical case is B;
calculating the rough-row similarity of the new case and any historical case, wherein the rough-row similarity is the inner product average value of the standardized word vector of the new case search word and the standardized vector of the central idea of a certain historical case, namely the rough-row similarity C is as follows:
Figure 334576DEST_PATH_IMAGE002
acquiring historical cases with the rough ranking similarity larger than a given threshold, and selecting the historical cases with the rough ranking similarity value of N before ranking as rough ranking results;
step 3, after the rough arrangement result is obtained, calculating the fine arrangement similarity through a text similarity matching algorithm, and intelligently matching the processing result of the new case;
constructing a case description-case description matching degree model and a case description-processing method matching degree model, and training the two models, wherein the two models have the same structure and are both BERT + two classification frames;
training a case description-case description matching degree model:
for any two historical cases, if the case descriptions of the two cases are the same, the two cases are considered to be matched, otherwise, the two cases are considered to be unmatched, and therefore training samples are obtained;
the training process is as follows: respectively taking two historical cases as a text 1 and a text 2, converting each word of the two texts into a word vector, inputting a BERT Model, inputting a vector output from the first [ CLS ] position of the last layer of the BERT Model into a linear binary Model to obtain a specific matching score with a value range of 0-1, considering the matching score to be matched when the matching score is larger than or equal to alpha, and determining the alpha belongs to [0.5, 0.6], otherwise, considering the matching score to be unmatched, and obtaining a case description-case description matching degree Model1 through training sample training parameters;
training a case description-processing method matching degree model:
for any two historical cases, if the case description and the processing method of the two cases are matched, the two cases are considered to be matched, otherwise, the two cases are considered to be unmatched, and therefore training samples are obtained;
the training process is as follows: respectively taking a case description and a processing method as a text 1 and a text 2, converting each word of the two texts into a word vector, inputting the word vector into a BERT Model, inputting a vector output from the first [ CLS ] position of the last layer of the BERT Model into a linear binary Model to obtain a matching score of which the specific value range is between 0 and 1, determining that the matching score is within the range of beta and belongs to [0.6 and 0.7] when the matching score is not less than beta, and determining that the matching score is not matched, and obtaining a case description-processing method matching degree Model2 by training sample training parameters if the matching score is not less than beta;
after the Model1 and the Model2 are obtained through training, for a new case, the matching degree of the new case and each historical case in the rough ranking result is calculated in sequence: for a historical case H, splicing the case description of the new case with the case description of the historical case H to input a Model1 to obtain a matching score S1, and splicing the case description of the new case with a processing method of the historical case H to input a Model2 to obtain a matching score S2, wherein the refined similarity S between the historical case H and the new case is as follows:
Figure 314033DEST_PATH_IMAGE004
wherein X1 and X2 are respectively the weight of the matching score S1 and the matching score S2,
and sequentially calculating the fine-ranking similarity of each historical case in the new case and the coarse-ranking result, selecting the historical case with the maximum fine-ranking similarity, and taking the processing method of the historical case with the maximum fine-ranking similarity as the processing result of the new case.
2. The case automatic processing method based on big data as claimed in claim 1, further comprising: when the history case after the processing is finished is obtained in the step 1 or a new case is obtained in the step 2, if the case is in a text input form, directly obtaining a text as case description, and if the case is in a pdf or picture form, obtaining the text as case description through image recognition.
3. The method as claimed in claim 1, further comprising, during the training sample acquisition process of Model 2: if the processing method of other cases is also applicable to the present case, the case description of the present case and the processing method are considered to be matched.
4. The case automatic processing method based on big data as claimed in claim 1, characterized in that in the two similarity model training process, the training of the augmentation sample is added, namely: after each word in the text is converted into a word vector, a part of word vectors are randomly selected, one or more dimensions are randomly selected to add or subtract a minimum value, and then the minimum value is input into a similarity model for training.
5. The automatic case processing method based on big data according to claim 1, characterized in that said selecting a history case with the largest refined similarity, using the processing method of the history case with the largest refined similarity as the processing result of the new case, replacing with: selecting a plurality of historical cases with high precision ranking similarity, and synthesizing the processing methods of the plurality of historical cases to obtain the processing result of the new case.
6. The automatic case processing method based on big data according to any of claims 1-5, characterized in that said selecting a history case with the largest refined similarity, and selecting the processing method of the history case with the largest refined similarity as the processing result of the new case, specifically: and if the maximum fine ranking similarity is larger than the threshold value, directly selecting the processing method of the historical case as a result, and otherwise, generating a processing strategy through a model.
7. The case automatic processing method based on big data as claimed in claim 6, characterized in that the processing strategy is generated by a model, specifically: constructing a seq2seq model, wherein a BERT model is selected as an encoding module encoder of the seq2seq model, and a BERT model is also selected as a decoding module decoder of the seq2seq model; the input is case description of a case, the output is a corresponding processing method, a Model3 is obtained by training a Model through historical cases and the corresponding processing method, and the case description of a new case is input into the Model3 to obtain a generated processing method.
8. The method according to claim 7, further comprising the step of judging the reasonableness of the processing method generated by the Model 3: and calculating the matching degree between the case description of the new case and the processing method generated by the Model3 through the matching degree Model2, wherein if the matching degree is greater than a set threshold value, the generated processing method is more suitable and can be directly used, and otherwise, manual adjustment is performed.
9. An automatic case processing device based on big data, which is characterized in that the device can realize an automatic case processing method based on big data as claimed in any one of claims 1-8.
10. A terminal device, characterized in that the terminal device comprises a big data based case automatic processing device according to claim 9.
CN202110542723.4A 2021-05-19 2021-05-19 Case automatic processing method and device based on big data and terminal equipment Active CN113032544B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110542723.4A CN113032544B (en) 2021-05-19 2021-05-19 Case automatic processing method and device based on big data and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110542723.4A CN113032544B (en) 2021-05-19 2021-05-19 Case automatic processing method and device based on big data and terminal equipment

Publications (2)

Publication Number Publication Date
CN113032544A CN113032544A (en) 2021-06-25
CN113032544B true CN113032544B (en) 2021-08-20

Family

ID=76455561

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110542723.4A Active CN113032544B (en) 2021-05-19 2021-05-19 Case automatic processing method and device based on big data and terminal equipment

Country Status (1)

Country Link
CN (1) CN113032544B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115188013B (en) * 2022-09-14 2023-06-30 泰豪信息技术有限公司 Risk prevention and control method, system, storage medium and equipment for decision book
CN115630834B (en) * 2022-12-21 2023-03-28 北京时代凌宇数字技术有限公司 Case dispatching method and device, electronic equipment and computer-readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825879A (en) * 2019-09-18 2020-02-21 平安科技(深圳)有限公司 Case decision result determination method, device and equipment and computer readable storage medium
CN111144068A (en) * 2019-11-26 2020-05-12 方正璞华软件(武汉)股份有限公司 Similar arbitration case recommendation method and device
CN111259951A (en) * 2020-01-13 2020-06-09 北京明略软件***有限公司 Case detection method and device, electronic equipment and readable storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825879A (en) * 2019-09-18 2020-02-21 平安科技(深圳)有限公司 Case decision result determination method, device and equipment and computer readable storage medium
CN111144068A (en) * 2019-11-26 2020-05-12 方正璞华软件(武汉)股份有限公司 Similar arbitration case recommendation method and device
CN111259951A (en) * 2020-01-13 2020-06-09 北京明略软件***有限公司 Case detection method and device, electronic equipment and readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Legal Feature Enhanced Semantic Matching Network for Similar Case Matching;Zhilong Hong等;《2020 International Joint Conference on Neural Networks (IJCNN)》;20200928;第1-8页 *
基于半监督学习的涉及未成年人案件文书识别方法;杨圣豪等;《华南理工大学学报(自然科学版)》;20210131;第49卷(第1期);第29-46页 *

Also Published As

Publication number Publication date
CN113032544A (en) 2021-06-25

Similar Documents

Publication Publication Date Title
CN108595696A (en) A kind of human-computer interaction intelligent answering method and system based on cloud platform
CN110765246B (en) Question and answer method and device based on intelligent robot, storage medium and intelligent device
CN110781276A (en) Text extraction method, device, equipment and storage medium
CN112464662B (en) Medical phrase matching method, device, equipment and storage medium
CN113032544B (en) Case automatic processing method and device based on big data and terminal equipment
KR20200007969A (en) Information processing methods, terminals, and computer storage media
CN110188195B (en) Text intention recognition method, device and equipment based on deep learning
CN107239564B (en) Text label recommendation method based on supervision topic model
CN110909224B (en) Sensitive data automatic classification and identification method and system based on artificial intelligence
CN112418320B (en) Enterprise association relation identification method, device and storage medium
CN111368096A (en) Knowledge graph-based information analysis method, device, equipment and storage medium
CN107291775A (en) The reparation language material generation method and device of error sample
CN117131449A (en) Data management-oriented anomaly identification method and system with propagation learning capability
CN106250366B (en) A kind of data processing method and system for question answering system
CN113065352B (en) Method for identifying operation content of power grid dispatching work text
CN110825852B (en) Long text-oriented semantic matching method and system
CN116070642A (en) Text emotion analysis method and related device based on expression embedding
CN110413750A (en) The method and apparatus for recalling standard question sentence according to user's question sentence
CN114385876A (en) Model search space generation method, device and system
CN114328903A (en) Text clustering-based customer service log backflow method and device
CN115618092A (en) Information recommendation method and information recommendation system
CN111708862A (en) Text matching method and device and electronic equipment
CN117033464B (en) Log parallel analysis algorithm based on clustering and application
CN114942980B (en) Method and device for determining text matching
CN116910377B (en) Grid event classified search recommendation method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: An automatic case processing method, device and terminal equipment based on big data

Effective date of registration: 20220705

Granted publication date: 20210820

Pledgee: China Construction Bank Corporation Nanjing Jianye sub branch

Pledgor: Nanjing inspector Intelligent Technology Co.,Ltd.

Registration number: Y2022980009897

PE01 Entry into force of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20230720

Granted publication date: 20210820

Pledgee: China Construction Bank Corporation Nanjing Jianye sub branch

Pledgor: Nanjing inspector Intelligent Technology Co.,Ltd.

Registration number: Y2022980009897

PC01 Cancellation of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A method, device, and terminal device for automatic case processing based on big data

Effective date of registration: 20230803

Granted publication date: 20210820

Pledgee: China Construction Bank Corporation Nanjing Jianye sub branch

Pledgor: Nanjing inspector Intelligent Technology Co.,Ltd.

Registration number: Y2023980050832

PE01 Entry into force of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Granted publication date: 20210820

Pledgee: China Construction Bank Corporation Nanjing Jianye sub branch

Pledgor: Nanjing inspector Intelligent Technology Co.,Ltd.

Registration number: Y2023980050832