CN113792127B - Rule recognition method and device based on big data, electronic equipment and medium - Google Patents

Rule recognition method and device based on big data, electronic equipment and medium Download PDF

Info

Publication number
CN113792127B
CN113792127B CN202111082811.7A CN202111082811A CN113792127B CN 113792127 B CN113792127 B CN 113792127B CN 202111082811 A CN202111082811 A CN 202111082811A CN 113792127 B CN113792127 B CN 113792127B
Authority
CN
China
Prior art keywords
case
rule
data
character
character vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111082811.7A
Other languages
Chinese (zh)
Other versions
CN113792127A (en
Inventor
罗斯洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN202111082811.7A priority Critical patent/CN113792127B/en
Publication of CN113792127A publication Critical patent/CN113792127A/en
Application granted granted Critical
Publication of CN113792127B publication Critical patent/CN113792127B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Technology Law (AREA)
  • Mathematical Physics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Character Discrimination (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of artificial intelligence, and discloses a rule recognition method based on big data, which comprises the following steps: performing position vector coding, feature extraction and structure extraction on characters in the rule data and the case data by utilizing a pre-constructed relation recognition model, and recognizing a predicted association relation of the rule data and the case data; calculating a loss value of the predicted association relationship and the actual association relationship, and adjusting the relationship identification model parameters to obtain a standard relationship identification model; identifying a first rule of a case to be accepted by using a standard relation identification model; taking the rule of matching the to-be-accepted case with the history case as a second rule of the to-be-accepted case; the first rule and the second rule are the same as the final rule of the to-be-accepted case. The invention also relates to a blockchain technique, which may be stored in a blockchain node. The invention also provides a regulation recognition device based on big data, electronic equipment and a medium. The invention can improve the accuracy and efficiency of the legal recognition.

Description

Rule recognition method and device based on big data, electronic equipment and medium
Technical Field
The present invention relates to the field of artificial intelligence, and in particular, to a method and apparatus for identifying regulations based on big data, an electronic device, and a computer readable storage medium.
Background
The gridding management divides the urban management district into unit grids according to a certain standard, the gridding supervision has the characteristic of 'one post and more responsibilities', and the supervision law enforcement personnel need to apply different laws and regulations according to the main bodies of different types and different business areas of different industries in the supervision process.
However, a great deal of supervision work is focused on self-learning and interpretation of rule bases in the law enforcement process, and due to large workload, complex working objects and various related rules, when law enforcement personnel are monitored to select corresponding rules for different legal cases in different operation ranges in different types of different industries, only manual understanding of the rules and proper rules are selected, so that the efficiency and accuracy of rule identification are low.
Disclosure of Invention
The invention provides a regulation recognition method, a regulation recognition device, electronic equipment and a computer readable storage medium based on big data, and the main purpose of the regulation recognition method, the device and the electronic equipment is to improve the accuracy and the efficiency of regulation recognition.
In order to achieve the above object, the present invention provides a rule recognition method based on big data, including:
acquiring regulation data and case data, and marking the actual association relation of the regulation data and the case data;
the coding layer in the pre-constructed relation recognition model is utilized to respectively carry out position vector coding on characters in the rule data and the case data, and a rule character vector set and a case character vector set are generated;
respectively carrying out feature extraction and structure extraction on the rule character vector set and the case character vector set by utilizing a feedforward attention mechanism in the relation recognition model to obtain a feature rule character vector set and a feature case character vector set;
performing association relation recognition on the characteristic rule character vector and the characteristic case character vector by utilizing a matching module in the relation recognition model to obtain a predicted association relation of the rule data and the case data;
calculating a loss value of the predicted association relationship and the actual association relationship by using a loss function in the relationship recognition model, and adjusting parameters of the relationship recognition model according to the loss value until the relationship recognition model meets a preset condition, so as to obtain a standard relationship recognition model after training is completed;
Acquiring a to-be-accepted case, and identifying a first rule corresponding to the to-be-accepted case by using the standard relation identification model;
matching the to-be-accepted case with the history cases in the pre-constructed history case library, and taking the rule corresponding to the history cases successfully matched as a second rule of the to-be-accepted case;
and taking the same rule in the first rule and the second rule as the final rule of the to-be-accepted case.
Optionally, the performing association relationship recognition on the feature rule character vector and the feature case character vector by using a matching module in the relationship recognition model to obtain a predicted association relationship between the rule data and the case data includes:
acquiring rule upper and lower character semantics corresponding to the characteristic rule character vector and case upper and lower character semantics corresponding to the characteristic case character vector;
splicing the rule upper and lower character semantics and the case upper and lower character semantics respectively by utilizing a hidden Markov algorithm in the matching module to obtain a rule character matrix and a case character matrix;
calculating the association coefficient of the rule character matrix and the case character matrix by using an NLP matching algorithm in the matching module;
If the association coefficient is smaller than or equal to a preset association coefficient, determining that the association relationship between the rule data corresponding to the rule character matrix and the case data corresponding to the case character matrix does not exist;
if the association coefficient is larger than a preset association coefficient, determining that association relationship exists between the rule data corresponding to the rule character matrix and the case data corresponding to the case character matrix.
Optionally, the feature extraction and the structure extraction are performed on the rule character vector set and the case character vector set by using a feedforward attention mechanism in the pre-constructed relation recognition model, so as to obtain a feature rule character vector and a feature case character vector, which includes:
extracting features of the obtained legal character vector and the case character vector by utilizing a hidden Markov module in the feedforward attention mechanism to obtain a feature character vector;
and extracting structural information of the characteristic character vector by using an encoder in the feedforward attention mechanism to obtain a characteristic rule character vector set and a characteristic case character vector set.
Optionally, the coding layer in the pre-constructed relation recognition model is used for respectively performing position vector coding on characters in the rule data and the case data to generate a rule character vector set and a case character vector set, which includes:
The coding layer is utilized to respectively carry out position index coding on characters in the legal data and the case data, and legal character position indexes and case character position indexes are obtained;
the coding layer is utilized to respectively convert characters in the legal data and the case data into character vectors, and an initial legal character vector and an initial case character vector are obtained;
combining the legal character position index and the initial legal character vector to generate a legal character vector set;
and combining the case character position index and the initial case character vector to generate a case character vector set.
Optionally, the marking the actual association relationship between the rule data and the case data includes:
and marking the actual association relation between the rule data and the case data by using a preset pointer algorithm.
Optionally, the calculating the loss value of the predicted association relationship and the actual association relationship by using the loss function in the relationship identification model includes:
calculating a loss value of the predicted association and the actual association by using the following loss function:
wherein,indicating a loss value- >Representing the number of predicted associations +.>The i-th predicted association relationship is represented,indicating the ith actual association.
Optionally, the matching the to-be-accepted case with the history case in the pre-constructed history case library, and taking the rule corresponding to the history case that is successfully matched as the second rule of the to-be-accepted case includes:
calculating the association degree of the to-be-accepted case and the history case in the pre-constructed history case library;
if the association degree is larger than the preset association degree, the fact that the history cases in the to-be-accepted case and the preset history case library are successfully matched is determined, and rules corresponding to the history cases which are successfully matched are used as second rules of the to-be-accepted case.
In order to solve the above problems, the present invention also provides a regulation recognition device based on big data, the device comprising:
the marking module is used for acquiring the rule data and the case data and marking the actual association relation between the rule data and the case data;
the model training module is used for carrying out position vector coding on characters in the rule data and the case data by utilizing a coding layer in a pre-constructed relation recognition model to generate a rule character vector set and a case character vector set, carrying out feature extraction and structure extraction on the rule character vector set and the case character vector set by utilizing a feedforward attention mechanism in the relation recognition model to obtain a feature rule character vector set and a feature case character vector set, carrying out association relation recognition on the feature rule character vector and the feature case character vector by utilizing a matching module in the relation recognition model to obtain a predicted association relation of the rule data and the case data, calculating a loss value of the predicted association relation and the actual association relation by utilizing a loss function in the relation recognition model, and adjusting parameters of the relation recognition model according to the loss value until the relation recognition model meets preset conditions to obtain a trained standard relation recognition model;
The rule acquisition module is used for acquiring a to-be-accepted case, identifying a first rule corresponding to the to-be-accepted case by using the standard relation identification model, matching the to-be-accepted case with a history case in a pre-constructed history case library, taking the rule corresponding to the history case which is successfully matched as a second rule of the to-be-accepted case, and taking the same rule in the first rule and the second rule as a final rule of the to-be-accepted case.
In order to solve the above-mentioned problems, the present invention also provides an electronic apparatus including:
a memory storing at least one computer program; a kind of electronic device with high-pressure air-conditioning system
And a processor executing the computer program stored in the memory to implement the big data based regulation recognition method.
In order to solve the above-mentioned problems, the present invention also provides a computer-readable storage medium having stored therein at least one computer program that is executed by a processor in an electronic device to implement the above-mentioned big data based regulation recognition method.
The embodiment of the invention firstly marks the actual association relation between the rule data and the case data, can determine the actual correspondence relation between the rule data and the case data, and then compares the rule data predicted by the subsequent model with the case data correspondence relation with the actual correspondence relation, thereby judging the accuracy of the rule data predicted by the subsequent model and the case data correspondence relation; secondly, training the pre-constructed relation recognition model by utilizing the association relation between the rule data and the case data, recognizing a first rule corresponding to the case to be accepted by utilizing the trained relation recognition model, determining the corresponding relation between the rule data and the case data by utilizing the trained relation recognition model, and recognizing the first rule corresponding to the case to be accepted, thereby improving the accuracy rate of rule recognition; finally, taking the rule corresponding to the history case which is successfully matched as a second rule of the case to be accepted, taking the same rule in the first rule and the second rule as a final rule of the case to be accepted, further screening the accurate rule, and improving the accuracy of rule identification. Therefore, the rule recognition method, the device, the electronic equipment and the medium based on the big data provided by the embodiment of the invention can improve the accuracy and the efficiency of rule recognition.
Drawings
FIG. 1 is a flow chart of a method for identifying regulations based on big data according to an embodiment of the invention;
FIG. 2 is a schematic block diagram of a rule recognition device based on big data according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an internal structure of an electronic device implementing a rule recognition method based on big data according to an embodiment of the present invention;
the achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The embodiment of the invention provides a rule identification method based on big data. The execution subject of the rule recognition method based on big data includes, but is not limited to, at least one of a server, a terminal and the like, which can be configured to execute the method provided by the embodiment of the application. In other words, the big data based regulation recognition method may be performed by software or hardware installed at a terminal device or a server device, and the software may be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.
Referring to fig. 1, which is a schematic flow chart of a big data based rule recognition method according to an embodiment of the present invention, in an embodiment of the present invention, the big data based rule recognition method includes:
s1, acquiring regulation data and case data, and marking the actual association relation of the regulation data and the case data.
In the embodiment of the invention, the regulation data can comprise interpretation information corresponding to relevant laws and regulations and laws related to food, medicines, medical instruments, cosmetics, special equipment, industrial products, consumer products, patents, trademarks and the like in a supervision area, wherein the laws and regulations can be acquired through files such as 'brand law of the people's republic of China ',' network transaction supervision and management method ',' cosmetic supervision and management regulation ',' management and punishment method of the people's republic of China', 'supervision and management method of medicine production', and the interpretation information can be acquired through files such as 'inspection and detection institution supervision and management method', and the like; the case data can comprise law enforcement processes and law enforcement punishment results of illegal cases in the supervision area, and can be obtained from a national judicial case network, namely, a Chinese judgment flow information disclosure network.
In detail, the marking the actual association relationship between the rule data and the case data includes: and marking the actual association relation between the rule data and the case data by using a preset pointer algorithm.
Preferably, the pointer algorithm may be a Span algorithm.
S2, respectively carrying out position vector coding on characters in the rule data and the case data by utilizing a coding layer in the pre-constructed relation recognition model, and generating a rule character vector set and a case character vector set.
In the embodiment of the invention, the vectorization and position indexing operation is carried out on the characters in the legal data, and the legal character vector set with connected context semantics can be obtained, namely the legal character vector set is the character vector set after vectorization and position indexing is carried out on the legal data.
In the embodiment of the invention, the vectorization and position indexing operation is performed on the characters in the case data, so that a case character vector set with context semantically connected can be obtained, namely, the case character vector is the character vector set after vectorization and position indexing is performed on the case data.
In the embodiment of the invention, the pre-constructed relationship identification model can be constructed by Tiny-ALBERT, wherein the pre-constructed relationship identification module comprises: coding layer, feedforward attention mechanism, matching module, loss function.
In the embodiment of the invention, the rule data and the case data contain a large number of characters, but the neural network only can accept numerical input, and if the rule data and the case data are directly utilized to train the pre-constructed relation recognition model, the trained relation recognition model cannot support the position recognition of the rule data and the case data, so that the coding layer in the relation recognition model can be utilized to carry out position vector coding on the rule data and the characters in the case data, and further, the case information of each character in the rule data and the case data is determined, thereby realizing the subsequent model training.
Preferably, the coding layer may be an coding layer.
In detail, the encoding layer in the pre-constructed relation recognition model is used for respectively performing position vector encoding on characters in the rule data and the case data to generate a rule character vector set and a case character vector set, and the method comprises the following steps:
the coding layer is utilized to respectively carry out position index coding on characters in the legal data and the case data, and legal character position indexes and case character position indexes are obtained;
The coding layer is utilized to respectively convert characters in the legal data and the case data into corresponding character vectors, and an initial legal character vector and an initial case character vector are obtained;
combining the legal character position index and the initial legal character vector to generate a legal character vector set;
and combining the case character position index and the initial case character vector to generate a case character vector set.
For example, the regulatory data are: the "application registration and use trademark" should follow the honest credit principle ", and the characters in the" application registration and use trademark "are subjected to position index coding to obtain a" application "position index of 0, a" please "position index of 1, a" note "position index of 2, a" book "position index of 3, and" position index of 4, a "make" position index of 5, a "use" position index of 6, a "quotient" position index of 7 and a "mark" position index of 8.
Similarly, when the regulatory data is: when the trademark is applied for registration and use, characters in the regulation data are converted into character vectors, so that an initial regulation character vector of 000, an initial regulation character vector of 001, an initial regulation character vector of 002, a volume initial regulation character vector of 003 and the like are obtained, and then the character position index and the initial regulation character vector are combined, so that a regulation character vector of 0-000, a regulation character vector of 1-001, a volume regulation character vector of 2-002 and the like are obtained.
And S3, respectively carrying out feature extraction and structure extraction on the rule character vector set and the case character vector set by utilizing a feedforward attention mechanism in the pre-constructed relation recognition model to obtain a feature rule character vector set and a feature case character vector set.
In the embodiment of the invention, the characteristic rule character vector set is a character vector set obtained by firstly carrying out characteristic extraction on the rule character vector set to obtain a characteristic character vector and then carrying out character structure information extraction on the characteristic character vector; the characteristic case character vector set is a character vector set obtained by firstly carrying out characteristic extraction on the case character vector set to obtain a characteristic character vector and then carrying out character structure information extraction on the characteristic character vector.
In the embodiment of the invention, the characteristic sequence extraction is carried out on the rule character vector and the case character vector by utilizing a feedforward attention mechanism in the pre-constructed relation recognition model, and the characteristic character vectors of the rule character vector and the case character vector are output. Wherein the feed forward attention mechanism comprises: a hidden markov module and an encoder.
In detail, the feature extraction and the structure extraction are performed on the rule character vector set and the case character vector set by using a feedforward attention mechanism in the pre-constructed relation recognition model to obtain a feature rule character vector and a feature case character vector, which comprises the following steps:
Extracting features of the obtained legal character vector and the case character vector by utilizing a hidden Markov module in the feedforward attention mechanism to obtain a feature character vector;
and extracting structural information of the characteristic character vector by using an encoder in the feedforward attention mechanism to obtain a characteristic rule character vector set and a characteristic case character vector set.
In the embodiment of the invention, the characteristic extraction of the key words of the legal character vector is carried out through a hidden Markov algorithm, for example, the characteristic key words of production, sales, sibutramine, weight loss, health food, food safety law and the like can be extracted by producing and selling weight-losing health food containing the forbidden sibutramine component by a certain enterprise and obtaining violence from the weight-losing health food, which violates the food safety law.
Further, the structural information sequence of the characteristic character vector of the legal character vector may be extracted by an encoder, that is, the constituent structure of the legal character vector is extracted and unstructured characters of the legal data are structured, for example, the constituent structures of "security" are syllables "a", "n", "q", "u", "a" and "n".
And S4, carrying out association relation recognition on the characteristic rule character vector and the characteristic case character vector by utilizing a matching module in the relation recognition model to obtain a predicted association relation between the rule data and the case data.
In the embodiment of the invention, the matching module can be used for carrying out keyword prediction on the characteristic rule character vector and the characteristic case character vector and calculating the association degree of the keyword prediction, wherein the matching module comprises: hidden markov algorithms and NLP matching algorithms.
In detail, the performing association relationship recognition on the feature rule character vector and the feature case character vector by using the matching module in the relationship recognition model to obtain a predicted association relationship between the rule data and the case data includes:
acquiring rule upper and lower character semantics corresponding to the characteristic rule character vector and case upper and lower character semantics corresponding to the characteristic case character vector;
splicing the rule upper and lower character semantics and the case upper and lower character semantics respectively by utilizing a hidden Markov algorithm in the matching module to obtain a rule character matrix and a case character matrix;
calculating the association coefficient of the rule character matrix and the case character matrix by using an NLP matching algorithm in the matching module;
If the association coefficient is smaller than or equal to a preset association coefficient, determining that the association relationship between the rule data corresponding to the rule character matrix and the case data corresponding to the case character matrix does not exist;
if the association coefficient is larger than a preset association coefficient, determining that association relationship exists between the rule data corresponding to the rule character matrix and the case data corresponding to the case character matrix.
In the embodiment of the invention, the characteristic rule character vector and the upper and lower character semantic relation corresponding to the characteristic case character vector can be spliced by using a hidden Markov algorithm to obtain a rule character matrix and a case character matrix:
,/>
in one embodiment of the present invention,character vector representing a characteristic case of length T, < +.>Keyword corresponding to character vector of characteristic case with length T according to +.>And->The relation of the case character matrix B is obtained by using the upper character semantic relation and the lower character semantic relation.
Similarly, the method of finding the rule character matrix is the same as above.
For example, the observation sequence O "x, i, B, u, q, u, m, i, n, g, j, i, a, n, f, e, i", the state sequence S is "sibutramine weight loss", and the obtained case character matrix B is [ sibutramine weight loss ].
In the embodiment of the invention, the association coefficient of the rule character matrix and the case character matrix can be calculated through an NLP matching algorithm, such as a cosine similarity algorithm, a Jaccard similarity coefficient algorithm and the like in the NLP matching algorithm.
Specifically, the preset association degree may be set to 0.98.
S5, calculating a loss value of the predicted association relationship and the actual association relationship by using a loss function in the relationship recognition model, and adjusting parameters of the relationship recognition model according to the loss value until the relationship recognition model meets a preset condition, so as to obtain a standard relationship recognition model after training is completed.
In the embodiment of the invention, the loss function in the relation recognition model is utilized to calculate the loss value of the predicted association relation and the actual association relation, and the parameters of the relation recognition model are adjusted according to the loss value until the relation recognition model meets the preset condition, so that the standard relation recognition model with complete training is obtained. The preset condition may be set according to the actual model training scenario, for example, the preset condition is that the loss value is smaller than a preset threshold.
In an embodiment of the present invention, the loss function includes a cross entropy (cross entropy) function.
In detail, the calculating the loss value of the predicted association relationship and the actual association relationship by using the loss function in the relationship identification model includes:
calculating a loss value of the predicted association and the actual association by using the following loss function:
wherein,indicating a loss value->Representing the number of predicted associations +.>The i-th predicted association relationship is represented,indicating the ith actual association.
In an alternative embodiment, the adjustment of the parameters may be implemented by a random descent gradient algorithm, for example, the parameters of the relational identification model may be adjusted using the following formula:
wherein,representing the adjusted parameters, m representing the characteristicsThe number of rule character vectors and feature case character vectors, L(s) representing the loss value,/->Representing a decreasing function +.>Indicates learning rate (I/O)>And->Representing the abscissa and ordinate positions of the feature legislation character vector and the feature case character vector.
S6, acquiring a case to be accepted, and identifying a first rule corresponding to the case to be accepted by using the standard relation identification model.
In the embodiment of the invention, the to-be-accepted case can be an illegal case related to cosmetics, special equipment, consumer products and the like in a monitoring area. Specifically, the to-be-accepted case may be input to the system by a user or by a staff member.
For example, the to-be-accepted case may be a defect of the consumer product caused by unqualified design and manufacture of the product in a certain batch of consumer products, which seriously jeopardizes personal safety of the consumer. Further, the to-be-accepted case is input into the standard relation recognition model, and a third term of the first rule in a document of 'temporary regulations for consumer recall management' can be obtained.
And S7, matching the to-be-accepted case with the history cases in the pre-constructed history case library, and taking the rule corresponding to the history case which is successfully matched as a second rule of the to-be-accepted case.
In the embodiment of the present invention, the pre-building history case library includes: and processing the illegal cases in the past time period and processing the regulations corresponding to the illegal cases.
In detail, the matching the to-be-accepted case with the history case in the pre-constructed history case library, and taking the rule corresponding to the history case successfully matched as the second rule of the to-be-accepted case includes:
calculating the association degree of the to-be-accepted case and the history case in the pre-constructed history case library;
if the association degree is larger than the preset association degree, the fact that the history cases in the to-be-accepted case and the preset history case library are successfully matched is determined, and rules corresponding to the history cases which are successfully matched are used as second rules of the to-be-accepted case.
In an optional embodiment of the present invention, the association degree between the to-be-accepted case and the history case in the pre-constructed history case library may be implemented by using an NLP matching algorithm, for example, a cosine similarity algorithm, a Jaccard similarity coefficient algorithm, etc., where the preset association degree may be set to 0.98, or may be preset to other values according to an actual service scenario.
S8, taking the same rule in the first rule and the second rule as the final rule of the to-be-accepted case.
In the embodiment of the present invention, if the first rule is the twenty-sixth term in the protection law for consumer rights in the people's republic of China; and the second rule is also twenty-sixth clause in Consumer rights protection law of the people's republic of China, and the twenty-sixth clause in Consumer rights protection law of the people's republic of China is taken as the final rule of the case to be accepted.
Further, in the embodiment of the present invention, if the first rule and the second rule do not have the same rule, the to-be-accepted case may be re-input into the relationship recognition model to perform training, so as to obtain an updated first rule, until the updated first rule and the second rule have the same rule, so that accuracy of rule recognition is improved.
The embodiment of the invention firstly marks the actual association relation between the rule data and the case data, can determine the actual correspondence relation between the rule data and the case data, and then compares the rule data predicted by the subsequent model with the case data correspondence relation with the actual correspondence relation, thereby judging the accuracy of the rule data predicted by the subsequent model and the case data correspondence relation; secondly, training the pre-constructed relation recognition model by utilizing the association relation between the rule data and the case data, recognizing a first rule corresponding to the case to be accepted by utilizing the trained relation recognition model, determining the corresponding relation between the rule data and the case data by utilizing the trained relation recognition model, and recognizing the first rule corresponding to the case to be accepted, thereby improving the accuracy rate of rule recognition; finally, taking the rule corresponding to the history case which is successfully matched as a second rule of the case to be accepted, taking the same rule in the first rule and the second rule as a final rule of the case to be accepted, further screening the accurate rule, and improving the accuracy of rule identification. Therefore, the rule recognition method based on big data provided by the embodiment of the invention can improve the accuracy and efficiency of rule recognition.
As shown in fig. 2, a functional block diagram of the big data based regulation recognition device of the present invention is shown.
The big data based regulation recognition apparatus 100 of the present invention may be installed in an electronic device. Depending on the functions implemented, the big data based rule recognition means may comprise a marking module 101, a model training module 102, a rule acquisition module 103, which may also be referred to as a unit, means a series of computer program segments capable of being executed by the processor of the electronic device and of performing a fixed function, which are stored in the memory of the electronic device.
In the present embodiment, the functions concerning the respective modules/units are as follows:
the marking module 101 is configured to obtain the rule data and the case data, and mark the actual association relationship between the rule data and the case data.
In the embodiment of the invention, the regulation data can comprise interpretation information corresponding to relevant laws and regulations and laws related to food, medicines, medical instruments, cosmetics, special equipment, industrial products, consumer products, patents, trademarks and the like in a supervision area, wherein the laws and regulations can be acquired through files such as 'brand law of the people's republic of China ',' network transaction supervision and management method ',' cosmetic supervision and management regulation ',' management and punishment method of the people's republic of China', 'supervision and management method of medicine production', and the interpretation information can be acquired through files such as 'inspection and detection institution supervision and management method', and the like; the case data can comprise law enforcement processes and law enforcement punishment results of illegal cases in the supervision area, and can be obtained from a national judicial case network, namely, a Chinese judgment flow information disclosure network.
In detail, the marking module 101 marks the actual association relationship of the regulation data and the case data by performing the following operations, including: and marking the actual association relation between the rule data and the case data by using a preset pointer algorithm.
Preferably, the pointer algorithm may be a Span algorithm.
The model training module 102 is configured to perform position vector encoding on characters in the rule data and the case data by using an encoding layer in a pre-constructed relationship recognition model, generate a rule character vector set and a case character vector set, perform feature extraction and structure extraction on the rule character vector set and the case character vector set by using a feedforward attention mechanism in the relationship recognition model, obtain a feature rule character vector set and a feature case character vector set, perform association relationship recognition on the feature rule character vector and the feature case character vector by using a matching module in the relationship recognition model, obtain a predicted association relationship between the rule data and the case data, calculate a loss value of the predicted association relationship and the actual association relationship by using a loss function in the relationship recognition model, and adjust parameters of the relationship recognition model according to the loss value until the relationship recognition model meets a preset condition, thereby obtaining a trained standard relationship recognition model.
In the embodiment of the invention, the vectorization and position indexing operation is carried out on the characters in the legal data, and the legal character vector set with connected context semantics can be obtained, namely the legal character vector set is the character vector set after vectorization and position indexing is carried out on the legal data.
In the embodiment of the invention, the vectorization and position indexing operation is performed on the characters in the case data, so that a case character vector set with context semantically connected can be obtained, namely, the case character vector is the character vector set after vectorization and position indexing is performed on the case data.
In the embodiment of the invention, the pre-constructed relationship identification model can be constructed by Tiny-ALBERT, wherein the pre-constructed relationship identification module comprises: coding layer, feedforward attention mechanism, matching module, loss function.
In the embodiment of the invention, the rule data and the case data contain a large number of characters, but the neural network only can accept numerical input, and if the rule data and the case data are directly utilized to train the pre-constructed relation recognition model, the trained relation recognition model cannot support the position recognition of the rule data and the case data, so that the coding layer in the relation recognition model can be utilized to carry out position vector coding on the rule data and the characters in the case data, and further, the case information of each character in the rule data and the case data is determined, thereby realizing the subsequent model training.
Preferably, the coding layer may be an coding layer.
In detail, the model training module 102 performs position vector encoding on characters in the rule data and the case data by using an encoding layer in a pre-constructed relationship recognition model by performing the following operations, to generate a rule character vector set and a case character vector set, including:
the coding layer is utilized to respectively carry out position index coding on characters in the legal data and the case data, and legal character position indexes and case character position indexes are obtained;
the coding layer is utilized to respectively convert characters in the legal data and the case data into corresponding character vectors, and an initial legal character vector and an initial case character vector are obtained;
combining the legal character position index and the initial legal character vector to generate a legal character vector set;
and combining the case character position index and the initial case character vector to generate a case character vector set.
For example, the regulatory data are: the "application registration and use trademark" should follow the honest credit principle ", and the characters in the" application registration and use trademark "are subjected to position index coding to obtain a" application "position index of 0, a" please "position index of 1, a" note "position index of 2, a" book "position index of 3, and" position index of 4, a "make" position index of 5, a "use" position index of 6, a "quotient" position index of 7 and a "mark" position index of 8.
Similarly, when the regulatory data is: when the trademark is applied for registration and use, characters in the regulation data are converted into character vectors, so that an initial regulation character vector of 000, an initial regulation character vector of 001, an initial regulation character vector of 002, a volume initial regulation character vector of 003 and the like are obtained, and then the character position index and the initial regulation character vector are combined, so that a regulation character vector of 0-000, a regulation character vector of 1-001, a volume regulation character vector of 2-002 and the like are obtained.
In the embodiment of the invention, the characteristic rule character vector set is a character vector set obtained by firstly carrying out characteristic extraction on the rule character vector set to obtain a characteristic character vector and then carrying out character structure information extraction on the characteristic character vector; the characteristic case character vector set is a character vector set obtained by firstly carrying out characteristic extraction on the case character vector set to obtain a characteristic character vector and then carrying out character structure information extraction on the characteristic character vector.
In the embodiment of the invention, the characteristic sequence extraction is carried out on the rule character vector and the case character vector by utilizing a feedforward attention mechanism in the pre-constructed relation recognition model, and the characteristic character vectors of the rule character vector and the case character vector are output. Wherein the feed forward attention mechanism comprises: a hidden markov module and an encoder.
In detail, the model training module 102 performs feature extraction and structure extraction on the rule character vector set and the case character vector set by using a feedforward attention mechanism in the pre-constructed relationship recognition model by performing the following operations, to obtain a feature rule character vector and a feature case character vector, including:
extracting features of the obtained legal character vector and the case character vector by utilizing a hidden Markov module in the feedforward attention mechanism to obtain a feature character vector;
and extracting structural information of the characteristic character vector by using an encoder in the feedforward attention mechanism to obtain a characteristic rule character vector set and a characteristic case character vector set.
In the embodiment of the invention, the characteristic extraction of the key words of the legal character vector is carried out through a hidden Markov algorithm, for example, the characteristic key words of production, sales, sibutramine, weight loss, health food, food safety law and the like can be extracted by producing and selling weight-losing health food containing the forbidden sibutramine component by a certain enterprise and obtaining violence from the weight-losing health food, which violates the food safety law.
Further, the structural information sequence of the characteristic character vector of the legal character vector may be extracted by an encoder, that is, the constituent structure of the legal character vector is extracted and unstructured characters of the legal data are structured, for example, the constituent structures of "security" are syllables "a", "n", "q", "u", "a" and "n".
In the embodiment of the invention, the matching module can be used for carrying out keyword prediction on the characteristic rule character vector and the characteristic case character vector and calculating the association degree of the keyword prediction, wherein the matching module comprises: hidden markov algorithms and NLP matching algorithms.
In detail, the model training module 102 performs association relation recognition on the feature rule character vector and the feature case character vector by using a matching module in the relation recognition model by performing the following operations, to obtain a predicted association relation between the rule data and the case data, including:
acquiring rule upper and lower character semantics corresponding to the characteristic rule character vector and case upper and lower character semantics corresponding to the characteristic case character vector;
splicing the rule upper and lower character semantics and the case upper and lower character semantics respectively by utilizing a hidden Markov algorithm in the matching module to obtain a rule character matrix and a case character matrix;
calculating the association coefficient of the rule character matrix and the case character matrix by using an NLP matching algorithm in the matching module;
if the association coefficient is smaller than or equal to a preset association coefficient, determining that the association relationship between the rule data corresponding to the rule character matrix and the case data corresponding to the case character matrix does not exist;
If the association coefficient is larger than a preset association coefficient, determining that association relationship exists between the rule data corresponding to the rule character matrix and the case data corresponding to the case character matrix.
In the embodiment of the invention, the characteristic rule character vector and the upper and lower character semantic relation corresponding to the characteristic case character vector can be spliced by using a hidden Markov algorithm to obtain a rule character matrix and a case character matrix:
,/>
in one embodiment of the present invention,character vector representing a characteristic case of length T, < +.>Keyword corresponding to character vector of characteristic case with length T according to +.>And->The relation of the case character matrix B is obtained by using the upper character semantic relation and the lower character semantic relation.
Similarly, the method of finding the rule character matrix is the same as above.
For example, the observation sequence O "x, i, B, u, q, u, m, i, n, g, j, i, a, n, f, e, i", the state sequence S is "sibutramine weight loss", and the obtained case character matrix B is [ sibutramine weight loss ].
In the embodiment of the invention, the association coefficient of the rule character matrix and the case character matrix can be calculated through an NLP matching algorithm, such as a cosine similarity algorithm, a Jaccard similarity coefficient algorithm and the like in the NLP matching algorithm.
Specifically, the preset association degree may be set to 0.98.
In the embodiment of the invention, the loss function in the relation recognition model is utilized to calculate the loss value of the predicted association relation and the actual association relation, and the parameters of the relation recognition model are adjusted according to the loss value until the relation recognition model meets the preset condition, so that the standard relation recognition model with complete training is obtained. The preset condition may be set according to the actual model training scenario, for example, the preset condition is that the loss value is smaller than a preset threshold.
In an embodiment of the present invention, the loss function includes a cross entropy (cross entropy) function.
In detail, the model training module 102 calculates a loss value of the predicted association and the actual association using a loss function in the relationship identification model by performing operations including:
calculating a loss value of the predicted association and the actual association by using the following loss function:
wherein,indicating a loss value->Representing the number of predicted associations +.>Representing the i-th predictive association, +.>Indicating the ith actual association.
In an alternative embodiment, the adjustment of the parameters may be implemented by a random descent gradient algorithm, for example, the parameters of the relational identification model may be adjusted using the following formula:
Wherein,representing the adjusted parameters, m representing the feature rule character vector and the number of feature case character vectors, L(s) representing the loss value, +.>Representing a decreasing function +.>Indicates learning rate (I/O)>And->Representing the abscissa and ordinate positions of the feature legislation character vector and the feature case character vector.
The rule obtaining module 103 is configured to obtain a case to be accepted, and identify a first rule corresponding to the case to be accepted by using the standard relationship identification model; matching the to-be-accepted case with the history cases in the pre-constructed history case library, and taking the rule corresponding to the history cases successfully matched as a second rule of the to-be-accepted case; and taking the same rule in the first rule and the second rule as the final rule of the to-be-accepted case.
In the embodiment of the invention, the to-be-accepted case can be an illegal case related to cosmetics, special equipment, consumer products and the like in a monitoring area. Specifically, the to-be-accepted case may be input to the system by a user or by a staff member.
For example, the to-be-accepted case may be a defect of the consumer product caused by unqualified design and manufacture of the product in a certain batch of consumer products, which seriously jeopardizes personal safety of the consumer. Further, the to-be-accepted case is input into the standard relation recognition model, and a third term of the first rule in a document of 'temporary regulations for consumer recall management' can be obtained.
In the embodiment of the present invention, the pre-building history case library includes: and processing the illegal cases in the past time period and processing the regulations corresponding to the illegal cases.
In detail, the rule obtaining module 103 matches the to-be-accepted case with the history cases in the pre-constructed history case library by executing the following operations, and uses the rule corresponding to the history case that is successfully matched as the second rule of the to-be-accepted case, including:
calculating the association degree of the to-be-accepted case and the history case in the pre-constructed history case library;
if the association degree is larger than the preset association degree, the fact that the history cases in the to-be-accepted case and the preset history case library are successfully matched is determined, and rules corresponding to the history cases which are successfully matched are used as second rules of the to-be-accepted case.
In an optional embodiment of the present invention, the association degree between the to-be-accepted case and the history case in the pre-constructed history case library may be implemented by using an NLP matching algorithm, for example, a cosine similarity algorithm, a Jaccard similarity coefficient algorithm, etc., where the preset association degree may be set to 0.98, or may be preset to other values according to an actual service scenario.
In the embodiment of the present invention, if the first rule is the twenty-sixth term in the protection law for consumer rights in the people's republic of China; and the second rule is also twenty-sixth clause in Consumer rights protection law of the people's republic of China, and the twenty-sixth clause in Consumer rights protection law of the people's republic of China is taken as the final rule of the case to be accepted.
Further, in the embodiment of the present invention, if the first rule and the second rule do not have the same rule, the to-be-accepted case may be re-input into the relationship recognition model to perform training, so as to obtain an updated first rule, until the updated first rule and the second rule have the same rule, so that accuracy of rule recognition is improved.
The embodiment of the invention firstly marks the actual association relation between the rule data and the case data, can determine the actual correspondence relation between the rule data and the case data, and then compares the rule data predicted by the subsequent model with the case data correspondence relation with the actual correspondence relation, thereby judging the accuracy of the rule data predicted by the subsequent model and the case data correspondence relation; secondly, training the pre-constructed relation recognition model by utilizing the association relation between the rule data and the case data, recognizing a first rule corresponding to the case to be accepted by utilizing the trained relation recognition model, determining the corresponding relation between the rule data and the case data by utilizing the trained relation recognition model, and recognizing the first rule corresponding to the case to be accepted, thereby improving the accuracy rate of rule recognition; finally, taking the rule corresponding to the history case which is successfully matched as a second rule of the case to be accepted, taking the same rule in the first rule and the second rule as a final rule of the case to be accepted, further screening the accurate rule, and improving the accuracy of rule identification. Therefore, the rule recognition device based on big data provided by the embodiment of the invention can improve the accuracy and efficiency of rule recognition.
As shown in fig. 3, a schematic structural diagram of an electronic device implementing a rule recognition method based on big data according to the present invention is shown.
The electronic device may comprise a processor 10, a memory 11, a communication bus 12 and a communication interface 13, and may further comprise a computer program stored in the memory 11 and executable on the processor 10, such as a big data based regulatory recognition program.
The memory 11 includes at least one type of medium, including flash memory, a mobile hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a local magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device, such as a mobile hard disk of the electronic device. The memory 11 may in other embodiments also be an external storage device of the electronic device, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device. The memory 11 may be used not only for storing application software installed in an electronic device and various types of data, such as codes of a regulation recognition program based on big data, but also for temporarily storing data that has been output or is to be output.
The processor 10 may be comprised of integrated circuits in some embodiments, for example, a single packaged integrated circuit, or may be comprised of multiple integrated circuits packaged with the same or different functions, including one or more central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, combinations of various control chips, and the like. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various parts of the entire electronic device using various interfaces and lines, executes or executes programs or modules (e.g., a rule recognition program based on big data, etc.) stored in the memory 11, and invokes data stored in the memory 11 to perform various functions of the electronic device and process the data.
The communication bus 12 may be a peripheral component interconnect standard (perIPheral component interconnect, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The communication bus 12 is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
Fig. 3 shows only an electronic device with components, and it will be understood by those skilled in the art that the structure shown in fig. 3 is not limiting of the electronic device and may include fewer or more components than shown, or may combine certain components, or a different arrangement of components.
For example, although not shown, the electronic device may further include a power source (such as a battery) for supplying power to the respective components, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The electronic device may further include various sensors, bluetooth modules, wi-Fi modules, etc., which are not described herein.
Optionally, the communication interface 13 may comprise a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), typically used to establish a communication connection between the electronic device and other electronic devices.
Optionally, the communication interface 13 may further comprise a user interface, which may be a Display, an input unit, such as a Keyboard (Keyboard), or a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device and for displaying a visual user interface.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.
The big data based regulation recognition program stored by the memory 11 in the electronic device is a combination of a plurality of computer programs, which when run in the processor 10, can realize:
acquiring regulation data and case data, and marking the actual association relation of the regulation data and the case data;
the coding layer in the pre-constructed relation recognition model is utilized to respectively carry out position vector coding on characters in the rule data and the case data, and a rule character vector set and a case character vector set are generated;
Respectively carrying out feature extraction and structure extraction on the rule character vector set and the case character vector set by utilizing a feedforward attention mechanism in the relation recognition model to obtain a feature rule character vector set and a feature case character vector set;
performing association relation recognition on the characteristic rule character vector and the characteristic case character vector by utilizing a matching module in the relation recognition model to obtain a predicted association relation of the rule data and the case data;
calculating a loss value of the predicted association relationship and the actual association relationship by using a loss function in the relationship recognition model, and adjusting parameters of the relationship recognition model according to the loss value until the relationship recognition model meets a preset condition, so as to obtain a standard relationship recognition model after training is completed;
acquiring a to-be-accepted case, and identifying a first rule corresponding to the to-be-accepted case by using the standard relation identification model;
matching the to-be-accepted case with the history cases in the pre-constructed history case library, and taking the rule corresponding to the history cases successfully matched as a second rule of the to-be-accepted case;
and taking the same rule in the first rule and the second rule as the final rule of the to-be-accepted case.
In particular, the specific implementation method of the processor 10 on the computer program may refer to the description of the relevant steps in the corresponding embodiment of fig. 1, which is not repeated herein.
Further, the electronic device integrated modules/units may be stored in a computer readable medium if implemented in the form of software functional units and sold or used as stand alone products. The computer readable medium may be non-volatile or volatile. The computer readable medium may include: any entity or device capable of carrying the computer program code to be described, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).
Embodiments of the present invention may also provide a computer readable storage medium storing a computer program which, when executed by a processor of an electronic device, may implement:
acquiring regulation data and case data, and marking the actual association relation of the regulation data and the case data;
the coding layer in the pre-constructed relation recognition model is utilized to respectively carry out position vector coding on characters in the rule data and the case data, and a rule character vector set and a case character vector set are generated;
Respectively carrying out feature extraction and structure extraction on the rule character vector set and the case character vector set by utilizing a feedforward attention mechanism in the relation recognition model to obtain a feature rule character vector set and a feature case character vector set;
performing association relation recognition on the characteristic rule character vector and the characteristic case character vector by utilizing a matching module in the relation recognition model to obtain a predicted association relation of the rule data and the case data;
calculating a loss value of the predicted association relationship and the actual association relationship by using a loss function in the relationship recognition model, and adjusting parameters of the relationship recognition model according to the loss value until the relationship recognition model meets a preset condition, so as to obtain a standard relationship recognition model after training is completed;
acquiring a to-be-accepted case, and identifying a first rule corresponding to the to-be-accepted case by using the standard relation identification model;
matching the to-be-accepted case with the history cases in the pre-constructed history case library, and taking the rule corresponding to the history cases successfully matched as a second rule of the to-be-accepted case;
and taking the same rule in the first rule and the second rule as the final rule of the to-be-accepted case.
Further, the computer usable medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created from the use of blockchain nodes, and the like.
In the several embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims (8)

1. A method of legislation identification based on big data, the method comprising:
acquiring regulation data and case data, and marking the actual association relation of the regulation data and the case data;
the coding layer in the pre-constructed relation recognition model is utilized to respectively carry out position vector coding on characters in the rule data and the case data, and a rule character vector set and a case character vector set are generated;
respectively carrying out feature extraction and structure extraction on the rule character vector set and the case character vector set by utilizing a feedforward attention mechanism in the relation recognition model to obtain a feature rule character vector set and a feature case character vector set;
Performing association relation recognition on the characteristic rule character vector and the characteristic case character vector by utilizing a matching module in the relation recognition model to obtain a predicted association relation of the rule data and the case data;
calculating a loss value of the predicted association relationship and the actual association relationship by using a loss function in the relationship recognition model, and adjusting parameters of the relationship recognition model according to the loss value until the relationship recognition model meets a preset condition, so as to obtain a standard relationship recognition model after training is completed;
acquiring a to-be-accepted case, and identifying a first rule corresponding to the to-be-accepted case by using the standard relation identification model;
matching the to-be-accepted case with the history cases in the pre-constructed history case library, and taking the rule corresponding to the history cases successfully matched as a second rule of the to-be-accepted case;
taking the same rule in the first rule and the second rule as the final rule of the to-be-accepted case;
the feature extraction and the structure extraction are respectively performed on the rule character vector set and the case character vector set by using a feedforward attention mechanism in the pre-constructed relation recognition model to obtain a feature rule character vector and a feature case character vector, and the method comprises the following steps: extracting features of the obtained legal character vector and the case character vector by utilizing a hidden Markov module in the feedforward attention mechanism to obtain a feature character vector; extracting structural information of the characteristic character vector by using an encoder in the feedforward attention mechanism to obtain a characteristic rule character vector set and a characteristic case character vector set;
The step of carrying out association relation recognition on the characteristic rule character vector and the characteristic case character vector by utilizing a matching module in the relation recognition model to obtain a predicted association relation between the rule data and the case data comprises the following steps: acquiring rule upper and lower character semantics corresponding to the characteristic rule character vector and case upper and lower character semantics corresponding to the characteristic case character vector; splicing the rule upper and lower character semantics and the case upper and lower character semantics respectively by utilizing a hidden Markov algorithm in the matching module to obtain a rule character matrix and a case character matrix; calculating the association coefficient of the rule character matrix and the case character matrix by using an NLP matching algorithm in the matching module; if the association coefficient is smaller than or equal to a preset association coefficient, determining that the association relationship between the rule data corresponding to the rule character matrix and the case data corresponding to the case character matrix does not exist; if the association coefficient is larger than a preset association coefficient, determining that association relationship exists between the rule data corresponding to the rule character matrix and the case data corresponding to the case character matrix.
2. The big data-based rule recognition method of claim 1, wherein the encoding layer in the pre-constructed relation recognition model is used for respectively performing position vector encoding on characters in the rule data and the case data to generate a rule character vector set and a case character vector set, and the method comprises the following steps:
The coding layer is utilized to respectively carry out position index coding on characters in the legal data and the case data, and legal character position indexes and case character position indexes are obtained;
the coding layer is utilized to respectively convert characters in the legal data and the case data into character vectors, and an initial legal character vector and an initial case character vector are obtained;
combining the legal character position index and the initial legal character vector to generate a legal character vector set;
and combining the case character position index and the initial case character vector to generate a case character vector set.
3. The big data based regulation recognition method of any of claims 1-2, wherein the marking the actual association relationship of the regulation data and case data includes:
and marking the actual association relation between the rule data and the case data by using a preset pointer algorithm.
4. The big data based regulation recognition method of claim 1, wherein the calculating the loss value of the predicted association relationship with the actual association relationship using the loss function in the relationship recognition model includes:
Calculating a loss value of the predicted association and the actual association by using the following loss function:
wherein,indicating a loss value->Representing the number of predicted associations +.>Representing the i-th predictive association, +.>Indicating the ith actual association.
5. The big data based rule recognition method of claim 1, wherein the matching the to-be-accepted case with the history cases in the pre-constructed history case library, and taking the rule corresponding to the history case successfully matched as the second rule of the to-be-accepted case comprises:
calculating the association degree of the to-be-accepted case and the history case in the pre-constructed history case library;
if the association degree is larger than the preset association degree, the fact that the history cases in the to-be-accepted case and the preset history case library are successfully matched is determined, and rules corresponding to the history cases which are successfully matched are used as second rules of the to-be-accepted case.
6. A big data based regulation recognition apparatus for implementing the big data based regulation recognition method of any one of claims 1 to 5, comprising:
the marking module is used for acquiring the rule data and the case data and marking the actual association relation between the rule data and the case data;
The model training module is used for carrying out position vector coding on characters in the rule data and the case data by utilizing a coding layer in a pre-constructed relation recognition model to generate a rule character vector set and a case character vector set, carrying out feature extraction and structure extraction on the rule character vector set and the case character vector set by utilizing a feedforward attention mechanism in the relation recognition model to obtain a feature rule character vector set and a feature case character vector set, carrying out association relation recognition on the feature rule character vector and the feature case character vector by utilizing a matching module in the relation recognition model to obtain a predicted association relation of the rule data and the case data, calculating a loss value of the predicted association relation and the actual association relation by utilizing a loss function in the relation recognition model, and adjusting parameters of the relation recognition model according to the loss value until the relation recognition model meets preset conditions to obtain a trained standard relation recognition model;
the rule acquisition module is used for acquiring a to-be-accepted case, identifying a first rule corresponding to the to-be-accepted case by using the standard relation identification model, matching the to-be-accepted case with a history case in a pre-constructed history case library, taking the rule corresponding to the history case which is successfully matched as a second rule of the to-be-accepted case, and taking the same rule in the first rule and the second rule as a final rule of the to-be-accepted case.
7. An electronic device, the electronic device comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores computer program instructions executable by the at least one processor to enable the at least one processor to perform the big data based regulation identification method of any of claims 1 to 5.
8. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the big data based regulation recognition method according to any of claims 1 to 5.
CN202111082811.7A 2021-09-15 2021-09-15 Rule recognition method and device based on big data, electronic equipment and medium Active CN113792127B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111082811.7A CN113792127B (en) 2021-09-15 2021-09-15 Rule recognition method and device based on big data, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111082811.7A CN113792127B (en) 2021-09-15 2021-09-15 Rule recognition method and device based on big data, electronic equipment and medium

Publications (2)

Publication Number Publication Date
CN113792127A CN113792127A (en) 2021-12-14
CN113792127B true CN113792127B (en) 2023-12-26

Family

ID=79183665

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111082811.7A Active CN113792127B (en) 2021-09-15 2021-09-15 Rule recognition method and device based on big data, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN113792127B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108536679A (en) * 2018-04-13 2018-09-14 腾讯科技(成都)有限公司 Name entity recognition method, device, equipment and computer readable storage medium
CN109919368A (en) * 2019-02-26 2019-06-21 西安交通大学 A kind of law article recommendation forecasting system and method based on associated diagram
CN110377632A (en) * 2019-06-17 2019-10-25 平安科技(深圳)有限公司 Lawsuit prediction of result method, apparatus, computer equipment and storage medium
CN110442744A (en) * 2019-08-09 2019-11-12 泰康保险集团股份有限公司 Extract method, apparatus, electronic equipment and the readable medium of target information in image
CN112966517A (en) * 2021-04-30 2021-06-15 平安科技(深圳)有限公司 Training method, device, equipment and medium for named entity recognition model
CN113255294A (en) * 2021-07-14 2021-08-13 北京邮电大学 Named entity recognition model training method, recognition method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050203899A1 (en) * 2003-12-31 2005-09-15 Anderson Steven B. Systems, methods, software and interfaces for integration of case law with legal briefs, litigation documents, and/or other litigation-support documents

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108536679A (en) * 2018-04-13 2018-09-14 腾讯科技(成都)有限公司 Name entity recognition method, device, equipment and computer readable storage medium
CN109919368A (en) * 2019-02-26 2019-06-21 西安交通大学 A kind of law article recommendation forecasting system and method based on associated diagram
CN110377632A (en) * 2019-06-17 2019-10-25 平安科技(深圳)有限公司 Lawsuit prediction of result method, apparatus, computer equipment and storage medium
CN110442744A (en) * 2019-08-09 2019-11-12 泰康保险集团股份有限公司 Extract method, apparatus, electronic equipment and the readable medium of target information in image
CN112966517A (en) * 2021-04-30 2021-06-15 平安科技(深圳)有限公司 Training method, device, equipment and medium for named entity recognition model
CN113255294A (en) * 2021-07-14 2021-08-13 北京邮电大学 Named entity recognition model training method, recognition method and device

Also Published As

Publication number Publication date
CN113792127A (en) 2021-12-14

Similar Documents

Publication Publication Date Title
CN107818138B (en) Case law regulation recommendation method and system
CN113592019B (en) Fault detection method, device, equipment and medium based on multi-model fusion
CN113157927B (en) Text classification method, apparatus, electronic device and readable storage medium
CN112860848B (en) Information retrieval method, device, equipment and medium
CN113378970B (en) Sentence similarity detection method and device, electronic equipment and storage medium
CN113435202A (en) Product recommendation method and device based on user portrait, electronic equipment and medium
CN113656690B (en) Product recommendation method and device, electronic equipment and readable storage medium
CN111798123A (en) Compliance evaluation method, device, equipment and medium based on artificial intelligence
CN115238670B (en) Information text extraction method, device, equipment and storage medium
CN114880449B (en) Method and device for generating answers of intelligent questions and answers, electronic equipment and storage medium
CN114612194A (en) Product recommendation method and device, electronic equipment and storage medium
CN116450829A (en) Medical text classification method, device, equipment and medium
CN116681082A (en) Discrete text semantic segmentation method, device, equipment and storage medium
CN113658002B (en) Transaction result generation method and device based on decision tree, electronic equipment and medium
CN112651782B (en) Behavior prediction method, device, equipment and medium based on dot product attention scaling
CN114003592A (en) Loan risk assessment method based on artificial intelligence and related equipment
CN112633988A (en) User product recommendation method and device, electronic equipment and readable storage medium
CN113792127B (en) Rule recognition method and device based on big data, electronic equipment and medium
CN116843481A (en) Knowledge graph analysis method, device, equipment and storage medium
CN114708073B (en) Intelligent detection method and device for surrounding mark and serial mark, electronic equipment and storage medium
CN113657546B (en) Information classification method, device, electronic equipment and readable storage medium
CN114996386A (en) Business role identification method, device, equipment and storage medium
CN114722146A (en) Supply chain asset checking method, device, equipment and medium based on artificial intelligence
CN114610854A (en) Intelligent question and answer method, device, equipment and storage medium
CN113888265A (en) Product recommendation method, device, equipment and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant