CN111782825A - Knowledge base construction method and device - Google Patents

Knowledge base construction method and device Download PDF

Info

Publication number
CN111782825A
CN111782825A CN202010842045.9A CN202010842045A CN111782825A CN 111782825 A CN111782825 A CN 111782825A CN 202010842045 A CN202010842045 A CN 202010842045A CN 111782825 A CN111782825 A CN 111782825A
Authority
CN
China
Prior art keywords
audit
auditing
information
knowledge
domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010842045.9A
Other languages
Chinese (zh)
Inventor
何龙龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202010842045.9A priority Critical patent/CN111782825A/en
Publication of CN111782825A publication Critical patent/CN111782825A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The specification provides a knowledge base construction method and a knowledge base construction device, wherein the knowledge base construction method comprises the following steps: acquiring audit information and audit requirements, wherein the audit requirements correspond to a plurality of audit fields; grouping the audit information according to a preset grouping rule to generate a plurality of audit data groups; establishing an auditing domain knowledge graph corresponding to each auditing domain according to a plurality of auditing data groups and the auditing requirement; the audit knowledge base is generated according to each audit domain knowledge map, the knowledge base construction method provided by the specification creates a plurality of audit domain knowledge maps in a plurality of audit domains according to audit information and audit requirements, and then the audit knowledge base is generated by the plurality of audit domain knowledge maps, so that various audit information can be timely and effectively obtained, and correct and reasonable changes can be made conveniently in time.

Description

Knowledge base construction method and device
Technical Field
The specification relates to the technical field of knowledge maps, in particular to a knowledge base construction method and a knowledge base construction device.
Background
With the stricter and stricter external supervision environment, the continuous expansion of data information such as laws, regulations, penalty information and the like, and the accurate perception of information change is a new challenge.
Different supervision fields exist in different service requirements, each supervision field has a corresponding supervision rule, so that changes of laws and regulations and newly-added penalty information need to be fused according to different requirements, and only by effectively and accurately sensing external information such as laws, regulations and penalty information, correct and reasonable changes can be made for external supervision, and effective adjustment can be made for internal services.
Therefore, an effective method is urgently needed, so that external laws and regulations and punishment information can be timely obtained and fused in the corresponding auditing field.
Disclosure of Invention
In view of this, the embodiments of the present specification provide a knowledge base construction method. The present specification also relates to a knowledge base constructing apparatus, a computing device, and a computer-readable storage medium, which are used to solve the technical problems in the prior art.
According to a first aspect of embodiments of the present specification, there is provided a knowledge base construction method, including:
acquiring audit information and audit requirements, wherein the audit requirements correspond to a plurality of audit fields;
grouping the audit information according to a preset grouping rule to generate a plurality of audit data groups;
establishing an auditing domain knowledge graph corresponding to each auditing domain according to a plurality of auditing data groups and the auditing requirement;
and generating an audit knowledge base according to each audit domain knowledge graph.
Optionally, the establishing an audit domain knowledge graph corresponding to each audit domain according to the plurality of audit data sets and the audit requirement includes:
determining a target auditing data group corresponding to each auditing field;
and constructing an auditing domain knowledge graph corresponding to the auditing domain according to the target auditing data group.
Optionally, constructing an audit domain knowledge graph corresponding to an audit domain according to the target audit data set, including:
determining target auditing information in the target auditing data group according to the auditing field;
performing knowledge processing on the target audit information to obtain candidate data;
performing knowledge fusion on the candidate data to obtain structured data;
and carrying out knowledge reasoning on the structured data to obtain an audit field knowledge graph of the audit field.
Optionally, performing knowledge processing on the target audit information to obtain candidate data, including:
and performing entity extraction, relation extraction and attribute extraction on the target audit information to obtain candidate entities, relations and attribute information of the candidate entities in the target audit information.
Optionally, performing knowledge fusion on the candidate data to obtain structured data, including:
carrying out entity disambiguation, entity normalization and reference resolution on the candidate entities to determine target entities;
and determining the corresponding target relation and the attribute information of the target entity according to the target entity.
Optionally, performing knowledge inference on the structured data to obtain an audit domain knowledge graph of the audit domain, including:
constructing an ontology of the audit field according to the target entity and the target relation;
and generating an auditing domain knowledge graph of the auditing domain according to the attribute information of the target entity and the ontology of the auditing domain.
Optionally, the obtaining of the audit information includes:
and acquiring first audit information and second audit information.
Optionally, the grouping the audit information according to a preset grouping rule includes:
and grouping the auditing information according to a preset theme classification.
Optionally, the method further includes:
and constructing an auditing platform based on the auditing knowledge base.
Optionally, the method further includes:
and receiving updated audit information, and updating the audit knowledge base according to the updated audit information.
According to a second aspect of embodiments of the present specification, there is provided a knowledge base building apparatus including:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is configured to acquire audit information and audit requirements, and the audit requirements correspond to a plurality of audit fields;
the grouping module is configured to group the audit information according to a preset grouping rule to generate a plurality of audit data groups;
the construction module is configured to construct an audit domain knowledge graph corresponding to each audit domain according to a plurality of audit data sets and the audit requirements;
a generation module configured to generate an audit knowledge base from each audit domain knowledge graph.
According to a third aspect of embodiments herein, there is provided a computing device comprising:
a memory and a processor;
the memory is to store computer-executable instructions, and the processor is to execute the computer-executable instructions to:
acquiring audit information and audit requirements, wherein the audit requirements correspond to a plurality of audit fields;
grouping the audit information according to a preset grouping rule to generate a plurality of audit data groups;
establishing an auditing domain knowledge graph corresponding to each auditing domain according to a plurality of auditing data groups and the auditing requirement;
and generating an audit knowledge base according to each audit domain knowledge graph.
According to a fourth aspect of embodiments herein, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of any of the knowledge base construction methods.
The knowledge base construction method provided by the specification acquires audit information and audit requirements, wherein the audit requirements correspond to a plurality of audit fields; grouping the audit information according to a preset grouping rule to generate a plurality of audit data groups; establishing an auditing domain knowledge graph corresponding to each auditing domain according to a plurality of auditing data groups and the auditing requirement; according to the method, the entity and the relation for constructing the knowledge map are obtained through the audit information, the knowledge map corresponding to each audit field is constructed through the audit field in the audit requirement, and then the audit knowledge base is generated, so that the supervision change can be accurately and effectively observed, the latest information is obtained, and the rules of the supervision business of the user can be timely and efficiently adjusted.
Drawings
FIG. 1 is a flow chart of a knowledge base construction method provided by an embodiment of the present specification;
FIG. 2 is a flow chart of a method for constructing an audit domain knowledge graph in a knowledge base construction method provided by an embodiment of the present specification;
FIG. 3 is a process flow diagram of a knowledge base construction method applied to the regulatory compliance field according to an embodiment of the present specification;
FIG. 4 is a schematic structural diagram of a knowledge base building apparatus provided in an embodiment of the present specification;
fig. 5 is a block diagram of a computing device according to an embodiment of the present disclosure.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, as those skilled in the art will be able to make and use the present disclosure without departing from the spirit and scope of the present disclosure.
The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can also be referred to as a second and, similarly, a second can also be referred to as a first without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
With the overall trend of the supervision environment being strict, the continuous expansion of data information such as laws, regulations, penalty information and the like is a new challenge to the accurate perception of information change, and how to take the changes of laws and regulations and newly added penalty information as the reference basis for the compliance check of the service platform in the supervision compliance check of the service platform, the internal service of the service platform can be effectively adjusted only by timely and effectively acquiring external information.
In the present specification, a knowledge base construction method is provided, and the present specification relates to a knowledge base construction apparatus, a computing device, and a computer-readable storage medium, which are described in detail one by one in the following embodiments.
Fig. 1 shows a flowchart of a knowledge base building method provided in an embodiment of the present specification, which specifically includes the following steps:
step 102: and acquiring audit information and audit requirements, wherein the audit requirements correspond to a plurality of audit fields.
The audit information is various information obtained in the external supervision environment, such as legal regulations, supervision concerns, supervision penalty information, expert opinions, supervision disclosure information, compliance management system and other supervision service data related to the supervision environment.
The audit requirements are audit requirements for different audit fields, for example, audit submission requirements for the audit submission field, compliance self-check requirements for the compliance self-check field, legal risk audit requirements for the legal risk field, and the like.
In practical application, the source of the audit information is very wide, and the audit information can be laws and regulations issued by related departments, authoritative statements issued by experts in the field, compliance management systems of enterprises and the like.
Optionally, the obtaining of the audit information includes: and acquiring first audit information and second audit information.
The first auditing information is the supervision business data with legal effectiveness, such as legal regulations, punishment information and the like, and the second auditing information is other supervision business data without the first auditing information.
Step 104: and grouping the audit information according to a preset grouping rule to generate a plurality of audit data groups.
After the audit information is obtained, the audit information needs to be preliminarily screened, and the audit information is grouped according to a preset grouping rule, where the preset grouping rule may be a subject of the audit information, may also be regional information of a publisher of the audit information, may also be acquisition time of the audit information, and the like, and in this specification, the preset grouping rule is not specifically limited.
And after the grouping rule is determined, grouping the audit information according to the grouping rule, and dividing the audit information into a plurality of audit data groups, wherein each audit data group comprises a plurality of pieces of audit information.
Optionally, the grouping the audit information according to a preset grouping rule includes:
and grouping the auditing information according to a preset theme classification.
In practical applications, it is preferable that the audit information is grouped by a preset topic classification, and the preset topics may be a marketing topic, a quality topic, a report topic, a compliance self-check topic, a legal and regulatory topic, an event topic, and the like.
Grouping the auditing information according to a preset grouping rule can be called subgraph division, and a plurality of divided subjects can be called a plurality of subgraphs, such as market subgraphs, quality subgraphs, report subgraphs, compliance self-check subgraphs, legal subgraphs, law and event subgraphs and the like.
Step 106: and establishing an auditing domain knowledge graph corresponding to each auditing domain according to the plurality of auditing data groups and the auditing requirement.
Each audit data set comprises a plurality of audit information, the audit information is used as a data basis for constructing a knowledge map, corresponding data information is obtained from the audit information according to different audit requirements, a corresponding audit domain knowledge map is constructed, and a data relation network corresponding to the audit domain is constructed according to the audit requirements by utilizing the audit information in the audit data sets, so as to support upper-layer data application.
Optionally, the establishing an audit domain knowledge graph corresponding to each audit domain according to the plurality of audit data sets and the audit requirement includes:
determining a target auditing data group corresponding to each auditing field;
and constructing an auditing domain knowledge graph corresponding to the auditing domain according to the target auditing data group.
In practical application, firstly, a target audit data group corresponding to each audit field is determined, for example, for the risk field of a legal person, the corresponding audit data group is a legal person sub-graph, a compliance self-check sub-graph and an event sub-graph, that is, data information related to the risk field of the legal person can be searched in the legal person sub-graph, the compliance self-check sub-graph and the event sub-graph, and corresponding target data can be searched in a large amount of data information by determining the target audit data group corresponding to each audit field, so that query time is saved, and query efficiency is improved.
After the target audit data group is determined, an audit domain knowledge map aiming at the corresponding audit domain is constructed according to audit data in the target audit data group, for example, a legal risk domain knowledge map is constructed according to a legal person sub-map, a compliance self-check sub-map and an event sub-map, a compliance self-check domain knowledge map is constructed according to the compliance self-check sub-map, the legal and legal rules sub-map and the event sub-map, a supervision reporting domain knowledge map is constructed according to a report sub-map and a quality sub-map, and the like.
Optionally, referring to fig. 2, fig. 2 shows a flowchart of a method for constructing an audit domain knowledge graph in a knowledge base construction method, and an audit domain knowledge graph corresponding to an audit domain is constructed according to the target audit data set, including the following steps 202 to 208:
step 202: and determining target auditing information in the target auditing data group according to the auditing field.
The target audit data group has multiple pieces of audit information, and the target audit data groups between different audit fields also have cross, for example, a compliance self-audit sub-graph can be a target audit data group of a legal risk field, and can also be a target audit data group corresponding to the compliance self-audit field, so that although the target audit data group is determined according to the audit field, the audit information in the target audit data group is possibly unrelated to the current audit field, and therefore, firstly, the target audit information in the target audit data group is determined according to the audit field, for example, for the legal risk field, the corresponding target audit data group is a legal sub-graph, a compliance self-audit sub-graph and an event sub-graph, and the compliance self-audit sub-graph has 50 pieces of audit information { S1, S2, S3 … … S50}, wherein the audit information { S1, S2, S3 … … S30} related to the legal risk field, the audit information related to the legal risk domain { S1, S2, S3 … … S50} is the target audit information in the compliance self-investigation subgraph determined by the legal risk domain.
Step 204: and carrying out knowledge processing on the target auditing information to obtain candidate data.
The knowledge processing may also be referred to as information extraction (information extraction), which is a first step of constructing a knowledge graph, the information extraction is a technology for automatically extracting structured information such as entities, relationships, entity attributes and the like from unstructured data and semi-structured data, the unstructured data is data which has an irregular or incomplete data structure and no predefined data model and is inconvenient to express by using a database two-dimensional logic table and comprises office documents, texts, pictures, HTML, various reports, images, audio and video information and the like in all formats, and the semi-structured data refers to data which has a certain structure compared with the unstructured data, such as XML documents, JSON documents, encyclopedias and the like.
Optionally, performing knowledge processing on the target audit information to obtain candidate data, including:
and performing entity extraction, relation extraction and attribute extraction on the target audit information to obtain candidate entities, relations and attribute information of the candidate entities in the target audit information.
Entity extraction, also known as Named Entity Recognition (NER), refers to the automatic recognition of named entities from a text dataset, such as for a piece of text "group a stands in 2009, the royal jelly taking the role of a director to go to reserved for starring business … …" the entities "group a", "royal jelly" may be obtained by entity extraction.
After the target audit information is extracted, a series of discrete named entities are obtained, in order to obtain semantic information, the association relation between the entities needs to be extracted from related corpora to link the entities through the relation so as to form a netlike knowledge structure, which is the matter that relation extraction needs to do, the text is continued to be used as the example, "a group stands for 2009, a starry trade … …" is reserved for a certain starry trade, and the relation extracted from the relation is "a starry" of "a group".
The purpose of attribute extraction is to collect attribute information of a specific entity from different information sources, such as a public person, a well-known enterprise and the like can obtain information of nicknames, birthdays, nationalities, education backgrounds and the like from network public information, and the attribute information related to the king itself, such as the director 'king' of 'A group', can be found from the network public information along with the use example, and the attribute information of the 'king' who is born in 1968, C city people, graduate graduation and the like can be found from the network public information.
The candidate data is the entity, relationship and attribute information of the entity obtained after knowledge processing is performed on the target audit information.
Step 206: and carrying out knowledge fusion on the candidate data to obtain structured data.
The entity, the relationship and the attribute information of the entity can be obtained from the original target audit information through knowledge processing, but many candidate data obtained from the target audit information are scattered and have many repetitions, and even have interference caused by some wrong data. Therefore, the candidate data needs to be fused, wrong information is eliminated, repeated data is normalized, and unknown data is clearly referred to as information.
Optionally, performing knowledge fusion on the candidate data to obtain structured data, including:
carrying out entity disambiguation, entity normalization and reference resolution on the candidate entities to determine target entities;
and determining the corresponding target relation and the attribute information of the target entity according to the target entity.
In practical applications, the same word may have different meanings in different contexts, so entity Disambiguation (Disambiguation) is required, which aims to correspond the same word to different entities according to different contexts, for example, for james, when the word appears in the context of basketball, the word can be determined as NBA star, and when the word appears in the context of movie-related context, the word can be determined as movie director.
Similarly, in practical applications, there may be cases where two words correspond to the same Entity, such as "beijing" and "capital of the country", where the two words are literally two different entities, but actually refer to the same Entity, and an Entity normalization (Entity Resolution) operation needs to be performed on multiple candidate entities.
The reference Resolution (Co-reference Resolution) is also an important step in knowledge fusion, in the target audit information, there are usually many pronouns such as "he", "it", "they", etc., and knowledge fusion also needs to determine the entity corresponding to each pronoun, for example, for a sentence "zhang san yesterday, because he goes to listen to the testimony of li-si, he does not go to work but he goes to work today. After the ' he's ' is subjected to the reference resolution, the specific reference can be determined to be ' Zhang three ' instead of ' Liquan '.
Through the operations of entity disambiguation, entity normalization, reference resolution and the like of the candidate entities, the target entities can be determined, the relationship corresponding to the target entities and the attribute information of the target entities can be determined based on the target entities, and thus the entities, the relationship and the attribute information of the entities for constructing the knowledge graph are prepared and structured data are constructed.
Step 208: and carrying out knowledge reasoning on the structured data to obtain an audit field knowledge graph of the audit field.
Through the series of operations, a series of basic fact expressions are obtained, then the fact is not equal to knowledge, a knowledge graph is required to be finally obtained, knowledge reasoning needs to be carried out on data, and association information between entities, the causal relationship between events and the like are obtained.
Optionally, performing knowledge inference on the structured data to obtain an audit domain knowledge graph of the audit domain, including:
constructing an ontology of the audit field according to the target entity and the target relation;
and generating an auditing domain knowledge graph of the auditing domain according to the attribute information of the target entity and the ontology of the auditing domain.
The ontology (Schema) refers to a concept set and a concept framework, and can be locally constructed manually in a manual editing mode or in a data-driven automatic mode.
After a plurality of entities are obtained, the entities are converted into vectors, the similarity among the entities is calculated, the entities with the similarity larger than a threshold value are determined to be similar entities, the entities are classified in an upper-lower level mode according to respective corresponding target relations, corresponding bodies are further constructed, entity attribute information corresponding to each entity corresponds to each entity, and a knowledge graph of the corresponding field is generated.
It should be noted that, in practical applications, most of the relationships between the generated knowledge maps are still incomplete, and the relationships between different entities can be further improved according to the relationships, for example, if a and B are father-child relationships, B is the director of the C group, and the C group is located in the D city, it can be inferred that a lives in the D city, and further, the incomplete relationships between the knowledge maps can be complemented to construct a complete knowledge map.
In the present specification, an audit domain knowledge graph corresponding to each audit domain, such as a supervision submission domain knowledge graph, a supervision declaration domain knowledge graph, a compliance self-check domain knowledge graph, a legal risk domain knowledge graph, a supervision situation domain knowledge graph, etc., is respectively constructed according to a plurality of audit data sets and audit requirements.
Step 108: and generating an audit knowledge base according to each audit domain knowledge graph.
The audit knowledge maps of different fields are combined together through the audit knowledge base, so that the change of audit information can be quickly, accurately and effectively observed, and the service content can be timely and efficiently adjusted according to the change of the audit information.
Optionally, after generating the audit knowledge base, the method further includes:
and constructing an auditing platform based on the auditing knowledge base.
In practical application, the obtained audit knowledge base is relatively abstract and can be in the form of a graph database, but for a user without a computer foundation, the audit knowledge base cannot be effectively utilized, so that after the audit knowledge base is generated, an audit platform can be correspondingly constructed for the audit knowledge base, the form of the graph database is imaged and concretized, the user can quickly and directly inquire the content in the audit knowledge base through the form of a graphical interface, and meanwhile, the audit platform also stores audit information, namely, the user can directly use the audit knowledge base through the audit platform and inquire the content of the audit information in the audit platform, so that the audit knowledge base is convenient for the user to use, and the user experience of the user is improved.
Optionally, the method further includes:
and receiving updated audit information, and updating the audit knowledge base according to the updated audit information.
The updated audit information specifically refers to new audit information, such as new laws and regulations, penalty information and the like, in practical application, the audit knowledge base is not fixed and unchanged after being generated, the audit knowledge base needs to be updated at any time according to the updated audit information, and the new audit information can be obtained in time by updating the audit knowledge base according to the new audit information.
The knowledge base construction method provided by the specification acquires audit information and audit requirements, wherein the audit requirements correspond to a plurality of audit fields; grouping the audit information according to a preset grouping rule to generate a plurality of audit data groups; establishing an auditing domain knowledge graph corresponding to each auditing domain according to a plurality of auditing data groups and the auditing requirement; according to the method, the entity and the relation for constructing the knowledge map are obtained through the audit information, the knowledge map corresponding to each audit field is constructed through the audit field in the audit requirement, and then the audit knowledge base is generated, so that the supervision change can be accurately and effectively observed, the latest information is obtained, and the rules of the supervision business of the user can be timely and efficiently adjusted.
The knowledge base construction method provided in the present specification is further described below with reference to fig. 3, taking an application of the knowledge base construction method in the regulatory compliance field as an example. Fig. 3 shows a processing flow chart of a knowledge base construction method applied to the regulatory compliance field provided in an embodiment of the present specification, and specifically includes the following steps:
step 302: and acquiring the auditing information such as legal regulations, punishment information and the like and auditing requirements for the supervision submission field, the compliance self-check field and the legal risk field.
In the embodiments provided in the present specification, legal and penalty information related to the supervision compliance field is acquired, and audit requirements of the supervision submission field, the compliance self-check field, and the legal risk field are acquired at the same time.
Step 304: and carrying out sub-graph division on the laws and regulations and the penalty information according to the subjects, and dividing the sub-graphs into report sub-graphs, compliance self-check sub-graphs, legal person sub-graphs, laws and regulations sub-graphs and event sub-graphs.
In the embodiment provided by the specification, sub-graph division is performed on the obtained legal regulations and penalty information according to the topic classification, and the legal regulations and the penalty information are divided into report sub-graphs, compliance self-check sub-graphs, legal person sub-graphs, legal regulations sub-graphs and event sub-graphs.
Step 306: and determining a target subgraph corresponding to each auditing field.
In the embodiment provided by the specification, sub-graphs corresponding to the supervision and delivery field are determined to be a report sub-graph, a legal sub-graph and an event sub-graph; sub-graphs corresponding to the compliance self-checking field comprise compliance self-checking sub-graphs, law and regulation sub-graphs and event sub-graphs; subgraphs corresponding to the legal risk field are legal subgraphs, legal subgraphs and event subgraphs.
Step 308: and determining target information in the target subgraph according to each audit field.
In the embodiment provided by the specification, a first target information set corresponding to a supervision and delivery field is determined in a report subgraph, a law and regulation subgraph and an event subgraph; determining a second target information set corresponding to the compliance self-checking field in a compliance self-checking subgraph, a law and regulation subgraph and an event subgraph; and determining a third target information set corresponding to the risk field of the legal person in the legal person subgraph, the legal regulation subgraph and the event subgraph.
Step 310: and carrying out knowledge processing on the target auditing information to obtain candidate data of each auditing field.
In an embodiment provided by the present specification, a first target information set is subjected to knowledge processing to obtain a first candidate data set of a supervision and submission field, a second target information set is subjected to knowledge processing to obtain a second candidate data set of a compliance and self-check field, and a third target information set is subjected to knowledge processing to obtain a third candidate data set of a legal risk field.
Step 312: and performing knowledge fusion on the candidate data of each auditing field to obtain the corresponding structured data of each auditing field.
In the embodiments provided in this specification, knowledge fusion is performed on a first candidate data set to obtain regulatory domain structured data of a regulatory submission domain, knowledge fusion is performed on a second candidate data set to obtain compliance self-check domain structured data, and knowledge fusion is performed on a third candidate data set to obtain legal risk domain structured data.
Step 314: and carrying out knowledge reasoning on the structured data corresponding to each auditing field to obtain an auditing field knowledge map corresponding to each auditing field.
In the embodiment provided by the specification, knowledge reasoning is performed on the structured data of the supervision domain to obtain a supervision domain knowledge map; carrying out knowledge reasoning on the structured data of the compliance self-checking field to obtain a knowledge map of the compliance self-checking field; and carrying out knowledge reasoning on the structured data of the risk field of the legal person to obtain a knowledge graph of the risk field of the legal person.
Step 316: and constructing an audit knowledge base according to the audit domain knowledge graph of each audit domain.
In embodiments provided herein, an audit knowledge base is constructed from a regulatory domain knowledge graph, a compliance self-check domain knowledge graph, and a legal risk domain knowledge graph.
The knowledge base construction method provided by the specification acquires audit information and audit requirements, wherein the audit requirements correspond to a plurality of audit fields; grouping the audit information according to a preset grouping rule to generate a plurality of audit data groups; establishing an auditing domain knowledge graph corresponding to each auditing domain according to a plurality of auditing data groups and the auditing requirement; according to the method, the entity and the relation for constructing the knowledge map are obtained through the audit information, the knowledge map corresponding to each audit field is constructed through the audit field in the audit requirement, and then the audit knowledge base is generated, so that the supervision change can be accurately and effectively observed, the latest information is obtained, and the rules of the supervision business of the user can be timely and efficiently adjusted.
Corresponding to the above method embodiment, the present specification further provides an embodiment of a knowledge base building apparatus, and fig. 4 shows a schematic structural diagram of a knowledge base building apparatus provided in an embodiment of the present specification. As shown in fig. 4, the apparatus includes:
an obtaining module 402, configured to obtain audit information and audit requirements, where the audit requirements correspond to multiple audit fields;
a grouping module 404 configured to group the audit information according to a preset grouping rule to generate a plurality of audit data groups;
a construction module 406 configured to construct an audit domain knowledge graph corresponding to each audit domain according to the plurality of audit data sets and the audit requirements;
a generation module 408 configured to generate an audit knowledge base from each audit domain knowledge graph.
Optionally, the building module 406 includes:
the determining unit is configured to determine a target auditing data set corresponding to each auditing field;
and the construction unit is configured to construct an auditing domain knowledge graph corresponding to the auditing domain according to the target auditing data set.
Optionally, the building unit includes:
a determining subunit configured to determine target audit information in the target audit data set according to an audit field;
a knowledge processing subunit, configured to perform knowledge processing on the target audit information to obtain candidate data;
a knowledge fusion subunit, configured to perform knowledge fusion on the candidate data to obtain structured data;
and the knowledge reasoning subunit is configured to perform knowledge reasoning on the structured data to obtain a review domain knowledge graph of the review domain.
Optionally, the knowledge processing subunit is further configured to:
and performing entity extraction, relation extraction and attribute extraction on the target audit information to obtain candidate entities, relations and attribute information of the candidate entities in the target audit information.
Optionally, the knowledge fusion subunit is further configured to:
carrying out entity disambiguation, entity normalization and reference resolution on the candidate entities to determine target entities;
and determining the corresponding target relation and the attribute information of the target entity according to the target entity.
Optionally, the knowledge inference subunit is further configured to:
constructing an ontology of the audit field according to the target entity and the target relation;
and generating an auditing domain knowledge graph of the auditing domain according to the attribute information of the target entity and the ontology of the auditing domain.
Optionally, the obtaining module 402 is further configured to:
and acquiring first audit information and second audit information.
Optionally, the grouping module 404 is further configured to:
and grouping the auditing information according to a preset theme classification.
Optionally, the apparatus further comprises:
a platform construction module configured to construct an audit platform based on the audit knowledge base.
Optionally, the apparatus further comprises:
and the updating module is configured to receive the updated audit information and update the audit knowledge base according to the updated audit information.
The knowledge base construction device provided by the specification acquires audit information and audit requirements, wherein the audit requirements correspond to a plurality of audit fields; grouping the audit information according to a preset grouping rule to generate a plurality of audit data groups; establishing an auditing domain knowledge graph corresponding to each auditing domain according to a plurality of auditing data groups and the auditing requirement; according to the method, the entity and the relation for constructing the knowledge map are obtained through the audit information, the knowledge map corresponding to each audit field is constructed through the audit field in the audit requirement, and then the audit knowledge base is generated, so that the supervision change can be accurately and effectively observed, the latest information is obtained, and the rules of the supervision business of the user can be timely and efficiently adjusted.
The above is an exemplary scheme of a knowledge base constructing apparatus of the present embodiment. It should be noted that the technical solution of the knowledge base constructing apparatus and the technical solution of the knowledge base constructing method belong to the same concept, and details that are not described in detail in the technical solution of the knowledge base constructing apparatus can be referred to the description of the technical solution of the knowledge base constructing method.
Fig. 5 illustrates a block diagram of a computing device 500 provided according to an embodiment of the present description. The components of the computing device 500 include, but are not limited to, a memory 510 and a processor 520. Processor 520 is coupled to memory 510 via bus 530, and database 550 is used to store data.
Computing device 500 also includes access device 540, access device 540 enabling computing device 500 to communicate via one or more networks 560. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. The access device 540 may include one or more of any type of network interface, e.g., a Network Interface Card (NIC), wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the present description, the above-described components of computing device 500, as well as other components not shown in FIG. 5, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 5 is for purposes of example only and is not limiting as to the scope of the present description. Those skilled in the art may add or replace other components as desired.
Computing device 500 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smartphone), wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 500 may also be a mobile or stationary server.
Wherein processor 520 is configured to execute the following computer-executable instructions:
acquiring audit information and audit requirements, wherein the audit requirements correspond to a plurality of audit fields;
grouping the audit information according to a preset grouping rule to generate a plurality of audit data groups;
establishing an auditing domain knowledge graph corresponding to each auditing domain according to a plurality of auditing data groups and the auditing requirement;
and generating an audit knowledge base according to each audit domain knowledge graph.
The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the knowledge base construction method belong to the same concept, and details that are not described in detail in the technical solution of the computing device can be referred to the description of the technical solution of the knowledge base construction method.
An embodiment of the present specification also provides a computer readable storage medium storing computer instructions that, when executed by a processor, are operable to:
acquiring audit information and audit requirements, wherein the audit requirements correspond to a plurality of audit fields;
grouping the audit information according to a preset grouping rule to generate a plurality of audit data groups;
establishing an auditing domain knowledge graph corresponding to each auditing domain according to a plurality of auditing data groups and the auditing requirement;
and generating an audit knowledge base according to each audit domain knowledge graph.
The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium and the technical solution of the above knowledge base construction method belong to the same concept, and details that are not described in detail in the technical solution of the storage medium can be referred to the description of the technical solution of the above knowledge base construction method.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present disclosure is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present disclosure. Further, those skilled in the art should also appreciate that the embodiments described in this specification are preferred embodiments and that acts and modules referred to are not necessarily required for this description.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are intended only to aid in the description of the specification. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the specification and its practical application, to thereby enable others skilled in the art to best understand the specification and its practical application. The specification is limited only by the claims and their full scope and equivalents.

Claims (13)

1. A knowledge base construction method comprises the following steps:
acquiring audit information and audit requirements, wherein the audit requirements correspond to a plurality of audit fields;
grouping the audit information according to a preset grouping rule to generate a plurality of audit data groups;
establishing an auditing domain knowledge graph corresponding to each auditing domain according to a plurality of auditing data groups and the auditing requirement;
and generating an audit knowledge base according to each audit domain knowledge graph.
2. The knowledge base construction method of claim 1, wherein constructing an audit domain knowledge graph corresponding to each audit domain according to a plurality of audit data sets and the audit requirements comprises:
determining a target auditing data group corresponding to each auditing field;
and constructing an auditing domain knowledge graph corresponding to the auditing domain according to the target auditing data group.
3. The knowledge base construction method of claim 2, wherein constructing an audit domain knowledge graph corresponding to an audit domain from the target audit data set comprises:
determining target auditing information in the target auditing data group according to the auditing field;
performing knowledge processing on the target audit information to obtain candidate data;
performing knowledge fusion on the candidate data to obtain structured data;
and carrying out knowledge reasoning on the structured data to obtain an audit field knowledge graph of the audit field.
4. The knowledge base construction method according to claim 3, wherein the knowledge processing is performed on the target audit information to obtain candidate data, and the method comprises:
and performing entity extraction, relation extraction and attribute extraction on the target audit information to obtain candidate entities, relations and attribute information of the candidate entities in the target audit information.
5. The knowledge base construction method of claim 4, wherein the knowledge fusion is performed on the candidate data to obtain the structured data, and the method comprises the following steps:
carrying out entity disambiguation, entity normalization and reference resolution on the candidate entities to determine target entities;
and determining the corresponding target relation and the attribute information of the target entity according to the target entity.
6. The knowledge base construction method of claim 5, wherein performing knowledge reasoning on the structured data to obtain an audit domain knowledge graph of the audit domain comprises:
constructing an ontology of the audit field according to the target entity and the target relation;
and generating an auditing domain knowledge graph of the auditing domain according to the attribute information of the target entity and the ontology of the auditing domain.
7. The knowledge base construction method of claim 1, wherein obtaining audit information comprises:
and acquiring first audit information and second audit information.
8. The knowledge base construction method according to claim 1, wherein grouping the audit information according to a preset grouping rule comprises:
and grouping the auditing information according to a preset theme classification.
9. The knowledge base construction method of claim 1, further comprising:
and constructing an auditing platform based on the auditing knowledge base.
10. The knowledge base construction method of claim 1, further comprising:
and receiving updated audit information, and updating the audit knowledge base according to the updated audit information.
11. A knowledge base building apparatus comprising:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is configured to acquire audit information and audit requirements, and the audit requirements correspond to a plurality of audit fields;
the grouping module is configured to group the audit information according to a preset grouping rule to generate a plurality of audit data groups;
the construction module is configured to construct an audit domain knowledge graph corresponding to each audit domain according to a plurality of audit data sets and the audit requirements;
a generation module configured to generate an audit knowledge base from each audit domain knowledge graph.
12. A computing device, comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions, and the processor is configured to execute the computer-executable instructions to implement the method of:
acquiring audit information and audit requirements, wherein the audit requirements correspond to a plurality of audit fields;
grouping the audit information according to a preset grouping rule to generate a plurality of audit data groups;
establishing an auditing domain knowledge graph corresponding to each auditing domain according to a plurality of auditing data groups and the auditing requirement;
and generating an audit knowledge base according to each audit domain knowledge graph.
13. A computer readable storage medium storing computer instructions which, when executed by a processor, carry out the steps of the method of knowledge base construction according to any one of claims 1 to 10.
CN202010842045.9A 2020-08-20 2020-08-20 Knowledge base construction method and device Pending CN111782825A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010842045.9A CN111782825A (en) 2020-08-20 2020-08-20 Knowledge base construction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010842045.9A CN111782825A (en) 2020-08-20 2020-08-20 Knowledge base construction method and device

Publications (1)

Publication Number Publication Date
CN111782825A true CN111782825A (en) 2020-10-16

Family

ID=72762357

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010842045.9A Pending CN111782825A (en) 2020-08-20 2020-08-20 Knowledge base construction method and device

Country Status (1)

Country Link
CN (1) CN111782825A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112766506A (en) * 2021-01-19 2021-05-07 澜途集思生态科技集团有限公司 Knowledge base construction method based on architecture
CN112818131A (en) * 2021-02-01 2021-05-18 亚信科技(成都)有限公司 Method, system and storage medium for constructing graph of threat information
CN115098698A (en) * 2022-06-22 2022-09-23 中电金信软件有限公司 Method and device for constructing Schema model in knowledge graph
WO2023088249A1 (en) * 2021-11-18 2023-05-25 华为技术有限公司 Method and apparatus for detecting compliance of data processing, and related device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017076263A1 (en) * 2015-11-03 2017-05-11 中兴通讯股份有限公司 Method and device for integrating knowledge bases, knowledge base management system and storage medium
CN109919585A (en) * 2019-05-14 2019-06-21 上海市浦东新区行政服务中心(上海市浦东新区市民中心) Artificial intelligence auxiliary administrative examination and approval method, system and the terminal of knowledge based map
CN110008288A (en) * 2019-02-19 2019-07-12 武汉烽火技术服务有限公司 The construction method in the knowledge mapping library for Analysis of Network Malfunction and its application
CN110134844A (en) * 2019-04-04 2019-08-16 平安科技(深圳)有限公司 Subdivision field public sentiment monitoring method, device, computer equipment and storage medium
CN110674274A (en) * 2019-09-23 2020-01-10 中国农业大学 Knowledge graph construction method for food safety regulation question-answering system
CN110689458A (en) * 2019-10-14 2020-01-14 王德生 Property management comprehensive supervision system based on knowledge base and construction method thereof
WO2020143326A1 (en) * 2019-01-11 2020-07-16 平安科技(深圳)有限公司 Knowledge data storage method, device, computer apparatus, and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017076263A1 (en) * 2015-11-03 2017-05-11 中兴通讯股份有限公司 Method and device for integrating knowledge bases, knowledge base management system and storage medium
WO2020143326A1 (en) * 2019-01-11 2020-07-16 平安科技(深圳)有限公司 Knowledge data storage method, device, computer apparatus, and storage medium
CN110008288A (en) * 2019-02-19 2019-07-12 武汉烽火技术服务有限公司 The construction method in the knowledge mapping library for Analysis of Network Malfunction and its application
CN110134844A (en) * 2019-04-04 2019-08-16 平安科技(深圳)有限公司 Subdivision field public sentiment monitoring method, device, computer equipment and storage medium
CN109919585A (en) * 2019-05-14 2019-06-21 上海市浦东新区行政服务中心(上海市浦东新区市民中心) Artificial intelligence auxiliary administrative examination and approval method, system and the terminal of knowledge based map
CN110674274A (en) * 2019-09-23 2020-01-10 中国农业大学 Knowledge graph construction method for food safety regulation question-answering system
CN110689458A (en) * 2019-10-14 2020-01-14 王德生 Property management comprehensive supervision system based on knowledge base and construction method thereof

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YUANGANG YAO .ET AL: "A Semantic Knowledge Base Construction Method for Information Security", IEEE, 19 January 2015 (2015-01-19) *
任惠平;: ""单套制"模式下电子档案信息治理理念研究", 山东档案, no. 03, 15 June 2020 (2020-06-15) *
李星滢;魏海平;孙梦婷;: "国界审核绘制知识库的设计与构建", 测绘与空间地理信息, no. 05, 25 May 2019 (2019-05-25) *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112766506A (en) * 2021-01-19 2021-05-07 澜途集思生态科技集团有限公司 Knowledge base construction method based on architecture
CN112818131A (en) * 2021-02-01 2021-05-18 亚信科技(成都)有限公司 Method, system and storage medium for constructing graph of threat information
CN112818131B (en) * 2021-02-01 2023-10-03 亚信科技(成都)有限公司 Map construction method, system and storage medium for threat information
WO2023088249A1 (en) * 2021-11-18 2023-05-25 华为技术有限公司 Method and apparatus for detecting compliance of data processing, and related device
CN115098698A (en) * 2022-06-22 2022-09-23 中电金信软件有限公司 Method and device for constructing Schema model in knowledge graph
CN115098698B (en) * 2022-06-22 2023-04-28 中电金信软件有限公司 Method and device for constructing Schema model in knowledge graph

Similar Documents

Publication Publication Date Title
Joss et al. The smart city as global discourse: Storylines and critical junctures across 27 cities
US11782985B2 (en) Constructing imaginary discourse trees to improve answering convergent questions
US11586827B2 (en) Generating desired discourse structure from an arbitrary text
CN111782825A (en) Knowledge base construction method and device
US10475132B1 (en) Computer implemented methods systems and articles of manufacture for identifying tax return preparation application questions based on semantic dependency
CN111241185B (en) Data processing method and device
CN110119473B (en) Method and device for constructing target file knowledge graph
US20180046628A1 (en) Ranking social media content
US20210406444A1 (en) Advanced text tagging using key phrase extraction and key phrase generation
CN113672781A (en) Data query method and device, electronic equipment and storage medium
US20220058562A1 (en) Dynamic and continous onboarding of service providers in an online expert marketplace
CN113111135A (en) Knowledge graph construction method and device
CN112579733A (en) Rule matching method, rule matching device, storage medium and electronic equipment
CN113971236A (en) Data monitoring method and device of knowledge graph
CN109542891B (en) Data fusion method and computer storage medium
Zaiß Instance-based ontology matching and the evaluation of matching systems.
US10853429B2 (en) Identifying domain-specific accounts
EP2613275A1 (en) Search device, search method, search program, and computer-readable memory medium for recording search program
CN117236624A (en) Issue repairer recommendation method and apparatus based on dynamic graph
Eykens et al. Subject specialties as interdisciplinary trading grounds: the case of the social sciences and humanities
CN117520520A (en) Knowledge graph-based knowledge query method and device
CN116595191A (en) Construction method and device of interactive low-code knowledge graph
CN115934904A (en) Text processing method and device
Wong et al. A system of systems service design for social media analytics
Utitiaj et al. Sentiment Analysis Tool for Spanish Tweets in the Ecuadorian Context

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination