CN114971140A - Service data quality evaluation method oriented to data exchange - Google Patents

Service data quality evaluation method oriented to data exchange Download PDF

Info

Publication number
CN114971140A
CN114971140A CN202210202369.5A CN202210202369A CN114971140A CN 114971140 A CN114971140 A CN 114971140A CN 202210202369 A CN202210202369 A CN 202210202369A CN 114971140 A CN114971140 A CN 114971140A
Authority
CN
China
Prior art keywords
data
quality
service
service data
business
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210202369.5A
Other languages
Chinese (zh)
Other versions
CN114971140B (en
Inventor
贾炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING AEROSPACE INTELLIGENCE AND INFORMATION INSTITUTE
Beijing Institute of Computer Technology and Applications
Original Assignee
BEIJING AEROSPACE INTELLIGENCE AND INFORMATION INSTITUTE
Beijing Institute of Computer Technology and Applications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING AEROSPACE INTELLIGENCE AND INFORMATION INSTITUTE, Beijing Institute of Computer Technology and Applications filed Critical BEIJING AEROSPACE INTELLIGENCE AND INFORMATION INSTITUTE
Priority to CN202210202369.5A priority Critical patent/CN114971140B/en
Publication of CN114971140A publication Critical patent/CN114971140A/en
Application granted granted Critical
Publication of CN114971140B publication Critical patent/CN114971140B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Data Mining & Analysis (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Tourism & Hospitality (AREA)
  • General Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a service data quality evaluation method oriented to data exchange, belonging to the field of computer data processing. The method comprises the steps of constructing a service data knowledge graph and a service data inspection rule based on data standard specifications of data element classes, information classification and coding classes in an application system split system in an electronic inspection standard framework system; and performing quality evaluation on the service data, performing quality evaluation on the exchange data, and performing comprehensive quality analysis on the exchange convergence data. The invention has the technical effects of good automatic detection, objectivity and the like, and can realize the technical effect of automatically checking, recording and evaluating the data quality after data exchange.

Description

Service data quality evaluation method oriented to data exchange
Technical Field
The invention belongs to the field of computer data processing, relates to a service data quality evaluation method facing data exchange, and particularly relates to a service data quality evaluation method facing cross-network and cross-level data exchange of a large-scale distributed service application system.
Background
Some service systems, national service application systems, are developed uniformly, deployed and applied independently in various places, and service data are converged to superior organizations through a data exchange platform. This leads to problems of data scatter, numerous types, uneven quality, etc. The technical problem encountered by a service application system in application is to realize cross-network and cross-hierarchy safe and reliable data sharing and exchange between an upper mechanism and a lower mechanism.
The current method for solving the problem mainly comprises the following steps: (1) firstly, take data reconciliation platform, adopt regularly to extract both sides data, and generate the hash value record, proofread (generally after the exchange or exchange the vacancy period) whether the hash value on the same table both sides has the same record, thereby in time monitor and find the data exchange and insert the data inconsistency of both sides, and through the table and the field of the database of both sides in advance and the corresponding configuration of file directory and file, realize the data synchronization of structured data and unstructured data, eliminate unilateral ledger phenomenon. The method is mainly suitable for the conditions that the database and the file directory are stable for a long time, account checking configuration of both exchange sides is consistent, and the like. (2) Data is managed on a data providing side of the data exchange system, data quality assurance problems such as data integrity and consistency are guaranteed, and reliable data exchange is achieved on a data receiving side by adopting a database synchronization technology.
The prior art has the following disadvantages: the data of business application systems is large in size, numerous in type, and the systems are in regular evolution and optimization. In practical application, a certain difference exists in deployment and configuration of each place, and data management work of each place is also insufficient. Resulting in problems with the quality of the data after data exchange.
Disclosure of Invention
Technical problem to be solved
The technical problem to be solved by the invention is how to provide a service data quality evaluation method facing data exchange, so as to solve the problems that the data scale of a service application system is large, the types are numerous, and the system is in the process of regular evolution and optimization; in practical application, a certain difference exists in deployment and configuration at each place, and data management work at each place is also insufficient; thereby causing a problem in data quality after data exchange.
(II) technical scheme
In order to solve the technical problem, the invention provides a service data quality evaluation method facing data exchange, which comprises the following steps:
step S1: business data knowledge graph construction
Based on the data standard specification of data element class, information classification and coding class in an application system sub-system in an electronic inspection service standard framework system, a service data knowledge graph and a service data inspection rule are constructed, wherein the construction of the service data knowledge graph comprises the construction of a service data knowledge graph database and the generation and updating of the graph;
step S2: quality assessment of superior agency service data
Step S21, according to the business data check rule, the quality of database table structure of the business data of the superior organization is evaluated, and the quality A of the database table structure is obtained;
step S22, according to the service data check rule, the quality evaluation is carried out on the specific service data in the superior mechanism service database to obtain the service data quality A;
step S3: quality assessment of lower-level organization business data
Step S31, according to the business data check rule, the quality of the database table structure of the lower mechanism business data is evaluated, and the quality Bn of the database table structure is obtained;
step S32, according to the service data check rule, the quality of the concrete service data in the lower mechanism service database is evaluated to obtain the service data quality Bn;
step S4: quality assessment after exchange of subordinate organization data to superior organization
Performing quality evaluation on specific service data exchanged to a lower mechanism service database of a higher mechanism according to a service data check rule to obtain service data quality BEn;
step S5: quality integrated analysis of all agency service data exchange convergence
And comprehensively analyzing the base table structure quality A, the service data quality A, the base table structure quality Bn, the service data quality Bn and the service data quality BEn in the working network of the superior organization.
Furthermore, the service data knowledge map database realizes multi-dimensional and multi-level description of data elements, classification and data types of the service data, stores the data elements by using a data structure of a knowledge map, and uniformly manages the same data elements and different data elements in different service data.
Further, the generation and updating of the maps realize the extraction of data elements, classifications and data types of the service data from the service application data standard database, the case card database table of the service application system 2.0 and the document directory, the construction of a service data knowledge map, and the provision of an external access interface of the service data knowledge map database.
Further, the step S1 specifically includes the following steps:
step S11, constructing an ontology base of the service data knowledge graph;
establishing entity types aiming at a data structure system of the business, wherein the entity types comprise events, people, mechanisms and business rules, establishing each entity, and further establishing subclasses of people and subclasses of events; constructing a service type and entity types of each subclass thereof; on the basis of entity type construction, entity relationship and attributes are constructed; the key point is that the subclasses of each service type contain entity types, entity attributes and numerical attributes;
step S12 builds a business data check rule
Constructing a service data inspection rule based on data standard specifications of data element classes, information classification and coding classes in an application system sub-system in an electronic inspection standard framework system;
step S13 exchange and deployment of business data knowledge graph
The service data knowledge graph is deployed to a working network of a subordinate organization through a cross-network and cross-layer data exchange platform of all organizations, and the service data knowledge graph can be accessed to the environment of service application system data and kept synchronous with a service data knowledge graph database of a superior organization.
Further, in step S12, the entity attribute and the numerical value attribute of each service data entity type are traversed and retrieved, so as to perform integrity check on the table entry of the actual service database; the consistency between the table items of the actual service database is checked through the retrieval and reasoning of the relationship between the service data entity types; and counting the integrity of each service data item through the necessity attribute of each service data entity type to construct a service data integrity index system.
Further, the step S21 specifically includes: comparing the database table structure of the business data with the deviation of the business data knowledge map, and calculating the integrity of the database table structure of the business database of the superior organization to form an integrity report; the integrity report comprises the conditions of whether the relation and the attribute of each entity type in the service data knowledge graph exist in a base table structure of a service database, whether the data item relation is consistent and whether the data types are consistent or not, and a consistency list of the data structure is formed; meanwhile, the percentage of the base table data item of the business database and the entity type attribute of the business data knowledge map is used for obtaining the integrity of the base table structure of the business database; this step forms a quality assessment result called: base table structure quality a.
Further, the steps S22, S32, and S4 specifically include:
firstly, 100 records of each table are extracted;
secondly, checking the integrity and the compliance of each data item of each record, and scoring each data item of each record according to the importance of each data item in the service data check rule; the score of each data item is obtained by multiplying the integrity score of the data item by the importance of the data item; obtaining an evaluation result list of all extracted data records;
data item evaluation score-data item integrity x data item importance
Then, forming a one-dimensional array by the evaluation scores of the data items of each database table, and calculating the average value of the array, namely the evaluation average score of the data table; and dividing the data item evaluation of all tables of the database into a plurality of one-dimensional arrays, and averagely dividing the data item evaluation into a one-bit array, thereby obtaining a quality evaluation result.
Further, the step S31 specifically includes: comparing the deviation between the database table structure of the service data and the service data knowledge map, and calculating the integrity of the database table structure of the lower-level mechanism service database to form an integrity report; the integrity report comprises the conditions of whether the relation and the attribute of each entity type in the service data knowledge graph exist in a base table structure of a service database, whether the data item relation is consistent, whether the data types are consistent and the like, which are comprehensively evaluated to form a consistency list of the data structure; meanwhile, the percentage of the base table data item of the service database and the entity type attribute of the service data knowledge map is used for obtaining the structural integrity of the base table of the service database; this step forms a quality assessment result called: base table structure quality Bn.
Further, in the step S5,
the base table structure quality A and the base table structure quality Bn are quality evaluation results of database table structures of a superior organization and a subordinate organization business system, and comprise integrity of the database table structures relative to a latest business data system and deviation of the base table structure quality of the subordinate organization business system relative to the base table structure quality of the superior organization business system, if the deviation is not 0, version and integrity check and update are required to be carried out on a database of the subordinate organization, so that a data exchange system can be ensured to be normally carried out;
the service data quality A and the service data quality Bn are service data quality evaluation results of service systems of a superior organization and a subordinate organization, are used for inspecting the overall quality condition of service data of all organizations, and are used as important bases for data management and data exchange.
Further, in the step S5, the service data quality Bn is a quality evaluation result of data that the lower agency service data is switched to the upper agency; one-to-one comparison of the service data quality BEn with the service data quality Bn, i.e. the service data exchanged by each subordinate entity to the superordinate entity is compared with the service data local to the subordinate entity, in order to evaluate the data exchange quality of the data exchange system
(III) advantageous effects
The invention provides a service data quality evaluation method facing data exchange, which mainly has the following technical effects:
1. in the process of converting the data standard specification in the application system split system in the electronic inspection standard framework system into the structured data standard specification by using the knowledge graph, various requirements and constraints of service data in the data standard specification are completely and accurately expressed by means of various technical characteristics of a graph data structure. Compared with a structuring method using a relational database, the method can more comprehensively express the attributes of various service data, the relation among the data and all the contents of the data standard specification.
2. By using the data standard specification (namely the service data knowledge map) converted into the knowledge map, the table structures, the relations among tables, the main keys and the external keys of the tables and the like of the databases in different service application systems can be flexibly compared and checked, so that whether the databases correctly express the requirements and the constraints on the service data in the data standard specification in the application system split system in the electronic inspection service standard framework system is found. Realizing the technical effect of no change of the strain.
3. The database table structure of the business application system deployed in a superior organization is checked by using the business data knowledge graph, and the obtained checking result is a reference of the database table structure of the business application system of all organizations, and is an effective and objective detection method for detecting whether the database version of the business application system deployed in subordinate organizations in the whole country is synchronous with the database table of the superior organization or whether the deployment is correct. And the method is also the basis of data exchange, and the data exchange and the data reconciliation can be effectively carried out only by the same database table structure. The method for judging whether the data exchange problem is caused by the consistent structure of the database table or not by manually checking the problems of errors, incomplete data and the like of data exchange by relative data exchange platform maintainers has good technical effects of automatic detection, objectivity and the like
4. The quality distribution condition of data in the service application systems deployed in all parts of the country can be detected by checking the data contents in the service application system databases of all the organizations and after exchange, and the data quality can be objectively compared with the service data contents exchanged to the superior organizations by all the organizations, so that the technical effect of automatically checking, recording and evaluating the data quality after data exchange is realized.
Drawings
FIG. 1 is an architectural view of the present invention;
fig. 2 is a flow chart of the service data quality evaluation method oriented to data exchange according to the present invention.
Detailed Description
In order to make the objects, contents and advantages of the present invention clearer, the following detailed description of the embodiments of the present invention will be made in conjunction with the accompanying drawings and examples.
In order to solve the above problems, the present invention provides a service data quality evaluation method oriented to data exchange. The invention has the following structural diagram:
0. knowledge graph
Refers to knowledge-based data based on Semantic Web (Semantic Web) technology. The knowledge graph mainly comprises entity nodes, entity attributes (which are also nodes and can only be used as tail nodes but not as head nodes) and relationships among the entities. The data structures that describe entity types, entity attribute types, and entity relationships are called ontologies. Both ontology and knowledge-graph are described by triplets (head, relation, tail), also called (master, predicate, guest), such as: (person, typeOf, Class), (unit, typeOf, Class), (work unit, typeOf, relationship), (name, typeOf, attribute), (mobile phone number, typeOf, attribute). Wherein: typeOf and has, and the like, and the entities such as Class, entity, attribute and the like are reserved words of the ontology and the knowledge graph. Connecting the various triples together is a "graph" data structure. Since relationships and attributes are directional "edges," both ontologies and knowledge-graphs are directed graphs.
1. Business data knowledge graph management
The service data knowledge map management comprises a service data knowledge map database, a map generation module, an update module and the like.
(1) The service data knowledge map database mainly realizes multi-dimensional and multi-level description of data elements, classification and data types of service data, stores the data elements by using a data structure (triple) of a knowledge map, and uniformly manages the same data elements and different data elements in different service data (such as public information, event management and the like).
Compared with the data structure, the data type and the data resource which describe the service data by using the data structures such as tables, trees and the like, the data structure and the data resource can fully express the formalization requirement and the semantic requirement of the service data and have strong expansion capability.
(2) The generation and the updating of the map mainly realize the extraction of information such as data elements, classification, data types and the like of the business data from resources such as a business application data standard database, a database table of a business application system 2.0, a document directory and the like, and the construction of a business data knowledge map. And provides an external access interface of the service data knowledge map database.
2. Service data quality assessment
The service data quality evaluation has the advantages of performing quality evaluation on various data such as a database table structure of a service application system, structured data in a database, related service files and the like, and mainly evaluating the integrity, consistency, service relevance and the like of service data.
The service data quality evaluation is mainly carried out through quality evaluation rules, the evaluation rules are established on a service data knowledge graph, the capabilities of entity and attribute data accurate retrieval, rule reasoning, type checking, relationship searching among entities and the like of the service data knowledge graph are fully utilized, quality evaluation is carried out on the service data according to the base table structure, the integrity and consistency of structured data, the integrity and the correlation of service files, and quality problems obtained through evaluation are recorded in a problem list.
3. Data exchange quality assessment
In the cross-network and cross-layer data sharing and exchanging engineering between the superior organization and the provincial and urban inspection centers, the data exchange quality evaluation realizes the quality evaluation of the exchanged data, and the quality evaluation result is gathered to the superior organization through the data exchange platform, so that the overall analysis of the quality of the business data exchange of each region is realized, and the relevant strategy is adopted.
The specific implementation mode of the invention comprises the following steps:
step S1: business data knowledge graph construction
And constructing a service data knowledge graph, a service data inspection rule and the like based on the data standard specification of data element classes, information classification and coding classes in an application system sub-system in an electronic inspection standard framework system. The construction of the service data knowledge graph is realized by a service data knowledge graph database, graph generation and updating. The specific construction steps are as follows:
step S11 is to construct ontology library of business data knowledge graph
Entity types such as events, people, mechanisms, business rules and the like are constructed aiming at a business data structure system, each entity is constructed, and subclasses of people and subclasses of events are further constructed. And constructing the service type and the entity type of each subclass thereof.
And on the basis of entity type construction, entity relationships and attributes are constructed. The emphasis is on which entity types, entity attributes and numerical attributes are contained in the subclasses of individual service types.
Step S12 builds a business data check rule
And constructing a business data inspection rule based on the data standard specification of data element class, information classification and coding class in an application system subsystem in an electronic inspection standard framework system. The method mainly comprises the steps of performing traversal retrieval on entity attributes and numerical attributes of each service data entity type, and performing integrity check on table entries of an actual service database; through the retrieval and reasoning of the relation between the service data entity types, the consistency between the table items of the actual service database is checked; and counting the integrity of each service data item through the necessity attribute of each service data entity type to construct a service data integrity index system.
Step S13 exchange and deployment of business data knowledge graph
The service data knowledge graph is deployed to a working network of a subordinate organization through a cross-network and cross-layer data exchange platform of all organizations, so that the service application system data can be accessed in an environment, and the synchronization with a service data knowledge graph database of a superior organization is kept.
Step S2: quality assessment of business data of superior organization
Step S21, according to the business data check rule, the quality evaluation is carried out to the database table structure of the business data of the superior organization
And applying the business data check rule constructed in the step S12 to evaluate the database table structure quality of the superior organization business database. And comparing the database table structure of the business data with the deviation of the business data knowledge map, and calculating the integrity of the database table structure of the business database of the superior mechanism to form an integrity report. The integrity report comprises the conditions of whether the relation and the attribute of each entity type in the service data knowledge graph exist in a base table structure of a service database, whether the data item relation is consistent, whether the data types are consistent and the like, which are comprehensively evaluated to form a consistency list of the data structure. And meanwhile, the integrity of the base table structure of the business database is obtained by the percentage of the base table data item of the business database and the entity type attribute of the business data knowledge map.
This step forms a quality assessment called: base table structure quality a.
Step S22, according to the business data check rule, the specific business data in the superior organization business database is evaluated for quality
And applying the service data check rule constructed in the step S12 to perform quality evaluation on the specific service data in the service database of the upper-level organization. Firstly extracting 100 records of each table, secondly checking the integrity and the compliance of each data item of each record, and scoring each data item of each record according to the importance of each data item in the business data checking rule. The score for each data item is derived by multiplying the integrity score (0 or 1) of the data item by the importance of the data item (three scores of 0.1, 0.5 and 1). And obtaining an evaluation result list of all extracted data records.
Data item evaluation score-data item integrity x data item importance
The evaluation scores of the data items of each database table are formed into a one-dimensional array (called a data table scoring data set), and the average value of the array is calculated and called the evaluation average score of the data table. The evaluation of data items of all tables of the database is divided into a plurality of one-dimensional arrays, and the evaluation of the data tables is divided into a one-bit array.
This step forms a quality assessment result called: quality of service data a.
Step S3: quality assessment of lower-level organization business data
The method and the device for evaluating the quality of the service data by the superior mechanism are deployed to the inferior mechanism, and the quality evaluation is carried out on the database table structure and the service data of the service system of the inferior mechanism.
Step S31, according to the business data check rule, the quality evaluation is carried out to the database table structure of the lower mechanism business data
And applying the business data check rule constructed in the step S12 to perform database table structure quality evaluation on the lower-level organization business database. And comparing the database table structure of the business data with the deviation of the business data knowledge map, and calculating the integrity of the database table structure of the lower-level organization business database to form an integrity report. The integrity report comprises the conditions of whether the relation and the attribute of each entity type in the service data knowledge graph exist in a base table structure of a service database, whether the data item relation is consistent, whether the data types are consistent and the like, which are comprehensively evaluated to form a consistency list of the data structure. And meanwhile, the integrity of the base table structure of the business database is obtained by the percentage of the base table data item of the business database and the entity type attribute of the business data knowledge map.
This step forms a quality assessment result called: the table structure quality Bn.
And converging the quality Bn of the base table structure to a superior organization through a data exchange system.
Step S32, according to the service data checking rule, the specific service data in the lower mechanism service database is evaluated for quality
And applying the service data check rule constructed in the step S12 to perform quality evaluation on the specific service data in the lower-level agency service database. Firstly extracting 100 records of each table, secondly checking the integrity and the compliance of each data item of each record, and scoring each data item of each record according to the importance of each data item in the business data checking rule. The score for each data item is derived by multiplying the integrity score (0 or 1) of the data item by the importance of the data item (three scores of 0.1, 0.5 and 1). And obtaining an evaluation result list of all extracted data records.
Data item evaluation score-data item integrity x data item importance
The evaluation scores of the data items of each database table are formed into a one-dimensional array (called a data table evaluation data set), and the average value of the array is calculated and called a data table evaluation average score. The evaluation of data items of all tables of the database is divided into a plurality of one-dimensional arrays, and the evaluation of the data tables is divided into a one-bit array.
This step forms a quality assessment result called: quality of service data Bn.
And the service data quality Bn is converged to a superior organization through a data exchange system.
Step S4: quality assessment after exchange of subordinate organization data to superior organization
The service data checking rule constructed in step S12 is applied to perform quality evaluation on the specific service data exchanged to the lower agency service database of the upper agency. Firstly extracting 100 records of each table, secondly checking the integrity and the compliance of each data item of each record, and scoring each data item of each record according to the importance of each data item in the business data checking rule. The score for each data item is derived by multiplying the integrity score (0 or 1) of the data item by the importance of the data item (three scores of 0.1, 0.5 and 1). And obtaining an evaluation result list of all extracted data records.
Data item evaluation score-data item integrity x data item importance
The evaluation scores of the data items of each database table are formed into a one-dimensional array (called a data table evaluation data set), and the average value of the array is calculated and called a data table evaluation average score. The evaluation of data items of all tables of the database is divided into a plurality of one-dimensional arrays, and the evaluation of the data tables is divided into a one-bit array.
This step forms a quality assessment result called: quality of service BEn.
Step S5: quality integrated analysis for all agency service data exchange convergence
And comprehensively analyzing quality evaluation result data such as base table structure quality A, service data quality A, base table structure quality Bn, service data quality BEn and the like in a working network of a superior organization.
The base table structure quality a and the base table structure quality Bn are quality evaluation results of the database table structures of the business systems of the upper-level organization and the lower-level organization. There are two main aspects: firstly, the integrity of a database table structure relative to a latest business data system (embodied and represented by a business data knowledge graph); and secondly, the quality of the base table structure of the lower-level organization business system is different from that of the base table structure of the upper-level organization business system, if the difference is not 0, the version and integrity of the database of the lower-level organization are checked and updated, so that the data exchange system can be normally operated.
The service data quality a and the service data quality Bn are service data quality evaluation results of service systems of an upper-level organization and a lower-level organization. The method is mainly used for inspecting the overall quality condition of all the organization business data and is used as an important basis for data governance and data exchange.
The service data quality Bn is a quality evaluation result of data exchanged by lower agency service data to an upper agency. The one-to-one comparison of the quality of service data BEn with the quality of service data Bn, i.e. the comparison of the service data exchanged by each subordinate entity to the superior entity with the service data local to the subordinate entity, is an efficient way of assessing the quality of data exchange in a data switching system.
The invention has the following technical effects:
1. in the process of converting the data standard specification in the application system split system in the electronic inspection standard framework system into the structured data standard specification by using the knowledge graph, various requirements and constraints of service data in the data standard specification are completely and accurately expressed by means of various technical characteristics of a graph data structure. Compared with a structuring method using a relational database, the method can more comprehensively express the attributes of various service data, the relation among the data and all the contents of the data standard specification.
2. By using the data standard specification (namely the service data knowledge map) converted into the knowledge map, the table structures, the relations among tables, the main keys and the external keys of the tables and the like of the databases in different service application systems can be flexibly compared and checked, so that whether the databases correctly express the requirements and the constraints on the service data in the data standard specification in the application system split system in the electronic inspection service standard framework system is found. Realizing the technical effect of no change of the strain.
3. The database table structure of the business application system deployed in a superior organization is checked by using the business data knowledge graph, and the obtained checking result is a reference of the database table structure of the business application system of all organizations, and is an effective and objective detection method for detecting whether the database version of the business application system deployed in subordinate organizations in the whole country is synchronous with the database table of the superior organization or whether the deployment is correct. And the method is also the basis of data exchange, and the data exchange and the data reconciliation can be effectively carried out only by the same database table structure. Compared with the method that maintenance personnel of a data exchange platform manually check the problems of errors, incomplete data and the like of data exchange and judge whether the data exchange problem is caused by the consistent structure of a database table, the method has the technical effects of good automatic detection, good objectivity and the like.
4. The quality distribution condition of the data in the service application systems deployed in the inspection institutions all over the country can be detected by inspecting the data content in the exchanged service application system databases of all the institutions, and the data content is objectively compared with the service data content exchanged to the superior institutions by the inspection institutions all over the country, so that the technical effect of automatically inspecting, recording and evaluating the data quality after data exchange is realized.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A service data quality evaluation method oriented to data exchange is characterized by comprising the following steps:
step S1: business data knowledge graph construction
Based on the data standard specification of data element class, information classification and coding class in an application system sub-system in an electronic inspection service standard framework system, a service data knowledge graph and a service data inspection rule are constructed, wherein the construction of the service data knowledge graph comprises the construction of a service data knowledge graph database and the generation and updating of the graph;
step S2: quality assessment of superior agency service data
Step S21, according to the business data check rule, the quality of database table structure of the business data of the superior organization is evaluated, and the quality A of the database table structure is obtained;
step S22, according to the service data check rule, the quality evaluation is carried out on the specific service data in the superior mechanism service database to obtain the service data quality A;
step S3: quality assessment of lower-level organization business data
Step S31, according to the business data check rule, the quality of the database table structure of the lower mechanism business data is evaluated, and the quality Bn of the database table structure is obtained;
step S32, according to the service data check rule, the quality of the concrete service data in the lower mechanism service database is evaluated to obtain the service data quality Bn;
step S4: quality assessment after exchange of subordinate organization data to superior organization
Performing quality evaluation on specific service data exchanged to a lower mechanism service database of a higher mechanism according to a service data check rule to obtain service data quality BEn;
step S5: quality integrated analysis of all agency service data exchange convergence
And comprehensively analyzing the base table structure quality A, the service data quality A, the base table structure quality Bn, the service data quality Bn and the service data quality BEn in the working network of the superior organization.
2. The method as claimed in claim 1, wherein the service data knowledge graph database implements multi-dimensional and multi-level description of data elements, classes and data types of the service data, stores the data elements by using a data structure of the knowledge graph, and uniformly manages the same data elements and different data elements in different service data.
3. The data-exchange-oriented business data quality assessment method according to claim 1, wherein the map generation and updating realizes extracting data elements, classifications and data types of the business data from the business application data standard database, the database table of the business application system 2.0 and the document directory, constructing the business data knowledge map, and providing an external access interface of the business data knowledge map database.
4. The data-exchange-oriented service data quality assessment method according to any one of claims 1 to 3, wherein said step S1 specifically comprises the steps of:
step S11, constructing an ontology base of the business data knowledge graph;
establishing entity types aiming at a data structure system of the business, wherein the entity types comprise events, people, mechanisms and business rules, establishing each entity, and further establishing subclasses of people and subclasses of events; constructing a business type and entity types of each subclass of the business type; on the basis of entity type construction, entity relationship and attributes are constructed; the key point is that the subclasses of each service type contain entity types, entity attributes and numerical attributes;
step S12 builds a business data check rule
Constructing a service data inspection rule based on data standard specifications of data element classes, information classification and coding classes in an application system sub-system in an electronic inspection standard framework system;
step S13 exchange and deployment of business data knowledge graph
The service data knowledge graph is deployed to a working network of a subordinate organization through a cross-network and cross-layer data exchange platform of all organizations, and the service data knowledge graph can be accessed to the environment of service application system data and kept synchronous with a service data knowledge graph database of a superior organization.
5. The method for evaluating the quality of service data facing data exchange of claim 4, wherein in step S12, the table entry of the actual service database is checked for integrity by performing traversal search on the entity attribute and the numerical value attribute of each service data entity type; the consistency between the table items of the actual service database is checked through the retrieval and reasoning of the relationship between the service data entity types; and counting the integrity of each service data item through the necessity attribute of each service data entity type to construct a service data integrity index system.
6. The method for evaluating the quality of service data oriented to data exchange of claim 5, wherein the step S21 specifically includes: comparing the database table structure of the business data with the deviation of the business data knowledge map, and calculating the integrity of the database table structure of the business database of the superior organization to form an integrity report; the integrity report comprises the conditions of whether the relation and the attribute of each entity type in the service data knowledge graph exist in a base table structure of a service database, whether the data item relation is consistent and whether the data types are consistent or not, and a consistency list of the data structure is formed; meanwhile, the percentage of the base table data item of the business database and the entity type attribute of the business data knowledge map is used for obtaining the integrity of the base table structure of the business database; this step forms a quality assessment result called: base table structure quality a.
7. The method for evaluating the quality of service data oriented to data exchange of claim 5, wherein the steps S22, S32 and S4 specifically include:
firstly, 100 records of each table are extracted;
secondly, checking the integrity and the compliance of each data item of each record, and scoring each data item of each record according to the importance of each data item in the service data check rule; the score of each data item is obtained by multiplying the integrity score of the data item and the importance of the data item; obtaining an evaluation result list of all extracted data records;
data item evaluation score-data item integrity x data item importance
Then, forming a one-dimensional array by the evaluation scores of the data items of each database table, and calculating the average value of the array, namely the evaluation average score of the data table; and dividing the data item evaluation of all tables of the database into a plurality of one-dimensional arrays, and averagely dividing the data item evaluation into a one-bit array, thereby obtaining a quality evaluation result.
8. The method for evaluating the quality of service data oriented to data exchange of claim 5, wherein the step S31 specifically includes: comparing the database table structure of the business data with the deviation of the business data knowledge map, and calculating the integrity of the database table structure of the lower-level organization business database to form an integrity report; the integrity report comprises the conditions of whether the relation and the attribute of each entity type in the service data knowledge graph exist in a base table structure of a service database, whether the data item relation is consistent, whether the data types are consistent and the like, which are comprehensively evaluated to form a consistency list of the data structure; meanwhile, the percentage of the base table data item of the service database and the entity type attribute of the service data knowledge map is used for obtaining the structural integrity of the base table of the service database; this step forms a quality assessment result called: base table structure quality Bn.
9. The data-exchange oriented service data quality assessment method according to any of claims 5-8, wherein in said step S5,
the base table structure quality A and the base table structure quality Bn are quality evaluation results of database table structures of a superior organization and a subordinate organization business system, and comprise integrity of the database table structures relative to a latest business data system and deviation of the base table structure quality of the subordinate organization business system relative to the base table structure quality of the superior organization business system, if the deviation is not 0, version and integrity check and update are required to be carried out on a database of the subordinate organization, so that a data exchange system can be ensured to be normally carried out;
the service data quality A and the service data quality Bn are service data quality evaluation results of service systems of a superior organization and a subordinate organization, are used for inspecting the overall quality condition of service data of all organizations, and are used as important bases for data management and data exchange.
10. The service data quality assessment method for data exchange oriented service data according to any of claims 5-8, wherein in said step S5, the service data quality Bn is the quality assessment result of the data exchanged by the lower agency service data to the upper agency; the service data quality BEn is compared with the service data quality Bn one-to-one, i.e. the service data exchanged by each subordinate entity to the superior entity is compared with the service data local to the subordinate entity, to assess the data exchange quality of the data exchange system.
CN202210202369.5A 2022-03-03 2022-03-03 Service data quality evaluation method oriented to data exchange Active CN114971140B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210202369.5A CN114971140B (en) 2022-03-03 2022-03-03 Service data quality evaluation method oriented to data exchange

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210202369.5A CN114971140B (en) 2022-03-03 2022-03-03 Service data quality evaluation method oriented to data exchange

Publications (2)

Publication Number Publication Date
CN114971140A true CN114971140A (en) 2022-08-30
CN114971140B CN114971140B (en) 2023-01-13

Family

ID=82975755

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210202369.5A Active CN114971140B (en) 2022-03-03 2022-03-03 Service data quality evaluation method oriented to data exchange

Country Status (1)

Country Link
CN (1) CN114971140B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050108631A1 (en) * 2003-09-29 2005-05-19 Amorin Antonio C. Method of conducting data quality analysis
CN106649840A (en) * 2016-12-30 2017-05-10 国网江西省电力公司经济技术研究院 Method suitable for power data quality assessment and rule check
CN109213819A (en) * 2018-08-16 2019-01-15 浪潮软件集团有限公司 Information resource sharing system
CN109815230A (en) * 2018-12-23 2019-05-28 国网浙江省电力有限公司 A kind of full-service data center Data Audit method of knowledge based map
CN109840268A (en) * 2018-12-23 2019-06-04 国网浙江省电力有限公司 A kind of universe data map construction method based on enterprise information model
CN110471995A (en) * 2019-08-14 2019-11-19 中电科新型智慧城市研究院有限公司 A kind of cross-cutting information share-and-exchange data model modeling method
CN110968629A (en) * 2019-11-27 2020-04-07 开普云信息科技股份有限公司 Cross-hierarchy heterogeneous data aggregation-based unified information resource management method and system
CN111159191A (en) * 2019-12-30 2020-05-15 深圳博沃智慧科技有限公司 Data processing method, device and interface
CN111597177A (en) * 2020-05-14 2020-08-28 重庆农村商业银行股份有限公司 Data governance method for improving data quality
CN111859969A (en) * 2020-07-20 2020-10-30 航天科工智慧产业发展有限公司 Data analysis method and device, electronic equipment and storage medium
CN112732924A (en) * 2020-12-04 2021-04-30 国网安徽省电力有限公司 Power grid data asset management system and method based on knowledge graph
CN112966901A (en) * 2021-02-04 2021-06-15 复旦大学 Lineage data quality analysis and verification method for inspection business collaborative flow
CN114037205A (en) * 2021-10-11 2022-02-11 北京市天元网络技术股份有限公司 Metadata quality checking method and system

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050108631A1 (en) * 2003-09-29 2005-05-19 Amorin Antonio C. Method of conducting data quality analysis
CN106649840A (en) * 2016-12-30 2017-05-10 国网江西省电力公司经济技术研究院 Method suitable for power data quality assessment and rule check
CN109213819A (en) * 2018-08-16 2019-01-15 浪潮软件集团有限公司 Information resource sharing system
CN109815230A (en) * 2018-12-23 2019-05-28 国网浙江省电力有限公司 A kind of full-service data center Data Audit method of knowledge based map
CN109840268A (en) * 2018-12-23 2019-06-04 国网浙江省电力有限公司 A kind of universe data map construction method based on enterprise information model
CN110471995A (en) * 2019-08-14 2019-11-19 中电科新型智慧城市研究院有限公司 A kind of cross-cutting information share-and-exchange data model modeling method
CN110968629A (en) * 2019-11-27 2020-04-07 开普云信息科技股份有限公司 Cross-hierarchy heterogeneous data aggregation-based unified information resource management method and system
CN111159191A (en) * 2019-12-30 2020-05-15 深圳博沃智慧科技有限公司 Data processing method, device and interface
CN111597177A (en) * 2020-05-14 2020-08-28 重庆农村商业银行股份有限公司 Data governance method for improving data quality
CN111859969A (en) * 2020-07-20 2020-10-30 航天科工智慧产业发展有限公司 Data analysis method and device, electronic equipment and storage medium
CN112732924A (en) * 2020-12-04 2021-04-30 国网安徽省电力有限公司 Power grid data asset management system and method based on knowledge graph
CN112966901A (en) * 2021-02-04 2021-06-15 复旦大学 Lineage data quality analysis and verification method for inspection business collaborative flow
CN114037205A (en) * 2021-10-11 2022-02-11 北京市天元网络技术股份有限公司 Metadata quality checking method and system

Also Published As

Publication number Publication date
CN114971140B (en) 2023-01-13

Similar Documents

Publication Publication Date Title
CN112163724A (en) Environment information data resource integration system
US9311175B2 (en) Method and system for processing log information
CN108255712A (en) The test system and test method of data system
Dai et al. Data profiling technology of data governance regarding big data: review and rethinking
CN104424360A (en) Method and system for accessing a set of data tables in a source database
CN111858713A (en) Object-based government information asset management method and system
CN112231333A (en) Ecological environment data sharing and exchanging method and system
GB2574282A (en) Data consistency verification method and system minimizing load of original database
CN104917627A (en) Log cluster scanning and analysis method used for large-scale server cluster
CN107945092A (en) Big data integrated management approach and system for audit field
CN114880405A (en) Data lake-based data processing method and system
CN114281877A (en) Data management system and method
CN114398669A (en) Joint credit scoring method and device based on privacy protection calculation and cross-organization
Ehrlinger et al. QuaIIe: a data quality assessment tool for integrated information systems
CN115329011A (en) Data model construction method, data query method, data model construction device and data query device, and storage medium
CN112966162A (en) Scientific and technological resource integration method and device based on data warehouse and middleware
CN114971140B (en) Service data quality evaluation method oriented to data exchange
Glake et al. Data management in multi-agent simulation systems
CN110245037B (en) Hive user operation behavior restoration method based on logs
CN117093556A (en) Log classification method, device, computer equipment and computer readable storage medium
CN116260866A (en) Government information pushing method and device based on machine learning and computer equipment
CN114925042A (en) Method for constructing metadata relation based on graphic database
CN109213909A (en) A kind of big data analysis system and its analysis method fusion search and calculated
Lenard et al. An Approach for Efficient Processing of Machine Operational Data
CN109388649B (en) Land intelligent recommendation method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant