CN111782736B - Data classification management method and system - Google Patents

Data classification management method and system Download PDF

Info

Publication number
CN111782736B
CN111782736B CN202010696437.9A CN202010696437A CN111782736B CN 111782736 B CN111782736 B CN 111782736B CN 202010696437 A CN202010696437 A CN 202010696437A CN 111782736 B CN111782736 B CN 111782736B
Authority
CN
China
Prior art keywords
label
prefix
predefined
labels
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010696437.9A
Other languages
Chinese (zh)
Other versions
CN111782736A (en
Inventor
郑敏
吴呈良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chaozhou Zhuoshu Big Data Industry Development Co Ltd
Original Assignee
Chaozhou Zhuoshu Big Data Industry Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chaozhou Zhuoshu Big Data Industry Development Co Ltd filed Critical Chaozhou Zhuoshu Big Data Industry Development Co Ltd
Priority to CN202010696437.9A priority Critical patent/CN111782736B/en
Publication of CN111782736A publication Critical patent/CN111782736A/en
Application granted granted Critical
Publication of CN111782736B publication Critical patent/CN111782736B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of data management, and particularly provides a method and a system for data classification management.A user carries out label marking on an unlabelled table, checks whether the unlabelled table exists, if so, checks whether a predefined label is tried, if so, carries out automatic marking, and if not, carries out manual marking; if not, the label or the label classification is required to be perfected, and the table is displayed. Compared with the prior art, the invention can effectively reduce the manpower input in the data management through a certain degree of automatic marking function, is convenient for a user to carry out multi-dimensional classification checking on the existing data through a form of table-label classification three-level, and continuously improves the existing data management through a warning-feedback mode.

Description

Data classification management method and system
Technical Field
The invention relates to the field of data management, and particularly provides a method and a system for data classification management.
Background
With the development of computer science and information science, each enterprise unit increasingly pays more attention to the construction of an information system, various information systems are gradually perfected, and massive data are generated in daily operation.
The data generated by a plurality of different information systems may differ in organization and structure, even to generate tables with ambiguous or temporary tables. In enterprise data management, due to the lack of corresponding management means, enterprises often have difficulty in realizing effective utilization of data, or some data which are actually discarded but occupy system resources for a long time due to unmarked data exist.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a data classification management method with strong practicability.
The invention further aims to provide a system for data classification management, which is reasonable in design, safe and applicable.
The technical scheme adopted by the invention for solving the technical problem is as follows:
a method for data classification management includes that a user labels an unlabelled table, checks whether the unlabelled table exists, if yes, checks whether a predefined label is tried, if yes, carries out automatic labeling, and if not, carries out manual labeling;
if not, the label or the label classification needs to be perfected, and the table is displayed.
Further, before the user labels the untagged table, firstly, a label correspondence table is created in the database to store the correspondence between the data table and the label, a predefined label table is created, and the correspondence between the prefix and the label is initialized and marked according to specific business rules.
Preferably, the field NAMEs included in the tag correspondence TABLE are TABLE _ NAME, LABEL _ TYPE, and PREFIX _ CHECK;
the field NAMEs included in the predefined tag TABLE are TABLE _ PREFIX, LABEL _ NAME, and LABEL _ TYPE.
Further, checking whether an unmarked table exists, performing inspection operation on the table of the database to be managed according to a period defined by a user, and checking whether an unmarked table exists according to a comparison system table list and a label corresponding table;
if the non-tagged table exists, checking whether a PREFIX _ CHECK field exists and whether the PREFIX _ CHECK field is empty, namely, the pre-tagged flow is not passed through;
if the situation is met, performing labeling operation on the watch by combining a predefined label table;
if the PREFIX _ CHECK field is not empty, namely the predefined labeling process is performed, the user is informed to perform custom labeling on the tables which are not labeled, and the corresponding relation between all the predefined or custom tables and the labels is recorded in the label corresponding table.
Further, after the initial labeling is finished, the user completes and classifies the existing labels, combines and unifies the labels through the standardization of the labels, and updates the label name fields in the label correspondence table.
A data classification management system comprises an inspection module, a marking module, a warning module and a table display module, wherein the inspection module is used for performing inspection operation on a table in a database to be managed according to a period defined by a user;
the marking module is used for performing labeling operation on the watch; the warning module is used for informing a user to carry out custom labeling processing on the untagged table; and the table display module is used for checking the labeled table according to the label.
Furthermore, a tag correspondence table is created in the database for subsequently storing the data table and the tag correspondence, a predefined tag table is created in the database, and the correspondence between the table name prefix and the tag is initialized according to specific business rules.
Preferably, the field NAMEs included in the tag correspondence TABLE are TABLE _ NAME, LABEL _ TYPE, and PREFIX _ CHECK;
the field NAMEs included in the predefined tag TABLE are TABLE _ PREFIX, LABEL _ NAME, and LABEL _ TYPE.
Further, the inspection module is used for performing inspection operation on the table in the database to be managed according to a period defined by a user, and checking whether an untagged table exists according to the comparison system table list and the tag corresponding table;
if the non-tagged table exists, checking whether a PREFIX _ CHECK field exists and whether the PREFIX _ CHECK field is empty, namely, the pre-tagged flow is not passed through;
if the situation is met, performing labeling operation on the table through a marking module by combining a predefined label table;
if the PREFIX _ CHECK field is not null, namely the labeling process is predefined, the warning module is used for informing a user to perform custom labeling processing on the non-labeled table;
and recording the corresponding relation between all the predefined or customized tables and the labels to the label corresponding table through the marking module.
Further, after the initial labeling is completed, the user completes and classifies the existing labels, combines and unifies the labels through the standardization of the labels, and updates the label name fields in the label correspondence table.
Compared with the prior art, the data classification management method and the data classification management system have the following outstanding beneficial effects:
(1) the invention can effectively reduce the input of manpower in data management through a certain degree of automatic marking function.
(2) Through the form of table-label classification three levels, the user can conveniently conduct multi-dimensional classification checking on the existing data. Existing data management is continuously refined through an alert-feedback model.
Drawings
In order to more clearly illustrate the embodiments or technical solutions of the present invention, the drawings used in the embodiments or technical solutions in the prior art are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a data classification management system.
Detailed Description
The present invention will be described in further detail with reference to specific embodiments in order to better understand the technical solutions of the present invention. It should be apparent that the described embodiments are only some embodiments of the present invention, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without making any creative effort belong to the protection scope of the present invention.
A preferred embodiment is given below:
as shown in fig. 1, a method for data classification management in this embodiment is: the user labels the untagged table, checks whether the untagged table exists, if so, checks whether the predefined label is tried, if so, automatically marks, and if not, manually marks. If not, the label or the label classification is required to be perfected, and the table is displayed.
The specific process is as follows:
before a user LABELs an untagged TABLE, firstly, a LABEL correspondence TABLE R _ TABLE _ LABEL is created in a database to store the correspondence between the data TABLE and the LABEL, a predefined LABEL TABLE R _ PREFIX _ LABEL is created, and the correspondence between the PREFIX and the LABEL is initialized and marked according to specific business rules.
The field NAMEs included in the tag correspondence TABLE R _ TABLE _ LABEL are TABLE _ NAME, LABEL _ TYPE, and PREFIX _ CHECK. The field NAMEs included in the predefined tag TABLE R _ PREFIX _ LABEL are TABLE _ PREFIX, LABEL _ NAME, and LABEL _ TYPE.
Checking whether an unmarked table exists, performing inspection operation on the table of the database to be managed according to a period defined by a user, and checking whether an unmarked table exists according to a comparison system table list and a label corresponding table.
If there is a non-tagged table, it is checked if there is a PREFIX _ CHECK field and if the PREFIX _ CHECK field is empty, i.e. not subjected to the predefined tagging flow.
If the situation is met, the table is labeled by combining the predefined LABEL table R _ PREFIX _ LABEL.
If the PREFIX _ CHECK field is not empty, namely the predefined labeling process is performed, the user is informed to perform custom labeling on the tables which are not labeled, and the corresponding relation between all the predefined or custom tables and the labels is recorded in the label corresponding table.
After the initial labeling is finished, the user completes and classifies the existing labels, combines and unifies the labels through the standardization of the labels, and updates the label name fields in the label corresponding table.
The system for realizing the method comprises the following steps:
a data classification management system comprises an inspection module, a marking module, a warning module and a table display module, wherein the inspection module is used for performing inspection operation on a table in a database to be managed according to a period defined by a user.
The marking module is used for performing labeling operation on the meter; the warning module is used for informing a user to carry out custom labeling processing on the untagged table; and the table display module is used for checking the labeled table according to the label.
The method comprises the following specific steps:
(1) and creating a LABEL corresponding TABLE R _ TABLE _ LABEL in the database for storing the corresponding relation between the data TABLE and the LABEL subsequently.
Name of field Type of data Note
TABLE_NAME Character type Name of data table
LABEL_NAME Character type Label name
LABEL_TYPE Character type Type of label
PREFIX_CHECK Character type Whether a predefined marking process has been performed
(2) A predefined LABEL table R _ PREFIX _ LABEL is established in a database, and the corresponding relation between the table name PREFIX and the LABEL is initialized according to specific business rules.
Name of field Data type Note
TABLE_PREFIX Character type Table name prefix
LABEL_NAME Character type Label name
LABEL_TYPE Character type Label type 5
(3) The patrol module is used for carrying out patrol operation on the TABLEs in the database to be managed according to the period defined by the user, and checking whether the untagged TABLEs exist according to the comparison system TABLE list and the LABEL corresponding TABLE R _ TABLE _ LABEL.
If there is an untagged table, it is checked if there is a PREFIX _ CHECK field that is empty, i.e. not subjected to a predefined tagging flow.
If such a situation is met, the table is tagged by the tagging module in conjunction with the predefined tag table R _ PREFIX _ LABEL. If the PREFIX _ CHECK field is not null, namely the predefined labeling process is already performed, the user is informed to perform custom labeling processing on the non-labeled table through an alarm module. And recording the corresponding relation between all predefined or customized TABLEs and the LABELs to a LABEL corresponding TABLE R _ TABLE _ LABEL through a marking module.
(4) After the initial labeling is completed, the user can perfect and classify the existing labels. The perfection of the LABEL mainly relates to the standardization work of the LABEL, because data can be generated by a plurality of information systems, users in different service fields have different appellations to the same entity or different appellations, in the process of customizing the LABEL by the user, the LABEL with different names but the same meaning appears, in this link, the part of LABELs are merged and unified through the standardization of the LABEL, and the LABEL name field in the LABEL corresponding TABLE R _ TABLE _ LABEL is updated. The classification of the labels mainly relates to the type division of the labels, and is used for reducing the screening range and improving the query efficiency in the actual use process.
(5) And checking the labeled table according to the label through the table display module.
The above embodiments are only specific examples, and the scope of the present invention includes but is not limited to the above embodiments, and any suitable changes or substitutions that may be made by one of ordinary skill in the art according to the method and system claims for data classification management of the present invention shall fall within the scope of the present invention.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that various changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (4)

1. A method for managing data classification is characterized in that a user labels an unlabelled table, checks whether the unlabelled table exists, if so, checks whether a predefined label is tried, if so, performs automatic labeling, and if not, performs manual labeling;
if the label does not exist, the label or the label classification needs to be perfected, and the table is displayed;
before a user labels an untagged table, firstly creating a label corresponding table in a database to store a corresponding relation between a data table and a label, creating a predefined label table, and initializing a corresponding relation between a marked prefix and the label according to a specific service rule;
the field NAMEs included in the tag correspondence TABLE are TABLE _ NAME, LABEL _ TYPE, and PREFIX _ CHECK;
the field NAMEs included in the predefined tag TABLE are TABLE _ PREFIX, LABEL _ NAME, and LABEL _ TYPE;
checking whether an unmarked table exists, performing inspection operation on the table of the database to be managed according to a period defined by a user, and checking whether an unmarked table exists according to a comparison system table list and a label corresponding table;
if the non-tagged table exists, checking whether a PREFIX _ CHECK field exists and whether the PREFIX _ CHECK field is empty, namely, the pre-tagged flow is not performed;
if the situation is met, performing labeling operation on the watch by combining a predefined label table;
if the PREFIX _ CHECK field is not empty, namely the predefined labeling process is performed, the user is informed to perform custom labeling on the tables which are not labeled, and the corresponding relation between all the predefined or custom tables and the labels is recorded in the label corresponding table.
2. The data classification management method as claimed in claim 1, characterized in that after the initial labeling is completed, the user completes and classifies the existing labels, merges and unifies the labels by standardizing the labels, and updates the label name field in the label correspondence table.
3. A data classification management system is characterized by comprising an inspection module, a marking module, a warning module and a table display module, wherein the inspection module is used for performing inspection operation on a table in a database to be managed according to a period defined by a user;
the marking module is used for performing labeling operation on the watch; the warning module is used for informing a user to carry out custom labeling processing on the non-labeled table; the table display module is used for checking the labeled table according to the label;
creating a label corresponding table in a database for storing the data table and the label corresponding relation subsequently, creating a predefined label table in the database, and initializing the corresponding relation between the table name prefix and the label according to specific business rules;
the field NAMEs included in the tag correspondence TABLE are TABLE _ NAME, LABEL _ TYPE, and PREFIX _ CHECK;
the field NAMEs included in the predefined tag TABLE are TABLE _ PREFIX, LABEL _ NAME, and LABEL _ TYPE;
the polling module is used for polling the tables in the database to be managed according to a period defined by a user and checking whether an untagged table exists or not according to a comparison system table list and a tag corresponding table;
if the non-tagged table exists, checking whether a PREFIX _ CHECK field exists and whether the PREFIX _ CHECK field is empty, namely, the pre-tagged flow is not performed;
if the situation is met, performing labeling operation on the table through a marking module by combining a predefined label table;
if the PREFIX _ CHECK field is not null, namely the predefined labeling process is performed, the warning module is used for informing a user of performing custom labeling on the non-labeled table;
and recording the corresponding relation between all the predefined or customized tables and the labels to the label corresponding table through the marking module.
4. The data classification management system according to claim 3, characterized in that after the initial labeling is completed, the user completes and classifies the existing labels, merges and unifies the labels by standardizing the labels, and updates the label name field in the label correspondence table.
CN202010696437.9A 2020-07-20 2020-07-20 Data classification management method and system Active CN111782736B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010696437.9A CN111782736B (en) 2020-07-20 2020-07-20 Data classification management method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010696437.9A CN111782736B (en) 2020-07-20 2020-07-20 Data classification management method and system

Publications (2)

Publication Number Publication Date
CN111782736A CN111782736A (en) 2020-10-16
CN111782736B true CN111782736B (en) 2022-07-26

Family

ID=72763547

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010696437.9A Active CN111782736B (en) 2020-07-20 2020-07-20 Data classification management method and system

Country Status (1)

Country Link
CN (1) CN111782736B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239539A (en) * 2017-06-02 2017-10-10 山东浪潮商用***有限公司 A kind of user-defined m odel method based on relevant database
CN110750514A (en) * 2019-09-17 2020-02-04 福建天泉教育科技有限公司 Method and terminal for labeling main data
CN111090656A (en) * 2020-03-23 2020-05-01 北京大数元科技发展有限公司 Method and system for dynamically constructing object portrait
CN111191125A (en) * 2019-12-24 2020-05-22 长威信息科技发展股份有限公司 Data analysis method based on tagging

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239539A (en) * 2017-06-02 2017-10-10 山东浪潮商用***有限公司 A kind of user-defined m odel method based on relevant database
CN110750514A (en) * 2019-09-17 2020-02-04 福建天泉教育科技有限公司 Method and terminal for labeling main data
CN111191125A (en) * 2019-12-24 2020-05-22 长威信息科技发展股份有限公司 Data analysis method based on tagging
CN111090656A (en) * 2020-03-23 2020-05-01 北京大数元科技发展有限公司 Method and system for dynamically constructing object portrait

Also Published As

Publication number Publication date
CN111782736A (en) 2020-10-16

Similar Documents

Publication Publication Date Title
CN110990474A (en) Regional industry image analysis method and device
CN111459985A (en) Identification information processing method and device
CN108376364A (en) A kind of method, equipment and the terminal device of payment system reconciliation
CN112100181B (en) Data resource management method based on sand table
CN111897856A (en) Supervision message generation method and device, electronic equipment and readable storage medium
US10922328B2 (en) Method and system for implementing an on-demand data warehouse
CN112800755A (en) Data management method and system
CN116205396A (en) Data panoramic monitoring method and system based on data center
CN115809653A (en) Intelligent contract auditing method and system
CN104766240A (en) Electronic banking data processing system and method
CN114969040A (en) Data display method and device, electronic equipment and storage medium
CN113792081A (en) Method and system for automatically checking data assets
CN111782736B (en) Data classification management method and system
CN112669133A (en) Intelligent cost control reimbursement method capable of automatically matching according to application scenes
CN115952160B (en) Data checking method
CN113568873B (en) Intelligent policy file matching method and device
CN113344527B (en) Method and platform for integrally managing and storing judicial advice information
CN113673889A (en) Intelligent data asset identification method
CN112000870A (en) Declaration scheme generation method and system based on user information
CN110928979B (en) Method and apparatus for managing technical metadata
CN118313133A (en) PloughCAE model-based bolt automation label construction method, ploughCAE model-based bolt automation label construction system and storage medium
CN115689786A (en) Financial reimbursement checking method, medium, equipment and system based on industry characteristics
CN115774997A (en) Substitute material determination method, system and equipment
CN112766889A (en) Work task dynamic classification management method and device
CN116089617A (en) Statistical result output method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant