CN112069540A - Sensitive information processing method, device and medium - Google Patents

Sensitive information processing method, device and medium Download PDF

Info

Publication number
CN112069540A
CN112069540A CN202010926974.8A CN202010926974A CN112069540A CN 112069540 A CN112069540 A CN 112069540A CN 202010926974 A CN202010926974 A CN 202010926974A CN 112069540 A CN112069540 A CN 112069540A
Authority
CN
China
Prior art keywords
field
sensitive
suspected
acquiring
sensitive field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010926974.8A
Other languages
Chinese (zh)
Inventor
李佳佳
左颖辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Life Insurance Company of China Ltd
Original Assignee
Ping An Life Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Life Insurance Company of China Ltd filed Critical Ping An Life Insurance Company of China Ltd
Priority to CN202010926974.8A priority Critical patent/CN112069540A/en
Publication of CN112069540A publication Critical patent/CN112069540A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application relates to the technical field of block chains, in particular to a sensitive information processing method, a sensitive information processing device and a sensitive information processing medium, wherein the method comprises the following steps: acquiring a first suspected sensitive field in test data; if the first suspected sensitive field does not belong to a preset sensitive information table and the first suspected sensitive field does not belong to a preset sensitive information example table, acquiring a first sensitivity probability of the first suspected sensitive field; if the first sensitivity probability is larger than a first threshold value, acquiring a first desensitization rule of the first suspected sensitive field; and adding the first suspected sensitive field and the first desensitization rule to the preset sensitive information table. By the method and the device, sensitive information corresponding to the desensitization rule can be prevented from being leaked, and data security is improved.

Description

Sensitive information processing method, device and medium
Technical Field
The application relates to the technical field of block chains, and mainly relates to a sensitive information processing method, a sensitive information processing device and a sensitive information processing medium.
Background
In daily project development tests, a huge amount of test data is required. In order to eliminate the difference in data magnitude and data distribution between the test data and the production data in actual production, the production data is tested as test data. And in order to avoid sensitive information leakage, before the production database is imported into the test database, desensitizing sensitive information (such as name, identification card number, telephone number, mailbox and the like) in the production data. In the prior art, desensitization processing is performed according to a manually configured desensitization rule. However, the manual configuration has the problems of untimely configuration and incomplete configuration, which easily causes sensitive information leakage.
Disclosure of Invention
The embodiment of the application provides a sensitive information processing method, a sensitive information processing device and a sensitive information processing medium, which can avoid sensitive information leakage and improve data security.
In a first aspect, an embodiment of the present application provides a sensitive information processing method, where:
acquiring a first suspected sensitive field in test data;
if the first suspected sensitive field does not belong to a preset sensitive information table and the first suspected sensitive field does not belong to a preset sensitive information example table, acquiring a first sensitivity probability of the first suspected sensitive field;
if the first sensitivity probability is larger than a first threshold value, acquiring a first desensitization rule of the first suspected sensitive field;
and adding the first suspected sensitive field and the first desensitization rule to the preset sensitive information table.
In a second aspect, an embodiment of the present application provides a sensitive information processing apparatus, where:
the storage unit is used for storing a preset sensitive information table and a preset sensitive information example table;
the processing unit is used for acquiring a first suspected sensitive field in the test data; if the first suspected sensitive field does not belong to the preset sensitive information table and the first suspected sensitive field does not belong to the preset sensitive information example table, acquiring a first sensitivity probability of the first suspected sensitive field; if the first sensitivity probability is larger than a first threshold value, acquiring a first desensitization rule of the first suspected sensitive field;
the storage unit is further configured to add the first suspected sensitive field and the first desensitization rule to the preset sensitive information table.
In a third aspect, an embodiment of the present application provides another sensitive information processing apparatus, including a processor, a memory, a communication interface, and one or at least one program, where the one or at least one program is stored in the memory and configured to be executed by the processor, and the program includes instructions for some or all of the steps described in the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, where the computer program makes a computer execute to implement part or all of the steps described in the first aspect.
The embodiment of the application has the following beneficial effects:
after the sensitive information processing method, the sensitive information processing device and the sensitive information processing medium are adopted, if a first suspected sensitive field in the test data does not belong to a preset sensitive information table and does not belong to a preset sensitive information example table, the sensitivity probability of the first suspected sensitive field is obtained. And when the first sensitivity probability is larger than a first threshold value, acquiring a first desensitization rule of the first suspected sensitive field. And then, the first suspected sensitive field and the first desensitization rule are added to the preset sensitive information table, so that the comprehensiveness of the preset sensitive information table can be improved, sensitive information corresponding to the desensitization rule is prevented from being leaked, and the data security is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Wherein:
fig. 1 is a schematic flowchart of a sensitive information processing method according to an embodiment of the present application;
fig. 2 is a schematic flowchart of another sensitive information processing method provided in an embodiment of the present application;
fig. 3 is a schematic diagram of a setting page of a management terminal in a test project according to an embodiment of the present application;
fig. 4 is a schematic diagram of a display page of a management terminal according to an embodiment of the present application;
fig. 5 is a schematic logical structure diagram of a sensitive information processing apparatus according to an embodiment of the present application;
fig. 6 is a schematic physical structure diagram of a sensitive information processing apparatus according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art without any inventive work according to the embodiments of the present application are within the scope of the present application.
The terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The network architecture applied by the embodiment of the application comprises a server and electronic equipment. The electronic device may be a Personal Computer (PC), a notebook computer, or a smart phone, and may also be an all-in-one machine, a palm computer, a tablet computer (pad), a smart television playing terminal, a vehicle-mounted terminal, or a portable device. The PC end user terminal, such as a kiosk, etc., may have an operating system including, but not limited to, Linux system, Unix system, Windows series system (e.g., Windows xp, Windows 7, etc.), Mac OS X system (operating system of apple computer), etc. The operating system of the mobile end user terminal, such as a smart phone, may include, but is not limited to, an operating system such as an android system, an IOS (operating system of an apple mobile phone), a Window system, and the like.
Servers are similar to general computer architectures and include processors, hard disks, memory, system buses, and the like, for providing services to electronic devices. The server may operate in a single device, or may operate in a server cluster formed by a plurality of servers, which is not limited herein.
The electronic device in the embodiment of the application can install and run the application program, and the server can be a server corresponding to the application program installed in the electronic device and provide application service for the application program. The application program may be a project development management platform, a sensitive information processing platform, or the like, or may be a separately integrated application program, or an applet embedded in another application, a system on a web page, or the like, which is not limited herein. The number of the electronic devices and the number of the servers are not limited in the embodiment of the application, and the servers can provide services for the electronic devices at the same time. The server may be implemented as a stand-alone server or as a server cluster of multiple servers.
In the embodiment of the present application, a preset sensitive information table and a preset sensitive information example table may be stored in advance. The preset sensitive information table, for example, the sensitive _ info _ shield, includes a sensitive field and a desensitization rule corresponding to the sensitive field. The default sensitive information example table, for example, test _ positive _ info _ exception, includes non-sensitive fields in the suspected sensitive fields. As can be seen, the preset sensitive information table is used for recording information of a sensitive field identified in the test database, and the preset sensitive information exception table is used for recording information of a suspected sensitive field identified as a non-sensitive field in the test database. That is, if the preset sensitive information table includes a suspected sensitive field, it is determined that the suspected sensitive field is a sensitive field. If the preset sensitive information exception table comprises a suspected sensitive field, the suspected sensitive field can be excluded from being a sensitive field, and the suspected sensitive field is a non-sensitive field. If the suspected sensitive field is not included in the preset sensitive information table and the preset sensitive information example table, the suspected sensitive field cannot be determined to be the sensitive field, and the suspected sensitive field cannot be excluded from being the sensitive field. It should be noted that the desensitization rule of the sensitive field may be stored separately.
The preset sensitive information table and the preset sensitive information exception table may further include attribute information such as a field column name, field content, table identification, data identification, and a table name. Wherein, the field column name is the attribute class name of the field, for example: name, identification number, telephone number, etc. The field content is a value corresponding to the field column name, namely the field column name and the field content belong to an attribute-value relationship. The table identifier is used for representing a table, and the table name is the table name of the test data corresponding to the field. The data identification is a user identification of the test data.
In this embodiment, the preset sensitive information table, the preset sensitive information exception table, and the desensitization rule may be stored in the memory, or may be stored in a designated location in the network, for example, a Uniform Resource Locator (URL). For example, the storage path of the preset sensitive information table is as follows: http:// XXX. com. cn/sensory.
The preset sensitive information table, the preset sensitive information exception table and the desensitization rule can also be stored in a block created on the block chain network. The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism and an encryption algorithm. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product services layer, and an application services layer. Therefore, data are stored in a distributed mode through the block chain, data security is guaranteed, and meanwhile data sharing of information among different platforms can be achieved.
The sensitive information processing method provided by the embodiment of the application can be executed by a sensitive information processing device, wherein the device can be realized by software and/or hardware, can be generally integrated in a server, and can improve the efficiency of demand scheduling.
Referring to fig. 1, fig. 1 is a schematic flow chart of a sensitive information processing method provided in the present application. The application of the method in the server is taken as an example for illustration, and the method comprises the following steps:
s101: a first suspiciously sensitive field in the test data is obtained.
In this embodiment of the application, the test data may be any data in the test database, or may also be newly imported data or newly modified data, and the like, which is not limited herein. The first suspected sensitive field is a suspected sensitive field in the test data, and may include a field column name, a field content, a table identifier, a data identifier, a table name, and the like, which is not limited herein.
The method for acquiring the first suspected sensitive field is not limited, the test data and the production data can be compared, and the modified field is selected for keyword detection. The key words can be sensitive fields corresponding to names including client names, policyholder, insured persons, user names, Chinese names or English names, etc., sensitive fields corresponding to contact ways including telephones, mobile phones, addresses, mails, fax machines, etc., and sensitive fields corresponding to identification information including passports, identity card numbers, drivers licenses, account numbers, native countries, companies, stock codes, enterprise homepages, etc. It should be noted that the sensitive field may be a chinese character, an english character, or other languages, and is not limited herein.
In one possible example, step S101 includes: acquiring a plurality of sensitive categories according to a preset sensitive information table; constructing a plurality of keywords according to the plurality of sensitive categories; generating a keyword detection script according to the plurality of keywords; and acquiring a first suspected sensitive field in the test data according to the keyword detection script.
The sensitive categories include personal information categories such as names, contact information and identification information. The preset sensitive information table comprises the sensitive fields as described above, namely the sensitive categories related to the sensitive fields are obtained for the queried sensitive fields, and then the keywords are constructed according to the sensitive categories, so that the comprehensiveness of the keywords can be improved.
The keyword detection script is a method for detecting the sensitive field according to the keywords, and the method for generating the keyword detection script is not limited in the application and can be obtained through the obtained keywords and the regular expressions corresponding to the keywords. For example: the mobile phone number is 11 digits, and the first digit is 1; the number of the Chinese identity card is 18 digits, or 17 digits and the last digit is an upper-case English letter X; the mailbox includes special characters "@" and ". com", etc. Thus, regular expressions can be constructed according to the rules described above. For example, the regular expression for a cell phone number may be: 1(3\ d |47|5 (.
For example, the keyword detection script includes Query code according to a Structured Query Language (SQL) as shown below. Wherein sid is a table identifier, owner is a data identifier, table _ name is a table name, comment _ name is a field column name, and comments are field contents. The table identifier is used for representing a table, and the table name is the table name of the test data corresponding to the field.
And querying codes:
selectsid,owner,table_name,column_name,comments from all_col_comments
where(column_name like'%APP%NAME%'or column_name like'%CLIENT%NAME%'or column_name like'CHINESE%NAME%'or column_name like'%ENGLISH%NAME%'
orcolumn_name like'%ADDR%'or column_name like'%TEL%'or column_name like'%EML%'
or communications like% 'address%' or communications like% 'certificate number%' or communications like% 'account%' or communications like% 'or company%'
and(owner not in('SYS','APPMGR','APPQOSSYS','DBMGR','XDB','ORDSYS'))
and(owner||table_name in(select owner||object_name from All_objects a where a.last_ddl_time>to_date('2019-10-26','yyyy-mm-dd')))
According to the above query code, a table with a date after 2019-10-26 can be searched from a table of all _ col _ comments, and the field column NAME includes fields of "APP NAME", "CLIENT NAME", "CHINESE NAME", "ENGLISH NAME", "ADDR", "TEL", "EML", or the field contents include fields of "client NAME", "phone", "address", "certificate number", "account", "company", and the data identification is not one of 'SYS', 'appmcr', 'appqsys', 'gr', 'XDB', 'orddbms'.
It can be understood that, in the above example, a plurality of sensitive categories are obtained according to the preset sensitive information table, that is, the sensitive categories related to the sensitive fields are obtained for the queried sensitive fields, and then the keywords are constructed according to the sensitive categories, so that the comprehensiveness of the keywords can be improved. And then, generating a keyword detection script according to the plurality of keywords, and acquiring a first suspected sensitive field in the test data according to the keyword detection script, so that the accuracy of acquiring the suspected sensitive field can be improved.
The time for acquiring the first suspected sensitive field is not limited in the present application, and may be after the test database is updated, or may be a timing query, for example, 14 hours, 24 hours, and the like.
In one possible example, before step S101, the method further comprises: acquiring the sensitivity proportion of a test database corresponding to the test data; determining a first timing length according to the sensitivity ratio; when the time of the first timer reaches the first timer duration, step S101 is executed.
The test database may be a set of all test data, or may be a data set of a test group corresponding to the test data, which is not limited herein. The sensitivity ratio is a ratio of the sensitive fields to the test database, and may be a ratio between the number of the sensitive fields and the total amount of data in the test database, which is not limited herein.
The first timer is a timer device in the server, and is used for counting time, and step S101 is executed when the time reaches the first timing time. The timer is re-counted when step S101 is executed, or the timer is re-counted after the test data in the test database are all detected, which is not limited herein.
The first timing duration can be determined according to the preset mapping relation between the sensitivity ratio and the timing duration, and as shown in the following table, when the sensitivity ratio is 0.5, the timing duration is 1 week.
Sensitive ratio Timed duration
(0,0.3] 12 hours
(0.3,0.7] 1 week
(0.7,1] 1 month
It will be appreciated that the greater the sensitivity scale, the less probability of a sensitive field being missing in the test database. In this example, the first timing length is determined according to the sensitivity ratio of the test database, and step S101 is executed when the time of the first timer reaches the first timing length, so that the electronic device can be prevented from frequently detecting the suspected sensitive field, and power consumption is reduced.
Further, in a possible example, a first update frequency of the preset sensitive information table and a second update frequency of the preset sensitive information example table are obtained; acquiring a second timing duration according to the sensitivity proportion; acquiring a third timing duration according to the first updating frequency and the second updating frequency; and acquiring a first timing duration according to the second timing duration and the third timing duration.
The first update frequency is used to describe the number of updated preset sensitive information tables in a unit time (for example, a day, a week, a month, etc.), and the second update frequency is used to describe the number of updated preset sensitive information example tables in a unit time. It can be seen that the first update frequency and the second update frequency are respectively used to describe the update frequency of the corresponding table. The greater the update frequency, the greater the probability of indicating a missing sensitive field in the test database.
It will be appreciated that the second timing duration is obtained in accordance with the sensitivity ratio, and the third timing duration is obtained in accordance with the first update frequency and the second update frequency. And then the timing duration is obtained according to the second timing duration and the third timing duration, so that the effectiveness of timing monitoring can be improved.
S102: if the first suspected sensitive field does not belong to a preset sensitive information table and the first suspected sensitive field does not belong to a preset sensitive information example table, acquiring a first sensitivity probability of the first suspected sensitive field.
In the embodiment of the present application, the sensitivity probability is used to describe a probability value that the first suspected sensitive field is a sensitive field. As described above, if the first suspected sensitive field does not belong to the preset sensitive information table, nor does it belong to the preset sensitive information table, it cannot be determined that the suspected sensitive field is a sensitive field, nor can it be excluded that the suspected sensitive field is a sensitive field. Therefore, the sensitivity probability of the first suspected sensitive field is obtained, and the accuracy of identifying the sensitive field can be improved.
The method for obtaining the sensitivity probability is not limited in the present application, and in a possible example, the step S302 includes: obtaining production data corresponding to the test data from a production database according to the data identification of the test data; acquiring a first field from the production data according to the target field column name of the first suspected sensitive field; obtaining a similarity value between the first field and the first suspected sensitive field; and acquiring a first sensitivity probability of the first suspected sensitive field according to the similarity value.
The data identification of the test data has uniqueness and is a non-sensitive field, and the non-sensitive field is used for acquiring the generated data, namely the original data, corresponding to the test data in the production database. The target field column name may be a key that was determined to be a suspiciously sensitive field during the process of obtaining the first suspiciously sensitive field. The similarity value is used for a similarity between the first field and the first suspiciously sensitive field.
It can be understood that, according to the data identifier of the test data, the production data corresponding to the test data is obtained from the production database, and then the first field is obtained from the production data according to the target field column name of the first suspected sensitive field. And then acquiring a similarity value between the first field and the first suspected sensitive field, and acquiring a first sensitivity probability of the first suspected sensitive field according to the similarity value. That is, the accuracy of acquiring the sensitivity probability can be improved by acquiring the sensitivity probability based on the difference between the field after desensitization and the field before desensitization.
Further, in a possible example, after the obtaining of the similarity value between the first field and the first suspiciously sensitive field, the method further includes: acquiring operation information of the test data; acquiring a second sensitivity probability of the first suspected sensitive field according to the operation information; and acquiring a first sensitivity probability of the first suspected sensitivity field according to the second sensitivity probability and the similar value.
The operation information may include information such as an operation table name, an operation field name, operation field contents, operation types (e.g., data addition, deletion, modification, and query), and the like. The method for obtaining the operation information is not limited, the interactive operation instruction of the third-party application can be obtained, and then the interactive operation instruction is identified to obtain the operation information.
It can be understood that the operation information is dynamic mutual information, and on the basis of the similarity value between the field after desensitization and the field before desensitization, the accuracy rate of identifying the sensitive information can be improved by checking the dynamic mutual information.
S103: and if the first sensitivity probability is larger than a first threshold value, acquiring a first desensitization rule of the first suspected sensitive field.
In the embodiment of the present application, the first threshold is not limited, and may be 50%. In one possible example, the method further comprises: and acquiring the first threshold according to the sensitivity ratio. It can be understood that the sensitivity ratio is used for describing the ratio of the sensitive field in the test database, and the first threshold value is dynamically set according to the sensitivity ratio, so that the accuracy of identifying the sensitive information can be improved.
In an embodiment of the present application, the first desensitization rule is a rule in which the first field is converted into a first suspected sensitive field. The first desensitization rule can be determined according to the field column name of the sensitive field, and can also be obtained by comparison according to the source data of the production data and then reverse calculation, which is not limited herein.
In one possible example, step S103 includes: acquiring a second desensitization rule corresponding to the production data from the preset sensitive information table according to the data identification; desensitizing the first field according to the second desensitization rule to obtain a second field; and acquiring a first desensitization rule of the first suspected sensitive field according to the second field and the first suspected sensitive field.
The second desensitization rule may be a preset desensitization rule corresponding to the production data, or may be a desensitization rule of a data table corresponding to the production data. The second field is obtained by carrying out desensitization treatment on the first field according to a second desensitization rule.
It can be understood that, in this example, according to the second desensitization rule stipulated before, desensitization processing is performed on the first field to obtain the second field, and then the second field and the first suspected sensitive field are analyzed to obtain the first desensitization rule, so that the accuracy of obtaining the desensitization rule can be further improved.
S104: and adding the first suspected sensitive field and the first desensitization rule to the preset sensitive information table.
As described above, the preset sensitive information table includes the sensitive field and the desensitization rule of the sensitive field. And when the first sensitivity probability is larger than a first threshold value, the first suspected sensitive field is indicated as a sensitive field. Therefore, the first suspected sensitive field and the first desensitization rule are added to the preset sensitive information table, the comprehensiveness of the preset sensitive information table can be improved, sensitive information corresponding to the desensitization rule is prevented from being leaked, and the data security is improved.
In the method shown in fig. 1, if a first suspected sensitive field in the test data does not belong to the preset sensitive information table, and also does not belong to the preset sensitive information table, the sensitivity probability of the first suspected sensitive field is obtained. And when the first sensitivity probability is larger than a first threshold value, acquiring a first desensitization rule of the first suspected sensitive field. And then, the first suspected sensitive field and the first desensitization rule are added to the preset sensitive information table, so that the comprehensiveness of the preset sensitive information table can be improved, sensitive information corresponding to the desensitization rule is prevented from being leaked, and the data security is improved.
In a possible example, after step S102, if the first sensitivity probability is less than or equal to the first threshold, the first suspected sensitive field is added to the preset sensitive information exception table.
It is to be appreciated that when the first sensitivity probability is less than or equal to the first threshold, the probability that the first suspiciously sensitive field is a sensitive field is small, and the first suspiciously sensitive field may be determined to be a non-sensitive field. Then, the first suspected sensitive field is added to the preset sensitive information example appearance, the comprehensiveness of the preset sensitive information example appearance can be improved, the first suspected sensitive field can be directly determined to be a non-sensitive field when the first suspected sensitive field is identified next time, and the identification efficiency can be improved.
Referring to fig. 2, fig. 2 is a schematic flowchart of another sensitive information processing method provided in the present application. The application of the method in the server is taken as an example for illustration, and the method comprises the following steps:
s201: a first suspiciously sensitive field in the test data is obtained.
S202: if the first suspected sensitive field does not belong to a preset sensitive information table and the first suspected sensitive field does not belong to a preset sensitive information example table, acquiring a first sensitivity probability of the first suspected sensitive field.
Step S201 and step S202 may refer to the description of step S101 and step S102, and are not described herein again.
S203: and if the first sensitivity probability is smaller than or equal to a first threshold value, sending the attribute information of the first suspected sensitive field to a management terminal corresponding to the test data.
In the embodiment of the application, the management terminal is a display device corresponding to the contact information of the pre-bound manager. For example, if the contact is a mail, the management terminal is an electronic device for a manager to log in a mail application (or applet, public number, etc.).
The attribute information of the first suspected sensitive field may include a table identifier, a data identifier, a table name, a field column name, and a field content of the field, and may also include a sensitivity probability, which is not limited herein.
The administrator may correspond to the test item. As shown in fig. 3, the setting page of the test item includes an item name, a contact mailbox, a mail template, and a content type. The item name is 'sensitive information inspection', namely the sensitive information inspection of the ABC test library. The contact mailbox is a mailbox of a manager, and is xxx @ xxx. The mail template is a template for sending a mail to a manager, and the content type is the writing language of the mail template. The mail format is set in fig. 3 using hypertext Markup Language (HTML). In this way, the attribute information of the first suspected sensitive field can be sent to the management terminal through the contact way (xxx @ xxx.com.cn) of the manager, and the attribute information is sent in the form of a mail template.
As shown in fig. 4, the display page of the management terminal includes the head portrait of the manager, the contact mailbox (i.e., the recipient), the receiving time (2020.07.25 (saturday) 14:30), the item name ([ sensitive information check ] ABC test bank sensitive information check), and the attribute information of the first suspected sensitive field. Wherein the attribute information is presented in the form of a table. The table includes a table identification of a field, a data identification, a table name, a field column name, and a field content. Therefore, when checking the mail, the manager acquires the attribute information of the first suspected sensitive field, so as to judge whether the first suspected sensitive field is a sensitive field.
S204: and if a confirmation instruction of the management terminal for the attribute information is received, acquiring a first desensitization rule of the first suspected sensitive field.
In an embodiment of the present application, the confirmation instruction is used to indicate that the first suspected sensitive field is a sensitive field. The first desensitization rule may refer to the description of step S103, and is not described herein.
S205: and adding the first suspected sensitive field and the first desensitization rule to the preset sensitive information table.
In the method shown in fig. 2, if a first suspected sensitive field in the test data does not belong to the preset sensitive information table, and also does not belong to the preset sensitive information table, the sensitivity probability of the first suspected sensitive field is obtained. And when the first sensitivity probability is smaller than or equal to a first threshold value, sending attribute information of the first suspected sensitive field to a management terminal corresponding to the test data. And when a confirmation instruction of the management terminal for the attribute information is received, determining that the first suspected sensitive field is a sensitive field. And then, a first desensitization rule of the first suspected sensitive field is obtained, and the first suspected sensitive field and the first desensitization rule are added to the preset sensitive information table, so that the comprehensiveness of the preset sensitive information table can be improved, sensitive information corresponding to the desensitization rule is prevented from being leaked, and the data security is improved.
In a possible example, after step S203, if a negative confirmation instruction of the management terminal for the attribute information is received, the first suspected sensitive field is added to the preset sensitive information sample table.
In an embodiment of the application, the denial instruction is used to indicate that the first suspiciously sensitive field is a non-sensitive field. It can be understood that, when a negative-acknowledgement instruction sent by the management terminal is received, the first suspected sensitive field can be determined to be a non-sensitive field. Then, the first suspected sensitive field is added to the preset sensitive information example appearance, the comprehensiveness of the preset sensitive information example appearance can be improved, the first suspected sensitive field can be directly determined to be a non-sensitive field when the first suspected sensitive field is identified next time, and the identification efficiency can be improved.
In a possible example, if the first suspected sensitive field does not belong to a preset sensitive information table and the first suspected sensitive field does not belong to a preset sensitive information table, the attribute information of the first suspected sensitive field is sent to a management terminal corresponding to the test data.
It can be understood that when the suspected sensitive field cannot be determined as a sensitive field or a non-sensitive field by the preset sensitive information table and the preset sensitive information exception table, manual judgment is introduced, so that false recognition caused by automatic processing can be avoided, and the accuracy of determining the sensitive information is improved.
The method of the embodiments of the present application is set forth above in detail and the apparatus of the embodiments of the present application is provided below.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a sensitive information processing apparatus according to the present application, and as shown in fig. 5, the sensitive information processing apparatus 500 includes:
the storage unit 503 is configured to store a preset sensitive information table and a preset sensitive information example table;
a processing unit 501, configured to obtain a first suspected sensitive field in test data; if the first suspected sensitive field does not belong to the preset sensitive information table and the first suspected sensitive field does not belong to the preset sensitive information example table, acquiring a first sensitivity probability of the first suspected sensitive field; if the first sensitivity probability is larger than a first threshold value, acquiring a first desensitization rule of the first suspected sensitive field;
the storage unit 503 is further configured to add the first suspected sensitive field and the first desensitization rule to the preset sensitive information table.
In a possible example, the processing unit 501 is specifically configured to obtain, from a production database, production data corresponding to the test data according to a data identifier of the test data; acquiring a first field from the production data according to the target field column name of the first suspected sensitive field; obtaining a similarity value between the first field and the first suspected sensitive field; and acquiring a first sensitivity probability of the first suspected sensitive field according to the similarity value.
In one possible example, after the obtaining of the similarity value between the first field and the first suspected sensitive field, the processing unit 501 is further configured to obtain operation information of the test data; acquiring a second sensitivity probability of the first suspected sensitive field according to the operation information; and acquiring a first sensitivity probability of the first suspected sensitivity field according to the second sensitivity probability and the similar value.
In a possible example, the processing unit 501 is specifically configured to obtain, from the preset sensitive information table, a second desensitization rule corresponding to the production data according to the data identifier; desensitizing the first field according to the second desensitization rule to obtain a second field; and acquiring a first desensitization rule of the first suspected sensitive field according to the second field and the first suspected sensitive field.
In a possible example, the processing unit 501 is specifically configured to obtain a plurality of sensitive categories according to the preset sensitive information table; constructing a plurality of keywords according to the plurality of sensitive categories; generating a keyword detection script according to the plurality of keywords; and acquiring a first suspected sensitive field in the test data according to the keyword detection script.
In a possible example, after the obtaining of the first sensitivity probability of the first suspected sensitive field, the sensitive information processing apparatus 500 further includes a communication unit 502, configured to send attribute information of the first suspected sensitive field to a management terminal corresponding to the test data if the sensitivity probability is smaller than or equal to the first threshold;
the processing unit 501 is further configured to execute the step of obtaining the first desensitization rule of the first suspected sensitive field if the communication unit 502 receives a confirmation instruction of the management terminal for the attribute information; alternatively, if the communication unit 502 receives a denial instruction of the management terminal for the attribute information, the first suspected sensitive field is added to the preset sensitive information sample table.
In one possible example, before the obtaining the desensitization rule of the first suspected sensitive field, the processing unit 501 is further configured to obtain a sensitivity ratio of a test database corresponding to the test data; acquiring a timing duration according to the sensitivity ratio; executing the step of obtaining the desensitization rule of the first suspected-to-be-sensitive field when the time of the first timer reaches the first time duration.
For detailed processes executed by each unit in the sensitive information processing apparatus 500, reference may be made to the execution steps in the foregoing method embodiments, which are not described herein again.
Referring to fig. 6, fig. 6 is a schematic structural diagram of another sensitive information processing apparatus according to an embodiment of the present application, where the sensitive information processing apparatus is a server corresponding to an electronic device or a document processing application. As shown in fig. 6, the sensitive information processing apparatus 600 includes a processor 610, a memory 620, a communication interface 630, and one or more programs 640. The related functions implemented by the communication unit 502 shown in fig. 5 can be implemented by the communication interface 630, the related functions implemented by the storage unit 503 shown in fig. 5 can be implemented by the memory 620, and the related functions implemented by the processing unit 501 shown in fig. 5 can be implemented by the processor 610.
The one or more programs 640 are stored in the memory 620 and configured to be executed by the processor 610, the programs 640 including instructions for:
acquiring a first suspected sensitive field in test data;
if the first suspected sensitive field does not belong to a preset sensitive information table and the first suspected sensitive field does not belong to a preset sensitive information example table, acquiring a first sensitivity probability of the first suspected sensitive field;
if the first sensitivity probability is larger than a first threshold value, acquiring a first desensitization rule of the first suspected sensitive field;
and adding the first suspected sensitive field and the first desensitization rule to the preset sensitive information table.
In one possible example, in terms of the obtaining the first probability of sensitivity of the first suspected sensitive field, the program 640 is specifically configured to execute the following steps:
obtaining production data corresponding to the test data from a production database according to the data identification of the test data;
acquiring a first field from the production data according to the target field column name of the first suspected sensitive field;
obtaining a similarity value between the first field and the first suspected sensitive field;
and acquiring a first sensitivity probability of the first suspected sensitive field according to the similarity value.
In one possible example, after the obtaining of the similarity value between the first field and the first suspiciously sensitive field, the program 640 is further configured to execute the following instructions:
acquiring operation information of the test data;
acquiring a second sensitivity probability of the first suspected sensitive field according to the operation information;
and acquiring a first sensitivity probability of the first suspected sensitivity field according to the second sensitivity probability and the similar value.
In one possible example, in connection with the obtaining the first desensitization rule for the first suspiciously sensitive field, the program 640 is specifically configured to execute the following instructions:
acquiring a second desensitization rule corresponding to the production data from the preset sensitive information table according to the data identification;
desensitizing the first field according to the second desensitization rule to obtain a second field;
and acquiring a first desensitization rule of the first suspected sensitive field according to the second field and the first suspected sensitive field.
In one possible example, in terms of obtaining the first suspiciously sensitive field in the test data, the program 640 is specifically configured to execute the following steps:
acquiring a plurality of sensitive categories according to the preset sensitive information table;
constructing a plurality of keywords according to the plurality of sensitive categories;
generating a keyword detection script according to the plurality of keywords;
and acquiring a first suspected sensitive field in the test data according to the keyword detection script.
In one possible example, after the obtaining the first probability of sensitivity for the first suspected sensitive field, the program 640 is further for executing the following steps:
if the sensitivity probability is smaller than or equal to the first threshold, sending attribute information of the first suspected sensitive field to a management terminal corresponding to the test data;
if a confirmation instruction of the management terminal for the attribute information is received, executing the step of obtaining the first desensitization rule of the first suspected sensitive field; or,
and if a denial instruction of the management terminal for the attribute information is received, adding the first suspected sensitive field to the preset sensitive information sample table.
In one possible example, prior to the obtaining the desensitization rule for the first suspiciously sensitive field, the program 640 is further for instructions to:
acquiring the sensitivity proportion of a test database corresponding to the test data;
acquiring a timing duration according to the sensitivity ratio;
executing the step of obtaining the desensitization rule of the first suspected-to-be-sensitive field when the time of the first timer reaches the first time duration.
Embodiments of the present application also provide a computer storage medium, where the computer storage medium stores a computer program for causing a computer to execute to implement part or all of the steps of any one of the methods described in the method embodiments, and the computer includes an electronic device and a server.
Embodiments of the application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform to implement some or all of the steps of any of the methods recited in the method embodiments. The computer program product may be a software first installation package, the computer comprising an electronic device and a server.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art will also appreciate that the embodiments described in this specification are presently preferred and that no particular act or mode of operation is required in the present application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, at least one unit or component may be combined or integrated with another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may also be distributed on at least one network unit. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a hardware mode or a software program mode.
The integrated unit, if implemented in the form of a software program module and sold or used as a stand-alone product, may be stored in a computer readable memory. With such an understanding, the technical solution of the present application may be embodied in the form of a software product, which is stored in a memory and includes several instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned memory comprises: various media capable of storing program codes, such as a usb disk, a read-only memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and the like.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash disk, ROM, RAM, magnetic or optical disk, and the like.
The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the method and the core concept of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A sensitive information processing method, comprising:
acquiring a first suspected sensitive field in test data;
if the first suspected sensitive field does not belong to a preset sensitive information table and the first suspected sensitive field does not belong to a preset sensitive information example table, acquiring a first sensitivity probability of the first suspected sensitive field;
if the first sensitivity probability is larger than a first threshold value, acquiring a first desensitization rule of the first suspected sensitive field;
and adding the first suspected sensitive field and the first desensitization rule to the preset sensitive information table.
2. The method of claim 1, wherein obtaining the first sensitivity probability of the first suspected sensitivity field comprises:
obtaining production data corresponding to the test data from a production database according to the data identification of the test data;
acquiring a first field from the production data according to the target field column name of the first suspected sensitive field;
obtaining a similarity value between the first field and the first suspected sensitive field;
and acquiring a first sensitivity probability of the first suspected sensitive field according to the similarity value.
3. The method of claim 2, wherein after said obtaining a similarity value between said first field and said first suspiciously sensitive field, said method further comprises:
acquiring operation information of the test data;
acquiring a second sensitivity probability of the first suspected sensitive field according to the operation information;
and acquiring a first sensitivity probability of the first suspected sensitivity field according to the second sensitivity probability and the similar value.
4. The method of claim 2, wherein obtaining the first desensitization rule for the first suspected sensitive field comprises:
acquiring a second desensitization rule corresponding to the production data from the preset sensitive information table according to the data identification;
desensitizing the first field according to the second desensitization rule to obtain a second field;
and acquiring a first desensitization rule of the first suspected sensitive field according to the second field and the first suspected sensitive field.
5. The method of any one of claims 1-4, wherein obtaining a first suspected sensitive field in the test data comprises:
acquiring a plurality of sensitive categories according to the preset sensitive information table;
constructing a plurality of keywords according to the plurality of sensitive categories;
generating a keyword detection script according to the plurality of keywords;
and acquiring a first suspected sensitive field in the test data according to the keyword detection script.
6. The method of any one of claims 1-4, wherein after said obtaining the first probability of sensitivity of the first suspected sensitive field, the method further comprises:
if the sensitivity probability is smaller than or equal to the first threshold, sending attribute information of the first suspected sensitive field to a management terminal corresponding to the test data;
if a confirmation instruction of the management terminal for the attribute information is received, executing the step of obtaining the first desensitization rule of the first suspected sensitive field; or,
and if a denial instruction of the management terminal for the attribute information is received, adding the first suspected sensitive field to the preset sensitive information sample table.
7. The method of any one of claims 1-4, wherein prior to the obtaining the desensitization rule for the first suspiciously sensitive field, the method further comprises:
acquiring the sensitivity proportion of a test database corresponding to the test data;
acquiring a timing duration according to the sensitivity ratio;
executing the step of obtaining the desensitization rule of the first suspected-to-be-sensitive field when the time of the first timer reaches the first time duration.
8. A sensitive information processing apparatus, comprising:
the storage unit is used for storing a preset sensitive information table and a preset sensitive information example table;
the processing unit is used for acquiring a first suspected sensitive field in the test data; if the first suspected sensitive field does not belong to the preset sensitive information table and the first suspected sensitive field does not belong to the preset sensitive information example table, acquiring a first sensitivity probability of the first suspected sensitive field; if the first sensitivity probability is larger than a first threshold value, acquiring a first desensitization rule of the first suspected sensitive field;
the storage unit is further configured to add the first suspected sensitive field and the first desensitization rule to the preset sensitive information table.
9. A sensitive information processing apparatus comprising a processor, a memory, a communication interface, and one or at least one program, wherein the one or at least one program is stored in the memory and configured to be executed by the processor, the program comprising instructions for performing the steps in the method of any of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program, the computer program causing a computer to execute to implement the method of any one of claims 1-7.
CN202010926974.8A 2020-09-04 2020-09-04 Sensitive information processing method, device and medium Pending CN112069540A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010926974.8A CN112069540A (en) 2020-09-04 2020-09-04 Sensitive information processing method, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010926974.8A CN112069540A (en) 2020-09-04 2020-09-04 Sensitive information processing method, device and medium

Publications (1)

Publication Number Publication Date
CN112069540A true CN112069540A (en) 2020-12-11

Family

ID=73663683

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010926974.8A Pending CN112069540A (en) 2020-09-04 2020-09-04 Sensitive information processing method, device and medium

Country Status (1)

Country Link
CN (1) CN112069540A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112835903A (en) * 2021-02-01 2021-05-25 上海上讯信息技术股份有限公司 Sensitive data identification method and equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105825138A (en) * 2015-01-04 2016-08-03 北京神州泰岳软件股份有限公司 Sensitive data identification method and device
CN106203145A (en) * 2016-08-04 2016-12-07 北京网智天元科技股份有限公司 Data desensitization method and relevant device
CN107145799A (en) * 2017-05-04 2017-09-08 山东浪潮云服务信息科技有限公司 A kind of data desensitization method and device
CN109614816A (en) * 2018-11-19 2019-04-12 平安科技(深圳)有限公司 Data desensitization method, device and storage medium
WO2019134339A1 (en) * 2018-01-03 2019-07-11 平安科技(深圳)有限公司 Desensitization method and procedure, application server and computer readable storage medium
CN110222170A (en) * 2019-04-25 2019-09-10 平安科技(深圳)有限公司 A kind of method, apparatus, storage medium and computer equipment identifying sensitive data
CN111191281A (en) * 2019-12-25 2020-05-22 平安信托有限责任公司 Data desensitization processing method and device, computer equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105825138A (en) * 2015-01-04 2016-08-03 北京神州泰岳软件股份有限公司 Sensitive data identification method and device
CN106203145A (en) * 2016-08-04 2016-12-07 北京网智天元科技股份有限公司 Data desensitization method and relevant device
CN107145799A (en) * 2017-05-04 2017-09-08 山东浪潮云服务信息科技有限公司 A kind of data desensitization method and device
WO2019134339A1 (en) * 2018-01-03 2019-07-11 平安科技(深圳)有限公司 Desensitization method and procedure, application server and computer readable storage medium
CN109614816A (en) * 2018-11-19 2019-04-12 平安科技(深圳)有限公司 Data desensitization method, device and storage medium
CN110222170A (en) * 2019-04-25 2019-09-10 平安科技(深圳)有限公司 A kind of method, apparatus, storage medium and computer equipment identifying sensitive data
CN111191281A (en) * 2019-12-25 2020-05-22 平安信托有限责任公司 Data desensitization processing method and device, computer equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘聪 等: "结合触发事件及词性分析的敏感信息识别方法", 《计算机工程与应用》, 30 October 2019 (2019-10-30), pages 1 - 8 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112835903A (en) * 2021-02-01 2021-05-25 上海上讯信息技术股份有限公司 Sensitive data identification method and equipment

Similar Documents

Publication Publication Date Title
CN111400765B (en) Private data access method and device and electronic equipment
CN105391674B (en) Information processing method and system, server and client
CN106713579B (en) Telephone number identification method and device
CN109446837B (en) Text auditing method and device based on sensitive information and readable storage medium
CN108009435B (en) Data desensitization method, device and storage medium
CN105653947B (en) The method and device of data safety risk is applied in a kind of assessment
CN113364753B (en) Anti-crawler method and device, electronic equipment and computer readable storage medium
CN110389941B (en) Database checking method, device, equipment and storage medium
CN111783138A (en) Sensitive data detection method and device, computer equipment and storage medium
CN109711189B (en) Data desensitization method and device, storage medium and terminal
CN115238286A (en) Data protection method and device, computer equipment and storage medium
CN116432604A (en) Data verification method and device and electronic equipment
CN108684044B (en) User behavior detection system, method and device
CN111737746A (en) Method for desensitizing dynamic configuration data based on java annotation
CN113282591B (en) Authority filtering method, authority filtering device, computer equipment and storage medium
CN114491646A (en) Data desensitization method and device, electronic equipment and storage medium
CN112069540A (en) Sensitive information processing method, device and medium
CN113869789A (en) Risk monitoring method and device, computer equipment and storage medium
CN110472121B (en) Business card information searching method and device, electronic equipment and computer readable storage medium
CN115544566A (en) Log desensitization method, device, equipment and storage medium
CN115544558A (en) Sensitive information detection method and device, computer equipment and storage medium
CN110825976B (en) Website page detection method and device, electronic equipment and medium
CN114662114A (en) Log-based code desensitization vulnerability detection method and related equipment
CN114912003A (en) Document searching method and device, computer equipment and storage medium
CN114707180A (en) Log desensitization method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination