WO2020063349A1 - Data protection method and device, apparatus, computer storage medium - Google Patents

Data protection method and device, apparatus, computer storage medium Download PDF

Info

Publication number
WO2020063349A1
WO2020063349A1 PCT/CN2019/105390 CN2019105390W WO2020063349A1 WO 2020063349 A1 WO2020063349 A1 WO 2020063349A1 CN 2019105390 W CN2019105390 W CN 2019105390W WO 2020063349 A1 WO2020063349 A1 WO 2020063349A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
privacy
processed
sub
model
Prior art date
Application number
PCT/CN2019/105390
Other languages
French (fr)
Chinese (zh)
Inventor
艾东梅
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2020063349A1 publication Critical patent/WO2020063349A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules

Definitions

  • the embodiments of the present application relate to, but are not limited to, privacy data protection technologies, and in particular, to a data protection method, device, device, and computer storage medium.
  • a method for protecting user privacy data is not flexible enough, and it is impossible to determine whether privacy protection of user data is required according to actual needs.
  • the embodiments of the present application provide a data protection method, device, device, and computer storage medium, which can flexibly protect and manage private data.
  • An embodiment of the present application provides a data protection method.
  • the method includes:
  • each privacy sub-model is a data set representing a privacy attribute, the privacy attributes represented by the n privacy sub-models are different from each other, and n is an integer greater than 1;
  • an alert message is generated to remind the data to be processed that privacy protection is needed.
  • An embodiment of the present application further provides a data protection device, where the device includes a processor and a memory configured to store a computer program capable of running on the processor; wherein,
  • processor When the processor is configured to run the computer program, execute the steps of any one of the data protection methods described above.
  • An embodiment of the present application further provides a data protection device, where the device includes an acquisition module and a decision module, where:
  • the obtaining module is configured to obtain n privacy sub-models; wherein each privacy sub-model is a data set representing a privacy attribute, the privacy attributes represented by the n privacy sub-models are different from each other, and n is an integer greater than 1 ;
  • the decision-making module is configured to obtain the data to be processed and determine the privacy sub-model corresponding to the data to be processed; when the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold value, generating warning information to It is reminded that the data to be processed needs privacy protection.
  • An embodiment of the present application further provides a computer storage medium, and when the computer program is executed by a processor, the steps of any one of the foregoing data protection methods are implemented.
  • n privacy submodels are first obtained; wherein each privacy submodel is a data set representing a privacy attribute, and the n The privacy attributes represented by the privacy submodel are different from each other, n is an integer greater than 1. Then, the data to be processed is obtained to determine the privacy submodel corresponding to the data to be processed; finally, the correlation between the data to be processed and the corresponding privacy submodel is When the correlation is greater than or equal to a preset correlation threshold, an early-warning message is generated to remind the data to be processed that privacy protection is required.
  • the n privacy attributes corresponding to the n privacy submodels can be flexibly set according to the actual needs of the user, the n privacy submodels that meet the actual needs can be obtained.
  • the required n privacy sub-models determine that the warning information is generated, it indicates that the generation of the warning information is in line with the actual requirements; that is, by setting the n privacy attributes flexibly and autonomously in advance, the user's privacy data can be alerted. It has certain flexibility and autonomy, which can prevent the leakage of private data that requires privacy protection.
  • FIG. 1 is a flowchart of a data protection method according to an embodiment of the present application
  • FIG. 2 is a schematic diagram of a clustering result of training data according to an embodiment of the present application.
  • FIG. 3 is a flowchart of another data protection method according to an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a data protection device according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a hardware structure of a data protection device according to an embodiment of the present application.
  • the method for protecting user privacy data is not flexible enough, and it is impossible to determine whether the user data needs to be privacy protected according to actual needs.
  • the embodiments of the present application can be applied to any scenario where privacy protection is required.
  • privacy protection is performed on user data generated when an application runs on a terminal
  • it can be implemented based on the technical solution provided in the embodiments of the present application.
  • the embodiments of the present application may be applied to a terminal or other devices, and the terminal or other devices described above may include devices such as a processor and a memory.
  • FIG. 1 is a flowchart of a data protection method according to an embodiment of the present application. As shown in FIG. 1, the process may include:
  • Step 101 Obtain n privacy sub-models; wherein each privacy sub-model is a data set representing a privacy attribute, the privacy attributes represented by the n privacy sub-models are different from each other, and n is an integer greater than 1.
  • training data may be obtained first, where the training data is used to represent user data generated when the application is running; then, preset n privacy attributes are used as the central object to perform training on the training object.
  • the data is clustered to obtain n privacy sub-models.
  • user raw data generated during application running may be obtained, and the user raw data recorded above may be pre-processed to obtain training data.
  • the user raw data recorded above may be pre-processed to obtain training data.
  • at least one of the following recorded user raw data may be trained to obtain training data.
  • Data: word segmentation processing, filtering useless word processing, useless words can include punctuation, single words, symbols, and other meaningless words; it should be noted that the content of the above records is only an example to provide the implementation of preprocessing.
  • the pre-processing may also have other implementation manners, which are not limited in the embodiments of the present application.
  • the above-mentioned user raw data may be user data generated when an application (Application, App) of a mobile terminal is run, and may include user's use of each application of the mobile terminal.
  • Various data generated such as login information, reading, consumption, preference details, etc.
  • n privacy attributes can be set in advance according to the actual needs for protecting private data.
  • Each privacy attribute of the n privacy attributes indicates that the user determines the privacy point to be protected (that is, the privacy that the user cares about most).
  • Points for example, n privacy attributes can include "identity", "interest”, etc .; n can be considered as a pre-set protection degree coefficient. The greater the value of n, the more privacy points the user determines to be protected; further After setting n privacy attributes, the user can change the n privacy attributes according to actual needs, and then re-cluster the training data based on the revised privacy attributes to obtain the corresponding privacy submodel.
  • protection degree coefficient n users can flexibly determine the privacy protection strategy of personal data, and the protection degree coefficient n then affects the size of the privacy protection category. Users set the corresponding degree of protection coefficient according to the degree of their privacy protection.
  • a user may input a protection degree coefficient n and n privacy attributes through a user interface (UI) of a terminal, which is convenient for user operations.
  • UI user interface
  • the n privacy attributes and the user's original data can be used as input data for constructing the n privacy sub-models, and further, the n privacy sub-models can be input.
  • the data is processed to obtain n privacy sub-models.
  • clustering-based natural language processing methods commonly used in machine learning can be used to automatically cluster the input data of n privacy sub-models and iteratively update each The central object of the secondary clustering until the final clustering result is obtained; here, the final clustering result may include n clusters, the privacy properties of the n clusters in the final clustering result are different from each other, and the final clustering Each cluster in the class result represents a privacy sub-model. It should be noted that the embodiments of the present application do not limit the structure and learning method of the machine learning model.
  • the central object of the cluster is updated so that the preset evaluation index of the current clustering result is higher than the preset evaluation index of the previous clustering result.
  • the preset evaluation index of the clustering result can be used to indicate: the proximity of each record in the same cluster in the clustering result, and the distance between the records of different clusters in the clustering result; the same in the clustering result The closer each record in the cluster is, the further away the records in different clusters are from the clustering result, indicating that the preset evaluation index of the clustering result is higher.
  • a first clustering process is performed on the training data to obtain a first clustering result
  • m denote the total number of iterations of the iterative clustering method.
  • i takes 2 to m
  • preset evaluation of the clustering result of the ith clustering If the index is higher than the preset evaluation index of the i-1th clustering result, the center object of the i-1th clustering is updated to obtain the center object of the ith clustering.
  • the i-th clustering process is performed on the training data to obtain the i-th clustering result.
  • m may be a preset integer greater than 1, or may be determined by a preset iteration termination condition.
  • the preset iteration termination condition may be: the preset evaluation index of the clustering result cannot be changed. High is the target, and the central object of the previous clustering is updated.
  • Step 102 Obtain the data to be processed and determine the privacy sub-model corresponding to the data to be processed. When the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold, it is determined that the data to be processed needs to be processed. privacy protection.
  • user data generated during application running can be monitored, user data generated during application running is monitored, and user data generated during monitored application running is taken as pending data.
  • the data to be processed is data to be uploaded by the terminal or data to be saved by the terminal.
  • the n privacy sub-models described above are taken as the center object, and a sufficient calculation is performed by a machine learning algorithm to determine the privacy sub-model (the privacy sub-model corresponding to the to-be-processed data).
  • the correlation between the data to be processed and the n privacy sub-models can be determined separately, and the privacy sub-model with the most correlation with the data to be processed is used as the privacy sub-model corresponding to the data to be processed.
  • the semantic distance between the data to be processed and each privacy submodel can be calculated, and according to the data to be processed and each privacy submodel, The semantic distance between the data to be processed and each privacy sub-model is determined. The smaller the semantic distance between the data to be processed and the privacy sub-model, the greater the privacy sensitivity of the data to be processed. The greater the correlation.
  • the magnitude relationship between the correlation between the data to be processed and the corresponding privacy sub-model and a preset correlation threshold can be judged.
  • the data to be processed is related to the corresponding privacy sub-model
  • the correlation is greater than or equal to a preset correlation threshold, it is determined that the data to be processed needs to be protected by privacy.
  • an alert message may be generated to remind the data to be processed that privacy needs to be protected.
  • the correlation of the privacy sub-model is less than a preset correlation threshold, it is determined that the data to be processed does not need to be protected by privacy, and the process may be directly ended.
  • the data to be processed and the corresponding privacy submodel may be calculated.
  • the semantic distance between the data to be processed and the corresponding privacy submodel is less than or equal to a preset semantic distance threshold, it is determined that the correlation between the data to be processed and the corresponding privacy submodel is greater than or equal to a preset correlation threshold; otherwise,
  • the semantic distance between the data to be processed and the corresponding privacy submodel is greater than a preset semantic distance threshold, it is determined that the correlation between the data to be processed and the corresponding privacy submodel is less than a preset correlation threshold.
  • n privacy attributes corresponding to the n privacy sub-models can be flexibly set according to actual needs, n privacy sub-models that meet actual needs can be obtained, and further, when generating the warning information according to the n privacy sub-models that meet actual needs, , Indicating that the generation of early warning information is in line with actual needs; that is, by setting n privacy attributes flexibly and autonomously in advance, early warning reminders of user privacy data can be achieved, with a certain degree of flexibility and autonomy, which can prevent the need Privacy data leakage for privacy protection.
  • the correlation between the data to be processed and the corresponding privacy sub-model is less than a preset correlation threshold, the data to be processed can be ignored, so that a secure channel and powerful guarantee can be provided for data that does not need privacy protection.
  • the data protection method of the first embodiment of the present application may be implemented based on a processor of a terminal or the like.
  • the method for protecting user privacy data is not flexible enough, and it is impossible to determine whether the user data needs to be privacy protected according to actual needs.
  • a machine learning method is used to automatically extract and aggregate the user's original data on the terminal according to preset n privacy attributes to generate a privacy protection scheme that meets the actual needs of a single user. Based on this, it is detected whether the data generated during the use of the application meets the degree of openness of the private information expected by the user, and then make corresponding measures; it can be seen that by setting n privacy attributes, the data generated during the use of the application can be filtered And discrimination processing, which can protect private data from misuse and even attack, and allow effective use of the data; that is, in the embodiment of the present application, from the perspective of the user, adhering to "let the user take charge of their own data "The purpose is to use machine learning to automatically build a privacy data protection solution that meets the needs of users, and then decide and manage those privacy data that may be the user's attention. You can protect your privacy while providing information to enjoy the service.
  • FIG. 3 is a flowchart of another data protection method according to an embodiment of the present application. As shown in FIG. 3, the process may include:
  • Step 301 Obtain data to be processed and n privacy sub-models.
  • Step 302 Determine a privacy sub-model corresponding to the data to be processed.
  • step 102 The implementation of this step has been described in step 102, and is not repeated here.
  • Step 303 Determine whether the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold, and if yes, execute step 304; if not, end the process.
  • Step 304 Perform early warning or other processing on the data to be processed.
  • the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold, it can be considered that the probability of the data to be processed belongs to the privacy category that the user cares about, and early warning information can be generated to prompt There is a risk of leakage of private information, or privacy protection may be performed directly on the data to be processed.
  • the display method of the warning information for example, the UI of the terminal or other forms may be used to display the warning information.
  • the data to be processed may be saved, When uploading or other operations that may cause privacy leakage, prevent the corresponding operation of the data to be processed, and alert or remind users.
  • the data to be processed may also be added to the corresponding privacy sub-model to make the privacy sub-model more complete.
  • a fourth embodiment of the present application provides a data protection device.
  • FIG. 4 is a schematic structural diagram of a data protection device according to an embodiment of the present application. As shown in FIG. 4, the device includes an obtaining module 401 and a decision module 402, where:
  • the obtaining module 401 is configured to obtain n privacy sub-models; wherein each privacy sub-model is a data set representing a privacy attribute, and the privacy attributes represented by the n privacy sub-models are different from each other, and n is greater than 1 Integer
  • the decision module 402 is configured to obtain data to be processed and determine a privacy sub-model corresponding to the data to be processed; when the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold, generating warning information, In order to remind that the data to be processed needs privacy protection.
  • the obtaining module 401 is specifically configured to obtain training data, where the training data is used to represent user data generated when an application is running; preset n privacy attributes are used as a central object for the training. The data is clustered to obtain n privacy sub-models.
  • the acquisition module 401 is specifically set to take preset n privacy attributes as the central object and adopt an iterative clustering method to perform multiple clustering processing on the training data to obtain n privacy A sub-model; wherein, when the clustering process is not performed for the first time, the central object of the clustering is updated so that the preset evaluation index of the current clustering result is higher than the preset evaluation index of the previous clustering result.
  • the preset evaluation index of the clustering result may be used to indicate: the proximity of each record in the same cluster in the clustering result, and the distance between the records of different clusters in the clustering result. .
  • the data to be processed is data to be uploaded by the terminal or data to be saved by the terminal.
  • the decision module 402 is specifically configured to use, among the n privacy sub-models, a privacy sub-model with the highest correlation with the data to be processed as the privacy sub-model corresponding to the data to be processed.
  • the decision module 402 is further configured to perform privacy protection on the data to be processed when the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold.
  • the decision module 402 is further configured to add the to-be-processed data to a corresponding one when the correlation between the to-be-processed data and a corresponding privacy sub-model is greater than or equal to a preset correlation threshold.
  • a preset correlation threshold In the privacy submodel, supplementary expansion of the corresponding privacy submodel is implemented.
  • the decision module 402 is further configured to determine that the data to be processed does not require privacy protection when the correlation between the data to be processed and the corresponding privacy sub-model is less than a preset correlation threshold.
  • the above-mentioned obtaining module 401 and decision-making module 402 can be composed of a central processing unit (CPU), a microprocessor (micro processor unit, MPU), and a digital signal processor (DSP) located in a terminal. ), Or Field Programmable Gate Array (FPGA).
  • CPU central processing unit
  • MPU microprocessor
  • DSP digital signal processor
  • FPGA Field Programmable Gate Array
  • the functional modules in this embodiment may be integrated into one processing unit, or each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional modules.
  • the integrated unit is implemented in the form of a software functional module and is not sold or used as an independent product, it may be stored in a computer-readable storage medium.
  • the technical solution of this embodiment is essentially or It is said that a part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium and includes several instructions for making a computer device (can It is a personal computer, a server, or a network device) or a processor (processor) to perform all or part of the steps of the method described in this embodiment.
  • the foregoing storage media include: U disks, mobile hard disks, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks, which can store program codes.
  • the computer program instructions corresponding to a data protection method in this embodiment may be stored on a storage medium such as an optical disc, a hard disk, a U disk, and the like, when the computer program instructions corresponding to a data protection method are stored in the storage medium
  • a storage medium such as an optical disc, a hard disk, a U disk, and the like
  • FIG. 5 shows a data protection device 50 provided by an embodiment of the present application.
  • the device may include: a memory 51, a processor 52, and a bus 53;
  • the bus 53 is configured to connect the memory 51, the processor 52, and mutual communication between these devices;
  • the memory 51 is configured to store a computer program and data
  • the processor 52 is configured to execute a computer program stored in the memory to implement the steps of any one of the data protection methods in the foregoing embodiments.
  • the above-mentioned memory 51 may be volatile memory (for example, RAM); or non-volatile memory (for example, ROM, flash memory, hard disk). Drive (HDD) or Solid-State Drive (SSD); or a combination of the above types of memory, and provides instructions and data to the processor 52.
  • volatile memory for example, RAM
  • non-volatile memory for example, ROM, flash memory, hard disk).
  • HDD Hard-State Drive
  • SSD Solid-State Drive
  • the processor 52 may be an Application Specific Integrated Circuit (ASIC), a DSP, a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), an FPGA, a CPU, At least one of a controller, a microcontroller, and a microprocessor. It can be understood that, for different devices, the electronic device used to implement the processor function may be other, which is not specifically limited in the embodiment of the present application.
  • ASIC Application Specific Integrated Circuit
  • DSP Digital Signal Processing Device
  • PLD Programmable Logic Device
  • FPGA field-programmable Logic Device
  • the embodiments of the present application may be provided as a method, a system, or a computer program product. Therefore, this application may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Moreover, the present application may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) containing computer-usable program code.
  • a computer-usable storage media including, but not limited to, disk storage, optical storage, and the like
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing device to work in a particular manner such that the instructions stored in the computer-readable memory produce a manufactured article including an instruction device, the instructions
  • the device implements the functions specified in one or more flowcharts and / or one or more blocks of the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing device, so that a series of steps can be performed on the computer or other programmable device to produce a computer-implemented process, which can be executed on the computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.
  • the n privacy attributes corresponding to the n privacy submodels can be flexibly set according to the actual needs of the user, the n privacy submodels that meet the actual needs can be obtained.
  • the required n privacy sub-models determine that the warning information is generated, it indicates that the generation of the warning information is in line with the actual requirements; that is, by setting the n privacy attributes flexibly and autonomously in advance, the user's privacy data can be alerted. It has certain flexibility and autonomy, which can prevent the leakage of private data that requires privacy protection.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiments of the present application provide a data protection method and device, an apparatus, and a computer storage medium. The method comprises: acquiring n privacy sub-models; each privacy sub-model being a data set representing one type of privacy attribute, the privacy attributes represented by the n privacy sub-models being different from each other, and n being an integer greater than one; acquiring data to be processed, and determining a privacy sub-model corresponding to said data; when the correlation between said data and the corresponding privacy sub-model is equal to or greater than a preset correlation threshold, generating early-warning information to prompt that privacy protection needs to be performed on said data.

Description

一种数据保护方法、设备、装置和计算机存储介质Data protection method, equipment, device and computer storage medium 技术领域Technical field
本申请实施例涉及但不限于隐私数据保护技术,尤其涉及一种数据保护方法、设备、装置和计算机存储介质。The embodiments of the present application relate to, but are not limited to, privacy data protection technologies, and in particular, to a data protection method, device, device, and computer storage medium.
背景技术Background technique
随着移动互联网的高速发展,移动终端上的各种应用成为用户了解世界的重要工具;由于互联网的开放性和互通性,使得用户对个人的网上隐私也越来越关注。尽管用户隐私是敏感信息,但仍然随时暴露无遗,示例性地,用户的各种行为轨迹如搜索、浏览、下载、支付、位置、运动量等被各种网站、app和终端等收集、存储、分析,然后被用于精准化营销或其他商业用途,甚至带来信息泄露、身份被盗、恶意攻击等危害。With the rapid development of the mobile Internet, various applications on mobile terminals have become important tools for users to understand the world; due to the openness and interoperability of the Internet, users have become more and more concerned about personal online privacy. Although user privacy is sensitive information, it is still exposed at any time. For example, various user trajectories such as search, browsing, downloading, payment, location, and amount of exercise are collected, stored, and analyzed by various websites, apps, and terminals. , And then used for precision marketing or other commercial purposes, and even bring harm to information leakage, identity theft, and malicious attacks.
相关技术中,对用户隐私数据的保护方法不够灵活,不能够根据实际需求确定是否需要对用户数据进行隐私保护。In related technologies, a method for protecting user privacy data is not flexible enough, and it is impossible to determine whether privacy protection of user data is required according to actual needs.
发明内容Summary of the Invention
本申请实施例提供了一种数据保护方法、设备、装置和计算机存储介质,能够对隐私数据进行灵活保护和管理。The embodiments of the present application provide a data protection method, device, device, and computer storage medium, which can flexibly protect and manage private data.
为达到上述目的,本申请实施例的技术方案是这样实现的:In order to achieve the foregoing objective, the technical solution of the embodiment of the present application is implemented as follows:
本申请实施例提供了一种数据保护方法,所述方法包括:An embodiment of the present application provides a data protection method. The method includes:
获取n个隐私子模型;其中,每个隐私子模型为表示一种隐私属性的数据集,所述n个隐私子模型表示的隐私属性互不相同,n为大于1的整数;Obtain n privacy sub-models; wherein each privacy sub-model is a data set representing a privacy attribute, the privacy attributes represented by the n privacy sub-models are different from each other, and n is an integer greater than 1;
获取待处理数据,确定待处理数据对应的隐私子模型;Obtaining the data to be processed and determining the privacy sub-model corresponding to the data to be processed;
所述待处理数据与对应的隐私子模型的相关性大于或等于预设相关性阈值时,生成预警信息,以提示所述待处理数据需要进行隐私保护。When the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold, an alert message is generated to remind the data to be processed that privacy protection is needed.
本申请实施例还提供了一种数据保护设备,所述设备包括处理器和设置为存储能够在处理器上运行的计算机程序的存储器;其中,An embodiment of the present application further provides a data protection device, where the device includes a processor and a memory configured to store a computer program capable of running on the processor; wherein,
所述处理器设置为运行所述计算机程序时,执行上述任意一种数据保护方法的步骤。When the processor is configured to run the computer program, execute the steps of any one of the data protection methods described above.
本申请实施例还提供了一种数据保护装置,所述装置包括:获取模块和决策模块,其中,An embodiment of the present application further provides a data protection device, where the device includes an acquisition module and a decision module, where:
获取模块,设置为获取n个隐私子模型;其中,每个隐私子模型为表示一种隐私属性的数据集,所述n个隐私子模型表示的隐私属性互不相同,n为大于1的整数;The obtaining module is configured to obtain n privacy sub-models; wherein each privacy sub-model is a data set representing a privacy attribute, the privacy attributes represented by the n privacy sub-models are different from each other, and n is an integer greater than 1 ;
决策模块,设置为获取待处理数据,确定待处理数据对应的隐私子模型;在所述待处理数据与对应的隐私子模型的相关性大于或等于预设相关性阈值时,生成预警信息,以提示所述待处理数据需要进行隐私保护。The decision-making module is configured to obtain the data to be processed and determine the privacy sub-model corresponding to the data to be processed; when the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold value, generating warning information to It is reminded that the data to be processed needs privacy protection.
本申请实施例还提供了一种计算机存储介质,该计算机程序被处理器执行时实现上述任意一种数据保护方法的步骤。An embodiment of the present application further provides a computer storage medium, and when the computer program is executed by a processor, the steps of any one of the foregoing data protection methods are implemented.
本申请实施例提供的一种数据保护方法、设备、装置和计算机存储介质中,首先获取n个隐私子模型;其中,每个隐私子模型为表示一种隐私属性的数据集,所述n个隐私子模型表示的隐私属性互不相同,n为大于1的整数;然后,获取待处理数据,确定待处理数据对应的隐私子模型;最后,所述待处理数据与对应的隐私子模型的相关性大于或等于预设相关性阈值时,生成预警信息,以提示所述待处理数据需要进行隐私保护。In a data protection method, device, device, and computer storage medium provided by the embodiments of the present application, n privacy submodels are first obtained; wherein each privacy submodel is a data set representing a privacy attribute, and the n The privacy attributes represented by the privacy submodel are different from each other, n is an integer greater than 1. Then, the data to be processed is obtained to determine the privacy submodel corresponding to the data to be processed; finally, the correlation between the data to be processed and the corresponding privacy submodel is When the correlation is greater than or equal to a preset correlation threshold, an early-warning message is generated to remind the data to be processed that privacy protection is required.
在采用上述记载的技术方案时,由于n个隐私子模型对应的n个隐私属性可以根据用户根据自身实际需求灵活设置,因而,可以得到符合实际需求的n个隐私子模型,进而,根据符合实际需求的n个隐私子模型确定生成预警信息时,说明预警信息的生成是符合实际需求的;也就是说,通过预先灵活和自主地设置n个隐私属性,可以实现对用户隐私数据的预警提醒,具有一定的灵活性和自主性,可以防止需要进行隐私保护的隐私数据泄露。When the technical solution described above is adopted, because the n privacy attributes corresponding to the n privacy submodels can be flexibly set according to the actual needs of the user, the n privacy submodels that meet the actual needs can be obtained. When the required n privacy sub-models determine that the warning information is generated, it indicates that the generation of the warning information is in line with the actual requirements; that is, by setting the n privacy attributes flexibly and autonomously in advance, the user's privacy data can be alerted. It has certain flexibility and autonomy, which can prevent the leakage of private data that requires privacy protection.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本申请实施例的一种数据保护方法的流程图;FIG. 1 is a flowchart of a data protection method according to an embodiment of the present application;
图2为本申请实施例中针对训练数据的聚类结果示意图;2 is a schematic diagram of a clustering result of training data according to an embodiment of the present application;
图3为本申请实施例的另一种数据保护方法的流程图;FIG. 3 is a flowchart of another data protection method according to an embodiment of the present application; FIG.
图4为本申请实施例的一种数据保护装置的组成结构示意图;4 is a schematic structural diagram of a data protection device according to an embodiment of the present application;
图5为本申请实施例的一种数据保护设备的硬件结构示意图。FIG. 5 is a schematic diagram of a hardware structure of a data protection device according to an embodiment of the present application.
具体实施方式detailed description
以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。The application is further described in detail below with reference to the drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the application, and are not used to limit the application.
相关技术中,对用户隐私数据的保护方法主要有以下两种:(1)针对应用程序做保护,具体地说,制作程序伪装触发器,对应用程序的图标和名称进行伪装,使得终端锁屏状态解除时,仍然可以保证应用程序对于用户具有较强的隐私性,防止应用程序被除终端用户之外的第二人使用,从而保护了应用程序中用户数据的隐私安全。(2)针对用户数据保护,具体地说对用户隐私数据进行变化,生成保护了个人隐私的匿名化数据隐藏。In related technology, there are two main methods for protecting user privacy data: (1) Protecting an application, specifically, making a program disguise trigger, disguising the icon and name of the application to make the terminal lock the screen When the status is released, the application can still be guaranteed to have strong privacy for the user, and the application can be prevented from being used by a second person other than the end user, thereby protecting the privacy and security of user data in the application. (2) Regarding the protection of user data, specifically changing the user's private data to generate anonymized data hiding that protects personal privacy.
采用方法(1)对用户隐私数据进行保护时,由于仅仅针对应用程序的标识信息进行保护,导致保护的范围很小,对用户隐私数据的保护作用不大;采用方法(2)对用户隐私数据进行保护时,由于需要对用户数据做匿名处理,因而需要在匿名信息和真实数据间建立关联关系,该关联关系的映射程度直接影响到隐私数据的使用,即,通过用户隐私数据进行变化实现对隐私数据的保护方案,会影响到隐私数据的使用。When using method (1) to protect user privacy data, because only the identification information of the application is protected, the scope of protection is small and the protection of user privacy data is not significant; using method (2) to protect user privacy data During the protection, because the user data needs to be treated anonymously, it is necessary to establish an association between the anonymous information and the real data. The degree of mapping of the association relationship directly affects the use of private data, that is, changes to the user's private data are used to achieve Privacy data protection schemes will affect the use of private data.
可以看出,相关技术中,对用户隐私数据的保护方法不够灵活,不能够根据实际需求确定是否需要对用户数据进行隐私保护。It can be seen that in the related art, the method for protecting user privacy data is not flexible enough, and it is impossible to determine whether the user data needs to be privacy protected according to actual needs.
本申请实施例可以应用于任意的需要进行隐私保护的场景,例如,对终端上应用运行时产生的用户数据进行隐私保护时,可以基于本申请实施例提供的技术方案实现;本申请实施例中并不对终端上运行的应用的种类进行限制。The embodiments of the present application can be applied to any scenario where privacy protection is required. For example, when privacy protection is performed on user data generated when an application runs on a terminal, it can be implemented based on the technical solution provided in the embodiments of the present application. There are no restrictions on the types of applications running on the terminal.
本申请实施例可以应用于终端或其他设备中,上述记载的终端或其他设备可以包括处理器、存储器等器件。The embodiments of the present application may be applied to a terminal or other devices, and the terminal or other devices described above may include devices such as a processor and a memory.
基于上述记载的内容,提出以下各具体实施例。Based on the content described above, the following specific embodiments are proposed.
第一实施例First embodiment
本申请第一实施例记载了一种数据保护方法,图1为本申请实施例的一种 数据保护方法的流程图,如图1所示,该流程可以包括:The first embodiment of the present application describes a data protection method. FIG. 1 is a flowchart of a data protection method according to an embodiment of the present application. As shown in FIG. 1, the process may include:
步骤101:获取n个隐私子模型;其中,每个隐私子模型为表示一种隐私属性的数据集,所述n个隐私子模型表示的隐私属性互不相同,n为大于1的整数。Step 101: Obtain n privacy sub-models; wherein each privacy sub-model is a data set representing a privacy attribute, the privacy attributes represented by the n privacy sub-models are different from each other, and n is an integer greater than 1.
对于本步骤的实现方式,示例性地,可以首先获取训练数据,所述训练数据用于表示应用运行时产生的用户数据;然后,以预设的n个隐私属性作为中心对象,对所述训练数据进行聚类处理,得到n个隐私子模型。For the implementation of this step, for example, training data may be obtained first, where the training data is used to represent user data generated when the application is running; then, preset n privacy attributes are used as the central object to perform training on the training object. The data is clustered to obtain n privacy sub-models.
在实际实施时,可以获取应用运行时产生的用户原始数据,对上述记载的用户原始数据进行预处理,得到训练数据;示例性地,可以对上述记载的用户原始数据执行以下至少一项得到训练数据:分词处理、过滤无用词处理,无用词可以包括标点、单字、符号、以及其它一些无意义的词;需要说明的是,上述记载的内容仅仅是举例的方式提供了预处理的实现方式,预处理还可以具有其他的实现方式,本申请实施例并不进行限制。In actual implementation, user raw data generated during application running may be obtained, and the user raw data recorded above may be pre-processed to obtain training data. For example, at least one of the following recorded user raw data may be trained to obtain training data. Data: word segmentation processing, filtering useless word processing, useless words can include punctuation, single words, symbols, and other meaningless words; it should be noted that the content of the above records is only an example to provide the implementation of preprocessing. The pre-processing may also have other implementation manners, which are not limited in the embodiments of the present application.
对于上述记载的用户原始数据的实现方式,示例性地,上述记载的用户原始数据可以是移动终端的应用(Application,App)运行时产生的用户数据,可以包含用户使用移动终端的各个应用时所产生的各种数据,比如登陆信息、阅读、消费、喜好细节等。For the implementation manner of the above-mentioned user raw data, for example, the above-mentioned user raw data may be user data generated when an application (Application, App) of a mobile terminal is run, and may include user's use of each application of the mobile terminal. Various data generated, such as login information, reading, consumption, preference details, etc.
实际应用中,可以根据对隐私数据进行保护的实际需求,预先设置n个不同的隐私属性,n个隐私属性中的每个隐私属性表示用户确定需进行保护的隐私点(即用户最在意的隐私点),例如,n个隐私属性可以包括“身份”、“兴趣”等;n可以认为是预先设置的保护度系数,n的值越大,说明用户确定需进行保护的隐私点越多;进一步地,在设置n个隐私属性后,用户可以根据实际需求对n个隐私属性进行改变,进而可以基于改版后的隐私属性对训练数据重新进行聚类处理,得到相应的隐私子模型。In practical applications, n different privacy attributes can be set in advance according to the actual needs for protecting private data. Each privacy attribute of the n privacy attributes indicates that the user determines the privacy point to be protected (that is, the privacy that the user cares about most). (Points), for example, n privacy attributes can include "identity", "interest", etc .; n can be considered as a pre-set protection degree coefficient. The greater the value of n, the more privacy points the user determines to be protected; further After setting n privacy attributes, the user can change the n privacy attributes according to actual needs, and then re-cluster the training data based on the revised privacy attributes to obtain the corresponding privacy submodel.
可以看出,通过设定保护度系数n,可以用户灵活地确定个人数据的隐私保护策略,保护度系数n接影响隐私保护范畴的大小。用户根据对自身隐私保护程度的不同,设定相应的保护度系数。It can be seen that by setting the protection degree coefficient n, users can flexibly determine the privacy protection strategy of personal data, and the protection degree coefficient n then affects the size of the privacy protection category. Users set the corresponding degree of protection coefficient according to the degree of their privacy protection.
具体实施时,用户可以通过终端的用户界面(User Interface,UI)输入保护度系数n以及n个隐私属性,如此,便于用户操作。During specific implementation, a user may input a protection degree coefficient n and n privacy attributes through a user interface (UI) of a terminal, which is convenient for user operations.
本申请实施例中,在获取n个隐私属性和用户原始数据后,可以将n个隐 私属性和用户原始数据作为构建n个隐私子模型的输入数据,进而,可以对n个隐私子模型的输入数据进行处理,得到n个隐私子模型。In the embodiment of the present application, after obtaining the n privacy attributes and the user's original data, the n privacy attributes and the user's original data can be used as input data for constructing the n privacy sub-models, and further, the n privacy sub-models can be input. The data is processed to obtain n privacy sub-models.
对于得到n个隐私子模型的实现方式,可以采用机器学习常用的基于聚类的自然语言处理方法,对n个隐私子模型的输入数据进行自动聚类处理,并通过迭代的方法,逐次更新每次聚类的中心对象,直至得到最后的聚类结果;这里,最后的聚类结果可以包括n个簇团,最后的聚类结果中的n个簇团的隐私属性互不相同,最后的聚类结果中的每个簇团表示一个隐私子模型。需要说明的是,本申请实施例并不对机器学习模型结构和学习方法进行限制。For the implementation of n privacy sub-models, clustering-based natural language processing methods commonly used in machine learning can be used to automatically cluster the input data of n privacy sub-models and iteratively update each The central object of the secondary clustering until the final clustering result is obtained; here, the final clustering result may include n clusters, the privacy properties of the n clusters in the final clustering result are different from each other, and the final clustering Each cluster in the class result represents a privacy sub-model. It should be noted that the embodiments of the present application do not limit the structure and learning method of the machine learning model.
也就是说,在得到训练数据和预设的n个隐私属性后,以预设的n个隐私属性作为中心对象,采用迭代的聚类方法,对所述训练数据进行多次聚类处理,得到n个隐私子模型;其中,在进行非首次聚类处理时,对聚类的中心对象进行更新,使本次聚类结果的预设评价指标高于上一次聚类结果的预设评价指标。That is, after obtaining the training data and preset n privacy attributes, using the preset n privacy attributes as the central object, using the iterative clustering method to perform multiple clustering processing on the training data to obtain n privacy sub-models; in the non-first clustering process, the central object of the cluster is updated so that the preset evaluation index of the current clustering result is higher than the preset evaluation index of the previous clustering result.
这里,聚类结果的预设评价指标可以用于表示:聚类结果中同一簇团中各个记录的邻近程度、以及聚类结果中不同簇团的记录之间的远离程度;聚类结果中同一簇团中各个记录的越邻近,聚类结果中不同簇团的记录之间的越远离,说明聚类结果的预设评价指标越高。Here, the preset evaluation index of the clustering result can be used to indicate: the proximity of each record in the same cluster in the clustering result, and the distance between the records of different clusters in the clustering result; the same in the clustering result The closer each record in the cluster is, the further away the records in different clusters are from the clustering result, indicating that the preset evaluation index of the clustering result is higher.
具体地说,对所述训练数据进行第1次聚类处理,得到第1次聚类结果;Specifically, a first clustering process is performed on the training data to obtain a first clustering result;
令m表示所述迭代的聚类方法的迭代总次数,当i取2至m时,在第i-1次聚类结果的基础上,以第i次聚类的聚类结果的预设评价指标高于第i-1次聚类结果的预设评价指标为目标,对第i-1次聚类的中心对象进行更新,得到第i次聚类的中心对象;根据第i次聚类的中心对象,对训练数据进行第i次聚类处理,得到第i次聚类结果。Let m denote the total number of iterations of the iterative clustering method. When i takes 2 to m, based on the i-1th clustering result, preset evaluation of the clustering result of the ith clustering If the index is higher than the preset evaluation index of the i-1th clustering result, the center object of the i-1th clustering is updated to obtain the center object of the ith clustering. For the center object, the i-th clustering process is performed on the training data to obtain the i-th clustering result.
这里,m可以是预设的大于1的整数,或者,可以由预设的迭代终止条件确定;在一个示例中,预设的迭代终止条件可以是:无法以聚类结果的预设评价指标更高为目标,对前一次聚类的中心对象进行更新。Here, m may be a preset integer greater than 1, or may be determined by a preset iteration termination condition. In one example, the preset iteration termination condition may be: the preset evaluation index of the clustering result cannot be changed. High is the target, and the central object of the previous clustering is updated.
下面通过图2说明本申请实施例中对训练数据的聚类结果,为便于理解,参照图2,当i取1至m时,对训练数据进行第i次聚类时,将第i次聚类的n个中心对象抽象为n个点,以这n个点为核心,吸引和聚合数据集中语义上最靠近它们的对象,形成分别以这n个属性为中心的簇团,图2中,用K1、K2、K3和Kn标识的四个圆圈标识四个簇团,每个簇团代表了一个隐私属性相关的 各种属性,如以“身份”为中心的簇团里聚集的就是各种跟“身份”相关的属性,例如“姓名”、“......”。如果用户想要隐私保护的范围更大,则可通过调节保护度系数来实现:保护度系数越大,则需要设置更多的隐私属性。The following describes the clustering results of training data in the embodiment of the present application with reference to FIG. 2. For easy understanding, referring to FIG. 2, when i takes 1 to m, the i-th clustering is performed on the training data, and the i-th clustering is performed. The n central objects of the class are abstracted into n points, with these n points as the core, attracting and aggregating objects that are semantically closest to them in the data set to form clusters centered on these n attributes, respectively. In Figure 2, Four clusters are identified by the four circles identified by K1, K2, K3, and Kn. Each cluster represents a variety of privacy-related attributes. For example, clusters centered on "identity" are various clusters. Attributes related to "identity", such as "name", "...". If the user wants a wider range of privacy protection, it can be achieved by adjusting the protection degree coefficient: the greater the protection degree coefficient, the more privacy attributes need to be set.
步骤102:获取待处理数据,确定待处理数据对应的隐私子模型;所述待处理数据与对应的隐私子模型的相关性大于或等于预设相关性阈值时,确定所述待处理数据需要进行隐私保护。Step 102: Obtain the data to be processed and determine the privacy sub-model corresponding to the data to be processed. When the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold, it is determined that the data to be processed needs to be processed. privacy protection.
实际应用中,可以对应用运行时产生的用户数据进行监测,在监测到应用运行时产生的用户数据,将监测到的应用运行时产生的用户数据作为待处理数据。示例性地,待处理数据为终端待上传的数据或终端待保存的数据。In actual applications, user data generated during application running can be monitored, user data generated during application running is monitored, and user data generated during monitored application running is taken as pending data. Exemplarily, the data to be processed is data to be uploaded by the terminal or data to be saved by the terminal.
在获取待处理数据后,以上述记载的n个隐私子模型为中心对象,通过机器学习算法进行充分的计算,以确定待处理数据归属的隐私子模型(待处理数据对应的隐私子模型)。示例性地,可以分别确定待处理数据与n个隐私子模型的相关性,将与待处理数据的相关性最大的隐私子模型作为待处理数据对应的隐私子模型。After obtaining the to-be-processed data, the n privacy sub-models described above are taken as the center object, and a sufficient calculation is performed by a machine learning algorithm to determine the privacy sub-model (the privacy sub-model corresponding to the to-be-processed data). Exemplarily, the correlation between the data to be processed and the n privacy sub-models can be determined separately, and the privacy sub-model with the most correlation with the data to be processed is used as the privacy sub-model corresponding to the data to be processed.
这里,对于确定待处理数据与每个隐私子模型的相关性的实现方式,在一个示例中,可以计算待处理数据与每个隐私子模型的语义距离,根据待处理数据与每个隐私子模型的语义距离,确定待处理数据与每个隐私子模型的相关性;待处理数据与隐私子模型的语义距离越小,说明待处理数据的隐私敏感程度越大,待处理数据与隐私子模型的相关性越大。Here, for an implementation manner of determining the correlation between the data to be processed and each privacy submodel, in one example, the semantic distance between the data to be processed and each privacy submodel can be calculated, and according to the data to be processed and each privacy submodel, The semantic distance between the data to be processed and each privacy sub-model is determined. The smaller the semantic distance between the data to be processed and the privacy sub-model, the greater the privacy sensitivity of the data to be processed. The greater the correlation.
具体实施时,在确定待处理数据对应的隐私子模型后,可以判断待处理数据与对应的隐私子模型的相关性与预设相关性阈值的大小关系,当待处理数据与对应的隐私子模型的相关性大于或等于预设相关性阈值时,确定所述待处理数据需要进行隐私保护,此时,可以生成预警信息,以提示所述待处理数据需要进行隐私保护;当待处理数据与对应的隐私子模型的相关性小于预设相关性阈值时,确定所述待处理数据不需要进行隐私保护,可以直接结束流程。In specific implementation, after the privacy sub-model corresponding to the data to be processed is determined, the magnitude relationship between the correlation between the data to be processed and the corresponding privacy sub-model and a preset correlation threshold can be judged. When the data to be processed is related to the corresponding privacy sub-model, When the correlation is greater than or equal to a preset correlation threshold, it is determined that the data to be processed needs to be protected by privacy. At this time, an alert message may be generated to remind the data to be processed that privacy needs to be protected. When the correlation of the privacy sub-model is less than a preset correlation threshold, it is determined that the data to be processed does not need to be protected by privacy, and the process may be directly ended.
本申请实施例中,对于判断待处理数据与对应的隐私子模型的相关性与预设相关性阈值的大小关系的实现方式,示例性地,可以计算所述待处理数据与对应的隐私子模型的语义距离,当待处理数据与对应的隐私子模型的语义距离小于或等于预设语义距离阈值时,确定待处理数据与对应的隐私子模型的相关性大于或等于预设相关性阈值;反之,当待处理数据与对应的隐私子模型的语 义距离大于预设语义距离阈值时,确定待处理数据与对应的隐私子模型的相关性小于预设相关性阈值。In the embodiment of the present application, for an implementation manner of judging the magnitude relationship between the correlation between the data to be processed and the corresponding privacy submodel and a preset correlation threshold, for example, the data to be processed and the corresponding privacy submodel may be calculated. When the semantic distance between the data to be processed and the corresponding privacy submodel is less than or equal to a preset semantic distance threshold, it is determined that the correlation between the data to be processed and the corresponding privacy submodel is greater than or equal to a preset correlation threshold; otherwise, When the semantic distance between the data to be processed and the corresponding privacy submodel is greater than a preset semantic distance threshold, it is determined that the correlation between the data to be processed and the corresponding privacy submodel is less than a preset correlation threshold.
由于n个隐私子模型对应的n个隐私属性可以根据实际需求灵活设置,因而,可以得到符合实际需求的n个隐私子模型,进而,根据符合实际需求的n个隐私子模型确定生成预警信息时,说明预警信息的生成是符合实际需求的;也就是说,通过预先灵活和自主地设置n个隐私属性,可以实现对用户隐私数据的预警提醒,具有一定的灵活性和自主性,可以防止需要进行隐私保护的隐私数据泄露。另外,当待处理数据与对应的隐私子模型的相关性小于预设相关性阈值时,可以忽略待处理数据,如此,可以为不需要进行隐私保护的数据提供安全的通道和有力的保障。Since the n privacy attributes corresponding to the n privacy sub-models can be flexibly set according to actual needs, n privacy sub-models that meet actual needs can be obtained, and further, when generating the warning information according to the n privacy sub-models that meet actual needs, , Indicating that the generation of early warning information is in line with actual needs; that is, by setting n privacy attributes flexibly and autonomously in advance, early warning reminders of user privacy data can be achieved, with a certain degree of flexibility and autonomy, which can prevent the need Privacy data leakage for privacy protection. In addition, when the correlation between the data to be processed and the corresponding privacy sub-model is less than a preset correlation threshold, the data to be processed can be ignored, so that a secure channel and powerful guarantee can be provided for data that does not need privacy protection.
实际应用中,本申请第一实施例的数据保护方法可以基于终端的处理器等实现。In practical applications, the data protection method of the first embodiment of the present application may be implemented based on a processor of a terminal or the like.
相关技术中,对用户隐私数据的保护方法主要有以下两种:(1)针对应用程序做保护,具体地说,制作程序伪装触发器,对应用程序的图标和名称进行伪装,使得终端锁屏状态解除时,仍然可以保证应用程序对于用户具有较强的隐私性,防止应用程序被除终端用户之外的第二人使用,从而保护了应用程序中用户数据的隐私安全。(2)针对用户数据保护,具体地说对用户隐私数据进行变化,生成保护了个人隐私的匿名化数据隐藏。In related technology, there are two main methods for protecting user privacy data: (1) Protecting an application, specifically, making a program disguise trigger, disguising the icon and name of the application to make the terminal lock the screen When the status is released, the application can still be guaranteed to have strong privacy for the user, and the application can be prevented from being used by a second person other than the end user, thereby protecting the privacy and security of user data in the application. (2) Regarding the protection of user data, specifically changing the user's private data to generate anonymized data hiding that protects personal privacy.
采用方法(1)对用户隐私数据进行保护时,由于仅仅针对应用程序的标识信息进行保护,导致保护的范围很小,对用户隐私数据的保护作用不大;采用方法(2)对用户隐私数据进行保护时,由于需要对用户数据做匿名处理,因而需要在匿名信息和真实数据间建立关联关系,该关联关系的映射程度直接影响到隐私数据的使用,即,通过用户隐私数据进行变化实现对隐私数据的保护方案,会影响到隐私数据的使用。When using method (1) to protect user privacy data, because only the identification information of the application is protected, the scope of protection is small and the protection of user privacy data is not significant; using method (2) to protect user privacy data During the protection, because the user data needs to be treated anonymously, it is necessary to establish an association between the anonymous information and the real data. The degree of mapping of the association relationship directly affects the use of private data, that is, changes to the user's private data are used to achieve Privacy data protection schemes will affect the use of private data.
相关技术中,在移动终端上,当用户安装App时,必须同意App所声明的全部授权,否则应用无法安装使用,相关领域的研究工作也主要集中在恶意软件检测、云端、服务器协助,基于信息流控制的隐私保护方法;在App生态圈没有协议或政策让用户自行决定让自己的哪些信息公开,哪些信息隐藏。In related technologies, when a user installs an app on a mobile terminal, he or she must agree to all the authorizations stated by the app, otherwise the app cannot be installed and used. Research in related fields also focuses on malware detection, cloud, and server assistance. Privacy control methods for flow control; there is no agreement or policy in the App ecosystem for users to decide which information they make public and which information is hidden.
可以看出,相关技术中,对用户隐私数据的保护方法不够灵活,不能够根据实际需求确定是否需要对用户数据进行隐私保护。It can be seen that in the related art, the method for protecting user privacy data is not flexible enough, and it is impossible to determine whether the user data needs to be privacy protected according to actual needs.
而在本申请实施例中,利用机器学习的方法,将终端上的用户原始数据,根据预设的n个隐私属性进行自动提炼和聚合处理,生成切合单个用户实际需要的隐私保护方案,并以此为基础,检测使用应用过程中生成的数据是否符合用户预期的隐私信息开放程度,再做出相应的举措;可以看出,通过设置n个隐私属性,可以对使用应用过程中生成的数据过滤和甄别处理,这样既能保护隐私数据防止滥用甚至被攻击,又能让数据得到有效利用;也就是说,本申请实施例中,可以从用户的角度出发,秉承“让用户为自己的数据做主”的宗旨,用机器学习的方法自动构建贴合用户需求的隐私数据保护方案,然后决策和管理那些可能是用户在意的隐私数据,可以在提供信息以享受服务的同时,保护自己的隐私。However, in the embodiment of the present application, a machine learning method is used to automatically extract and aggregate the user's original data on the terminal according to preset n privacy attributes to generate a privacy protection scheme that meets the actual needs of a single user. Based on this, it is detected whether the data generated during the use of the application meets the degree of openness of the private information expected by the user, and then make corresponding measures; it can be seen that by setting n privacy attributes, the data generated during the use of the application can be filtered And discrimination processing, which can protect private data from misuse and even attack, and allow effective use of the data; that is, in the embodiment of the present application, from the perspective of the user, adhering to "let the user take charge of their own data "The purpose is to use machine learning to automatically build a privacy data protection solution that meets the needs of users, and then decide and manage those privacy data that may be the user's attention. You can protect your privacy while providing information to enjoy the service.
第二实施例Second embodiment
在本申请前述实施例提出的数据保护方法的基础上,进行进一步的举例说明。Based on the data protection method proposed in the foregoing embodiment of the present application, further examples will be described.
图3为本申请实施例的另一种数据保护方法的流程图,如图3所示,该流程可以包括:FIG. 3 is a flowchart of another data protection method according to an embodiment of the present application. As shown in FIG. 3, the process may include:
步骤301:获取待处理数据和n个隐私子模型。Step 301: Obtain data to be processed and n privacy sub-models.
本步骤的实现方式已经在第一实施例中作出说明,这里不再赘述。The implementation of this step has been described in the first embodiment, and is not repeated here.
步骤302:确定待处理数据对应的隐私子模型。Step 302: Determine a privacy sub-model corresponding to the data to be processed.
本步骤的实现方式已经在步骤102中作出说明,这里不再赘述。The implementation of this step has been described in step 102, and is not repeated here.
步骤303:判断待处理数据与对应的隐私子模型的相关性是否大于或等于预设相关性阈值,如果是,则执行步骤304,如果否,则结束流程。Step 303: Determine whether the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold, and if yes, execute step 304; if not, end the process.
步骤304:对待处理数据进行预警或其他处理。Step 304: Perform early warning or other processing on the data to be processed.
具体地说,当待处理数据与对应的隐私子模型的相关性大于或等于预设相关性阈值时,可以认为待处理数据大概率属于用户很在意的隐私范畴,还可以生成预警信息,以提示存在隐私信息泄露风险,或者,可以直接对所述待处理数据进行隐私保护。对于预警信息的展示方式,示例性地,可以采用终端的UI或其他形式展示预警信息;对于对所述待处理数据进行隐私保护的实现方式,示例性地,可以在确定对待处理数据进行保存、上传或其他可能造成隐私泄露 的操作时,阻止对待处理数据进行相应的操作,并对用户进行预警或提醒。Specifically, when the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold, it can be considered that the probability of the data to be processed belongs to the privacy category that the user cares about, and early warning information can be generated to prompt There is a risk of leakage of private information, or privacy protection may be performed directly on the data to be processed. For the display method of the warning information, for example, the UI of the terminal or other forms may be used to display the warning information. For the implementation of privacy protection of the data to be processed, for example, the data to be processed may be saved, When uploading or other operations that may cause privacy leakage, prevent the corresponding operation of the data to be processed, and alert or remind users.
进一步地,在待处理数据与对应的隐私子模型的相关性大于或等于预设相关性阈值时,还可以将所述待处理数据添加到对应的隐私子模型中,使得隐私子模型更加完善。Further, when the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold, the data to be processed may also be added to the corresponding privacy sub-model to make the privacy sub-model more complete.
第三实施例Third embodiment
在前述实施例提出的数据保护方法的基础上,本申请第四实施例提供了一种数据保护装置。On the basis of the data protection method proposed in the foregoing embodiment, a fourth embodiment of the present application provides a data protection device.
图4为本申请实施例的一种数据保护装置的组成结构示意图,如图4所示,所述装置包括获取模块401和决策模块402,其中,FIG. 4 is a schematic structural diagram of a data protection device according to an embodiment of the present application. As shown in FIG. 4, the device includes an obtaining module 401 and a decision module 402, where:
获取模块401,设置为获取n个隐私子模型;其中,每个隐私子模型为表示一种隐私属性的数据集,所述n个隐私子模型表示的隐私属性互不相同,n为大于1的整数;The obtaining module 401 is configured to obtain n privacy sub-models; wherein each privacy sub-model is a data set representing a privacy attribute, and the privacy attributes represented by the n privacy sub-models are different from each other, and n is greater than 1 Integer
决策模块402,设置为获取待处理数据,确定待处理数据对应的隐私子模型;在所述待处理数据与对应的隐私子模型的相关性大于或等于预设相关性阈值时,生成预警信息,以提示所述待处理数据需要进行隐私保护。The decision module 402 is configured to obtain data to be processed and determine a privacy sub-model corresponding to the data to be processed; when the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold, generating warning information, In order to remind that the data to be processed needs privacy protection.
在一实施方式中,所述获取模块401,具体设置为获取训练数据,所述训练数据用于表示应用运行时产生的用户数据;以预设的n个隐私属性作为中心对象,对所述训练数据进行聚类处理,得到n个隐私子模型。In an implementation manner, the obtaining module 401 is specifically configured to obtain training data, where the training data is used to represent user data generated when an application is running; preset n privacy attributes are used as a central object for the training. The data is clustered to obtain n privacy sub-models.
在一实施方式中,所述获取模块401,具体设置为以预设的n个隐私属性作为中心对象,采用迭代的聚类方法,对所述训练数据进行多次聚类处理,得到n个隐私子模型;其中,在进行非首次聚类处理时,对聚类的中心对象进行更新,使本次聚类结果的预设评价指标高于上一次聚类结果的预设评价指标。In one embodiment, the acquisition module 401 is specifically set to take preset n privacy attributes as the central object and adopt an iterative clustering method to perform multiple clustering processing on the training data to obtain n privacy A sub-model; wherein, when the clustering process is not performed for the first time, the central object of the clustering is updated so that the preset evaluation index of the current clustering result is higher than the preset evaluation index of the previous clustering result.
在一实施方式中,所述聚类结果的预设评价指标可以用于表示:聚类结果中同一簇团中各个记录的邻近程度、以及聚类结果中不同簇团的记录之间的远离程度。In an embodiment, the preset evaluation index of the clustering result may be used to indicate: the proximity of each record in the same cluster in the clustering result, and the distance between the records of different clusters in the clustering result. .
在一实施方式中,所述待处理数据为终端待上传的数据或终端待保存的数据。In an embodiment, the data to be processed is data to be uploaded by the terminal or data to be saved by the terminal.
在一实施方式中,所述决策模块402,具体设置为在所述n个隐私子模型 中,将与待处理数据的相关性最大的隐私子模型作为待处理数据对应的隐私子模型。In one embodiment, the decision module 402 is specifically configured to use, among the n privacy sub-models, a privacy sub-model with the highest correlation with the data to be processed as the privacy sub-model corresponding to the data to be processed.
在一实施方式中,所述决策模块402,还设置为在所述待处理数据与对应的隐私子模型的相关性大于或等于预设相关性阈值时,对所述待处理数据进行隐私保护。In one embodiment, the decision module 402 is further configured to perform privacy protection on the data to be processed when the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold.
在一实施方式中,所述决策模块402,还设置为在所述待处理数据与对应的隐私子模型的相关性大于或等于预设相关性阈值时,将所述待处理数据添加到对应的隐私子模型中,实现对应隐私子模型的补充扩容。In an implementation manner, the decision module 402 is further configured to add the to-be-processed data to a corresponding one when the correlation between the to-be-processed data and a corresponding privacy sub-model is greater than or equal to a preset correlation threshold. In the privacy submodel, supplementary expansion of the corresponding privacy submodel is implemented.
在一实施方式中,所述决策模块402,还设置为在所述待处理数据与对应的隐私子模型的相关性小于预设相关性阈值时,确定所述待处理数据不需要进行隐私保护。In an embodiment, the decision module 402 is further configured to determine that the data to be processed does not require privacy protection when the correlation between the data to be processed and the corresponding privacy sub-model is less than a preset correlation threshold.
实际应用中,上述获取模块401和决策模块402均可由位于终端中的中央处理器(Central Processing Unit,CPU)、微处理器(Micro Processor Unit,MPU)、数字信号处理器(Digital Signal Processor,DSP)、或现场可编程门阵列(Field Programmable Gate Array,FPGA)等实现。In practical applications, the above-mentioned obtaining module 401 and decision-making module 402 can be composed of a central processing unit (CPU), a microprocessor (micro processor unit, MPU), and a digital signal processor (DSP) located in a terminal. ), Or Field Programmable Gate Array (FPGA).
另外,在本实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。In addition, the functional modules in this embodiment may be integrated into one processing unit, or each unit may exist separately physically, or two or more units may be integrated into one unit. The above integrated unit may be implemented in the form of hardware or in the form of software functional modules.
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或processor(处理器)执行本实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional module and is not sold or used as an independent product, it may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of this embodiment is essentially or It is said that a part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium and includes several instructions for making a computer device (can It is a personal computer, a server, or a network device) or a processor (processor) to perform all or part of the steps of the method described in this embodiment. The foregoing storage media include: U disks, mobile hard disks, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks, which can store program codes.
具体来讲,本实施例中的一种数据保护方法对应的计算机程序指令可以被存储在光盘,硬盘,U盘等存储介质上,当存储介质中的与一种数据保护方法对应的计算机程序指令被一电子设备读取或被执行时,实现前述实施例的任意 一种数据保护方法的步骤。Specifically, the computer program instructions corresponding to a data protection method in this embodiment may be stored on a storage medium such as an optical disc, a hard disk, a U disk, and the like, when the computer program instructions corresponding to a data protection method are stored in the storage medium When read or executed by an electronic device, the steps of any one of the data protection methods of the foregoing embodiments are implemented.
基于前述实施例相同的技术构思,参见图5,其示出了本申请实施例提供的一种数据保护设备50,该设备可以包括:存储器51、处理器52和总线53;其中,Based on the same technical concept of the foregoing embodiment, referring to FIG. 5, it shows a data protection device 50 provided by an embodiment of the present application. The device may include: a memory 51, a processor 52, and a bus 53;
所述总线53设置为连接所述存储器51、处理器52和这些器件之间的相互通信;The bus 53 is configured to connect the memory 51, the processor 52, and mutual communication between these devices;
所述存储器51,设置为存储计算机程序和数据;The memory 51 is configured to store a computer program and data;
所述处理器52,设置为执行所述存储器中存储的计算机程序,以实现前述实施例的任意一种数据保护方法的步骤。The processor 52 is configured to execute a computer program stored in the memory to implement the steps of any one of the data protection methods in the foregoing embodiments.
在实际应用中,上述存储器51可以是易失性存储器(volatile memory),例如RAM;或者非易失性存储器(non-volatile memory),例如ROM,快闪存储器(flash memory),硬盘(Hard Disk Drive,HDD)或固态硬盘(Solid-State Drive,SSD);或者上述种类的存储器的组合,并向处理器52提供指令和数据。In practical applications, the above-mentioned memory 51 may be volatile memory (for example, RAM); or non-volatile memory (for example, ROM, flash memory, hard disk). Drive (HDD) or Solid-State Drive (SSD); or a combination of the above types of memory, and provides instructions and data to the processor 52.
上述处理器52可以为特定用途集成电路(Application Specific Integrated Circuit,ASIC)、DSP、数字信号处理装置(Digital Signal Processing Device,DSPD)、可编程逻辑装置(Programmable Logic Device,PLD)、FPGA、CPU、控制器、微控制器、微处理器中的至少一种。可以理解地,对于不同的设备,用于实现上述处理器功能的电子器件还可以为其它,本申请实施例不作具体限定。The processor 52 may be an Application Specific Integrated Circuit (ASIC), a DSP, a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), an FPGA, a CPU, At least one of a controller, a microcontroller, and a microprocessor. It can be understood that, for different devices, the electronic device used to implement the processor function may be other, which is not specifically limited in the embodiment of the present application.
本领域内的技术人员应明白,本申请的实施例可提供为方法、***、或计算机程序产品。因此,本申请可采用硬件实施例、软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application may be provided as a method, a system, or a computer program product. Therefore, this application may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Moreover, the present application may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) containing computer-usable program code.
本申请是参照根据本申请实施例的方法、设备(***)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个 流程和/或方框图一个方框或多个方框中指定的功能的装置。This application is described with reference to flowcharts and / or block diagrams of methods, devices (systems), and computer program products according to embodiments of the present application. It should be understood that each process and / or block in the flowcharts and / or block diagrams, and combinations of processes and / or blocks in the flowcharts and / or block diagrams can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing device to produce a machine, so that the instructions generated by the processor of the computer or other programmable data processing device are used to generate instructions Means for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing device to work in a particular manner such that the instructions stored in the computer-readable memory produce a manufactured article including an instruction device, the instructions The device implements the functions specified in one or more flowcharts and / or one or more blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device, so that a series of steps can be performed on the computer or other programmable device to produce a computer-implemented process, which can be executed on the computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.
以上所述,仅为本申请的较佳实施例而已,并非用于限定本申请的保护范围。The above descriptions are merely preferred embodiments of the present application, and are not intended to limit the protection scope of the present application.
工业实用性Industrial applicability
在采用上述记载的技术方案时,由于n个隐私子模型对应的n个隐私属性可以根据用户根据自身实际需求灵活设置,因而,可以得到符合实际需求的n个隐私子模型,进而,根据符合实际需求的n个隐私子模型确定生成预警信息时,说明预警信息的生成是符合实际需求的;也就是说,通过预先灵活和自主地设置n个隐私属性,可以实现对用户隐私数据的预警提醒,具有一定的灵活性和自主性,可以防止需要进行隐私保护的隐私数据泄露。When the technical solution described above is adopted, because the n privacy attributes corresponding to the n privacy submodels can be flexibly set according to the actual needs of the user, the n privacy submodels that meet the actual needs can be obtained. When the required n privacy sub-models determine that the warning information is generated, it indicates that the generation of the warning information is in line with the actual requirements; that is, by setting the n privacy attributes flexibly and autonomously in advance, the user's privacy data can be alerted. It has certain flexibility and autonomy, which can prevent the leakage of private data that requires privacy protection.

Claims (12)

  1. 一种数据保护方法,其中,所述方法包括:A data protection method, wherein the method includes:
    获取n个隐私子模型;其中,每个隐私子模型为表示一种隐私属性的数据集,所述n个隐私子模型表示的隐私属性互不相同,n为大于1的整数;Obtain n privacy sub-models; wherein each privacy sub-model is a data set representing a privacy attribute, the privacy attributes represented by the n privacy sub-models are different from each other, and n is an integer greater than 1;
    获取待处理数据,确定待处理数据对应的隐私子模型;所述待处理数据与对应的隐私子模型的相关性大于或等于预设相关性阈值时,生成预警信息,以提示所述待处理数据需要进行隐私保护。Obtaining the data to be processed to determine the privacy sub-model corresponding to the data to be processed; when the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold, an alert message is generated to prompt the data to be processed Requires privacy protection.
  2. 根据权利要求1所述的方法,所述获取n个隐私子模型,包括:The method according to claim 1, wherein the acquiring n privacy sub-models comprises:
    获取训练数据,所述训练数据用于表示应用运行时产生的用户数据;Acquiring training data, which is used to represent user data generated when an application is running;
    以预设的n个隐私属性作为中心对象,对所述训练数据进行聚类处理,得到n个隐私子模型。Using the preset n privacy attributes as the central object, clustering the training data to obtain n privacy sub-models.
  3. 根据权利要求2所述的方法,所述以预设的n个隐私属性作为中心对象,对所述训练数据进行聚类处理,得到n个隐私子模型,包括:The method according to claim 2, wherein the preset n privacy attributes are used as the central object to perform clustering processing on the training data to obtain n privacy sub-models, comprising:
    以预设的n个隐私属性作为中心对象,采用迭代的聚类方法,对所述训练数据进行多次聚类处理,得到n个隐私子模型;其中,在进行非首次聚类处理时,对聚类的中心对象进行更新,使本次聚类结果的预设评价指标高于上一次聚类结果的预设评价指标。Taking the preset n privacy attributes as the central object, iterative clustering method is used to perform multiple clustering processing on the training data to obtain n privacy sub-models. Among them, when performing non-first clustering processing, The central object of the cluster is updated so that the preset evaluation index of the current clustering result is higher than the preset evaluation index of the previous clustering result.
  4. 根据权利要求3所述的方法,所述聚类结果的预设评价指标可以用于表示:聚类结果中同一簇团中各个记录的邻近程度、以及聚类结果中不同簇团的记录之间的远离程度。The method according to claim 3, wherein the preset evaluation index of the clustering result can be used to indicate: the proximity of each record in the same cluster in the clustering result, and between the records of different clusters in the clustering result Far away.
  5. 根据权利要求1所述的方法,所述待处理数据为终端待上传的数据或终端待保存的数据。The method according to claim 1, wherein the data to be processed is data to be uploaded by the terminal or data to be saved by the terminal.
  6. 根据权利要求1所述的方法,所述确定待处理数据对应的隐私子模型,包括:The method according to claim 1, wherein the determining a privacy sub-model corresponding to the data to be processed comprises:
    在所述n个隐私子模型中,将与待处理数据的相关性最大的隐私子模型作为待处理数据对应的隐私子模型。Among the n privacy sub-models, the privacy sub-model with the greatest correlation with the data to be processed is used as the privacy sub-model corresponding to the data to be processed.
  7. 根据权利要求1所述的方法,所述方法还包括:所述待处理数据与对应的隐私子模型的相关性大于或等于预设相关性阈值时,对所述待处理数据进行 隐私保护。The method according to claim 1, further comprising: when the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold, privacy protection is performed on the data to be processed.
  8. 根据权利要求1所述的方法,所述方法还包括:所述待处理数据与对应的隐私子模型的相关性大于或等于预设相关性阈值时,将所述待处理数据添加到对应的隐私子模型中,实现对应隐私子模型的补充扩容。The method according to claim 1, further comprising: when the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold, adding the data to be processed to the corresponding privacy In the sub-model, supplementary expansion of the corresponding privacy sub-model is implemented.
  9. 根据权利要求1至8任一项所述的方法,所述方法还包括:The method according to any one of claims 1 to 8, further comprising:
    所述待处理数据与对应的隐私子模型的相关性小于预设相关性阈值时,确定所述待处理数据不需要进行隐私保护。When the correlation between the data to be processed and the corresponding privacy sub-model is less than a preset correlation threshold, it is determined that the data to be processed does not need to be protected by privacy.
  10. 一种数据保护设备,其中,所述设备包括处理器和设置为存储能够在处理器上运行的计算机程序的存储器;其中,A data protection device, wherein the device includes a processor and a memory configured to store a computer program capable of running on the processor; wherein,
    所述处理器设置为运行所述计算机程序时,执行权利要求1至9任一项所述方法的步骤。When the processor is configured to run the computer program, the processor executes the steps of the method according to any one of claims 1 to 9.
  11. 一种数据保护装置,其中,所述装置包括:获取模块和决策模块,其中,A data protection device, wherein the device includes: an acquisition module and a decision module, wherein:
    获取模块,设置为获取n个隐私子模型;其中,每个隐私子模型为表示一种隐私属性的数据集,所述n个隐私子模型表示的隐私属性互不相同,n为大于1的整数;The obtaining module is configured to obtain n privacy sub-models; wherein each privacy sub-model is a data set representing a privacy attribute, the privacy attributes represented by the n privacy sub-models are different from each other, and n is an integer greater than 1 ;
    决策模块,设置为获取待处理数据,确定待处理数据对应的隐私子模型;在所述待处理数据与对应的隐私子模型的相关性大于或等于预设相关性阈值时,生成预警信息,以提示所述待处理数据需要进行隐私保护。The decision-making module is configured to obtain the data to be processed and determine the privacy sub-model corresponding to the data to be processed; when the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold value, generating warning information to It is reminded that the data to be processed needs privacy protection.
  12. 一种计算机存储介质,其上存储有计算机程序,其中,该计算机程序被处理器执行时实现权利要求1至9任一项所述方法的步骤。A computer storage medium having stored thereon a computer program, wherein when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 9 are implemented.
PCT/CN2019/105390 2018-09-30 2019-09-11 Data protection method and device, apparatus, computer storage medium WO2020063349A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811162220.9 2018-09-30
CN201811162220.9A CN110968889A (en) 2018-09-30 2018-09-30 Data protection method, equipment, device and computer storage medium

Publications (1)

Publication Number Publication Date
WO2020063349A1 true WO2020063349A1 (en) 2020-04-02

Family

ID=69951173

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/105390 WO2020063349A1 (en) 2018-09-30 2019-09-11 Data protection method and device, apparatus, computer storage medium

Country Status (2)

Country Link
CN (1) CN110968889A (en)
WO (1) WO2020063349A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818390A (en) * 2021-01-26 2021-05-18 支付宝(杭州)信息技术有限公司 Data information publishing method, device and equipment based on privacy protection
CN113742781B (en) * 2021-09-24 2024-04-05 湖北工业大学 K anonymous clustering privacy protection method, system, computer equipment and terminal

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014169600A1 (en) * 2013-09-06 2014-10-23 中兴通讯股份有限公司 Method, device and terminal for processing hidden file folder
CN104200170A (en) * 2014-04-15 2014-12-10 中兴通讯股份有限公司 Privacy protection method of electronic equipment and electronic equipment
CN106599709A (en) * 2015-10-15 2017-04-26 中兴通讯股份有限公司 Privacy information leakage prevention method and device as well as terminal
WO2017187207A1 (en) * 2016-04-29 2017-11-02 Privitar Limited Computer-implemented privacy engineering system and method
CN107563204A (en) * 2017-08-24 2018-01-09 西安电子科技大学 A kind of privacy leakage methods of risk assessment of anonymous data

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102231277A (en) * 2011-06-29 2011-11-02 电子科技大学 Method for protecting mobile terminal privacy based on voiceprint recognition
WO2016206041A1 (en) * 2015-06-25 2016-12-29 宇龙计算机通信科技(深圳)有限公司 Terminal data protection method and apparatus
CN106709588B (en) * 2015-11-13 2022-05-17 日本电气株式会社 Prediction model construction method and device and real-time prediction method and device
GB201610883D0 (en) * 2016-06-22 2016-08-03 Microsoft Technology Licensing Llc Privacy-preserving machine learning
CN107358111B (en) * 2017-08-28 2019-11-22 维沃移动通信有限公司 A kind of method for secret protection and mobile terminal
CN107819945B (en) * 2017-10-30 2020-11-03 同济大学 Handheld device browsing behavior authentication method and system integrating multiple factors

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014169600A1 (en) * 2013-09-06 2014-10-23 中兴通讯股份有限公司 Method, device and terminal for processing hidden file folder
CN104200170A (en) * 2014-04-15 2014-12-10 中兴通讯股份有限公司 Privacy protection method of electronic equipment and electronic equipment
CN106599709A (en) * 2015-10-15 2017-04-26 中兴通讯股份有限公司 Privacy information leakage prevention method and device as well as terminal
WO2017187207A1 (en) * 2016-04-29 2017-11-02 Privitar Limited Computer-implemented privacy engineering system and method
CN107563204A (en) * 2017-08-24 2018-01-09 西安电子科技大学 A kind of privacy leakage methods of risk assessment of anonymous data

Also Published As

Publication number Publication date
CN110968889A (en) 2020-04-07

Similar Documents

Publication Publication Date Title
EP3622402B1 (en) Real time detection of cyber threats using behavioral analytics
US10785241B2 (en) URL attack detection method and apparatus, and electronic device
Fernandes et al. Generalised differential privacy for text document processing
CN106682495B (en) Safety protection method and safety protection device
US11256821B2 (en) Method of identifying and tracking sensitive data and system thereof
Zarni Aung Permission-based android malware detection
WO2017032261A1 (en) Identity authentication method, device and apparatus
US9628506B1 (en) Systems and methods for detecting security events
CN109614816A (en) Data desensitization method, device and storage medium
Zardari et al. K-NN classifier for data confidentiality in cloud computing
CN110852374B (en) Data detection method, device, electronic equipment and storage medium
WO2019184122A1 (en) Login verification method and apparatus, terminal device and storage medium
WO2020063349A1 (en) Data protection method and device, apparatus, computer storage medium
EP3471060A1 (en) Apparatus and methods for determining and providing anonymized content within images
WO2019019711A1 (en) Method and apparatus for publishing behaviour pattern data, terminal device and medium
Malik Android system call analysis for malicious application detection
Charmilisri et al. A novel ransomware virus detection technique using machine and deep learning methods
CN113409014A (en) Big data service processing method based on artificial intelligence and artificial intelligence server
AbuAlghanam et al. Android Malware Detection System Based on Ensemble Learning
CN113378982A (en) Training method and system of image processing model
CN105809074B (en) USB data transmission control method, device, control assembly and system
Aswini et al. Towards the Detection of Android Malware using Ensemble Features.
Yao et al. Reverse Engineering of Deceptions on Machine-and Human-Centric Attacks
CN113849246B (en) Plug-in identification method, plug-in loading method, computing device and storage medium
CN115878848B (en) Antagonistic video sample generation method, terminal equipment and medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19867696

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 24.08.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19867696

Country of ref document: EP

Kind code of ref document: A1