CN110968889A

CN110968889A - Data protection method, equipment, device and computer storage medium

Info

Publication number: CN110968889A
Application number: CN201811162220.9A
Authority: CN
Inventors: 艾东梅
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2018-09-30
Filing date: 2018-09-30
Publication date: 2020-04-07
Also published as: WO2020063349A1

Abstract

The embodiment of the invention provides a data protection method, equipment, a device and a computer storage medium, wherein the method comprises the following steps: acquiring n privacy submodels; each privacy submodel is a data set representing a privacy attribute, the privacy attributes represented by the n privacy submodels are different from each other, and n is an integer greater than 1; acquiring data to be processed, and determining a privacy sub-model corresponding to the data to be processed; and when the correlation between the data to be processed and the corresponding privacy submodel is greater than or equal to a preset correlation threshold value, generating early warning information to prompt that the data to be processed needs privacy protection.

Description

Data protection method, equipment, device and computer storage medium

Technical Field

Embodiments of the present invention relate to, but not limited to, private data protection technology, and in particular, to a data protection method, device, apparatus, and computer storage medium.

Background

With the rapid development of the mobile internet, various applications on the mobile terminal become important tools for users to know the world; due to the openness and interoperability of the internet, users are concerned about personal online privacy more and more. Although the privacy of the user is sensitive information, the privacy is still exposed at any time, and for example, various behavior tracks of the user, such as searching, browsing, downloading, payment, location, motion quantity and the like, are collected, stored and analyzed by various websites, apps, terminals and the like, and then are used for precise marketing or other commercial purposes, and even harm such as information leakage, identity theft, malicious attack and the like is caused.

In the related art, the protection method for the user privacy data is not flexible enough, and whether the user data needs to be subjected to privacy protection or not can not be determined according to actual requirements.

Disclosure of Invention

The embodiment of the invention provides a data protection method, data protection equipment, a data protection device and a computer storage medium, which can be used for flexibly protecting and managing private data.

In order to achieve the above purpose, the technical solution of the embodiment of the present invention is realized as follows:

the embodiment of the invention provides a data protection method, which comprises the following steps:

acquiring n privacy submodels; each privacy submodel is a data set representing a privacy attribute, the privacy attributes represented by the n privacy submodels are different from each other, and n is an integer greater than 1;

acquiring data to be processed, and determining a privacy sub-model corresponding to the data to be processed;

and when the correlation between the data to be processed and the corresponding privacy submodel is greater than or equal to a preset correlation threshold value, generating early warning information to prompt that the data to be processed needs privacy protection.

An embodiment of the present invention further provides a data protection device, where the device includes a processor and a memory for storing a computer program that can be run on the processor; wherein the content of the first and second substances,

the processor is configured to execute the steps of any one of the above-mentioned data protection methods when the computer program is executed.

An embodiment of the present invention further provides a data protection device, where the device includes: an acquisition module and a decision module, wherein,

the acquisition module is used for acquiring n privacy submodels; each privacy submodel is a data set representing a privacy attribute, the privacy attributes represented by the n privacy submodels are different from each other, and n is an integer greater than 1;

the decision module is used for acquiring data to be processed and determining a privacy submodel corresponding to the data to be processed; and when the correlation between the data to be processed and the corresponding privacy submodel is greater than or equal to a preset correlation threshold value, generating early warning information to prompt that the data to be processed needs privacy protection.

An embodiment of the present invention further provides a computer storage medium, and when being executed by a processor, the computer program implements the steps of any one of the data protection methods described above.

In the data protection method, the data protection equipment, the data protection device and the computer storage medium provided by the embodiment of the invention, n privacy submodels are firstly obtained; each privacy submodel is a data set representing a privacy attribute, the privacy attributes represented by the n privacy submodels are different from each other, and n is an integer greater than 1; then, acquiring data to be processed, and determining a privacy submodel corresponding to the data to be processed; and finally, when the correlation between the data to be processed and the corresponding privacy submodel is greater than or equal to a preset correlation threshold value, generating early warning information to prompt that the data to be processed needs privacy protection.

When the recorded technical scheme is adopted, the n privacy attributes corresponding to the n privacy sub-models can be flexibly set according to the actual requirements of users, so that the n privacy sub-models meeting the actual requirements can be obtained, and the generation of the early warning information is proved to be in accordance with the actual requirements when the early warning information is determined to be generated according to the n privacy sub-models meeting the actual requirements; that is to say, through setting up n privacy attributes in advance in a flexible way and independently, can realize having certain flexibility and autonomy to the early warning of user's privacy data warning, can prevent to carry out privacy protection's privacy data and reveal.

Drawings

FIG. 1 is a flow chart of a data protection method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a clustering result for training data according to an embodiment of the present invention;

FIG. 3 is a flow chart of another data protection method according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a data protection apparatus according to an embodiment of the present invention;

fig. 5 is a schematic diagram of a hardware structure of a data protection device according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

In the related art, there are two main methods for protecting user privacy data: (1) specifically, a program disguise trigger is made to disguise the icon and the name of the application program, so that when the terminal screen locking state is released, the application program still can have strong privacy for the user, the application program is prevented from being used by a second person except the terminal user, and privacy safety of user data in the application program is protected. (2) For user data protection, specifically, for changing user privacy data, anonymization data hiding for protecting personal privacy is generated.

When the method (1) is adopted to protect the user privacy data, the protection range is very small and the protection effect on the user privacy data is not large because the protection is only carried out aiming at the identification information of the application program; when the method (2) is adopted to protect the user privacy data, the user data needs to be processed anonymously, so that an incidence relation needs to be established between anonymous information and real data, and the mapping degree of the incidence relation directly influences the use of the privacy data, namely, the privacy data can be influenced by realizing a protection scheme of the privacy data through the change of the user privacy data.

It can be seen that, in the related art, the protection method for the user privacy data is not flexible enough, and it cannot be determined whether to perform privacy protection on the user data according to actual requirements.

The embodiment of the invention can be applied to any scene needing privacy protection, for example, when privacy protection is carried out on user data generated in the process of application operation on a terminal, the embodiment of the invention can be realized based on the technical scheme provided by the embodiment of the invention; the embodiment of the present invention does not limit the types of applications running on the terminal.

The embodiments of the present invention may be applied to a terminal or other devices, and the terminal or other devices described above may include a processor, a memory, and other devices.

Based on the above description, the following specific examples are proposed.

First embodiment

A first embodiment of the present invention describes a data protection method, and fig. 1 is a flowchart of the data protection method according to the embodiment of the present invention, and as shown in fig. 1, the flowchart may include:

step 101: acquiring n privacy submodels; each privacy submodel is a data set representing one privacy attribute, the privacy attributes represented by the n privacy submodels are different from each other, and n is an integer greater than 1.

For the implementation of this step, for example, training data representing user data generated during the runtime of the application may be obtained first; and then, clustering the training data by taking n preset privacy attributes as central objects to obtain n privacy submodels.

In actual implementation, user original data generated during application running can be obtained, and the recorded user original data is preprocessed to obtain training data; illustratively, the training data may be obtained by performing at least one of the following operations on the user raw data described above: word segmentation processing and filtering useless word processing, wherein the useless words can comprise punctuations, single words, symbols and other meaningless words; it should be noted that the foregoing description provides an implementation of the preprocessing by way of example only, and the preprocessing may have other implementations, and the embodiments of the present invention are not limited.

For the above-described implementation of the user raw data, the user raw data may be user data generated when an Application (App) of the mobile terminal runs, and may include various data generated when the user uses each Application of the mobile terminal, such as login information, reading, consumption, preference details, and the like.

In practical application, n different privacy attributes can be preset according to an actual requirement for protecting privacy data, each privacy attribute in the n privacy attributes represents a privacy point (i.e., a privacy point that a user most intensely intends) that the user determines to be protected, for example, the n privacy attributes may include "identity", "interest", and the like; n can be regarded as a preset protection degree coefficient, and the larger the value of n is, the more privacy points which need to be protected are determined by a user; furthermore, after n privacy attributes are set, the user can change the n privacy attributes according to actual requirements, and then clustering processing can be carried out on the training data again based on the changed privacy attributes to obtain the corresponding privacy sub-models.

It can be seen that, by setting the protection degree coefficient n, the user can flexibly determine the privacy protection strategy of personal data, and the protection degree coefficient n affects the privacy protection category. And the user sets a corresponding protection coefficient according to different privacy protection degrees.

In specific implementation, a User can input the protection factor n and the n privacy attributes through a User Interface (UI) of the terminal, so that the User operation is facilitated.

In the embodiment of the invention, after the n privacy attributes and the user original data are obtained, the n privacy attributes and the user original data can be used as input data for constructing the n privacy submodels, and then the input data of the n privacy submodels can be processed to obtain the n privacy submodels.

For the implementation mode of obtaining the n privacy sub-models, a clustering-based natural language processing method commonly used by machine learning can be adopted to automatically cluster the input data of the n privacy sub-models, and the central object of each clustering is gradually updated by an iteration method until the final clustering result is obtained; here, the last clustering result may include n clusters, privacy attributes of the n clusters in the last clustering result are different from each other, and each cluster in the last clustering result represents one privacy sub-model. It should be noted that, the embodiment of the present invention does not limit the structure and the learning method of the machine learning model.

That is to say, after obtaining training data and n preset privacy attributes, performing multiple clustering processing on the training data by using the n preset privacy attributes as a central object and adopting an iterative clustering method to obtain n privacy submodels; when non-primary clustering processing is carried out, the clustered center object is updated, and the preset evaluation index of the current clustering result is higher than the preset evaluation index of the last clustering result.

Here, the preset evaluation index of the clustering result may be used to indicate: the proximity of each record in the same cluster in the clustering result and the distance between the records of different clusters in the clustering result; the closer the records in the same cluster in the clustering result are, the farther the records in different clusters in the clustering result are, which indicates that the preset evaluation index of the clustering result is higher.

Specifically, 1 st clustering processing is carried out on the training data to obtain a 1 st clustering result;

making m represent the total iteration times of the iterative clustering method, and when i is 2-m, updating the central object of the ith-1 clustering by taking the preset evaluation index of the clustering result of the ith clustering higher than the preset evaluation index of the clustering result of the ith-1 as a target on the basis of the clustering result of the ith-1 clustering to obtain the central object of the ith clustering; and performing ith clustering processing on the training data according to the central object of the ith clustering to obtain an ith clustering result.

Here, m may be a preset integer greater than 1, or may be determined by a preset iteration termination condition; in one example, the preset iteration termination condition may be: and the central object of the previous clustering cannot be updated by taking the preset evaluation index of the clustering result as a target.

The clustering result of the training data in the embodiment of the present invention is illustrated below by referring to fig. 2, for convenience of understanding, when i takes 1 to m, when the training data is clustered for the ith time, n central objects of the ith clustering are abstracted into n points, and the n points are taken as the core, objects semantically closest to the n central objects in the data set are attracted and aggregated to form clusters respectively centered on the n attributes, in fig. 2, four clusters are identified by four circles identified by K1, K2, K3 and Kn, each cluster represents various attributes related to one privacy attribute, for example, various attributes related to "identity" such as "name", "id", are aggregated in clusters centered on "identity". If the user wants a larger range of privacy protection, it can be achieved by adjusting the protection factor: the larger the protection factor is, the more privacy attributes need to be set.

Step 102: acquiring data to be processed, and determining a privacy sub-model corresponding to the data to be processed; and when the correlation between the data to be processed and the corresponding privacy submodel is greater than or equal to a preset correlation threshold value, determining that the data to be processed needs privacy protection.

In practical application, the user data generated during application running can be monitored, and when the user data generated during application running is monitored, the monitored user data generated during application running is used as data to be processed. Illustratively, the data to be processed is data to be uploaded by the terminal or data to be saved by the terminal.

After the data to be processed is obtained, the above-mentioned n privacy submodels are taken as the central objects, and sufficient calculation is performed through a machine learning algorithm, so as to determine the privacy submodel (the privacy submodel corresponding to the data to be processed) to which the data to be processed belongs. For example, the correlations of the data to be processed and the n privacy submodels may be respectively determined, and the privacy submodel with the maximum correlation to the data to be processed may be used as the privacy submodel corresponding to the data to be processed.

Here, for the implementation of determining the correlation between the data to be processed and each privacy sub-model, in one example, the semantic distance between the data to be processed and each privacy sub-model may be calculated, and the correlation between the data to be processed and each privacy sub-model may be determined according to the semantic distance between the data to be processed and each privacy sub-model; the smaller the semantic distance between the data to be processed and the privacy submodel is, the greater the privacy sensitivity degree of the data to be processed is, and the greater the correlation between the data to be processed and the privacy submodel is.

In specific implementation, after a privacy submodel corresponding to data to be processed is determined, the size relationship between the correlation between the data to be processed and the corresponding privacy submodel and a preset correlation threshold value can be judged, when the correlation between the data to be processed and the corresponding privacy submodel is greater than or equal to the preset correlation threshold value, it is determined that privacy protection is required to be performed on the data to be processed, and at the moment, early warning information can be generated to prompt that privacy protection is required to be performed on the data to be processed; and when the correlation between the data to be processed and the corresponding privacy submodel is smaller than a preset correlation threshold value, determining that the data to be processed does not need privacy protection, and directly ending the process.

In the embodiment of the present invention, as for an implementation manner of determining a magnitude relation between a correlation between data to be processed and a corresponding privacy sub-model and a preset correlation threshold, exemplarily, a semantic distance between the data to be processed and the corresponding privacy sub-model may be calculated, and when the semantic distance between the data to be processed and the corresponding privacy sub-model is less than or equal to the preset semantic distance threshold, it is determined that the correlation between the data to be processed and the corresponding privacy sub-model is greater than or equal to the preset correlation threshold; otherwise, when the semantic distance between the data to be processed and the corresponding privacy sub-model is larger than the preset semantic distance threshold, determining that the correlation between the data to be processed and the corresponding privacy sub-model is smaller than the preset correlation threshold.

The n privacy attributes corresponding to the n privacy submodels can be flexibly set according to actual requirements, so that the n privacy submodels meeting the actual requirements can be obtained, and further, when the early warning information is determined to be generated according to the n privacy submodels meeting the actual requirements, the generation of the early warning information is shown to be in accordance with the actual requirements; that is to say, through setting up n privacy attributes in advance in a flexible way and independently, can realize having certain flexibility and autonomy to the early warning of user's privacy data warning, can prevent to carry out privacy protection's privacy data and reveal. In addition, when the correlation between the data to be processed and the corresponding privacy submodel is smaller than a preset correlation threshold, the data to be processed can be ignored, and thus, a safe channel and a strong guarantee can be provided for the data which does not need privacy protection.

In practical applications, the data protection method according to the first embodiment of the present invention may be implemented based on a processor of a terminal.

In the related art, on a mobile terminal, when a user installs an App, the user must agree to all authorizations declared by the App, otherwise the App cannot be installed and used, and research works in related fields mainly focus on malicious software detection, cloud end and server assistance and a privacy protection method based on information flow control; there is no protocol or policy in the App ecosphere, so that the user can decide which information is disclosed and which information is hidden.

In the embodiment of the invention, the original user data on the terminal is automatically refined and aggregated by using a machine learning method according to preset n privacy attributes to generate a privacy protection scheme meeting the actual needs of a single user, and on the basis of the privacy protection scheme, whether the data generated in the application process accords with the expected privacy information openness degree of the user is detected, and then corresponding measures are taken; the data generated in the application process can be filtered and discriminated by setting the n privacy attributes, so that the private data can be protected from abuse and even attack, and the data can be effectively utilized; that is to say, in the embodiment of the present invention, from the perspective of the user, with the purpose of "letting the user master the own data", a privacy data protection scheme that conforms to the user requirements is automatically constructed by using a machine learning method, and then, the privacy data that may be the intention of the user is decided and managed, so that the information can be provided to enjoy the service, and the privacy of the user can be protected.

Second embodiment

The data protection method provided by the foregoing embodiment of the present invention is further illustrated.

Fig. 3 is a flowchart of another data protection method according to an embodiment of the present invention, and as shown in fig. 3, the flowchart may include:

step 301: and acquiring data to be processed and n privacy submodels.

The implementation of this step has already been described in the first embodiment, and is not described here again.

Step 302: and determining a privacy submodel corresponding to the data to be processed.

The implementation of this step has already been explained in step 102, and is not described here again.

Step 303: and judging whether the correlation between the data to be processed and the corresponding privacy submodel is greater than or equal to a preset correlation threshold, if so, executing the step 304, and if not, ending the process.

Step 304: and carrying out early warning or other processing on the data to be processed.

Specifically, when the correlation between the data to be processed and the corresponding privacy submodel is greater than or equal to the preset correlation threshold, it may be considered that the data to be processed has a high probability of belonging to a privacy category that the user is interested in, and early warning information may be generated to prompt that a privacy information leakage risk exists, or privacy protection may be directly performed on the data to be processed. As for the display mode of the early warning information, illustratively, the early warning information can be displayed in a UI or other forms of the terminal; for the implementation manner of performing privacy protection on the data to be processed, for example, when it is determined that the data to be processed is stored, uploaded or otherwise has a possibility of causing privacy disclosure, the corresponding operation on the data to be processed is prevented, and an early warning or a prompt is performed on a user.

Further, when the correlation between the data to be processed and the corresponding privacy submodel is greater than or equal to a preset correlation threshold value, the data to be processed can be added into the corresponding privacy submodel, so that the privacy submodel is more complete.

Third embodiment

On the basis of the data protection method proposed in the foregoing embodiment, a fourth embodiment of the present invention provides a data protection apparatus.

Fig. 4 is a schematic structural diagram of a data protection apparatus according to an embodiment of the present invention, as shown in fig. 4, the apparatus includes an obtaining module 401 and a decision module 402, wherein,

an obtaining module 401, configured to obtain n privacy submodels; each privacy submodel is a data set representing a privacy attribute, the privacy attributes represented by the n privacy submodels are different from each other, and n is an integer greater than 1;

a decision module 402, configured to obtain data to be processed, and determine a privacy sub-model corresponding to the data to be processed; and when the correlation between the data to be processed and the corresponding privacy submodel is greater than or equal to a preset correlation threshold value, generating early warning information to prompt that the data to be processed needs privacy protection.

In an embodiment, the obtaining module 401 is specifically configured to obtain training data, where the training data is used to represent user data generated during an application runtime; and clustering the training data by taking n preset privacy attributes as central objects to obtain n privacy submodels.

In an embodiment, the obtaining module 401 is specifically configured to perform multiple clustering processing on the training data by using n preset privacy attributes as a central object and using an iterative clustering method to obtain n privacy submodels; when non-primary clustering processing is carried out, the clustered center object is updated, and the preset evaluation index of the current clustering result is higher than the preset evaluation index of the last clustering result.

In an embodiment, the preset evaluation index of the clustering result may be used to indicate: the proximity of individual records in the same cluster in the clustering result, and the distance between records of different clusters in the clustering result.

In an embodiment, the data to be processed is data to be uploaded by the terminal or data to be stored by the terminal.

In an embodiment, the decision module 402 is specifically configured to, in the n privacy submodels, use a privacy submodel with the highest correlation with the data to be processed as the privacy submodel corresponding to the data to be processed.

In an embodiment, the decision module 402 is further configured to perform privacy protection on the to-be-processed data when a correlation between the to-be-processed data and a corresponding privacy sub-model is greater than or equal to a preset correlation threshold.

In an embodiment, the decision module 402 is further configured to add the to-be-processed data to the corresponding privacy sub-model when the correlation between the to-be-processed data and the corresponding privacy sub-model is greater than or equal to a preset correlation threshold, so as to implement supplementary capacity expansion of the corresponding privacy sub-model.

In an embodiment, the decision module 402 is further configured to determine that the data to be processed does not need to be subjected to privacy protection when the correlation between the data to be processed and the corresponding privacy submodel is smaller than a preset correlation threshold.

In practical applications, the obtaining module 401 and the deciding module 402 can be implemented by a Central Processing Unit (CPU), a microprocessor Unit (MPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), or the like in the terminal.

In addition, each functional module in this embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware or a form of a software functional module.

Based on the understanding that the technical solution of the present embodiment essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, and include several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the method of the present embodiment. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Specifically, the computer program instructions corresponding to a data protection method in the present embodiment may be stored on a storage medium such as an optical disc, a hard disc, a usb disk, or the like, and when the computer program instructions corresponding to a data protection method in the storage medium are read or executed by an electronic device, the steps of any one of the data protection methods in the foregoing embodiments are implemented.

Based on the same technical concept of the foregoing embodiment, referring to fig. 5, it shows a data protection device 50 provided by an embodiment of the present invention, which may include: memory 51, processor 52 and bus 53; wherein the content of the first and second substances,

the bus 53 is used for connecting the memory 51, the processor 52 and the intercommunication among these devices;

the memory 51 for storing computer programs and data;

the processor 52 is configured to execute the computer program stored in the memory to implement the steps of any one of the data protection methods of the foregoing embodiments.

In practical applications, the memory 51 may be a volatile memory (RAM); or a non-volatile memory (non-volatile memory) such as a ROM, a flash memory (flash memory), a Hard Disk (Hard Disk Drive, HDD) or a Solid-State Drive (SSD); or a combination of the above types of memories and provides instructions and data to the processor 52.

The processor 52 may be at least one of an Application Specific Integrated Circuit (ASIC), a DSP, a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), an FPGA, a CPU, a controller, a microcontroller, and a microprocessor. It will be appreciated that the electronic devices used to implement the processor functions described above may be other devices, and embodiments of the present invention are not limited in particular.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.

Claims

1. A method for data protection, the method comprising:

acquiring data to be processed, and determining a privacy sub-model corresponding to the data to be processed; and when the correlation between the data to be processed and the corresponding privacy submodel is greater than or equal to a preset correlation threshold value, generating early warning information to prompt that the data to be processed needs privacy protection.

2. The method of claim 1, wherein the obtaining n privacy submodels comprises:

acquiring training data, wherein the training data is used for representing user data generated during the operation of an application;

and clustering the training data by taking n preset privacy attributes as central objects to obtain n privacy submodels.

3. The method according to claim 2, wherein the clustering the training data with n preset privacy attributes as a central object to obtain n privacy submodels comprises:

taking n preset privacy attributes as central objects, and performing clustering processing on the training data for multiple times by adopting an iterative clustering method to obtain n privacy sub-models; when non-primary clustering processing is carried out, the clustered center object is updated, and the preset evaluation index of the current clustering result is higher than the preset evaluation index of the last clustering result.

4. The method according to claim 3, wherein the preset evaluation index of the clustering result can be used to represent: the proximity of individual records in the same cluster in the clustering result, and the distance between records of different clusters in the clustering result.

5. The method according to claim 1, wherein the data to be processed is data to be uploaded by the terminal or data to be saved by the terminal.

6. The method of claim 1, wherein the determining the privacy submodel corresponding to the data to be processed comprises:

and in the n privacy submodels, taking the privacy submodel with the maximum correlation with the data to be processed as the privacy submodel corresponding to the data to be processed.

7. The method of claim 1, further comprising: and when the correlation between the data to be processed and the corresponding privacy submodel is greater than or equal to a preset correlation threshold value, carrying out privacy protection on the data to be processed.

8. The method of claim 1, further comprising: and when the correlation between the data to be processed and the corresponding privacy submodel is greater than or equal to a preset correlation threshold value, adding the data to be processed into the corresponding privacy submodel to realize the supplementary capacity expansion of the corresponding privacy submodel.

9. The method according to any one of claims 1 to 8, further comprising:

and when the correlation between the data to be processed and the corresponding privacy submodel is smaller than a preset correlation threshold value, determining that the data to be processed does not need privacy protection.

10. A data protection device, characterized in that the device comprises a processor and a memory for storing a computer program capable of running on the processor; wherein the content of the first and second substances,

the processor is adapted to perform the steps of the method of any one of claims 1 to 9 when running the computer program.

11. A data protection device, the device comprising: an acquisition module and a decision module, wherein,

12. A computer storage medium on which a computer program is stored, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 9 when executed by a processor.