CN113709090A - System and method for determining group privacy disclosure risk - Google Patents

System and method for determining group privacy disclosure risk Download PDF

Info

Publication number
CN113709090A
CN113709090A CN202011103737.8A CN202011103737A CN113709090A CN 113709090 A CN113709090 A CN 113709090A CN 202011103737 A CN202011103737 A CN 202011103737A CN 113709090 A CN113709090 A CN 113709090A
Authority
CN
China
Prior art keywords
privacy
determining
group
data set
disclosure
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011103737.8A
Other languages
Chinese (zh)
Other versions
CN113709090B (en
Inventor
张颖
张继东
袁海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi Digital Life Technology Co Ltd
Original Assignee
Tianyi Smart Family Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianyi Smart Family Technology Co Ltd filed Critical Tianyi Smart Family Technology Co Ltd
Priority to CN202011103737.8A priority Critical patent/CN113709090B/en
Publication of CN113709090A publication Critical patent/CN113709090A/en
Application granted granted Critical
Publication of CN113709090B publication Critical patent/CN113709090B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/185Arrangements for providing special services to substations for broadcast or conference, e.g. multicast with management of multicast group membership
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Systems and methods for determining group privacy exposure risk are disclosed. The system comprises: the privacy data set preprocessing module is used for preprocessing the privacy data set; a privacy database for storing the pre-processed privacy data set; a group information preprocessing module for determining the openness of the private data set based on the preprocessed private data set and forming an open matrix; the privacy leakage degree calculation module is used for determining the association degree of the privacy data sets, forming an association matrix, determining the triple information of each type of privacy data sets, and determining the privacy leakage degree based on the association matrix and the triple information; and the privacy disclosure risk determining module is used for determining the privacy disclosure risk based on the privacy disclosure degree. The application also discloses a corresponding method. The system and the method consider the relevance among the privacy data and the individuation of the user, evaluate the group privacy disclosure risk more accurately and objectively, and have stronger expansibility and generalizability.

Description

System and method for determining group privacy disclosure risk
Technical Field
The present application relates to the field of information security technology, and more particularly, to a system and method for determining group privacy disclosure risk.
Background
With the rapid development of emerging technologies such as cloud computing, internet of things and the internet, the data is also increased explosively. During the process of data collection, mining, analysis and application, if some personal information is contained in the data record, the privacy of the person is easily revealed. This can have undesirable consequences for both individuals and society. Therefore, how to protect personal privacy is more and more emphasized by people. Current research on privacy data protection focuses mainly on how to desensitize data through a series of desensitization algorithms, or reduce the possibility of privacy disclosure by means of privacy data distribution protection methods, such as common Private Aggregation of teachers' entire (PATE) or Differential Privacy (DP) protection methods, and the like, while relatively few model or algorithm research is conducted to determine privacy disclosure risks.
Patent document CN 105871891 a discloses "a DNS privacy disclosure risk assessment method and system", in which quantitative evaluation of DNS privacy disclosure risk can be achieved. The method comprises the steps of calculating the total DNS privacy disclosure RISK RISK value of a user or a user group according to the privacy disclosure degree and the privacy infringement difficulty degree of each privacy disclosure RISK to the user or the user group and the privacy disclosure RISK value, wherein the data source mainly comprises the monitoring of a link in a server and the collection of access domain names, the privacy disclosure RISK of data is judged according to the interaction frequency of each application and the server, and the construction and evaluation indexes of a privacy data set are not perfect. For example, when the privacy data is comprehensively evaluated, the degree of openness of a user to a certain type of specific sensitive data and the relevance among the data are not considered, so that the evaluation method is not suitable for the case that the data generated in some application scenarios are strongly correlated in family scenarios and similar scenarios due to the relativity in relation or similar relevance among the members.
Patent document CN 110222058A discloses a multi-source data association privacy disclosure RISK assessment system based on FP-growth, in which vulnerability analysis is performed on different data sources, a privacy disclosure RISK value RISK of a single data source is calculated to construct a privacy disclosure RISK assessment index, a mapping assessment set is constructed by combining an asset influence coefficient C, a threat frequency T and a vulnerability severity V, a privacy disclosure RISK factor entropy weight coefficient is calculated through multiple matrix changes and a markov chain, and a privacy disclosure RISK assessment model is finally obtained. But the patent only considers the privacy disclosure risk from the perspective of data assets, the model ignores the relevance factors between people, and the algorithm calculation process is relatively complex and is not suitable for the privacy disclosure risk assessment in the group unit. Under the condition of large data volume, the calculation cost is high, and the popularization is not facilitated.
Most of the current privacy disclosure risk assessment methods or systems aim at a single individual, but in group applications, such as groups taking families as units, due to the fact that the relationships among group members have certain relevance and are even relatively close, the current privacy disclosure risk assessment methods taking individuals as main research objects have at least two problems in terms of dimensions such as data generation, data use and privacy opening. First, because of the relationship relevance between each member in the group, the data generated in the application process has a larger relevance, and the privacy evaluation based on the individual often ignores the influence of the data relevance on privacy disclosure. Even though the concept of a user group is introduced into a part of privacy evaluation models, the evaluation dimension is still single-group data, and the relevance between the data is ignored. The association performance between the data plays a role of enhancing background knowledge, so that the privacy data of one member is leaked, and the privacy of other group members is possibly leaked to some extent. Secondly, the openness of the privacy is a non-strict quantitative measurement relationship, due to the individual difference among the groups, the openness degree of different individuals to the individual privacy is different, from the perspective of individualization of sensitive data, the definitions of the same kind of privacy by different individuals and the individualized privacy requirements when the privacy data are issued and processed are also different, and the group is taken as the privacy evaluation unit, so that the difference of the openness of the privacy itself is certainly considered.
Therefore, after studying the technology for evaluating the privacy disclosure risk in the prior art, most of the technical solutions have at least two problems as follows.
First, the relevance between data is not considered when evaluating the privacy disclosure risk, but in practical application, there often exists a potential invisible relevance between data. For example, some fields in a certain data record are associated with other data in some way, so that more privacy information can be deduced therefrom to cause privacy disclosure.
Second, the differential needs of different groups or individuals for privacy protection are not taken into account when assessing privacy leakage risks. In fact, different people have quite different requirements on privacy protection, and the definition standards of privacy disclosure are different. Therefore, when the privacy disclosure risk assessment is performed in groups (especially, user groups in households), the methods in the prior art cannot completely adapt to the requirements.
Therefore, there is a need in the art for a method and system for comprehensively, objectively and quantifiably determining the risk of privacy disclosure, so that the risk of privacy disclosure can be comprehensively evaluated in groups, particularly in families.
Disclosure of Invention
The following presents a simplified summary of one or more aspects in order to provide a basic understanding of such aspects. This summary is not an extensive overview of all contemplated aspects, and is intended to neither identify key or critical elements of all aspects nor delineate the scope of any or all aspects. Its sole purpose is to present some concepts of one or more aspects in a simplified form as a prelude to the more detailed description that is presented later.
The application aims to provide a method and a system for determining group privacy disclosure risks. The method comprises the steps of classifying data related to group application by taking a group as a unit, proposing a characterized privacy data set, establishing a privacy open matrix through an algorithm and calculating a privacy disclosure risk coefficient by combining personal privacy openness factors, and finally obtaining a privacy data disclosure risk determination result of the group comprehensively. The result can be used as a reference basis for providing targeted and personalized privacy protection measures for millions of group members. Meanwhile, the method in the application can also be popularized to any organization or application scene needing to determine privacy disclosure risks by taking the group as a unit.
According to a first aspect of the present application, there is provided a system for determining a risk of group privacy disclosure, the system comprising: the system comprises a privacy data set preprocessing module, a privacy database, a group information preprocessing module, a privacy disclosure degree calculating module and a privacy disclosure risk determining module.
Wherein the privacy data set preprocessing module is used for preprocessing the privacy data set, and comprises: the method comprises the steps of identifying privacy data related to various applications and services in a group scene, namely defining a privacy data set, and generating a feature vector of the privacy data set after cleaning, formatting, filtering useless data, normalizing repeated data and standardizing the data.
The privacy database is used to store data related to privacy leakage risks, including but not limited to: personal information (such as age, identity card, occupation, hobbies and interests, work units and the like) of family members, APP access information and log information, internet access characteristic information and traffic information, basic information of family intelligent equipment, use log information of the family intelligent equipment, other information and the like. Each private data record consists of the following tuples:
{ [ privacy tag metadata MetaDi ]; [ privacy tag metadata describes MetaDSpec-i, which can be described in regular expressions or Backus-parade ]; (keyword list (keyword 1, keyword 2, keyword 3.. keyword n, this element is optional)); eigenvalues }.
The group information preprocessing module is used for determining the openness degree of each group member to the data and forming a privacy open matrix between the group members and the privacy data set according to the individual requirements of different members in the group to privacy protection and by combining the output of the privacy data set preprocessing module.
The privacy disclosure degree calculation module is used for determining the association degree of the privacy data sets and forming an association matrix R, for each type of privacy data set, determining the three-tuple information { disclosure severity Si, disclosure difficulty Bi, openness Fpi of the group }, and determining the privacy disclosure degree of each type of privacy data set based on the association matrix and the three-tuple information.
The privacy disclosure risk determination module is used for determining a privacy disclosure risk quantitative value of the group based on the privacy disclosure degree to determine the privacy disclosure risk of the group.
According to a second aspect of the present application, there is provided a method for determining a risk of privacy leakage for a group, the method comprising: pre-processing a set of privacy data about a group; determining the degree of association of the privacy data sets and forming an association matrix; the method comprises the steps of determining the openness ai of a group to each type of privacy data set based on the preprocessed privacy data sets, and forming a corresponding privacy open matrix; for each type of private data set, determining three-tuple information { leakage severity Si, leakage difficulty Bi, openness Fpi of the group }; determining the privacy disclosure degree of each type of privacy data set based on the incidence matrix and the triplet information; and determining a privacy disclosure risk quantification value of the group based on the privacy disclosure degree to determine the privacy disclosure risk of the group.
To the accomplishment of the foregoing and related ends, the one or more aspects comprise the features hereinafter fully described and particularly pointed out in the claims. The following description and the annexed drawings set forth in detail certain illustrative features of the one or more aspects. These features are indicative, however, of but a few of the various ways in which the principles of various aspects may be employed and the present description is intended to include all such aspects and their equivalents.
Drawings
So that the manner in which the above recited features of the present application can be understood in detail, a more particular description of the disclosure briefly summarized above may be had by reference to aspects, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only certain typical aspects of this application and are therefore not to be considered limiting of its scope, for the description may admit to other equally effective aspects.
In the drawings:
fig. 1 is a schematic block diagram illustrating a system for determining group privacy disclosure risk according to an embodiment of the present application; and
fig. 2 is a flow diagram illustrating a method 200 for determining group privacy disclosure risk according to an embodiment of the present application.
Detailed Description
The detailed description set forth below in connection with the appended drawings is intended as a description of various configurations and is not intended to represent the only configurations in which the concepts described herein may be practiced. The detailed description includes specific details to provide a thorough understanding of the various concepts. It will be apparent, however, to one skilled in the art that these concepts may be practiced without these specific details. In some instances, well known components are shown in block diagram form in order to avoid obscuring such concepts.
It is to be understood that other embodiments will be evident based on the present disclosure, and that system, structural, process, or mechanical changes may be made without departing from the scope of the present disclosure
As shown in fig. 1, a system for determining a risk of privacy disclosure of a group is illustrated. The system comprises: the privacy data set preprocessing module 10, the privacy database 20, the group information preprocessing module 30, the privacy disclosure degree calculating module 40, and the privacy disclosure risk determining module 50.
The privacy data set preprocessing module 10 is configured to preprocess the privacy data set, and includes: the method comprises the steps of identifying privacy data related to various applications and services in a group scene, namely defining a privacy data set, and cleaning, formatting, filtering useless data, normalizing repeated data and standardizing the privacy data set to generate a feature vector of the privacy data set.
The privacy database 20 is used to construct a privacy database, wherein the privacy data includes but is not limited to: personal information (such as age, identity card, occupation, hobbies and interests, work units and the like) of the members, various application access information and log information, internet access characteristic information and traffic information, basic information of the intelligent equipment, use log information of the intelligent equipment, other information and the like.
Each private data record consists of the following tuples:
{ [ privacy tag metadata MetaDi ]; [ privacy tag metadata describes MetaDSpec-i, which can be described in regular expressions or Backus-parade ]; (keyword list (keyword 1, keyword 2, keyword 3.. keyword n, this element is optional)); eigenvalues }.
The group information preprocessing module 30 is configured to determine, according to personalized requirements of different members in the group for privacy protection, the openness degree of each member for the data in the group and form a privacy open matrix between the members and the privacy data, in combination with the output of the privacy data preprocessing module.
The privacy disclosure degree calculation module 40 is configured to determine the degree of association of the privacy data sets and form a correlation matrix R, determine, for each type of privacy data set, triplet information { disclosure severity Si, disclosure difficulty Bi, openness degree Fpi of the group }, and determine the degree of privacy disclosure of each type of privacy data set based on the correlation matrix and the triplet information.
The privacy disclosure risk determination module 50 is configured to determine a privacy disclosure risk of the member based on the feature vector value of the privacy data in combination with the risk disclosure vector value of the privacy data and output the result.
The detailed implementation process and steps of the method 200 for determining a risk of group privacy disclosure according to the present application are described in detail below with reference to the flowchart in fig. 2.
Step one, S201: preprocessing private data sets for a group
This step includes the following substeps.
The method comprises the steps of combing business data, log data and the like which can be related to various applications and services in a group, and after cleaning, analyzing and sorting, performing word segmentation on the business data, the log data and the like by using a word segmentation device to obtain a specific data set. And searching the privacy database by using the key words of each data set according to each item of data of each data set to perform feature matching, if the matching is successful, the data is the privacy data, and adding the corresponding feature value into the tuple, otherwise, discarding the tuple.
X ═ i1, i 2.., im } is defined as a set of m different private data items ij, called a private data set, where ij is the feature vector to which the private data item corresponds.
Assuming that P number of privacy data sets are finally obtained after sorting all applications and services, the privacy data sets can be expressed as { X1, X2, … Xp }. When there is an association between some two sets of private data Xi and Xj, it is satisfied
Figure BDA0002726263490000073
When the number of corresponding private data items Xi and Xj is different, taking t as Max { m | where m is X1, X2... Xp is the number of data items }, P private data sets can be represented by the matrix Pr as follows:
Figure BDA0002726263490000071
next, the degree of correlation of the private data sets is determined and a correlation matrix R is generated.
Determining the relevance of the private data items according to the matrix Pr and forming a data relevance coefficient matrix R as shown in formula I
The formula I is as follows: r ═ R (R)ji)txt
Wherein the calculation formula of each rij is shown below, satisfying rij=rji
Figure BDA0002726263490000072
Step two S202: determining the degree of association of the private data sets and generating an association matrix R
Calculating the relevance of the private data items according to the matrix Pr and forming a data relevance coefficient matrix R as shown in formula I
The formula I is as follows: r ═ R (R)ji)txt
Wherein the calculation formula of each rij is shown in (2), and r is satisfiedij=rji
Figure BDA0002726263490000081
Step three, S203: determining the openness ai of the group u to a certain type of private data set and forming an open matrix
Due to the individual differences among the members of the group, the individual with different personalities has different degrees of openness to the private data. The openness degree of each type of private data of the group members is divided into a, B, C, D, E, f. The rank may be represented by different numbers n1,n2,n3...nn}. Wherein n satisfies the following condition nk<1, k is 1,2,3, when i<When j is, ni<nj}。
Assuming that there are m members in group u, and the openness of each member to the data items in dataset { X1, X2, … Xp } is represented by a feature vector ui (a1, a2, a 3.. ap), the relationship between the group member privacy openness and dataset P can be represented as the following matrix:
Figure BDA0002726263490000082
the minimum allowable openness of the final data set Xi in the group is represented by Fp, then:
f (min { uk (a1) }, min { uk (a2), min { uk (ap)) }, wherein k ═ 1,. n }, and n is the number of group members.
Step four S204: determining three-tuple information { leakage severity Si, leakage difficulty degree Bi, and openness degree min { ai } of group u } of each type of data item of the privacy data set
It is evaluated from three dimensions for a certain class of private data sets: revealing severity Si, revealing difficulty Bi, data openness Fpi, forming a private dataset triple (revealing severity Si, revealing difficulty Bi, data openness Fpi). The higher the Si value is, the greater the loss of the data to the group members after the data is leaked, and the more serious the consequence is; bi > is 1, and the higher the Bi value is, the more difficult the data is to leak; fpi < ═ 1, smaller values indicate that the members of the group are less open.
Step five S205: determining a degree of privacy disclosure for each type of privacy data set
For a certain data set Xp, defining a privacy disclosure risk coefficient theta as follows:
the formula II is as follows: θ ═ Si × Fpi)/Bi.
A privacy leakage vector T [ θ 1, θ 1, θ 2.. θ p ] is created for the privacy data set { X1, X2, … Xp }.
Determine a privacy-exposure risk vector for the dataset X1, X2.VALUEWherein R isVALUEExpressed as { Risk1, Risk2.. RiskP }: then
The formula III is as follows: rVALUE=R*T,
Where R is the incidence matrix and T is the privacy leakage vector.
Step six S206: determining privacy exposure risk for a group
Defining a certain privacy data item i by combining the frequency of data generation or collection in actual specific application or servicemnThe number of occurrences in a certain time period is rmn.Determining the frequency of occurrence FRE of each data itemmnThen, then
The formula four is as follows:
Figure BDA0002726263490000091
defining privacy Weight of dataset XP as Weightk=max{FREmk}。
And determining a privacy disclosure risk quantitative value of the group according to the following to determine the privacy disclosure risk of the group.
The formula five is as follows:
Figure BDA0002726263490000092
the present invention is described in further detail below with reference to the attached drawings. Taking a common family with 4 households as an example, assuming that the daily family service scene is relatively simple, the following record sets in three aspects are obtained by sorting data and logs generated in various applications of the family:
the integration: { APP usage; { APP open time, APP action, user, search keyword }, {1,4,3,2}
And a second set: { Internet surfing situation: { visit time, visit web site, length of stay, keyword topic }, {1/2,1/2,3,2}
And (3) collecting three: { device information: { device name, device action, time }, {2, 4,1, 0}
And (4) collecting: { personal information: { identification card, name, gender, age {, {2,3,4,1} }
Remarking: wherein the calculation of the eigenvalues in the privacy database is not within the scope of the discussion of this technical solution.
The private data set Pr can thus be derived:
Figure BDA0002726263490000101
step two: the degree of correlation of the private data sets is determined and a correlation matrix R is generated. The correlation matrix rij is calculated by formula one as follows:
Figure BDA0002726263490000102
step three: evaluating the openness ai of a family user group u to a certain type of private data set
Assuming that there are 4 members in the family, the openness of each member to the three data sets is represented by a matrix as follows:
Figure BDA0002726263490000103
step four: for the data set one, two and three, the three-tuple information { the severity of leakage Si, the difficulty of leakage Bi, the openness min { ai } of the user group u is confirmed respectively as follows:
data set one { Si ═ 5, Bi ═ 2, Fpi ═ 1/6}
Data set two { Si ═ 3, Bi ═ 6, Fpi ═ 1/4}
Data set three { Si ═ 2, Bi ═ 6, Fpi ═ 1/5}
Data set four { Si ═ 8, Bi ═ 6, Fpi ═ 1/8}
Step five: calculating the privacy disclosure degree of each type of privacy data set, wherein the privacy disclosure risk coefficient formula of each data set is theta (Si) Fpi)/Bi, and the calculation result is as follows
Data set one X1, whose privacy leakage risk factor is calculated as: 5/12
Data set two X2, whose privacy leakage risk factor is calculated as: 3/24
Data set three X3, whose privacy leakage risk factor is calculated as: 1/15
Data set three X4, whose privacy leakage risk factor is calculated as: 1/6
Privacy leakage vectors are created for the data sets { X1, X2, X3, X4}
T[θ1,θ1,θ2,...θp]=5/12,1/8,1/15,1/6};
Defining the privacy leakage risk value vector of the data set { X1, X2, X3} as RVALUE, calculating the result according to formula III as follows:
0.337
0.456
0.036
-0.817
step six: and calculating the quantitative value of the risk of the privacy disclosure of the whole family. Assuming that the maximum number of times the data set { X1, X2, X3, X4} takes within 30 days is {210,120,30,10}, the frequency Weightk of the data set { X1, X2, X3, X4} is {7,4,1,0.33} respectively according to the formula four
And calculating the privacy evaluation risk of the family data according to the formula five, wherein the calculation result is 4.49.
As described above, compared with the prior art, the system and the method for determining the group privacy disclosure risk provided by the present application have the following advantages:
1. it is first proposed to make privacy disclosure risk determination in units of groups (such as households) and to consider the association between various privacy data. Aiming at privacy determination which is mainly performed by group members, the invention firstly proposes that privacy disclosure risk determination is performed by taking a group as a unit, and meanwhile, in order to reduce privacy disclosure caused by relevance among privacy data, a relevance coefficient matrix among privacy data items is determined and a corresponding determination method is provided, so that the relevance among the privacy data is fully considered in privacy protection.
2. And in combination with the actual application scene, the openness of the privacy data is determined in a grading way, and the openness degree of the related privacy data item is determined for each group member so as to form an individual privacy open vector. Aiming at multi-user scenes such as groups and the like, a method for determining the openness matrix of the group privacy data set is provided by combining the relevance among the group members, so that the personalized requirements of privacy protection can be met and the characteristics of the groups can be considered in the privacy determination process.
3. The application also provides a method for expressing the triples (the disclosure severity Si, the disclosure difficulty Bi and the data openness Fpi) of the privacy data set, one privacy data set is evaluated according to the severity of the privacy disclosure, the disclosure difficulty and the data development degree, and the privacy disclosure risk coefficient is determined according to a formula, so that the disclosure risk of the privacy data is better quantified.
4. According to the group privacy disclosure risk determining method, the group privacy disclosure risk determining method has strong expansibility and generalizability, can be popularized to other scenes needing group privacy determination, has no limit to the number of users of the group, and can be easily extended to user groups with a certain similar characteristic. With the development of the internet of things and the 5G technology, more and more IoT devices are bound to be accessed into the network, more and more organizations are bound to doubt the hidden danger of privacy disclosure, and the method has a very wide application prospect.
5. The privacy risk determination system and method of the present application are very versatile, have a long technology application life cycle, and can be embedded in any application or product that requires the use of the method.
By adopting the system and the method, the privacy personalized requirements among different members of the group are considered, the characteristics of the privacy data set and the correlation degree among the data sets are also considered, and the risk of group privacy disclosure is comprehensively determined, so that the privacy disclosure taking the group as a unit is more accurately evaluated.
It is to be understood that the specific order or hierarchy of steps in the methods disclosed is an illustration of exemplary processes. Based upon design preferences, it is understood that the specific order or hierarchy of steps in the methods or methodologies described herein may be rearranged. The accompanying method claims present elements of the various steps in a sample order, and are not meant to be limited to the specific order or hierarchy presented unless specifically recited herein.
The previous description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects. Thus, the claims are not intended to be limited to the aspects shown herein, but is to be accorded the full scope consistent with the language claims, wherein reference to an element in the singular is not intended to mean "one and only one" (unless specifically so stated) but rather "one or more". The term "some" means one or more unless specifically stated otherwise. A phrase referring to "at least one of a list of items refers to any combination of those items, including a single member. By way of example, "at least one of a, b, or c" is intended to encompass: at least one a; at least one b; at least one c; at least one a and at least one b; at least one a and at least one c; at least one b and at least one c; and at least one a, at least one b, and at least one c. All structural and functional equivalents to the elements of the various aspects described throughout this disclosure that are known or later come to be known to those of ordinary skill in the art are expressly incorporated herein by reference and are intended to be encompassed by the claims. Moreover, nothing disclosed herein is intended to be dedicated to the public regardless of whether such disclosure is explicitly recited in the claims.

Claims (10)

1. A system for determining group privacy exposure risk, comprising:
a privacy data set preprocessing module for preprocessing the privacy data set relating to the group;
a privacy database for storing the pre-processed privacy data set;
the group information preprocessing module is used for determining the openness ai of the group to each type of privacy data set based on the preprocessed privacy data sets and forming a corresponding privacy open matrix;
the privacy disclosure degree calculation module is used for determining the association degree of the privacy data sets and forming an association matrix R, determining the three-tuple information { disclosure severity Si, disclosure difficulty Bi and openness degree Fpi of the group } of each type of privacy data sets, and determining the privacy disclosure degree of each type of privacy data sets based on the association matrix R and the triple information; and
a privacy disclosure risk determination module to determine a privacy disclosure risk quantification value for the group based on the privacy disclosure degree to determine the privacy disclosure risk for the group.
2. The system of claim 1, wherein pre-processing the private data set comprises:
the private data set is cleaned, formatted, filtered of useless data, normalized by repeated data, and normalized to generate a feature vector of the private data set.
3. The system of claim 1,
determining the openness ai comprises dividing the openness of the members of the group to each type of private data set into a plurality of levels, the maximum value of the levels not exceeding 1, the openness of the group to each type of private data set being the minimum value of the levels, and
wherein Si is more than or equal to 1, Bi is more than or equal to 1, and Fpi is less than or equal to 1.
4. The system of claim 1, wherein determining the degree of privacy disclosure comprises:
defining a privacy disclosure risk coefficient theta ═ (Si × Fpi)/Bi;
defining a privacy leakage vector T (θ 1, θ 2, …) for each type of privacy data set; and
determining the privacy leakage risk vector RVALUE=R*T。
5. The system of claim 4, wherein determining the group's privacy-exposure risk quantified value comprises weighting the privacy-exposure risk vector according to a frequency of use of each type of privacy data set to obtain the privacy-exposure risk quantified value.
6. A method for determining group privacy exposure risk, comprising:
pre-processing a set of privacy data about the group;
determining the degree of association of the private data sets and forming an association matrix R;
the method comprises the steps of determining the openness ai of the group to each type of privacy data set based on the preprocessed privacy data sets, and forming a corresponding privacy open matrix;
for each type of private data set, determining three-tuple information { leakage severity Si, leakage difficulty Bi, openness Fpi of the group };
determining a privacy disclosure degree of each type of privacy data set based on the incidence matrix R and the triplet information; and
determining a privacy exposure risk quantification value for the group based on the privacy exposure degree to determine a privacy exposure risk for the group.
7. The method of claim 6, wherein preprocessing the private data set comprises: the private data set is cleaned, formatted, filtered of useless data, normalized by repeated data, and normalized to generate a feature vector of the private data set.
8. The method of claim 6,
determining the openness ai comprises dividing the openness of the members of the group to each type of private data set into a plurality of levels, the maximum value of the levels not exceeding 1, the openness of the group to each type of private data set being the minimum value of the levels, and
wherein Si is more than or equal to 1, Bi is more than or equal to 1, and Fpi is less than or equal to 1.
9. The method of claim 6, wherein determining the degree of privacy disclosure comprises:
defining a privacy disclosure risk coefficient theta ═ (Si × Fpi)/Bi;
defining a privacy leakage vector T (θ 1, θ 2, …) for each type of privacy data set; and
determining the privacy leakage risk vector RVALUE=R*T。
10. The method of claim 9, wherein determining the group's privacy-exposure risk quantified value comprises weighting the privacy-exposure risk vector according to a frequency of use of each type of privacy data set to obtain the privacy-exposure risk quantified value.
CN202011103737.8A 2020-10-15 2020-10-15 System and method for determining group privacy disclosure risk Active CN113709090B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011103737.8A CN113709090B (en) 2020-10-15 2020-10-15 System and method for determining group privacy disclosure risk

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011103737.8A CN113709090B (en) 2020-10-15 2020-10-15 System and method for determining group privacy disclosure risk

Publications (2)

Publication Number Publication Date
CN113709090A true CN113709090A (en) 2021-11-26
CN113709090B CN113709090B (en) 2023-03-17

Family

ID=78646709

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011103737.8A Active CN113709090B (en) 2020-10-15 2020-10-15 System and method for determining group privacy disclosure risk

Country Status (1)

Country Link
CN (1) CN113709090B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113904874A (en) * 2021-11-30 2022-01-07 北京中超伟业信息安全技术股份有限公司 Unmanned aerial vehicle data secure transmission method
CN114139213A (en) * 2022-02-07 2022-03-04 广州海洁尔医疗设备有限公司 ICU ward monitoring data processing method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105871891A (en) * 2016-05-17 2016-08-17 中国互联网络信息中心 DNS privacy leakage risk assessment method and system
US20170270318A1 (en) * 2016-03-15 2017-09-21 Stuart Ritchie Privacy impact assessment system and associated methods
CN109670342A (en) * 2018-12-30 2019-04-23 北京工业大学 The method and apparatus of information leakage risk measurement
CN109783614A (en) * 2019-01-25 2019-05-21 北京信息科技大学 A kind of the difference privacy leakage detection method and system of social networks text to be released
CN110222058A (en) * 2019-06-05 2019-09-10 深圳市优网科技有限公司 Multi-source data based on FP-growth is associated with privacy leakage risk evaluating system
US20200151351A1 (en) * 2018-11-13 2020-05-14 International Business Machines Corporation Verification of Privacy in a Shared Resource Environment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170270318A1 (en) * 2016-03-15 2017-09-21 Stuart Ritchie Privacy impact assessment system and associated methods
CN105871891A (en) * 2016-05-17 2016-08-17 中国互联网络信息中心 DNS privacy leakage risk assessment method and system
US20200151351A1 (en) * 2018-11-13 2020-05-14 International Business Machines Corporation Verification of Privacy in a Shared Resource Environment
CN109670342A (en) * 2018-12-30 2019-04-23 北京工业大学 The method and apparatus of information leakage risk measurement
CN109783614A (en) * 2019-01-25 2019-05-21 北京信息科技大学 A kind of the difference privacy leakage detection method and system of social networks text to be released
CN110222058A (en) * 2019-06-05 2019-09-10 深圳市优网科技有限公司 Multi-source data based on FP-growth is associated with privacy leakage risk evaluating system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
AKSHIKA WIJESUNDARA: "《Engineering Privacy-aware Smart Home Environments》", 《EICS "20 COMPANION: COMPANION PROCEEDINGS OF THE 12TH ACM SIGCHI SYMPOSIUM ON ENGINEERING INTERACTIVE COMPUTING SYSTEMS》 *
吴丁娟: "《大数据背景下医疗数据的隐私关注及其影响因素》", 《河南师范大学学报》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113904874A (en) * 2021-11-30 2022-01-07 北京中超伟业信息安全技术股份有限公司 Unmanned aerial vehicle data secure transmission method
CN113904874B (en) * 2021-11-30 2022-03-04 北京中超伟业信息安全技术股份有限公司 Unmanned aerial vehicle data secure transmission method
CN114139213A (en) * 2022-02-07 2022-03-04 广州海洁尔医疗设备有限公司 ICU ward monitoring data processing method and system

Also Published As

Publication number Publication date
CN113709090B (en) 2023-03-17

Similar Documents

Publication Publication Date Title
JP6814017B2 (en) Computer implementation systems and methods that automatically identify attributes for anonymization
CN113709090B (en) System and method for determining group privacy disclosure risk
Javadi et al. Monitoring misuse for accountable'artificial intelligence as a service'
Rupa et al. A machine learning driven threat intelligence system for malicious URL detection
EP3567508A1 (en) Detection and prevention of privacy violation due to database release
Veena et al. C SVM classification and KNN techniques for cyber crime detection
Torra Privacy in data mining
Vishva et al. Phisher fighter: website phishing detection system based on url and term frequency-inverse document frequency values
Kasim Automatic detection of phishing pages with event-based request processing, deep-hybrid feature extraction and light gradient boosted machine model
Wang et al. Exploring topic models to discern cyber threats on Twitter: A case study on Log4Shell
Pandey et al. Text and data mining to detect phishing websites and spam emails
de Oliveira Silva et al. Privacy and data mining: Evaluating the impact of data anonymization on classification algorithms
US20200293590A1 (en) Computer-implemented Method and System for Age Classification of First Names
Phua et al. On the communal analysis suspicion scoring for identity crime in streaming credit applications
Schroeder et al. Crimelink explorer: Using domain knowledge to facilitate automated crime association analysis
CN112231746A (en) Joint data analysis method, device and system and computer readable storage medium
Park et al. Evaluating differentially private decision tree model over model inversion attack
Garfinkel et al. Detecting threatening insiders with lightweight media forensics
Sumalatha et al. Data collection and audit logs of digital forensics in cloud
Abid et al. Online testing of user profile resilience against inference attacks in social networks
Mavriki et al. Profiling with Big Data: Identifying Privacy Implication for Individuals, Groups and Society
Canelón et al. Unstructured data for cybersecurity and internal control
Kashid et al. Discrimination-aware data mining: a survey
Kabwe et al. Identity attributes metric modelling based on mathematical distance metrics models
Krüger An Approach to Profiler Detection of Cyber Attacks using Case-based Reasoning.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220208

Address after: Room 1423, No. 1256 and 1258, Wanrong Road, Jing'an District, Shanghai 200072

Applicant after: Tianyi Digital Life Technology Co.,Ltd.

Address before: 201702 3rd floor, 158 Shuanglian Road, Qingpu District, Shanghai

Applicant before: Tianyi Smart Family Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant