CN110852374A - Data detection method and device, electronic equipment and storage medium - Google Patents

Data detection method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN110852374A
CN110852374A CN201911090328.6A CN201911090328A CN110852374A CN 110852374 A CN110852374 A CN 110852374A CN 201911090328 A CN201911090328 A CN 201911090328A CN 110852374 A CN110852374 A CN 110852374A
Authority
CN
China
Prior art keywords
data
cluster
clustering
target
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911090328.6A
Other languages
Chinese (zh)
Other versions
CN110852374B (en
Inventor
赵瑞辉
石维
苏晓东
陈婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Cloud Computing Beijing Co Ltd
Original Assignee
Tencent Cloud Computing Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Cloud Computing Beijing Co Ltd filed Critical Tencent Cloud Computing Beijing Co Ltd
Priority to CN201911090328.6A priority Critical patent/CN110852374B/en
Publication of CN110852374A publication Critical patent/CN110852374A/en
Application granted granted Critical
Publication of CN110852374B publication Critical patent/CN110852374B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/008Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols involving homomorphic encryption

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a data detection method, a data detection device, electronic equipment and a storage medium, wherein the data detection method comprises the following steps: the method comprises the steps of encrypting data to be detected through a preset homomorphic encryption algorithm to obtain encrypted data, sending the encrypted data to a server, so that the server can perform clustering processing on the encrypted data based on data obtained from different terminals, receiving a first clustering result returned by the server, decrypting the first clustering result through the homomorphic encryption algorithm, determining a target clustering center and a membership value of a cluster to which the data to be detected belongs according to the decrypted first clustering result, obtaining a range interval value corresponding to the target clustering center, detecting the data to be detected through the membership value, the range interval value and the target clustering center, and obtaining a detection result corresponding to the data to be detected.

Description

Data detection method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of computers, in particular to a data detection method, a data detection device, electronic equipment and a storage medium.
Background
With the development of communication technology, data security is also more and more emphasized by people, taking enterprise data security as an example, in order to prevent lawless persons from obtaining some data of an enterprise to earn profits, data is usually detected, and at present, a commonly used detection scheme can be an anomaly detection scheme based on rules and an anomaly detection scheme based on traditional machine learning.
In the above mentioned schemes, sufficient data is required to be used as a support to perform anomaly detection in a global scope, so as to find real anomalies, however, the reason that the information islanding is limited is that an anomaly detection algorithm is still used under the condition that the data amount is insufficient, which may cause inaccurate detection results.
Disclosure of Invention
The embodiment of the invention provides a data detection method, a data detection device, electronic equipment and a storage medium, which can improve the accuracy of data detection.
The embodiment of the invention provides a data detection method, which comprises the following steps:
encrypting the data to be detected by a preset homomorphic encryption algorithm to obtain encrypted data;
sending the encrypted data to a server so that the server can perform clustering processing on the encrypted data based on data acquired from different terminals;
receiving a first clustering result returned by the server, and decrypting the first clustering result by adopting the homomorphic encryption algorithm;
determining a target cluster center and a membership value of a cluster to which the data to be detected belongs according to the decrypted first cluster result;
and acquiring a range interval value corresponding to the target clustering center, and detecting the data to be detected through the membership value, the range interval value and the target clustering center to obtain a detection result corresponding to the data to be detected.
Correspondingly, an embodiment of the present invention further provides a data detection apparatus, including:
the encryption module is used for encrypting the data to be detected through a preset homomorphic encryption algorithm to obtain encrypted data;
the sending module is used for sending the encrypted data to a server so that the server can perform clustering processing on the encrypted data based on data obtained from different terminals;
the decryption module is used for receiving the first clustering result returned by the server and decrypting the first clustering result by adopting the homomorphic encryption algorithm;
the determining module is used for determining a target cluster center and a membership value of a cluster to which the data to be detected belongs according to the decrypted first cluster result;
the acquisition module is used for acquiring a range interval value corresponding to the target clustering center;
and the detection module is used for detecting the data to be detected through the membership value, the range interval value and the target clustering center to obtain a detection result corresponding to the data to be detected.
Optionally, in some embodiments of the present invention, the detection module includes:
the extraction unit is used for extracting all the cluster centers in the decrypted cluster result;
the calculating unit is used for calculating the distance between the target clustering center and each clustering center;
and the detection unit is used for detecting the data to be detected based on the membership value, the range interval value and the distance to obtain a detection result corresponding to the data to be detected.
Optionally, in some embodiments of the present invention, the detection unit is specifically configured to:
and when the distance is less than or equal to the first threshold value, judging whether the target membership value is within the range interval value of the target clustering center, and if the target membership value is within the range interval value of the target clustering center, determining that the object to be detected is normal data.
Optionally, in some embodiments of the present invention, the detection unit is further specifically configured to:
when the distance between the target cluster center and other cluster centers is larger than a first threshold value, determining the object to be detected as abnormal data, or;
and when the distance between the target cluster center and other cluster centers is smaller than or equal to a first threshold value and the target membership value is not within the range interval value of the target cluster center, determining the object to be detected as abnormal data.
Optionally, in some embodiments of the present invention, the apparatus further includes a training module, where the training module is specifically configured to:
acquiring a sample data set, wherein the sample data set comprises a plurality of sample data with normal data condition marks;
encrypting the sample data of the sample data set by a preset homomorphic encryption algorithm to obtain an encrypted data set;
sending the encrypted data set to a server so that the server can perform clustering processing on the data in the encrypted data set based on the data acquired from different terminals;
receiving a second clustering result returned by the server, and decrypting the second clustering result by adopting the homomorphic encryption algorithm;
determining a cluster center of a cluster to which the sample data belongs and a membership value corresponding to the sample data according to the decrypted second clustering result;
calculating a range interval value of a clustering center of a cluster to which the sample data belongs according to the membership value corresponding to the sample data;
predicting the data condition of the sample data according to the membership value corresponding to the sample data, the clustering center of the cluster to which the sample data belongs and the range interval value of the clustering center of the cluster to which the sample data belongs to obtain the predicted data condition of the sample data;
according to the real data condition and the predicted data condition, adjusting the clustering center of the cluster to which the sample data belongs until the clustering center of the cluster to which the sample data belongs meets the preset condition;
saving a range interval value corresponding to the clustering center meeting the preset condition;
the obtaining module is specifically configured to obtain a range interval value corresponding to the target cluster center from the stored range interval values.
Optionally, in some embodiments of the present invention, the system further includes a building module, where the building module is specifically configured to:
extracting attribute information of a clustering center corresponding to the sample data;
and constructing a mapping relation between the attribute information and the range interval value.
Optionally, in some embodiments of the present invention, the determining module is specifically configured to:
extracting attribute information corresponding to the target clustering center;
and acquiring a range interval value corresponding to the target clustering center from the stored range interval values based on a preset mapping relation.
Optionally, in some embodiments of the present invention, the decryption module is specifically configured to:
acquiring a decryption function corresponding to the data to be detected based on a preset homomorphic encryption algorithm;
and decrypting the first clustering result through a decryption function.
According to the embodiment of the invention, after data to be detected is encrypted by a preset homomorphic encryption algorithm to obtain encrypted data, the encrypted data is sent to a server so that the server can perform clustering processing on the encrypted data based on data obtained from different terminals, then a first clustering result returned by the server is received, the first clustering result is decrypted by the homomorphic encryption algorithm, then a target clustering center and a membership value of a cluster to which the data to be detected belongs are determined according to the decrypted first clustering result, finally, a range interval value corresponding to the target clustering center is obtained, and the data to be detected is detected through the membership value, the range interval value and the target clustering center to obtain a detection result corresponding to the data to be detected. Therefore, the scheme can effectively improve the accuracy of data detection.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1a is a schematic view of a scene of a data detection method according to an embodiment of the present invention;
FIG. 1b is a schematic flow chart of a data detection method according to an embodiment of the present invention;
FIG. 2a is a schematic flow chart of a data detection method according to an embodiment of the present invention;
FIG. 2b is a diagram illustrating a mapping relationship provided by an embodiment of the present invention
FIG. 2c is a schematic diagram of another scenario of a data detection method according to an embodiment of the present invention;
FIG. 2d is a schematic interface diagram of the detection result provided by the embodiment of the present invention;
fig. 3a is a schematic structural diagram of a first implementation of a data detection apparatus according to an embodiment of the present invention;
fig. 3b is a schematic structural diagram of a second implementation of the data detection apparatus according to the embodiment of the present invention;
FIG. 3c is a schematic structural diagram of a third implementation of a data detection apparatus according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a data detection system, which is hereinafter referred to as a detection system.
The detection system may include a user, a terminal and a server, the data detection apparatus may be specifically integrated in the terminal, and the terminal may include a mobile phone, a tablet Computer or a Personal Computer (PC).
For example, referring to fig. 1a, the data detection apparatus is integrated on a personal computer, and when the personal computer receives a data detection instruction triggered by a user, the personal computer may obtain data to be detected corresponding to the data detection instruction, encrypt the data to be detected by a preset homomorphic encryption algorithm to obtain encrypted data, send the encrypted data to a server, so that the server clusters the encrypted data based on data obtained from different terminals, then receive a first clustering result returned by the server, decrypt the first clustering result by using the homomorphic encryption algorithm, determine a target clustering center and a membership value of a cluster to which the data to be detected belongs according to the decrypted first clustering result, and finally obtain a range interval value corresponding to the target clustering center, and determine, by using the membership value, a range interval value corresponding to the target clustering center, And detecting the data to be detected by the range interval value and the target clustering center to obtain a detection result corresponding to the data to be detected.
According to the scheme, the data to be detected is encrypted through a homomorphic encryption algorithm, and the server cannot acquire a specific numerical value of the data to be detected, so that the safety of data detection is improved; in addition, the server can perform clustering processing on the encrypted data based on the data acquired from different terminals, so that false alarm caused by too small data volume due to the fact that the existing data detection scheme is limited by a data island is solved, and the accuracy of data detection is improved.
The following are detailed below. It should be noted that the description sequence of the following embodiments is not intended to limit the priority sequence of the embodiments.
A method of data detection, comprising: encrypting data to be detected by a preset homomorphic encryption algorithm to obtain encrypted data, sending the encrypted data to a server so that the server can perform clustering processing on the encrypted data based on data obtained from different terminals, receiving a first clustering result returned by the server, decrypting the first clustering result by using the homomorphic encryption algorithm, determining a target clustering center and a membership value of a cluster to which the data to be detected belongs according to the decrypted first clustering result, obtaining a range interval value corresponding to the target clustering center, and detecting the data to be detected by using the membership value, the range interval value and the target clustering center to obtain a detection result corresponding to the data to be detected.
Referring to fig. 1b, fig. 1b is a schematic flow chart of a data detection method according to an embodiment of the invention. The specific flow of the data detection method can be as follows:
101. and encrypting the data to be detected by a preset homomorphic encryption algorithm to obtain encrypted data.
Specifically, the data to be detected, which needs to be detected, can be acquired from the local database, and then the data to be detected is encrypted through a preset homomorphic encryption algorithm, so that encrypted data is obtained.
Among them, homomorphic encryption is an encryption method that allows mathematical operations to be performed on data, rather than on the actual data itself. Ciphertext, which is an encrypted version of input data, also known as plaintext, is operated on and then decrypted to obtain the desired output. The key to homomorphic encryption is to obtain the same output from the ciphertext of the decryption operation, rather than simply manipulating the original plaintext. That is, homomorphic encryption is a method that can perform calculation without decrypting encrypted data in advance, and the homomorphic encryption technology is used to encrypt original data without any significant change to the attributes of the original data.
Optionally, the embodiment of the present invention may adopt a lightweight homomorphic encryption algorithm. The lightweight homomorphic encryption algorithm comprises addition, subtraction and multiplication, and not only can ensure the privacy of each data source and enlarge the scale of the data to be detected, but also can reduce the occupied memory of the terminal, thereby improving the efficiency of data detection.
102. And sending the encrypted data to a server so that the server can perform clustering processing on the encrypted data based on the data acquired from different terminals.
Because the encrypted data is the data to be detected encrypted by the preset homomorphic encryption algorithm, the server cannot acquire the true value of the data to be detected, the server can acquire the data of different terminals, the data can also be the data subjected to encryption processing, and then the server performs clustering processing on the encrypted data based on the data, so that the false report caused by insufficient data quantity of the terminals can be reduced, and the accuracy of data detection is improved.
In addition, the server may perform Clustering processing on the encrypted data by using Fuzzy C-Means (FCM), or may also use K-Means Clustering (K-Means Clustering), and here, the concept of Fuzzy is introduced, where Fuzzy Means that the extension of the concept has uncertainty or the extension of the concept is unclear. For example, the concept of "young" is known, but its extension, i.e., what age groups people are young, is difficult to say, because there is no definite boundary between "young" and "not young", which is a fuzzy concept. One considers that 20 years old is "young" and 21 years old is "not young" according to a deterministic schedule. However, it is also believed that the ages 20 and 21 are also in the category of "young", and it is believed that the age 21 is 0.9 out of youth and 0.1 out of youth, where 0.9 and 0.1 refer to a similar degree. The degree of similarity of a sample to a result is referred to as the membership degree of the sample, and represents an index of the degree to which a sample is similar to different results.
103. And receiving a first clustering result returned by the server, and decrypting the first clustering result by adopting a homomorphic encryption algorithm.
Since the first clustering result returned by the server is the clustering result corresponding to the encrypted data, the homomorphic encryption algorithm needs to be used to decrypt the first clustering result, for example, a corresponding decryption function may be obtained according to the homomorphic encryption algorithm, and then the first clustering result is decrypted by the decryption function, that is, in some embodiments, the step "decrypting the first clustering result by using the homomorphic encryption algorithm" includes:
(11) acquiring a decryption function corresponding to the data to be detected based on a preset homomorphic encryption algorithm;
(12) and decrypting the first clustering result by using the decryption key.
For example, the data a to be detected is encrypted by the encryption function F to obtain encrypted data a ', that is, F (a) ═ a', so that the decryption function F corresponding to the encryption function F can be obtained-1Then, it can pass through this decryption function F-1And decrypting the first clustering result. In general, if the encrypted data D ' and the encrypted data E ' are added to each other to obtain the superimposed encrypted data H ', the decryption function F is used to decrypt the superimposed encrypted data H-1The result of decrypting H' is typically meaningless scrambling, but if the encryption function F is one that can be homomorphic, then the decryption function F is used-1And decrypting the H' to obtain a decryption result H, wherein the H is D + E, so that the data processing right and the data ownership can be separated, and for enterprises, the data can be processed by utilizing the computing capacity of the cloud service while the data is prevented from being leaked.
It should be noted that the encryption function in the homomorphic encryption algorithm can be divided into an addition homomorphic state and a multiplication homomorphic state, and if F (a) + F (B) ═ F (a + B), the encryption function is called an addition homomorphic state; if F (a) × F (B) is F (a × B), this cryptographic function is called multiplication homomorphism; if an encryption function satisfies both additive and multiplicative homologies, it is called fully homomorphic encryption.
104. And determining a target cluster center and a membership value of a cluster to which the data to be detected belongs according to the decrypted first cluster result.
The target cluster center of the cluster to which the data to be detected belongs can be extracted from the decrypted first cluster result, and the membership value of the data to be detected belonging to the target cluster center can be extracted from the decrypted first cluster result.
The concept of membership value is introduced, where membership belongs to the concept in the fuzzy evaluation function, and for any element x in the domain of discourse, a number a (x) e (0,1) corresponds to it, which is called the fuzzy set on U, and a (x) is called the membership value of x to a. The closer to 1 the degree of membership A (x) is, the higher the degree to which x belongs to A, and the closer to 0A (x) is, the lower the degree to which x belongs to A. And (3) representing the degree of the x belonging to the A by using a membership function A (x) which takes values in an interval (0, 1).
105. And acquiring a range interval value corresponding to the target clustering center, and detecting the data to be detected through the membership value, the range interval value and the target clustering center to obtain a detection result corresponding to the data to be detected.
The obtaining of the range interval value corresponding to the target clustering center by the server based on a plurality of data may be performed in advance, that is, in some embodiments, the step "obtaining the range interval value corresponding to the target clustering center" may specifically include:
(21) acquiring a sample data set, wherein the sample data set comprises a plurality of sample data with normal data condition marks;
(22) encrypting the sample data of the sample data set by a preset homomorphic encryption algorithm to obtain an encrypted data set;
(23) sending the encrypted data set to a server so that the server can perform clustering processing on the data in the encrypted data set based on the data acquired from different terminals;
(24) receiving a second clustering result returned by the server, and decrypting the second clustering result by adopting a homomorphic encryption algorithm;
(25) determining a cluster center of a cluster to which the sample data belongs and a membership value corresponding to the sample data according to the decrypted second clustering result;
(26) calculating a range interval value of a clustering center of a cluster to which the sample data belongs according to the membership value corresponding to the sample data;
(27) predicting the data condition of the sample data according to the membership value corresponding to the sample data, the clustering center of the cluster to which the sample data belongs and the range interval value of the clustering center of the cluster to which the sample data belongs to obtain the predicted data condition of the sample data;
(28) according to the real data condition and the predicted data condition, adjusting the clustering center of the cluster to which the sample data belongs until the clustering center of the cluster to which the sample data belongs meets the preset condition;
(29) saving a range interval value corresponding to the clustering center meeting the preset condition;
for example, a sample data set may be obtained, where the sample data set includes 10 sample data, and data conditions of the 10 sample data are all marked as normal, then the 10 sample data are encrypted by a homomorphic encryption algorithm, respectively, to obtain an encrypted data set, and the data encrypted set is sent to the server, so that the server may perform clustering processing on data in the encrypted data set based on data obtained from different terminals, then receive a second clustering result returned by the server, and decrypt the second clustering result by using the homomorphic encryption algorithm, where the decrypted second clustering result may include a clustering center to which the sample data belongs and a membership value corresponding to the sample data, and then, according to the membership value corresponding to the sample data, calculate a range interval value of the clustering center to which the sample data belongs, for example, can be calculated using the following formula
Figure BDA0002266655130000091
Wherein the membership value corresponding to the sample data is Xi, T (T)1,T2) Denotes a range interval value of the cluster center, n is the number of cluster centers, T1 denotes a minimum value of the range interval value, and T2 denotes a maximum value of the range interval value.
Then, predicting the data condition of the sample data according to the membership value corresponding to the sample data, the cluster center of the cluster to which the sample data belongs and the range interval value of the cluster center of the cluster to which the sample data belongs to obtain the predicted data condition of the sample data, and when the true data condition is consistent with the predicted data condition, determining that the cluster center of the cluster to which the sample data belongs meets a preset condition, and then, locally predicting the range interval value corresponding to the cluster center which can meet the preset condition; and when the real data condition is inconsistent with the predicted data condition, adjusting the clustering center of the cluster to which the sample data belongs until the clustering center of the cluster to which the sample data belongs meets the preset condition. That is, in some embodiments, the step "obtaining a range interval value corresponding to a target cluster center" may specifically include: and acquiring a range interval value corresponding to the target clustering center from the stored range interval values.
Optionally, in some embodiments, the method may further include:
(31) extracting attribute information of a clustering center corresponding to the sample data;
(32) and constructing a mapping relation between the attribute information and the range interval value.
For example, the sample data set includes two sample data, where the attribute information of the cluster center corresponding to one sample data is: financial, the attribute information corresponding to the clustering center corresponding to another sample data is: in medical treatment, then, a mapping relationship between the attribute information and the range interval value may be constructed, so as to facilitate subsequent use, that is, in some embodiments, the step "obtaining the range interval value corresponding to the cluster center" may specifically include:
(41) extracting attribute information corresponding to a target clustering center;
(42) and acquiring a range interval value corresponding to the target clustering center from the stored range interval values based on a preset mapping relation.
For example, when it is determined that the attribute information corresponding to the target cluster center is "finance", a range interval value corresponding to "finance" may be acquired from the stored range interval values by presetting the mapping relationship.
Optionally, in some embodiments, the step of detecting the data to be detected through the membership value, the range interval value, and the target clustering center to obtain a detection result corresponding to the data to be detected specifically may include:
(51) extracting all cluster centers in the decrypted cluster result;
(52) calculating the distance between the target clustering center and each clustering center;
(53) and detecting the data to be detected based on the membership value, the range interval value and the distance to obtain a detection result corresponding to the data to be detected.
Specifically, whether all data of the target cluster center are normal or not can be judged through the distance between the target cluster center and each cluster center, whether the data to be detected are normal or not can be judged through the membership value and the range interval value, when the distance between the target cluster center and each cluster center is smaller than the maximum value of the range interval value, whether the target membership value is located in the range interval value of the target cluster center or not can be judged, and if the target membership value is located in the range interval value of the target cluster center, it is determined that the current processing object is normal data, that is, in some embodiments, the step "detecting the data to be detected based on the membership value, the range interval value and the distance to obtain the detection result corresponding to the data to be detected" may specifically include:
and when the distance is less than or equal to the first threshold value, judging whether the target membership value is within the range interval value of the target clustering center, and if the target membership value is within the range interval value of the target clustering center, determining that the object to be detected is normal data.
For example, in some embodiments, when the distance between the target cluster center and each cluster center is greater than the maximum value of the range interval value, it may be determined that all data corresponding to the target cluster center is abnormal data, or when the distance between the target cluster center and each cluster center is less than the maximum value of the range interval value, if the target membership value is not within the range interval value of the target cluster center, it is determined that the current processing object is abnormal data, that is, the step "detecting the data to be detected based on the membership value, the range interval value, and the distance to obtain the detection result corresponding to the data to be detected" may specifically include
(61) When the distance between the target cluster center and other cluster centers is larger than a first threshold value, determining the current processing object as abnormal data, or;
(62) and when the distance between the target cluster center and other cluster centers is smaller than or equal to a first threshold value and the target membership value is not within the range interval value of the target cluster center, determining the object to be detected as abnormal data.
According to the embodiment of the invention, after data to be detected is encrypted by a preset homomorphic encryption algorithm to obtain encrypted data, the encrypted data is sent to a server so that the server can cluster the encrypted data based on the data obtained from different terminals, then a first clustering result returned by the server is received, the first clustering result is decrypted by the homomorphic encryption algorithm, then a target clustering center and a membership value of a cluster to which the data to be detected belongs are determined according to the decrypted first clustering result, finally a range interval value corresponding to the target clustering center is obtained, and the data to be detected is detected through the membership value, the range interval value and the target clustering center to obtain a detection result corresponding to the data to be detected. Compared with the existing data detection scheme, the data detection method can encrypt the data to be detected through a homomorphic encryption algorithm, so that the server cannot acquire the specific numerical value of the data to be detected, and the safety of data detection is improved; in addition, the server can perform clustering processing on the encrypted data based on the data acquired from different terminals, so that false alarm caused by too small data volume due to the fact that the existing data detection scheme is limited by a data island is solved, and the accuracy of data detection is improved.
The method according to the examples is further described in detail below by way of example.
In the present embodiment, the data detection apparatus is specifically integrated in a terminal as an example.
Referring to fig. 2a, a specific process of a data detection method may be as follows:
201. and the terminal encrypts the data to be detected through a preset homomorphic encryption algorithm to obtain encrypted data.
Specifically, the terminal can generate a key according to the data to be detected, the key can include a public key and a private key, if the public key is used for encrypting the data, the data can be decrypted only by using the corresponding private key, then the data to be detected is divided into a plurality of random values, a multi-dimensional vector corresponding to the data to be detected is constructed according to the plurality of random values, and then the multi-dimensional vector is encoded through the private key to obtain the encrypted data.
202. And the terminal sends the encrypted data to the server so that the server can perform clustering processing on the encrypted data based on the data acquired from different terminals.
For example, specifically, after the terminal sends the encrypted data to the server, the server may perform clustering processing on the encrypted data through a fuzzy clustering algorithm based on data obtained from different terminals, and of course, a K-means clustering algorithm may also be used.
203. And the terminal receives the first clustering result returned by the server and decrypts the first clustering result by adopting a homomorphic encryption algorithm.
For example, specifically, the terminal may obtain a private key corresponding to the public key used in step 201, and decrypt the first clustering result.
204. And the terminal determines a target cluster center and a membership value of a cluster to which the data to be detected belongs according to the decrypted first cluster result.
For example, the terminal may extract a target cluster center of a cluster to which the data to be detected belongs from the decrypted first cluster result, and extract a membership value of the data to be detected belonging to the target cluster center from the decrypted first cluster result.
205. And the terminal acquires a range interval value corresponding to the target clustering center, and detects the data to be detected through the membership value, the range interval value and the target clustering center to obtain a detection result corresponding to the data to be detected.
It should be noted that, the server may obtain different terminal data in advance for clustering to obtain a plurality of cluster sets, each cluster set includes at least one data, and each set corresponds to a cluster center, where data conditions corresponding to the data are all labeled as normal, then, the server may calculate a range interval value corresponding to each cluster center, in the terminal according to the embodiment of the present invention, first, a sample data set including sample data labeled as normal in a plurality of data conditions may be obtained, then, the terminal may encrypt the sample data of the sample data set by using a preset homomorphic encryption algorithm to obtain an encrypted data set, then, the terminal may send the encrypted data set to the server, so that the server may perform clustering processing on the data in the encrypted data set based on the data obtained from different terminals, and receive a second clustering result returned by the server, and finally, the terminal adjusts the clustering center of the cluster to which the sample data belongs according to the real data condition and the predicted data condition until the clustering center of the cluster to which the sample data belongs meets the preset condition, and can also store the range interval value corresponding to the clustering center meeting the preset condition.
In order to improve the efficiency of data detection, the terminal may construct a mapping relationship between the attribute information of the cluster center and the range interval value, for example, the sample data set includes two sample data, where the attribute information of the cluster center corresponding to one sample data is: financial, the attribute information corresponding to the clustering center corresponding to another sample data is: in medical treatment, the terminal may construct a mapping relationship between the attribute information and the range interval value, so as to facilitate subsequent use, as shown in fig. 2 b.
In the actual data detection process, the terminal can judge whether all data of the target cluster center are normal or not according to the distance between the target cluster center and each cluster center, judge whether the data to be detected are normal or not according to the membership value and the range interval value, judge whether the target membership value is within the range interval value of the target cluster center or not when the distance between the target cluster center and each cluster center is smaller than the maximum value of the range interval value, if the target membership value is within the range interval value of the target cluster center, determine that the current processing object is normal data, and when the distance between the target cluster center and each cluster center is larger than the maximum value of the range interval value, all data corresponding to the target cluster center can be considered as abnormal data, or when the distance between the target cluster center and each cluster center is smaller than the maximum value of the range interval value, and if the target membership value is not located in the range interval value of the target clustering center, determining that the current processing object is abnormal data.
According to the embodiment of the invention, after the terminal encrypts the data to be detected by a preset homomorphic encryption algorithm to obtain the encrypted data, the terminal sends the encrypted data to the server so that the server can cluster the encrypted data based on the data obtained from different terminals, then the terminal receives a first clustering result returned by the server and decrypts the first clustering result by adopting the homomorphic encryption algorithm, then the terminal determines a target clustering center and a membership value of a cluster to which the data to be detected belongs according to the decrypted first clustering result, finally, the terminal obtains a range interval value corresponding to the target clustering center and detects the data to be detected by the membership value, the range interval value and the target clustering center to obtain a detection result corresponding to the data to be detected. Compared with the existing data detection scheme, the terminal can encrypt the data to be detected through a homomorphic encryption algorithm, so that the server cannot acquire the specific numerical value of the data to be detected, and the safety of data detection is improved; in addition, the server can perform clustering processing on the encrypted data based on the data acquired from different terminals, so that false alarm caused by too small data volume due to the fact that the existing data detection scheme is limited by a data island is solved, and the accuracy of data detection is improved.
To facilitate understanding of the data detection method provided by the embodiment of the present invention, please refer to fig. 2c, which illustrates a scenario of data detection through a data detection platform, first, a user logs in the data detection platform through a terminal, when a user identifier of the user passes verification, the user can upload data to be detected to a server through the terminal, the terminal can acquire the data to be detected, then, the terminal encrypts the data to be detected through a preset homomorphic encryption algorithm to obtain encrypted data, and sends the encrypted data to the server, so that the server performs clustering processing on the encrypted data based on data acquired from different terminals, then, the terminal receives a first clustering result returned by the server, decrypts the first clustering result by using the homomorphic encryption algorithm, and then, the terminal determines a target clustering center and a membership value of a cluster to which the data to be detected belongs according to the decrypted first clustering result, then, the terminal obtains a range interval value corresponding to the target cluster center, detects the data to be detected through the membership value, the range interval value and the target cluster center to obtain a detection result corresponding to the data to be detected, and finally, the terminal can display an interface corresponding to the detection result, as shown in fig. 2d, it can be seen that the data to be detected is abnormal data, and information of an attacker, such as a network address, attack time, the number of ports involved in an attack event and the like, is displayed on the interface.
In order to better implement the data detection method according to the embodiment of the present invention, an embodiment of the present invention further provides a data detection apparatus (referred to as a detection apparatus for short) based on the foregoing data detection method. The terms are the same as those in the above data detection method, and details of implementation can be referred to the description in the method embodiment.
Referring to fig. 3a, fig. 3a is a schematic structural diagram of a data detection apparatus according to an embodiment of the present invention, where the detection apparatus may include an encryption module 301, a sending module 302, a decryption module 303, a determining module 304, an obtaining module 305, and a detection module 306, and specifically may be as follows:
the encryption module 301 is configured to encrypt the data to be detected by using a preset homomorphic encryption algorithm, so as to obtain encrypted data.
The encryption module 301 may obtain data to be detected, which needs to be detected, from the local database, and then encrypt the data to be detected by using a preset homomorphic encryption algorithm, so as to obtain encrypted data.
A sending module 302, configured to send the encrypted data to the server, so that the server performs clustering processing on the encrypted data based on data obtained from different terminals.
Because the encrypted data is the data to be detected encrypted by the preset homomorphic encryption algorithm, the server cannot acquire the true value of the data to be detected, the server can acquire the data of different terminals, the data can also be the data subjected to encryption processing, and then the server performs clustering processing on the encrypted data based on the data, so that the false report caused by insufficient data quantity of the terminals can be reduced, and the accuracy of data detection is improved.
And the decryption module 303 is configured to receive the first clustering result returned by the server, and decrypt the first clustering result by using a homomorphic encryption algorithm.
Optionally, in some embodiments, the decryption module 303 is specifically configured to: and acquiring a decryption function corresponding to the data to be detected based on a preset homomorphic encryption algorithm, and decrypting the first clustering result through the decryption function.
Since the first clustering result returned by the server is the clustering result corresponding to the encrypted data, the homomorphic encryption algorithm needs to be used to decrypt the first clustering result, for example, the decryption module 303 may obtain a corresponding decryption function according to the homomorphic encryption algorithm, and then decrypt the first clustering result through the decryption function.
And the determining module 304 is configured to determine a target cluster center and a membership value of a cluster to which the data to be detected belongs according to the decrypted first cluster result.
The determining module 304 may extract a target cluster center of a cluster to which the data to be detected belongs from the decrypted first cluster result, and extract a membership value of the data to be detected belonging to the target cluster center from the decrypted first cluster result.
An obtaining module 305, configured to obtain a range interval value corresponding to a target clustering center.
Optionally, in some embodiments, referring to fig. 3b, the detection apparatus may further include a training module 307, where the training module is specifically configured to: acquiring a sample data set, wherein the sample data set comprises a plurality of sample data with normal data condition marks, encrypting the sample data of the sample data set by a preset homomorphic encryption algorithm to obtain an encrypted data set, sending the encrypted data set to a server so that the server can perform clustering processing on the data in the encrypted data set based on the data acquired from different terminals, receiving a second clustering result returned by the server, decrypting the second clustering result by using the homomorphic encryption algorithm, determining a clustering center of a cluster to which the sample data belongs and a membership value corresponding to the sample data according to the decrypted second clustering result, calculating a range interval value of the clustering center to which the sample data belongs according to the membership value corresponding to the sample data, and predicting the data condition of the sample data by the membership value corresponding to the sample data, the clustering center to which the sample data belongs and the range interval value of the clustering center to which the sample data belongs, obtaining the predicted data condition of the sample data, adjusting the clustering center of the cluster to which the sample data belongs according to the real data condition and the predicted data condition until the clustering center of the cluster to which the sample data belongs meets the preset condition, and storing the range interval value corresponding to the clustering center meeting the preset condition;
optionally, in some embodiments, the obtaining module 305 is specifically configured to: and acquiring a range interval value corresponding to the target clustering center from the stored range interval values.
Optionally, in some embodiments, referring to fig. 3c, the detection apparatus further includes a constructing module 308, where the constructing module 308 is specifically configured to: and extracting attribute information of a clustering center corresponding to the sample data, and constructing a mapping relation between the attribute information and the range interval value.
Optionally, in some embodiments, the determining module 304 is specifically configured to: and extracting attribute information corresponding to the target clustering center, and acquiring a range interval value corresponding to the target clustering center from the stored range interval values based on a preset mapping relation.
The detection module 306 is configured to detect the data to be detected through the membership value, the range interval value and the target clustering center, so as to obtain a detection result corresponding to the data to be detected.
Specifically, the detection module 306 may determine whether all data of the target cluster center are normal according to the distance between the target cluster center and each cluster center, and determine whether the data to be detected are normal according to the membership value and the range interval value.
Optionally, in some embodiments, the detection module 306 may be specifically configured to: and when the distance is less than or equal to the first threshold value, judging whether the target membership value is within the range interval value of the target clustering center, and if the target membership value is within the range interval value of the target clustering center, determining that the object to be detected is normal data.
Optionally, in some embodiments, the detection module 306 may be further configured to: and when the distance between the target cluster center and the other cluster centers is larger than a first threshold value, determining the object to be detected as abnormal data, or when the distance between the target cluster center and the other cluster centers is smaller than or equal to the first threshold value and the target membership value is not within the range interval value of the target cluster center, determining the object to be detected as abnormal data.
After the encryption module 301 of the embodiment of the present invention encrypts the data to be detected by using the preset homomorphic encryption algorithm to obtain encrypted data, the sending module 302 sends the encrypted data to the server, so that the server performs clustering processing on the encrypted data based on data acquired from different terminals, then, the decryption module 303 receives the first clustering result returned by the server, decrypts the first clustering result by using a homomorphic encryption algorithm, then, the determining module 304 determines the target cluster center and the membership value of the cluster to which the data to be detected belongs according to the decrypted first cluster result, and finally, the obtaining module 305 obtains a range interval value corresponding to the target cluster center, and the detecting module 306 detects the data to be detected through the membership value, the range interval value and the target cluster center to obtain a detection result corresponding to the data to be detected. Compared with the existing data detection scheme, the encryption module 301 can encrypt the data to be detected through a homomorphic encryption algorithm, so that the server cannot acquire a specific numerical value of the data to be detected, and the safety of data detection is improved; in addition, the server can perform clustering processing on the encrypted data based on the data acquired from different terminals, so that false alarm caused by too small data volume due to the fact that the existing data detection scheme is limited by a data island is solved, and the accuracy of data detection is improved.
In addition, an embodiment of the present invention further provides an electronic device, as shown in fig. 4, which shows a schematic structural diagram of the electronic device according to the embodiment of the present invention, specifically:
the electronic device may include components such as a processor 401 of one or more processing cores, memory 402 of one or more computer-readable storage media, a power supply 403, and an input unit 404. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 4 does not constitute a limitation of the electronic device and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components. Wherein:
the processor 401 is a control center of the electronic device, connects various parts of the whole electronic device by various interfaces and lines, performs various functions of the electronic device and processes data by running or executing software programs and/or modules stored in the memory 402 and calling data stored in the memory 402, thereby performing overall monitoring of the electronic device. Optionally, processor 401 may include one or more processing cores; preferably, the processor 401 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 401.
The memory 402 may be used to store software programs and modules, and the processor 401 executes various functional applications and data processing by operating the software programs and modules stored in the memory 402. The memory 402 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 402 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 access to the memory 402.
The electronic device further comprises a power supply 403 for supplying power to the various components, and preferably, the power supply 403 is logically connected to the processor 401 through a power management system, so that functions of managing charging, discharging, and power consumption are realized through the power management system. The power supply 403 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
The electronic device may further include an input unit 404, and the input unit 404 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
Although not shown, the electronic device may further include a display unit and the like, which are not described in detail herein. Specifically, in this embodiment, the processor 401 in the electronic device loads the executable file corresponding to the process of one or more application programs into the memory 402 according to the following instructions, and the processor 401 runs the application program stored in the memory 402, thereby implementing various functions as follows:
encrypting data to be detected by a preset homomorphic encryption algorithm to obtain encrypted data, sending the encrypted data to a server so that the server can perform clustering processing on the encrypted data based on data obtained from different terminals, receiving a first clustering result returned by the server, decrypting the first clustering result by using the homomorphic encryption algorithm, determining a target clustering center and a membership value of a cluster to which the data to be detected belongs according to the decrypted first clustering result, obtaining a range interval value corresponding to the target clustering center, and detecting the data to be detected by using the membership value, the range interval value and the target clustering center to obtain a detection result corresponding to the data to be detected.
The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.
According to the embodiment of the invention, after data to be detected is encrypted by a preset homomorphic encryption algorithm to obtain encrypted data, the encrypted data is sent to a server so that the server can cluster the encrypted data based on the data obtained from different terminals, then a first clustering result returned by the server is received, the first clustering result is decrypted by the homomorphic encryption algorithm, then a target clustering center and a membership value of a cluster to which the data to be detected belongs are determined according to the decrypted first clustering result, finally a range interval value corresponding to the target clustering center is obtained, and the data to be detected is detected through the membership value, the range interval value and the target clustering center to obtain a detection result corresponding to the data to be detected. Compared with the existing data detection scheme, the data detection method can encrypt the data to be detected through a homomorphic encryption algorithm, so that the server cannot acquire the specific numerical value of the data to be detected, and the safety of data detection is improved; in addition, the server can perform clustering processing on the encrypted data based on the data acquired from different terminals, so that false alarm caused by too small data volume due to the fact that the existing data detection scheme is limited by a data island is solved, and the accuracy of data detection is improved.
It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions or by associated hardware controlled by the instructions, which may be stored in a computer readable storage medium and loaded and executed by a processor.
To this end, the present invention provides a storage medium, in which a plurality of instructions are stored, and the instructions can be loaded by a processor to execute the steps in any one of the data detection methods provided by the embodiments of the present invention. For example, the instructions may perform the steps of:
encrypting data to be detected by a preset homomorphic encryption algorithm to obtain encrypted data, sending the encrypted data to a server so that the server can perform clustering processing on the encrypted data based on data obtained from different terminals, receiving a first clustering result returned by the server, decrypting the first clustering result by using the homomorphic encryption algorithm, determining a target clustering center and a membership value of a cluster to which the data to be detected belongs according to the decrypted first clustering result, obtaining a range interval value corresponding to the target clustering center, and detecting the data to be detected by using the membership value, the range interval value and the target clustering center to obtain a detection result corresponding to the data to be detected.
The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.
Wherein the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
Since the instructions stored in the storage medium can execute the steps in any data detection method provided in the embodiments of the present invention, the beneficial effects that can be achieved by any data detection method provided in the embodiments of the present invention can be achieved, which are detailed in the foregoing embodiments and will not be described herein again.
The data detection method, the data detection device, the electronic device, and the storage medium according to the embodiments of the present invention are described in detail above, and a specific example is applied in the description to explain the principles and the embodiments of the present invention, and the description of the embodiments is only used to help understanding the method and the core concept of the present invention; meanwhile, for those skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (15)

1. A method for data detection, comprising:
encrypting the data to be detected by a preset homomorphic encryption algorithm to obtain encrypted data;
sending the encrypted data to a server so that the server can perform clustering processing on the encrypted data based on data acquired from different terminals;
receiving a first clustering result returned by the server, and decrypting the first clustering result by adopting the homomorphic encryption algorithm;
determining a target cluster center and a membership value of a cluster to which the data to be detected belongs according to the decrypted first cluster result;
and acquiring a range interval value corresponding to the target clustering center, and detecting the data to be detected through the membership value, the range interval value and the target clustering center to obtain a detection result corresponding to the data to be detected.
2. The method according to claim 1, wherein the detecting the data to be detected through the membership value, the range interval value and the target clustering center to obtain a detection result corresponding to the data to be detected comprises:
extracting all cluster centers in the decrypted cluster result;
calculating the distance between the target clustering center and each clustering center;
and detecting the data to be detected based on the membership value, the range interval value and the distance to obtain a detection result corresponding to the data to be detected.
3. The method according to claim 2, wherein the detecting the data to be detected based on the membership value, the range interval value and the distance to obtain a detection result corresponding to the data to be detected comprises:
and when the distance is less than or equal to the first threshold value, judging whether the target membership value is within the range interval value of the target clustering center, and if the target membership value is within the range interval value of the target clustering center, determining that the object to be detected is normal data.
4. The method of claim 3, further comprising:
when the distance between the target cluster center and other cluster centers is larger than a first threshold value, determining the current processing object as abnormal data, or;
and when the distance between the target cluster center and other cluster centers is smaller than or equal to a first threshold value and the target membership value is not within the range interval value of the target cluster center, determining the object to be detected as abnormal data.
5. The method according to any one of claims 1 to 4, wherein before obtaining the range interval value corresponding to the cluster center, the method further comprises:
acquiring a sample data set, wherein the sample data set comprises a plurality of sample data with normal data condition marks;
encrypting the sample data of the sample data set by a preset homomorphic encryption algorithm to obtain an encrypted data set;
sending the encrypted data set to a server so that the server can perform clustering processing on the data in the encrypted data set based on the data acquired from different terminals;
receiving a second clustering result returned by the server, and decrypting the second clustering result by adopting the homomorphic encryption algorithm;
determining a cluster center of a cluster to which the sample data belongs and a membership value corresponding to the sample data according to the decrypted second clustering result;
calculating a range interval value of a clustering center of a cluster to which the sample data belongs according to the membership value corresponding to the sample data;
predicting the data condition of the sample data according to the membership value corresponding to the sample data, the clustering center of the cluster to which the sample data belongs and the range interval value of the clustering center of the cluster to which the sample data belongs to obtain the predicted data condition of the sample data;
according to the real data condition and the predicted data condition, adjusting the clustering center of the cluster to which the sample data belongs until the clustering center of the cluster to which the sample data belongs meets the preset condition;
saving a range interval value corresponding to the clustering center meeting the preset condition;
the obtaining of the range interval value corresponding to the target clustering center includes: and acquiring a range interval value corresponding to the target clustering center from the stored range interval values.
6. The method of claim 5, further comprising:
extracting attribute information of a clustering center corresponding to the sample data;
and constructing a mapping relation between the attribute information and the range interval value.
7. The method according to claim 6, wherein the obtaining of the range interval value corresponding to the cluster center comprises:
extracting attribute information corresponding to the target clustering center;
and acquiring a range interval value corresponding to the target clustering center from the stored range interval values based on a preset mapping relation.
8. The method according to any one of claims 1 to 4, wherein the decrypting the first clustering result by using the homomorphic encryption algorithm comprises:
acquiring a decryption function corresponding to the data to be detected based on a preset homomorphic encryption algorithm;
and decrypting the first clustering result through a decryption function.
9. A data detection apparatus, comprising:
the encryption module is used for encrypting the data to be detected through a preset homomorphic encryption algorithm to obtain encrypted data;
the sending module is used for sending the encrypted data to a server so that the server can perform clustering processing on the encrypted data based on data obtained from different terminals;
the decryption module is used for receiving the first clustering result returned by the server and decrypting the first clustering result by adopting the homomorphic encryption algorithm;
the determining module is used for determining a target cluster center and a membership value of a cluster to which the data to be detected belongs according to the decrypted first cluster result;
the acquisition module is used for acquiring a range interval value corresponding to the target clustering center;
and the detection module is used for detecting the data to be detected through the membership value, the range interval value and the target clustering center to obtain a detection result corresponding to the data to be detected.
10. The apparatus of claim 9, wherein the detection module comprises:
the extraction unit is used for extracting all the cluster centers in the decrypted cluster result;
the calculating unit is used for calculating the distance between the target clustering center and each clustering center;
and the detection unit is used for detecting the data to be detected based on the membership value, the range interval value and the distance to obtain a detection result corresponding to the data to be detected.
11. The apparatus according to claim 10, wherein the detection unit is specifically configured to:
and when the distance is less than or equal to the first threshold value, judging whether the target membership value is within the range interval value of the target clustering center, and if the target membership value is within the range interval value of the target clustering center, determining that the object to be detected is normal data.
12. The apparatus according to claim 10, wherein the detection unit is further configured to:
when the distance between the target cluster center and other cluster centers is larger than a first threshold value, determining the current processing object as abnormal data, or;
and when the distance between the target cluster center and other cluster centers is smaller than or equal to a first threshold value and the target membership value is not within the range interval value of the target cluster center, determining the object to be detected as abnormal data.
13. The apparatus according to any one of claims 9 to 12, further comprising a training module, the training module being configured to:
acquiring a sample data set, wherein the sample data set comprises a plurality of sample data with normal data condition marks;
encrypting the sample data of the sample data set by a preset homomorphic encryption algorithm to obtain an encrypted data set;
sending the encrypted data set to a server so that the server can perform clustering processing on the data in the encrypted data set based on the data acquired from different terminals;
receiving a second clustering result returned by the server, and decrypting the second clustering result by adopting the homomorphic encryption algorithm;
determining a cluster center of a cluster to which the sample data belongs and a membership value corresponding to the sample data according to the decrypted second clustering result;
calculating a range interval value of a clustering center of a cluster to which the sample data belongs according to the membership value corresponding to the sample data;
predicting the data condition of the sample data according to the membership value corresponding to the sample data, the clustering center of the cluster to which the sample data belongs and the range interval value of the clustering center of the cluster to which the sample data belongs to obtain the predicted data condition of the sample data;
according to the real data condition and the predicted data condition, adjusting the clustering center of the cluster to which the sample data belongs until the clustering center of the cluster to which the sample data belongs meets the preset condition;
saving a range interval value corresponding to the clustering center meeting the preset condition;
the obtaining module is specifically configured to obtain a range interval value corresponding to the target cluster center from the stored range interval values.
14. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the data detection method according to any of claims 1-8 are implemented when the program is executed by the processor.
15. A computer-readable storage medium, on which a computer program is stored, wherein the computer program, when being executed by a processor, carries out the steps of the data detection method according to any one of claims 1 to 8.
CN201911090328.6A 2019-11-08 2019-11-08 Data detection method, device, electronic equipment and storage medium Active CN110852374B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911090328.6A CN110852374B (en) 2019-11-08 2019-11-08 Data detection method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911090328.6A CN110852374B (en) 2019-11-08 2019-11-08 Data detection method, device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110852374A true CN110852374A (en) 2020-02-28
CN110852374B CN110852374B (en) 2023-05-02

Family

ID=69600140

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911090328.6A Active CN110852374B (en) 2019-11-08 2019-11-08 Data detection method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110852374B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111931221A (en) * 2020-09-25 2020-11-13 支付宝(杭州)信息技术有限公司 Data processing method and device and server
CN112035671A (en) * 2020-11-05 2020-12-04 腾讯科技(深圳)有限公司 State detection method and device, computer equipment and storage medium
CN112468452A (en) * 2020-11-10 2021-03-09 深圳市欢太科技有限公司 Flow detection method and device, electronic equipment and computer readable storage medium
CN113063461A (en) * 2021-03-17 2021-07-02 中旭京坤(北京)科技有限公司 Detection method based on intelligent sensor
CN113436027A (en) * 2021-06-30 2021-09-24 山大地纬软件股份有限公司 Medical insurance reimbursement abnormal data detection method and system
CN113747492A (en) * 2021-09-03 2021-12-03 四川英得赛克科技有限公司 Identification method of wireless communication equipment
CN113810493A (en) * 2021-09-16 2021-12-17 中国电信股份有限公司 Translation method, system, device and storage medium
CN114362973A (en) * 2020-09-27 2022-04-15 中国科学院软件研究所 K-means and FCM clustering combined flow detection method and electronic device
CN117688502A (en) * 2024-02-04 2024-03-12 山东大学 Safe outsourcing calculation method and system for detecting local abnormal factors

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030078946A1 (en) * 2001-06-05 2003-04-24 Laurie Costello Clustered filesystem
CN105069469A (en) * 2015-07-30 2015-11-18 天津师范大学 Data flow detection method based on fuzzy C-means clustering algorithm and entropy theory
CN108985361A (en) * 2018-07-02 2018-12-11 北京金睛云华科技有限公司 A kind of malicious traffic stream detection implementation method and device based on deep learning
CN109347834A (en) * 2018-10-24 2019-02-15 广东工业大学 Detection method, device and the equipment of abnormal data in Internet of Things edge calculations environment
CN109615021A (en) * 2018-12-20 2019-04-12 暨南大学 A kind of method for protecting privacy based on k mean cluster
CN109684118A (en) * 2018-12-10 2019-04-26 深圳前海微众银行股份有限公司 Detection method, device, equipment and the computer readable storage medium of abnormal data

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030078946A1 (en) * 2001-06-05 2003-04-24 Laurie Costello Clustered filesystem
CN105069469A (en) * 2015-07-30 2015-11-18 天津师范大学 Data flow detection method based on fuzzy C-means clustering algorithm and entropy theory
CN108985361A (en) * 2018-07-02 2018-12-11 北京金睛云华科技有限公司 A kind of malicious traffic stream detection implementation method and device based on deep learning
CN109347834A (en) * 2018-10-24 2019-02-15 广东工业大学 Detection method, device and the equipment of abnormal data in Internet of Things edge calculations environment
CN109684118A (en) * 2018-12-10 2019-04-26 深圳前海微众银行股份有限公司 Detection method, device, equipment and the computer readable storage medium of abnormal data
CN109615021A (en) * 2018-12-20 2019-04-12 暨南大学 A kind of method for protecting privacy based on k mean cluster

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SHAJINA ANAND ET AL: "EECDH to prevent MITM attack in cloud computing", 《DIGITAL COMMUNICATIONS AND NETWORKS》 *
陈利军等: "基于粒子滤波的网络入侵相频特征提取算法", 《科技通报》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111931221A (en) * 2020-09-25 2020-11-13 支付宝(杭州)信息技术有限公司 Data processing method and device and server
CN114362973A (en) * 2020-09-27 2022-04-15 中国科学院软件研究所 K-means and FCM clustering combined flow detection method and electronic device
CN114362973B (en) * 2020-09-27 2023-02-28 中国科学院软件研究所 K-means and FCM clustering combined flow detection method and electronic device
CN112035671A (en) * 2020-11-05 2020-12-04 腾讯科技(深圳)有限公司 State detection method and device, computer equipment and storage medium
CN112035671B (en) * 2020-11-05 2021-02-26 腾讯科技(深圳)有限公司 State detection method and device, computer equipment and storage medium
CN112468452A (en) * 2020-11-10 2021-03-09 深圳市欢太科技有限公司 Flow detection method and device, electronic equipment and computer readable storage medium
CN113063461A (en) * 2021-03-17 2021-07-02 中旭京坤(北京)科技有限公司 Detection method based on intelligent sensor
CN113436027A (en) * 2021-06-30 2021-09-24 山大地纬软件股份有限公司 Medical insurance reimbursement abnormal data detection method and system
CN113747492A (en) * 2021-09-03 2021-12-03 四川英得赛克科技有限公司 Identification method of wireless communication equipment
CN113810493A (en) * 2021-09-16 2021-12-17 中国电信股份有限公司 Translation method, system, device and storage medium
CN117688502A (en) * 2024-02-04 2024-03-12 山东大学 Safe outsourcing calculation method and system for detecting local abnormal factors
CN117688502B (en) * 2024-02-04 2024-04-30 山东大学 Safe outsourcing calculation method and system for detecting local abnormal factors

Also Published As

Publication number Publication date
CN110852374B (en) 2023-05-02

Similar Documents

Publication Publication Date Title
CN110852374B (en) Data detection method, device, electronic equipment and storage medium
CN108681966B (en) Information supervision method and device based on block chain
WO2020177392A1 (en) Federated learning-based model parameter training method, apparatus and device, and medium
CN111698088B (en) Key alternation method, key alternation device, electronic equipment and medium
US20170364691A1 (en) Method and System for Controlling Encryption of Information and Analyzing Information as well as Terminal
CN107305611B (en) Method and device for establishing model corresponding to malicious account and method and device for identifying malicious account
CN110825818B (en) Multidimensional feature construction method and device, electronic equipment and storage medium
CN104836781A (en) Method distinguishing identities of access users, and device
US20210233673A1 (en) Method and device for blockchain nodes
CN109951449A (en) A kind of abnormal login detecting method, device, electronic equipment and storage medium
CN114417364A (en) Data encryption method, federal modeling method, apparatus and computer device
CN112508200A (en) Method, apparatus, device, medium, and program for processing machine learning model file
CN113239401A (en) Big data analysis system and method based on power Internet of things and computer storage medium
US20130117245A1 (en) Method and system for identification of asset records in a version managed datastore
CN116401718A (en) Block chain-based data protection method and device, electronic equipment and storage medium
CN111475690A (en) Character string matching method and device, data detection method and server
CN115964726A (en) Robot process automation data processing method, device, equipment and storage medium
CN113762970A (en) Data processing method and device, computer readable storage medium and computer equipment
JP2009053896A (en) Unauthorized operation detector and program
CN110378110A (en) Software cryptography processing method, software verification method and device
CN114121049B (en) Data processing method, device and storage medium
CN116261139B (en) Online data security transmission method and system based on 5G message and electronic equipment
CN115150196A (en) Ciphertext data-based anomaly detection method, device and equipment under normal distribution
Christy Data Prevention Technique For Securing The Data
CN116881215A (en) File sharing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40021717

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant