CN112800045A - Big data-based data information analysis system - Google Patents

Big data-based data information analysis system Download PDF

Info

Publication number
CN112800045A
CN112800045A CN202110203473.1A CN202110203473A CN112800045A CN 112800045 A CN112800045 A CN 112800045A CN 202110203473 A CN202110203473 A CN 202110203473A CN 112800045 A CN112800045 A CN 112800045A
Authority
CN
China
Prior art keywords
data
module
network
identifier
program
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110203473.1A
Other languages
Chinese (zh)
Inventor
孙昊
张文鹏
王媛媛
吕志文
王凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Haike Virtual Reality Research Institute
Original Assignee
Qingdao Haike Virtual Reality Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Haike Virtual Reality Research Institute filed Critical Qingdao Haike Virtual Reality Research Institute
Priority to CN202110203473.1A priority Critical patent/CN112800045A/en
Publication of CN112800045A publication Critical patent/CN112800045A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Bioethics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Quality & Reliability (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of data analysis, and discloses a data information analysis system based on big data, which comprises: the device comprises a data acquisition module, a data encryption module, a data transmission module, a data receiving module, a central control module, a data decryption module, a data normalization module, a data analysis module, a disordered data rejection module and a data storage module. The big data-based data information analysis system provided by the invention can be used for encrypting the transmitted data to be analyzed through the arrangement of the data encryption module and the data decryption module, and checking the credibility of the access network before transmission, so that the safety of data transmission is ensured, and the risk of data leakage is reduced; the data are normalized before the data are analyzed, so that the data are more conveniently analyzed, and the analysis efficiency is improved; and after the data analysis result is obtained, the disordered data is removed, the application of the data analysis result is better realized, and the obtained optimized data can be directly used.

Description

Big data-based data information analysis system
Technical Field
The invention belongs to the technical field of data analysis, and particularly relates to a data information analysis system based on big data.
Background
At present: the big data technology is a hot project which is researched by various industries at home and abroad at present. With the technical challenges brought by big data change in the global scope, China also pays more and more attention to the practical application of big data technology. In recent years, with the shift of the management emphasis from centralization, unification to refinement and high efficiency, the development of data information analysis and digitization technology is combined to become a trend in combination with the high-speed development of information technology and the wide application of various digitization technologies under the background of the "internet +" era. The big data realizes the integration, analysis and processing of the data and supports the retrieval of mass data. However, the existing data information analysis has the defects of complex operation, low analysis efficiency, risk of disclosure and difficulty in direct application of data analysis results.
Through the above analysis, the problems and defects of the prior art are as follows: the existing data information analysis has the defects of complex operation, low analysis efficiency, risk of divulgence and difficulty in directly applying data analysis results.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a data information analysis system based on big data.
The present invention is achieved in such a way that a big data-based data information analysis system includes:
the data acquisition module is connected with the central control module and is used for acquiring data to be analyzed through a data acquisition program to obtain the data to be analyzed;
the data encryption module is connected with the central control module and used for encrypting the data to be analyzed through a data encryption program to obtain encrypted data;
the data transmission module is connected with the central control module and is used for transmitting the encrypted data through a data transmission program;
the transmission of the encrypted data by the data transmission program includes:
carrying out data preprocessing and feature extraction on different types of network connection data;
according to the extracted features, iteration and training are carried out through a generalized regression neural network in combination with a fuzzy clustering algorithm to obtain a clustering result;
the clustering result obtained by iteration and training through the generalized regression neural network and the fuzzy clustering algorithm comprises the following steps:
classifying the data according to a fuzzy clustering algorithm, and calculating a clustering center of each class;
FCM combines n vectors xkDividing the data into c fuzzy classes, and calculating the clustering center c of each classiTo minimize the fuzzy objective function;
the objective function of fuzzy clustering is:
Figure BDA0002949209110000021
wherein d isij=||ci-xj| | is the Euclidean distance between the sample vector and the central point, ci is the center of the ith class, m is the number of samples, and j is an attribute column; the calculation formula of each cluster center is as follows:
Figure BDA0002949209110000022
calculating a membership value through a membership function to form a fuzzy matrix;
the membership function is:
Figure BDA0002949209110000023
selecting a training sample from the fuzzy matrix as the training input of the generalized neural network;
selecting m samples with the minimum distance from the central value in the fuzzy matrix as training samples, and using n x m groups of data as the training input of the generalized neural network; n is the number of classified intrusion data according to a fuzzy clustering algorithm, and m is data between 1 and 5;
predicting and outputting the type of intrusion data according to the training input of the generalized neural network;
data are subdivided into n classes, and a sample closest to the central value of each class is found out to be used as a training sample; obtaining a clustering result;
calculating credibility estimated values of corresponding classifications by setting credibility weight vectors and a network connection credibility algorithm according to the clustering result;
calculating the reliability of the network intrusion rule through an improved associated attribute judgment algorithm, and using the reliability as a basis for dynamically adjusting a rule base in an intrusion detection system;
determining whether to establish connection between the data analysis terminal and the internet according to the adjusted credibility;
the data receiving module is connected with the central control module and is used for receiving the encrypted data through a data receiving program;
and the central control module is connected with the data acquisition module, the data encryption module, the data transmission module and the data receiving module and is used for controlling the operation of each connecting module through the main control computer and ensuring the normal operation of each module.
Further, the big data-based data information analysis system further includes:
the data decryption module is connected with the central control module and used for decrypting the received encrypted data through a data decryption program to obtain decrypted data;
the data normalization module is connected with the central control module and used for normalizing the decrypted data through a data normalization program to obtain normalized data;
the data analysis module is connected with the central control module and used for analyzing the normalized data through a data analysis program to obtain a data analysis result;
the disordered data removing module is connected with the central control module and used for removing disordered data according to a data analysis result through a disordered data removing program to obtain optimized data;
and the data storage module is connected with the central control module and is used for storing the optimized data through the data storage.
Further, the encrypting the data to be analyzed by the data encryption program to obtain encrypted data includes: receiving a first network data frame, the first network data frame being transmitted by the first network device; acquiring a first payload from the first network data frame, acquiring an encryption identifier according to a first preset rule, judging the state of the encryption identifier, if the encryption identifier is a first identifier, encrypting the first payload to obtain ciphertext data, and if the encryption identifier is a second identifier, performing first preset processing on the first network data frame according to a first configuration parameter; transmitting a second network data frame, the second network data frame including the encrypted data.
Further, the obtaining of the encrypted identifier according to the first preset rule includes: judging whether the quintuple of the first network data frame is matched with a preset quintuple list of the encryption equipment or not, and if a first quintuple matched with the quintuple of the first network data frame exists in the preset quintuple list of the encryption equipment, setting the encryption identifier as the first identifier; and if the first quintuple matched with the quintuple of the first network data frame does not exist in the preset quintuple list of the encryption equipment, setting the encryption identifier as the second identifier.
Further, the generalized neural network is composed of four-level structures of an input layer, a mode layer, a summation layer and an output layer.
Further, the decrypting the received encrypted data by the data decrypting program provided by the embodiment of the present invention to obtain the decrypted data includes: receiving a third network data frame; acquiring a second payload from the third network data frame, acquiring a decryption identifier according to a third preset rule, judging the state of the decryption identifier, decrypting the second payload to obtain plaintext data if the decryption identifier is a third identifier, and performing second preset processing on the third network data frame according to a second configuration parameter if the decryption identifier is a fourth identifier; and sending a fourth network data frame to the second network equipment, wherein the fourth network data frame comprises the decrypted data.
Further, the normalizing the decrypted data by the data normalization program to obtain normalized data includes:
taking the decrypted data as an input sample, and dividing the input sample into an original data training set and an original data testing set;
performing normalization processing on original data by adopting a data normalization method to form a normalized training set and a normalized test set, and respectively labeling the two sets to obtain a training set label and a test set label;
optimizing SVM parameters by adopting a fusion data normalization self-adaptive mutation bird group algorithm to the normalization training set to obtain an optimal parameter group (c, gamma), and establishing an SVM model by using the parameter group;
and substituting the normalized test set into the SVM model to obtain normalized data.
Further, the normalization processing of the raw data by using the data normalization method includes:
let any data sample in the original data training set and original data testing set be yiAfter normalization, the corresponding data samples in the normalized training set and the normalized test set are as follows:
Figure BDA0002949209110000051
wherein y isminAnd ymaxRepresents yiA respective minimum and maximum.
It is another object of the present invention to provide a computer program product stored on a computer readable medium, comprising a computer readable program for providing a user input interface to apply the big data based data information analyzing system when executed on an electronic device.
It is another object of the present invention to provide a computer-readable storage medium storing instructions that, when executed on a computer, cause the computer to apply the big-data based data information analyzing system.
By combining all the technical schemes, the invention has the advantages and positive effects that: the big data-based data information analysis system provided by the invention can be used for encrypting the transmitted data to be analyzed through the arrangement of the data encryption module and the data decryption module, and checking the credibility of the access network before transmission, so that the safety of data transmission is ensured, and the risk of data leakage is reduced; the data are normalized before the data are analyzed, so that the data are more conveniently analyzed, and the analysis efficiency is improved; and after data analysis is carried out to obtain a data analysis result, the disordered data is removed, the application of the data analysis result is better realized, and the obtained optimized data can be directly used.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained from the drawings without creative efforts.
Fig. 1 is a block diagram of a big data-based data information analysis system according to an embodiment of the present invention.
Fig. 2 is a flowchart of a method for analyzing data information based on big data according to an embodiment of the present invention.
Fig. 3 is a flow chart of transmission of encrypted data by a data transmission program according to an embodiment of the present invention.
Fig. 4 is a block diagram of a generalized neural network structure provided in an embodiment of the present invention.
Fig. 5 is a flowchart of normalizing decrypted data by a data normalization program to obtain normalized data according to an embodiment of the present invention.
In the figure: 1. a data acquisition module; 2. a data encryption module; 3. a data transmission module; 4. a data receiving module; 5. a central control module; 6. a data decryption module; 7. a data normalization module; 8. a data analysis module; 9. a disordered data eliminating module; 10. a data storage module; 11. an input layer; 12. a mode layer; 13. a summing layer; 14. and (5) outputting the layer.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In view of the problems in the prior art, the present invention provides a data information analysis system based on big data, and the present invention is described in detail below with reference to the accompanying drawings.
As shown in fig. 1, a big data-based data information analysis system provided in an embodiment of the present invention includes:
the data acquisition module 1 is connected with the central control module 5 and is used for acquiring data to be analyzed through a data acquisition program to obtain the data to be analyzed;
the data encryption module 2 is connected with the central control module 5 and used for encrypting the data to be analyzed through a data encryption program to obtain encrypted data;
the data transmission module 3 is connected with the central control module 5 and is used for transmitting the encrypted data through a data transmission program;
the data receiving module 4 is connected with the central control module 5 and is used for receiving the encrypted data through a data receiving program;
the central control module 5 is connected with the data acquisition module 1, the data encryption module 2, the data transmission module 3, the data receiving module 4, the data decryption module 6, the data normalization module 7, the data analysis module 8, the disordered data rejection module 9 and the data storage module 10, and is used for controlling the operation of each connection module through a main control computer and ensuring the normal operation of each module;
the data decryption module 6 is connected with the central control module 5 and used for decrypting the received encrypted data through a data decryption program to obtain decrypted data;
the data normalization module 7 is connected with the central control module 5 and used for performing normalization processing on the decrypted data through a data normalization program to obtain normalized data;
the data analysis module 8 is connected with the central control module 5 and used for analyzing the normalized data through a data analysis program to obtain a data analysis result;
the disordered data removing module 9 is connected with the central control module 5 and used for removing disordered data according to a data analysis result through a disordered data removing program to obtain optimized data;
and the data storage module 10 is connected with the central control module 5 and is used for storing the optimized data through the data storage.
As shown in fig. 2, the method for analyzing data information based on big data according to the embodiment of the present invention includes the following steps:
s101, acquiring data to be analyzed by a data acquisition module through a data acquisition program to obtain the data to be analyzed; encrypting the data to be analyzed by using a data encryption program through a data encryption module to obtain encrypted data;
s102, transmitting the encrypted data by using a data transmission program through a data transmission module; receiving the encrypted data by a data receiving module by using a data receiving program;
s103, controlling the operation of each connecting module by using a main control computer through a central control module to ensure the normal operation of each module; decrypting the received encrypted data by using a data decryption program through a data decryption module to obtain decrypted data;
s104, carrying out normalization processing on the decrypted data by using a data normalization program through a data normalization module to obtain normalized data;
s105, analyzing the normalized data by using a data analysis program through a data analysis module to obtain a data analysis result;
s106, removing the disordered data by using a disordered data removing program through a disordered data removing module according to a data analysis result to obtain optimized data; and the optimized data is stored by using the data memory through the data storage module.
The encryption of the data to be analyzed by the data encryption program provided by the embodiment of the invention to obtain the encrypted data comprises the following steps: receiving a first network data frame, the first network data frame being transmitted by the first network device; acquiring a first payload from the first network data frame, acquiring an encryption identifier according to a first preset rule, judging the state of the encryption identifier, if the encryption identifier is a first identifier, encrypting the first payload to obtain ciphertext data, and if the encryption identifier is a second identifier, performing first preset processing on the first network data frame according to a first configuration parameter; transmitting a second network data frame, the second network data frame including the encrypted data.
The embodiment of the invention provides a method for obtaining an encrypted identifier according to a first preset rule, which comprises the following steps: judging whether the quintuple of the first network data frame is matched with a preset quintuple list of the encryption equipment or not, and if a first quintuple matched with the quintuple of the first network data frame exists in the preset quintuple list of the encryption equipment, setting the encryption identifier as the first identifier; and if the first quintuple matched with the quintuple of the first network data frame does not exist in the preset quintuple list of the encryption equipment, setting the encryption identifier as the second identifier.
As shown in fig. 3, the transmission of encrypted data by a data transmission program according to an embodiment of the present invention includes:
s201, preprocessing data and extracting characteristics of different types of network connection data;
s202, according to the extracted features, iteration and training are carried out through a generalized regression neural network and a fuzzy clustering algorithm to obtain a clustering result;
s203, calculating credibility estimated values of corresponding classifications by setting credibility weight vectors and a network connection credibility algorithm according to the clustering result;
s204, calculating the reliability of the network intrusion rule through an improved associated attribute judgment algorithm, and using the reliability as a basis for dynamically adjusting a rule base in the intrusion detection system;
and S205, determining whether to establish connection between the data analysis terminal and the Internet according to the adjusted credibility.
The clustering result obtained by iteration and training through the generalized regression neural network and the fuzzy clustering algorithm provided by the embodiment of the invention comprises the following steps:
classifying the data according to a fuzzy clustering algorithm, and calculating a clustering center of each class;
FCM combines n vectors xkDividing the data into c fuzzy classes, and calculating the clustering center c of each classiTo minimize the fuzzy objective function;
the objective function of fuzzy clustering is:
Figure BDA0002949209110000091
wherein d isij=||ci-xj| | is the Euclidean distance of the sample vector from the center point, ciIs the center of the ith class, m is the number of samples, and j is the attribute column; the calculation formula of each cluster center is as follows:
Figure BDA0002949209110000092
calculating a membership value through a membership function to form a fuzzy matrix;
the membership function is:
Figure BDA0002949209110000093
selecting a training sample from the fuzzy matrix as the training input of the generalized neural network;
selecting m samples with the minimum distance from the central value in the fuzzy matrix as training samples, and using n x m groups of data as the training input of the generalized neural network; n is the number of classified intrusion data according to a fuzzy clustering algorithm, and m is data between 1 and 5;
predicting and outputting the type of intrusion data according to the training input of the generalized neural network;
data are subdivided into n classes, and a sample closest to the central value of each class is found out to be used as a training sample; and obtaining a clustering result.
As shown in fig. 4, the generalized neural network provided by the embodiment of the present invention is composed of four levels of structures, i.e., an input layer 11, a mode layer 12, a summation layer 13, and an output layer 14.
The decryption of the received encrypted data by the data decryption program provided by the embodiment of the invention to obtain the decrypted data comprises the following steps: receiving a third network data frame; acquiring a second payload from the third network data frame, acquiring a decryption identifier according to a third preset rule, judging the state of the decryption identifier, decrypting the second payload to obtain plaintext data if the decryption identifier is a third identifier, and performing second preset processing on the third network data frame according to a second configuration parameter if the decryption identifier is a fourth identifier; and sending a fourth network data frame to the second network equipment, wherein the fourth network data frame comprises the decrypted data.
As shown in fig. 5, the normalization processing on the decrypted data by the data normalization program according to the embodiment of the present invention to obtain normalized data includes:
s301, taking the decrypted data as an input sample, and dividing the input sample into an original data training set and an original data testing set;
s302, performing normalization processing on original data by adopting a data normalization method to form a normalized training set and a normalized testing set, and respectively labeling the two sets to obtain a training set label and a testing set label;
s303, optimizing SVM parameters by adopting a fusion data normalization self-adaptive mutation bird group algorithm to the normalization training set to obtain an optimal parameter group (c, gamma), and establishing an SVM model by using the parameter group;
and S304, bringing the normalized test set into an SVM model to obtain normalized data.
The method for normalizing the original data provided by the embodiment of the invention comprises the following steps:
setting any data sample in the original data training set and the original data testing set as yi, and after normalization, normalizing the corresponding data sample in the training set and the normalization testing set as follows:
Figure BDA0002949209110000111
wherein y isminAnd ymaxRepresents yiA respective minimum and maximum.
In the description of the present invention, "a plurality" means two or more unless otherwise specified; the terms "upper", "lower", "left", "right", "inner", "outer", "front", "rear", "head", "tail", and the like, indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, are only for convenience in describing and simplifying the description, and do not indicate or imply that the device or element referred to must have a particular orientation, be constructed in a particular orientation, and be operated, and thus, should not be construed as limiting the invention. Furthermore, the terms "first," "second," "third," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A big-data-based data information analysis system, comprising:
the data acquisition module is connected with the central control module and is used for acquiring data to be analyzed through a data acquisition program to obtain the data to be analyzed;
the data encryption module is connected with the central control module and used for encrypting the data to be analyzed through a data encryption program to obtain encrypted data;
the data transmission module is connected with the central control module and is used for transmitting the encrypted data through a data transmission program;
the transmission of the encrypted data by the data transmission program includes:
carrying out data preprocessing and feature extraction on different types of network connection data;
according to the extracted features, iteration and training are carried out through a generalized regression neural network in combination with a fuzzy clustering algorithm to obtain a clustering result;
the clustering result obtained by iteration and training through the generalized regression neural network and the fuzzy clustering algorithm comprises the following steps:
classifying the data according to a fuzzy clustering algorithm, and calculating a clustering center of each class;
FCM combines n vectors xkDividing the data into c fuzzy classes, and calculating the clustering center c of each classiTo minimize the fuzzy objective function;
the objective function of fuzzy clustering is:
Figure FDA0002949209100000011
wherein d isij=||ci-xj| | is the Euclidean distance of the sample vector from the center point, ciIs the center of class i, m is the number of samplesJ is an attribute column; the calculation formula of each cluster center is as follows:
Figure FDA0002949209100000012
calculating a membership value through a membership function to form a fuzzy matrix;
the membership function is:
Figure FDA0002949209100000021
selecting a training sample from the fuzzy matrix as the training input of the generalized neural network;
selecting m samples with the minimum distance from the central value in the fuzzy matrix as training samples, and using n x m groups of data as the training input of the generalized neural network; n is the number of classified intrusion data according to a fuzzy clustering algorithm, and m is data between 1 and 5;
predicting and outputting the type of intrusion data according to the training input of the generalized neural network;
data are subdivided into n classes, and a sample closest to the central value of each class is found out to be used as a training sample; obtaining a clustering result;
calculating credibility estimated values of corresponding classifications by setting credibility weight vectors and a network connection credibility algorithm according to the clustering result;
calculating the reliability of the network intrusion rule through an improved associated attribute judgment algorithm, and using the reliability as a basis for dynamically adjusting a rule base in an intrusion detection system;
determining whether to establish connection between the data analysis terminal and the internet according to the adjusted credibility;
the data receiving module is connected with the central control module and is used for receiving the encrypted data through a data receiving program;
and the central control module is connected with the data acquisition module, the data encryption module, the data transmission module and the data receiving module and is used for controlling the operation of each connecting module through the main control computer and ensuring the normal operation of each module.
2. The big-data based data-information analyzing system of claim 1, wherein the big-data based data-information analyzing system further comprises:
the data decryption module is connected with the central control module and used for decrypting the received encrypted data through a data decryption program to obtain decrypted data;
the data normalization module is connected with the central control module and used for normalizing the decrypted data through a data normalization program to obtain normalized data;
the data analysis module is connected with the central control module and used for analyzing the normalized data through a data analysis program to obtain a data analysis result;
the disordered data removing module is connected with the central control module and used for removing disordered data according to a data analysis result through a disordered data removing program to obtain optimized data;
and the data storage module is connected with the central control module and is used for storing the optimized data through the data storage.
3. The big data-based data information analysis system according to claim 1, wherein the encrypting the data to be analyzed by the data encryption program to obtain encrypted data comprises: receiving a first network data frame, the first network data frame being transmitted by the first network device; acquiring a first payload from the first network data frame, acquiring an encryption identifier according to a first preset rule, judging the state of the encryption identifier, if the encryption identifier is a first identifier, encrypting the first payload to obtain ciphertext data, and if the encryption identifier is a second identifier, performing first preset processing on the first network data frame according to a first configuration parameter; transmitting a second network data frame, the second network data frame including the encrypted data.
4. The big data-based data information analysis system according to claim 3, wherein the obtaining of the encrypted identifier according to the first preset rule comprises: judging whether the quintuple of the first network data frame is matched with a preset quintuple list of the encryption equipment or not, and if a first quintuple matched with the quintuple of the first network data frame exists in the preset quintuple list of the encryption equipment, setting the encryption identifier as the first identifier; and if the first quintuple matched with the quintuple of the first network data frame does not exist in the preset quintuple list of the encryption equipment, setting the encryption identifier as the second identifier.
5. The big-data based data-information analyzing system of claim 1, wherein the generalized neural network consists of a four-level structure of an input layer, a pattern layer, a summation layer, and an output layer.
6. The big data-based data information analysis system according to claim 2, wherein the decrypting of the received encrypted data by the data decrypting program to obtain the decrypted data according to the embodiment of the present invention includes: receiving a third network data frame; acquiring a second payload from the third network data frame, acquiring a decryption identifier according to a third preset rule, judging the state of the decryption identifier, decrypting the second payload to obtain plaintext data if the decryption identifier is a third identifier, and performing second preset processing on the third network data frame according to a second configuration parameter if the decryption identifier is a fourth identifier; and sending a fourth network data frame to the second network equipment, wherein the fourth network data frame comprises the decrypted data.
7. The big-data-based data information analysis system according to claim 2, wherein the normalizing the decrypted data by the data normalization program to obtain normalized data comprises:
taking the decrypted data as an input sample, and dividing the input sample into an original data training set and an original data testing set;
performing normalization processing on original data by adopting a data normalization method to form a normalized training set and a normalized test set, and respectively labeling the two sets to obtain a training set label and a test set label;
optimizing SVM parameters by adopting a fusion data normalization self-adaptive mutation bird group algorithm to the normalization training set to obtain an optimal parameter group (c, gamma), and establishing an SVM model by using the parameter group;
and substituting the normalized test set into the SVM model to obtain normalized data.
8. The big-data-based data information analysis system according to claim 7, wherein the normalization processing of the raw data by using the data normalization method comprises:
let any data sample in the original data training set and original data testing set be yiAfter normalization, the corresponding data samples in the normalized training set and the normalized test set are as follows:
Figure FDA0002949209100000041
wherein y isminAnd ymaxRepresents yiA respective minimum and maximum.
9. A computer program product stored on a computer readable medium, comprising a computer readable program for providing a user input interface for applying the big data based data information analyzing system as claimed in any one of claims 1 to 8 when executed on an electronic device.
10. A computer-readable storage medium storing instructions that, when executed on a computer, cause the computer to apply the big data based data information analyzing system according to any one of claims 1 to 8.
CN202110203473.1A 2021-02-23 2021-02-23 Big data-based data information analysis system Pending CN112800045A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110203473.1A CN112800045A (en) 2021-02-23 2021-02-23 Big data-based data information analysis system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110203473.1A CN112800045A (en) 2021-02-23 2021-02-23 Big data-based data information analysis system

Publications (1)

Publication Number Publication Date
CN112800045A true CN112800045A (en) 2021-05-14

Family

ID=75815377

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110203473.1A Pending CN112800045A (en) 2021-02-23 2021-02-23 Big data-based data information analysis system

Country Status (1)

Country Link
CN (1) CN112800045A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113381995A (en) * 2021-06-08 2021-09-10 珠海格力电器股份有限公司 Data processing method and device, electronic equipment and storage medium
CN113949576A (en) * 2021-10-19 2022-01-18 中国电子科技集团公司第三十研究所 Zero network communication flow detection method and device based on mixed leakage information
CN114283910A (en) * 2022-03-04 2022-04-05 广州科犁医学研究有限公司 Clinical data acquisition and analysis system based on multi-channel information

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104539484A (en) * 2014-12-31 2015-04-22 深圳先进技术研究院 Method and system for dynamically estimating network connection reliability
CN104915560A (en) * 2015-06-11 2015-09-16 万达信息股份有限公司 Method for disease diagnosis and treatment scheme based on generalized neural network clustering
CN108270625A (en) * 2018-01-30 2018-07-10 河南质量工程职业学院 A kind of data calculating control system based on cloud service platform
CN110008914A (en) * 2019-04-11 2019-07-12 杨勇 A kind of pattern recognition system neural network based and recognition methods
CN110162968A (en) * 2019-05-20 2019-08-23 西安募格网络科技有限公司 A kind of Network Intrusion Detection System based on machine learning
CN110691074A (en) * 2019-09-20 2020-01-14 西安瑞思凯微电子科技有限公司 IPv6 data encryption method and IPv6 data decryption method
CN111628858A (en) * 2020-05-29 2020-09-04 厘壮信息科技(苏州)有限公司 Encryption and decryption system and encryption and decryption method of network security algorithm

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104539484A (en) * 2014-12-31 2015-04-22 深圳先进技术研究院 Method and system for dynamically estimating network connection reliability
CN104915560A (en) * 2015-06-11 2015-09-16 万达信息股份有限公司 Method for disease diagnosis and treatment scheme based on generalized neural network clustering
CN108270625A (en) * 2018-01-30 2018-07-10 河南质量工程职业学院 A kind of data calculating control system based on cloud service platform
CN110008914A (en) * 2019-04-11 2019-07-12 杨勇 A kind of pattern recognition system neural network based and recognition methods
CN110162968A (en) * 2019-05-20 2019-08-23 西安募格网络科技有限公司 A kind of Network Intrusion Detection System based on machine learning
CN110691074A (en) * 2019-09-20 2020-01-14 西安瑞思凯微电子科技有限公司 IPv6 data encryption method and IPv6 data decryption method
CN111628858A (en) * 2020-05-29 2020-09-04 厘壮信息科技(苏州)有限公司 Encryption and decryption system and encryption and decryption method of network security algorithm

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113381995A (en) * 2021-06-08 2021-09-10 珠海格力电器股份有限公司 Data processing method and device, electronic equipment and storage medium
CN113381995B (en) * 2021-06-08 2023-07-07 珠海格力电器股份有限公司 Data processing method and device, electronic equipment and storage medium
CN113949576A (en) * 2021-10-19 2022-01-18 中国电子科技集团公司第三十研究所 Zero network communication flow detection method and device based on mixed leakage information
CN113949576B (en) * 2021-10-19 2023-05-12 中国电子科技集团公司第三十研究所 Zero network communication flow detection method and device based on mixed leakage information
CN114283910A (en) * 2022-03-04 2022-04-05 广州科犁医学研究有限公司 Clinical data acquisition and analysis system based on multi-channel information

Similar Documents

Publication Publication Date Title
CN112800045A (en) Big data-based data information analysis system
EP3525388A2 (en) Privatized machine learning using generative adversarial networks
CN109886290B (en) User request detection method and device, computer equipment and storage medium
CN113989583A (en) Method and system for detecting malicious traffic of internet
CN110245714B (en) Image recognition method and device and electronic equipment
US11706236B2 (en) Autonomous application of security measures to IoT devices
CN115412370B (en) Vehicle communication data detection method and device, electronic equipment and readable medium
CN107729924A (en) Picture review probability interval generation method and picture review decision method
CN112785303A (en) Verification processing method and verification processing system based on block chain offline payment
CN114612011A (en) Risk prevention and control decision method and device
Elhaloui et al. Machine learning for internet of things classification using network traffic parameters
Wang et al. Statistical network protocol identification with unknown pattern extraction
CN112492591A (en) Method and device for accessing power Internet of things terminal to network
CN116847335A (en) Communication message encryption compression system based on Beidou third generation
WO2023051455A1 (en) Method and apparatus for training trust model
Bhattacharya et al. Anomalies Detection on Contemporary Industrial Internet of Things Data for Securing Crucial Devices
CN114329127B (en) Feature binning method, device and storage medium
CN112861160A (en) Data privacy protection system and protection method
CN115348198A (en) Unknown encryption protocol identification and classification method, device and medium based on feature retrieval
CN111814051B (en) Resource type determining method and device
CN113992419A (en) User abnormal behavior detection and processing system and method thereof
Farhaoui et al. Big Data and Smart Digital Environment
CN107742140B (en) Intelligent identity information identification method based on RFID technology
CN113301011A (en) Information security management system based on cloud service
Han et al. Security analysis of intelligent system based on edge computing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210514

RJ01 Rejection of invention patent application after publication