CN113010500B - Processing method and processing system for DPI data - Google Patents

Processing method and processing system for DPI data Download PDF

Info

Publication number
CN113010500B
CN113010500B CN201911305426.7A CN201911305426A CN113010500B CN 113010500 B CN113010500 B CN 113010500B CN 201911305426 A CN201911305426 A CN 201911305426A CN 113010500 B CN113010500 B CN 113010500B
Authority
CN
China
Prior art keywords
data
dpi
time period
dpi data
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911305426.7A
Other languages
Chinese (zh)
Other versions
CN113010500A (en
Inventor
安翔宇
闫健儒
马奕凡
朱晨曦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi Cloud Technology Co Ltd
Original Assignee
Tianyi Cloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianyi Cloud Technology Co Ltd filed Critical Tianyi Cloud Technology Co Ltd
Priority to CN201911305426.7A priority Critical patent/CN113010500B/en
Publication of CN113010500A publication Critical patent/CN113010500A/en
Application granted granted Critical
Publication of CN113010500B publication Critical patent/CN113010500B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Complex Calculations (AREA)

Abstract

The disclosure provides a processing method and a processing system for DPI data, and relates to the field of data processing. The processing method comprises the following steps: detecting a first time period in which DPI data are missing; obtaining DPI data of a second time period adjacent to the first time period; inputting DPI data of a second time period to a DPI data complement model unit; and the DPI data complement model unit generating missing DPI data for the first time period based on the DPI data for the second time period. The DPI data supplementing method and device achieve the complement of the missing DPI data, and reduce the influence of data missing when a user uses the data.

Description

Processing method and processing system for DPI data
Technical Field
The present disclosure relates to the field of data processing, and in particular, to a processing method and a processing system for DPI data.
Background
With the explosion of internet technology and data technology, large internet companies already have PB (petabyte, beat bytes) data storage, hundred TB (terabyte ) data day increments. Data is an important asset for large companies as a raw material for data service products. Therefore, ensuring data stability and availability is a core task for data operation. DPI (DEEP PACKET Inspection) data is a very large class of data. In the process of data transmission, the problem of DPI data loss may be caused by uncontrollable factors such as network fluctuation, resource load or source data abnormality, and the like, so that difficulty is brought to subsequent use.
Disclosure of Invention
One technical problem solved by the present disclosure is: a processing method for DPI data is provided to complement missing DPI data.
According to one aspect of the present disclosure, there is provided a processing method for deep packet inspection, DPI, data, comprising: detecting a first time period in which DPI data are missing; obtaining DPI data of a second time period adjacent to the first time period; inputting the DPI data of the second time period to a DPI data complement model unit; and the DPI data complement model unit generating missing DPI data for the first time period based on the DPI data for the second time period.
In some embodiments, prior to detecting the first period of time in which DPI data is missing, the processing method further comprises: acquiring sample DPI data of a sample time period; and inputting the sample DPI data to the DPI data complement model unit to train the DPI data complement model unit.
In some embodiments, the step of training the DPI data complement model unit includes: preprocessing the sample DPI data, and sequentially inputting the preprocessed sample DPI data into a convolution layer, a correction linear unit layer, a pooling layer and a full-connection layer for processing to obtain characteristic data of the sample DPI data; inputting the characteristic data of the sample DPI data into a discriminator of a generative countermeasure network GAN; inputting a random value into a generator of the GAN; the generator calculates the random value to generate random characteristic data, and the random characteristic data is input into the discriminator; the discriminator compares and judges the characteristic data of the sample DPI data with the random characteristic data to obtain a judging result; when the judging result is not in the preset range, the judging device determines that the current DPI data complement model unit does not reach the optimal state, and returns the judging result to the generator, so that the generator generates the next random characteristic data; and when the judging result is in the preset range, the judging device determines that the current DPI data complement model unit reaches the optimal state.
In some embodiments, the predetermined range is 0.45 to 0.55.
In some embodiments, the step of generating random feature data by the generator comprises: the generator generates a data sequence of an initial time period based on the random value, takes a preset time period as an increment time period, and correspondingly gradually increases the data sequence until the data sequence with the time period equal to the length of the sample time period is increased, namely the random characteristic data, and utilizes a forgetting gate to acquire time information of the random characteristic data.
In some embodiments, the preprocessing comprises: at least one of the missing value processing, the dimension reduction processing, the normalization processing, and the vector encoding processing is removed.
According to another aspect of the present disclosure there is provided a processing system for DPI data comprising: the DPI data acquisition unit is used for detecting and obtaining a first time period of missing DPI data, acquiring DPI data of a second time period adjacent to the first time period, and inputting the DPI data of the second time period to the DPI data complement model unit; and the DPI data complement model unit is used for generating missing DPI data of the first time period based on the DPI data of the second time period.
In some embodiments, the obtaining unit is further configured to obtain sample DPI data of a sample period, and input the sample DPI data to the DPI data complement model unit; the DPI data complement model unit is further configured to train based on the sample DPI data.
In some embodiments, the DPI data complement model unit includes: the data processing module is used for preprocessing the sample DPI data, and inputting the preprocessed sample DPI data into a convolution layer, a correction linear unit layer, a pooling layer and a full-connection layer in sequence for processing so as to obtain characteristic data of the sample DPI data; inputting the characteristic data of the sample DPI data into a discriminator of a generative antagonism network GAN; and the GAN comprises a generator and a arbiter; the generator is used for receiving a random value, calculating the random value to generate random characteristic data, and inputting the random characteristic data into the discriminator; the discriminator is used for comparing and judging the characteristic data of the sample DPI data with the random characteristic data to obtain a judging result; when the judging result is not in the preset range, determining that the current DPI data complement model unit does not reach the optimal state, and returning the judging result to the generator so that the generator generates the next random characteristic data; and when the judging result is in the preset range, determining that the current DPI data complement model unit reaches the optimal state.
In some embodiments, the predetermined range is 0.45 to 0.55.
In some embodiments, the generator is configured to generate a data sequence of an initial period based on the random value, and take a preset period as an incremental period, and gradually increase the data sequence accordingly until the data sequence having a period equal to the length of the sample period is increased, that is, the random feature data, and acquire time information of the random feature data using a forgetting gate.
In some embodiments, the preprocessing comprises: at least one of the missing value processing, the dimension reduction processing, the normalization processing, and the vector encoding processing is removed.
According to another aspect of the present disclosure there is provided a processing system for DPI data comprising: a memory; and a processor coupled to the memory, the processor configured to perform the method as described above based on instructions stored in the memory.
According to another aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which when executed by a processor implement the steps of the method as previously described.
In the processing method, detecting and obtaining a first time period of missing DPI data; obtaining DPI data for a second time period adjacent to the first time period; inputting DPI data of a second time period to a DPI data complement model unit; and the DPI data complement model unit generating missing DPI data for the first time period based on the DPI data for the second time period. The processing method realizes the complement of the missing DPI data and reduces the influence of the data missing when the user uses the data.
Other features of the present disclosure and its advantages will become apparent from the following detailed description of exemplary embodiments of the disclosure, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.
The disclosure may be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings in which:
fig. 1 is a flow chart illustrating a method of processing DPI data according to some embodiments of the present disclosure;
fig. 2 is a schematic diagram illustrating missing DPI data according to some embodiments of the present disclosure;
fig. 3 is a flow chart illustrating a method of training a DPI data complement model unit according to some embodiments of the present disclosure;
Figure 4 is a schematic diagram illustrating a structure of a processing system for DPI data according to some embodiments of the present disclosure;
Figure 5 is a schematic diagram illustrating a configuration of a processing system for DPI data according to further embodiments of the present disclosure;
Figure 6 is a schematic diagram illustrating a configuration of a processing system for DPI data according to further embodiments of the present disclosure.
Detailed Description
Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless it is specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective parts shown in the drawings are not drawn in actual scale for convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where appropriate.
In all examples shown and discussed herein, any specific values should be construed as merely illustrative, and not a limitation. Thus, other examples of the exemplary embodiments may have different values.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.
Fig. 1 is a flow chart illustrating a method of processing DPI data according to some embodiments of the present disclosure. As shown in fig. 1, the processing method may include steps S102 to S108.
In step S102, a first period of time in which DPI data is missing is detected.
Fig. 2 is a schematic diagram illustrating missing DPI data according to some embodiments of the present disclosure. For example, as shown in fig. 2, in DPI data for a period of time, DPI data for a first period of time is missing, and the first period of time in which the missing DPI data is obtained may be detected.
Returning to fig. 1, in step S104, DPI data for a second time period adjacent to the first time period is acquired.
For example, as shown in fig. 2, the second time period with DPI data is adjacent to the first time period. In some embodiments, as shown in fig. 2, the second time period may precede the first time period. In other embodiments, the second time period may be subsequent to the first time period. In other embodiments, the second time period may be on both sides of the first time period, i.e., the second time period may be divided into two portions: one part is before the first period of time and the other part is after the first period of time. In either case, the second time period is adjacent to the first time period. In this step, DPI data for the second period of time may be acquired.
In step S106, DPI data of the second period is input to the DPI data complement model unit. The DPI data complement model unit is a model that has been trained using sample data.
In step S108, the DPI data complement model unit generates missing DPI data for the first period based on the DPI data for the second period.
For example, where the first time period is a day and the second time period is the first 30 days of the day, the DPI data replenishment model unit may generate missing DPI data for the day based on the DPI data for the first 30 days.
Thus far, a processing method for DPI data according to some embodiments of the present disclosure is provided. The processing method comprises the following steps: detecting a first time period in which DPI data are missing; obtaining DPI data for a second time period adjacent to the first time period; inputting DPI data of a second time period to a DPI data complement model unit; and the DPI data complement model unit generating missing DPI data for the first time period based on the DPI data for the second time period. The processing method realizes the complement of the missing DPI data and reduces the influence of the data missing when the user uses the data.
The processing method is beneficial to the fluctuation resistance of DPI data and provides technical data for a back-end user. The method is based on the mining analysis of a large amount of data of the telecom access class, performs data complementation in multiple dimensions, and reduces the influence of data loss when a user uses the data.
In some embodiments, before step S102, the processing method may further include: acquiring sample DPI data of a sample time period; and inputting the sample DPI data to a DPI data complement model unit to train the DPI data complement model unit. Through training, DPI data complement model units reaching the optimal state can be obtained, so that the realization of complement of the missing DPI data is facilitated.
Fig. 3 is a flow chart illustrating a method of training a DPI data complement model unit according to some embodiments of the disclosure. The process of training the DPI data complement model unit is described in detail below in connection with FIG. 3. As shown in fig. 3, the method may include steps S302 to S314.
In step S302, the sample DPI data is preprocessed, and the preprocessed sample DPI data is sequentially input to a convolution layer, a correction linear Unit (RECTIFIED LINEAR Unit, abbreviated as ReLU) layer, a pooling layer, and a full-connection layer for processing, so as to obtain feature data of the sample DPI data. For example, the sample DPI data may be 30 days (as a sample period) DPI data. In some embodiments, the sample DPI data may be embodied in the form of a data matrix.
In some embodiments, the preprocessing may include: at least one of the missing value processing, the dimension reduction processing, the normalization processing, and the vector encoding processing is removed. These preprocessing modes can be performed in a manner known to those skilled in the art, and thus will not be described in detail herein.
The convolution layer, the ReLU layer, the pooling layer and the full connection layer are described below, respectively.
Convolution layer: the parameters of convolutional neural networks are made up of a number of learnable sets of filters, each of which is relatively small in space (e.g., width and height), but of which the depth is consistent with the depth of the input data.
Relu layers: the Relu layer is an activation function that can increase the nonlinear segmentation capability of the network.
Pooling layer: the pooling layer is typically periodically inserted between the convolution layers, which serves to gradually reduce the spatial size of the data volume, thus reducing the number of parameters in the network, reducing the computational resource consumption, and effectively controlling the overfitting.
Full tie layer: each neuron of the fully connected layer is fully connected with all neurons of the previous layer, while the convolutional neural network (Convolutional Neural Networks, abbreviated as CNN) is connected with only one local area in the input data, and each depth slice of the output neurons shares parameters.
The above-described convolution, reLU, pooling and fully-connected layers may be known to those skilled in the art, and thus their specific functions or operations are not described in detail herein.
Through this step S302, feature data of the sample DPI data can be obtained. The characteristic data may represent the primary information of the sample DPI data. For example, the characteristic data may be embodied in the form of a data matrix.
In step S304, feature data of the sample DPI data is input to a discriminator of GAN (GENERATIVE ADVERSARIAL Networks, generation type countermeasure network).
GAN may include a arbiter D and a generator G. For example, the generator G and the arbiter D may be implemented by a network composed of LSTM (Long Short-Term Memory network) units. In this step, feature data of the sample DPI data is input into the discriminator D.
In step S306, a random value is input into the generator of GAN.
For example, a random value z may be generated using a known algorithm and input into the generator G of GAN.
In step S308, the generator generates random feature data, which is input into the arbiter.
For example, the generator G may calculate the random value z to generate random feature data, which is input into the arbiter D.
In some embodiments, the step of generating random feature data by the generator may comprise: the generator generates a data sequence of an initial period based on the random value, and takes a preset period as an increment period, and gradually increases the data sequence correspondingly until the data sequence with the period equal to the length of the sample period is increased, namely the random characteristic data, and the time information of the random characteristic data is acquired by utilizing a forgetting gate.
For example, the generator G first generates a data sequence on day 1, takes day 1 as a preset time period, and gradually increases with day 1 as an incremental time period on the basis of day 1, and accordingly, the data sequence also gradually increases, for example, a data sequence of 2 days and 3 days … … may be gradually increased by a known algorithm until a data sequence of 30 days (as a sample time period) is increased, the data sequence of 30 days is the random feature data, and the time information of the random feature data is acquired by using a known forgetting gate technology.
In step S310, the discriminator compares and determines the feature data of the sample DPI data with the random feature data to obtain a determination result, and determines whether the determination result is within a predetermined range.
For example, the discriminator may compare the characteristic data of the sample DPI data with the random characteristic data, and may determine by a known determination method, thereby obtaining a determination result, and determine whether the determination result is within a predetermined range. If yes, the process advances to step S314; otherwise the process advances to step S312.
In some embodiments, the predetermined range may be 0.45 to 0.55.
In step S312, when the determination result is not within the predetermined range, the arbiter determines that the current DPI data complement model unit does not reach the optimal state, and returns the determination result to the generator. This may cause the generator to generate the next random feature data (e.g., random feature data may be generated based on other random values). The generator inputs the next random feature data into a arbiter; the discriminator continues comparing and judging the characteristic data of the sample DPI data with the next random characteristic data to obtain a next judging result until the judging result is within a preset range.
In step S314, when the determination result is within the predetermined range, the arbiter determines that the current DPI data complement model unit reaches the optimal state.
Here, the optimal state means that the DPI data complement model unit can be used to perform a complement operation on the missing DPI data and that the complement DPI data is very close to the missing real DPI data (i.e. the difference is within an acceptable range).
Thus, methods of training a DPI data complement model unit in accordance with some embodiments of the present disclosure are provided. The method comprises the following steps: preprocessing sample DPI data, and sequentially inputting the preprocessed sample DPI data into a convolution layer, a correction linear unit layer, a pooling layer and a full-connection layer for processing to obtain characteristic data of the sample DPI data; inputting characteristic data of the sample DPI data into a discriminator of the GAN; inputting the random value into a generator of the GAN; the generator calculates a random value to generate random characteristic data, and the random characteristic data is input into the discriminator; the discriminator compares and judges the characteristic data of the sample DPI data with the random characteristic data to obtain a judging result; when the judging result is not in the preset range, the judging device determines that the current DPI data complement model unit does not reach the optimal state, and returns the judging result to the generator, so that the generator generates the next random characteristic data; and when the judging result is in the preset range, the judging device determines that the current DPI data complement model unit reaches the optimal state.
For example, the discriminator D compares the feature data of the processed sample DPI data with the random feature data generated by the generator G, and when the determination result D (G (z))=about 0.5, the model reaches an optimal state, that is, the data generated by the generator and the real data are not greatly different. Therefore, the generator can be used for generating missing DPI data in a certain time period to complete the data, so that the function of data fluctuation resistance is achieved.
Through training the DPI data complement model unit, the DPI data complement model unit can realize the complement operation of the missing DPI data.
The DPI data complement model unit is different from some existing algorithm models, such as a K-means clustering algorithm. The model disclosed by the embodiment of the disclosure is closer to an actual application scene, the functions of intelligently extracting hidden characteristics, acquiring time sequence information of long sequence dependence and intelligently countering generation of missing data are added, the model is applied to a multi-path flow processing cleaning platform of big data, is a core algorithm model of the cleaning platform, and provides data fluctuation resistance for a platform system.
In some embodiments, the DPI data complement model unit may be continuously trained using the sample DPI data of a sample period (e.g., 30 days) before the current day, so that the calculation result of the DPI data complement model unit may be kept as close to the real data as possible.
Figure 4 is a schematic diagram illustrating a structure of a processing system for DPI data according to some embodiments of the present disclosure. As shown in fig. 4, the processing system may include an acquisition unit 410 and a DPI data complement model unit 420.
The obtaining unit 410 is configured to detect a first period of time in which DPI data is missing, obtain DPI data of a second period of time adjacent to the first period of time, and input the DPI data of the second period of time to the DPI data complement model unit 420.
The DPI data complement model unit 420 is configured to generate missing DPI data for the first time period based on DPI data for the second time period.
Thus far, a processing system for DPI data according to some embodiments of the present disclosure is provided. In the processing system, an acquisition unit is used for detecting a first time period in which DPI data is missing, acquiring DPI data of a second time period adjacent to the first time period, and inputting the DPI data of the second time period to a DPI data complement model unit; the DPI data complement model unit is configured to generate missing DPI data for the first time period based on the DPI data for the second time period. The processing system realizes the complement of the missing DPI data and reduces the influence of the data missing when the user uses the data.
In some embodiments, the acquisition unit 410 may also be configured to acquire sample DPI data for a sample period of time and input the sample DPI data to the DPI data complement model unit 420.DPI data complement model unit 420 may also be used to train based on sample DPI data.
In some embodiments, as shown in fig. 4, DPI data complement model unit 420 may include a data processing module 421 and a GAN 422.
The data processing module 421 is configured to pre-process the sample DPI data, and sequentially input the pre-processed sample DPI data to a convolution layer, a correction linear unit layer, a pooling layer, and a full-connection layer for processing, so as to obtain feature data of the sample DPI data; and inputs the characteristic data of the sample DPI data into the discriminator 4222 of the GAN 422. For example, the preprocessing may include: at least one of the missing value processing, the dimension reduction processing, the normalization processing, and the vector encoding processing is removed.
GAN 422 may include a generator 4221 and a arbiter 4222.
The generator 4221 is configured to receive the random value, calculate the random value to generate random feature data, and input the random feature data into the discriminator 4222.
The discriminator 4222 is configured to compare and determine the feature data of the sample DPI data with the random feature data to obtain a determination result; when the determination result is not within the predetermined range, it is determined that the current DPI data complement model unit 420 does not reach the optimal state, and the determination result is returned to the generator 4221, so that the generator 4221 generates the next random feature data; when the determination result is within the predetermined range, it is determined that the current DPI data complement model unit 420 reaches an optimal state.
In some embodiments, the predetermined range may be 0.45 to 0.55.
In some embodiments, the generator 4221 may be configured to generate the data sequence of the initial period based on the random value, and to gradually increase the data sequence accordingly with the preset period as the increment period until the data sequence having the period equal to the length of the sample period is increased, that is, the random feature data, and acquire the time information of the random feature data using the forgetting gate.
Figure 5 is a schematic diagram illustrating a configuration of a processing system for DPI data according to further embodiments of the present disclosure. The processing system includes a memory 510 and a processor 520. Wherein:
Memory 510 may be a magnetic disk, flash memory, or any other non-volatile storage medium. The memory is used to store instructions in the embodiments corresponding to fig. 1 and/or 3.
Processor 520 is coupled to memory 510 and may be implemented as one or more integrated circuits, such as a microprocessor or microcontroller. The processor 520 is configured to execute instructions stored in the memory, thereby implementing complement to the missing DPI data and reducing the impact of the data missing when the user uses the data.
In some embodiments, the processing system 600 may also include a memory 610 and a processor 620, as shown in FIG. 6. Processor 620 is coupled to memory 610 through BUS 630. The processing system 600 may also be coupled to external storage 650 via a storage interface 640 for invoking external data, and may also be coupled to a network or another computer system (not shown) via a network interface 660, not described in detail herein.
In the embodiment, the data instruction is stored by the memory, and then the instruction is processed by the processor, so that the complement of the missing DPI data is realized, and the influence of the data loss when the user uses the data is reduced.
In other embodiments, the present disclosure also provides a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the methods of the corresponding embodiments of fig. 1 and/or 3. It will be apparent to those skilled in the art that embodiments of the present disclosure may be provided as a method, apparatus, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Thus far, the present disclosure has been described in detail. In order to avoid obscuring the concepts of the present disclosure, some details known in the art are not described. How to implement the solutions disclosed herein will be fully apparent to those skilled in the art from the above description.
The methods and systems of the present disclosure may be implemented in a number of ways. For example, the methods and systems of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, firmware. The above-described sequence of steps for the method is for illustration only, and the steps of the method of the present disclosure are not limited to the sequence specifically described above unless specifically stated otherwise. Furthermore, in some embodiments, the present disclosure may also be implemented as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
Although some specific embodiments of the present disclosure have been described in detail by way of example, it should be understood by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the present disclosure. It will be appreciated by those skilled in the art that modifications may be made to the above embodiments without departing from the scope and spirit of the disclosure. The scope of the present disclosure is defined by the appended claims.

Claims (10)

1. A processing method for deep packet inspection, DPI, data, comprising:
detecting a first time period in which DPI data are missing;
obtaining DPI data of a second time period adjacent to the first time period;
inputting the DPI data of the second time period to a DPI data complement model unit; and
The DPI data complement model unit generating missing DPI data for the first time period based on DPI data for the second time period;
the processing method further comprises the following steps:
Before a first time period of missing DPI data is detected, sample DPI data of a sample time period is obtained; and
Inputting the sample DPI data to the DPI data complement model unit to train the DPI data complement model unit;
The step of training the DPI data complement model unit comprises the following steps:
Preprocessing the sample DPI data, and sequentially inputting the preprocessed sample DPI data into a convolution layer, a correction linear unit layer, a pooling layer and a full-connection layer for processing to obtain characteristic data of the sample DPI data;
inputting the characteristic data of the sample DPI data into a discriminator of a generative countermeasure network GAN;
inputting a random value into a generator of the GAN;
The generator calculates the random value to generate random characteristic data, and the random characteristic data is input into the discriminator; and
The discriminator compares and judges the characteristic data of the sample DPI data with the random characteristic data to obtain a judging result;
when the judging result is not in the preset range, the judging device determines that the current DPI data complement model unit does not reach the optimal state, and returns the judging result to the generator, so that the generator generates the next random characteristic data;
and when the judging result is in the preset range, the judging device determines that the current DPI data complement model unit reaches the optimal state.
2. The process according to claim 1, wherein,
The predetermined range is 0.45 to 0.55.
3. The processing method of claim 1, wherein the step of generating random feature data by the generator comprises:
The generator generates a data sequence of an initial time period based on the random value, takes a preset time period as an increment time period, and correspondingly gradually increases the data sequence until the data sequence with the time period equal to the length of the sample time period is increased, namely the random characteristic data, and utilizes a forgetting gate to acquire time information of the random characteristic data.
4. The process according to claim 1, wherein,
The pretreatment comprises the following steps: at least one of the missing value processing, the dimension reduction processing, the normalization processing, and the vector encoding processing is removed.
5. A processing system for DPI data, comprising:
The DPI data acquisition unit is used for detecting and obtaining a first time period of missing DPI data, acquiring DPI data of a second time period adjacent to the first time period, and inputting the DPI data of the second time period to the DPI data complement model unit; and
The DPI data complement model unit is used for generating missing DPI data of the first time period based on the DPI data of the second time period;
The acquisition unit is further used for acquiring sample DPI data of a sample time period and inputting the sample DPI data to the DPI data complement model unit;
the DPI data complement model unit is further configured to train based on the sample DPI data;
the DPI data complement model unit comprises:
the data processing module is used for preprocessing the sample DPI data, inputting the preprocessed sample DPI data into a convolution layer, a modified linear unit layer, a pooling layer and a full connection layer in sequence for processing so as to obtain characteristic data of the sample DPI data, and inputting the characteristic data of the sample DPI data into a generating type countermeasure network GAN discriminator; and
The GAN comprises a generator and a discriminator; wherein,
The generator is used for receiving a random value, calculating the random value to generate random characteristic data, and inputting the random characteristic data into the discriminator;
The discriminator is used for comparing and judging the characteristic data of the sample DPI data with the random characteristic data to obtain a judging result; when the judging result is not in the preset range, determining that the current DPI data complement model unit does not reach the optimal state, and returning the judging result to the generator so that the generator generates the next random characteristic data; and when the judging result is in the preset range, determining that the current DPI data complement model unit reaches the optimal state.
6. The processing system of claim 5, wherein,
The predetermined range is 0.45 to 0.55.
7. The processing system of claim 5, wherein,
The generator is used for generating a data sequence of an initial time period based on the random value, taking a preset time period as an increment time period, correspondingly and gradually increasing the data sequence until the data sequence with the time period equal to the length of the sample time period is increased, namely the random characteristic data, and acquiring time information of the random characteristic data by utilizing a forgetting gate.
8. The processing system of claim 5, wherein,
The pretreatment comprises the following steps: at least one of the missing value processing, the dimension reduction processing, the normalization processing, and the vector encoding processing is removed.
9. A processing system for DPI data, comprising:
A memory; and
A processor coupled to the memory, the processor configured to perform the method of any of claims 1-4 based on instructions stored in the memory.
10. A computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the method of any of claims 1 to 4.
CN201911305426.7A 2019-12-18 2019-12-18 Processing method and processing system for DPI data Active CN113010500B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911305426.7A CN113010500B (en) 2019-12-18 2019-12-18 Processing method and processing system for DPI data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911305426.7A CN113010500B (en) 2019-12-18 2019-12-18 Processing method and processing system for DPI data

Publications (2)

Publication Number Publication Date
CN113010500A CN113010500A (en) 2021-06-22
CN113010500B true CN113010500B (en) 2024-06-14

Family

ID=76381114

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911305426.7A Active CN113010500B (en) 2019-12-18 2019-12-18 Processing method and processing system for DPI data

Country Status (1)

Country Link
CN (1) CN113010500B (en)

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106301825B (en) * 2015-05-18 2020-10-16 南京中兴新软件有限责任公司 DPI rule generation method and device
CN106971348B (en) * 2016-01-14 2021-04-30 阿里巴巴集团控股有限公司 Data prediction method and device based on time sequence
CN107133190A (en) * 2016-02-29 2017-09-05 阿里巴巴集团控股有限公司 The training method and training system of a kind of machine learning system
CN107493203A (en) * 2016-06-12 2017-12-19 中兴通讯股份有限公司 DPI rules delivery method and device
CN107169520A (en) * 2017-05-19 2017-09-15 济南浪潮高新科技投资发展有限公司 A kind of big data lacks attribute complementing method
CN109840530A (en) * 2017-11-24 2019-06-04 华为技术有限公司 The method and apparatus of training multi-tag disaggregated model
CN109165664B (en) * 2018-07-04 2020-09-22 华南理工大学 Attribute-missing data set completion and prediction method based on generation of countermeasure network
CN109063433B (en) * 2018-07-09 2021-04-30 中国联合网络通信集团有限公司 False user identification method and device and readable storage medium
CN109815223B (en) * 2019-01-21 2020-09-25 北京科技大学 Completion method and completion device for industrial monitoring data loss
CN110288537A (en) * 2019-05-20 2019-09-27 湖南大学 Facial image complementing method based on the depth production confrontation network from attention

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于生成对抗网络的分级联合图像补全方法;冀俭俭;杨刚;;《图学学报》;20191215(第6期);29-37 *
基于生成式对抗网络的路网交通流数据补全方法;王力 等;《交通运输***工程与信息》;20181215;第18卷(第6期);63-71 *

Also Published As

Publication number Publication date
CN113010500A (en) 2021-06-22

Similar Documents

Publication Publication Date Title
CN109271958B (en) Face age identification method and device
CN111506814B (en) Sequence recommendation method based on variational self-attention network
US20210294834A1 (en) 3d-aware image search
CN111260220B (en) Group control equipment identification method and device, electronic equipment and storage medium
CN112597984B (en) Image data processing method, image data processing device, computer equipment and storage medium
CN113361698A (en) Processing method and device of neural network model, and data processing method and device
CN110750677A (en) Audio and video recognition method and system based on artificial intelligence, storage medium and server
CN116775807A (en) Natural language processing and model training method, equipment and storage medium
CN106326388A (en) Method and device for processing information
CN115496144A (en) Power distribution network operation scene determining method and device, computer equipment and storage medium
CN113723115B (en) Open domain question-answer prediction method based on pre-training model and related equipment
CN109242089B (en) Progressive supervised deep learning neural network training method, system, medium and device
CN111613273A (en) Model training method, protein interaction prediction method, device and medium
CN111241258A (en) Data cleaning method and device, computer equipment and readable storage medium
CN111010595B (en) New program recommendation method and device
CN114329004A (en) Digital fingerprint generation method, digital fingerprint generation device, data push method, data push device and storage medium
CN113010500B (en) Processing method and processing system for DPI data
CN112738098A (en) Anomaly detection method and device based on network behavior data
CN113392867A (en) Image identification method and device, computer equipment and storage medium
CN113743593B (en) Neural network quantization method, system, storage medium and terminal
CN112667394B (en) Computer resource utilization rate optimization method
CN110555182A (en) User portrait determination method and device and computer readable storage medium
CN111984842B (en) Bank customer data processing method and device
CN112115991B (en) Mobile terminal change prediction method, device, equipment and readable storage medium
CN110659962A (en) Commodity information output method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220126

Address after: 100007 room 205-32, floor 2, building 2, No. 1 and No. 3, qinglonghutong a, Dongcheng District, Beijing

Applicant after: Tianyiyun Technology Co.,Ltd.

Address before: No.31, Financial Street, Xicheng District, Beijing, 100033

Applicant before: CHINA TELECOM Corp.,Ltd.

GR01 Patent grant
GR01 Patent grant