WO2022257436A1 - 基于无线通信网络数据仓库构建方法、***、设备及介质 - Google Patents

基于无线通信网络数据仓库构建方法、***、设备及介质 Download PDF

Info

Publication number
WO2022257436A1
WO2022257436A1 PCT/CN2021/142266 CN2021142266W WO2022257436A1 WO 2022257436 A1 WO2022257436 A1 WO 2022257436A1 CN 2021142266 W CN2021142266 W CN 2021142266W WO 2022257436 A1 WO2022257436 A1 WO 2022257436A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
association
fields
data table
key performance
Prior art date
Application number
PCT/CN2021/142266
Other languages
English (en)
French (fr)
Inventor
张秉致
何世文
易云山
王良鹏
张祥伍
黄永明
尤肖虎
Original Assignee
网络通信与安全紫金山实验室
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 网络通信与安全紫金山实验室 filed Critical 网络通信与安全紫金山实验室
Publication of WO2022257436A1 publication Critical patent/WO2022257436A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/02Arrangements for optimising operational condition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • G06N5/025Extracting rules from data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design

Definitions

  • the present application relates to the field of intelligent wireless communication network technology, and in particular to a method, system, device and medium for building a data warehouse based on a wireless communication network.
  • Wireless communication refers to the long-distance transmission communication between multiple nodes without propagation through conductors or cables.
  • Commercial wireless communication has developed from the initial 1G to the current 5G and the future 6G.
  • the traffic bandwidth of communication is getting larger and more powerful.
  • a wireless communication network involves a lot of complex data from the terminal, the access network to the core network, with thousands of data fields and indicators, involving different hardware and software, functions, and protocol stacks.
  • the effective collection and rational use of various data formed during the operation of the wireless communication network can maximize the service potential of the wireless communication network and promote the further development of the technical advantages of the wireless communication network.
  • the continuous progress of big data and artificial intelligence technology has promoted the development of wireless communication towards intelligence, and the premise of this realization is wireless big data.
  • the collection of wireless communication data is mainly completed by telecom operators, telecom equipment providers, and application service providers.
  • the collection nodes include smartphones and various sensors on the terminal side, macro/micro base stations on the access side, and dedicated data collection units on the core network side. Collection methods include raw data recording and deep packet inspection (Deep Packet Inspection, DPI), etc.
  • a data warehouse is a data collection that synthesizes, classifies, analyzes and utilizes the collected raw data for specific analysis needs.
  • Traditional data warehouse construction is based on existing domain knowledge for data modeling. In the face of relatively complex wireless communication network data, it is impossible to completely and accurately extract data that meets the analysis requirements, thereby affecting the accuracy of analysis results.
  • a method, system, device and medium for building a data warehouse based on a wireless communication network are provided.
  • a method for building a data warehouse based on a wireless communication network including: preprocessing raw data, generating raw data tables, summarizing key performance indicators from the raw data based on different time granularities and dimensions, generating Key performance indicator data table; performing knowledge extraction on the original data table and the key performance indicator data table, constructing association rules and generating a knowledge map, and obtaining an initial data classification model through endogenous association reasoning; according to the initial data classification model, splitting the original data table and the key performance indicator data table, and constructing an initial classified lightly aggregated data table, the lightly aggregated data table includes different types of raw data sub-tables and key performance indicator data sub-table; according to the demand field input by the user, perform association reasoning on the initial data classification model to output association fields, calculate and sort the weights of the association between the association fields, and output the optimal association model; and according to the optimal association model, From the lightly aggregated data table, data is extracted, converted, and loaded to generate a data warehouse for the requirement field, and information associated with the requirement
  • the dimensions include user, cell, and process.
  • the raw data includes access network data and core network data of the wireless communication network, and the raw data is collected and stored on a data platform with hive as the software architecture through the collection software, and the data is collected and stored on a data platform with hive as the software architecture, and passed through null values and invalid values The elimination is partitioned and stored according to the time range.
  • the performing knowledge extraction on the preprocessed data includes: performing knowledge extraction by utilizing the corresponding association existing between the fields of the original data table and the key performance indicator fields of the key performance indicator data table. Extracting, summarizing and integrating the fields of the preprocessed original data table and the key performance indicator fields of the key performance indicator data table into several vector matrices, and initializing the weights in each vector matrix.
  • the constructing the association rules and generating the knowledge map includes: determining the association rules based on the wireless communication network protocol, using different weights to define the strength of the association according to the association rules, and Assigning the weights to the several vector matrices generated by knowledge extraction; and splitting the several vector matrices into several triplets, each of which contains two associated fields and a vector matrix
  • the weights in and stored in the form of a graph, generate a knowledge map of the correlation between several fields.
  • the assignment of the weights is input and filled through a visual interface, or loaded in batches in the form of text files.
  • the initial data classification model obtained after endogenous association reasoning includes:
  • performing association inference on the initial data classification model according to the required fields input by the user to output associated fields, calculating and sorting the weights of the associations between the associated fields, and outputting the optimal association model including:
  • the requirement field input by the user is associated with the initial data classification model, and analyzed to obtain a number of association classes associated with the requirement field in the initial data classification model, and each of the association classes is associated with the requirement field a number of associated fields; calculate the weight of the association between the associated fields associated with the demand field, the associated fields include the associated fields of the original data table and the associated key performance indicator fields; And sort the associated fields in each of the associated classes according to the weight of the association, extract some associated fields with large weights and the lightly aggregated data tables where they are located, and sort the associated fields with large weights
  • the associated field names and table names of bit associated fields are stored according to a predetermined data structure, and the optimal associated model is output.
  • the required fields include data fields, time granularity, and field thresholds.
  • performing data extraction, conversion, and loading from the lightly aggregated data table according to the outputted preferred association model, and generating a data warehouse for the demand field includes: according to the outputted The preferred association model, write the corresponding data extraction-conversion-loading program; the data extraction-conversion-loading program is used to extract corresponding associated data that meets the requirements from the lightly aggregated data table, and use the associated class Stored in the form of key performance index sub-tables and associated data sub-tables, the associated key performance index sub-tables and the associated data sub-tables constitute a data warehouse for the demand fields.
  • a system for building a data warehouse based on a wireless communication network including: a data detail processing unit, an endogenous association modeling unit, a demand association reasoning unit, and a data warehouse construction unit;
  • the detailed data processing unit includes a preprocessing module and a key performance indicator summary module, the preprocessing module is used to preprocess the raw data to generate a raw data table; the key performance indicator summary module is used to Summarize key performance indicators from raw data and dimensions to generate key performance indicator data tables;
  • the endogenous association modeling unit is used to perform knowledge extraction on the original data table and the key performance indicator data table obtained by the preprocessing of the data detail processing unit, construct association rules and generate a knowledge map, and perform endogenous association Reasoning to generate an initial data classification model, constructing a lightly aggregated data table of the initial classification according to the initial data classification model, and outputting the lightly aggregated data table to the data warehouse construction unit;
  • the demand association inference unit is used to perform association inference on the initial data classification model according to the demand field input by the user to output the association fields, calculate and sort the weights of the associations between the association fields, and output the optimal association model;
  • the data warehouse construction unit is used to extract, convert, and load data from the lightly aggregated data table according to the outputted preferred association model, and generate a data warehouse for the demand field, and the data warehouse
  • the information associated with the required fields is summarized in .
  • the dimensions include user, cell, and process.
  • the raw data includes access network data and core network data of the wireless communication network, and the raw data is collected and stored on a data platform with hive as the software architecture through the collection software, and the data is collected and stored on a data platform with hive as the software architecture, and passed through null values and invalid values The elimination is partitioned and stored according to the time range.
  • the endogenous association modeling unit includes a knowledge extraction module, an association rule module, a knowledge graph construction module and an endogenous association reasoning module;
  • the knowledge extraction module is used to perform knowledge extraction on the original data table and the key performance index data table obtained by preprocessing, and extract the fields of the original data table and the key performance index data table after preprocessing
  • the key performance indicator field is summarized and integrated into several vector matrices, and the weight value in each vector matrix is initialized;
  • the association rule module is used to construct a slowly changing association rule based on the wireless communication network protocol, assign values to the weights in the several vector matrices formed by the knowledge extraction module according to the association rule, and Save the several vector matrices after assignment in real time;
  • the knowledge graph building module is used to split several vector matrices stored by the association rule module into several triplets, each of which includes two association fields and weights in the vector matrix, and Stored in the form of a graph to generate a knowledge map of the correlation between several fields;
  • the endogenous association reasoning module is used to classify the fields in the original data table and the key performance indicator data table through a preset association reasoning algorithm for the knowledge graph provided by the knowledge graph building module , generating an initial data classification model, splitting the original data table and the key performance indicator data table according to the initial data classification model, constructing a lightly aggregated data table for the initial classification, and storing the lightly aggregated data
  • the tables are output to the data warehouse construction unit through the back-end program.
  • the preset association reasoning algorithm is based on a Markov logic network model algorithm.
  • the assignment of the weights is input and filled through a visual interface, or loaded in batches in the form of text files.
  • the demand association reasoning unit includes a specific demand input module, an association field reasoning module, a weight sorting module and an association model output module;
  • the specific requirement input module is used to input the user's specific requirement field for the data warehouse, and the requirement field includes data field, time granularity, and field threshold;
  • the association field reasoning module is used to perform association reasoning between the demand field and the initial data classification model generated by the endogenous association modeling unit after receiving the demand field transmitted by the specific demand input module, to obtain Several association classes associated with the requirement field in the initial data classification model and several association fields associated with the requirement field in each of the association classes, calculating the number of association fields associated with the requirement field The weight of the correlation between the associated fields; the associated fields include associated fields of the original data table and associated key performance indicator fields;
  • the weight sorting and selection module is used to sort the associated fields output by the associated field reasoning module according to the weight, extract the number of associated fields with the top weight, and sort the associated fields with the top of the weight According to two types of field and key performance index field of the original data table, it is output to the association model output module;
  • the association model output module is used to combine the two types of association fields output by the weight sorting module with the required fields input by the specific requirements input module to generate an optimal association model that meets the requirements, and transmit it to the The data warehouse building unit described above.
  • the data warehouse construction unit includes a model sub-table ETL module and an associated data extraction ETL module
  • the model sub-table ETL module is used to receive the initial classification data transmitted by the endogenous association modeling unit model, performing sub-table processing on the preprocessed original data table and the summarized key performance index data table to generate several lightly aggregated data tables
  • the associated data extraction ETL module is used to receive the demand association
  • the preferred association model transmitted by the reasoning unit generates several associated data sub-tables according to the lightly aggregated data table, and constructs a data warehouse for the required fields.
  • an electronic device including a memory, a processor, and a computer program stored on the memory and operable on the processor, and the processor implements any of the above-mentioned ones when executing the program.
  • a computer-readable storage medium which stores computer-executable instructions, and the computer-executable instructions are used to implement any of the above wireless communication network-based data warehouse constructions when executed by a processor. method.
  • FIG. 1 is a flow chart of a method according to an embodiment of the present application
  • Fig. 2 is a schematic structural diagram of a device according to an embodiment of the present application.
  • FIG. 1 shows an exemplary flowchart of a method for building a data warehouse based on a wireless communication network according to an embodiment of the present application.
  • Described method specifically comprises the following steps:
  • Step S01 preprocessing the raw data to generate a raw data table, and summarizing key performance indicators (Key Performance Indicator, KPI) from the raw data to generate a key performance indicator data table.
  • KPI Key Performance Indicator
  • the summary of key performance indicators can be based on different time granularities and dimensions, including multiple dimensions such as users, communities, and processes.
  • the original data includes two parts: access network data and core network data of the wireless communication network.
  • Raw data is collected and stored on a data platform with hive as the software architecture through various collection software, and partitioned and stored according to the time range through the preliminary elimination of null and invalid values. Then, the key performance indicators of various raw data are calculated at different time granularities, and the corresponding key performance indicator data tables are generated.
  • the data of the N1 interface of the core network is used as the original data, and the signaling flow classification contained in the N1 data is shown in Table 1 below:
  • the N1 wide data table is processed with dirty data to remove redundant fields.
  • different single types of signaling are counted, such as the number of successes and failures of the registration process at 15 minutes, one hour, and one day time granularity, to form statistical data of key performance indicators at different time granularities, and import the corresponding key performance indicator data table.
  • the method for building a data warehouse based on a wireless communication network can be applied to different network protocols, including wireless communication data, according to the source of the original data. It can be applied not only above the network layer, but also on the physical layer and data Link layer data.
  • Step S02 perform knowledge extraction on the preprocessed original data table and key performance indicator data table, construct association rules and generate a knowledge map, and obtain an initial data classification model through endogenous association reasoning.
  • Endogenous association refers to the hidden association relationship between the elements inside the thing, including the hidden association relationship between the fields of the original data table and the key performance indicator fields of the key performance indicator data table.
  • Endogenous correlation analysis refers to the establishment of data and graph structure analysis models to realize the mining of hidden correlations between some internal manifestations of the wireless communication network specified in the protocol, data that affect business data flow and network performance, and indicators.
  • the fields of the preprocessed original data table and the key performance indicator fields of the key performance indicator data table are all regarded as the knowledge of the wireless communication network. There are more or less correlations between these fields, and the correlation between the fields can be used perform knowledge extraction. For example, a change in a field value of raw data will affect changes in other field values.
  • the key performance indicator field is obtained by summarizing the information of some fields of the original data, and the change of the field value of some fields of the original data affects the change of the field value of the key performance indicator field. There is also an impact relationship between key performance indicator fields and key performance indicator fields. A change in the field value of one key performance indicator field will cause changes in the field values of other key performance indicator fields.
  • the fields of the preprocessed original data table and the key performance indicator fields of the key performance indicator data table are summarized and integrated into several vector matrices, and the weights in each vector matrix are initialized, such as weight The initial value is set to 0.
  • the association rules are determined. According to the association rules, different weights can be used to define the strength of this association, and the weights are assigned to the knowledge extraction generated. In several vector matrices, the weights are assigned after adopting a slowly changing association rule, as shown in Table 2 below, where w represents the weight between two fields:
  • vector matrices can be split into several triplets, each triplet contains two associated fields, and the weights in the matrix.
  • the triplet between field 1 and field 2 can be expressed as (field 1, weight w12, field 2).
  • Triplets can be stored in the form of graphs. Combined with different algorithms, such as the K-means algorithm, a knowledge map of the correlation between several fields can be generated.
  • the storage method of graph triples can use a Neo4j graph database.
  • This application effectively clarifies the complex relationship of the wireless communication network.
  • the relationship between various data fields in the wireless communication network is represented in the form of a knowledge graph.
  • the fields in the original data table and key performance indicator data table can be divided into several categories by using the preset association reasoning algorithm, such as the Markov logic network model association reasoning algorithm.
  • These classifications form an initial data classification model for preprocessed raw data tables and key performance indicator data tables.
  • the data of the N1 interface of the core network is used as the original data, and the original data table and the key performance indicator data table are generated by preprocessing the data of the N1 interface of the core network.
  • the original data table includes fields of N1 interface data
  • the key performance indicator data table includes key performance indicator fields.
  • N1 interface data fields and key performance indicator fields contain more than 100 fields in total. Endogenous association reasoning is performed on these more than 100 fields to obtain the hidden association relationship between N1 interface data fields and key performance indicator fields, and generate initial data. classification model.
  • the fields of the N1 interface data and key performance indicator fields are divided into several categories through the Markov logic network model association reasoning algorithm, and then the initial data classification model is generated.
  • the generated initial data classification model Part of the content is shown in Table 3 below:
  • Step S03 according to the initial data classification model generated by endogenous association reasoning, construct a lightly aggregated data table for the initial classification.
  • the preprocessed original data table and key performance indicator table can be split to generate different types of original data sub-tables and key performance indicator data sub-tables, which are defined as initial classification
  • the lightly aggregated data table is used as the basic data for subsequent demand association reasoning.
  • Step S04 perform association reasoning on the initial data classification model according to the required fields input by the user, and output associated fields, calculate and sort the weights of associations between associated fields, and output an optimal association model;
  • Lightly aggregated data tables cannot be directly used as analysis data for specific applications. They need to be further processed in combination with specific application requirements before they can be used.
  • the data applicator that is, the user, puts forward specific demand fields for the data warehouse based on traditional communication knowledge, and enters the demand fields.
  • Requirement fields include data fields, time granularity, and field thresholds. By associating these demand fields with the initial data classification model, it is possible to analyze which association fields in which data classifications are associated with the demand fields.
  • the associated fields here include the fields of the original data table and the key performance indicator fields.
  • step S04 includes: performing association reasoning on the demand field input by the user and the initial data classification model, and analyzing and obtaining several association classes associated with the demand field in the initial data classification model, and each association class Several associated fields that are associated with the required fields; calculate the weight of the association between all associated fields associated with the required fields, and the associated fields include the fields of the original data table and the key performance indicator fields; and for each associated class
  • the associated fields are sorted according to the weight of the association, and several associated fields with large weights and the light aggregation data tables where they are located are extracted.
  • the associated field names and table names of several associated fields with large weights are sorted according to the predetermined data Structural storage, output the best association model.
  • association class 1 M data classes are associated with the required fields, which are called association class 1, association class 2, ..., association class M.
  • association class M M data classes are associated with the required fields, which are called association class 1, association class 2, ..., association class M.
  • association class 2 M data classes are associated with the requirement fields, and the weight of this association can be calculated.
  • the associated fields in each association class are sorted according to the weight of the association, and some top-ranked bit fields are selected, for example, the top 10 bit fields with the top-ranked weight are selected.
  • These 10 fields may include both original data fields and key performance indicator fields.
  • Step S05 according to the output optimal association model, extract, convert, and load data from the lightly aggregated data table, and generate a corresponding data warehouse for the required fields.
  • the generated data warehouse summarizes the information associated with the required fields, for example, it can summarize all the information associated with the required fields, so that it is convenient for data analysts to analyze and apply the data more accurately and directly according to the data warehouse.
  • the corresponding data can be extracted from the lightly aggregated data table.
  • the associated data that meets the application requirements are stored in the form of associated key performance indicator sub-tables and associated data sub-tables.
  • the associated data sub-tables of these associated types constitute a data warehouse for the required fields, which is convenient for data analysts to be more accurate Analyze data directly.
  • This application analyzes the associated fields of different requirements through association reasoning, effectively improves the effective information of the data warehouse subject of the wireless communication network, generates a data warehouse for the required fields, and summarizes all the information associated with the required fields in the data warehouse, thereby improving the later stage.
  • Enhanced tuning research provides stronger support.
  • the present application is beneficial to the performance optimization of the wireless network. For example, in the fault detection scenario, the data warehouse constructed by the present application can provide more targeted, comprehensive and accurate data analysis for fault detection.
  • FIG. 2 it shows a structural block diagram of a wireless communication network-based data warehouse construction system provided by this embodiment.
  • the system includes data detail processing unit, endogenous association modeling unit, requirement association reasoning unit and data warehouse construction unit.
  • the detailed data processing unit includes a preprocessing module and a key performance indicator summary module.
  • the preprocessing module is used to preprocess the original data and generate the original data table.
  • the original data includes two parts: access network data and core network data of the wireless communication network.
  • Raw data is collected and stored on a data platform with hive as the software architecture through various collection software.
  • the hive execution script is written in the shell language, and then the scheduling tool is used to execute it regularly, and the relevant processing process is completed periodically, and stored in the hive data platform.
  • the key performance indicator summary module is used to summarize the key performance indicators from the original data to generate the key performance indicator data table.
  • the endogenous association modeling unit includes knowledge extraction module, association rule module, knowledge map building module and endogenous association reasoning module.
  • the endogenous association modeling unit is used to extract knowledge from the original data table and key performance indicator data table obtained by the preprocessing of the data detail processing unit, store the corresponding association rules in the form of a graph, then build a knowledge map, and finally perform endogenous association Inference, thereby generating the initial data classification model and outputting it.
  • the knowledge extraction module is used to summarize and integrate various fields of various raw data tables and key performance indicator fields of key performance indicator tables obtained by preprocessing into several vectors according to knowledge in the traditional communication field matrix, and initialize the weights in each vector matrix, that is, before building association rules, the weights in the vector matrix are all set to 0.
  • the association rule module is used to construct slowly changing association rules, including assigning weights in the vector matrix formed by the knowledge extraction module based on the wireless communication network protocol, and saving the assigned vector matrix in real time.
  • the assignment of weights is input and filled through a visual interface, or loaded in batches in the form of text files.
  • the knowledge map building module is used to split several vector matrices stored by the association rule module into several triplets, each triplet contains two association fields and weights in the vector matrix, which are stored according to the association rule module Vector matrix; store the relationship between the fields of the original data table and key performance indicator fields in the form of a graph in the graph database software, and combine different data algorithms to generate triplet information of key performance indicators and algorithm types, Expressed as (attribute field, effective relationship, statistical indicator), (statistical indicator, algorithm relationship, algorithm type data indicator); among them, the effective relationship and algorithm relationship in the triplet are expressed in the form of weight and stored in the graph database , to construct a knowledge graph required for endogenous association reasoning.
  • the graph triplet of the association rule is expressed as (flow type, flow relationship, attribute field), and the flow can store multiple triplets according to the attribute fields involved.
  • the endogenous association reasoning module is used to reason about the knowledge graph provided by the knowledge graph building module based on the preset association reasoning algorithm.
  • the preset association reasoning algorithm can be based on the Markov logic network model algorithm;
  • the fields in the original data table and the key performance indicator data table are classified accordingly to generate the initial data classification model;
  • the original data table and the key performance index data table are split according to the initial data classification model, and the lightly aggregated data of the initial classification is constructed table, and output the lightly aggregated data table to the data warehouse building unit through the back-end program.
  • the demand association reasoning unit is used to perform association reasoning between the demand field and the initial data classification model generated by the endogenous association modeling unit after receiving the specific demand field input by the user, to obtain the corresponding optimal association model, and output it to the data Warehouse construction unit, including a specific requirement input module, an associated field reasoning module, a weight sorting optimal module, and an associated model output module.
  • the specific requirement input module is a front-end display software module for inputting the user's requirement fields for the data warehouse, and the requirement fields include but not limited to data fields, time granularity, and field thresholds.
  • the association field reasoning module After receiving the demand fields transmitted by the specific demand input module, the association field reasoning module uses a preset algorithm, such as a Markov logic network model algorithm, to compare the demand fields with the initial data classification model generated by the endogenous association modeling unit. Association reasoning, to obtain several association classes of the demand field in the initial data classification model, as well as the fields of the associated original data table and the weights of the associated key performance indicator fields.
  • a preset algorithm such as a Markov logic network model algorithm
  • the weight sorting module is used to sort the associated fields output by the associated field reasoning module according to their weights, and then select several associated fields with the top weights, and sort the selected associated fields with the top weights according to the associated original data
  • the fields of the table and the associated key performance indicator fields are output to the associated model output module.
  • the association model output module combines the two types of association fields output by the weight sorting module and the demand fields under the conditions of time granularity and field threshold input by the specific demand input module to generate an optimal association model that meets the requirements and transmits it to the data warehouse building blocks.
  • Data warehouse construction unit including model sub-table ETL module, and associated data extraction ETL module, respectively used to receive the data model transmitted by the endogenous association modeling unit and demand association reasoning unit, process the data in two stages, and finally generate data storehouse.
  • the model sub-table ETL module is used to receive the initial classification data model transmitted by the endogenous association modeling unit, perform sub-table processing on the preprocessed original data table and the summarized key performance indicator data table, and generate several lightly aggregated data tables .
  • the associated data extraction ETL module is used to receive the optimal association model transmitted by the demand association reasoning unit, operate on the lightly aggregated data table, generate several associated data sub-tables, and build a data warehouse for the demand field.
  • the ETL script is generated by the back-end program according to the data model, and then the execution cycle is configured by the front-end, and then executed by the scheduling software cycle.
  • Each module in the above wireless communication network-based data warehouse construction system can be fully or partially realized by software, hardware and combinations thereof.
  • the above-mentioned modules can be embedded in or independent of the processor in the server in the form of hardware, or can be stored in the memory of the server in the form of software, so that the processor can invoke and execute the corresponding operations of the above-mentioned modules.
  • An electronic device including a memory, a processor, and a computer program stored on the memory and operable on the processor.
  • the processor executes the program, it implements a method for constructing a data warehouse based on a wireless communication network described in any of the above embodiments.
  • the memory can be various types of memory, such as random access memory, read-only memory, flash memory, and the like.
  • the processor may be various types of processors, for example, a central processing unit, a microprocessor, a digital signal processor, or an image processor, and the like.
  • a computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to implement the wireless communication network-based data warehouse construction method described in any of the above embodiments when executed by a processor.
  • the storage medium includes: U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk and other various media that can store program codes.
  • a component may be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer.
  • a component may be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer.
  • an application running on a server and a server can be a component.
  • One or more components can reside within a process and/or thread of execution, and a component can be localized on one computer and/or distributed between two or more computers.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种基于无线通信网络数据仓库构建方法包括:对原始数据进行预处理,生成原始数据表,并基于不同的时间粒度和维度从原始数据中汇总关键性能指标,生成关键性能指标数据表;对原始数据表和关键性能指标数据表进行知识抽取,构建关联规则并生成知识图谱,通过内生关联推理后得到初始数据分类模型;根据初始数据分类模型,对原始数据表和关键性能指标数据表进行拆分,构建初始分类的轻度汇聚数据表;根据用户输入的需求字段对初始数据分类模型进行关联推理输出关联字段,计算关联字段间关联性的权值并排序,输出择优关联模型;以及根据择优关联模型,从轻度汇聚数据表中,进行数据的抽取、转换、装载,生成针对需求字段的数据仓库。

Description

基于无线通信网络数据仓库构建方法、***、设备及介质
相关申请的交叉引用
本申请要求于2021年06月08日提交中国专利局、申请号为202110634448.9、发明名称为“基于无线通信网络数据仓库构建方法、***、设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及智能无线通信网络技术领域,尤其涉及一种基于无线通信网络数据仓库构建方法、***、设备及介质。
背景技术
无线通信是指多个节点间不经由导体或缆线传播进行的远距离传输通讯。商用无线通信从最初的1G发展到目前的5G,以及未来的6G,通信的流量带宽越来越大,功能越来越强大。一个无线通信网络从使用终端、接入网到核心网都涉及许多复杂的数据,有成千上万个数据字段和指标,涉及不同的软硬件、功能、以及协议栈。对无线通信网络运行过程中形成的各类数据进行有效归集与合理使用,能够最大发掘无线通信网络的服务潜能,促进无线通信网络技术优势的进一步发挥。
大数据和人工智能技术的不断进步,促使无线通信朝着智能化的趋势发展,而这一实现的前提是无线大数据。无线通信数据的采集主要由电信运营商、电信设备提供商、应用服务提供商完成。采集节点包括终端侧的智能手机以及各类传感器等、接入侧的宏/微基站和核心网侧的专用数据采集单元。采集手段包括原始数据记录和深度包解析(Deep Packet Inspection,DPI)等。
数据仓库是针对具体的分析需求案例,将采集的原始数据进行综合、归类 和分析利用的数据集合。传统的数据仓库构建都是基于现有的领域知识进行数据建模,在面对关联关系相对复杂的无线通信网络数据,无法完整精准地提取符合分析需求的数据,从而影响分析结果的精准度。
发明内容
根据本申请的各种实施例,提供一种基于无线通信网络数据仓库构建方法、***、设备及介质。
在一方面,提供一种基于无线通信网络数据仓库构建方法,包括:对原始数据进行预处理,生成原始数据表,并基于不同的时间粒度和维度从所述原始数据中汇总关键性能指标,生成关键性能指标数据表;对所述原始数据表和所述关键性能指标数据表进行知识抽取,构建关联规则并生成知识图谱,通过内生关联推理后得到初始数据分类模型;根据所述初始数据分类模型,对所述原始数据表和所述关键性能指标数据表进行拆分,构建初始分类的轻度汇聚数据表,所述轻度汇聚数据表包括不同类的原始数据分表和关键性能指标数据分表;根据用户输入的需求字段对所述初始数据分类模型进行关联推理输出关联字段,计算所述关联字段间关联性的权值并排序,输出择优关联模型;以及根据所述择优关联模型,从所述轻度汇聚数据表中,进行数据的抽取、转换、装载,生成针对所述需求字段的数据仓库,所述数据仓库中汇总有与需求字段相关联的信息。
在一个实施例中,所述维度包括用户、小区、流程。
在一个实施例中,所述原始数据包括无线通信网络的接入网数据和核心网数据,所述原始数据通过采集软件,采集存储到以hive为软件架构的数据平台,通过空值、无效值的剔除,按照时间范围进行分区存储。
在一个实施例中,所述对预处理的数据进行知识抽取包括:通过利用所述原始数据表的字段和所述关键性能指标数据表的关键性能指标字段之间存在的相应的关联性执行知识抽取,将所述预处理后的原始数据表的字段、关键性能指标数据表的关键性能指标字段汇总整合成若干个矢量矩阵,并对每个矢量矩 阵中的权值初始化。
在一个实施例中,所述构建关联规则并生成知识图谱包括:以无线通信网络协议为依据,确定关联规则,根据所述关联规则采用不同的权值定义所述关联性的强弱,并将所述权值赋值到知识抽取生成的所述若干个矢量矩阵中;以及将所述若干个矢量矩阵拆分成若干个三元组,每个所述三元组包含两个关联字段以及矢量矩阵中的权值,并且以图的形式存储,生成若干个字段之间关联性的知识图谱。
在一个实施例中,所述权值的赋值通过可视化的界面进行输入填充,或者以文本文件的形式批量加载。
在一个实施例中,所述通过内生关联推理后得到初始数据分类模型,包括:
通过预设的马尔科夫逻辑网络模型关联推理算法,对所述原始数据表和所述关键性能指标数据表中的字段进行分类,形成初始数据分类模型。
在一个实施例中,所述根据用户输入的需求字段对所述初始数据分类模型进行关联推理输出关联字段,计算所述关联字段间关联性的权值并排序,输出择优关联模型,包括:将用户输入的需求字段与初始数据分类模型进行关联推理,分析得出初始数据分类模型中与所述需求字段存在关联的若干个关联类,以及每个所述关联类中与所述需求字段存在关联的若干个关联字段;计算与所述需求字段存在关联的所述关联字段间关联性的权值,所述关联字段包含关联的所述原始数据表的字段以及关联的所述关键性能指标字段;以及对每个所述关联类中的关联字段按照所述关联性的权值进行排序,提取权值大的若干位关联字段及其所在的轻度汇聚数据表,将所述权值大的若干位关联字段的关联字段名、表名按照预定的数据结构存储,输出择优关联模型。
在一个实施例中,所述需求字段包括数据字段、时间粒度、字段阈值。
在一个实施例中,所述根据输出的所述择优关联模型,从所述轻度汇聚数据表中,进行数据的抽取、转换、装载,生成针对所述需求字段的数据仓库包括:根据输出的所述择优关联模型,编写相应的数据抽取-转换-装载程序;所述数据抽取-转换-装载程序用于从所述轻度汇聚数据表中提取相应的符合需求的关联数据,分别以关联类关键性能指标分表、关联类数据分表的形式存储,所 述关联类关键性能指标分表和所述关联类数据分表构成针对所述需求字段的数据仓库。
在另一方面,提供一种基于无线通信网络数据仓库构建***,包括:数据明细处理单元、内生关联建模单元、需求关联推理单元和数据仓库构建单元;
所述数据明细处理单元包括预处理模块和关键性能指标汇总模块,所述预处理模块用于对原始数据进行预处理,生成原始数据表;所述关键性能指标汇总模块用于根据不同的时间粒度和维度从原始数据中汇总关键性能指标,生成关键性能指标数据表;
所述内生关联建模单元用于对所述数据明细处理单元预处理得到的所述原始数据表和所述关键性能指标数据表进行知识抽取,构建关联规则并生成知识图谱,进行内生关联推理,以生成初始数据分类模型,根据所述初始数据分类模型构建初始分类的轻度汇聚数据表,并将所述轻度汇聚数据表输出至数据仓库构建单元;
所述需求关联推理单元用于根据用户输入的需求字段对所述初始数据分类模型进行关联推理输出关联字段,计算所述关联字段间关联性的权值并排序,输出择优关联模型;并且
所述数据仓库构建单元用于根据输出的所述择优关联模型,从所述轻度汇聚数据表中,进行数据的抽取、转换、装载,生成针对所述需求字段的数据仓库,所述数据仓库中汇总有与所述需求字段相关联的信息。
在一个实施例中,所述维度包括用户、小区、流程。
在一个实施例中,所述原始数据包括无线通信网络的接入网数据和核心网数据,所述原始数据通过采集软件,采集存储到以hive为软件架构的数据平台,通过空值、无效值的剔除,按照时间范围进行分区存储。
在一个实施例中,所述内生关联建模单元包括知识抽取模块、关联规则模块、知识图谱构建模块和内生关联推理模块;
所述知识抽取模块用于对预处理得到的所述原始数据表和所述关键性能指标数据表进行知识抽取,将预处理后的所述原始数据表的字段、所述关键性能指标数据表的关键性能指标字段,汇总整合成若干个矢量矩阵,并对每个所述 矢量矩阵中的权值初始化;
所述关联规则模块用于以无线通信网络协议为依据,构建缓慢变化的关联规则,根据所述关联规则,对所述知识抽取模块形成的所述若干个矢量矩阵中的权值进行赋值,并实时保存赋值后的所述若干个矢量矩阵;
所述知识图谱构建模块用于将所述关联规则模块存储的若干个矢量矩阵拆分成若干个三元组,每个所述三元组包含两个关联字段以及矢量矩阵中的权值,并且以图的形式存储,生成若干个字段之间关联性的知识图谱;
所述内生关联推理模块用于对所述知识图谱构建模块提供的所述知识图谱,通过预设的关联推理算法,将所述原始数据表和所述关键性能指标数据表中的字段进行分类,生成初始数据分类模型,根据所述初始数据分类模型对所述原始数据表和所述关键性能指标数据表进行拆分,构建初始分类的轻度汇聚数据表,并将所述轻度汇聚数据表通过后端程序输出给所述数据仓库构建单元。
在一个实施例中,所述预设的关联推理算法是基于马尔科夫逻辑网络模型算法。
在一个实施例中,所述权值的赋值通过可视化的界面进行输入填充,或者以文本文件的形式批量加载。
在一个实施例中,所述需求关联推理单元包括具体需求输入模块、关联字段推理模块、权重排序择优模块和关联模型输出模块;
所述具体需求输入模块用于输入用户对数据仓库的具体需求字段,所述需求字段包括数据字段、时间粒度、字段阈值;
所述关联字段推理模块用于在接收到所述具体需求输入模块传输的需求字段后,将所述需求字段与所述内生关联建模单元生成的所述初始数据分类模型进行关联推理,得到所述初始数据分类模型中与所述需求字段存在关联的若干个关联类和以及每个所述关联类中与所述需求字段存在关联的若干个关联字段,计算与所述需求字段存在关联的所述关联字段间关联性的权值;所述关联字段包含关联的所述原始数据表的字段以及关联的关键性能指标字段;
所述权重排序择优模块用于将所述关联字段推理模块输出的所述关联字段按照权值进行排序,提取权值排前的若干位关联字段,将所述权值排前的若干 位关联字段按照原始数据表的字段、关键性能指标字段两种类型输出给所述关联模型输出模块;
所述关联模型输出模块用于将所述权重排序择优模块输出的两种类型的所述关联字段,结合所述具体需求输入模块输入的所需求字段,生成符合需求的择优关联模型,传输给所述数据仓库构建单元。
在一个实施例中,所述数据仓库构建单元包括模型分表ETL模块和关联数据提取ETL模块,所述模型分表ETL模块用于接收所述内生关联建模单元传输的所述初始分类数据模型,对预处理后的所述原始数据表以及汇总的所述关键性能指标数据表进行分表处理,生成若干个轻度汇聚数据表;所述关联数据提取ETL模块用于接收所述需求关联推理单元传输的所述择优关联模型,根据所述轻度汇聚数据表生成若干个关联数据分表,构建针对所述需求字段的数据仓库。
在另一方面,提供一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现以上任一所述的一种基于无线通信网络数据仓库构建方法。
在另一方面,提供一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于被处理器执行时实现以上任一所述的一种基于无线通信网络数据仓库构建方法。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征、目的和优点将从说明书、附图以及权利要求书变得明显。
附图说明
图1为根据本申请一实施例的方法流程图;
图2为根据本申请一实施例的装置结构示意图。
具体实施方式
以下结合附图对本申请的一种基于无线通信网络数据仓库构建方法、***、设备及介质做进一步的说明和解释。
如图1所示,其示出了根据本申请一实施例的基于无线通信网络数据仓库构建方法的示例性流程图。
所述方法具体包括以下步骤:
步骤S01,对原始数据进行预处理,生成原始数据表,并从原始数据中汇总出关键性能指标(Key Performance Indicator,KPI),生成关键性能指标数据表。
关键性能指标的汇总可以基于不同的时间粒度和维度,在维度上包含用户、小区、流程等多种维度。其中原始数据包括无线通信网络的接入网数据和核心网数据两部分。原始数据通过各种采集软件,采集存储到以hive为软件架构的数据平台,通过初步的空值、无效值的剔除,按照时间范围进行分区存储。然后以不同的时间粒度,对各类原始数据的关键性能指标进行计算,生成对应的关键性能指标数据表。
以原始数据表中的通信数据为例,基于单位时间的时间粒度,统计原始数据表中不同通信流程单位时间内的成功失败次数,并汇总关键性能指标,包括:注册成功次数、注册失败次数、UE鉴权成功次数、UE鉴权失败次数、PDU_Session资源建立请求成功次数、PDU_Session资源建立请求失败次数、5G切出成功次数和5G切出失败次数。
在一个实施例中,以核心网N1接口的数据作为原始数据,N1数据包含的信令流程分类如下表1所示:
表格1
Figure PCTCN2021142266-appb-000001
Figure PCTCN2021142266-appb-000002
如上表1所示,将N1的数据宽表,经过脏数据处理,去除冗余字段。同时分别统计不同单类信令,例如注册流程在15分钟、一个小时,一天时间粒度下的成功失败次数,形成不同时间粒度关键性能指标统计数据,导入对应的关键性能指标数据表。
本申请所提供的基于无线通信网络数据仓库构建方法,根据原始数据的来源,可应用在不同的网络协议中,包括无线通信数据,既可应用在网络层以上,也可应用在物理层和数据链路层的数据。
步骤S02,对预处理得到的原始数据表和关键性能指标数据表进行知识抽取,构建关联规则并生成知识图谱,通过内生关联推理后得到初始数据分类模型。
内生关联指的是事物内部各元素隐藏的关联关系,包含原始数据表的字段、关键性能指标数据表的关键性能指标字段之间隐藏的关联关系。内生关联分析指代的是通过建立数据、图结构分析模型等方法,实现对协议规定的无线通信网络内部的一些体现、影响业务数据流向及网络性能的数据、指标间隐藏关联关系的挖掘。
将预处理后的原始数据表的字段、关键性能指标数据表的关键性能指标字段都作为无线通信网络的知识,这些字段之间存在或远或近的关联性,可以利用字段之间存在的关联性执行知识抽取。例如,原始数据的某个字段值的变化会影响另一些字段值的变化。关键性能指标字段是通过对原始数据部分字段的信息进行汇总而得到的,原始数据部分字段的字段值的变化影响着关键性能指标字段的字段值的变化。关键性能指标字段与关键性能指标字段也存在着影响 关系,一个关键性能指标字段的字段值的变化会导致另一些关键性能指标字段的字段值也发生变化。
通过知识抽取,将所述预处理后的原始数据表的字段、关键性能指标数据表的关键性能指标字段汇总整合成若干个矢量矩阵,并对每个矢量矩阵中的权值初始化,如权值的初值均设置为0。
以无线通信网络协议为依据,包括对3GPP协议、行业规范的理解,确定关联规则,根据关联规则可以用不同的权值定义这种关联性的强弱,并将权值赋值到知识抽取生成的若干个矢量矩阵中,即采用某种缓慢变化的关联规则后,对权值进行赋值,如下表2所示,w表示两个字段之间的权值:
表格2
  字段1 字段2 字段3 字段N
字段1 0 w 12 w 13   w 1n
字段2 w 12 0 w 23   w 2n
字段3 w 13 w 23 0   w 3n
0
字段N w 1n w 2n w 3n   0
这些矢量矩阵可以拆分成若干个三元组,每个三元组包含两个关联字段,以及矩阵中的权值。例如字段1和字段2之间的三元组可表示为(字段1,权值w12,字段2)。三元组可以以图的形式存储。结合不同算法,例如K-means算法,可以生成若干张字段之间关联性的知识图谱。
本申请的一个实施例中,图三元组(以图的形式存储的三元组)的存储方式可以使用Neo4j图数据库。本申请有效厘清无线通信网络复杂关系,通过对隐藏在数据背后的字段之间的内生关联进行挖掘,将无线通信网络中的各种数据字段之间的关系以知识图谱的形式进行表征。
生成知识图谱后,利用预设的关联推理算法,例如马尔科夫逻辑网络模型关联推理算法,可以将原始数据表、关键性能指标数据表中的字段划分成若干类。这些分类,形成一个初始的对预处理的原始数据表、关键性能指标数据表 的初始数据分类模型。
在一个实施例中,以核心网N1接口的数据作为原始数据,通过对核心网N1接口的数据进行预处理,生成原始数据表和关键性能指标数据表。其中原始数据表中包含N1接口数据的字段,关键性能指标数据表中包含关键性能指标字段。N1接口数据的字段、关键性能指标字段共包含100多个字段,对这100多个字段进行内生关联推理,获取N1接口数据的字段和关键性能指标字段之间隐藏的关联关系,生成初始数据分类模型。在本实施例中,通过马尔科夫逻辑网络模型关联推理算法将N1接口数据的字段和关键性能指标字段划分为若干分类,进而生成初始数据分类模型,该实施例中,生成的初始数据分类模型部分内容如下表3所示:
表格3
Figure PCTCN2021142266-appb-000003
步骤S03,根据内生关联推理生成的初始数据分类模型,构建初始分类的轻度汇聚数据表。
得到内生关联推理生成的初始数据分类模型,就可以对预处理的原始数据表、关键性能指标表进行拆分,生成不同类的原始数据分表和关键性能指标数据分表,定义为初始分类的轻度汇聚数据表,作为后续需求关联推理处理的基 础数据。
步骤S04,根据用户输入的需求字段对初始数据分类模型进行关联推理输出关联字段,计算关联字段间关联性的权重并排序,并输出择优关联模型;
轻度汇聚数据表不能直接作为提供具体应用的分析数据,需要结合具体的应用需求,做进一步处理,才能使用。数据应用者即用户基于传统的通信知识提出对数据仓库的具体需求字段,并输入需求字段。需求字段包括数据字段、时间粒度、字段阈值。将这些需求字段与初始数据分类模型进行关联推理,可以分析出哪些数据分类中的哪些关联字段与需求字段存在关联。这里的关联字段包含原始数据表的字段以及关键性能指标字段。
在一个实施例中,步骤S04包括:将用户输入的需求字段与初始数据分类模型进行关联推理,分析得出初始数据分类模型中与需求字段存在关联的若干个关联类,以及每个关联类中与需求字段存在关联的若干个关联字段;计算与需求字段存在关联的所有关联字段间关联性的权值,关联字段包含原始数据表的字段以及关键性能指标字段;以及对每个关联类中的关联字段按照关联性的权值进行排序,提取权值大的若干位关联字段及其所在的轻度汇聚数据表,将权值大的若干位关联字段的关联字段名、表名按照预定的数据结构存储,输出择优关联模型。
示例地,以核心网N1接口的数据作为原始数据,针对N1数据的分析需求,通过分析可以得出M个数据类与需求字段存在关联,称作关联类1,关联类2,…,关联类M。每个关联类中又有若干个关联字段与需求字段是存在关联的,并且可以计算出这种关联性的权值。对每个关联类中的关联字段按照关联性的权值进行排序,选取排前的若干位字段,例如选取权值排前的前10位字段。这10个字段中可能既有原始数据的字段,也有关键性能指标字段,将这两类字段和字段所在的轻度汇聚数据表提取出来,以一定的数据结构存储,就构成了一个符合需求的择优关联模型。
步骤S05,根据输出的择优关联模型,从轻度汇聚数据表中,进行数据的抽取、转换、装载,生成针对需求字段的相应数据仓库。
生成的数据仓库中汇总有与需求字段相关联的信息,例如可以汇总所有与 需求字段相关联的信息,从而可以方便数据分析人员根据所述数据仓库更精准直接的对数据做分析应用。
在一个实施例中,获取到符合需求的择优关联模型后,通过编写相应的数据抽取-转换-装载(Extract-Transform-Load,ETL)程序,就可以从轻度汇聚数据表中提取到相应的符合应用需求的关联数据,分别以关联类关键性能指标分表、关联类数据分表的形式存储,这些关联类的关联数据分表就构成一个针对需求字段的数据仓库,便于数据分析人员更精准直接的对数据做分析应用。
本申请通过关联推理分析出不同需求的关联字段,有效提高无线通信网络的数据仓库主题的有效信息,生成针对需求字段的数据仓库,数据仓库中汇总所有与需求字段相关联的信息,进而提高后期数据处理的准确性,并为研究人员提供更多有价值的参考字段,同时避免在一些无效信息上浪费时间和精力,方便研究人员进行更有针对性的数据分析与研究,为无线通信网络性能提升调优的研究提供更加有力的支持。此外,本申请有利于无线网络的性能优化,如在故障检测场景中,通过本申请构建的数据仓库,为故障检测提供更加针对性以及更加全面和准确的数据分析。
进一步参考附图2,其示出了本实施例提供的一种基于无线通信网络数据仓库构建***的结构性框图。该***包括数据明细处理单元、内生关联建模单元、需求关联推理单元和数据仓库构建单元。
数据明细处理单元包括预处理模块和关键性能指标汇总模块。预处理模块用于对原始数据进行预处理,生成原始数据表。原始数据包括无线通信网络的接入网数据和核心网数据两部分。原始数据通过各种采集软件,采集存储到以hive为软件架构的数据平台。预处理模块中以shell语言编写hive的执行脚本,再采用调度工具定时执行,周期性完成相关的处理过程,并存储到hive数据平台。关键性能指标汇总模块用于从原始数据中汇总关键性能指标,生成关键性能指标数据表。
内生关联建模单元包括知识抽取模块、关联规则模块、知识图谱构建模块和内生关联推理模块。内生关联建模单元用于对数据明细处理单元预处理得到的原始数据表和关键性能指标数据表进行知识抽取,以图的形式存储相应的关 联规则,然后构建知识图谱,最后进行内生关联推理,从而生成初始数据分类模型并输出。
在一个实施例中,知识抽取模块用于将预处理得到的各类原始数据表的各种字段、关键性能指标表的关键性能指标字段,根据传统的通信领域的知识,汇总整合成若干个矢量矩阵,并对每个矢量矩阵中的权值初始化,即在构建关联规则之前,矢量矩阵中的权值都设为0。
关联规则模块用于构建缓慢变化的关联规则,包括以无线通信网络协议为依据,对知识抽取模块形成的矢量矩阵中的权值进行赋值,并实时保存赋值后的矢量矩阵。
在一个实施例中,权值的赋值通过可视化的界面进行输入填充,或者以文本文件的形式批量加载。
知识图谱构建模块用于将关联规则模块存储的若干个矢量矩阵拆分成若干个三元组,每个三元组包含两个关联字段以及矢量矩阵中的权值,是根据关联规则模块存储的矢量矩阵;将原始数据表的字段、关键性能指标字段之间的关联关系以图的形式在图数据库软件中存储,并结合不同的数据算法,生成关键性能指标、算法类型的三元组信息,表示为(属性字段、生效关系、统计指标),(统计指标,算法关系,算法类型数据指标);其中三元组中的生效关系,算法关系,采用权值的形式表示,存储在图数据库中,构建一张内生关联推理所需的知识图谱。
在一个实施例中,以信令流程为例,关联规则的图三元组表示为(流程类型,流程关系,属性字段),流程根据其涉及的属性字段,可以存储多个三元组。
内生关联推理模块,用于对知识图谱构建模块提供的知识图谱,基于预设的关联推理算法做相关的算法的推理,预设的关联推理算法可以是基于马尔科夫逻辑网络模型算法;将原始数据表和关键性能指标数据表中的字段做相应的分类,生成初始数据分类模型;根据初始数据分类模型对原始数据表、关键性能指标数据表进行拆分,构建初始分类的轻度汇聚数据表,并将轻度汇聚数据表通过后端程序输出给数据仓库构建单元。
需求关联推理单元,用于在接收到用户输入的具体的需求字段后,将需求 字段与内生关联建模单元生成的初始数据分类模型进行关联推理,得到相应的择优关联模型,并输出给数据仓库构建单元,包括具体需求输入模块、关联字段推理模块、权重排序择优模块、关联模型输出模块。
具体需求输入模块,是一个前端显示的软件模块,用于输入用户对数据仓库的需求字段,需求字段包括但不限于数据字段、时间粒度、字段阈值。
关联字段推理模块是在接收到具体需求输入模块传输的需求字段后,基于预设算法,如基于马尔科夫逻辑网络模型算法,将需求字段与内生关联建模单元生成的初始数据分类模型进行关联推理,得到需求字段在初始数据分类模型中的若干关联类以及其中关联的原始数据表的字段、关联的关键性能指标字段的权值。
权重排序择优模块用于将关联字段推理模块输出的关联字段按照权值进行排序,然后再选择权值排前的若干位关联字段,将选择的权值排前的若干位关联字段按照关联原始数据表的字段、关联关键性能指标字段两种类型输出给关联模型输出模块。
关联模型输出模块,将权重排序择优模块输出的两种类型的关联字段,结合具体需求输入模块输入的时间粒度、字段阈值等条件下的需求字段,生成符合需求的择优关联模型,传输给数据仓库构建单元。
数据仓库构建单元,包括模型分表ETL模块,和关联数据提取ETL模块,分别用于接收内生关联建模单元和需求关联推理单元传输的数据模型,对数据做两阶段的处理,最后生成数据仓库。
模型分表ETL模块用于接收内生关联建模单元传输的初始分类数据模型,对预处理后的原始数据表以及汇总的关键性能指标数据表进行分表处理,生成若干个轻度汇聚数据表。
关联数据提取ETL模块用于接收需求关联推理单元传输的择优关联模型,对轻度汇聚数据表进行操作,生成若干个关联数据分表,构建成针对需求字段的数据仓库。
在一个实施例中,ETL的脚本由后端程序根据数据模型,生成处理脚本,然后通过前端配置执行周期后,通过调度软件周期执行。
上述基于无线通信网络数据仓库构建***中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于服务器中的处理器中,也可以以软件形式存储于服务器中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,处理器执行程序时实现以上任一实施例所述的一种基于无线通信网络数据仓库构建方法。存储器可为各种类型的存储器,可为随机存储器、只读存储器、闪存等。处理器可为各种类型的处理器,例如,中央处理器、微处理器、数字信号处理器或图像处理器等。
一种计算机可读存储介质,存储有计算机可执行指令,计算机可执行指令用于被处理器执行时实现以上任一实施例所述的一种基于无线通信网络数据仓库构建方法。存储介质包括:U盘、移动硬盘、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
如在本申请中所使用的,术语“组件”、“模块”和“***”等旨在表示计算机相关的实体,它可以是硬件、硬件和软件的组合、软件、或者执行中的软件。例如,组件可以是但不限于是,在处理器上运行的进程、处理器、对象、可执行码、执行的线程、程序和/或计算机。作为说明,运行在服务器上的应用程序和服务器都可以是组件。一个或多个组件可以驻留在进程和/或执行的线程中,并且组件可以位于一个计算机内和/或分布在两个或更多的计算机之间。
以上所述仅是本申请的优选实施方式,应当指出:对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。

Claims (20)

  1. 一种基于无线通信网络数据仓库构建方法,包括:
    对原始数据进行预处理,生成原始数据表,并基于不同的时间粒度和维度从所述原始数据中汇总关键性能指标,生成关键性能指标数据表;
    对所述原始数据表和所述关键性能指标数据表进行知识抽取,构建关联规则并生成知识图谱,通过内生关联推理后得到初始数据分类模型;
    根据所述初始数据分类模型,对所述原始数据表和所述关键性能指标数据表进行拆分,构建初始分类的轻度汇聚数据表,所述轻度汇聚数据表包括不同类的原始数据分表和关键性能指标数据分表;
    根据用户输入的需求字段对所述初始数据分类模型进行关联推理输出关联字段,计算所述关联字段间关联性的权值并排序,输出择优关联模型;以及
    根据所述择优关联模型,从所述轻度汇聚数据表中,进行数据的抽取、转换、装载,生成针对所述需求字段的数据仓库,所述数据仓库中汇总有与所述需求字段相关联的信息。
  2. 根据权利要求1所述的方法,其中,所述维度包括用户、小区、流程。
  3. 根据权利要求1所述的方法,其中,所述原始数据包括无线通信网络的接入网数据和核心网数据,所述原始数据通过采集软件,采集存储到以hive为软件架构的数据平台,通过空值、无效值的剔除,按照时间范围进行分区存储。
  4. 根据权利要求1所述的方法,其中,所述对预处理的数据进行知识抽取包括:
    通过利用所述原始数据表的字段和所述关键性能指标数据表的关键性能指标字段之间存在的相应的关联性执行知识抽取,将所述预处理后的原始数据表的字段、关键性能指标数据表的关键性能指标字段汇总整合成若干个矢量矩阵,并对每个矢量矩阵中的权值初始化。
  5. 根据权利要求4所述的方法,其中,所述构建关联规则并生成知识图谱包括:
    以无线通信网络协议为依据,确定关联规则,根据所述关联规则采用不同的权值定义所述关联性的强弱,并将所述权值赋值到知识抽取生成的所述若干 个矢量矩阵中;以及
    将所述若干个矢量矩阵拆分成若干个三元组,每个所述三元组包含两个关联字段以及矢量矩阵中的权值,并且以图的形式存储,生成若干个字段之间关联性的知识图谱。
  6. 根据权利要求5所述的方法,所述权值的赋值通过可视化的界面进行输入填充,或者以文本文件的形式批量加载。
  7. 根据权利要求1所述的方法,其中,所述通过内生关联推理后得到初始数据分类模型,包括:
    通过预设的马尔科夫逻辑网络模型关联推理算法,对所述原始数据表和所述关键性能指标数据表中的字段进行分类,形成初始数据分类模型。
  8. 根据权利要求1所述的方法,其中,所述根据用户输入的需求字段对所述初始数据分类模型进行关联推理输出关联字段,计算所述关联字段间关联性的权值并排序,输出择优关联模型,包括:
    将用户输入的需求字段与初始数据分类模型进行关联推理,分析得出初始数据分类模型中与所述需求字段存在关联的若干个关联类,以及每个所述关联类中与所述需求字段存在关联的若干个关联字段;
    计算与所述需求字段存在关联的所述关联字段间关联性的权值,所述关联字段包含关联的所述原始数据表的字段以及关联的所述关键性能指标字段;以及
    对每个所述关联类中的关联字段按照所述关联性的权值进行排序,提取权值大的若干位关联字段及其所在的轻度汇聚数据表,将所述权值大的若干位关联字段的关联字段名、表名按照预定的数据结构存储,输出择优关联模型。
  9. 根据权利要求1所述的方法,其中,所述需求字段包括数据字段、时间粒度、字段阈值。
  10. 根据权利要求1所述的方法,其中:所述根据输出的所述择优关联模型,从所述轻度汇聚数据表中,进行数据的抽取、转换、装载,生成针对所述需求字段的数据仓库包括:
    根据输出的所述择优关联模型,编写相应的数据抽取-转换-装载程序;所述 数据抽取-转换-装载程序用于从所述轻度汇聚数据表中提取相应的符合需求的关联数据,分别以关联类关键性能指标分表、关联类数据分表的形式存储,所述关联类关键性能指标分表和所述关联类数据分表构成针对所述需求字段的数据仓库。
  11. 一种基于无线通信网络数据仓库构建***,包括:数据明细处理单元、内生关联建模单元、需求关联推理单元和数据仓库构建单元;
    所述数据明细处理单元包括预处理模块和关键性能指标汇总模块,所述预处理模块用于对原始数据进行预处理,生成原始数据表;所述关键性能指标汇总模块用于根据不同的时间粒度和维度从所述原始数据中汇总关键性能指标,生成关键性能指标数据表;
    所述内生关联建模单元用于对所述数据明细处理单元预处理得到的所述原始数据表和所述关键性能指标数据表进行知识抽取,构建关联规则并生成知识图谱,进行内生关联推理,以生成初始数据分类模型,根据所述初始数据分类模型构建初始分类的轻度汇聚数据表,并将所述轻度汇聚数据表输出至数据仓库构建单元;
    所述需求关联推理单元用于根据用户输入的需求字段对所述初始数据分类模型进行关联推理输出关联字段,计算所述关联字段间关联性的权值并排序,输出择优关联模型;并且
    所述数据仓库构建单元用于根据输出的所述择优关联模型,从所述轻度汇聚数据表中,进行数据的抽取、转换、装载,生成针对所述需求字段的数据仓库,所述数据仓库中汇总有与所述需求字段相关联的信息。
  12. 根据权利要求11所述的***,其中,所述维度包括用户、小区、流程。
  13. 根据权利要求11所述的***,其中,所述原始数据包括无线通信网络的接入网数据和核心网数据,所述原始数据通过采集软件,采集存储到以hive为软件架构的数据平台,通过空值、无效值的剔除,按照时间范围进行分区存储。
  14. 根据权利要求11所述的***,其中:所述内生关联建模单元包括知识抽取模块、关联规则模块、知识图谱构建模块和内生关联推理模块;
    所述知识抽取模块用于对预处理得到的所述原始数据表和所述关键性能指标数据表进行知识抽取,将预处理后的所述原始数据表的字段、所述关键性能指标数据表的关键性能指标字段,汇总整合成若干个矢量矩阵,并对每个所述矢量矩阵中的权值初始化;
    所述关联规则模块用于以无线通信网络协议为依据,构建缓慢变化的关联规则,根据所述关联规则,对所述知识抽取模块形成的所述若干个矢量矩阵中的权值进行赋值,并实时保存赋值后的所述若干个矢量矩阵;
    所述知识图谱构建模块用于将所述关联规则模块存储的所述若干个矢量矩阵拆分成若干个三元组,每个所述三元组包含两个关联字段以及矢量矩阵中的权值,并且以图的形式存储,生成若干个字段之间关联性的知识图谱;
    所述内生关联推理模块用于对所述知识图谱构建模块提供的所述知识图谱,通过预设的关联推理算法,将所述原始数据表和所述关键性能指标数据表中的字段进行分类,生成初始数据分类模型,根据所述初始数据分类模型对所述原始数据表和所述关键性能指标数据表进行拆分,构建初始分类的轻度汇聚数据表,并将所述轻度汇聚数据表通过后端程序输出给所述数据仓库构建单元。
  15. 根据权利要求14所述的***,其中,所述预设的关联推理算法是基于马尔科夫逻辑网络模型算法。
  16. 根据权利要求14所述的***,其中,所述权值的赋值通过可视化的界面进行输入填充,或者以文本文件的形式批量加载。
  17. 根据权利要求12所述的***,其中:所述需求关联推理单元包括具体需求输入模块、关联字段推理模块、权重排序择优模块和关联模型输出模块;
    所述具体需求输入模块用于输入用户对数据仓库的需求字段,所述需求字段包括数据字段、时间粒度、字段阈值;
    所述关联字段推理模块用于在接收到所述具体需求输入模块传输的需求字段后,将所述需求字段与所述内生关联建模单元生成的所述初始数据分类模型进行关联推理,得到所述初始数据分类模型中与所述需求字段存在关联的若干个关联类以及每个所述关联类中与所述需求字段存在关联的若干个关联字段,计算与所述需求字段存在关联的所述关联字段间关联性的权值;所述关联字段 包含关联的所述原始数据表的字段以及关联的关键性能指标字段;
    所述权重排序择优模块用于将所述关联字段推理模块输出的所述关联字段按照所述权值进行排序,提取权值排前的若干位关联字段,将所述权值排前的若干位关联字段按照原始数据表的字段、关键性能指标字段两种类型输出给所述关联模型输出模块;
    所述关联模型输出模块用于将所述权重排序择优模块输出的两种类型的所述关联字段,结合所述具体需求输入模块输入的所述需求字段,生成符合需求的择优关联模型,传输给所述数据仓库构建单元。
  18. 根据权利要求12所述的***,其中:所述数据仓库构建单元包括模型分表ETL模块和关联数据提取ETL模块,所述模型分表ETL模块用于接收所述内生关联建模单元传输的所述初始分类数据模型,对预处理后的所述原始数据表以及汇总的所述关键性能指标数据表进行分表处理,生成若干个轻度汇聚数据表;所述关联数据提取ETL模块用于接收所述需求关联推理单元传输的所述择优关联模型,根据所述轻度汇聚数据表生成若干个关联数据分表,构建针对所述需求字段的数据仓库。
  19. 一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现如权利要求1-11中任一项所述的基于无线通信网络数据仓库构建方法。
  20. 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于被处理器执行时实现如权利要求1-11中任一项所述的基于无线通信网络数据仓库构建方法。
PCT/CN2021/142266 2021-06-08 2021-12-29 基于无线通信网络数据仓库构建方法、***、设备及介质 WO2022257436A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110634448.9 2021-06-08
CN202110634448.9A CN113259972B (zh) 2021-06-08 2021-06-08 基于无线通信网络数据仓库构建方法、***、设备及介质

Publications (1)

Publication Number Publication Date
WO2022257436A1 true WO2022257436A1 (zh) 2022-12-15

Family

ID=77186983

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/142266 WO2022257436A1 (zh) 2021-06-08 2021-12-29 基于无线通信网络数据仓库构建方法、***、设备及介质

Country Status (2)

Country Link
CN (1) CN113259972B (zh)
WO (1) WO2022257436A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115858699A (zh) * 2023-02-28 2023-03-28 北京仁科互动网络技术有限公司 数据仓库的构建方法、装置、电子设备和可读存储介质
CN116244386A (zh) * 2023-02-10 2023-06-09 北京友友天宇***技术有限公司 应用于多源异构数据存储***的实体关联关系的识别方法
CN116975043A (zh) * 2023-09-21 2023-10-31 国网信息通信产业集团有限公司 一种基于流式框架的数据实时传输构建方法
CN117033460A (zh) * 2023-08-07 2023-11-10 南京中新赛克科技有限责任公司 一种基于总线矩阵的数据模型自动构建***及方法
CN117609289A (zh) * 2024-01-22 2024-02-27 山东浪潮数据库技术有限公司 一种基于宽表的能源数据处理***

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113259972B (zh) * 2021-06-08 2021-09-28 网络通信与安全紫金山实验室 基于无线通信网络数据仓库构建方法、***、设备及介质
CN114205852B (zh) * 2022-02-17 2022-05-03 网络通信与安全紫金山实验室 无线通信网络知识图谱的智能分析与应用***及方法
CN114845323A (zh) * 2022-04-06 2022-08-02 湖南华诺科技有限公司 一种基于数字孪生的无线网络优化平台及方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180082183A1 (en) * 2011-02-22 2018-03-22 Thomson Reuters Global Resources Machine learning-based relationship association and related discovery and search engines
CN111008253A (zh) * 2018-10-08 2020-04-14 阿里巴巴集团控股有限公司 数据模型生成方法和数据仓库生成方法、装置及电子设备
CN111241185A (zh) * 2020-04-26 2020-06-05 浙江网商银行股份有限公司 数据处理方法以及装置
CN112714032A (zh) * 2021-03-29 2021-04-27 网络通信与安全紫金山实验室 无线网络协议知识图谱构建分析方法、***、设备及介质
CN113259972A (zh) * 2021-06-08 2021-08-13 网络通信与安全紫金山实验室 基于无线通信网络数据仓库构建方法、***、设备及介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110856186B (zh) * 2019-11-19 2023-04-07 北京联合大学 一种无线网络知识图谱的构建方法及***
CN110972174B (zh) * 2019-12-02 2022-12-30 东南大学 一种基于稀疏自编码器的无线网络中断检测方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180082183A1 (en) * 2011-02-22 2018-03-22 Thomson Reuters Global Resources Machine learning-based relationship association and related discovery and search engines
CN111008253A (zh) * 2018-10-08 2020-04-14 阿里巴巴集团控股有限公司 数据模型生成方法和数据仓库生成方法、装置及电子设备
CN111241185A (zh) * 2020-04-26 2020-06-05 浙江网商银行股份有限公司 数据处理方法以及装置
CN112714032A (zh) * 2021-03-29 2021-04-27 网络通信与安全紫金山实验室 无线网络协议知识图谱构建分析方法、***、设备及介质
CN113259972A (zh) * 2021-06-08 2021-08-13 网络通信与安全紫金山实验室 基于无线通信网络数据仓库构建方法、***、设备及介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIU QIAO LI YANG; DUAN HONG; LIU YAO; QIN ZHIGUANG: "Knowledge Graph Construction Techniques", JOURNAL OF COMPUTER RESEARCH AND DEVELOPMENT, 31 December 2016 (2016-12-31), pages 1 - 19, XP055945884 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116244386A (zh) * 2023-02-10 2023-06-09 北京友友天宇***技术有限公司 应用于多源异构数据存储***的实体关联关系的识别方法
CN116244386B (zh) * 2023-02-10 2023-12-12 北京友友天宇***技术有限公司 应用于多源异构数据存储***的实体关联关系的识别方法
CN115858699A (zh) * 2023-02-28 2023-03-28 北京仁科互动网络技术有限公司 数据仓库的构建方法、装置、电子设备和可读存储介质
CN115858699B (zh) * 2023-02-28 2023-05-09 北京仁科互动网络技术有限公司 数据仓库的构建方法、装置、电子设备和可读存储介质
CN117033460A (zh) * 2023-08-07 2023-11-10 南京中新赛克科技有限责任公司 一种基于总线矩阵的数据模型自动构建***及方法
CN117033460B (zh) * 2023-08-07 2024-04-30 南京中新赛克科技有限责任公司 一种基于总线矩阵的数据模型自动构建***及方法
CN116975043A (zh) * 2023-09-21 2023-10-31 国网信息通信产业集团有限公司 一种基于流式框架的数据实时传输构建方法
CN116975043B (zh) * 2023-09-21 2023-12-08 国网信息通信产业集团有限公司 一种基于流式框架的数据实时传输构建方法
CN117609289A (zh) * 2024-01-22 2024-02-27 山东浪潮数据库技术有限公司 一种基于宽表的能源数据处理***

Also Published As

Publication number Publication date
CN113259972A (zh) 2021-08-13
CN113259972B (zh) 2021-09-28

Similar Documents

Publication Publication Date Title
WO2022257436A1 (zh) 基于无线通信网络数据仓库构建方法、***、设备及介质
US20240163684A1 (en) Method and System for Constructing and Analyzing Knowledge Graph of Wireless Communication Network Protocol, and Device and Medium
WO2019184836A1 (zh) 数据分析设备、多模型共决策***及方法
CN109388565B (zh) 基于生成式对抗网络的软件***性能优化方法
WO2022001918A1 (zh) 构建预测模型的方法、装置、计算设备和存储介质
CN111431819A (zh) 一种基于序列化的协议流特征的网络流量分类方法和装置
CN107704868A (zh) 基于移动应用使用行为的用户分群聚类方法
CN111062431A (zh) 图像聚类方法、图像聚类装置、电子设备及存储介质
CN114330469A (zh) 一种快速、准确的加密流量分类方法及***
CN114492601A (zh) 资源分类模型的训练方法、装置、电子设备及存储介质
Graham et al. Finding and visualizing graph clusters using pagerank optimization
CN116186759A (zh) 一种面向隐私计算的敏感数据识别与脱敏方法
CN114095447B (zh) 一种基于知识蒸馏与自蒸馏的通信网络加密流量分类方法
CN113282433B (zh) 集群异常检测方法、装置和相关设备
CN108830302B (zh) 一种图像分类方法、训练方法、分类预测方法及相关装置
CN113722711A (zh) 基于大数据安全漏洞挖掘的数据添加方法及人工智能***
CN116127400B (zh) 基于异构计算的敏感数据识别***、方法及存储介质
CN115348198B (zh) 基于特征检索的未知加密协议识别分类方法、设备及介质
CN115510331B (zh) 一种基于闲置量聚合的共享资源匹配方法
Ke et al. Spark-based feature selection algorithm of network traffic classification
CN111784402A (zh) 基于多通路的下单率预测方法、设备及可读存储介质
CN114979017B (zh) 基于工控***原始流量的深度学习协议识别方法及***
CN113918577B (zh) 数据表识别方法、装置、电子设备及存储介质
CN112711678A (zh) 数据解析方法、装置、设备及存储介质
CN115629883A (zh) 资源预测方法、装置、计算机设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21944932

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 18568274

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21944932

Country of ref document: EP

Kind code of ref document: A1