WO2020177366A1 - Data processing method and apparatus based on time sequence data, and computer device - Google Patents

Data processing method and apparatus based on time sequence data, and computer device Download PDF

Info

Publication number
WO2020177366A1
WO2020177366A1 PCT/CN2019/116234 CN2019116234W WO2020177366A1 WO 2020177366 A1 WO2020177366 A1 WO 2020177366A1 CN 2019116234 W CN2019116234 W CN 2019116234W WO 2020177366 A1 WO2020177366 A1 WO 2020177366A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
time series
feature
visualization
variables
Prior art date
Application number
PCT/CN2019/116234
Other languages
French (fr)
Chinese (zh)
Inventor
陈娴娴
阮晓雯
徐亮
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020177366A1 publication Critical patent/WO2020177366A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This application relates to a data processing method, device and computer equipment based on time series data.
  • Time series is an important high-dimensional data type. It is a sequence composed of the sampling values of a certain physical quantity of an objective object at different time points arranged in chronological order. It has a wide range of applications in the fields of economic management and engineering. Using time series data mining, you can obtain the useful information related to time contained in the data and realize the extraction of knowledge.
  • time series data itself has high dimensionality, complexity, dynamics, high noise characteristics and easy to reach large-scale Characteristics.
  • the achievability of images over time has been very extensive, but the visualization based on time series data is usually framed in line graphs or scatter graphs with time as the coordinate, and the information capture has greater limitations, leading to The efficiency and accuracy of mining time series data is low.
  • a data processing method, device, and computer equipment based on time series data are provided.
  • a data processing method based on time series data includes:
  • the visualization data including a category identifier
  • corresponding view resource data is generated in a preset manner, and the view resource data is pushed to the terminal.
  • a data processing device based on time series data includes:
  • the request receiving module is configured to receive a resource acquisition request sent by the terminal, where the resource acquisition request includes the request type and request information;
  • a data acquisition module configured to acquire a plurality of visualization data according to the resource acquisition request and request information, the visualization data includes a category identifier; and extract the time series distribution data in the visualization data according to the category identifier;
  • the feature processing module is used to perform time series feature processing on the time series distribution data to obtain the feature variables and dimensional feature values corresponding to the time series distribution data; perform feature extraction on the feature variables to extract the feature variables and corresponding values that reach the threshold Dimensional characteristic value of;
  • the data mining module is configured to obtain a preset time series data mining model according to the request type, and analyze the characteristic variables and corresponding dimensional characteristic values through the time series data mining model to obtain analysis result data;
  • the view generation module is configured to generate corresponding view resource data in a preset manner according to the analysis result data, and push the view resource data to the terminal.
  • a computer device including a memory and one or more processors, the memory stores computer readable instructions, when the computer readable instructions are executed by the processor, the one or more processors execute The following steps:
  • the visualization data including a category identifier
  • corresponding view resource data is generated in a preset manner, and the view resource data is pushed to the terminal.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the one or more processors execute the following steps:
  • the visualization data including a category identifier
  • corresponding view resource data is generated in a preset manner, and the view resource data is pushed to the terminal.
  • Fig. 1 is an application scenario diagram of a data processing method based on time series data according to one or more embodiments.
  • Fig. 2 is a schematic flowchart of a data processing method based on time series data according to one or more embodiments.
  • Fig. 3 is a schematic flowchart of the steps of acquiring time series data according to one or more embodiments.
  • Fig. 4 is a schematic flowchart of the steps of analyzing time series data through a time series data mining model according to one or more embodiments.
  • Fig. 5 is a block diagram of a data processing device based on time series data according to one or more embodiments.
  • Figure 6 is a block diagram of a computer device according to one or more embodiments.
  • the data processing method based on time series data provided in this application can be applied to the application environment as shown in FIG. 1.
  • the terminal 102 communicates with the server 104 through the network through the network.
  • the terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server 104 may be implemented by an independent server or a server cluster composed of multiple servers.
  • the user can send a resource acquisition request to the server 104 through the corresponding terminal 102, and the resource acquisition request includes the request type and request information.
  • the server 104 After the server 104 receives the resource acquisition request sent by the terminal 102, it acquires a plurality of visualization data according to the resource acquisition request and the request information, and the visualization data includes a category identifier.
  • the server 104 further extracts the time series distribution data in the visualization data according to the category identifier.
  • the server 104 further performs time series feature processing on the multiple time series data to obtain feature variables and dimensional feature values corresponding to the multiple time series data; performs feature extraction on the feature variables to extract feature variables and corresponding dimensional feature values that reach the threshold.
  • the server 104 then generates corresponding view resources in a preset manner according to the analysis result data Data and push the view resource data to the terminal 102.
  • a data processing method for giving time series data is provided. Taking the method applied to the server in FIG. 1 as an example for description, the method includes the following steps:
  • Step 202 Receive a resource acquisition request sent by a terminal, where the resource acquisition request includes the request type and request information.
  • the user can send a data acquisition request to the server through the corresponding terminal, and the data acquisition request includes the request type.
  • the request type may be the type of the target data resource obtained, for example, it may be a request type such as time series data prediction and data mining.
  • the request information may be parameter information input by the user, such as time dimension parameters.
  • Step 204 Obtain multiple visualization data according to the resource acquisition request and the request information, where the visualization data includes a category identifier.
  • the visualization data may be data that is integrated with time series data through a specific integration function and presented in a visual form.
  • Each visualization data includes a category identification.
  • the category identifier may indicate the type of visualization graph corresponding to the visualization data.
  • the visualization data may include weather data, public opinion data, medical data, etc.
  • the visualization data may be in the form of various view data and tabular data. For example, statistical graphs, distribution graphs, heat maps, scatter plots, etc.
  • the server can obtain multiple visualization data from the local database according to the resource obtaining request and request information, or obtain multiple visualization data from a third-party database. Further, the resource acquisition request may also include request information, and the server acquires multiple visualization data according to the resource acquisition request and the request information.
  • Step 206 Extract the time series distribution data in the visualization data according to the category identifier.
  • Time series data refers to the time series data recorded by the same unified indicator in chronological order.
  • the time series data may include dimensional features corresponding to multiple dimensions.
  • the server obtains multiple visualization data according to the resource obtaining request, it obtains the base table data and the objective function corresponding to the visualization data according to the category identifier.
  • the base table data can represent the basic data required for the visualization data
  • the objective function can be the integration function required for integrating the visualization data.
  • the server then obtains the time series distribution data in the visualization data according to the base table data and the objective function, and converts the time series distribution data into corresponding time series distribution data in a preset manner.
  • Step 208 Perform time series feature processing on the time series distribution data to obtain feature variables and dimension feature values corresponding to the time series distribution data.
  • Step 210 Perform feature extraction on the feature variable, and extract the feature variable that reaches the threshold and the corresponding dimensional feature value.
  • the server After the server obtains multiple time series data from the visualization data, it performs time series feature processing on the multiple time series data, converts the multiple time series data into corresponding feature vectors according to the time series, and converts the multiple feature vectors into multiple feature variables and Corresponding dimensional feature value, the dimensional feature value can be expressed as the feature dimension to which the feature variable belongs.
  • the feature variables and dimension feature values corresponding to multiple time series data are obtained.
  • the server can preprocess the feature vectors corresponding to multiple time series data by means of mean filling, custom filling, and book model filling, and process the feature vectors of multiple time series data through data mean, variance, and standard deviation to extract The feature variables and dimension feature values corresponding to multiple time series data are obtained.
  • the server then performs feature extraction on multiple feature variables corresponding to multiple time series data, and extracts feature variables that reach the threshold and corresponding dimensional feature values.
  • the server After the server performs time-series feature processing on the feature vector corresponding to the time-series data, it extracts multiple time-series feature vectors according to the preset feature dimensionality reduction algorithm, and extracts the feature variables that reach the threshold.
  • algorithms such as singular value decomposition and principal component analysis can be used to reduce the overall dimensionality of feature variables, thereby effectively extracting features from time series data and extracting valuable feature variables and corresponding dimensional feature values.
  • Step 212 Obtain a preset time series data mining model according to the request type, and analyze the characteristic variables and corresponding dimension characteristic values through the time series data mining model to obtain analysis result data.
  • the server After the server extracts the feature variables that reach the threshold and the corresponding dimensional feature values, it obtains the preset time series data mining model, and inputs the extracted feature variables and corresponding dimensional feature values into the time series data mining model through the time series data mining model Analyze time series data.
  • the server can calculate the weights of feature variables through the time series data mining model.
  • the time series data mining model can calculate the corresponding predicted values of multiple feature variables corresponding to multiple time series parameters according to the dimensional feature values and weights of the type feature variables. The time series parameters and corresponding predicted values generate analysis result data corresponding to the request type.
  • Step 214 Generate corresponding view resource data in a preset manner according to the analysis result data, and push the view resource data to the terminal.
  • the server After the server generates the analysis result data, it can further generate the corresponding view resource data from the analysis result data in a preset manner. Specifically, the server can obtain the corresponding integration function according to the request type in the resource acquisition request, and integrate the corresponding view resource data through the integration function according to the multiple timing parameters and corresponding predicted values in the analysis result data, and the server can then integrate the view resource The data is pushed to the terminal.
  • the time series data mining model can be used to analyze the time series data, which can effectively mine the valuable information in the time series data for further analysis, thereby effectively improving the time series data Mining efficiency and accuracy.
  • the resource acquisition request includes the request type and request information
  • the server acquires multiple visualization data according to the resource acquisition request and request information.
  • the visualization data includes the category identifier .
  • the server further obtains a plurality of time series data from the visualization data according to the category identifier, so that the time series data in the visualization data can be effectively obtained.
  • the server further performs time series feature processing on multiple time series data to obtain feature variables and dimensional feature values corresponding to the multiple time series data; performs feature extraction on the feature variables to extract feature variables that reach the threshold and corresponding dimensional feature values.
  • the time series data mining model can be used to analyze the time series data, which can effectively mine the valuable information in the time series data for further analysis, thereby effectively improving the time series data Mining efficiency and accuracy.
  • the visualization data includes category identification and data identification.
  • the step of extracting the time series distribution data in the visualization data according to the category identification specifically includes the following content:
  • Step 302 Obtain the base table data and the objective function corresponding to the visualization data according to the category identifier and the data identifier.
  • the base table data may be a data table storing physical records corresponding to the visualization data, for example, may be a data relationship mapping table.
  • the category identifier may indicate the type of visualization data
  • the data identifier may indicate the identification code of each visualization data
  • the association mapping relationship table may be pre-established between the data expression and the base table data.
  • Step 304 Obtain distribution data in the visualization data according to the base table data and the objective function.
  • Step 306 Convert the distributed data into time series distributed data in a preset manner.
  • the server After receiving the resource acquisition request sent by the terminal, the server acquires a plurality of visualization data according to the request type and request information in the resource acquisition request, and the visualization data includes a category identifier and a data identifier. The server further obtains multiple time series distribution data from the visualization data according to the category identification.
  • the visualization data may be different types of visualization data generated in advance based on the base table data and the preset integration function, and each type of visualization data may correspond to the same target integration function.
  • the server obtains the objective function corresponding to the visualization data according to the category identifier, and the objective function may be the objective integration function for integrating the visualization data.
  • the server obtains the preset association relationship mapping table according to the data identifier, and then obtains the base table data corresponding to the data identifier.
  • the server may then obtain distribution data in the visualization data through the objective function according to the base table data, for example, it may be function distribution data based on the objective function.
  • the server then converts the acquired distributed data into time series distributed data in a preset manner.
  • visualization data such as histograms and distribution density can be embedded to generate corresponding visualization data using python visualization functions.
  • the corresponding python visualization function and base table data analysis can be used to obtain the corresponding time series distribution data, which can then effectively capture the time series distribution data contained in the visualization data.
  • obtaining the distribution data in the visualization data according to the base table data and the objective function includes: obtaining coordinate matrix data and corresponding parameter weights according to the base table data and the objective function; according to the coordinate matrix data and the corresponding parameter weights Obtain the corresponding distribution data.
  • the server After receiving the resource acquisition request sent by the terminal, the server acquires a plurality of visualization data according to the request type and request information in the resource acquisition request, and the visualization data includes a category identifier and a data identifier. The server further obtains multiple time series data from the visualization data according to the category identifier.
  • the visualization data may be different types of visualization data generated in advance based on the base table data and the preset integration function, and each type of visualization data may correspond to the same target integration function.
  • the server obtains the objective function corresponding to the visualization data according to the category identifier, and the objective function may be the objective integration function for integrating the visualization data.
  • the server obtains the preset association relationship mapping table according to the data identifier, and then obtains the base table data corresponding to the data identifier.
  • the visualization data includes coordinate matrix data, and the coordinate matrix data may include feature vectors of multiple dimensions and corresponding parameter weights, for example, may include feature vectors corresponding to abscissa and ordinate and corresponding parameter weights.
  • the server further obtains the coordinate matrix data and the corresponding parameter weights according to the base table data and the objective function, and then obtains the corresponding time series distribution data in the visualization data according to the coordinate matrix data and the corresponding parameter weights.
  • the time series distribution data can be based on the time dimension.
  • the time series distribution information can also include multiple dimensions of time series distribution information.
  • the step of performing feature extraction on feature variables corresponding to multiple multi-dimensional time series data according to a preset algorithm includes: performing cluster analysis on feature variables corresponding to multiple multi-dimensional time series data to obtain multiple clusters Results: Combine feature variables in multiple clustering results to obtain multiple combined feature variables, and calculate the correlation between multiple feature variables; select features according to the correlation between multiple feature variables, and extract The feature variables and corresponding dimensional features that reach the preset threshold.
  • the server After receiving the resource acquisition request sent by the terminal, the server acquires a plurality of visualization data according to the request type and request information in the resource acquisition request, and the visualization data includes a category identifier and a data identifier. The server further obtains multiple time series data from the visualization data according to the category identifier.
  • the server After the server obtains multiple time series data from the visualization data, it performs time series feature processing on the multiple time series data, converts the multiple time series data into corresponding feature vectors according to the time series, and converts the multiple feature vectors into multiple feature variables And the corresponding dimensional feature values, thereby obtaining feature variables and dimensional feature values corresponding to multiple time series data.
  • the server can preprocess the feature vectors corresponding to multiple time series data by means of mean filling, custom filling, and book model filling, and process the feature vectors of multiple time series data through data mean, variance, and standard deviation to extract The feature variables and dimension feature values corresponding to multiple time series data are obtained.
  • the server then performs feature extraction on multiple feature variables corresponding to multiple time series data, and extracts feature variables that reach the threshold and corresponding dimensional feature values. Specifically, after the server extracts the feature variables and corresponding dimensional feature values corresponding to the multiple time series data, it uses a preset clustering algorithm to perform cluster analysis on the multiple feature variables and the corresponding dimensional feature values. For example, a k-means (k-means algorithm) clustering method can be used. The server obtains multiple clustering results after performing multiple clustering through multiple feature variables and corresponding dimensional feature values.
  • k-means k-means algorithm
  • the server further combines the feature variables in the multiple clustering results to obtain multiple combined feature variables. Obtain the target variable, and use the target variable to test the correlation of multiple combined feature variables. When the test passes, an interactive label is added to the combined feature variable. Use the combined feature variable after adding the interactive label to analyze the corresponding feature variable.
  • the combined feature variable after the interaction tag is added may be a feature variable that reaches a preset threshold, and the server extracts a corresponding dimensional feature value that reaches the preset threshold.
  • the server then obtains a preset time series data mining model according to the request type, and analyzes the characteristic variables and corresponding dimensional characteristic values through the time series data mining model to obtain corresponding analysis result data.
  • corresponding view resource data is generated in a preset manner, and the view resource data is pushed to the terminal.
  • the method before analyzing the feature variables and corresponding dimensional feature values through the time series data mining model, the method further includes the step of pre-training the time series data mining model: acquiring multiple historical time series data, and using the historical time series data to generate Training set data and validation set data; input the training set data into the preset neural network model for training to obtain the initial time series data mining model; use the validation set data to continuously train and verify the initial time series data mining model; and until validation When the data that meets the condition in the set data reaches the preset threshold, the time series data mining model that has been trained is obtained.
  • the server Before the server obtains the preset time series data mining model, it also needs to construct and train the time series data mining model in advance.
  • the server may obtain multiple time series data from a local or third-party database, and the server then performs cluster analysis on the multiple time series data. Specifically, the server performs feature extraction on multiple time series data, and extracts corresponding feature variables and corresponding dimensional feature values. After the server extracts multiple time series feature variables and dimension feature values, it uses a preset clustering algorithm to perform cluster analysis on the feature variables, and the server obtains multiple clustering results by clustering the feature variables multiple times. The server further combines the feature variables in the multiple clustering results to obtain multiple combined feature variables, and calculates the correlation between the multiple combined feature variables. The server extracts feature variables that reach the preset threshold and the corresponding dimensional feature values.
  • the server After the server extracts multiple feature variables and corresponding dimensional feature values, it constructs a time series data mining model according to a preset algorithm according to the multiple feature variables and corresponding dimensional feature values.
  • the time series data mining model can be a model based on a decision tree or a neural network.
  • the server can also obtain a large amount of time series data, and use a large amount of time series data to generate training set data and verification set data.
  • the server inputs a large amount of time series data in the training set into the time series data mining model for training, and obtains a preliminary time series data mining model.
  • the server uses a large amount of time series data in the verification set to further train and verify the preliminary time series data mining model.
  • the trained time series data mining model is obtained. After performing big data analysis on a large amount of time series data, use the extracted feature variables and dimensional feature values to build time series data mining models and train them in a preset manner, which can effectively construct time series data mining with high analysis accuracy model.
  • the step of analyzing time series data through the time series data mining model specifically includes the following contents:
  • Step 402 Calculate the weight of the feature variable through the time series data mining model.
  • Step 404 Calculate corresponding predicted values of multiple feature variables corresponding to multiple time series parameters according to the dimensional feature value and weight of the feature variable through the time series data mining model.
  • Step 406 Generate analysis result data corresponding to the request type according to the multiple time sequence parameters and corresponding predicted values.
  • the server After receiving the resource acquisition request sent by the terminal, the server acquires a plurality of visualization data according to the request type and request information in the resource acquisition request, and the visualization data includes a category identifier and a data identifier. The server further obtains multiple time series data from the visualization data according to the category identifier.
  • the server After the server obtains multiple time series data from the visualization data, it performs time series feature processing on the multiple time series data, converts the multiple time series data into corresponding feature vectors according to the time series, and converts the multiple feature vectors into multiple feature variables And the corresponding dimensional feature values, thereby obtaining feature variables and dimensional feature values corresponding to multiple time series data.
  • the server then performs feature extraction on multiple feature variables corresponding to multiple time series data, and extracts feature variables that reach the threshold and corresponding dimensional feature values.
  • the server further obtains the preset time series data mining model according to the request type, and inputs the extracted feature variables and corresponding dimensional feature values into the time series data mining model, calculates the weight of the feature variables through the time series data mining model, and uses time series data mining
  • the model calculates the corresponding predicted values of multiple feature variables corresponding to multiple time series parameters according to the dimensional feature values and weights of feature variables, and generates analysis result data corresponding to the request type according to the multiple time series parameters and corresponding predicted values.
  • the server After the server generates the analysis result data, it can further generate corresponding view resource data from the analysis result data in a preset manner, and push the view resource data to the terminal.
  • the time series data mining model can be used to analyze the time series data, which can effectively mine the valuable information in the time series data for further analysis, thereby effectively improving the time series data Mining efficiency and accuracy.
  • the server can obtain multiple visualization data, such as visualization data such as weather visualization data for a week, a large amount of disease data, public opinion data, and medical data.
  • the server further extracts the time series data in the visualization data, and extracts the feature variables and corresponding dimensional feature values in the time series data, and then analyzes the feature variables and corresponding dimensional features in the time series data through the preset time series data analysis model Analyze the value to analyze the correlation between the incidence probability and each feature variable in the time series data, and then obtain the corresponding predicted values of multiple feature variables corresponding to multiple time series parameters such as incidence probability, number of cases, and incidence distribution.
  • time-series data can be effectively used to mine and analyze the incidence probability of a certain epidemic within a specific time period, and then the corresponding epidemic can be effectively prevented.
  • the analysis result data includes a plurality of time series parameters and corresponding predicted values
  • the method further includes: obtaining a preset integration function according to the request type; and according to the plurality of time series parameters and corresponding prediction values in the analysis result data
  • the predicted value of is integrated with corresponding view resource data through an integrated function; the event type identifier and corresponding interface call parameters are added to the view resource data, and the interface call parameters are used to call the generated view resource data according to the event type identifier.
  • the server After the server receives the resource acquisition request sent by the terminal, the resource acquisition request includes the request type and request information, and the server acquires multiple visualization data according to the resource acquisition request and the request information, and the visualization data includes the category identifier.
  • the server then obtains multiple time series data from the visualization data according to the category identifier, and performs time series feature processing on the multiple time series data to obtain feature variables and dimension feature values corresponding to the multiple time series data; perform feature extraction on the feature variables to extract Threshold feature variables and corresponding dimension feature values.
  • the server then obtains the preset time series data mining model according to the request type, analyzes the feature variables and the corresponding dimension feature values through the time series data mining model, and obtains the corresponding analysis result data, and then generates the corresponding analysis result data according to the preset method.
  • View resource data Specifically, the server can obtain a preset integration function according to the request type, integrate corresponding view resource data through the integration function according to multiple timing parameters and corresponding predicted values in the analysis result data, and add event type identification and The corresponding interface call parameters.
  • the analysis result data may include corresponding predicted values of multiple time series parameters.
  • it may include parameters such as the incidence probability and incidence distribution based on the time dimension and the corresponding predicted value.
  • the time dimension can be every 3 hours, every 12 hours as the time unit, every day, or every week as the time unit.
  • the server can integrate the corresponding predicted values of multiple time series parameters into corresponding view data by obtaining preset time series distribution integration functions, such as python visualization functions.
  • it can embed visualization functions such as histogram visualization functions, distribution density, and heat maps. Integrate corresponding view data, and draw corresponding visual images through nested functions.
  • the server After the server integrates the corresponding view resource data through the integration function according to the multiple timing parameters in the analysis result data and the corresponding predicted values, it further adds the event type identifier and the corresponding interface call parameters to the view resource data, and integrates the corresponding class for storage .
  • the server or terminal In order to facilitate the server or terminal to call the generated view resource data, so that when the server or terminal obtains the associated time series data or view data again, it can directly call the mining analyzed view according to the event type identification and the corresponding interface call parameters Resource data, thereby improving the analysis efficiency and utilization value of time series data.
  • the server After the server generates the corresponding view resource data, it pushes the view resource data to the terminal.
  • the time series data mining model can be used to analyze the time series data, which can effectively mine the valuable information in the time series data for further analysis, thereby effectively improving the time series data Mining efficiency and analysis efficiency.
  • a data processing device based on time series data including: a request receiving module 502, a data acquisition module 504, a feature processing module 506, a data mining module 508, and a view generation module 510, of which:
  • the request receiving module 502 is configured to receive a resource acquisition request sent by the terminal, and the resource acquisition request includes the request type and request information;
  • the data acquisition module 504 is configured to acquire a plurality of visualization data according to the resource acquisition request and request information, the visualization data includes a category identifier; and extract the time series distribution data in the visualization data according to the category identifier;
  • the feature processing module 506 is configured to perform time series feature processing on the time series distribution data to obtain feature variables and dimensional feature values corresponding to the time series distribution data; perform feature extraction on the feature variables, and extract feature variables and corresponding dimensional feature values that reach the threshold;
  • the data mining module 508 is configured to obtain a preset time series data mining model according to the request type, and analyze the characteristic variables and corresponding dimensional characteristic values through the time series data mining model to obtain corresponding analysis result data;
  • the view generating module 510 is configured to generate corresponding view resource data in a preset manner according to the analysis result data, and push the view resource data to the terminal.
  • the visualization data includes category identification and data identification
  • the data acquisition module 504 is further configured to acquire the base table data and the objective function corresponding to the visualization data according to the category identification and the data identification; according to the base table data and the objective function Obtain distribution data in the visualization data; convert the distribution data into time series distribution data in a preset manner.
  • the data obtaining module 504 is further configured to obtain coordinate matrix data and corresponding parameter weights according to the base table data and the objective function; and obtain corresponding time series distribution data according to the coordinate matrix data and corresponding parameter weights.
  • the feature processing module 506 is also used to perform cluster analysis on feature variables corresponding to multiple time series data to obtain multiple clustering results; calculate the correlation between multiple feature variables based on the multiple clustering results Sex; feature selection based on the correlation between multiple feature variables, and extract feature variables that reach a preset threshold and corresponding dimensional features.
  • the device further includes a model training module for obtaining a plurality of historical time series data, using the historical time series data to generate training set data and validation set data; inputting the training set data into the preset neural network model Perform training to obtain the initial time series data mining model; use the validation set data to continuously train and verify the initial time series data mining model; and until the data that meets the conditions in the validation set data reaches the preset threshold, the trained time series data mining is obtained model.
  • a model training module for obtaining a plurality of historical time series data, using the historical time series data to generate training set data and validation set data; inputting the training set data into the preset neural network model Perform training to obtain the initial time series data mining model; use the validation set data to continuously train and verify the initial time series data mining model; and until the data that meets the conditions in the validation set data reaches the preset threshold, the trained time series data mining is obtained model.
  • the data mining module 508 is also used to calculate the weights of feature variables through the time series data mining model; the time series data mining model calculates the number of feature variables corresponding to multiple time series parameters according to the dimensional feature values and weights of the feature variables. Corresponding predicted value; generating analysis result data corresponding to the request type according to multiple timing parameters and corresponding predicted values.
  • the analysis result data includes multiple time series parameters and corresponding predicted values
  • the view generation module 510 is also configured to obtain a preset integration function according to the request type; according to the multiple time series parameters and the corresponding prediction values in the analysis result data
  • the corresponding predicted value integrates the corresponding view resource data through the integrated function; the event type identifier and the corresponding interface call parameter are added to the view resource data, and the interface call parameter is used to call the generated view resource data according to the event type identifier.
  • Each module in the above-mentioned data processing device based on time series data can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 6.
  • the computer equipment includes a processor, a memory, a network interface and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the computer equipment database is used to store visualization data, time series data, and view resource data.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer readable instructions are executed by the processor to realize a data processing method based on time series data.
  • FIG. 6 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • a computer device includes a memory and one or more processors.
  • the memory stores computer readable instructions.
  • the one or more processors execute the following steps:
  • the resource acquisition request includes the request type and request information
  • the visualization data includes category identification; extract the time series distribution data in the visualization data according to the category identification;
  • corresponding view resource data is generated in a preset manner, and the view resource data is pushed to the terminal.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the one or more processors execute the following steps:
  • the resource acquisition request includes the request type and request information
  • the visualization data includes category identification; extract the time series distribution data in the visualization data according to the category identification;
  • corresponding view resource data is generated in a preset manner, and the view resource data is pushed to the terminal.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • ROM read only memory
  • PROM programmable ROM
  • EPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Fuzzy Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A data processing method based on time sequence data, comprising: receiving a resource acquisition request sent by a terminal, wherein the resource acquisition request comprises a request type and request information; acquiring a plurality of pieces of visual data according to the resource acquisition request and the request information, wherein the visual data comprises a category identifier; extracting time sequence distribution data in the visual data according to the category identifier; performing time sequence feature processing on the time sequence distribution data to obtain a feature variable and a dimension feature value corresponding to the time sequence distribution data; performing feature extraction on the feature variable to extract a feature variable reaching a threshold value and a corresponding dimension feature value; acquiring a preset time sequence data mining model according to the request type, and analyzing the feature variable and the corresponding dimension feature value by means of the time sequence data mining model to obtain analysis result data; and according to the analysis result data, generating corresponding view resource data according to a preset mode, and pushing the view resource data to the terminal.

Description

基于时序数据的数据处理方法、装置和计算机设备Data processing method, device and computer equipment based on time series data
相关申请的交叉引用:Cross-references to related applications:
本申请要求于2019年03月07日提交至中国专利局,申请号为2019101719236,申请名称为“基于时序数据的数据处理方法、装置和计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed to the Chinese Patent Office on March 7, 2019, with the application number 2019101719236, and the application titled "Data processing methods, devices and computer equipment based on time series data", all of which are approved The reference is incorporated in this application.
技术领域Technical field
本申请涉及一种基于时序数据的数据处理方法、装置和计算机设备。This application relates to a data processing method, device and computer equipment based on time series data.
背景技术Background technique
随着计算机技术的迅速发展,数据挖掘技术也变得越来越重要。时间序列是一种重要的高维数据类型,是由客观对象的某个物理量在不同时间点的采样值按照时间先后次序排列而组成的序列,在经济管理以及工程领域具有广泛应用。利用时间序列数据挖掘,可以获得数据中蕴含的与时间相关的有用信息,实现知识的提取。With the rapid development of computer technology, data mining technology has become more and more important. Time series is an important high-dimensional data type. It is a sequence composed of the sampling values of a certain physical quantity of an objective object at different time points arranged in chronological order. It has a wide range of applications in the fields of economic management and engineering. Using time series data mining, you can obtain the useful information related to time contained in the data and realize the extraction of knowledge.
然而,目前的对时序数据进行挖掘处理的方式中,大多只是对时序数据进行单维度处理,而时间序列数据本身所具备的高维性、复杂性、动态性、高噪声特性以及容易达到大规模的特性。且在可视化层面,随着时间推移的图像可实现化已非常广泛,但基于时序数据的可视化通常会框定在以时间为坐标的线图或散点图中,信息抓取局限性较大,导致对时序数据进行挖掘的效率和准确率较低。However, most of the current methods of mining and processing time series data are only single-dimensional processing of time series data. The time series data itself has high dimensionality, complexity, dynamics, high noise characteristics and easy to reach large-scale Characteristics. And at the level of visualization, the achievability of images over time has been very extensive, but the visualization based on time series data is usually framed in line graphs or scatter graphs with time as the coordinate, and the information capture has greater limitations, leading to The efficiency and accuracy of mining time series data is low.
发明内容Summary of the invention
根据本申请公开的各种实施例,提供一种基于时序数据的数据处理方法、装置和计算机设备。According to various embodiments disclosed in the present application, a data processing method, device, and computer equipment based on time series data are provided.
一种基于时序数据的数据处理方法包括:A data processing method based on time series data includes:
接收终端发送的资源获取请求,所述资源获取请求包括请求类型和请求信息;Receiving a resource acquisition request sent by the terminal, where the resource acquisition request includes a request type and request information;
根据所述资源获取请求和请求信息获取多个可视化数据,所述可视化数据包括类别标识;Acquiring a plurality of visualization data according to the resource acquisition request and request information, the visualization data including a category identifier;
根据所述类别标识提取所述可视化数据中的时序分布数据;Extracting time series distribution data in the visualization data according to the category identifier;
对所述时序分布数据进行时序特征处理,得到所述时序分布数据对应的特征变量和维度特征值;Performing time series feature processing on the time series distribution data to obtain feature variables and dimension feature values corresponding to the time series distribution data;
对所述特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值;Perform feature extraction on the feature variable, and extract the feature variable that reaches the threshold and the corresponding dimensional feature value;
根据所述请求类型获取预设的时序数据挖掘模型,通过所述时序数据挖掘模型对所述特征变量和对应的维度特征值进行分析,得到分析结果数据;及Obtain a preset time series data mining model according to the request type, and analyze the characteristic variables and corresponding dimensional characteristic values through the time series data mining model to obtain analysis result data; and
根据所述分析结果数据按照预设方式生成对应的视图资源数据,将所述视图资源数据推送至所述终端。According to the analysis result data, corresponding view resource data is generated in a preset manner, and the view resource data is pushed to the terminal.
一种基于时序数据的数据处理装置包括:A data processing device based on time series data includes:
请求接收模块,用于接收终端发送的资源获取请求,所述资源获取请求包括请求类型和请求信息;The request receiving module is configured to receive a resource acquisition request sent by the terminal, where the resource acquisition request includes the request type and request information;
数据获取模块,用于根据所述资源获取请求和请求信息获取多个可视化数据,所述可视化数据包括类别标识;根据所述类别标识提取所述可视化数据中的时序分布数据;A data acquisition module, configured to acquire a plurality of visualization data according to the resource acquisition request and request information, the visualization data includes a category identifier; and extract the time series distribution data in the visualization data according to the category identifier;
特征处理模块,用于对所述时序分布数据进行时序特征处理,得到所述时序分布数据对应的特征变量和维度特征值;对所述特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值;The feature processing module is used to perform time series feature processing on the time series distribution data to obtain the feature variables and dimensional feature values corresponding to the time series distribution data; perform feature extraction on the feature variables to extract the feature variables and corresponding values that reach the threshold Dimensional characteristic value of;
数据挖掘模块,用于根据所述请求类型获取预设的时序数据挖掘模型,通过所述时序数据挖掘模型对所述特征变量和对应的维度特征值进行分析,得到分析结果数据;The data mining module is configured to obtain a preset time series data mining model according to the request type, and analyze the characteristic variables and corresponding dimensional characteristic values through the time series data mining model to obtain analysis result data;
视图生成模块,用于根据所述分析结果数据按照预设方式生成对应的视图资源数据,将所述视图资源数据推送至所述终端。The view generation module is configured to generate corresponding view resource data in a preset manner according to the analysis result data, and push the view resource data to the terminal.
一种计算机设备,包括存储器和一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device, including a memory and one or more processors, the memory stores computer readable instructions, when the computer readable instructions are executed by the processor, the one or more processors execute The following steps:
接收终端发送的资源获取请求,所述资源获取请求包括请求类型和请求信息;Receiving a resource acquisition request sent by the terminal, where the resource acquisition request includes a request type and request information;
根据所述资源获取请求和请求信息获取多个可视化数据,所述可视化数据包括类别标识;Acquiring a plurality of visualization data according to the resource acquisition request and request information, the visualization data including a category identifier;
根据所述类别标识提取所述可视化数据中的时序分布数据;Extracting time series distribution data in the visualization data according to the category identifier;
对所述时序分布数据进行时序特征处理,得到所述时序分布数据对应的特征变量和维度特征值;Performing time series feature processing on the time series distribution data to obtain feature variables and dimension feature values corresponding to the time series distribution data;
对所述特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度 特征值;Performing feature extraction on the feature variables, and extracting feature variables that reach a threshold and corresponding dimensional feature values;
根据所述请求类型获取预设的时序数据挖掘模型,通过所述时序数据挖掘模型对所述特征变量和对应的维度特征值进行分析,得到分析结果数据;及Obtain a preset time series data mining model according to the request type, and analyze the characteristic variables and corresponding dimensional characteristic values through the time series data mining model to obtain analysis result data; and
根据所述分析结果数据按照预设方式生成对应的视图资源数据,将所述视图资源数据推送至所述终端。According to the analysis result data, corresponding view resource data is generated in a preset manner, and the view resource data is pushed to the terminal.
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:One or more non-volatile computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors execute the following steps:
接收终端发送的资源获取请求,所述资源获取请求包括请求类型和请求信息;Receiving a resource acquisition request sent by the terminal, where the resource acquisition request includes a request type and request information;
根据所述资源获取请求和请求信息获取多个可视化数据,所述可视化数据包括类别标识;Acquiring a plurality of visualization data according to the resource acquisition request and request information, the visualization data including a category identifier;
根据所述类别标识提取所述可视化数据中的时序分布数据;Extracting time series distribution data in the visualization data according to the category identifier;
对所述时序分布数据进行时序特征处理,得到所述时序分布数据对应的特征变量和维度特征值;Performing time series feature processing on the time series distribution data to obtain feature variables and dimension feature values corresponding to the time series distribution data;
对所述特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值;Perform feature extraction on the feature variable, and extract the feature variable that reaches the threshold and the corresponding dimensional feature value;
根据所述请求类型获取预设的时序数据挖掘模型,通过所述时序数据挖掘模型对所述特征变量和对应的维度特征值进行分析,得到分析结果数据;及Obtain a preset time series data mining model according to the request type, and analyze the characteristic variables and corresponding dimensional characteristic values through the time series data mining model to obtain analysis result data; and
根据所述分析结果数据按照预设方式生成对应的视图资源数据,将所述视图资源数据推送至所述终端。According to the analysis result data, corresponding view resource data is generated in a preset manner, and the view resource data is pushed to the terminal.
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。The details of one or more embodiments of the application are set forth in the following drawings and description. Other features and advantages of this application will become apparent from the description, drawings and claims.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.
图1为根据一个或多个实施例中基于时序数据的数据处理方法的应用场景图。Fig. 1 is an application scenario diagram of a data processing method based on time series data according to one or more embodiments.
图2为根据一个或多个实施例中基于时序数据的数据处理方法的流程示意图。Fig. 2 is a schematic flowchart of a data processing method based on time series data according to one or more embodiments.
图3为根据一个或多个实施例中获取时序数据步骤的流程示意图。Fig. 3 is a schematic flowchart of the steps of acquiring time series data according to one or more embodiments.
图4为根据一个或多个实施例中通过时序数据挖掘模型对时序数据进行分析步骤的流程示意图。Fig. 4 is a schematic flowchart of the steps of analyzing time series data through a time series data mining model according to one or more embodiments.
图5为根据一个或多个实施例中基于时序数据的数据处理装置的框图。Fig. 5 is a block diagram of a data processing device based on time series data according to one or more embodiments.
图6为根据一个或多个实施例中计算机设备的框图。Figure 6 is a block diagram of a computer device according to one or more embodiments.
具体实施方式detailed description
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the technical solutions and advantages of the present application clearer, the following further describes the present application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the application, and not used to limit the application.
本申请提供的基于时序数据的数据处理方法,可以应用于如图1所示的应用环境中。其中,终端102通过网络与服务器104通过网络进行通信。终端102可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备,服务器104可以用独立的服务器或者是多个服务器组成的服务器集群来实现。用户可以通过对应的终端102向服务器104发送资源获取请求,资源获取请求包括请求类型和请求信息。服务器104接收终端102发送的资源获取请求后,则根据资源获取请求和请求信息获取多个可视化数据,可视化数据包括类别标识。服务器104进而根据类别标识提取可视化数据中的时序分布数据。服务器104进一步对多个时序数据进行时序特征处理,得到多个时序数据对应的特征变量和维度特征值;对特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值。根据请求类型获取预设的时序数据挖掘模型,通过时序数据挖掘模型对特征变量和对应的维度特征值进行分析,得到分析结果数据,服务器104进而根据分析结果数据按照预设方式生成对应的视图资源数据,并将视图资源数据推送至终端102。通过提取出可视化数据中的时序数据,并通过时序数据挖掘模型进行分析,能够有效地挖掘出时序数据中有价值的信息进行进一步分析,由此能够有效提高时序数据的挖掘效率和准确率。The data processing method based on time series data provided in this application can be applied to the application environment as shown in FIG. 1. Wherein, the terminal 102 communicates with the server 104 through the network through the network. The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server 104 may be implemented by an independent server or a server cluster composed of multiple servers. The user can send a resource acquisition request to the server 104 through the corresponding terminal 102, and the resource acquisition request includes the request type and request information. After the server 104 receives the resource acquisition request sent by the terminal 102, it acquires a plurality of visualization data according to the resource acquisition request and the request information, and the visualization data includes a category identifier. The server 104 further extracts the time series distribution data in the visualization data according to the category identifier. The server 104 further performs time series feature processing on the multiple time series data to obtain feature variables and dimensional feature values corresponding to the multiple time series data; performs feature extraction on the feature variables to extract feature variables and corresponding dimensional feature values that reach the threshold. Obtain a preset time series data mining model according to the request type, analyze feature variables and corresponding dimension feature values through the time series data mining model, and obtain analysis result data. The server 104 then generates corresponding view resources in a preset manner according to the analysis result data Data and push the view resource data to the terminal 102. By extracting the time series data in the visualization data and analyzing it through the time series data mining model, valuable information in the time series data can be effectively mined for further analysis, thereby effectively improving the efficiency and accuracy of time series data mining.
在其中一个实施例中,如图2所示,提供了一种给予时序数据的数据处理方法,以该方法应用于图1中的服务器为例进行说明,包括以下步骤:In one of the embodiments, as shown in FIG. 2, a data processing method for giving time series data is provided. Taking the method applied to the server in FIG. 1 as an example for description, the method includes the following steps:
步骤202,接收终端发送的资源获取请求,资源获取请求包括请求类型和请求信息。Step 202: Receive a resource acquisition request sent by a terminal, where the resource acquisition request includes the request type and request information.
用户可以通过对应的终端向服务器发送数据获取请求,数据获取请求中以包括了请求类型。其中,请求类型可以是获取的目标数据资源的类型,例如可以是时序数据预测和数据挖掘等请求类型。请求信息可以是用户输入的参数信息,例如时间维度参数等。The user can send a data acquisition request to the server through the corresponding terminal, and the data acquisition request includes the request type. Wherein, the request type may be the type of the target data resource obtained, for example, it may be a request type such as time series data prediction and data mining. The request information may be parameter information input by the user, such as time dimension parameters.
步骤204,根据资源获取请求和请求信息获取多个可视化数据,可视化数据包括类别标识。Step 204: Obtain multiple visualization data according to the resource acquisition request and the request information, where the visualization data includes a category identifier.
可视化数据可以是通过特定的集成函数将时序数据进行集成,并以视觉形式来呈现的数据,每个可视化数据包括了类别标识。类别标识可以表示可视化数据对应的可视化图形类型。例如可视化数据可以包括天气类数据、舆情类数据、医疗数据等,可视化数据可以为各种视图数据和表格数据等形式。例如统计图、分布图、热度图、散点图等。The visualization data may be data that is integrated with time series data through a specific integration function and presented in a visual form. Each visualization data includes a category identification. The category identifier may indicate the type of visualization graph corresponding to the visualization data. For example, the visualization data may include weather data, public opinion data, medical data, etc., and the visualization data may be in the form of various view data and tabular data. For example, statistical graphs, distribution graphs, heat maps, scatter plots, etc.
服务器可以根据资源获取请求和请求信息从本地数据库中获取多个可视化数据,也可以从第三方数据库中获取多个可视化数据。进一步的,资源获取请求中还可以包括请求信息,服务器则根据资源获取请求和请求信息获取多个可视化数据。The server can obtain multiple visualization data from the local database according to the resource obtaining request and request information, or obtain multiple visualization data from a third-party database. Further, the resource acquisition request may also include request information, and the server acquires multiple visualization data according to the resource acquisition request and the request information.
步骤206,根据类别标识提取可视化数据中的时序分布数据。Step 206: Extract the time series distribution data in the visualization data according to the category identifier.
时序数据是指同一统一指标按时间顺序记录的时间序列数据。时序数据中可以包括多个维度对应的维度特征。服务器根据资源获取请求获取多个可视化数据后,则根据类别标识获取可视化数据对应的基表数据和目标函数。基表数据可以表生成示该可视化数据所需要的基本数据,目标函数可以是集成该可视化数据所需要的集成函数等。服务器进而根据基表数据和目标函数获取可视化数据中的时序分布数据,并将时序分布数据按照预设方式转换为对应的时序分布数据。Time series data refers to the time series data recorded by the same unified indicator in chronological order. The time series data may include dimensional features corresponding to multiple dimensions. After the server obtains multiple visualization data according to the resource obtaining request, it obtains the base table data and the objective function corresponding to the visualization data according to the category identifier. The base table data can represent the basic data required for the visualization data, and the objective function can be the integration function required for integrating the visualization data. The server then obtains the time series distribution data in the visualization data according to the base table data and the objective function, and converts the time series distribution data into corresponding time series distribution data in a preset manner.
步骤208,对时序分布数据进行时序特征处理,得到时序分布数据对应的特征变量和维度特征值。Step 208: Perform time series feature processing on the time series distribution data to obtain feature variables and dimension feature values corresponding to the time series distribution data.
步骤210,对特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值。Step 210: Perform feature extraction on the feature variable, and extract the feature variable that reaches the threshold and the corresponding dimensional feature value.
服务器从可视化数据中获取多个时序数据后,对多个时序数据进行时序特征处理,将多个时序数据按照时间序列转换为对应的特征向量,并将多个特征向量转换为多个特征变量和对应的维度特征值,维度特征值可以表示为特征变量所属的特征维度。由此得到多个时序数据对应的特征变量和维度特征值。例如服务器可以通过均值填充、定制填充和书模型填充等方式对多个时序数据对应的特征向量进行预处理,并通过数据均值、方差和标准差等对多个时序数据进行时序图特征处理,提取出多个时序数据对应的特征变量和维度特征值。After the server obtains multiple time series data from the visualization data, it performs time series feature processing on the multiple time series data, converts the multiple time series data into corresponding feature vectors according to the time series, and converts the multiple feature vectors into multiple feature variables and Corresponding dimensional feature value, the dimensional feature value can be expressed as the feature dimension to which the feature variable belongs. Thus, the feature variables and dimension feature values corresponding to multiple time series data are obtained. For example, the server can preprocess the feature vectors corresponding to multiple time series data by means of mean filling, custom filling, and book model filling, and process the feature vectors of multiple time series data through data mean, variance, and standard deviation to extract The feature variables and dimension feature values corresponding to multiple time series data are obtained.
服务器进而对多个时序数据对应的多个特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值。服务器对时序数据对应的特征向量进行时序特征处理后,根据预设的特征降维算法对多个时序特征向量进行特征提 取,提取出达到阈值的特征变量。例如,可以利用奇异值分解、主成分分析等算法对特征变量整体进行降维,从而能够有效地对时序数据进行特征提取,提取出有价值的特征变量和对应的维度特征值。The server then performs feature extraction on multiple feature variables corresponding to multiple time series data, and extracts feature variables that reach the threshold and corresponding dimensional feature values. After the server performs time-series feature processing on the feature vector corresponding to the time-series data, it extracts multiple time-series feature vectors according to the preset feature dimensionality reduction algorithm, and extracts the feature variables that reach the threshold. For example, algorithms such as singular value decomposition and principal component analysis can be used to reduce the overall dimensionality of feature variables, thereby effectively extracting features from time series data and extracting valuable feature variables and corresponding dimensional feature values.
步骤212,根据请求类型获取预设的时序数据挖掘模型,通过时序数据挖掘模型对特征变量和对应的维度特征值进行分析,得到分析结果数据。Step 212: Obtain a preset time series data mining model according to the request type, and analyze the characteristic variables and corresponding dimension characteristic values through the time series data mining model to obtain analysis result data.
服务器提取出达到阈值的特征变量和对应的维度特征值后,则获取预设的时序数据挖掘模型,将提取的特征变量和对应的维度特征值输入至时序数据挖掘模型中,通过时序数据挖掘模型对时序数据进行分析。服务器可以通过时序数据挖掘模型计算出特征变量的权重,通过时序数据挖掘模根据型特征变量的维度特征值和权重计算多个特征变量对应多个时序参数的对应的预测值,服务器则根据多个时序参数和对应的预测值生成与请求类型对应的分析结果数据。After the server extracts the feature variables that reach the threshold and the corresponding dimensional feature values, it obtains the preset time series data mining model, and inputs the extracted feature variables and corresponding dimensional feature values into the time series data mining model through the time series data mining model Analyze time series data. The server can calculate the weights of feature variables through the time series data mining model. The time series data mining model can calculate the corresponding predicted values of multiple feature variables corresponding to multiple time series parameters according to the dimensional feature values and weights of the type feature variables. The time series parameters and corresponding predicted values generate analysis result data corresponding to the request type.
步骤214,根据分析结果数据按照预设方式生成对应的视图资源数据,将视图资源数据推送至终端。Step 214: Generate corresponding view resource data in a preset manner according to the analysis result data, and push the view resource data to the terminal.
服务器生成分析结果数据后,还可以进一步将分析结果数据按照预设方式生成对应的视图资源数据。具体地,服务器可以根据资源获取请求中的请求类型获取对应的集成函数,并根据分析结果数据中的多个时序参数和对应的预测值通过集成函数集成对应的视图资源数据,服务器进而将视图资源数据推送至终端。通过提取出可视化数据中的时序数据,并对时序数据进行特征提取后,通过时序数据挖掘模型进行分析,能够有效地挖掘出时序数据中有价值的信息进行进一步分析,由此能够有效提高时序数据的挖掘效率和准确率。After the server generates the analysis result data, it can further generate the corresponding view resource data from the analysis result data in a preset manner. Specifically, the server can obtain the corresponding integration function according to the request type in the resource acquisition request, and integrate the corresponding view resource data through the integration function according to the multiple timing parameters and corresponding predicted values in the analysis result data, and the server can then integrate the view resource The data is pushed to the terminal. By extracting the time series data in the visualization data and extracting the characteristics of the time series data, the time series data mining model can be used to analyze the time series data, which can effectively mine the valuable information in the time series data for further analysis, thereby effectively improving the time series data Mining efficiency and accuracy.
上述基于时序数据的数据处理方法中,服务器接收终端发送的资源获取请求后,资源获取请求包括请求类型和请求信息,服务器则根据资源获取请求和请求信息获取多个可视化数据,可视化数据包括类别标识。服务器进而根据类别标识从可视化数据中获取多个时序数据,由此可以有效地获取可视化数据中的时序数据。服务器进一步对多个时序数据进行时序特征处理,得到多个时序数据对应的特征变量和维度特征值;对特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值。根据请求类型获取预设的时序数据挖掘模型,通过时序数据挖掘模型对特征变量和对应的维度特征值进行分析,得到对应的分析结果数据,进而根据分析结果数据按照预设方式生成对应的视图资源数据,并将视图资源数据推送至终端。通过提取出可视化数据中的时序数据,并对时序数据进行特征提取后,通过时序数据挖掘模型进行分析,能够有效地挖掘出时序数据中有价值的信息进行进一步分析,由此能够有效提高时序数据的挖掘效率和准确率。In the above-mentioned data processing method based on time series data, after the server receives the resource acquisition request sent by the terminal, the resource acquisition request includes the request type and request information, and the server acquires multiple visualization data according to the resource acquisition request and request information. The visualization data includes the category identifier . The server further obtains a plurality of time series data from the visualization data according to the category identifier, so that the time series data in the visualization data can be effectively obtained. The server further performs time series feature processing on multiple time series data to obtain feature variables and dimensional feature values corresponding to the multiple time series data; performs feature extraction on the feature variables to extract feature variables that reach the threshold and corresponding dimensional feature values. Obtain the preset time series data mining model according to the request type, analyze the feature variables and the corresponding dimension feature values through the time series data mining model to obtain the corresponding analysis result data, and then generate the corresponding view resource according to the preset method according to the analysis result data Data, and push view resource data to the terminal. By extracting the time series data in the visualization data and extracting the characteristics of the time series data, the time series data mining model can be used to analyze the time series data, which can effectively mine the valuable information in the time series data for further analysis, thereby effectively improving the time series data Mining efficiency and accuracy.
在其中一个实施例中,可视化数据包括类别标识和数据标识,如图3所示,根据类别标识提取可视化数据中的时序分布数据的步骤,具体包括以下内容:In one of the embodiments, the visualization data includes category identification and data identification. As shown in FIG. 3, the step of extracting the time series distribution data in the visualization data according to the category identification specifically includes the following content:
步骤302,根据类别标识和数据标识获取可视化数据对应的基表数据和目标函数。Step 302: Obtain the base table data and the objective function corresponding to the visualization data according to the category identifier and the data identifier.
其中,基表数据可以是存储了可视化数据对应的物理记录的数据表,例如可以是数据关系映射表。类别标识可以表示可视化数据的类型,数据标识可以表示每个可视化数据的识别码,数据表示与基表数据之间可以预先建立关联映射关系表。The base table data may be a data table storing physical records corresponding to the visualization data, for example, may be a data relationship mapping table. The category identifier may indicate the type of visualization data, the data identifier may indicate the identification code of each visualization data, and the association mapping relationship table may be pre-established between the data expression and the base table data.
步骤304,根据基表数据和目标函数获取可视化数据中的分布数据。Step 304: Obtain distribution data in the visualization data according to the base table data and the objective function.
步骤306,将分布数据按照预设方式转换为时序分布数据。Step 306: Convert the distributed data into time series distributed data in a preset manner.
服务器接收终端发送的资源获取请求后,根据资源获取请求中的请求类型和请求信息获取多个可视化数据,可视化数据包括了类别标识和数据标识。服务器则进一步根据类别标识从可视化数据中获取多个时序分布数据。可视化数据可以是预先根据基表数据和预设集成函数生成的不同类别的可视化数据,每种类型的可视化数据可以对应相同的目标集成函数。After receiving the resource acquisition request sent by the terminal, the server acquires a plurality of visualization data according to the request type and request information in the resource acquisition request, and the visualization data includes a category identifier and a data identifier. The server further obtains multiple time series distribution data from the visualization data according to the category identification. The visualization data may be different types of visualization data generated in advance based on the base table data and the preset integration function, and each type of visualization data may correspond to the same target integration function.
具体地,服务器则根据类别标识获取可视化数据对应的目标函数,目标函数可以是集成该可视化数据对于的目标集成函数。服务器并根据数据标识获取预设的关联关系映射表,进而获取与数据标识对应的基表数据。服务器进而可以根据基表数据通过目标函数获取可视化数据中的分布数据,例如可以是基于目标函数的函数分布数据。服务器进而将获取的分布数据按照预设方式转换为时序分布数据。例如,直方图以及分布密度等可视化数据,可以利用python可视化函数嵌入生成对应的可视化数据。则在提取时序数据时,可以利用对应的python可视化函数和基表数据解析得到对应的时序分布数据,进而能够有效地抓取出可视化数据中包含的时序分布数据。Specifically, the server obtains the objective function corresponding to the visualization data according to the category identifier, and the objective function may be the objective integration function for integrating the visualization data. The server obtains the preset association relationship mapping table according to the data identifier, and then obtains the base table data corresponding to the data identifier. The server may then obtain distribution data in the visualization data through the objective function according to the base table data, for example, it may be function distribution data based on the objective function. The server then converts the acquired distributed data into time series distributed data in a preset manner. For example, visualization data such as histograms and distribution density can be embedded to generate corresponding visualization data using python visualization functions. When extracting time series data, the corresponding python visualization function and base table data analysis can be used to obtain the corresponding time series distribution data, which can then effectively capture the time series distribution data contained in the visualization data.
在其中一个实施例中,根据基表数据和目标函数获取可视化数据中的分布数据,包括:根据基表数据和目标函数获取坐标矩阵数据和对应的参数权重;根据坐标矩阵数据和对应的参数权重获取对应的分布数据。In one of the embodiments, obtaining the distribution data in the visualization data according to the base table data and the objective function includes: obtaining coordinate matrix data and corresponding parameter weights according to the base table data and the objective function; according to the coordinate matrix data and the corresponding parameter weights Obtain the corresponding distribution data.
服务器接收终端发送的资源获取请求后,根据资源获取请求中的请求类型和请求信息获取多个可视化数据,可视化数据包括了类别标识和数据标识。服务器则进一步根据类别标识从可视化数据中获取多个时序数据。可视化数据可以是预先根据基表数据和预设集成函数生成的不同类别的可视化数据,每种类型的可视化数据可以对应相同的目标集成函数。After receiving the resource acquisition request sent by the terminal, the server acquires a plurality of visualization data according to the request type and request information in the resource acquisition request, and the visualization data includes a category identifier and a data identifier. The server further obtains multiple time series data from the visualization data according to the category identifier. The visualization data may be different types of visualization data generated in advance based on the base table data and the preset integration function, and each type of visualization data may correspond to the same target integration function.
具体地,服务器则根据类别标识获取可视化数据对应的目标函数,目标函 数可以是集成该可视化数据对于的目标集成函数。服务器并根据数据标识获取预设的关联关系映射表,进而获取与数据标识对应的基表数据。可视化数据中包括了坐标矩阵数据,坐标矩阵数据可以包括多个维度的特征向量以及对应的参数权重,例如可以包括横坐标和纵坐标对应的特征向量以及对应的参数权重。Specifically, the server obtains the objective function corresponding to the visualization data according to the category identifier, and the objective function may be the objective integration function for integrating the visualization data. The server obtains the preset association relationship mapping table according to the data identifier, and then obtains the base table data corresponding to the data identifier. The visualization data includes coordinate matrix data, and the coordinate matrix data may include feature vectors of multiple dimensions and corresponding parameter weights, for example, may include feature vectors corresponding to abscissa and ordinate and corresponding parameter weights.
服务器则进一步根据基表数据和目标函数获取坐标矩阵数据和对应的参数权重,进而根据坐标矩阵数据和对应的参数权重获取可视化数据中对应的时序分布数据,例如时序分布数据可以是基于时间维度的时序分布信息,同时还可以包括多个维度的时序分布信息。服务器获取可视化数据中的时序数据后,进而将获取的分布数据按照预设方式转换为时序分布数据。通过基表数据和目标函数能够有效地获取可视化数据的坐标矩阵数据,进而能够有效地抓取出可视化数据中的时序分布数据。The server further obtains the coordinate matrix data and the corresponding parameter weights according to the base table data and the objective function, and then obtains the corresponding time series distribution data in the visualization data according to the coordinate matrix data and the corresponding parameter weights. For example, the time series distribution data can be based on the time dimension. The time series distribution information can also include multiple dimensions of time series distribution information. After the server obtains the time series data in the visualization data, it then converts the obtained distribution data into time series distribution data in a preset manner. The coordinate matrix data of the visualization data can be effectively obtained through the base table data and the objective function, and the time series distribution data in the visualization data can be effectively captured.
在其中一个实施例中,根据预设算法对多个多维时序数据对应的特征变量进行特征提取的步骤,包括:对多个多维度时序数据对应的特征变量进行聚类分析,得到多个聚类结果;对多个聚类结果内的特征变量分别进行组合,得到多个组合特征变量,计算多个特征变量之间的相关性;根据多个特征变量之间的相关性进行特征选择,提取出达到预设阈值的特征变量和对应的维度特征。In one of the embodiments, the step of performing feature extraction on feature variables corresponding to multiple multi-dimensional time series data according to a preset algorithm includes: performing cluster analysis on feature variables corresponding to multiple multi-dimensional time series data to obtain multiple clusters Results: Combine feature variables in multiple clustering results to obtain multiple combined feature variables, and calculate the correlation between multiple feature variables; select features according to the correlation between multiple feature variables, and extract The feature variables and corresponding dimensional features that reach the preset threshold.
服务器接收终端发送的资源获取请求后,根据资源获取请求中的请求类型和请求信息获取多个可视化数据,可视化数据包括了类别标识和数据标识。服务器则进一步根据类别标识从可视化数据中获取多个时序数据。After receiving the resource acquisition request sent by the terminal, the server acquires a plurality of visualization data according to the request type and request information in the resource acquisition request, and the visualization data includes a category identifier and a data identifier. The server further obtains multiple time series data from the visualization data according to the category identifier.
服务器从可视化数据中获取多个时序数据后,则对多个时序数据进行时序特征处理,将多个时序数据按照时间序列转换为对应的特征向量,并将多个特征向量转换为多个特征变量和对应的维度特征值,由此得到多个时序数据对应的特征变量和维度特征值。例如服务器可以通过均值填充、定制填充和书模型填充等方式对多个时序数据对应的特征向量进行预处理,并通过数据均值、方差和标准差等对多个时序数据进行时序图特征处理,提取出多个时序数据对应的特征变量和维度特征值。After the server obtains multiple time series data from the visualization data, it performs time series feature processing on the multiple time series data, converts the multiple time series data into corresponding feature vectors according to the time series, and converts the multiple feature vectors into multiple feature variables And the corresponding dimensional feature values, thereby obtaining feature variables and dimensional feature values corresponding to multiple time series data. For example, the server can preprocess the feature vectors corresponding to multiple time series data by means of mean filling, custom filling, and book model filling, and process the feature vectors of multiple time series data through data mean, variance, and standard deviation to extract The feature variables and dimension feature values corresponding to multiple time series data are obtained.
服务器进而对多个时序数据对应的多个特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值。具体地,服务器提取出多个时序数据对应的特征变量和对应的维度特征值后,采用预设的聚类算法对多个特征变量和对应的维度特征值进行聚类分析。例如,可以采用k-means(k-均值算法)聚类的方法。服务器通过多个特征变量和对应的维度特征值进行多次聚类后得到多个聚类结果。The server then performs feature extraction on multiple feature variables corresponding to multiple time series data, and extracts feature variables that reach the threshold and corresponding dimensional feature values. Specifically, after the server extracts the feature variables and corresponding dimensional feature values corresponding to the multiple time series data, it uses a preset clustering algorithm to perform cluster analysis on the multiple feature variables and the corresponding dimensional feature values. For example, a k-means (k-means algorithm) clustering method can be used. The server obtains multiple clustering results after performing multiple clustering through multiple feature variables and corresponding dimensional feature values.
服务器进一步对多个聚类结果内的特征变量分别进行组合,得到多个组合 特征变量。获取目标变量,利用目标变量对多个组合特征变量进行相关性检验。检验通过时,对组合特征变量添加交互标签。利用添加交互标签后的组合特征变量解析对应的特征变量。添加交互标签后的组合特征变量可以为达到预设阈值的特征变量,服务器则提取出达到预设阈值特征变量的对应的维度特征值。The server further combines the feature variables in the multiple clustering results to obtain multiple combined feature variables. Obtain the target variable, and use the target variable to test the correlation of multiple combined feature variables. When the test passes, an interactive label is added to the combined feature variable. Use the combined feature variable after adding the interactive label to analyze the corresponding feature variable. The combined feature variable after the interaction tag is added may be a feature variable that reaches a preset threshold, and the server extracts a corresponding dimensional feature value that reaches the preset threshold.
服务器进而根据请求类型获取预设的时序数据挖掘模型,通过时序数据挖掘模型对特征变量和对应的维度特征值进行分析,得到对应的分析结果数据。根据分析结果数据按照预设方式生成对应的视图资源数据,并将视图资源数据推送至终端。通过对时序数据进行大数据分析,能够有效地提取出时序数据中有价值的特征变量和对应的维度特征值,进而能够有效地对时序数据进行挖掘。The server then obtains a preset time series data mining model according to the request type, and analyzes the characteristic variables and corresponding dimensional characteristic values through the time series data mining model to obtain corresponding analysis result data. According to the analysis result data, corresponding view resource data is generated in a preset manner, and the view resource data is pushed to the terminal. Through big data analysis of time series data, valuable feature variables and corresponding dimensional feature values in time series data can be effectively extracted, and then time series data can be effectively mined.
在其中一个实施例中,通过时序数据挖掘模型对特征变量和对应的维度特征值进行分析之前,该方法还包括预先训练时序数据挖掘模型的步骤:获取多个历史时序数据,利用历史时序数据生成训练集数据和验证集数据;将训练集数据输入至预设的神经网络模型中进行训练,得到初始时序数据挖掘模型;利用验证集数据对初始时序数据挖掘模型进行持续训练和验证;及直到验证集数据中的满足条件的数据达到预设阈值时,得到训练完成的时序数据挖掘模型。In one of the embodiments, before analyzing the feature variables and corresponding dimensional feature values through the time series data mining model, the method further includes the step of pre-training the time series data mining model: acquiring multiple historical time series data, and using the historical time series data to generate Training set data and validation set data; input the training set data into the preset neural network model for training to obtain the initial time series data mining model; use the validation set data to continuously train and verify the initial time series data mining model; and until validation When the data that meets the condition in the set data reaches the preset threshold, the time series data mining model that has been trained is obtained.
服务器在获取预设的时序数据挖掘模型之前,还需要预先构建和训练出时序数据挖掘模型。具体地,服务器可以从本地或第三方数据库中获取多个时序数据,服务器进而对多个时序数据进行聚类分析。具体地,服务器对多个时序数据进行特征提取,提取出对应的特征变量以及对应的维度特征值。服务器提取出多个时序的特征变量和维度特征值后,采用预设的聚类算法对特征变量进行聚类分析,服务器通过对特征变量进行多次聚类后得到多个聚类结果。服务器进一步对多个聚类结果内的特征变量分别进行组合,得到多个组合特征变量,并计算多个组合特征变量之间的相关性。服务器则提取出达到预设阈值的特征变量和对应的维度特征值。Before the server obtains the preset time series data mining model, it also needs to construct and train the time series data mining model in advance. Specifically, the server may obtain multiple time series data from a local or third-party database, and the server then performs cluster analysis on the multiple time series data. Specifically, the server performs feature extraction on multiple time series data, and extracts corresponding feature variables and corresponding dimensional feature values. After the server extracts multiple time series feature variables and dimension feature values, it uses a preset clustering algorithm to perform cluster analysis on the feature variables, and the server obtains multiple clustering results by clustering the feature variables multiple times. The server further combines the feature variables in the multiple clustering results to obtain multiple combined feature variables, and calculates the correlation between the multiple combined feature variables. The server extracts feature variables that reach the preset threshold and the corresponding dimensional feature values.
服务器提取出多个特征变量和对应的维度特征值后,则根据多个特征变量和对应的维度特征值按照预设算法构建时序数据挖掘模型。时序数据挖掘模型可以是基于决策树或基于神经网络的模型。After the server extracts multiple feature variables and corresponding dimensional feature values, it constructs a time series data mining model according to a preset algorithm according to the multiple feature variables and corresponding dimensional feature values. The time series data mining model can be a model based on a decision tree or a neural network.
进一步地,服务器还可以获取大量的时序数据,并利用大量时序数据生成训练集数据和验证集数据。服务器则将训练集中的大量时序数据输入至时序数据挖掘模型中进行训练,得到初步的时序数据挖掘模型。服务器进而利用验证集中的大量时序数据对初步的时序数据挖掘模型进行进一步训练和验证,当验证集中的满足预设评估值的数据达到预设比值时,得到训练完成的时序数据挖掘模型。通过对大量的时序数据进行大数据分析后,利用提取的特征变量和维 度特征值按照预设方式建立时序数据挖掘模型并进行训练,由此能够有效地构建出分析准确率较高的时序数据挖掘模型。Further, the server can also obtain a large amount of time series data, and use a large amount of time series data to generate training set data and verification set data. The server inputs a large amount of time series data in the training set into the time series data mining model for training, and obtains a preliminary time series data mining model. The server then uses a large amount of time series data in the verification set to further train and verify the preliminary time series data mining model. When the data in the verification set that meets the preset evaluation value reaches the preset ratio, the trained time series data mining model is obtained. After performing big data analysis on a large amount of time series data, use the extracted feature variables and dimensional feature values to build time series data mining models and train them in a preset manner, which can effectively construct time series data mining with high analysis accuracy model.
在其中一个实施例中,如图4所示,通过时序数据挖掘模型对时序数据进行分析的步骤,具体包括以下内容:In one of the embodiments, as shown in FIG. 4, the step of analyzing time series data through the time series data mining model specifically includes the following contents:
步骤402,通过时序数据挖掘模型计算特征变量的权重。Step 402: Calculate the weight of the feature variable through the time series data mining model.
步骤404,通过时序数据挖掘模型根据特征变量的维度特征值和权重计算多个特征变量对应多个时序参数的对应的预测值。Step 404: Calculate corresponding predicted values of multiple feature variables corresponding to multiple time series parameters according to the dimensional feature value and weight of the feature variable through the time series data mining model.
步骤406,根据多个时序参数和对应的预测值生成与请求类型对应的分析结果数据。Step 406: Generate analysis result data corresponding to the request type according to the multiple time sequence parameters and corresponding predicted values.
服务器接收终端发送的资源获取请求后,根据资源获取请求中的请求类型和请求信息获取多个可视化数据,可视化数据包括了类别标识和数据标识。服务器则进一步根据类别标识从可视化数据中获取多个时序数据。After receiving the resource acquisition request sent by the terminal, the server acquires a plurality of visualization data according to the request type and request information in the resource acquisition request, and the visualization data includes a category identifier and a data identifier. The server further obtains multiple time series data from the visualization data according to the category identifier.
服务器从可视化数据中获取多个时序数据后,则对多个时序数据进行时序特征处理,将多个时序数据按照时间序列转换为对应的特征向量,并将多个特征向量转换为多个特征变量和对应的维度特征值,由此得到多个时序数据对应的特征变量和维度特征值。服务器进而对多个时序数据对应的多个特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值。After the server obtains multiple time series data from the visualization data, it performs time series feature processing on the multiple time series data, converts the multiple time series data into corresponding feature vectors according to the time series, and converts the multiple feature vectors into multiple feature variables And the corresponding dimensional feature values, thereby obtaining feature variables and dimensional feature values corresponding to multiple time series data. The server then performs feature extraction on multiple feature variables corresponding to multiple time series data, and extracts feature variables that reach the threshold and corresponding dimensional feature values.
服务器进一步根据请求类型获取预设的时序数据挖掘模型,并将提取出的特征变量和对应的维度特征值输入至时序数据挖掘模型中,通过时序数据挖掘模型计算特征变量的权重,通过时序数据挖掘模型根据特征变量的维度特征值和权重计算多个特征变量对应多个时序参数的对应的预测值,根据多个时序参数和对应的预测值生成与请求类型对应的分析结果数据。服务器生成分析结果数据后,还可以进一步将分析结果数据按照预设方式生成对应的视图资源数据,并将视图资源数据推送至终端。通过提取出可视化数据中的时序数据,并对时序数据进行特征提取后,通过时序数据挖掘模型进行分析,能够有效地挖掘出时序数据中有价值的信息进行进一步分析,由此能够有效提高时序数据的挖掘效率和准确率。The server further obtains the preset time series data mining model according to the request type, and inputs the extracted feature variables and corresponding dimensional feature values into the time series data mining model, calculates the weight of the feature variables through the time series data mining model, and uses time series data mining The model calculates the corresponding predicted values of multiple feature variables corresponding to multiple time series parameters according to the dimensional feature values and weights of feature variables, and generates analysis result data corresponding to the request type according to the multiple time series parameters and corresponding predicted values. After the server generates the analysis result data, it can further generate corresponding view resource data from the analysis result data in a preset manner, and push the view resource data to the terminal. By extracting the time series data in the visualization data and extracting the characteristics of the time series data, the time series data mining model can be used to analyze the time series data, which can effectively mine the valuable information in the time series data for further analysis, thereby effectively improving the time series data Mining efficiency and accuracy.
例如,服务器可以获取多个可视化数据,例如一周的天气可视化数据、大量的发病数据、舆情数据以及医疗数据等可视化数据。服务器进一部提取出可视化数据中的时序数据,并提取出时序数据中的特征变量和对应的维度特征值,进而通过预设的时序数据分析模型对时序数据中的特征变量和对应的维度特征值进行分析,分析出发病概率与时序数据中的各个特征变量之间的相关性,进而得到多个特征变量对应发病概率、发病人数以及发病分布情况等多个时序参 数的对应的预测值。由此可以有效地利用时序数据挖掘和分析出特定时间段内某种流行病的发病概率,进而能够有效地对对应的流行病进行预防。For example, the server can obtain multiple visualization data, such as visualization data such as weather visualization data for a week, a large amount of disease data, public opinion data, and medical data. The server further extracts the time series data in the visualization data, and extracts the feature variables and corresponding dimensional feature values in the time series data, and then analyzes the feature variables and corresponding dimensional features in the time series data through the preset time series data analysis model Analyze the value to analyze the correlation between the incidence probability and each feature variable in the time series data, and then obtain the corresponding predicted values of multiple feature variables corresponding to multiple time series parameters such as incidence probability, number of cases, and incidence distribution. As a result, time-series data can be effectively used to mine and analyze the incidence probability of a certain epidemic within a specific time period, and then the corresponding epidemic can be effectively prevented.
在其中一个实施例中,分析结果数据中包括多个时序参数和对应的预测值,所述方法还包括:根据请求类型获取预设的集成函数;根据分析结果数据中的多个时序参数和对应的预测值通过集成函数集成对应的视图资源数据;对视图资源数据添加事件类型标识和对应的接口调用参数,接口调用参数用于根据事件类型标识调用所生成的视图资源数据。In one of the embodiments, the analysis result data includes a plurality of time series parameters and corresponding predicted values, and the method further includes: obtaining a preset integration function according to the request type; and according to the plurality of time series parameters and corresponding prediction values in the analysis result data The predicted value of is integrated with corresponding view resource data through an integrated function; the event type identifier and corresponding interface call parameters are added to the view resource data, and the interface call parameters are used to call the generated view resource data according to the event type identifier.
服务器接收终端发送的资源获取请求后,资源获取请求包括请求类型和请求信息,服务器则根据资源获取请求和请求信息获取多个可视化数据,可视化数据包括类别标识。服务器进而根据类别标识从可视化数据中获取多个时序数据,并对多个时序数据进行时序特征处理,得到多个时序数据对应的特征变量和维度特征值;对特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值。After the server receives the resource acquisition request sent by the terminal, the resource acquisition request includes the request type and request information, and the server acquires multiple visualization data according to the resource acquisition request and the request information, and the visualization data includes the category identifier. The server then obtains multiple time series data from the visualization data according to the category identifier, and performs time series feature processing on the multiple time series data to obtain feature variables and dimension feature values corresponding to the multiple time series data; perform feature extraction on the feature variables to extract Threshold feature variables and corresponding dimension feature values.
服务器进而根据请求类型获取预设的时序数据挖掘模型,通过时序数据挖掘模型对特征变量和对应的维度特征值进行分析,得到对应的分析结果数据,进而根据分析结果数据按照预设方式生成对应的视图资源数据。具体地,服务器可以根据请求类型获取预设的集成函数,根据分析结果数据中的多个时序参数和对应的预测值通过集成函数集成对应的视图资源数据,并对视图资源数据添加事件类型标识和对应的接口调用参数。The server then obtains the preset time series data mining model according to the request type, analyzes the feature variables and the corresponding dimension feature values through the time series data mining model, and obtains the corresponding analysis result data, and then generates the corresponding analysis result data according to the preset method. View resource data. Specifically, the server can obtain a preset integration function according to the request type, integrate corresponding view resource data through the integration function according to multiple timing parameters and corresponding predicted values in the analysis result data, and add event type identification and The corresponding interface call parameters.
分析结果数据中可以包括多个时序参数的对应的预测值。例如,可以包括基于时间维度的发病概率、发病分布情况等参数以及对应的预测值。如时间维度可以是每3小时、每12小时为时间单位每天或者每周为时间单位。服务器可以通过获取预设的时序分布集成函数,例如python可视化函数,将多个时序参数的对应的预测值集成对应的视图数据,例如可以利用直方图可视化函数、分布密度、热度图等可视化函数嵌入集成对应的视图数据,通过嵌套函数能够绘制出对应的可视化图像。The analysis result data may include corresponding predicted values of multiple time series parameters. For example, it may include parameters such as the incidence probability and incidence distribution based on the time dimension and the corresponding predicted value. For example, the time dimension can be every 3 hours, every 12 hours as the time unit, every day, or every week as the time unit. The server can integrate the corresponding predicted values of multiple time series parameters into corresponding view data by obtaining preset time series distribution integration functions, such as python visualization functions. For example, it can embed visualization functions such as histogram visualization functions, distribution density, and heat maps. Integrate corresponding view data, and draw corresponding visual images through nested functions.
服务器根据分析结果数据中的多个时序参数和对应的预测值通过集成函数集成对应的视图资源数据后,进一步对视图资源数据添加事件类型标识和对应的接口调用参数,并集成对应的类进行存储。以利于服务器或终端对生成的视图资源数据进行调用,由此使得服务器或终端再次获取相关联的时序数据或视图数据时,可以直接根据事件类型标识和对应的接口调用参数调用挖掘分析出的视图资源数据,进而提高了对时序数据的分析效率和利用价值。After the server integrates the corresponding view resource data through the integration function according to the multiple timing parameters in the analysis result data and the corresponding predicted values, it further adds the event type identifier and the corresponding interface call parameters to the view resource data, and integrates the corresponding class for storage . In order to facilitate the server or terminal to call the generated view resource data, so that when the server or terminal obtains the associated time series data or view data again, it can directly call the mining analyzed view according to the event type identification and the corresponding interface call parameters Resource data, thereby improving the analysis efficiency and utilization value of time series data.
服务器生成对应的视图资源数据后,则将视图资源数据推送至终端。通过 提取出可视化数据中的时序数据,并对时序数据进行特征提取后,通过时序数据挖掘模型进行分析,能够有效地挖掘出时序数据中有价值的信息进行进一步分析,由此能够有效提高时序数据的挖掘效率和分析效率。After the server generates the corresponding view resource data, it pushes the view resource data to the terminal. By extracting the time series data in the visualization data and extracting the characteristics of the time series data, the time series data mining model can be used to analyze the time series data, which can effectively mine the valuable information in the time series data for further analysis, thereby effectively improving the time series data Mining efficiency and analysis efficiency.
应该理解的是,虽然图2-4的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2-4中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the various steps in the flowcharts of FIGS. 2-4 are displayed in sequence as indicated by the arrows, these steps are not necessarily executed in sequence in the order indicated by the arrows. Unless specifically stated in this article, the execution of these steps is not strictly limited in order, and these steps can be executed in other orders. Moreover, at least some of the steps in Figures 2-4 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times. These sub-steps or stages The execution order of is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.
在其中一个实施例中,如图5所示,提供了一种基于时序数据的数据处理装置,包括:请求接收模块502、数据获取模块504、特征处理模块506、数据挖掘模块508和视图生成模块510,其中:In one of the embodiments, as shown in FIG. 5, a data processing device based on time series data is provided, including: a request receiving module 502, a data acquisition module 504, a feature processing module 506, a data mining module 508, and a view generation module 510, of which:
请求接收模块502,用于接收终端发送的资源获取请求,资源获取请求包括请求类型和请求信息;The request receiving module 502 is configured to receive a resource acquisition request sent by the terminal, and the resource acquisition request includes the request type and request information;
数据获取模块504,用于根据资源获取请求和请求信息获取多个可视化数据,可视化数据包括类别标识;根据类别标识提取可视化数据中的时序分布数据;The data acquisition module 504 is configured to acquire a plurality of visualization data according to the resource acquisition request and request information, the visualization data includes a category identifier; and extract the time series distribution data in the visualization data according to the category identifier;
特征处理模块506,用于对时序分布数据进行时序特征处理,得到时序分布数据对应的特征变量和维度特征值;对特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值;The feature processing module 506 is configured to perform time series feature processing on the time series distribution data to obtain feature variables and dimensional feature values corresponding to the time series distribution data; perform feature extraction on the feature variables, and extract feature variables and corresponding dimensional feature values that reach the threshold;
数据挖掘模块508,用于根据请求类型获取预设的时序数据挖掘模型,通过时序数据挖掘模型对特征变量和对应的维度特征值进行分析,得到对应的分析结果数据;The data mining module 508 is configured to obtain a preset time series data mining model according to the request type, and analyze the characteristic variables and corresponding dimensional characteristic values through the time series data mining model to obtain corresponding analysis result data;
视图生成模块510,用于根据分析结果数据按照预设方式生成对应的视图资源数据,将视图资源数据推送至终端。The view generating module 510 is configured to generate corresponding view resource data in a preset manner according to the analysis result data, and push the view resource data to the terminal.
在其中一个实施例中,可视化数据包括类别标识和数据标识,数据获取模块504还用于根据类别标识和数据标识获取可视化数据对应的基表数据和目标函数;根据基表数据和所述目标函数获取所述可视化数据中的分布数据;将分布数据按照预设方式转换为时序分布数据。In one of the embodiments, the visualization data includes category identification and data identification, and the data acquisition module 504 is further configured to acquire the base table data and the objective function corresponding to the visualization data according to the category identification and the data identification; according to the base table data and the objective function Obtain distribution data in the visualization data; convert the distribution data into time series distribution data in a preset manner.
在其中一个实施例中,数据获取模块504还用于根据基表数据和目标函数获取坐标矩阵数据和对应的参数权重;根据坐标矩阵数据和对应的参数权重获 取对应的时序分布数据。In one of the embodiments, the data obtaining module 504 is further configured to obtain coordinate matrix data and corresponding parameter weights according to the base table data and the objective function; and obtain corresponding time series distribution data according to the coordinate matrix data and corresponding parameter weights.
在其中一个实施例中,特征处理模块506还用于对多个时序数据对应的特征变量进行聚类分析,得到多个聚类结果;根据多个聚类结果计算多个特征变量之间的相关性;根据多个特征变量之间的相关性进行特征选择,提取出达到预设阈值的特征变量和对应的维度特征。In one of the embodiments, the feature processing module 506 is also used to perform cluster analysis on feature variables corresponding to multiple time series data to obtain multiple clustering results; calculate the correlation between multiple feature variables based on the multiple clustering results Sex; feature selection based on the correlation between multiple feature variables, and extract feature variables that reach a preset threshold and corresponding dimensional features.
在其中一个实施例中,该装置还包括模型训练模块,用于获取多个历史时序数据,利用历史时序数据生成训练集数据和验证集数据;将训练集数据输入至预设的神经网络模型中进行训练,得到初始时序数据挖掘模型;利用验证集数据对初始时序数据挖掘模型进行持续训练和验证;及直到验证集数据中的满足条件的数据达到预设阈值时,得到训练完成的时序数据挖掘模型。In one of the embodiments, the device further includes a model training module for obtaining a plurality of historical time series data, using the historical time series data to generate training set data and validation set data; inputting the training set data into the preset neural network model Perform training to obtain the initial time series data mining model; use the validation set data to continuously train and verify the initial time series data mining model; and until the data that meets the conditions in the validation set data reaches the preset threshold, the trained time series data mining is obtained model.
在其中一个实施例中,数据挖掘模块508还用于通过时序数据挖掘模型计算特征变量的权重;通过时序数据挖掘模型根据特征变量的维度特征值和权重计算多个特征变量对应多个时序参数的对应的预测值;根据多个时序参数和对应的预测值生成与请求类型对应的分析结果数据。In one of the embodiments, the data mining module 508 is also used to calculate the weights of feature variables through the time series data mining model; the time series data mining model calculates the number of feature variables corresponding to multiple time series parameters according to the dimensional feature values and weights of the feature variables. Corresponding predicted value; generating analysis result data corresponding to the request type according to multiple timing parameters and corresponding predicted values.
在其中一个实施例中,分析结果数据中包括多个时序参数和对应的预测值,视图生成模块510还用于根据请求类型获取预设的集成函数;根据分析结果数据中的多个时序参数和对应的预测值通过集成函数集成对应的视图资源数据;对视图资源数据添加事件类型标识和对应的接口调用参数,接口调用参数用于根据事件类型标识调用所生成的视图资源数据。In one of the embodiments, the analysis result data includes multiple time series parameters and corresponding predicted values, and the view generation module 510 is also configured to obtain a preset integration function according to the request type; according to the multiple time series parameters and the corresponding prediction values in the analysis result data The corresponding predicted value integrates the corresponding view resource data through the integrated function; the event type identifier and the corresponding interface call parameter are added to the view resource data, and the interface call parameter is used to call the generated view resource data according to the event type identifier.
关于基于时序数据的数据处理装置的具体限定可以参见上文中对于基于时序数据的数据处理方法的限定,在此不再赘述。上述基于时序数据的数据处理装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific definition of the data processing device based on time series data, please refer to the above definition of the data processing method based on time series data, which will not be repeated here. Each module in the above-mentioned data processing device based on time series data can be implemented in whole or in part by software, hardware, and a combination thereof. The foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图6所示。该计算机设备包括通过***总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作***、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作***和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储可视化数据、时序数据以及视图资源数据等。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执 行时以实现一种基于时序数据的数据处理方法。In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 6. The computer equipment includes a processor, a memory, a network interface and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium. The computer equipment database is used to store visualization data, time series data, and view resource data. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer readable instructions are executed by the processor to realize a data processing method based on time series data.
本领域技术人员可以理解,图6中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in FIG. 6 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied. The specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行以下步骤:A computer device includes a memory and one or more processors. The memory stores computer readable instructions. When the computer readable instructions are executed by the processor, the one or more processors execute the following steps:
接收终端发送的资源获取请求,资源获取请求包括请求类型和请求信息;Receive the resource acquisition request sent by the terminal, the resource acquisition request includes the request type and request information;
根据资源获取请求和请求信息获取多个可视化数据,可视化数据包括类别标识;根据类别标识提取可视化数据中的时序分布数据;Obtain multiple visualization data according to the resource acquisition request and request information, the visualization data includes category identification; extract the time series distribution data in the visualization data according to the category identification;
对时序分布数据进行时序特征处理,得到时序分布数据对应的特征变量和维度特征值;对特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值;Perform time series feature processing on the time series distribution data to obtain the feature variables and dimensional feature values corresponding to the time series distribution data; perform feature extraction on the feature variables to extract the feature variables and corresponding dimensional feature values that reach the threshold;
根据请求类型获取预设的时序数据挖掘模型,通过时序数据挖掘模型对特征变量和对应的维度特征值进行分析,得到分析结果数据;及Obtain a preset time series data mining model according to the request type, analyze the characteristic variables and corresponding dimension feature values through the time series data mining model, and obtain the analysis result data; and
根据分析结果数据按照预设方式生成对应的视图资源数据,将视图资源数据推送至终端。According to the analysis result data, corresponding view resource data is generated in a preset manner, and the view resource data is pushed to the terminal.
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:One or more non-volatile computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors execute the following steps:
接收终端发送的资源获取请求,资源获取请求包括请求类型和请求信息;Receive the resource acquisition request sent by the terminal, the resource acquisition request includes the request type and request information;
根据资源获取请求和请求信息获取多个可视化数据,可视化数据包括类别标识;根据类别标识提取可视化数据中的时序分布数据;Obtain multiple visualization data according to the resource acquisition request and request information, the visualization data includes category identification; extract the time series distribution data in the visualization data according to the category identification;
对时序分布数据进行时序特征处理,得到时序分布数据对应的特征变量和维度特征值;对特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值;Perform time series feature processing on the time series distribution data to obtain the feature variables and dimensional feature values corresponding to the time series distribution data; perform feature extraction on the feature variables to extract the feature variables and corresponding dimensional feature values that reach the threshold;
根据请求类型获取预设的时序数据挖掘模型,通过时序数据挖掘模型对特征变量和对应的维度特征值进行分析,得到分析结果数据;及Obtain a preset time series data mining model according to the request type, analyze the characteristic variables and corresponding dimension feature values through the time series data mining model, and obtain the analysis result data; and
根据分析结果数据按照预设方式生成对应的视图资源数据,将视图资源数据推送至终端。According to the analysis result data, corresponding view resource data is generated in a preset manner, and the view resource data is pushed to the terminal.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时, 可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions, which can be stored in a non-volatile computer. In a readable storage medium, when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily. In order to make the description concise, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction between the combinations of these technical features, they should It is considered as the range described in this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation manners of the present application, and the description is relatively specific and detailed, but it should not be understood as a limitation on the scope of the invention patent. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of this application, several modifications and improvements can be made, and these all fall within the protection scope of this application. Therefore, the scope of protection of the patent of this application shall be subject to the appended claims.

Claims (20)

  1. 一种基于时序数据的数据处理方法,所述方法包括:A data processing method based on time series data, the method comprising:
    接收终端发送的资源获取请求,所述资源获取请求包括请求类型和请求信息;Receiving a resource acquisition request sent by the terminal, where the resource acquisition request includes a request type and request information;
    根据所述资源获取请求和请求信息获取多个可视化数据,所述可视化数据包括类别标识;Acquiring a plurality of visualization data according to the resource acquisition request and request information, the visualization data including a category identifier;
    根据所述类别标识提取所述可视化数据中的时序分布数据;Extracting time series distribution data in the visualization data according to the category identifier;
    对所述时序分布数据进行时序特征处理,得到所述时序分布数据对应的特征变量和维度特征值;Performing time series feature processing on the time series distribution data to obtain feature variables and dimension feature values corresponding to the time series distribution data;
    对所述特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值;Perform feature extraction on the feature variable, and extract the feature variable that reaches the threshold and the corresponding dimensional feature value;
    根据所述请求类型获取预设的时序数据挖掘模型,通过所述时序数据挖掘模型对所述特征变量和对应的维度特征值进行分析,得到分析结果数据;及Obtain a preset time series data mining model according to the request type, and analyze the characteristic variables and corresponding dimensional characteristic values through the time series data mining model to obtain analysis result data; and
    根据所述分析结果数据按照预设方式生成对应的视图资源数据,将所述视图资源数据推送至所述终端。According to the analysis result data, corresponding view resource data is generated in a preset manner, and the view resource data is pushed to the terminal.
  2. 根据权利要求1所述的方法,其特征在于,所述可视化数据包括类别标识和数据标识,所述根据所述类别标识提取所述可视化数据中的时序分布数据的步骤,包括:The method according to claim 1, wherein the visualization data includes a category identification and a data identification, and the step of extracting time series distribution data in the visualization data according to the category identification includes:
    根据所述类别标识和所述数据标识获取所述可视化数据对应的基表数据和目标函数;Acquiring the base table data and the objective function corresponding to the visualization data according to the category identifier and the data identifier;
    根据所述基表数据和所述目标函数获取所述可视化数据中的分布数据;及Obtaining distribution data in the visualization data according to the base table data and the objective function; and
    将所述分布数据按照预设方式转换为时序分布数据。The distribution data is converted into time series distribution data in a preset manner.
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述基表数据和所述目标函数获取所述可视化数据中的分布数据的步骤,包括:The method according to claim 2, wherein the step of obtaining the distribution data in the visualization data according to the base table data and the objective function comprises:
    根据所述基表数据和所述目标函数获取坐标矩阵数据和对应的参数权重;Obtaining coordinate matrix data and corresponding parameter weights according to the base table data and the objective function;
    及根据所述坐标矩阵数据和对应的参数权重获取对应的分布数据。And obtaining corresponding distribution data according to the coordinate matrix data and the corresponding parameter weights.
  4. 根据权利要求1所述的方法,其特征在于,所述根据预设算法对多个多维时序数据对应的特征变量进行特征提取的步骤,包括:The method according to claim 1, wherein the step of performing feature extraction on feature variables corresponding to multiple multi-dimensional time series data according to a preset algorithm comprises:
    对所述多个时序数据对应的特征变量进行聚类分析,得到多个聚类结果;Performing cluster analysis on the feature variables corresponding to the multiple time series data to obtain multiple clustering results;
    对所述多个聚类结果内的特征变量分别进行组合,得到多个组合特征变量,计算多个特征变量之间的相关性;及Combine the feature variables in the multiple clustering results respectively to obtain multiple combined feature variables, and calculate the correlation between the multiple feature variables; and
    根据所述多个特征变量之间的相关性进行特征选择,提取出达到预设阈值的特征变量和对应的维度特征。Feature selection is performed according to the correlation between the multiple feature variables, and feature variables that reach a preset threshold and corresponding dimensional features are extracted.
  5. 根据权利要求1所述的方法,其特征在于,所述通过所述时序数据挖掘模型对所述特征变量和对应的维度特征值进行分析之前,所述方法还包括:The method according to claim 1, characterized in that, before analyzing the characteristic variables and corresponding dimensional characteristic values through the time series data mining model, the method further comprises:
    获取多个历史时序数据,利用历史时序数据生成训练集数据和验证集数据;Obtain multiple historical time series data, and use historical time series data to generate training set data and validation set data;
    将所述训练集数据输入至预设的神经网络模型中进行训练,得到初始时序数据挖掘模型;Input the training set data into a preset neural network model for training to obtain an initial time series data mining model;
    利用所述验证集数据对所述初始时序数据挖掘模型进行持续训练和验证;Using the verification set data to continuously train and verify the initial time series data mining model;
    及直到所述验证集数据中的满足条件的数据达到预设阈值时,得到训练完成的时序数据挖掘模型。And until the data that meets the condition in the verification set data reaches a preset threshold, a time series data mining model that has been trained is obtained.
  6. 根据权利要求1所述的方法,其特征在于,所述通过所述时序数据挖掘模型对所述特征变量和对应的维度特征值进行分析的步骤,包括:The method according to claim 1, wherein the step of analyzing the characteristic variable and the corresponding dimension characteristic value through the time series data mining model comprises:
    通过所述时序数据挖掘模型计算所述特征变量的权重;Calculating the weight of the feature variable by the time series data mining model;
    通过所述时序数据挖掘模型根据所述特征变量的维度特征值和权重计算多个特征变量对应多个时序参数的对应的预测值;及Calculating corresponding predicted values of multiple feature variables corresponding to multiple time series parameters according to the dimensional feature values and weights of the feature variables through the time series data mining model; and
    根据所述多个时序参数和对应的预测值生成与所述请求类型对应的分析结果数据。The analysis result data corresponding to the request type is generated according to the multiple time sequence parameters and corresponding predicted values.
  7. 根据权利要求1所述的方法,其特征在于,所述分析结果数据中包括多个时序参数和对应的预测值,所述方法还包括:The method according to claim 1, wherein the analysis result data includes multiple time series parameters and corresponding predicted values, and the method further comprises:
    根据所述请求类型获取预设的集成函数;Obtaining a preset integration function according to the request type;
    根据所述分析结果数据中的多个时序参数和对应的预测值通过所述集成函数集成对应的视图资源数据;及Integrate corresponding view resource data through the integration function according to multiple time series parameters and corresponding predicted values in the analysis result data; and
    对所述视图资源数据添加事件类型标识和对应的接口调用参数,所述接口调用参数用于根据事件类型标识调用所生成的视图资源数据。An event type identifier and corresponding interface call parameters are added to the view resource data, and the interface call parameters are used to call the generated view resource data according to the event type identifier.
  8. 一种基于时序数据的数据处理装置,所述装置包括:A data processing device based on time series data, the device comprising:
    请求接收模块,用于接收终端发送的资源获取请求,所述资源获取请求包括请求类型和请求信息;The request receiving module is configured to receive a resource acquisition request sent by the terminal, where the resource acquisition request includes the request type and request information;
    数据获取模块,用于根据所述资源获取请求和请求信息获取多个可视化数据,所述可视化数据包括类别标识;根据所述类别标识提取所述可视化数据中的时序分布数据;A data acquisition module, configured to acquire a plurality of visualization data according to the resource acquisition request and request information, the visualization data includes a category identifier; and extract the time series distribution data in the visualization data according to the category identifier;
    特征处理模块,用于对所述时序分布数据进行时序特征处理,得到所述时序分布数据对应的特征变量和维度特征值;对所述特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值;The feature processing module is used to perform time series feature processing on the time series distribution data to obtain the feature variables and dimensional feature values corresponding to the time series distribution data; perform feature extraction on the feature variables to extract the feature variables and corresponding values that reach the threshold Dimensional characteristic value of;
    数据挖掘模块,用于根据所述请求类型获取预设的时序数据挖掘模型,通过所述时序数据挖掘模型对所述特征变量和对应的维度特征值进行分析,得到 分析结果数据;及The data mining module is configured to obtain a preset time series data mining model according to the request type, and analyze the characteristic variables and corresponding dimension characteristic values through the time series data mining model to obtain analysis result data; and
    视图生成模块,用于根据所述分析结果数据按照预设方式生成对应的视图资源数据,将所述视图资源数据推送至所述终端。The view generation module is configured to generate corresponding view resource data in a preset manner according to the analysis result data, and push the view resource data to the terminal.
  9. 根据权利要求8所述的装置,其特征在于,所述可视化数据包括类别标识和数据标识,所述数据获取模块还用于根据所述类别标识和所述数据标识获取所述可视化数据对应的基表数据和目标函数;根据所述基表数据和所述目标函数获取所述可视化数据中的分布数据;将所述时序分布数据按照预设方式转换为时序分布数据。The device according to claim 8, wherein the visualization data includes a category identification and a data identification, and the data acquisition module is further configured to acquire a base corresponding to the visualization data according to the category identification and the data identification. Table data and objective function; obtain distribution data in the visualization data according to the base table data and the objective function; convert the time series distribution data into time series distribution data in a preset manner.
  10. 根据权利要求9所述的装置,其特征在于,所述数据获取模块还用于根据所述基表数据和所述目标函数获取坐标矩阵数据和对应的参数权重;及根据所述坐标矩阵数据和对应的参数权重获取对应的分布数据。The device according to claim 9, wherein the data acquisition module is further configured to acquire coordinate matrix data and corresponding parameter weights according to the base table data and the objective function; and according to the coordinate matrix data and The corresponding parameter weight obtains the corresponding distribution data.
  11. 根据权利要求8所述的装置,其特征在于,所述特征处理模块还用于对所述多个时序数据对应的特征变量进行聚类分析,得到多个聚类结果;对所述多个聚类结果内的特征变量分别进行组合,得到多个组合特征变量,计算多个特征变量之间的相关性;及根据所述多个特征变量之间的相关性进行特征选择,提取出达到预设阈值的特征变量和对应的维度特征。The device according to claim 8, wherein the feature processing module is further configured to perform a cluster analysis on the feature variables corresponding to the multiple time series data to obtain multiple clustering results; The feature variables in the class results are respectively combined to obtain multiple combined feature variables, and the correlation between the multiple feature variables is calculated; and feature selection is performed according to the correlation between the multiple feature variables, and the extraction reaches the preset value. Threshold feature variables and corresponding dimensional features.
  12. 根据权利要求8所述的装置,其特征在于,所述装置还包括模型训练模块,用于获取多个历史时序数据,利用历史时序数据生成训练集数据和验证集数据;将所述训练集数据输入至预设的神经网络模型中进行训练,得到初始时序数据挖掘模型;利用所述验证集数据对所述初始时序数据挖掘模型进行持续训练和验证;及直到所述验证集数据中的满足条件的数据达到预设阈值时,得到训练完成的时序数据挖掘模型。The device according to claim 8, characterized in that the device further comprises a model training module for acquiring a plurality of historical time series data, and using the historical time series data to generate training set data and verification set data; Input to a preset neural network model for training to obtain an initial time series data mining model; use the verification set data to continuously train and verify the initial time series data mining model; and until the conditions in the verification set data are met When the data reaches the preset threshold, the time series data mining model that has been trained is obtained.
  13. 根据权利要求8所述的装置,其特征在于,所述数据挖掘模块还用于通过所述时序数据挖掘模型计算所述特征变量的权重;通过所述时序数据挖掘模型根据所述特征变量的维度特征值和权重计算多个特征变量对应多个时序参数的对应的预测值;及根据所述多个时序参数和对应的预测值生成与所述请求类型对应的分析结果数据。The device according to claim 8, wherein the data mining module is further configured to calculate the weight of the characteristic variable through the time series data mining model; and the time series data mining model is used to calculate the weight of the characteristic variable according to the dimension of the characteristic variable. The feature values and weights calculate corresponding predicted values of multiple feature variables corresponding to multiple time series parameters; and generate analysis result data corresponding to the request type according to the multiple time series parameters and the corresponding predicted values.
  14. 根据权利要求8所述的装置,其特征在于,所述分析结果数据中包括多个时序参数和对应的预测值,所述视图生成模块还用于根据所述请求类型获取预设的集成函数;根据所述分析结果数据中的多个时序参数和对应的预测值通过所述集成函数集成对应的视图资源数据;及对所述视图资源数据添加事件类型标识和对应的接口调用参数,所述接口调用参数用于根据事件类型标识调用所生成的视图资源数据。The device according to claim 8, wherein the analysis result data includes a plurality of time series parameters and corresponding predicted values, and the view generation module is further configured to obtain a preset integration function according to the request type; Integrate corresponding view resource data through the integration function according to multiple time series parameters and corresponding predicted values in the analysis result data; and add event type identifiers and corresponding interface call parameters to the view resource data, the interface The call parameter is used to identify the view resource data generated by the call according to the event type.
  15. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device includes a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the one or more processors, the one or more Each processor performs the following steps:
    接收终端发送的资源获取请求,所述资源获取请求包括请求类型和请求信息;Receiving a resource acquisition request sent by the terminal, where the resource acquisition request includes a request type and request information;
    根据所述资源获取请求和请求信息获取多个可视化数据,所述可视化数据包括类别标识;根据所述类别标识提取所述可视化数据中的时序分布数据;Acquiring a plurality of visualization data according to the resource acquisition request and request information, the visualization data including a category identifier; extracting time series distribution data in the visualization data according to the category identifier;
    对所述时序分布数据进行时序特征处理,得到时序分布数据对应的特征变量和维度特征值;对所述特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值;Performing time series feature processing on the time series distribution data to obtain feature variables and dimensional feature values corresponding to the time series distribution data; performing feature extraction on the feature variables to extract feature variables that reach a threshold and corresponding dimensional feature values;
    根据所述请求类型获取预设的时序数据挖掘模型,通过所述时序数据挖掘模型对所述特征变量和对应的维度特征值进行分析,得到分析结果数据;及Obtain a preset time series data mining model according to the request type, and analyze the characteristic variables and corresponding dimensional characteristic values through the time series data mining model to obtain analysis result data; and
    根据所述分析结果数据按照预设方式生成对应的视图资源数据,将所述视图资源数据推送至所述终端。According to the analysis result data, corresponding view resource data is generated in a preset manner, and the view resource data is pushed to the terminal.
  16. 根据权利要求15所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:根据所述类别标识和所述数据标识获取所述可视化数据对应的基表数据和目标函数;根据所述基表数据和所述目标函数获取所述可视化数据中的分布数据;及将所述分布数据按照预设方式转换为时序分布数据。The computer device according to claim 15, wherein the processor further executes the following step when executing the computer-readable instruction: obtaining a base table corresponding to the visualization data according to the category identifier and the data identifier Data and objective function; obtaining distribution data in the visualization data according to the base table data and the objective function; and converting the distribution data into time series distribution data in a preset manner.
  17. 根据权利要求15所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:根据所述请求类型获取预设的集成函数;根据所述分析结果数据中的多个时序参数和对应的预测值通过所述集成函数集成对应的视图资源数据;及对所述视图资源数据添加事件类型标识和对应的接口调用参数,所述接口调用参数用于根据事件类型标识调用所生成的视图资源数据。The computer device according to claim 15, wherein the processor further executes the following steps when executing the computer-readable instruction: obtaining a preset integration function according to the request type; according to the analysis result data The multiple time sequence parameters and the corresponding predicted values are integrated with corresponding view resource data through the integration function; and the event type identifier and corresponding interface call parameters are added to the view resource data, and the interface call parameters are used to Identifies the view resource data generated by the call.
  18. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:One or more non-volatile computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors perform the following steps:
    接收终端发送的资源获取请求,所述资源获取请求包括请求类型和请求信息;Receiving a resource acquisition request sent by the terminal, where the resource acquisition request includes a request type and request information;
    根据所述资源获取请求和请求信息获取多个可视化数据,所述可视化数据包括类别标识;根据所述类别标识提取所述可视化数据中的时序分布数据;Acquiring a plurality of visualization data according to the resource acquisition request and request information, the visualization data including a category identifier; extracting time series distribution data in the visualization data according to the category identifier;
    对所述时序分布数据进行时序特征处理,得到时序分布数据对应的特征变 量和维度特征值;对所述特征变量进行特征提取,提取出达到阈值的特征变量和对应的维度特征值;Performing time series feature processing on the time series distribution data to obtain feature variables and dimensional feature values corresponding to the time series distribution data; performing feature extraction on the feature variables to extract feature variables and corresponding dimensional feature values that reach a threshold;
    根据所述请求类型获取预设的时序数据挖掘模型,通过所述时序数据挖掘模型对所述特征变量和对应的维度特征值进行分析,得到分析结果数据;及Obtain a preset time series data mining model according to the request type, and analyze the characteristic variables and corresponding dimensional characteristic values through the time series data mining model to obtain analysis result data; and
    根据所述分析结果数据按照预设方式生成对应的视图资源数据,将所述视图资源数据推送至所述终端。According to the analysis result data, corresponding view resource data is generated in a preset manner, and the view resource data is pushed to the terminal.
  19. 根据权利要求17所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:根据所述类别标识和所述数据标识获取所述可视化数据对应的基表数据和目标函数;根据所述基表数据和所述目标函数获取所述可视化数据中的分布数据;及将所述分布数据按照预设方式转换为时序分布数据。The storage medium according to claim 17, wherein when the computer-readable instructions are executed by the processor, the following steps are further executed: obtaining the base corresponding to the visualization data according to the category identifier and the data identifier. Table data and objective function; obtaining distribution data in the visualization data according to the base table data and the objective function; and converting the distribution data into time series distribution data in a preset manner.
  20. 根据权利要求17所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:根据所述请求类型获取预设的集成函数;根据所述分析结果数据中的多个时序参数和对应的预测值通过所述集成函数集成对应的视图资源数据;及对所述视图资源数据添加事件类型标识和对应的接口调用参数,所述接口调用参数用于根据事件类型标识调用所生成的视图资源数据。The storage medium according to claim 17, wherein when the computer-readable instructions are executed by the processor, the following steps are further executed: obtaining a preset integrated function according to the request type; and according to the analysis result data The multiple time sequence parameters and corresponding predicted values in the integrated function integrate corresponding view resource data; and add event type identifiers and corresponding interface call parameters to the view resource data, and the interface call parameters are used according to the event The type identifies the view resource data generated by the call.
PCT/CN2019/116234 2019-03-07 2019-11-07 Data processing method and apparatus based on time sequence data, and computer device WO2020177366A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910171923.6 2019-03-07
CN201910171923.6A CN110008251B (en) 2019-03-07 2019-03-07 Data processing method and device based on time sequence data and computer equipment

Publications (1)

Publication Number Publication Date
WO2020177366A1 true WO2020177366A1 (en) 2020-09-10

Family

ID=67166718

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/116234 WO2020177366A1 (en) 2019-03-07 2019-11-07 Data processing method and apparatus based on time sequence data, and computer device

Country Status (2)

Country Link
CN (1) CN110008251B (en)
WO (1) WO2020177366A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112561337A (en) * 2020-12-16 2021-03-26 北京明略软件***有限公司 Method and device for determining object type
CN113157788A (en) * 2021-04-13 2021-07-23 福州外语外贸学院 Big data mining method and system
CN114936581A (en) * 2022-06-01 2022-08-23 中国人民解放军63796部队 Multi-parameter association mining method based on time sequence data segmentation
CN115442243A (en) * 2022-08-31 2022-12-06 西南大学 Time sequence network node centrality evaluation method and device based on time sequence path tree
CN116804993A (en) * 2023-08-22 2023-09-26 北京龙德缘电力科技发展有限公司 Visual expression method with time sequence data characteristics
CN117493444A (en) * 2024-01-02 2024-02-02 广州海洋地质调查局三亚南海地质研究所 Data extraction and loading method and device, electronic equipment and storage medium

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008251B (en) * 2019-03-07 2023-07-04 平安科技(深圳)有限公司 Data processing method and device based on time sequence data and computer equipment
CN110489613B (en) * 2019-07-29 2022-04-26 北京航空航天大学 Collaborative visual data recommendation method and device
CN110716926B (en) * 2019-09-06 2022-09-20 未鲲(上海)科技服务有限公司 Periodic view data generation method and device, computer equipment and storage medium
CN115114345B (en) * 2022-04-02 2024-04-09 腾讯科技(深圳)有限公司 Feature representation extraction method, device, equipment, storage medium and program product

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060010142A1 (en) * 2004-07-09 2006-01-12 Microsoft Corporation Modeling sequence and time series data in predictive analytics
CN106651200A (en) * 2016-12-29 2017-05-10 中国西电电气股份有限公司 Electrical load management method and system for industrial enterprise aggregate user
CN108462605A (en) * 2018-02-06 2018-08-28 国家电网公司 A kind of prediction technique and device of data
CN110008251A (en) * 2019-03-07 2019-07-12 平安科技(深圳)有限公司 Data processing method, device and computer equipment based on time series data

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5742811A (en) * 1995-10-10 1998-04-21 International Business Machines Corporation Method and system for mining generalized sequential patterns in a large database
CN101853277A (en) * 2010-05-14 2010-10-06 南京信息工程大学 Vulnerability data mining method based on classification and association analysis
CN104679827A (en) * 2015-01-14 2015-06-03 北京得大信息技术有限公司 Big data-based public information association method and mining engine

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060010142A1 (en) * 2004-07-09 2006-01-12 Microsoft Corporation Modeling sequence and time series data in predictive analytics
CN106651200A (en) * 2016-12-29 2017-05-10 中国西电电气股份有限公司 Electrical load management method and system for industrial enterprise aggregate user
CN108462605A (en) * 2018-02-06 2018-08-28 国家电网公司 A kind of prediction technique and device of data
CN110008251A (en) * 2019-03-07 2019-07-12 平安科技(深圳)有限公司 Data processing method, device and computer equipment based on time series data

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112561337A (en) * 2020-12-16 2021-03-26 北京明略软件***有限公司 Method and device for determining object type
CN113157788A (en) * 2021-04-13 2021-07-23 福州外语外贸学院 Big data mining method and system
CN113157788B (en) * 2021-04-13 2024-02-13 福州外语外贸学院 Big data mining method and system
CN114936581A (en) * 2022-06-01 2022-08-23 中国人民解放军63796部队 Multi-parameter association mining method based on time sequence data segmentation
CN114936581B (en) * 2022-06-01 2024-04-26 中国人民解放军63796部队 Multi-parameter association mining method based on time sequence data segmentation
CN115442243A (en) * 2022-08-31 2022-12-06 西南大学 Time sequence network node centrality evaluation method and device based on time sequence path tree
CN115442243B (en) * 2022-08-31 2024-04-16 西南大学 Sequential network node centrality evaluation method and device based on sequential path tree
CN116804993A (en) * 2023-08-22 2023-09-26 北京龙德缘电力科技发展有限公司 Visual expression method with time sequence data characteristics
CN116804993B (en) * 2023-08-22 2023-12-08 北京龙德缘电力科技发展有限公司 Visual expression method with time sequence data characteristics
CN117493444A (en) * 2024-01-02 2024-02-02 广州海洋地质调查局三亚南海地质研究所 Data extraction and loading method and device, electronic equipment and storage medium
CN117493444B (en) * 2024-01-02 2024-04-09 广州海洋地质调查局三亚南海地质研究所 Data extraction and loading method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN110008251A (en) 2019-07-12
CN110008251B (en) 2023-07-04

Similar Documents

Publication Publication Date Title
WO2020177366A1 (en) Data processing method and apparatus based on time sequence data, and computer device
WO2020177377A1 (en) Machine learning-based data prediction processing method and apparatus, and computer device
WO2020253358A1 (en) Service data risk control analysis processing method, apparatus and computer device
WO2021027553A1 (en) Micro-expression classification model generation method, image recognition method, apparatus, devices, and mediums
WO2020177365A1 (en) Data mining-based social insurance data processing method and apparatus, and computer device
WO2020140678A1 (en) Abnormal application detection method and apparatus, and computer device and storage medium
JP2021532499A (en) Machine learning-based medical data classification methods, devices, computer devices and storage media
CN109325118B (en) Unbalanced sample data preprocessing method and device and computer equipment
CN112395500B (en) Content data recommendation method, device, computer equipment and storage medium
WO2020253381A1 (en) Data monitoring method and apparatus, computer device and storage medium
WO2021114612A1 (en) Target re-identification method and apparatus, computer device, and storage medium
CN111145910A (en) Abnormal case identification method and device based on artificial intelligence and computer equipment
WO2022252454A1 (en) Abnormal data detection method and apparatus, computer device, and readable storage medium
CN109886719B (en) Data mining processing method and device based on grid and computer equipment
CN111178949B (en) Service resource matching reference data determining method, device, equipment and storage medium
CN112035611B (en) Target user recommendation method, device, computer equipment and storage medium
WO2020034801A1 (en) Medical feature screening method and apparatus, computer device, and storage medium
CN113743607A (en) Training method of anomaly detection model, anomaly detection method and device
CN114328942A (en) Relationship extraction method, apparatus, device, storage medium and computer program product
CN111898035B (en) Data processing strategy configuration method and device based on Internet of things and computer equipment
CN109471717B (en) Sample library splitting method, device, computer equipment and storage medium
CN116884636A (en) Infectious disease data analysis method, infectious disease data analysis device, computer equipment and storage medium
CN114495137B (en) Bill abnormity detection model generation method and bill abnormity detection method
WO2021139480A1 (en) Gis service aggregation method and apparatus, and computer device and storage medium
CN110489592B (en) Video classification method, apparatus, computer device and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19917820

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19917820

Country of ref document: EP

Kind code of ref document: A1