CN113792931B - Data prediction method and device, logistics cargo amount prediction method, medium and equipment - Google Patents

Data prediction method and device, logistics cargo amount prediction method, medium and equipment Download PDF

Info

Publication number
CN113792931B
CN113792931B CN202111100746.6A CN202111100746A CN113792931B CN 113792931 B CN113792931 B CN 113792931B CN 202111100746 A CN202111100746 A CN 202111100746A CN 113792931 B CN113792931 B CN 113792931B
Authority
CN
China
Prior art keywords
sequence data
prediction
data
prediction result
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111100746.6A
Other languages
Chinese (zh)
Other versions
CN113792931A (en
Inventor
吴盛楠
庄晓天
韩国帅
佟路
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Zhenshi Information Technology Co Ltd
Original Assignee
Beijing Jingdong Zhenshi Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Zhenshi Information Technology Co Ltd filed Critical Beijing Jingdong Zhenshi Information Technology Co Ltd
Priority to CN202111100746.6A priority Critical patent/CN113792931B/en
Publication of CN113792931A publication Critical patent/CN113792931A/en
Application granted granted Critical
Publication of CN113792931B publication Critical patent/CN113792931B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • G06Q10/083Shipping

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Software Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The disclosure relates to the technical field of computers, and relates to a data prediction method and device, a logistics cargo amount prediction method and device, a storage medium and electronic equipment. The method comprises the following steps: acquiring historical time sequence data, and performing time sequence decomposition on the historical time sequence data to obtain trend sequence data, periodic sequence data and error sequence data; performing time sequence decomposition on the error sequence data, performing fitting prediction on the obtained decomposed subsequences, and determining a first prediction result corresponding to the error sequence data according to the fitting prediction result; fitting and predicting trend sequence data and periodic sequence data by adopting a prediction model corresponding to the trend sequence data and the periodic sequence data to obtain a second prediction result and a third prediction result; and fusing the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result. The method and the device improve the prediction accuracy of the periodic and trending data to a certain extent.

Description

Data prediction method and device, logistics cargo amount prediction method, medium and equipment
Technical Field
The present disclosure relates to the field of computer technology, and more particularly, to a data prediction method, a data prediction apparatus, a logistics amount prediction method, a logistics amount prediction apparatus, a computer storage medium, and an electronic device.
Background
Along with the development of computer technology, the application field of machine learning is also more and more widespread, and in many application scenarios, future data needs to be predicted according to the development rule of historical data, for example, prediction of financial market quotation, prediction of demand in retail industry, or for healthy and sustainable development of logistics industry, avoiding reduction of logistics service quality and reduction of logistics service rate, and accurately predicting logistics cargo volume is also important.
In the related technology, the model prediction process ignores the periodicity and trend characteristics of the predicted data, a single pre-stored model is adopted for prediction, the prediction accuracy of the whole data is affected, a large deviation exists between a model prediction result and real data, in the related technology, the generated random error is directly used as an input parameter to be input into the prediction model for prediction fitting, and the prediction accuracy of the model is reduced due to direct introduction of an error term.
It should be noted that the information of the present invention in the above background section is only for enhancing understanding of the background of the present disclosure, and thus may include information that does not form the prior art that is already known to those of ordinary skill in the art.
Disclosure of Invention
The disclosure aims to provide a data prediction method and device based on a time sequence, a logistics cargo amount prediction method and device based on a time sequence, a computer storage medium and electronic equipment, so as to improve the prediction accuracy of periodic and trending data at least to a certain extent.
Other features and advantages of the present disclosure will be apparent from the following detailed description, or may be learned in part by the practice of the disclosure.
According to one aspect of the present disclosure, there is provided a data prediction method based on a time series, including: acquiring historical time sequence data, and performing time sequence decomposition on the historical time sequence data to obtain trend sequence data, periodic sequence data and error sequence data; performing time sequence decomposition on the error sequence data, and performing fitting prediction on the obtained multiple decomposition subsequences to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result; fitting and predicting the trend sequence data and the periodic sequence data by adopting a prediction model corresponding to the trend sequence data and the periodic sequence data to obtain a second prediction result and a third prediction result; and carrying out fusion processing on the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result.
In an exemplary embodiment of the disclosure, the performing time-series decomposition on the error sequence data and performing fitting prediction on the obtained multiple decomposed sub-sequences to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result includes: performing time sequence decomposition on the error sequence data to obtain sub-trend sequence data, sub-period sequence data and sub-random error sequence data; discarding the sub-random error sequence data; carrying out fusion processing on the sub-trend sequence data and the sub-period sequence data to obtain sub-fusion sequence data; and carrying out fitting prediction on the sub-fusion sequence data by adopting a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result.
In an exemplary embodiment of the present disclosure, before performing a fitting prediction on the sub-fusion sequence data by using a prediction model corresponding to the sub-fusion sequence data, and obtaining the first prediction result, the method further includes: labeling and normalizing the sub-fusion sequence data, and determining training set data and testing set data according to the sub-fusion sequence data after standard normalization.
In an exemplary embodiment of the disclosure, the prediction model corresponding to the sub-fusion sequence data is a sine wave sequence characterization network and long-term and short-term memory network composite model; the step of performing fitting prediction on the sub-fusion sequence data by adopting a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result comprises the following steps: inputting the training set data into a sine wave sequence characterization network for sample amplification, and inputting the training set data after sample amplification into a long-period memory network for training the sine wave sequence characterization network and long-period memory network composite model; and inputting the test set data into the trained composite model to obtain the first prediction result.
In an exemplary embodiment of the present disclosure, the fitting prediction is performed on the trend sequence data and the periodic sequence data by using prediction models corresponding to the trend sequence data and the periodic sequence data, to obtain a second prediction result and a third prediction result, including: respectively determining training set data and testing set data corresponding to the trend sequence data and the periodic sequence data; inputting training set data corresponding to the trend sequence data into a corresponding first prediction model to train the first prediction model, and inputting test set data corresponding to the trend sequence data into the trained first prediction model to obtain the second prediction result; and inputting training set data corresponding to the periodic sequence data into a corresponding second prediction model to train the second prediction model, and inputting test set data corresponding to the trend sequence data into the trained second prediction model to obtain the third prediction result.
In an exemplary embodiment of the present disclosure, before determining corresponding training set data and test set data from the periodic sequence data, the method further comprises: and carrying out normalization and normalization processing on the periodic sequence data.
In an exemplary embodiment of the present disclosure, the first predictive model is one of a differentially integrated moving average autoregressive model, an exponential smoothing model, or a Theta model; the second prediction model is a sine wave sequence representation network and long-term and short-term memory network composite model.
In an exemplary embodiment of the present disclosure, the fusing the first prediction result, the second prediction result, and the third prediction result to obtain a target prediction result includes: performing inverse normalization and inverse normalization on the first prediction result and the third prediction result; and adding the second predicted result, the processed first predicted result and the processed third predicted result to obtain the target predicted result.
According to one aspect of the present disclosure, there is provided a logistics cargo amount prediction method based on time series, including: acquiring historical cargo volume time sequence data, and performing time sequence decomposition on the historical cargo volume time sequence data to obtain trend sequence data, periodic sequence data and error sequence data; performing time sequence decomposition on the error sequence data, and performing fitting prediction on the obtained multiple decomposition subsequences to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result; fitting and predicting the trend sequence data and the periodic sequence data by adopting a prediction model corresponding to the trend sequence data and the periodic sequence data to obtain a second prediction result and a third prediction result; and carrying out fusion processing on the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result of the cargo quantity.
In an exemplary embodiment of the present disclosure, performing time-series decomposition on the error sequence data, and performing fitting prediction on the obtained multiple decomposed sub-sequences, so as to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result, where the method includes: performing time sequence decomposition on the error sequence data to obtain sub-trend sequence data, sub-period sequence data and sub-random error sequence data; discarding the sub-random error sequence data; carrying out fusion processing on the sub-trend sequence data and the sub-period sequence data to obtain sub-fusion sequence data; and carrying out fitting prediction on the sub-fusion sequence data by adopting a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result.
In an exemplary embodiment of the disclosure, the prediction model corresponding to the sub-fusion sequence data is a sine wave sequence characterization network and long-term and short-term memory network composite model; the step of performing fitting prediction on the sub-fusion sequence data by adopting a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result comprises the following steps: inputting the training set data into a sine wave sequence characterization network for sample amplification, and inputting the amplified training set data into a long-period memory network to train the sine wave sequence characterization network and the long-period memory network composite model; and inputting the test set data into the trained composite model to obtain the first prediction result.
In an exemplary embodiment of the present disclosure, the fusing the first prediction result, the second prediction result, and the third prediction result to obtain a target prediction result of the cargo quantity includes: and adding the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result of the cargo quantity.
According to one aspect of the present disclosure, there is provided a time-series based data prediction apparatus, the apparatus comprising: the data acquisition module is used for acquiring historical time sequence data, and performing time sequence decomposition on the historical time sequence data to obtain trend sequence data, periodic sequence data and error sequence data; the sequence decomposition module is used for carrying out time sequence decomposition on the error sequence data and carrying out fitting prediction on the obtained multiple decomposition subsequences so as to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result; the fitting prediction module is used for respectively carrying out fitting prediction on the trend sequence data and the periodic sequence data by adopting prediction models corresponding to the trend sequence data and the periodic sequence data to obtain a second prediction result and a third prediction result; and the fusion processing module is used for carrying out fusion processing on the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result.
According to one aspect of the present disclosure, there is provided a time-series-based logistics cargo amount prediction apparatus, the apparatus comprising: the time sequence data acquisition module is used for acquiring historical cargo quantity time sequence data, and performing time sequence decomposition on the historical cargo quantity time sequence data to obtain trend sequence data, periodic sequence data and error sequence data; the time sequence decomposition module is used for carrying out time sequence decomposition on the error sequence data and carrying out fitting prediction on the obtained multiple decomposition subsequences so as to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result; the fitting prediction module is used for respectively carrying out fitting prediction on the trend sequence data and the periodic sequence data by adopting prediction models corresponding to the trend sequence data and the periodic sequence data to obtain a second prediction result and a third prediction result; and the prediction result determining module is used for carrying out fusion processing on the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result of the cargo quantity.
According to an aspect of the present disclosure, there is provided a computer storage medium having stored thereon a computer program which, when executed by a processor, implements the time-series based data prediction method of any one of the above or the time-series based logistics inventory prediction method of any one of the above.
According to one aspect of the present disclosure, there is provided an electronic device including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the time-series based data prediction method of any one of the above or the time-series based logistics inventory prediction method of any one of the above via execution of the executable instructions.
According to the data prediction method based on the time sequence in the exemplary embodiment of the disclosure, trend sequence data, periodic sequence data and error sequence data are obtained by performing time sequence decomposition on the time sequence data, prediction fitting is performed on each sequence data by adopting different prediction models, wherein the error sequence data is subjected to time sequence decomposition again, a plurality of obtained decomposition subsequences are subjected to fitting processing to obtain a first prediction result, and the first prediction result is fused with a second prediction result and a third prediction result which are obtained by fitting the trend sequence data and the periodic sequence data, so that a final prediction result is obtained. On one hand, a time sequence decomposition method is adopted to decompose time sequence data into a plurality of sequence data, different prediction models are adopted to respectively conduct fitting prediction on each sequence data, and finally a plurality of fitting prediction results are fused, so that the accuracy of prediction is improved by a combined prediction method; on the other hand, the time sequence decomposition is carried out on the error sequence data again, fitting prediction and fusion of fitting results are also carried out respectively, so that error items are prevented from being directly introduced into a prediction model, and the model prediction accuracy is improved to a certain extent.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The above, as well as additional purposes, features, and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description when read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings, in which:
FIG. 1 illustrates a flowchart of time series based data prediction according to an exemplary embodiment of the present disclosure;
FIG. 2 illustrates a flowchart for obtaining a first prediction result according to an exemplary embodiment of the present disclosure;
FIG. 3 illustrates a flowchart for determining a first prediction result based on a composite model according to an exemplary embodiment of the present disclosure;
FIG. 4 illustrates a flow chart for fitting predictions of trend sequence data and periodic sequence data in accordance with an exemplary embodiment of the present disclosure;
FIG. 5 illustrates a flow chart of fitting predictions to trend sequence data using ARIMA in accordance with an exemplary embodiment of the present disclosure;
FIG. 6 illustrates a flow chart of a time series based logistics inventory prediction method in accordance with an exemplary embodiment of the present disclosure;
FIG. 7 illustrates a flowchart of a first forecast result acquisition in a time-series logistics volume forecast method in accordance with an exemplary embodiment of the present disclosure;
FIG. 8 illustrates a flowchart of fitting predictions to sub-fusion sequence data to obtain a first prediction result, according to an exemplary embodiment of the present disclosure;
FIG. 9 illustrates a flow diagram of a time series based logistics volume prediction method in accordance with an exemplary embodiment of the present disclosure;
FIG. 10 illustrates a schematic diagram of the results of decomposition of training set data STL in logistics volume prediction in accordance with an exemplary embodiment of the present disclosure;
FIG. 11 illustrates a schematic diagram of the result of performing STL decomposition again on error sequence data R 1 according to an example embodiment of the present disclosure;
FIG. 12 illustrates a schematic diagram of results of fitting predictions to sub-fusion sequence data CS 2 using a sine wave sequence characterization network and long-term memory network composite model, according to an example embodiment of the present disclosure;
FIG. 13 illustrates a schematic diagram of results of fitting predictions of trend sequence data C 1 using an ARIMA model in accordance with an exemplary embodiment of the present disclosure;
FIG. 14 illustrates a schematic diagram of adding a first prediction result, a second prediction result, and a third prediction result to obtain a final target prediction result according to an exemplary embodiment of the present disclosure;
FIG. 15 shows a schematic diagram of the results of fitting predictions of the periodic sequence data S1 by the LSTM model alone and fitting predictions of the periodic sequence data S1 by the composite model in accordance with an exemplary embodiment of the present disclosure;
FIG. 16 illustrates an autocorrelation and partial autocorrelation plot of trend sequence data in accordance with an exemplary embodiment of the present disclosure;
FIG. 17 is a schematic diagram showing a structure of a data prediction apparatus based on time series according to an exemplary embodiment of the present disclosure;
FIG. 18 is a schematic diagram showing a structure of a time-series-based logistics traffic prediction apparatus in accordance with an exemplary embodiment of the present disclosure;
FIG. 19 illustrates a schematic diagram of a storage medium according to an exemplary embodiment of the present disclosure; and
Fig. 20 shows a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
Detailed Description
Exemplary embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the exemplary embodiments to those skilled in the art. The same reference numerals in the drawings denote the same or similar structures, and thus detailed descriptions thereof will be omitted.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the disclosed aspects may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known structures, methods, devices, implementations, or operations are not shown or described in detail to avoid obscuring aspects of the disclosure.
The block diagrams depicted in the figures are merely functional entities and do not necessarily correspond to physically separate entities. That is, these functional entities may be implemented in software, or in one or more software-hardened modules, or in different networks and/or processor devices and/or microcontroller devices.
In many application fields, such as financial market prediction, demand prediction in retail industry, logistics goods volume prediction, etc., in the process of realizing process automation, time sequence prediction plays a very important role, for example, in order to ensure service quality, the goods volume is pre-stored in advance, and is prepared according to the prediction result, for example, in some online shopping websites, sales volume of each type of goods in a future period is a variable to be considered in series decisions such as stock, sales promotion, etc., so that the prediction technology capability can have an important influence on stock, transportation scheduling, sales income, inventory cost, etc. Therefore, how to improve the prediction accuracy of the data, and the time-series prediction technology has higher requirements.
In the related art, for the prediction data with periodicity and trending characteristics, a single prediction model is often adopted, so that the prediction accuracy is poor, and the time sequence prediction model is well suitable for the application scene.
The time series, also called dynamic series, time series or historical complex number, is a series formed by arranging data of the same statistical index according to the time sequence of occurrence of the data. Time series analysis is to make analogies or extensions according to the development process, direction and trend reflected by the time series by compiling and analyzing the time series, so as to predict the possible level of the next time point or the next time period. Time series methods can be classified into short-term prediction, medium-term prediction, and long-term prediction according to the predicted time span. The data analysis method may be a simple time-series average method, a weighted time-series average method, a simple moving average method, a weighted moving average method, an exponential smoothing method, etc., depending on the data analysis method.
However, in the related art, a time series prediction model is generally adopted, random errors are directly input into the prediction model to perform prediction, and the prediction accuracy of the model is reduced due to the direct introduction of error terms.
Based on this, in an exemplary embodiment of the present disclosure, a data prediction method based on a time series is provided first. Referring to fig. 1, the data prediction method based on time series includes the steps of:
Step S110: acquiring historical time sequence data, and performing time sequence decomposition on the historical time sequence data to obtain trend sequence data, periodic sequence data and error sequence data;
step S120: performing time sequence decomposition on the error sequence data, and performing fitting prediction on the obtained multiple decomposition subsequences to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result;
Step S130: fitting and predicting trend sequence data and periodic sequence data by adopting a prediction model corresponding to the trend sequence data and the periodic sequence data to obtain a second prediction result and a third prediction result;
Step S140: and carrying out fusion processing on the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result.
According to the data prediction method based on the time sequence in the present exemplary embodiment, on one hand, a time sequence decomposition method is adopted to decompose the time sequence data into a plurality of sequence data, different prediction models are adopted to respectively conduct fitting prediction on each sequence data, and finally a plurality of fitting prediction results are fused, so that the accuracy of the prediction is improved by the combined prediction method; on the other hand, the time sequence decomposition is carried out on the error sequence data again, fitting prediction and fusion of fitting results are also carried out respectively, so that error items are prevented from being directly introduced into a prediction model, and the model prediction accuracy is improved to a certain extent.
The method of predicting time-series-based data in the exemplary embodiments of the present disclosure is further described below.
In step S110, historical time series data is acquired, and time series decomposition is performed on the historical time series data to obtain trend series data, periodic series data and error series data.
In an exemplary embodiment of the present disclosure, the time in the historical time series data may be in the form of a year, quarter, month, day, or any other time, and the granularity of acquiring the historical time series data may be determined based on the difference in actual observation predicted time. The obtained historical data are ordered according to a time sequence, specific time is used as an index, and corresponding data are values to form historical time sequence data. STL (Seasonal and Trend decomposition using Loess, time series decomposition) is a time series decomposition method using robust local weighted regression as a smoothing method, which is based on a random process and a comb statistical method, and researches the statistical law followed by a random data sequence for solving the practical problem. The method adopts an STL method to carry out time sequence decomposition on historical time sequence data to obtain trend sequence data, periodic sequence data and error sequence data. The trend sequence data part shows the change of data along with time, the periodic sequence data reflects the periodic fluctuation of the data along with time, and the error sequence data reflects the trend sequence data and the part of the periodic sequence data which cannot be explained; the STL decomposition may be performed using any period, and the decomposition period may be selected according to the actual processing data type, application scenario, and the like, which is not particularly limited in the present disclosure.
In step S120, time-series decomposition is performed on the error sequence data, and fitting prediction is performed on the obtained plurality of decomposed sub-sequences, so as to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result.
In an exemplary embodiment of the present disclosure, to avoid the random error being directly input as a reference to the prediction model for prediction, the present disclosure again performs time series decomposition on the error sequence data and performs a fit prediction on the resulting plurality of decomposed sub-sequences.
Specifically, fig. 2 shows a flowchart of acquiring a first prediction result according to an exemplary embodiment of the present disclosure, and as shown in fig. 2, the process includes the steps of:
In step S210, time-series decomposition is performed on the error sequence data to obtain sub-trend sequence data, sub-cycle sequence data, and sub-random error sequence data.
In the exemplary embodiment of the present disclosure, STL decomposition may be performed on error sequence data using any period, and the decomposition period may be selected according to the type of actually processed data, the application scenario, and the like; alternatively, the period of STL decomposition of the error sequence data is the same as the period of STL decomposition of the historical time sequence data, for example, t=7 days; alternatively, the period of STL decomposition of the error sequence data may be different from the period of STL decomposition of the historical time sequence data, and the present disclosure may determine whether the period of STL decomposition is the same for two times according to the actual prediction requirement, which is not particularly limited in the present disclosure.
In step S220, the sub-random error sequence data is discarded.
In an exemplary embodiment of the present disclosure, randomness exists in the random error sequence data, in order to avoid inputting the sub-random errors into the prediction model for prediction, the present disclosure discards the sub-random error sequence and does not continue to be used for fitting prediction of the model, thereby avoiding the part from introducing into the model and reducing the model prediction accuracy.
In step S230, the sub-trend sequence data and the sub-period sequence data are fused to obtain sub-fusion sequence data.
In an exemplary embodiment of the present disclosure, the data values of corresponding times in the sub-trend sequence data and the periodic sequence data are summed to obtain sub-fusion sequence data.
In step S240, a prediction model corresponding to the sub-fusion sequence data is adopted to perform fitting prediction on the sub-fusion sequence data, so as to obtain the first prediction result.
In an exemplary embodiment of the present disclosure, for facilitating data processing and model fitting prediction, before fitting prediction is performed on sub-fusion sequence data by using a prediction model corresponding to the sub-fusion sequence data, normalization and normalization processing are performed on the sub-fusion sequence data, and training set data and test set data are determined according to the sub-fusion sequence data after normalization.
The data normalization and normalization processes are to scale the data to fall into a specific interval, convert the data into dimensionless pure values, and facilitate the comparison and weighting of indexes of different units or orders. The data normalization processing method includes, but is not limited to, a linear method (such as extremum method, standard deviation method), a polyline method (such as triple polyline method), and a curve method (such as semi-normal distribution), and the data normalization processing is to perform linear transformation on the original data so that the result falls into the [0,1] interval, and the standard normalization processing procedure of the present disclosure is described below.
For example, X represents the data in the sub-fusion sequence data, using the following normalization formula (1):
x′i=(xi-μ)/σ (1)
Wherein μ and σ are the mean and standard deviation of X, respectively, X 1,x2,x3…xn is the sequence value of the data;
the normalized sequence set is X', and the normalized data is normalized by adopting the following formula (2):
x″i=(x′i-min(X′)/(max(X′)-min(X′)) (2)
Wherein max is the maximum value data in the normalized sequence set X ', and min is the minimum value data in the normalized sequence set X'.
Further, training set data and test set data are determined according to the standard normalized sub-fusion sequence data. In the present disclosure, standard normalized sub-fusion sequence data is divided into training set data and test set data according to a preset ratio. For example, for data of 2021, 4 months 1 to 5 months 30 days, 4 months 1 to 5 months 23 days are divided into training set data, and 5 months 24 to 5 months 30 days are divided into test set data. The dividing ratio may be determined according to the actual fit prediction situation, for example, 8:1, 9:1, etc., which is not particularly limited in the present disclosure.
In an exemplary embodiment of the present disclosure, a prediction model corresponding to sub-fusion sequence data is a sine wave sequence characterization network and long-short-term memory network composite model, fig. 3 shows a flowchart for determining a first prediction result based on the composite model according to an exemplary embodiment of the present disclosure, and as shown in fig. 3, a process of performing fitting prediction on sub-fusion sequence data by using the prediction model corresponding to sub-fusion sequence data to obtain the first prediction result includes the following steps:
in step S310, the training set data is input to the sine wave sequence characterization network for sample amplification, and the training set data after sample amplification is input to the long-short-term memory network for training the sine wave sequence characterization network and long-short-term memory network composite model.
In an exemplary embodiment of the present disclosure, the training set dataset is characterized by a sine wave sequence characterization network corresponding to embedding layers (embedded layers) of the time sequence, and the sine wave sequence characterization network adopted in the present disclosure specifically includes the following formula (3):
Wherein τ is the input sub-fusion sequence data, parameter h j,wj, The k is the sine wave representation layer output dimension obtained through model training learning, and can be set according to actual fit prediction conditions, such as 16, 32, 64 and the like.
A Long Short-Term Memory network (LSTM) is a type of RNN (Recurrent Network, recurrent neural network) that can learn Long-Term dependency information, including an input layer, a hidden layer, and an output layer.
In the model training process, the output of the sine wave sequence representation network is used as the input of the LSTM, the output dimension of the LSTM is 1, and the data of the next time point can be predicted through the observed data of the time point, namely:
yt=LSTM(xt-1,xt-2,xt-3…xt-T) (4)
where y t is the output of LSTM and x i is the amplified training set data for the sample input to LSTM.
In step S320, the test set data is input to the trained composite model, and a first prediction result is obtained. In an exemplary embodiment of the present disclosure, the first prediction result is obtained by inputting test set data to the trained composite model.
When the observed quantity of time sequence data is less, the division of a training/testing data set is difficult, the model is difficult to train or even overfit due to the fact that the data of the training set is less, and the model selection result is unreliable due to the fact that the data of the verification set is too less.
In step S130, a prediction model corresponding to the trend sequence data and the periodic sequence data is adopted to perform fitting prediction on the trend sequence data and the periodic sequence data, so as to obtain a second prediction result and a third prediction result.
In an exemplary embodiment of the present disclosure, fitting predictions are made for trend sequence data and periodic sequence data using different prediction models, and fig. 4 shows a flowchart of fitting predictions for trend sequence data and periodic sequence data according to an exemplary embodiment of the present disclosure, as shown in fig. 4, the process includes the steps of:
In step S410, training set data and test set data corresponding to the trend sequence data and the periodic sequence data are determined respectively, specifically, the trend sequence data and the periodic sequence data are divided into the training set data and the test set data according to a preset proportion, the division proportion can be determined according to the actual fit prediction condition, but the division proportion needs to be consistent with the division proportion of the error sequence data.
In step S420, training set data corresponding to the trend sequence data is input to the corresponding first prediction model to train the first prediction model, and test set data corresponding to the trend sequence data is input to the trained first prediction model to obtain a second prediction result; the first prediction model may be ARIMA (Autoregressive Integrated Moving Average model, differential integration moving average autoregressive model), an exponential smoothing model (ETS model), a Theta model, and the like, and the present disclosure may select a corresponding prediction model according to the actual fit prediction needs. The following describes an example of a fitting prediction process of the trend sequence data by ARIMA, see fig. 5, including the steps of:
In step S510, the stationarity check and the difference are smoothed.
Using a unit root test method to test the stability of trend sequence data, and if the trend sequence data is tested to be a stable sequence, using ARIMA (p, 0, q) as a prediction model; and if the test result is a non-stationary sequence, carrying out differential processing on the trend sequence data until the stationary sequence is tested, and using ARIMA (p, d, q) as a prediction model. Wherein p is the order of the autoregressive model, q is the order of the moving average model, and d is the differential processing times; the unit root test method includes ADF test, PP test, NP test, KPSS test, ERS test, and the type of the unit root test method is not particularly limited in the present disclosure.
In step S520, parameters p and q are determined.
The d value has been determined in step S510, optionally, an autocorrelation graph ACF and a partial autocorrelation graph PACF of the trend sequence data may be drawn, and the values of P and q may be determined by observing the tail-biting condition of ACF and PACF; alternatively, information criteria AIC (Akaike Information Criterion, minimum informatization criteria) may also be employed for determination, including but not limited to the above-described methods of determining parameters P and q.
In step S530, ARIMA (p, d, q) fits the predictions.
And inputting training set data corresponding to the trend sequence data after the stabilization treatment into a corresponding first prediction model so as to train the first prediction model, and inputting test set data corresponding to the trend sequence data after the stabilization treatment into the trained first prediction model so as to obtain a second prediction result.
In step S430, training set data corresponding to the periodic sequence data is input to the corresponding second prediction model to train the second prediction model, and test set data corresponding to the trend sequence data is input to the trained second prediction model to obtain a third prediction result. Before the training set data corresponding to the periodic sequence data is input to the corresponding second prediction model, the periodic sequence data is normalized and normalized, and the standard normalization process is the same as the standard normalization process in step S240, which is not described herein.
In an exemplary embodiment of the present disclosure, the second prediction model is a sine wave sequence representation network and long-short-term memory network composite model, and a fitting prediction process of the sine wave sequence representation network and long-short-term memory network composite model on the periodic sequence is the same as the above step S310, and is not described herein again.
In step S140, the first prediction result, the second prediction result, and the third prediction result are fused to obtain a target prediction result.
In an exemplary embodiment of the present disclosure, since the error sequence data and the data when the periodic sequence data are subjected to fitting prediction undergo normalization and normalization processing, in order to perform fusion processing on the first prediction result, the second prediction result, and the third prediction result, the first prediction result and the third prediction result are first subjected to inverse normalization and inverse normalization processing, and then the second prediction result and the processed first prediction result and third prediction result are added to obtain a target prediction result.
According to the method, trend sequence data, periodic sequence data and error sequence data are obtained through time sequence decomposition of time sequence data, prediction fitting is conducted on each sequence data by adopting different prediction models, wherein time sequence decomposition is conducted on the error sequence data again, fitting processing is conducted on a plurality of obtained decomposition subsequences to obtain first result data, and a second prediction result and a third prediction result obtained through fitting prediction of the first prediction result and the trend sequence data and fitting of the periodic sequence data are fused to obtain a final prediction result. Decomposing the time sequence data into a plurality of sequence data by adopting a time sequence decomposition method, respectively carrying out fitting prediction on each sequence data by adopting different prediction models, and finally fusing a plurality of fitting prediction results, so that the accuracy of prediction is improved by adopting a combined prediction method; in addition, the time sequence decomposition is carried out on the error sequence data again, fitting prediction and fusion of fitting results are also carried out respectively, so that error items are prevented from being directly introduced into a prediction model, and the model prediction accuracy is improved to a certain extent.
In an exemplary embodiment of the present disclosure, there is also provided a time-series based logistics traffic prediction method, and fig. 6 shows a flowchart of the time-series based logistics traffic prediction method according to an exemplary embodiment of the present disclosure, as shown in fig. 6, the process including the steps of:
in step S610, historical cargo amount time series data is obtained, and time series decomposition is performed on the historical cargo amount time series data to obtain trend series data, periodic series data and error series data;
in step S620, time-series decomposition is performed on the error sequence data, and fitting prediction is performed on the obtained multiple decomposed sub-sequences, so as to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result;
in step S630, fitting and predicting the trend sequence data and the periodic sequence data by adopting prediction models corresponding to the trend sequence data and the periodic sequence data, so as to obtain a second prediction result and a third prediction result;
In step S640, the first prediction result, the second prediction result, and the third prediction result are fused to obtain a target prediction result of the cargo quantity.
In the example embodiment schematically shown in fig. 6, on one hand, a time sequence decomposition method is adopted to decompose the time sequence data of the historical cargo quantity into a plurality of sequence data, different prediction models are adopted to respectively carry out fitting prediction on each sequence data, and finally a plurality of fitting prediction results are fused, so that the accuracy of prediction is improved by the combined prediction method according to the periodicity and trend characteristics of the cargo quantity; in addition, the time sequence decomposition is carried out on the error sequence data again, fitting prediction and fusion of fitting results are also carried out respectively, so that error items are prevented from being directly introduced into a prediction model, and the model prediction accuracy is improved to a certain extent.
It should be further noted that, in the above-mentioned method for predicting data based on time series, the portions related to defining and explaining each method from step S110 to step S140 are also applicable to step S610 to step S640, and redundant contents are not repeated here to avoid excessive contents.
Further, referring to fig. 7, the process of performing time-series decomposition on error sequence data and performing fitting prediction on the obtained plurality of decomposed sub-sequences to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result includes: step S710, performing time sequence decomposition on the error sequence data to obtain sub-trend sequence data, sub-period sequence data and sub-random error sequence data; step S720, discarding the sub-random error sequence data; step S730, carrying out fusion processing on the sub-trend sequence data and the sub-period sequence data to obtain sub-fusion sequence data; and step 740, adopting a prediction model corresponding to the sub-fusion sequence data to carry out fitting prediction on the sub-fusion sequence data so as to obtain a first prediction result.
It should be noted that, the portions related to defining and explaining the methods of step S210 to step S240 in the above-mentioned data prediction method based on time series are also applicable to step S710 to step S740, and are not repeated here to avoid redundant contents.
Still further, the prediction model corresponding to the neutron fusion sequence data in the present disclosure is a sine wave sequence characterization network and long-term and short-term memory network composite model; referring to fig. 8, a process of performing fitting prediction on sub-fusion sequence data by using a prediction model corresponding to the sub-fusion sequence data to obtain a first prediction result includes the following steps: step S810, inputting training set data into a sine wave sequence characterization network for sample enhancement, and inputting the obtained enhanced training set data into a long-period memory network so as to train a sine wave sequence characterization network and long-period memory network composite model; step S820, inputting the test set data into the trained composite model to obtain a first prediction result.
It should be noted that, the portions related to defining and explaining the methods of step S310 to step S320 in the above-mentioned data prediction method based on time sequence are also applicable to step S810 to step S820, and are not repeated here to avoid redundant contents.
The method for predicting the daily order quantity of a logistics trunk line of a logistics company based on time series of the present disclosure will be described below by taking daily order quantity data of a logistics trunk line of a certain logistics company as an example. The method comprises the steps of summarizing the goods quantity belonging to the same day into one piece of data, sorting all the goods quantity data according to a date sequence, selecting an actual departure day as an index, and taking an actual transportation goods quantity as a value to form historical goods quantity time sequence data; historical cargo amount data of 2021 from 4 months 1 to 5 months 30 days is selected, wherein 4 months 1 to 5 months 23 days are training set data, and 5 months 24 days to 5 months 30 days are test set data (the training set data and the test set data are standard normalized data).
Fig. 9 shows a flowchart of a time-series-based logistics cargo amount prediction method according to an exemplary embodiment of the present disclosure, as shown in fig. 9, the method includes the following processes:
Firstly, STL decomposition is performed on training set data (logistics volume order data of 4 months 1 day to 5 months 23 days) at a period t=7 to obtain trend sequence data C 1, periodic sequence data S 1, and error sequence data R 1 (see fig. 10); if there are discrete points deviating from the baseline in the error sequence data R 1 after STL decomposition, that is, the error sequence data R 1 still has information, so by performing STL decomposition on the error sequence data R 1 again, sub-trend sequence data C 2, sub-period sequence data S 2 (see fig. 11) and sub-random error sequence data R 2 are obtained;
Then discarding the sub random error sequence data R 2, and adding the sub trend sequence data C 2 and the sub period sequence data S 2 corresponding to the time point cargo quantity to obtain sub fusion sequence data CS 2; fitting and predicting the sub-fusion sequence data CS 2 by adopting a sine wave sequence representation network and long-short-term memory network composite model, and outputting a first prediction result (as shown in figure 12);
Then, performing fitting prediction on the trend sequence data C 1 by adopting an ARIMA model, outputting a second prediction result (as shown in figure 13), performing fitting prediction on the periodic sequence data S 1 by adopting a sine wave sequence characterization network and long-period memory network composite model, and outputting a third prediction result;
And finally, adding the first prediction result, the second prediction result and the third prediction result to obtain a final target prediction result (as shown in fig. 14).
Fig. 15 shows a schematic diagram of a result of fitting and predicting the periodic sequence data S 1 by using the LSTM model and a schematic diagram of a result of fitting and predicting the periodic sequence data S 1 by using a composite model of a sine wave sequence characterization network and a long-short-term memory network, and comparing the two diagrams, it can be known that the fitting result of fitting and predicting the periodic sequence data S 1 by using the composite model in the lower diagram of fig. 15 can more accurately predict the fluctuation rule of the data (wherein the dotted line is a predicted value, and is realized as a real value). For trend sequence data C 1, prediction is performed by using an ARIMA model, an autocorrelation and a partial autocorrelation map of the trend sequence data are drawn (fig. 16 is an autocorrelation map and a partial autocorrelation map, respectively), and parameters q=3, p=1 and d=0 are determined by the autocorrelation map and the partial autocorrelation map, that is, the trend sequence data is predicted by using ARIMA (1,0,3), so as to obtain a second prediction result. And finally, carrying out fusion processing on the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result of the cargo quantity.
In an exemplary embodiment of the present disclosure, there is also provided a time-series based data prediction apparatus 1700, as shown with reference to fig. 17, may include a data acquisition module 1710, a sequence decomposition module 1720, a fitting prediction module 1730, and a fusion processing module 1740. In particular, the method comprises the steps of,
The data acquisition module 1710 is configured to acquire historical time series data, perform time series decomposition on the historical time series data, and obtain trend series data, periodic series data, and error series data;
a sequence decomposition module 1720, configured to perform time-sequence decomposition on the error sequence data, and perform fitting prediction on the obtained multiple decomposition sub-sequences, so as to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result;
The fitting prediction module 1730 is configured to perform fitting prediction on the trend sequence data and the periodic sequence data by using prediction models corresponding to the trend sequence data and the periodic sequence data, so as to obtain a second prediction result and a third prediction result;
And the fusion processing module 1740 is configured to perform fusion processing on the first prediction result, the second prediction result and the third prediction result, so as to obtain a target prediction result.
In addition, in the exemplary embodiment of the disclosure, a logistics cargo amount prediction device based on time series is also provided. Referring to fig. 18, the time-series-based logistics volume prediction apparatus 1800 may include a time-series data acquisition module 1810, a time-series decomposition module 1820, a fit prediction module 1830, and a prediction result determination module 1840. In particular, the method comprises the steps of,
The time sequence data acquisition module 1810 is configured to acquire historical cargo amount time sequence data, and perform time sequence decomposition on the historical cargo amount time sequence data to obtain trend sequence data, periodic sequence data and error sequence data;
the time sequence decomposition module 1820 is configured to perform time sequence decomposition on the error sequence data, and perform fitting prediction on the obtained multiple decomposition subsequences, so as to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result;
The fitting prediction module 1830 is configured to perform fitting prediction on the trend sequence data and the periodic sequence data by using prediction models corresponding to the trend sequence data and the periodic sequence data, so as to obtain a second prediction result and a third prediction result;
and the prediction result determining module 1840 is configured to perform fusion processing on the first prediction result, the second prediction result and the third prediction result, so as to obtain a target prediction result of the cargo quantity.
Since each functional module of the time-series-based data prediction apparatus of the exemplary embodiment of the present disclosure is the same as in the above-described embodiment of the invention of the time-series-based data prediction method, each functional module of the time-series-based logistics volume prediction apparatus of the exemplary embodiment of the present disclosure is the same as in the above-described embodiment of the invention of the time-series-based logistics volume prediction method, and thus, a detailed description thereof will be omitted.
It should be noted that although several modules or units of the time-series based data prediction device and the time-series based logistics volume prediction device are mentioned in the above detailed description, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
Furthermore, in exemplary embodiments of the present disclosure, a computer storage medium capable of implementing the above-described method is also provided. On which a program product is stored which enables the implementation of the method described above in the present specification. In some possible embodiments, the various aspects of the present disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the disclosure as described in the "exemplary methods" section of this specification, when the program product is run on the terminal device.
Referring to fig. 19, a program product 1900 for implementing the above-described method according to an exemplary embodiment of the present disclosure is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
In addition, in an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided. Those skilled in the art will appreciate that the various aspects of the present disclosure may be implemented as a system, method, or program product. Accordingly, various aspects of the disclosure may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
An electronic device 2000 according to such an embodiment of the present disclosure is described below with reference to fig. 20. The electronic device 2000 illustrated in fig. 20 is merely an example, and should not be construed to limit the functionality and scope of use of embodiments of the present disclosure in any way.
As shown in fig. 20, the electronic device 2000 is embodied in the form of a general purpose computing device. Components of the electronic device 2000 may include, but are not limited to: the at least one processing unit 2010, the at least one storage unit 2020, a bus 2030 connecting the different system components (including the storage unit 2020 and the processing unit 2010), and a display unit 2040.
Wherein the storage unit stores program code that is executable by the processing unit 2010 such that the processing unit 2010 performs steps according to various exemplary embodiments of the present disclosure described in the "exemplary methods" section of the present specification.
The storage unit 2020 may include readable media in the form of volatile storage units such as random access memory unit (RAM) 2021 and/or cache memory unit 2022, and may further include read only memory unit (ROM) 2023.
The storage unit 2020 may also include a program/utility 2024 having a set (at least one) of program modules 2025, such program modules 2025 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
The bus 2030 may be one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, a graphics accelerator port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 2000 may also be in communication with one or more external devices 2100 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 2000, and/or any device (e.g., router, modem, etc.) that enables the electronic device 2000 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 2050. Also, the electronic device 2000 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through a network adapter 2060. As shown, the network adapter 2060 communicates with other modules of the electronic device 2000 via the bus 2030. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with the electronic device 2000, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, and includes several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
Furthermore, the above-described figures are only schematic illustrations of processes included in the method according to the exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily appreciated that the processes shown in the above figures do not indicate or limit the temporal order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, among a plurality of modules.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (13)

1. A time-series-based data prediction method applied to logistics cargo volume prediction, comprising the following steps:
acquiring historical cargo volume time sequence data, and performing time sequence decomposition on the historical cargo volume time sequence data to obtain trend sequence data, periodic sequence data and error sequence data; the actual departure day is selected as an index, the actual transportation goods quantity is a value, and the historical goods quantity time series data are formed;
performing time sequence decomposition on the error sequence data, and performing fitting prediction on the obtained multiple decomposition subsequences to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result;
respectively determining training set data and testing set data corresponding to the trend sequence data and the periodic sequence data; inputting training set data corresponding to the trend sequence data into a corresponding first prediction model to train the first prediction model, and inputting test set data corresponding to the trend sequence data into the trained first prediction model to obtain a second prediction result; inputting training set data corresponding to the periodic sequence data into a corresponding second prediction model to train the second prediction model, and inputting test set data corresponding to the periodic sequence data into the trained second prediction model to obtain a third prediction result;
Fusion processing is carried out on the first prediction result, the second prediction result and the third prediction result, so that a target prediction result is obtained;
the performing time sequence decomposition on the error sequence data, and performing fitting prediction on the obtained multiple decomposition subsequences, so as to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result, including:
performing time sequence decomposition on the error sequence data to obtain sub-trend sequence data, sub-period sequence data and sub-random error sequence data;
Discarding the sub-random error sequence data;
carrying out fusion processing on the sub-trend sequence data and the sub-period sequence data to obtain sub-fusion sequence data;
and carrying out fitting prediction on the sub-fusion sequence data by adopting a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result.
2. The method of claim 1, wherein prior to performing a fit prediction on the sub-fusion sequence data using a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result, the method further comprises:
Labeling and normalizing the sub-fusion sequence data, and determining training set data and testing set data according to the sub-fusion sequence data after standard normalization.
3. The method according to claim 2, wherein the prediction model corresponding to the sub-fusion sequence data is a sine wave sequence characterization network and long-term memory network composite model;
The step of performing fitting prediction on the sub-fusion sequence data by adopting a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result comprises the following steps:
Inputting the training set data into a sine wave sequence characterization network for sample amplification, and inputting the training set data after sample amplification into a long-period memory network for training the sine wave sequence characterization network and long-period memory network composite model;
and inputting the test set data into the trained composite model to obtain the first prediction result.
4. The method of claim 1, wherein prior to determining corresponding training set data and test set data from the periodic sequence data, the method further comprises:
And carrying out normalization and normalization processing on the periodic sequence data.
5. The method of claim 1, wherein the first predictive model is one of a differentially integrated moving average autoregressive model, an exponential smoothing model, or a Theta model; the second prediction model is a sine wave sequence representation network and long-term and short-term memory network composite model.
6. The method of claim 4, wherein the fusing the first, second, and third predictors to obtain a target predictor comprises:
Performing inverse normalization and inverse normalization on the first prediction result and the third prediction result;
and adding the second predicted result, the processed first predicted result and the processed third predicted result to obtain the target predicted result.
7. A time series-based logistics cargo amount prediction method, comprising:
acquiring historical cargo volume time sequence data, and performing time sequence decomposition on the historical cargo volume time sequence data to obtain trend sequence data, periodic sequence data and error sequence data; the actual departure day is selected as an index, the actual transportation goods quantity is a value, and the historical goods quantity time series data are formed;
performing time sequence decomposition on the error sequence data, and performing fitting prediction on the obtained multiple decomposition subsequences to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result;
respectively determining training set data and testing set data corresponding to the trend sequence data and the periodic sequence data; inputting training set data corresponding to the trend sequence data into a corresponding first prediction model to train the first prediction model, and inputting test set data corresponding to the trend sequence data into the trained first prediction model to obtain a second prediction result; inputting training set data corresponding to the periodic sequence data into a corresponding second prediction model to train the second prediction model, and inputting test set data corresponding to the periodic sequence data into the trained second prediction model to obtain a third prediction result;
The first prediction result, the second prediction result and the third prediction result are fused to obtain a target prediction result of the cargo quantity;
The method comprises the steps of carrying out time sequence decomposition on the error sequence data, carrying out fitting prediction on a plurality of obtained decomposition subsequences, and determining a first prediction result corresponding to the error sequence data according to the fitting prediction result, wherein the method comprises the following steps:
performing time sequence decomposition on the error sequence data to obtain sub-trend sequence data, sub-period sequence data and sub-random error sequence data;
Discarding the sub-random error sequence data;
carrying out fusion processing on the sub-trend sequence data and the sub-period sequence data to obtain sub-fusion sequence data;
and carrying out fitting prediction on the sub-fusion sequence data by adopting a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result.
8. The method of claim 7, wherein the prediction model corresponding to the sub-fusion sequence data is a sine wave sequence characterization network and long-term memory network composite model;
The step of performing fitting prediction on the sub-fusion sequence data by adopting a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result comprises the following steps:
inputting the training set data into a sine wave sequence characterization network for sample amplification, and inputting the amplified training set data into a long-period memory network to train the sine wave sequence characterization network and the long-period memory network composite model;
and inputting the test set data into the trained composite model to obtain the first prediction result.
9. The method of claim 7, wherein the fusing the first, second, and third predictions to obtain a target prediction of the cargo quantity comprises:
and adding the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result of the cargo quantity.
10. A time-series-based data prediction apparatus for use in logistics volume prediction, comprising:
The data acquisition module is used for acquiring historical time sequence data, and performing time sequence decomposition on the historical time sequence data to obtain trend sequence data, periodic sequence data and error sequence data; the actual departure day is selected as an index, the actual transportation goods quantity is a value, and the historical goods quantity time series data are formed;
the sequence decomposition module is used for carrying out time sequence decomposition on the error sequence data and carrying out fitting prediction on the obtained multiple decomposition subsequences so as to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result;
The fitting prediction module is used for determining training set data and testing set data corresponding to the trend sequence data and the periodic sequence data respectively; inputting training set data corresponding to the trend sequence data into a corresponding first prediction model to train the first prediction model, and inputting test set data corresponding to the trend sequence data into the trained first prediction model to obtain a second prediction result; inputting training set data corresponding to the periodic sequence data into a corresponding second prediction model to train the second prediction model, and inputting test set data corresponding to the periodic sequence data into the trained second prediction model to obtain a third prediction result;
the fusion processing module is used for carrying out fusion processing on the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result;
the performing time sequence decomposition on the error sequence data, and performing fitting prediction on the obtained multiple decomposition subsequences, so as to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result, including:
performing time sequence decomposition on the error sequence data to obtain sub-trend sequence data, sub-period sequence data and sub-random error sequence data;
Discarding the sub-random error sequence data;
carrying out fusion processing on the sub-trend sequence data and the sub-period sequence data to obtain sub-fusion sequence data;
and carrying out fitting prediction on the sub-fusion sequence data by adopting a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result.
11. A time series-based logistics cargo amount prediction apparatus, comprising:
The time sequence data acquisition module is used for acquiring historical cargo quantity time sequence data, and performing time sequence decomposition on the historical cargo quantity time sequence data to obtain trend sequence data, periodic sequence data and error sequence data; the actual departure day is selected as an index, the actual transportation goods quantity is a value, and the historical goods quantity time series data are formed;
the time sequence decomposition module is used for carrying out time sequence decomposition on the error sequence data and carrying out fitting prediction on the obtained multiple decomposition subsequences so as to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result;
The fitting prediction module is used for determining training set data and testing set data corresponding to the trend sequence data and the periodic sequence data respectively; inputting training set data corresponding to the trend sequence data into a corresponding first prediction model to train the first prediction model, and inputting test set data corresponding to the trend sequence data into the trained first prediction model to obtain a second prediction result; inputting training set data corresponding to the periodic sequence data into a corresponding second prediction model to train the second prediction model, and inputting test set data corresponding to the periodic sequence data into the trained second prediction model to obtain a third prediction result;
The prediction result determining module is used for carrying out fusion processing on the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result of the cargo quantity;
the performing time sequence decomposition on the error sequence data, and performing fitting prediction on the obtained multiple decomposition subsequences, so as to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result, including:
performing time sequence decomposition on the error sequence data to obtain sub-trend sequence data, sub-period sequence data and sub-random error sequence data;
Discarding the sub-random error sequence data;
carrying out fusion processing on the sub-trend sequence data and the sub-period sequence data to obtain sub-fusion sequence data;
and carrying out fitting prediction on the sub-fusion sequence data by adopting a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result.
12. A storage medium having stored thereon a computer program which, when executed by a processor, implements the time series based data prediction method of any one of claims 1 to 6 or the time series based logistics inventory prediction method of any one of claims 7 to 9.
13. An electronic device, comprising:
a processor; and
A memory for storing executable instructions of the processor;
wherein the processor is configured to perform the time series based data prediction method of any one of claims 1 to 6 or the time series based logistics inventory prediction method of any one of claims 7 to 9 via execution of the executable instructions.
CN202111100746.6A 2021-09-18 2021-09-18 Data prediction method and device, logistics cargo amount prediction method, medium and equipment Active CN113792931B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111100746.6A CN113792931B (en) 2021-09-18 2021-09-18 Data prediction method and device, logistics cargo amount prediction method, medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111100746.6A CN113792931B (en) 2021-09-18 2021-09-18 Data prediction method and device, logistics cargo amount prediction method, medium and equipment

Publications (2)

Publication Number Publication Date
CN113792931A CN113792931A (en) 2021-12-14
CN113792931B true CN113792931B (en) 2024-06-18

Family

ID=79184130

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111100746.6A Active CN113792931B (en) 2021-09-18 2021-09-18 Data prediction method and device, logistics cargo amount prediction method, medium and equipment

Country Status (1)

Country Link
CN (1) CN113792931B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203732A (en) * 2016-07-26 2016-12-07 国网重庆市电力公司 Error in dipping computational methods based on ITD and time series analysis
CN106408341A (en) * 2016-09-21 2017-02-15 北京小米移动软件有限公司 Goods sales volume prediction method and device, and electronic equipment

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11146445B2 (en) * 2019-12-02 2021-10-12 Alibaba Group Holding Limited Time series decomposition
CN113037531A (en) * 2019-12-25 2021-06-25 中兴通讯股份有限公司 Flow prediction method, device and storage medium
CN111160651B (en) * 2019-12-31 2022-07-08 福州大学 STL-LSTM-based subway passenger flow prediction method
CN111161538B (en) * 2020-01-06 2021-07-02 东南大学 Short-term traffic flow prediction method based on time series decomposition
CN113379168B (en) * 2021-08-11 2021-12-17 云智慧(北京)科技有限公司 Time series prediction processing method, device and equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203732A (en) * 2016-07-26 2016-12-07 国网重庆市电力公司 Error in dipping computational methods based on ITD and time series analysis
CN106408341A (en) * 2016-09-21 2017-02-15 北京小米移动软件有限公司 Goods sales volume prediction method and device, and electronic equipment

Also Published As

Publication number Publication date
CN113792931A (en) 2021-12-14

Similar Documents

Publication Publication Date Title
Lazzeri Machine learning for time series forecasting with Python
Kripfganz et al. ardl: Estimating autoregressive distributed lag and equilibrium correction models
Ashuri et al. Empirical tests for identifying leading indicators of ENR construction cost index
He et al. Do voluntary disclosures of product and business expansion plans impact analyst coverage and forecasts?
Asai et al. Dynamic asymmetric leverage in stochastic volatility models
US20150025931A1 (en) Business opportunity forecasting
US11501107B2 (en) Key-value memory network for predicting time-series metrics of target entities
CN112668238B (en) Rainfall processing method, rainfall processing device, rainfall processing equipment and storage medium
US20180240037A1 (en) Training and estimation of selection behavior of target
CN113159355A (en) Data prediction method, data prediction device, logistics cargo quantity prediction method, medium and equipment
Tang et al. A novel BEMD-based method for forecasting tourist volume with search engine data
Grimm et al. Assessing the expected value of research studies in reducing uncertainty and improving implementation dynamics
WO2023134188A1 (en) Index determination method and apparatus, and electronic device and computer-readable medium
Lambert et al. Partial adjustment analysis of income and jobs, and growth regimes in the Appalachian region with smooth transition spatial process models
Li et al. Robust jump regressions
CN113947439A (en) Demand prediction model training method and device and demand prediction method and device
Qi et al. Learning Newsvendor Problems With Intertemporal Dependence and Moderate Non-stationarities
CN113792931B (en) Data prediction method and device, logistics cargo amount prediction method, medium and equipment
US20140282034A1 (en) Data analysis in a network
Wang et al. A deep learning integrated framework for predicting stock index price and fluctuation via singular spectrum analysis and particle swarm optimization
CN115048561A (en) Recommendation information determination method and device, electronic equipment and readable storage medium
Bao et al. Dynamic financial distress prediction based on Kalman filtering
Humala et al. A factorial decomposition of inflation in Peru: An alternative measure of core inflation
De Pinho et al. Comparing volatility forecasting models during the global financial crisis
Oust et al. Assessing the explanatory power of dwelling condition in automated valuation models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant