CN108829718A - A kind of method and apparatus of data processing - Google Patents

A kind of method and apparatus of data processing Download PDF

Info

Publication number
CN108829718A
CN108829718A CN201810426890.0A CN201810426890A CN108829718A CN 108829718 A CN108829718 A CN 108829718A CN 201810426890 A CN201810426890 A CN 201810426890A CN 108829718 A CN108829718 A CN 108829718A
Authority
CN
China
Prior art keywords
data
target
historical data
actual value
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810426890.0A
Other languages
Chinese (zh)
Other versions
CN108829718B (en
Inventor
周葳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201810426890.0A priority Critical patent/CN108829718B/en
Publication of CN108829718A publication Critical patent/CN108829718A/en
Application granted granted Critical
Publication of CN108829718B publication Critical patent/CN108829718B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention provides a kind of method and apparatus of data processing, the method includes:Determine data target to be processed;Obtain the historical data of the data target;According to the historical data, the predicted value of the data target is determined;When collecting the actual value of the data target, the difference between the actual value and the predicted value is calculated;If the difference is less than preset threshold, determine the actual value for valid data.The present embodiment predicts the data in index future by the historical data of a certain data target, to after actual acquisition obtains the actual value of the data, determine whether the actual value is effective by judging whether actual value meets the development trend of historical data, to increase the link of data check during data processing, so that data user will not get the data of mistake, the accuracy of subsequent data analysis result ensure that.

Description

A kind of method and apparatus of data processing
Technical field
The present invention relates to field of computer technology, the method and a kind of data processing more particularly to a kind of data processing Device.
Background technique
In computer science, data, which refer to, all can be input to computer and by Jie of the symbol of computer programs process The general name of matter is handled for inputting electronic computer, number, letter, symbol and analog quantity etc. with definite meaning Common name.The object of present computer storage and processing is very extensive, indicates that the data of these objects also become more and more therewith It is complicated.
By taking video industry as an example.Video website also exists in constantly collection user when providing the functions such as the broadcasting of video The Various types of data that generates when search, viewing video, and by handling above-mentioned data after, form such as click volume, search The data such as amount, light exposure, for analysis, personnel are analyzed, to provide corresponding foundation for subsequent company operation decision.
But in process of production, probably due to the difference or the failure in transmission process of data acquisition modes, so that most There is different degrees of mistake in the data generated eventually.If the data of mistake are analyzed personnel and directly acquire and use, will lead Cause the inaccuracy of data analysis result.More seriously, also the Operation Decision of company can be caused to mislead.
Summary of the invention
In view of the above problems, it proposes the embodiment of the present invention and overcomes the above problem or at least partly in order to provide one kind A kind of method of the data processing to solve the above problems and a kind of corresponding device of data processing.
To solve the above-mentioned problems, the embodiment of the invention discloses a kind of methods of data processing, including:
Determine data target to be processed;
Obtain the historical data of the data target;
According to the historical data, the predicted value of the data target is determined;
When collecting the actual value of the data target, the difference between the actual value and the predicted value is calculated;
If the difference is less than preset threshold, determine the actual value for valid data.
Optionally, the step of historical data for obtaining the data target includes:
Determine the collection period of historical data;
Obtain historical data of the data target in the collection period.
Optionally, described according to the historical data, the step of determining the predicted value of the data target, includes:
According to the historical data, the prediction model for being directed to the data target is generated;
Using the prediction model, the predicted value of the data target is calculated.
Optionally, described according to the historical data, the step of generating the prediction model for being directed to the data target, includes:
Setting default regression parameter value;
By changing the default regression parameter value, preset autoregression model is trained, so as to the history The prediction error of data is less than the second preset threshold, to generate prediction model.
Optionally, further include:
If the difference is more than the preset threshold, script is calculated to data and carries out correction process;
The actual value that script recalculates the data target is calculated using the data after error correction.
To solve the above-mentioned problems, the embodiment of the invention discloses a kind of devices of data processing, including:
Data target determining module, for determining data target to be processed;
Historical data obtains module, for obtaining the historical data of the data target;
Predicted value determining module, for determining the predicted value of the data target according to the historical data;
Difference calculating module, for when collecting the actual value of the data target, calculate the actual value with it is described Difference between predicted value;
Valid data determination module determines that the actual value is effective if being less than preset threshold for the difference Data.
Optionally, the historical data acquisition module includes:
Collection period determines submodule, for determining the collection period of historical data;
Historical data acquisition submodule, for obtaining historical data of the data target in the collection period.
Optionally, the predicted value determining module includes:
Prediction model generates submodule, for generating the prediction mould for being directed to the data target according to the historical data Type;
Predictor calculation submodule calculates the predicted value of the data target for using the prediction model.
Optionally, the prediction model generation submodule includes:
Regression parameter value setting unit, for default regression parameter value to be arranged;
Autoregression model training unit changes the default regression parameter value for passing through, to preset autoregression model It is trained, so as to the prediction error of the historical data less than the second preset threshold, to generate prediction model.
Optionally, further include:
Correction process module calculates script to data and carries out error correction if being more than the preset threshold for the difference Processing;
Calculated with actual values module, for calculating the reality that script recalculates the data target using the data after error correction Value.
Compared with the background art, the embodiment of the present invention includes following advantages:
The embodiment of the present invention by the data target that determination is to be processed, and obtains the historical data of the data target, then The predicted value of data target can be determined according to above-mentioned historical data, thus in the actual value for collecting the data target, The difference between above-mentioned actual value and predicted value can be calculated, it, can be with if difference between the two is less than preset threshold Determine above-mentioned actual value for valid data.The present embodiment predicts the number in index future by the historical data of a certain data target According to being become by judging whether actual value meets the development of historical data thus after actual acquisition obtains the actual value of the data Gesture determines whether the actual value is effective, so that the link of data check is increased during data processing, so that data User will not get the data of mistake, ensure that the accuracy of subsequent data analysis result.
Detailed description of the invention
Fig. 1 is a kind of step flow diagram of the method for data processing of one embodiment of the invention;
Fig. 2 is the step flow diagram of the method for another data processing of one embodiment of the invention;
Fig. 3 is a kind of operation flow schematic diagram of the method for data processing of one embodiment of the invention;
Fig. 4 is a kind of schematic block diagram of the Installation practice of data processing of one embodiment of the invention.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real Applying mode, the present invention is described in further detail.
Referring to Fig.1, a kind of step flow diagram of the method for data processing of one embodiment of the invention is shown, is had Body may include steps of:
Step 101, data target to be processed is determined;
In embodiments of the present invention, data target can refer to is handled by initial data of certain mode to acquisition Or the data of one kind preparatory processing or calculating obtained after statistics.For example, data target can be advertisement exposure amount, video section Mesh volumes of searches, click volume, alternatively, advertising income etc..The present embodiment is not construed as limiting the concrete type of data target.
In embodiments of the present invention, it can determine which current data target to be treated is first, it is then right again The data target is handled, and is currently collected with judgement or whether calculated data are authentic and valid data.
It, can be using advertisement exposure amount as working as example, in order to determine whether the data of advertisement exposure that statistics obtains are accurate Preceding data target to be processed.
Step 102, the historical data of the data target is obtained;
In embodiments of the present invention, historical data can refer to the data within past a period of time.For example, the past one It, one week, January, alternatively, 1 year etc..Those skilled in the art can determine according to actual needs historical data acquisition week Phase, the present embodiment are not construed as limiting this.
In general, can be predicted by the historical data of a certain data target the Future Data of the index.Therefore, In order to guarantee the accuracy of prediction, the historical data in longer period of time can be obtained as far as possible.Taking into account computational efficiency It, can collection period using the past 1 year as historical data in the case where calculating accuracy.
Step 103, according to the historical data, the predicted value of the data target is determined;
In embodiments of the present invention, when the predicted value of data target can refer to future one of the index not generated also Between put or the period data.For example, the data of the light exposure of the advertisement of today of prediction.
In the concrete realization, it after getting the historical data of data target, can be constructed using above-mentioned historical data One data prediction model calculates the current predicted value of the data target by the model.
By taking advertisement exposure amount as an example.It is collecting over after 365 days 1 year daily advertisement exposure amount data, Ke Yigen A prediction model is generated according to above-mentioned data, then by the model, predicts the advertisement exposure amount of today.
It certainly, is only a kind of example in such a way that history constructs the predicted value of prediction model and then calculating data target, Those skilled in the art can determine the predicted value of data target using other modes, the present embodiment is to this according to actual needs It is not construed as limiting.
Step 104, it when collecting the actual value of the data target, calculates between the actual value and the predicted value Difference;
In embodiments of the present invention, the actual value of data target can refer to the data that actual acquisition obtains.In general, the number The data value obtained after script is handled or counted to original data is calculated by data according to can be.
For example, the actual value of advertisement exposure amount data can refer to the advertisement exposure amount on the same day that statistics obtains.
It in embodiments of the present invention, can in order to judge whether the actual value of the data target collected is valid data The predicted value of the above-mentioned actual value data target determining with step 103 to be compared first.
In the concrete realization, when being compared to actual value and predicted value, difference between the two can be calculated.In order to The convenience of subsequent calculating, above-mentioned difference may further refer to the absolute value of difference.
Step 105, if the difference is less than preset threshold, determine the actual value for valid data.
In embodiments of the present invention, if the difference between the actual value and predicted value that are calculated in step 104 is pre- If in threshold range, it is believed that the actual value meets the development trend of historical data in the error range of predicted value, above-mentioned It is effective that actual value, which has great probability,.At this point it is possible to which the actual value is labeled as valid data, and pass to business department It uses.
In embodiments of the present invention, the data target to be processed by determination, and the historical data of the data target is obtained, Then the predicted value of data target can be determined according to above-mentioned historical data, thus in the reality for collecting the data target When value, the difference between above-mentioned actual value and predicted value can be calculated, if difference between the two is less than preset threshold, It can be determined that above-mentioned actual value is valid data.The present embodiment predicts index future by the historical data of a certain data target Data, thus after actual acquisition obtains the actual value of the data, by judging whether actual value meets the hair of historical data Exhibition trend determines whether the actual value is effective, so that the link of data check is increased during data processing, so that Data user will not get the data of mistake, ensure that the accuracy of subsequent data analysis result.
Referring to Fig. 2, the step flow diagram of the method for another data processing of one embodiment of the invention is shown, It can specifically include following steps:
Step 201, data target to be processed is determined;
In embodiments of the present invention, data target can refer to is handled by initial data of certain mode to acquisition Or the data of one kind preparatory processing or calculating obtained after statistics.For example, data target can be advertisement exposure amount, video section Mesh volumes of searches, click volume, alternatively, advertising income etc..The present embodiment is not construed as limiting the concrete type of data target.
In embodiments of the present invention, it can determine which current data target to be treated is first, it is then right again The data target is handled, and is currently collected with judgement or whether calculated data are authentic and valid data.
It, can be using advertisement exposure amount as working as example, in order to determine whether the data of advertisement exposure that statistics obtains are accurate Preceding data target to be processed.
Step 202, the collection period of historical data is determined;
In embodiments of the present invention, historical data can refer to the data within past a period of time, and the above-mentioned time is It is the collection period of historical data.For example, collection period can be past one day, January, alternatively, 1 year etc..
In order to make it easy to understand, the present embodiment is introduced so that the collection period of historical data is last year as an example.
For example, if being adopted for current time point for the historical data of a certain data target on January 1st, 2018 Collecting the period can refer to by falling back forward on January 1st, 2,018 1 year, i.e. on December 31, in 1 day to 2017 January in 2017.
Step 203, historical data of the data target in the collection period is obtained;
In embodiments of the present invention, after determining the collection period of historical data, correspondence in respective cycle can be referred to Target data extract.
In the concrete realization, the data summarization of corresponding index from data statistic, can per diem be found out as a result, then It is found out in 365 days using SQL statement as follows, the total data of the index:
SELECT dt,SUM(revenue)FROM table WHERE dt BETWEEN DATE_SUB(DATE_ FORMAT (NOW (), ' %Y-%m-%d'), INTERVAL 1YEAR) AND DATE_SUB (DATE_FORMAT (NOW (), ' %Y-%m-%d'), INTERVAL 1DAY) GROUP BY dt
It in embodiments of the present invention,, can also be by above-mentioned history after obtaining historical data for subsequent convenience of calculation Data are converted to vector.Specifically, it is 365 that the data summarization result in 365 days above-mentioned past can be converted to a length Vector (x1,x2,……,x365)。
It should be noted that the historical data obtained after conversion corresponds to vector for the collection period of different time length Length it is also different.For example, if with the past 90 days for collection period, after converting an available length into 90 vector, That is (x1,x2,……,x90), the present embodiment is not construed as limiting this.
Step 204, according to the historical data, the prediction model for being directed to the data target is generated;
In embodiments of the present invention, historical data can be trained using autoregression model, is obtained for above-mentioned number According to the prediction model of index.
Autoregression model (Autoregressive model, abbreviation AR model) is statistically a kind of processing time sequence The method of column.Pass through each phase, that is, x before with same parameter such as x1To xt-1To predict current period xtPerformance, and assume They are a linear relationship.Autoregression model is widely used in economics, informatics, and, in the prediction of natural phenomena.
In the concrete realization, default regression parameter value can be set first, then by changing above-mentioned default regression parameter Value, is trained preset autoregression model, so that being less than the preset threshold of setting, to the prediction error of historical data with life At prediction model.
By taking data target is advertisement exposure amount as an example.Since ad data is have a week (7 days) periodic, so It can choose the training that model is carried out using seven rank autoregression models.By utilizing 365 days data of history, one group of default is set Regression parameter constantly changes the value of regression parameter, so that training pattern is for history value then by the way of gradient decline The error of prediction reaches minimum to get prediction model has been arrived.
Step 205, using the prediction model, the predicted value of the data target is calculated;
In the concrete realization, after constructing a data prediction model using above-mentioned historical data, the model can be passed through To calculate the current predicted value of the data target.That is, can be trained by historical data, obtain pre- after model Survey the data of next point.
For example, using January 1 to December 31 in 2017 for predicting the advertisement exposure amount on January 1st, 2018 The historical data of advertisement exposure amount carries out model training, after obtaining corresponding prediction model, can use the model, predict The advertisement exposure amount data on January 1st, 2018.
Step 206, it when collecting the actual value of the data target, calculates between the actual value and the predicted value Difference;
In embodiments of the present invention, the actual value of data target can refer to the data that actual acquisition obtains.In general, the number The data value obtained after script is handled or counted to original data is calculated by data according to can be.
For example, the actual value of advertisement exposure amount data can refer to that the advertisement on the same day 1 day January in 2018 that statistics obtains exposes Light quantity.
It in embodiments of the present invention, can in order to judge whether the actual value of the data target collected is valid data The predicted value of the above-mentioned actual value data target determining with step 205 to be compared first.
In the concrete realization, when being compared to actual value and predicted value, difference between the two can be calculated.In order to The convenience of subsequent calculating, above-mentioned difference may further refer to the absolute value of difference.
Step 207, if the difference is less than preset threshold, determine the actual value for valid data;
In embodiments of the present invention, if the difference between the actual value and predicted value that are calculated in step 206 is pre- If in threshold range, it is believed that the actual value meets the development trend of historical data in the error range of predicted value, above-mentioned It is effective that actual value, which has great probability,.At this point it is possible to which the actual value is labeled as valid data, and pass to business department It uses.
Step 208, if the difference is more than the preset threshold, script is calculated to data and carries out correction process;
In embodiments of the present invention, if the difference gap between the actual value and predicted value that are calculated is excessive, do not exist In preset threshold range, it is believed that not in the error range of predicted value, the development for not meeting historical data becomes the actual value Gesture, it is invalid that above-mentioned actual value, which has great probability,.
At this point it is possible to which calculating script to data carries out correction process, to exclude to lead to occur error in data calculation process Failure, and execute step 209, calculate the actual value that script recalculates above-mentioned data target using the data after error correction.
Step 209, the actual value that script recalculates the data target is calculated using the data after error correction.
In embodiments of the present invention, it after recalculating to obtain the actual value of data target, can return to step 206, continue to determine the validity for the actual value being calculated again.
In embodiments of the present invention, in the data transmission, it by increasing the link of a data check, is calculated so that judgement is practical Whether obtained data actual value meets the development trend of historical data, therefore, it is determined that whether actual value is effective data.Such as Fruit confirms that above-mentioned actual value is not available invalid data by judgement, then can be carried out by manually calculating script to data The mode of error correction re-starts calculating, and the accuracy calculated for data provides guarantee, so that data user will not get The data of mistake ensure that the accuracy of subsequent data analysis result.
In order to make it easy to understand, being made a presentation below with a complete example to the method for the data processing of the present embodiment.
As shown in figure 3, being a kind of operation flow schematic diagram of the method for data processing of one embodiment of the invention.Scheming In 3, when data analyst is when executing data calculating task, the prediction of the data target can be constructed by historical data Model, and predict the predicted value on corresponding date, thus after the of even date actual value of the index is calculated, it can be to prediction Value and actual value are compared.It, can be by above-mentioned reality if actual value and the difference of predicted value time meet preset requirement Value is labeled as valid data, and passes to data requirements side and analyzed.If the difference of above-mentioned actual value and predicted value time Preset requirement is not met, it may be considered that above-mentioned actual value is not available invalid data.At this point it is possible to calculate script to data Error correction is carried out, the actual value on the index same day is recalculated, guarantees the validity of data.
It should be noted that for simple description, therefore, it is stated as a series of action groups for embodiment of the method It closes, but those skilled in the art should understand that, embodiment of that present invention are not limited by the describe sequence of actions, because according to According to the embodiment of the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art also should Know, the embodiments described in the specification are all preferred embodiments, and the related movement not necessarily present invention is implemented Necessary to example.
Referring to Fig. 4, a kind of schematic structure of the Installation practice of data processing of one embodiment of the invention is shown Block diagram can specifically include following module:
Data target determining module 401, for determining data target to be processed;
Historical data obtains module 402, for obtaining the historical data of the data target;
Predicted value determining module 403, for determining the predicted value of the data target according to the historical data;
Difference calculating module 404, for calculating the actual value and institute when collecting the actual value of the data target State the difference between predicted value;
Valid data determination module 405 determines that the actual value is to have if being less than preset threshold for the difference Imitate data.
In the embodiment of the present invention, data target to be processed is determined by data target determining module, and via history number Obtain the historical data of the data target according to module is obtained, then can by predicted value determining module according to above-mentioned historical data, The predicted value of data target is determined, so that difference calculating module can be passed through in the actual value for collecting the data target The difference between above-mentioned actual value and predicted value is calculated, it, can be by having if difference between the two is less than preset threshold It imitates data judging module and determines that above-mentioned actual value is valid data.The present embodiment is predicted by the historical data of a certain data target The data in index future, to be gone through after actual acquisition obtains the actual value of the data by judging whether actual value meets The development trend of history data determines whether the actual value is effective, to increase data check during data processing Link ensure that the accuracy of subsequent data analysis result so that data user will not get the data of mistake.
In embodiments of the present invention, the historical data, which obtains module 402, can specifically include following submodule:
Collection period determines submodule, for determining the collection period of historical data;
Historical data acquisition submodule, for obtaining historical data of the data target in the collection period.
In embodiments of the present invention, the predicted value determining module 403 can specifically include following submodule:
Prediction model generates submodule, for generating the prediction mould for being directed to the data target according to the historical data Type;
Predictor calculation submodule calculates the predicted value of the data target for using the prediction model.
In embodiments of the present invention, the prediction model generates submodule and can specifically include such as lower unit:
Regression parameter value setting unit, for default regression parameter value to be arranged;
Autoregression model training unit changes the default regression parameter value for passing through, to preset autoregression model It is trained, so as to the prediction error of the historical data less than the second preset threshold, to generate prediction model.
In embodiments of the present invention, described device can also include following module:
Correction process module calculates script to data and carries out error correction if being more than the preset threshold for the difference Processing;
Calculated with actual values module, for calculating the reality that script recalculates the data target using the data after error correction Value.
In the embodiment of the present invention, in the data transmission, by increasing the link of a data check, by difference calculating module meter The difference between actual value and predicted value is calculated, and then is by the data actual value that valid data determine that judgement is actually calculated The no development trend for meeting historical data, therefore, it is determined that whether actual value is effective data.If confirmed above-mentioned by judgement Actual value is not available invalid data, then can calculate script to data by correction process module and carry out error correction, and again It is calculated, the accuracy calculated for data provides guarantee, so that data user will not get the data of mistake, guarantees The accuracy of subsequent data analysis result.
For device embodiment, since it is basically similar to the method embodiment, related so being described relatively simple Place illustrates referring to the part of embodiment of the method.
All the embodiments in this specification are described in a progressive manner, the highlights of each of the examples are with The difference of other embodiments, the same or similar parts between the embodiments can be referred to each other.
It should be understood by those skilled in the art that, the embodiment of the embodiment of the present invention can provide as method, apparatus or calculate Machine program product.Therefore, the embodiment of the present invention can be used complete hardware embodiment, complete software embodiment or combine software and The form of the embodiment of hardware aspect.Moreover, the embodiment of the present invention can be used one or more wherein include computer can With in the computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) of program code The form of the computer program product of implementation.
The embodiment of the present invention be referring to according to the method for the embodiment of the present invention, terminal device (system) and computer program The flowchart and/or the block diagram of product describes.It should be understood that flowchart and/or the block diagram can be realized by computer program instructions In each flow and/or block and flowchart and/or the block diagram in process and/or box combination.It can provide these Computer program instructions are set to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing terminals Standby processor is to generate a machine, so that being held by the processor of computer or other programmable data processing terminal devices Capable instruction generates for realizing in one or more flows of the flowchart and/or one or more blocks of the block diagram The device of specified function.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing terminal devices In computer-readable memory operate in a specific manner, so that instruction stored in the computer readable memory generates packet The manufacture of command device is included, which realizes in one side of one or more flows of the flowchart and/or block diagram The function of being specified in frame or multiple boxes.
These computer program instructions can also be loaded into computer or other programmable data processing terminal devices, so that Series of operation steps are executed on computer or other programmable terminal equipments to generate computer implemented processing, thus The instruction executed on computer or other programmable terminal equipments is provided for realizing in one or more flows of the flowchart And/or in one or more blocks of the block diagram specify function the step of.
Although the preferred embodiment of the embodiment of the present invention has been described, once a person skilled in the art knows bases This creative concept, then additional changes and modifications can be made to these embodiments.So the following claims are intended to be interpreted as Including preferred embodiment and fall into all change and modification of range of embodiment of the invention.
Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that process, method, article or terminal device including a series of elements not only wrap Those elements are included, but also including other elements that are not explicitly listed, or further includes for this process, method, article Or the element that terminal device is intrinsic.In the absence of more restrictions, being wanted by what sentence "including a ..." limited Element, it is not excluded that there is also other identical elements in process, method, article or the terminal device for including the element.
Method to a kind of data processing provided by the present invention and a kind of device of data processing above have carried out in detail It introduces, used herein a specific example illustrates the principle and implementation of the invention, the explanation of above embodiments It is merely used to help understand method and its core concept of the invention;At the same time, for those skilled in the art, according to this The thought of invention, there will be changes in the specific implementation manner and application range, in conclusion the content of the present specification is not answered It is interpreted as limitation of the present invention.

Claims (10)

1. a kind of method of data processing, which is characterized in that including:
Determine data target to be processed;
Obtain the historical data of the data target;
According to the historical data, the predicted value of the data target is determined;
When collecting the actual value of the data target, the difference between the actual value and the predicted value is calculated;
If the difference is less than preset threshold, determine the actual value for valid data.
2. the method according to claim 1, wherein the step of historical data for obtaining the data target Including:
Determine the collection period of historical data;
Obtain historical data of the data target in the collection period.
3. determining that the data refer to the method according to claim 1, wherein described according to the historical data The step of target predicted value includes:
According to the historical data, the prediction model for being directed to the data target is generated;
Using the prediction model, the predicted value of the data target is calculated.
4. according to the method described in claim 3, generation is directed to the number it is characterized in that, described according to the historical data According to index prediction model the step of include:
Setting default regression parameter value;
By changing the default regression parameter value, preset autoregression model is trained, so as to the historical data Prediction error less than the second preset threshold, to generate prediction model.
5. the method according to claim 1, wherein further including:
If the difference is more than the preset threshold, script is calculated to data and carries out correction process;
The actual value that script recalculates the data target is calculated using the data after error correction.
6. a kind of device of data processing, which is characterized in that including:
Data target determining module, for determining data target to be processed;
Historical data obtains module, for obtaining the historical data of the data target;
Predicted value determining module, for determining the predicted value of the data target according to the historical data;
Difference calculating module, for calculating the actual value and the prediction when collecting the actual value of the data target Difference between value;
Valid data determination module determines the actual value for valid data if being less than preset threshold for the difference.
7. device according to claim 6, which is characterized in that the historical data obtains module and includes:
Collection period determines submodule, for determining the collection period of historical data;
Historical data acquisition submodule, for obtaining historical data of the data target in the collection period.
8. device according to claim 6, which is characterized in that the predicted value determining module includes:
Prediction model generates submodule, for generating the prediction model for being directed to the data target according to the historical data;
Predictor calculation submodule calculates the predicted value of the data target for using the prediction model.
9. device according to claim 8, which is characterized in that the prediction model generates submodule and includes:
Regression parameter value setting unit, for default regression parameter value to be arranged;
Autoregression model training unit, for being carried out to preset autoregression model by changing the default regression parameter value Training, so as to the prediction error of the historical data less than the second preset threshold, to generate prediction model.
10. device according to claim 6, which is characterized in that further include:
Correction process module calculates script to data and carries out correction process if being more than the preset threshold for the difference;
Calculated with actual values module, for calculating the actual value that script recalculates the data target using the data after error correction.
CN201810426890.0A 2018-05-07 2018-05-07 Data processing method and device Active CN108829718B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810426890.0A CN108829718B (en) 2018-05-07 2018-05-07 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810426890.0A CN108829718B (en) 2018-05-07 2018-05-07 Data processing method and device

Publications (2)

Publication Number Publication Date
CN108829718A true CN108829718A (en) 2018-11-16
CN108829718B CN108829718B (en) 2021-04-06

Family

ID=64147515

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810426890.0A Active CN108829718B (en) 2018-05-07 2018-05-07 Data processing method and device

Country Status (1)

Country Link
CN (1) CN108829718B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008386A (en) * 2019-01-17 2019-07-12 阿里巴巴集团控股有限公司 A kind of data generation, processing, evaluation method, device, equipment and medium
CN111311086A (en) * 2020-02-11 2020-06-19 ***股份有限公司 Capacity monitoring method and device and computer readable storage medium
CN113590989A (en) * 2020-04-30 2021-11-02 北京金山云网络技术有限公司 Data processing method and device for real-time computing abnormality and electronic equipment
CN113590705A (en) * 2020-04-30 2021-11-02 北京金山云网络技术有限公司 Real-time data processing method and device and electronic equipment
CN113689078A (en) * 2021-07-27 2021-11-23 中国科学院地理科学与资源研究所 Survey data verification method and device
CN113744890A (en) * 2021-11-03 2021-12-03 北京融信数联科技有限公司 Reworking and production-resuming analysis method, system and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080126144A1 (en) * 2006-07-21 2008-05-29 Alex Elkin Method and system for improving the accuracy of a business forecast
CN102111284A (en) * 2009-12-28 2011-06-29 北京亿阳信通软件研究院有限公司 Method and device for predicting telecom traffic
CN103745279A (en) * 2014-01-24 2014-04-23 广东工业大学 Method and device for monitoring energy consumption abnormity
CN103886018A (en) * 2014-02-21 2014-06-25 车智互联(北京)科技有限公司 Data predication device, data predication method and electronic equipment
CN105676670A (en) * 2014-11-18 2016-06-15 北京翼虎能源科技有限公司 Method and system for processing energy data
CN107958297A (en) * 2016-10-17 2018-04-24 华为技术有限公司 A kind of product demand forecasting method and product demand prediction meanss

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080126144A1 (en) * 2006-07-21 2008-05-29 Alex Elkin Method and system for improving the accuracy of a business forecast
CN102111284A (en) * 2009-12-28 2011-06-29 北京亿阳信通软件研究院有限公司 Method and device for predicting telecom traffic
CN103745279A (en) * 2014-01-24 2014-04-23 广东工业大学 Method and device for monitoring energy consumption abnormity
CN103886018A (en) * 2014-02-21 2014-06-25 车智互联(北京)科技有限公司 Data predication device, data predication method and electronic equipment
CN105676670A (en) * 2014-11-18 2016-06-15 北京翼虎能源科技有限公司 Method and system for processing energy data
CN107958297A (en) * 2016-10-17 2018-04-24 华为技术有限公司 A kind of product demand forecasting method and product demand prediction meanss

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008386A (en) * 2019-01-17 2019-07-12 阿里巴巴集团控股有限公司 A kind of data generation, processing, evaluation method, device, equipment and medium
CN111311086A (en) * 2020-02-11 2020-06-19 ***股份有限公司 Capacity monitoring method and device and computer readable storage medium
CN111311086B (en) * 2020-02-11 2024-02-09 ***股份有限公司 Capacity monitoring method, device and computer readable storage medium
CN113590989A (en) * 2020-04-30 2021-11-02 北京金山云网络技术有限公司 Data processing method and device for real-time computing abnormality and electronic equipment
CN113590705A (en) * 2020-04-30 2021-11-02 北京金山云网络技术有限公司 Real-time data processing method and device and electronic equipment
CN113689078A (en) * 2021-07-27 2021-11-23 中国科学院地理科学与资源研究所 Survey data verification method and device
CN113744890A (en) * 2021-11-03 2021-12-03 北京融信数联科技有限公司 Reworking and production-resuming analysis method, system and storage medium

Also Published As

Publication number Publication date
CN108829718B (en) 2021-04-06

Similar Documents

Publication Publication Date Title
CN108829718A (en) A kind of method and apparatus of data processing
CN110149540B (en) Recommendation processing method and device for multimedia resources, terminal and readable medium
Tijms Stochastic modelling and analysis: a computational approach
CN110866628A (en) System and method for multi-bounded time series prediction using dynamic time context learning
CN108921221A (en) Generation method, device, equipment and the storage medium of user characteristics
CN104462593A (en) Method and device for providing user personalized resource message pushing
CN107833605A (en) A kind of coding method, device, server and the system of hospital's medical record information
CN112380859A (en) Public opinion information recommendation method and device, electronic equipment and computer storage medium
CN105335875A (en) Purchasing power prediction method and purchasing power prediction device
CN110019943A (en) Video recommendation method, device, electronic equipment and storage medium
CN112801712B (en) Advertisement putting strategy optimization method and device
CN110046278B (en) Video classification method and device, terminal equipment and storage medium
CN112182118B (en) Target object prediction method based on multiple data sources and related equipment thereof
CN111460290A (en) Information recommendation method, device, equipment and storage medium
KR102559950B1 (en) An AI-based optimal advertising recommendation system
US20230011954A1 (en) Device, method, and system for business plan management
CN111177568A (en) Object pushing method based on multi-source data, electronic device and storage medium
CN117575275A (en) Material demand cloud computing analysis system, method and medium based on MPR
CN112182281B (en) Audio recommendation method, device and storage medium
CN113627160B (en) Text error correction method and device, electronic equipment and storage medium
CN106452808A (en) Data processing method and data processing device
CN110659998A (en) Data processing method, data processing apparatus, computer apparatus, and storage medium
CN112364185A (en) Method and device for determining characteristics of multimedia resource, electronic equipment and storage medium
US20160371486A1 (en) Capturing correlations between activity and non-activity attributes using n-grams
CN113780666B (en) Missing value prediction method and device and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant