CN117556223A - Multi-factor similarity-based snow melt runoff forecasting method - Google Patents

Multi-factor similarity-based snow melt runoff forecasting method Download PDF

Info

Publication number
CN117556223A
CN117556223A CN202410045133.4A CN202410045133A CN117556223A CN 117556223 A CN117556223 A CN 117556223A CN 202410045133 A CN202410045133 A CN 202410045133A CN 117556223 A CN117556223 A CN 117556223A
Authority
CN
China
Prior art keywords
data
forecasting
runoff
historical
rainfall
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410045133.4A
Other languages
Chinese (zh)
Inventor
陈然
牟时宇
王建华
李佳
杨杉
施颖
谭乔凤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Guodian Dadu River Hydropower Development Co Ltd
Original Assignee
Hohai University HHU
Guodian Dadu River Hydropower Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU, Guodian Dadu River Hydropower Development Co Ltd filed Critical Hohai University HHU
Priority to CN202410045133.4A priority Critical patent/CN117556223A/en
Publication of CN117556223A publication Critical patent/CN117556223A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01WMETEOROLOGY
    • G01W1/00Meteorology
    • G01W1/10Devices for predicting weather conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Environmental & Geological Engineering (AREA)
  • Atmospheric Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Ecology (AREA)
  • Environmental Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method for forecasting snow-melting runoff based on multi-factor similarity, which relates to the field of data processing and comprises the following steps: acquiring historical actual measurement data of the upstream of a river basin where a forecast section is located and carrying out data preprocessing; determining a plurality of forecasting schemes; establishing and training a snow-melting runoff forecasting model based on multi-factor similarity; determining a plurality of target history samples based on history actual measurement data of the upstream of the drainage basin where the preprocessed forecast section is located; determining an optimal forecasting scheme according to the forecasting performance index; the snow-melting runoff forecasting model is used for rolling forecasting of the snow-melting runoff of the forecasting section in a plurality of future time periods in a future forecasting period based on an optimal forecasting scheme, a plurality of target historical samples and the historical actual measurement data of the upstream of the river basin where the pre-processed forecasting section is located, and the snow-melting runoff forecasting model has the advantages of improving the accuracy of forecasting the snow-melting runoff and the time length of the forecasting period.

Description

Multi-factor similarity-based snow melt runoff forecasting method
Technical Field
The invention relates to the field of data processing, in particular to a method for forecasting snow-melting runoff based on multi-factor similarity.
Background
For a long time, due to factors such as lack of environmental monitoring data, insufficient explanation of runoff cause and the like, runoff forecasting in a snow melting period always faces the problems of low precision, insufficient reliability and the like. In addition, global climate change not only changes the rain and snow proportion and the water quantity of the precipitation in the western cold region, but also changes the period of snow accumulation and snow melting process, thereby causing profound effects on the runoff process of the flow field and further increasing the uncertainty of forecasting. The snow melting period is a key stage of reservoir water-level fluctuation, improves the accuracy and reliability of runoff forecasting in the snow melting period, and has important significance for electric power safety guarantee and comprehensive utilization of water resources in a river basin.
The prediction of the snow-melting runoff is mainly based on a simulation model based on a physical mechanism and mainly comprises two types. One class is conceptual models based on temperature index methods, such as a Snowmelt Runoff Model (SRM), swedish Hydrologic Bureau (HBV) model, and the like. Another class is models based on the principle of energy balance, such as soil and water assessment models (SWAT), infiltration capacity models (VIC), and the like. In recent years, with the development of various environmental monitoring data such as remote sensing, observation, modes and the like and technologies such as big data analysis, artificial intelligence and the like, a data-driven runoff forecasting method becomes a research hotspot in the hydrologic field.
The current physical driving and data driving snow-melting runoff forecasting methods have obtained relatively great research results. The physical driving method describes the general sub-process and physical mechanism of hydrologic cycle, but only the reliable hydrologic, meteorological and underlying data of the river basin can be mastered to accurately describe the current-production and confluence relationship in the river basin, the model parameters are difficult to set, the mathematical operation is complex, and the space variability of variables and the random characteristics of the rainfall runoff process are easy to ignore. The data driving method does not need to be deeply researched into a rainfall-runoff physical forming mechanism, but the forecasting effect is completely dependent on reliable massive actual measurement data and a selected data driving model, and the existing snow melting data has the problems of insufficient sequence length, low reliability and the like, and various data driving models still have a certain bottleneck in forecasting precision and forecasting period.
Therefore, it is necessary to provide a method for predicting the flow of snow melt based on multi-factor similarity, which is used for improving the prediction accuracy of the flow of snow melt and the prediction period time.
Disclosure of Invention
The invention provides a method for forecasting snow-melting runoff based on multi-factor similarity, which comprises the following steps: acquiring historical actual measurement data of the upstream of a river basin where a forecast section is located, wherein the historical actual measurement data at least comprise historical actual measurement rainfall data, historical actual measurement runoff data and historical actual measurement temperature data; performing data preprocessing on the history measured data of the upstream of the river basin where the predicted section is located, and generating the history measured data of the upstream of the river basin where the predicted section is located after preprocessing; determining a plurality of forecasting schemes, wherein the forecasting schemes at least comprise runoff factors, precipitation factors, positive accumulation temperature factors, soil water content factors and time-lag combinations, and the time-lag combinations comprise runoff time-lag, precipitation time-lag, positive accumulation temperature time-lag and soil water content time-lag; establishing and training a snow-melting runoff forecasting model based on multi-factor similarity; determining a plurality of target history samples based on history measured data of the upstream of the drainage basin where the preprocessed forecast section is located; screening the multiple forecasting schemes according to the forecasting performance indexes, and determining an optimal forecasting scheme from the multiple forecasting schemes; rolling prediction is carried out on the snow melting runoff of the prediction section in a plurality of future time periods in a future prediction period based on the optimal prediction scheme, the plurality of target historical samples and the historical actual measurement data of the upstream of the river basin of the preprocessed prediction section through the snow melting runoff prediction model.
Further, the data preprocessing is performed on the history measured data of the upstream of the river basin where the predicted section is located, and the generating of the history measured data of the upstream of the river basin where the predicted section is located after the preprocessing includes: performing time scale normalization on the historical measured rainfall data, the historical measured runoff data and the historical measured temperature data to generate time scale normalized historical measured rainfall data, historical measured runoff data and historical measured temperature data; carrying out missing value and/or abnormal value processing on the historical measured runoff data after time scale normalization to generate processed historical measured runoff data; performing spatial aggregation treatment on the historical measured rainfall data subjected to time scale normalization to generate the historical measured rainfall data subjected to the spatial aggregation treatment; performing space aggregation treatment on the historical measured temperature data subjected to time scale normalization to generate the historical measured temperature data subjected to space aggregation treatment; converting the historical measured temperature data subjected to time scale normalization into historical positive accumulated temperature data; and calculating the water content of the soil according to the historical actual measurement rainfall data normalized by the time scale.
Further, the spatial aggregation processing is performed on the historical measured rainfall data after the time scale normalization to generate the historical measured rainfall data after the spatial aggregation processing, including: dividing the upstream of the drainage basin where the forecast section is located into a plurality of subareas, and determining the average surface rainfall of each subarea according to the following formula based on the historical actual measurement rainfall data normalized by the time scale;
wherein,is the average surface rain level of a sub-area, < >>Is the total number of sites contained in the sub-region,is the +.>Spot rain amount of individual sites->Refers to the +.>The rainfall of each site is converted into the weight of the average surface rainfall.
Further, the spatial aggregation processing is performed on the historical measured temperature data after the time scale normalization to generate the historical measured temperature data after the spatial aggregation processing, including: dividing the upstream of the river basin where the forecast section is located into a plurality of subareas, and determining the average surface air temperature of each subarea through the following formula based on the historical measured temperature data normalized by the time scale;
wherein,is the average surface air temperature of a sub-zone, < ->Is the number of sites comprised in said sub-area, < >>Is the sub-region->Point air temperature of individual sites->Refers to->The point air temperature is converted into a weight of the average surface air temperature.
Further, the converting the time scale normalized historical measured temperature data into historical positive temperature coefficient data includes: converting the historical measured temperature data normalized by the time scale into historical positive temperature data by the following formula:
wherein:positive accumulation temperature for the site; />For site at->Day average temperature of the day; />Is a logic variable, when->When not less than 0, the weight is added>=1; when-><At 0, the +>=0。
Further, the calculating of the soil water content according to the historical measured rainfall data normalized by the time scale includes: calculating the water content of the soil according to the historical actual measurement rainfall data normalized by the time scale based on the following formula;
wherein,is site a +.>Daily soil moisture content, < >>Is site a +.>Soil moisture content +1 day, < > and->Is site a +.>Daily rainfall, ->Is the daily coefficient of regression of the soil moisture content.
Further, the determining a plurality of forecasting schemes includes: calculating rainfall correlation coefficients, runoff autocorrelation coefficients and runoff bias correlation coefficients; and determining the plurality of forecasting schemes based on the rainfall correlation coefficient, the runoff autocorrelation coefficient and the runoff bias correlation coefficient.
Further, the determining a plurality of target history samples based on the history measured data of the upstream of the river basin where the preprocessed forecast section is located includes: based on the optimal similarity between the history measured data of the upstream of the drainage basin where the preprocessed forecast section is located and each history sample; and determining the target historical samples based on the optimal similarity between the historical measured data of the upstream of the drainage basin where the preprocessed forecast section is located and each historical sample.
Further, the screening the multiple forecasting schemes according to the forecasting performance index, and determining an optimal forecasting scheme from the multiple forecasting schemes includes: and determining an optimal forecasting scheme from the plurality of forecasting schemes based on Nash coefficients, root mean square errors, average absolute errors and average relative errors of the snowmelt runoff forecasting model in each forecasting scheme.
Further, the function of the snow melt runoff prediction model is expressed as:
wherein,to characterize the predicted flow of snow melt from the predicted fracture in the r-th future prediction period,predicting samples for the r-th target, +.>For the corresponding sampling weight +.>For the +.>Sample->Is->The weight corresponding to the first sampled sample is,for the p-th sample determined based on the plurality of target history samples, +.>Is the total number of sampled samples.
Compared with the prior art, the snow melt runoff forecasting method based on multi-factor similarity has the following beneficial effects:
1. the advantages of physical cause analysis and data mining technology are fully integrated, the snow-melting runoff prediction model can comprehensively consider multiple influencing factors such as early runoff, precipitation, positive accumulation temperature, soil water content and the like, and the snow-melting runoff prediction model based on multi-factor similarity is built, so that prediction accuracy is greatly improved. Particularly, an important predictor of positive accumulated temperature is introduced into the snow-melting runoff prediction model, so that the accuracy and the reliability of the snow-melting runoff prediction model for the prediction of the snow-melting runoff are further improved, and a better prediction effect can be obtained in a severe cold region.
2. The snow-melting runoff forecasting model can quantitatively mine the experiences of 'referencing the past forecast' which are abundant for first-line business personnel, forecast future runoff processes by using historical data, quantitatively search out the most similar runoff processes in history while improving forecasting precision and forecasting period, provide interpretable runoff forecasting results, and have remarkable improvement effect on prolonging the effective forecasting period of rolling forecasting.
3. The optimal similarity coefficient is introduced, compared with the previous similarity, the optimal similarity coefficient has better effect, the optimal similarity coefficient combines the shape similarity, and the found sample can simultaneously reflect the similarity of space and time. The optimal similarity coefficient considers the similarity of the historical data, can better adapt to the change of future conditions, and can improve the prediction accuracy and the interpretability by comparing and interpreting the prediction result and the historical data. The optimal similarity coefficient can better capture future trends and modes by searching the similarity in the historical data, so that the prediction result is reliable and effective in a longer time range.
Drawings
The present specification will be further elucidated by way of example embodiments, which will be described in detail by means of the accompanying drawings. The embodiments are not limiting, in which like numerals represent like structures, wherein:
FIG. 1 is a flow chart of a method of forecasting flow of snow melt runoff based on multi-factor similarity according to some embodiments of the present disclosure;
fig. 2 is a schematic flow chart of data preprocessing of historical measured data upstream of a drainage basin where a predicted section is located according to some embodiments of the present disclosure.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present specification, the drawings that are required to be used in the description of the embodiments will be briefly described below. It is apparent that the drawings in the following description are only some examples or embodiments of the present specification, and it is possible for those of ordinary skill in the art to apply the present specification to other similar situations according to the drawings without inventive effort. Unless otherwise apparent from the context of the language or otherwise specified, like reference numerals in the figures refer to like structures or operations.
Fig. 1 is a flow chart of a method for predicting a flow of snow melt runoff based on multi-factor similarity according to some embodiments of the present disclosure, as shown in fig. 1, the method for predicting a flow of snow melt runoff based on multi-factor similarity may include the following steps.
Step 110, obtaining historical actual measurement data of the upstream of the river basin where the forecast section is located.
The historical measured data at least comprises historical measured rainfall data, historical measured runoff data and historical measured temperature data. Historical measured data of the upstream of the river basin where the forecast section is located can be obtained from an external data source.
When the rainfall data and the air temperature data are processed, coverage of the rainfall data and the air temperature data needs to be ensured to cover the upstream area of the river basin where the whole forecasting section is located. This means that a sufficient amount of weather station data is collected and spatially suitably laid out to ensure the representativeness and accuracy of the data. The data covering the area upstream of the basin where the entire predicted section is located can provide more comprehensive information for accurate analysis and model building.
And 120, carrying out data preprocessing on the historical measured data of the upstream of the river basin where the predicted section is located, and generating the preprocessed historical measured data of the upstream of the river basin where the predicted section is located.
Fig. 2 is a schematic flow chart of data preprocessing of historical measured data of a river basin upstream of a predicted section according to some embodiments of the present disclosure, and as shown in fig. 2, in some embodiments, the data preprocessing of the historical measured data of the river basin upstream of the predicted section may include the following steps.
Step 210, performing time scale normalization on the historical measured rainfall data, the historical measured runoff data and the historical measured temperature data, and generating the historical measured rainfall data, the historical measured runoff data and the historical measured temperature data after the time scale normalization.
Specifically, because the historical measured rainfall data is in an hour scale and the other data are in a day scale, in order to enable rainfall and other data to have consistent time scales in a research area, the rainfall data needs to be converted from the hour scale to the day scale for comparison and analysis with other data. The daily rainfall data are accumulated by the daily hour rainfall of the station, so that the dimension reduction on the time scale is realized, and the data are easier to process and analyze.
And 220, carrying out missing value and/or abnormal value processing on the historical measured runoff data after time scale normalization, and generating the processed historical measured runoff data.
Specifically, when the missing value or the abnormal value in the historical actual measurement runoff data exceeds five continuously, the continuous missing value or the abnormal value of the segment should be wholly moved out of the historical actual measurement runoff data; when the continuous missing value or abnormal value in the historical actual measurement runoff data is less than five continuous values, a linear interpolation method is adopted to cover the abnormal value or supplement the missing data in the historical actual measurement runoff data.
And 230, performing spatial aggregation treatment on the historical measured rainfall data subjected to time scale normalization to generate the historical measured rainfall data subjected to spatial aggregation treatment.
Specifically, as the number of rainfall stations is large, the upstream of the drainage basin where the forecast section is located can be divided into a plurality of subareas by analyzing the relativity of the rainfall stations, and the average surface rainfall can be obtained in the subareas by weighted average, so that dimension reduction on the spatial scale is realized, and analysis and prediction are more efficient and accurate.
In some embodiments, the spatial aggregation processing is performed on the historical measured rainfall data after the time scale normalization, and the generating the historical measured rainfall data after the spatial aggregation processing includes:
dividing the upstream of the river basin where the forecast section is located into a plurality of subareas, and determining the average surface rainfall of each subarea based on the historical actual measurement rainfall data normalized by a time scale according to the following formula;
wherein,is the average surface rain level of a sub-area, < >>Is the total number of sites contained in the sub-region,is the ∈th of the sub-region>Spot rain amount of individual sites->Refers to the +.>The rainfall of each site is converted into the weight of the average surface rainfall.
And 240, performing spatial aggregation on the historical measured temperature data subjected to time scale normalization to generate the historical measured temperature data subjected to spatial aggregation.
Specifically, as the number of air temperature stations is large, the upstream of the river basin where the forecast section is located can be divided into a plurality of subareas by analyzing the correlation of the air temperature stations, and the average surface air temperature can be obtained in the subareas by weighted average, so that the dimension reduction on the spatial scale is realized, and the analysis and the prediction are more efficient and accurate.
In some embodiments, the spatial aggregation processing is performed on the historical measured temperature data after the time scale normalization to generate the historical measured temperature data after the spatial aggregation processing, including:
dividing the upstream of the river basin where the forecast section is located into a plurality of subareas, and determining the average surface air temperature of each subarea based on the historical measured temperature data normalized by a time scale through the following formula;
wherein,is the average surface air temperature of a sub-zone, < ->Is the number of sites contained in the sub-region, +.>Is the sub-region->Point air temperature of individual sites->Refers to->The point air temperature is converted into a weight of the average surface air temperature.
Step 250, converting the time scale normalized historical measured temperature data into historical positive temperature coefficient data.
Specifically, in the alpine region, the positive accumulation temperature factor has an important influence on the snow melting yield, so the positive accumulation temperature should be added as a forecasting factor. According to the ice and snow ablation related research, the positive accumulation temperature of the site is calculated based on the principle of calculating the glacier ablation amount by a holiday factor method, and modeling calculation is performed by adopting the positive accumulation temperature as a characteristic value of the air temperature.
In some embodiments, converting the time scale normalized historical measured temperature data into historical positive temperature coefficient data includes:
converting the historical measured temperature data normalized by the time scale into historical positive temperature data by the following formula:
wherein:positive accumulation temperature for the site; />For site at->Day average temperature of the day; />Is a logic variable, when->When not less than 0, the weight is added>=1; when-><At 0, the +>=0。
And 260, calculating the water content of the soil according to the historical actual measurement rainfall data normalized by the time scale.
The water content of the soil is also one of the important factors affecting the flow of snow melt, and when the snow melts, a part of the rainwater is absorbed and evaporated by the soil, and the other part flows into an underground reservoir or a surface water body. Therefore, knowing the size and variation of the soil moisture content is of great importance for accurately forecasting the snow melting flow.
In some embodiments, the calculation of the soil moisture content is performed based on the time scale normalized historical measured rainfall data, including:
calculating the water content of the soil according to the historical actual measurement rainfall data normalized by the time scale based on the following formula;
wherein,is->Daily soil moisture content, < >>Is->Soil moisture content +1 day, < > and->Is the firstDaily rainfall, ->Is the daily coefficient of regression of the soil moisture content.
At step 130, a plurality of forecasting schemes are determined.
In some embodiments, determining a plurality of forecasting schemes includes:
calculating rainfall correlation coefficients, runoff autocorrelation coefficients and runoff bias correlation coefficients;
based on the rainfall correlation coefficient, the runoff autocorrelation coefficient and the runoff bias correlation coefficient, a plurality of forecasting schemes are determined.
Precipitation is an important influence factor of short-term runoff prediction, is one of main driving forces for runoff generation, and the space-time distribution of rainfall has an important influence on the formation and change of runoff process. And obtaining rainfall lag time by analyzing the correlation. In the alpine region, the positive accumulated temperature factor has an important influence on the snow melting yield, and the air temperature lag time is obtained by analyzing the correlation.
For example, the rainfall correlation coefficient may be calculated based on the following formula:
wherein r is a rainfall correlation coefficient; n is the total length of the sequence of rainfall and runoff;for front->Daily rainfall on the day; />Is the average value of rainfall; />For front->Daily runoff of the day; />Is the average value of runoff.
The runoff sequence is taken as a typical variable with time sequence characteristics, key information of the runoff sequence can be identified by using an autocorrelation and partial correlation analysis method, and the early runoff lag time is obtained by analyzing the autocorrelation and partial correlation of the runoff sequence.
For example, the runoff autocorrelation coefficients may be calculated based on the following formula:
wherein,is a runoff autocorrelation coefficient; />The total length of the sequence for runoff; />Is the runoff on day i; />Is the average value of runoff; />Is->Runoff of days; />Is a runoff bias correlation coefficient; />Is->Tian and the firstThe natural runoff autocorrelation coefficient; />Is->Day and->The natural runoff autocorrelation coefficient; />Is->Day and->And an autocorrelation coefficient of the runoff.
Based on determining four forecasting factors of runoff, precipitation, positive accumulation temperature and soil water content, a series of researches are conducted, and various forecasting schemes are established by combining and optimizing different variables, time delay and output forms, and the forecasting schemes are further tested and optimized to achieve more accurate and reliable forecasting results.
By way of example only, various forecasting schemes may be as shown in table 1.
TABLE 1
Scheme numbering Forecast scheme
Scheme 1 Regional surface rainfall (t-1) +local station flow (t-1, t-2) +soil moisture content (t-1) +positive accumulated temperature (t-1)
Scheme 2 Regional surface rainfall (t-1) +local station flow (t-1, t-2) +soil moisture content (t-1) +positive accumulated temperature (t-1)
Scheme 3 Regional surface rainfall (t-1) +local station flow (t-1, t-2, t-3) +soil moisture content (t-1) +positive accumulated temperature (t-1)
Scheme 4 Regional surface rainfall (t-1) +local station flow (t-1, t-2, t-3) +soil moisture content (t-1) +positive accumulated temperature (t-1)
Scheme 5 Regional surface rainfall (t-1, t-2) +local station flowThe amount (t-1, t-2, t-3) +the soil moisture content (t-2) +the positive accumulation temperature (t-2)
Scheme 6 Regional surface rainfall (t-1, t-2) +local station flow (t-1, t-2, t-3) +soil moisture content (t-2) +positive accumulated temperature (t-2)
Scheme 7 Regional surface rainfall (t-1, t-2, t-3) +local station flow (t-1, t-2) +soil moisture content (t-3) +positive accumulated temperature (t-3)
Scheme 8 Regional surface rainfall (t-1, t-2, t-3) +local station flow (t-1, t-2) +soil moisture content (t-3) +positive accumulated temperature (t-3)
Scheme 9 Regional surface rainfall (t-1, t-2, t-3) +local station flow (t-1, t-2, t-3) +soil moisture content (t-3) +positive accumulated temperature (t-3)
Scheme 10 Regional surface rainfall (t-1, t-2, t-3) +local station flow (t-1, t-2, t-3) +soil moisture content (t-3) +positive accumulated temperature (t-3)
And 140, establishing and training a snow melt runoff forecasting model based on the multi-factor similarity.
The snow melt runoff forecasting model can be a nearest neighbor sampling regression model (NNBR), the model considers that a certain rule exists in the occurrence and development of the objective world, the future development trend has similarity with the occurrence and development of the history, and the history known occurrence and development process can be used for searching the future occurrence and development trend. The nearest neighbor sampling regression model is a data-driven prediction model considering physical causes, and the dependence form and the probability distribution form of a research object are not required to be assumed, so that each parameter has a clear definition.
Nearest neighbor sampling regression model (NNBR) is a classical data-driven model, known in time seriesThere is->The influence factors may be different for each factor, assuming +.>The influence lag time number of each factor isIs->No. 2 of the individual factors>The number of elements, which in the calculation represent the magnitude of a certain influence value of a specific day,/->Indicate->Feature vector of the individual factors. />The feature vectors of the individual influencing factors form a feature vector group +.>Then->And->One-to-one correspondence:
knowing the current set of feature vectorsHow to predict the follow-up value of the subject +.>The basic idea of the nearest neighbor sampling regression model is: in the existing feature vector group +.>In all cases there is->Person and->Nearest neighbor, noted asThe corresponding subsequent values are +.>. Weighting ∈10 with optimal similarity coefficient>And->The value range of the optimal similarity coefficient is [0,1]The larger the optimal similarity coefficient, the more similar the two.
Setting nearest neighbor characteristic vector group obtained from history sampleAnd->The optimal similarity coefficients between the two are respectively +.>Big, indicate->And->The more neighbor, then->The greater the likelihood of (2) can be set +.>Is->Is included in the sampling weight of (a). Thus, the function of the snow melt runoff forecasting model is expressed as:
wherein,to characterize the predicted snow melt runoff of the predicted prediction section in the r-th future prediction period, a.>Predicting samples for the r-th target, +.>For the corresponding sampling weight +.>For the +.>Sample->Is->Weight corresponding to the first sample, +.>For the p-th sample determined based on the plurality of target history samples, +.>Is the total number of sampled samples.
And 150, determining a plurality of target history samples based on the history measured data of the upstream of the river basin where the preprocessed forecast section is located.
In some embodiments, determining a plurality of target historical samples based on historical measured data upstream of a basin in which the preprocessed forecast section resides comprises: based on the optimal similarity between the history measured data of the upstream of the drainage basin where the pretreated forecast section is located and each history sample; and determining a plurality of target historical samples based on the optimal similarity between the historical measured data of the upstream of the drainage basin where the preprocessed forecast section is located and each historical sample.
Comprehensively considering early runoff, precipitation, positive accumulation temperature and soil water content factors, and adopting an optimal similarity coefficient as an index for measuring the similarity of a current sample and a historical sample. Optimal similarity coefficientIt is composed of shape coefficient and value coefficient.Has a value range of [0,1 ]]When it is 1, the two samples are completely coincident.
To be used forRepresenting optimal similarity coefficient->The form factor of (c) is expressed as follows:
wherein,to predict the%>The corresponding value of the predictor +.>For->The corresponding value of the predictor +.>For predicting the numerical average value of all predictors in the sample, +.>For the numerical average value corresponding to all predictors in the history sample, +.>Is the total number of predictors.
To be used forRepresenting optimal similarity coefficient->The expression of which is as follows:
wherein,to predict the%>The corresponding value of the predictor +.>For->The corresponding value of the predictor +.>For predicting the numerical average value of all predictors in the sample, +.>For the numerical average value corresponding to all predictors in the history sample, +.>For the total number of predictors>Representing prediction samples, ++>Representing historical samples, ++>Representing the predictor sequence number.
Comprehensively considering the influence of the shape and the value, taking the product of the shape and the value coefficient by the optimal similarity coefficient, thenThe method comprises the following steps:
step 160, screening the multiple forecasting schemes according to the forecasting performance indexes, and determining the optimal forecasting scheme from the multiple forecasting schemes.
In some embodiments, an optimal forecasting solution is determined from a plurality of forecasting solutions based on a snow melt runoff forecasting model at Nash coefficient (NS), root Mean Square Error (RMSE), mean Absolute Error (MAE), mean relative error (MARE) of each forecasting solution.
The calculation expression of the Nash coefficient (NS) is as follows:
the Root Mean Square Error (RMSE) is calculated as follows:
the expression for the calculation of the Mean Absolute Error (MAE) is as follows:
the calculated expression for the mean relative error (MARE) is as follows:
wherein,and->Respectively +.>The predicted flow values are output by the observed flow and the snow-melting runoff prediction model; />Is the average value of the observed flow; />Is the sample data length.
NS coefficients take values from minus infinity to 1. The closer the NS coefficient is to 1, the closer the predicted diameter value predicted by the measured runoff and snowmelt runoff prediction model is, and the better the snowmelt runoff prediction model is; NS is close to 0, indicating that the prediction result is close to the average level of the measured value, i.e. the overall result is reliable, but the process prediction error is large; and if the NS is smaller than 0, the reliability of the snow-melting runoff prediction model is low. Thus, the closer the NS is to 1, the better the effect of the snow melt runoff prediction model. RMSE, MAE, MARE is 0 to positive infinity, and RMSE, MAE, MARE index values are 0 when the runoff forecast value and the runoff actual measurement value are the same, so that RMSE, MAE, MARE is close to 0, and the effect of the snow-melting runoff forecast model is better. According to the forecasting scheme with the forecasting period of 1 day, the performance of the snow-melting runoff forecasting model in each forecasting scheme is obtained, the forecasting scheme with the smallest average relative error is selected as the optimal forecasting scheme after comparison, and the optimal forecasting scheme is used for rolling forecasting of the snow-melting runoffs of a plurality of future time periods of the forecasting section in a future forecasting period.
And 170, rolling prediction is carried out on the snow melt runoff of the prediction section in a plurality of future time periods in a future prediction period through a snow melt runoff prediction model based on an optimal prediction scheme, a plurality of target historical samples and the historical actual measurement data of the upstream of the river basin of the preprocessed prediction section.
Because the accuracy of the runoff forecast can be obviously reduced along with the extension of the forecast period, how to effectively improve the forecast accuracy while prolonging the forecast period is a great difficulty in the runoff forecast. In order to consider the effectiveness of the forecast period, the invention adopts a rolling forecast method to prolong the forecast period. The essence of rolling forecast is to update the data of early runoff, precipitation, positive accumulated temperature and soil water content in the forecast sample by continuously using the forecast meteorological information and the forecast runoff so as to update the forecast sample. In order to minimize the transmission and accumulation problems of various errors in rolling forecast, the invention considers the similarity of the forecast samples of the rolling update of the positive accumulated temperature and the rainfall forecast information, continuously adds the forecast runoff information and the soil water content information as input to roll update the similarity, and continuously forecasts the runoffs in the next period to forecastThe runoff process of the period is exemplified:
wherein:respectively the earlier runoff, precipitation, positive accumulated temperature and the influence time delay of the soil water content,for the already occurring, measured early runoff course,/->For the already occurring, actually measured early rainfall processes,/->For an already occurring and measured early positive temperature change process,for the already occurring, actually measured, course of change of the water content of the early soil,/a->Is->Runoff forecast results of period->Is->Forecast precipitation outcome, ->Is->Forecast positive accumulated temperature result of period, +.>Is thatForecasting soil moisture content results for time period,/->Is->And forecasting precipitation results in time periods. And so on.
Forecast periodRunoff of->When it is needed to be +.>Runoff forecast outcome->Forecast precipitation->Forecasting positive accumulation temperature ++>And soil moisture content->Adding to the input of the model; and then->Forecast runoff->Forecast precipitation +.>Forecasting positive accumulation temperature ++>And soil moisture content->Inputting a model, forecasting the next period +.>Runoff of->. And similarly, the forecast of the next period updates the rainfall and runoff information in the earlier period at the same time, so that the rolling forecast of the period by period in the forecast period is realized.
Finally, it should be understood that the embodiments described in this specification are merely illustrative of the principles of the embodiments of this specification. Other variations are possible within the scope of this description. Thus, by way of example, and not limitation, alternative configurations of embodiments of the present specification may be considered as consistent with the teachings of the present specification. Accordingly, the embodiments of the present specification are not limited to only the embodiments explicitly described and depicted in the present specification.

Claims (10)

1. The method for forecasting the snowmelt runoff based on the multi-factor similarity is characterized by comprising the following steps of:
acquiring historical actual measurement data of the upstream of a river basin where a forecast section is located, wherein the historical actual measurement data at least comprise historical actual measurement rainfall data, historical actual measurement runoff data and historical actual measurement temperature data;
performing data preprocessing on the history measured data of the upstream of the river basin where the predicted section is located, and generating the history measured data of the upstream of the river basin where the predicted section is located after preprocessing;
determining a plurality of forecasting schemes, wherein the forecasting schemes at least comprise runoff factors, precipitation factors, positive accumulation temperature factors, soil water content factors and time-lag combinations, and the time-lag combinations comprise runoff time-lag, precipitation time-lag, positive accumulation temperature time-lag and soil water content time-lag;
establishing and training a snow-melting runoff forecasting model based on multi-factor similarity;
determining a plurality of target history samples based on history measured data of the upstream of the drainage basin where the preprocessed forecast section is located;
screening the multiple forecasting schemes according to the forecasting performance indexes, and determining an optimal forecasting scheme from the multiple forecasting schemes;
rolling prediction is carried out on the snow melting runoff of the prediction section in a plurality of future time periods in a future prediction period based on the optimal prediction scheme, the plurality of target historical samples and the historical actual measurement data of the upstream of the river basin of the preprocessed prediction section through the snow melting runoff prediction model.
2. The method for predicting snowmelt runoff based on multi-factor similarity according to claim 1, wherein the step of performing data preprocessing on the historical measured data of the upstream of the river basin where the predicted section is located, and generating the preprocessed historical measured data of the upstream of the river basin where the predicted section is located comprises the steps of:
performing time scale normalization on the historical measured rainfall data, the historical measured runoff data and the historical measured temperature data to generate time scale normalized historical measured rainfall data, historical measured runoff data and historical measured temperature data;
carrying out missing value and/or abnormal value processing on the historical measured runoff data after time scale normalization to generate processed historical measured runoff data;
performing spatial aggregation treatment on the historical measured rainfall data subjected to time scale normalization to generate the historical measured rainfall data subjected to the spatial aggregation treatment;
performing space aggregation treatment on the historical measured temperature data subjected to time scale normalization to generate the historical measured temperature data subjected to space aggregation treatment;
converting the historical measured temperature data subjected to time scale normalization into historical positive accumulated temperature data;
and calculating the water content of the soil according to the historical actual measurement rainfall data normalized by the time scale.
3. The method for forecasting the snowmelt runoff based on the multi-factor similarity according to claim 2, wherein the step of performing spatial aggregation processing on the historical measured rainfall data normalized by the time scale to generate the historical measured rainfall data subjected to the spatial aggregation processing comprises the following steps:
dividing the upstream of the drainage basin where the forecast section is located into a plurality of subareas, and determining the average surface rainfall of each subarea according to the following formula based on the historical actual measurement rainfall data normalized by the time scale;
wherein (1)>Is the average surface rain level of a sub-area, < >>Is the total number of sites comprised by said sub-area,/->Is the +.>Spot rain amount of individual sites->Refers to the +.>The rainfall of each site is converted into the weight of the average surface rainfall.
4. The method for predicting the runoff of snow melt based on multi-factor similarity according to claim 2, wherein the spatial aggregation processing is performed on the historical measured temperature data normalized by the time scale, and the generating of the historical measured temperature data after the spatial aggregation processing includes:
dividing the upstream of the river basin where the forecast section is located into a plurality of subareas, and determining the average surface air temperature of each subarea through the following formula based on the historical measured temperature data normalized by the time scale;
wherein (1)>Is the average surface air temperature of a sub-zone, < ->Is the number of sites comprised in said sub-area, < >>Is the sub-region->Point air temperature of individual sites->Refers to->The point air temperature is converted into a weight of the average surface air temperature.
5. The method for predicting snowmelt runoff based on multi-factor similarity according to claim 2, wherein the converting the time-scale normalized historical measured temperature data into the historical positive accumulated temperature data comprises:
converting the historical measured temperature data normalized by the time scale into historical positive temperature data by the following formula:
wherein: />Positive accumulation temperature for the site; />For site at->Day average temperature of the day;is a logic variable, when->When not less than 0, the weight is added>=1; when-><At 0, the +>=0。
6. The method for forecasting the snowmelt runoff based on the multi-factor similarity according to claim 2, wherein the calculating of the soil water content according to the historical measured rainfall data normalized by the time scale comprises the following steps:
calculating the water content of the soil according to the historical actual measurement rainfall data normalized by the time scale based on the following formula;
wherein (1)>Is site a +.>Daily soil moisture content, < >>Is site a +.>Soil moisture content +1 day, < > and->Is site a +.>Daily rainfall, ->Is the daily coefficient of regression of the soil moisture content.
7. A method of predicting snowmelt runoff based on multifactor similarity as claimed in any one of claims 1 to 6 wherein said determining a plurality of prediction schemes comprises:
calculating rainfall correlation coefficients, runoff autocorrelation coefficients and runoff bias correlation coefficients;
and determining the plurality of forecasting schemes based on the rainfall correlation coefficient, the runoff autocorrelation coefficient and the runoff bias correlation coefficient.
8. A method of predicting snowmelt runoff based on multifactor similarity as claimed in any one of claims 1 to 6 wherein said determining a plurality of target historical samples based on historical measured data of the upstream of the basin of said preprocessed predicted section comprises:
based on the optimal similarity between the history measured data of the upstream of the drainage basin where the preprocessed forecast section is located and each history sample;
and determining the target historical samples based on the optimal similarity between the historical measured data of the upstream of the drainage basin where the preprocessed forecast section is located and each historical sample.
9. A method for forecasting snow melt runoff based on multi-factor similarity according to any one of claims 1-6, wherein said screening said plurality of forecasting solutions according to the forecasting performance index, determining an optimal forecasting solution from said plurality of forecasting solutions, comprises:
and determining an optimal forecasting scheme from the plurality of forecasting schemes based on Nash coefficients, root mean square errors, average absolute errors and average relative errors of the snowmelt runoff forecasting model in each forecasting scheme.
10. A method of predicting snowmelt runoff based on multifactor similarity according to any one of claims 1-6, wherein the function of the snowmelt runoff prediction model is expressed as:
wherein (1)>To characterize the predicted snow melt runoff of the predicted prediction section in the r-th future prediction period, a.>Predicting samples for the r-th target, +.>For the corresponding sampling weight +.>For the +.>Sample->Is->Weight corresponding to the first sample, +.>For the p-th sample determined based on the plurality of target history samples, +.>Is the total number of sampled samples.
CN202410045133.4A 2024-01-12 2024-01-12 Multi-factor similarity-based snow melt runoff forecasting method Pending CN117556223A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410045133.4A CN117556223A (en) 2024-01-12 2024-01-12 Multi-factor similarity-based snow melt runoff forecasting method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410045133.4A CN117556223A (en) 2024-01-12 2024-01-12 Multi-factor similarity-based snow melt runoff forecasting method

Publications (1)

Publication Number Publication Date
CN117556223A true CN117556223A (en) 2024-02-13

Family

ID=89813338

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410045133.4A Pending CN117556223A (en) 2024-01-12 2024-01-12 Multi-factor similarity-based snow melt runoff forecasting method

Country Status (1)

Country Link
CN (1) CN117556223A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102867106A (en) * 2012-08-14 2013-01-09 贵州乌江水电开发有限责任公司 Method and system for predicting short-term running water
CN108416049A (en) * 2018-03-19 2018-08-17 河海大学 A kind of high and cold mountain area basin sleet mixing Runoff calculation method
CN112801342A (en) * 2020-12-31 2021-05-14 国电大渡河流域水电开发有限公司 Adaptive runoff forecasting method based on rainfall runoff similarity
CN115423146A (en) * 2022-07-27 2022-12-02 贵州乌江水电开发有限责任公司 Self-adaptive runoff forecasting method based on multi-factor nearest neighbor sampling regression and support vector machine

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102867106A (en) * 2012-08-14 2013-01-09 贵州乌江水电开发有限责任公司 Method and system for predicting short-term running water
CN108416049A (en) * 2018-03-19 2018-08-17 河海大学 A kind of high and cold mountain area basin sleet mixing Runoff calculation method
CN112801342A (en) * 2020-12-31 2021-05-14 国电大渡河流域水电开发有限公司 Adaptive runoff forecasting method based on rainfall runoff similarity
CN115423146A (en) * 2022-07-27 2022-12-02 贵州乌江水电开发有限责任公司 Self-adaptive runoff forecasting method based on multi-factor nearest neighbor sampling regression and support vector machine

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
刘建军等: "降雨径流预报方法在两江水电站的应用", 《东北水利水电》, vol. 33, no. 05, 15 May 2015 (2015-05-15), pages 2 *
罗阳等: "相似性度量研究及最优相似系数", 《中国气象学会2008年年会天气预报准确率与公共气象服务分会场论文集》, 1 November 2008 (2008-11-01), pages 1 *
闻昕等: "基于多因素相似性的融雪径流预报方法研究", 《水力发电学报》, vol. 41, no. 03, 31 March 2022 (2022-03-31), pages 2 - 5 *

Similar Documents

Publication Publication Date Title
CN112801342A (en) Adaptive runoff forecasting method based on rainfall runoff similarity
CN110110912B (en) Photovoltaic power multi-model interval prediction method
CN112183897B (en) Long-time prediction method for ice coating thickness of overhead transmission line based on deep learning
CN115933008A (en) Strong convection weather forecast early warning method
CN116362419B (en) Urban flood control early warning system and method
CN113255986A (en) Multi-step daily runoff forecasting method based on meteorological information and deep learning algorithm
CN117009735A (en) High-strength forest fire occurrence probability calculation method combining BiLSTM and nuclear density estimation
CN115423146A (en) Self-adaptive runoff forecasting method based on multi-factor nearest neighbor sampling regression and support vector machine
Passalis et al. Global adaptive input normalization for short-term electric load forecasting
Liu et al. Research of photovoltaic power forecasting based on big data and mRMR feature reduction
Sun et al. Nonparametric probabilistic prediction of regional PV outputs based on granule-based clustering and direct optimization programming
CN117556223A (en) Multi-factor similarity-based snow melt runoff forecasting method
CN116202575B (en) Soil erosion rate monitoring system and method for soil conservation
Lu et al. Uncertainty quantification of machine learning models to improve streamflow prediction under changing climate and environmental conditions
Nizar et al. Forecasting of temperature by using LSTM and bidirectional LSTM approach: case study in Semarang, Indonesia
Hong et al. A study on rainfall prediction based on meteorological time series
Kheyruri et al. Quantification of the meteorological and hydrological droughts links over various regions of Iran using gridded datasets
CN113468821B (en) Decision regression algorithm-based slope abortion sand threshold determination method
Zhao et al. Investigating the critical influencing factors of snowmelt runoff and development of a mid-long term snowmelt runoff forecasting
Guo et al. Prediction of hourly inflow for reservoirs at mountain catchments using residual error data and multiple-ahead correction technique
Yang et al. Runoff Prediction in a Data Scarce Region Based on Few-Shot Learning
Mahata et al. A Statistical Analysis Model of Wind Power Generation Forecasting for the Western Region of India
Singh et al. Prognosis for crop yield production by data mining techniques in agriculture
Zor et al. Very Short-Term Electrical Energy Consumption Forecasting of a Household for the Integration of Smart Grids
Sulagna et al. A statistical analysis model of wind power generation forecasting for the Western Region of India

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination