CN117910512A

CN117910512A - Rail potential prediction method based on attention mechanism and OVPD intelligent control device

Info

Publication number: CN117910512A
Application number: CN202410247837.XA
Authority: CN
Inventors: 林珊; 农兴中; 陈霞; 贺利工; 蔡智超; 崖尚松; 唐智玺
Original assignee: East China Jiaotong University; Guangzhou Metro Design and Research Institute Co Ltd
Current assignee: East China Jiaotong University; Guangzhou Metro Design and Research Institute Co Ltd
Priority date: 2024-03-05
Filing date: 2024-03-05
Publication date: 2024-04-19

Abstract

The invention provides a rail potential prediction method based on an attention mechanism and an OVPD intelligent control device. Compared with the prior art, the GRU-MA-GRU neural network model is built in the method, the multi-head attention mechanism MA layer in the GRU-MA-GRU neural network model is used for calculating the attention score between each first prediction data and the historical rail potential through the multi-head attention mechanism, new weight is given to the first prediction data, the multi-head attention mechanism allows the model to pay attention to a plurality of important parts in the first prediction data at the same time, different attention points of the first prediction data are captured through calculating the attention scores of different heads, and therefore learning ability of the model on different parts of the data is improved, and accuracy of a prediction effect is improved.

Description

Rail potential prediction method based on attention mechanism and OVPD intelligent control device

Technical Field

The invention belongs to the technical field of rail traffic safety control, and particularly relates to a rail potential prediction method based on an attention mechanism and an OVPD intelligent control device.

Background

In recent years, with the rapid development of the economy in China, the traffic volume of urban rail transit is increased year by year, and the urban rail transit is in an unprecedented high-speed development period. In electrified railways, the train track is used as a part of a traction current backflow channel, and the excessive potential of the steel rail not only can interfere with signal equipment along the line, but also can cause abnormal traction backflow and even threaten the life safety of personnel. Therefore, the rail potential limiting device (Over Voltage Protection Device, OVPD) is widely adopted at home and abroad to implement grounding protection so as to prevent the consequences caused by the overhigh voltage of the rail.

When the rail potential exceeds the safety allowable contact voltage, the OVPD will automatically trigger and rapidly short the rail to ground, and the OVPD operation time needs to be about 0.2S. The instant of shorting the rail to ground can cause current leakage, creating stray currents that are likely to cause OVPDs at other locations to trigger successively or remain off for long periods of time, especially when one of the OVPDs in the rail transit system is triggered. This phenomenon of generating large amounts of stray current presents significant problems for the safe operation and maintenance of urban rail transit systems. Therefore, for this phenomenon, it is very important to study the actions of OVPD in urban rail transit system to analyze the optimal action strategy of OVPD.

In the prior art, the potential distribution condition of the steel rail mainly depends on a mechanism model of a reflux system. According to the model, real data of the OVPD are calculated and analyzed through high-precision digital twin rail distribution simulation, so that real-time conditions of rail potential distribution at all positions of the whole line are obtained, but rail conditions in a period of time in the future cannot be predicted, the OVPD cannot be controlled in advance according to the predicted data, and the OVPD is possibly triggered untimely, so that safety accidents occur.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a rail potential prediction method based on an attention mechanism and an OVPD intelligent control device, which are used for accurately predicting the rail potential of a target rail transit system.

In a first aspect, the object of the present invention is achieved by the following technical solution:

The rail potential prediction method based on the attention mechanism is characterized by comprising the following steps of:

S1: acquiring rail side running state data of a target rail transit system by SCADA (Supervisory Control And Data Acquisition, SCADA) as sample data;

s2: preprocessing the sample data to obtain preprocessed sample data;

S3: setting a multi-step prediction label, and setting a sliding window according to the multi-step prediction label; the sliding window is used for capturing local characteristics of the preprocessed sample data;

S4: dividing the preprocessed sample data into a test set and a training set according to the sliding window;

S5: constructing a GRU-MA-GRU neural network model with an attention mechanism built in;

S6: optimizing a plurality of super parameters of the GRU-MA-GRU neural network model;

s7: the training set is placed into the GRU-MA-GRU neural network model for training, and a depth model is obtained;

S8: the test set is placed into the depth model for calculation, and prediction data are obtained;

S9: evaluating the predicted data to obtain an evaluation result;

S10: repeating the steps S6-S9, and if the evaluation result is within a preset threshold range, taking the hyper-parameters corresponding to the current evaluation result as the prediction configuration parameters of the GRU-MA-GRU neural network model to obtain a prediction model based on attention;

s11: and obtaining a rail potential predicted value in a preset time period by adopting the attention-based prediction model.

Preferably, the GRU-MA-GRU neural network model comprises:

The linear input layer is used for carrying out linear transformation and dimension reduction transformation on input data and outputting a first transformation value;

The GRU network layer is used for carrying out feature extraction and prediction on the first transformation value to obtain first prediction data;

a residual layer for preserving original characteristics of the input data;

the multi-head attention mechanism MA layer calculates the attention score between each first predicted data and the historical rail potential by adopting a multi-head attention mechanism and gives the weight corresponding to each first predicted data;

ReLu an activation layer, in which a nonlinear activation function is built, for learning a nonlinear mode or feature of input data to obtain activation data;

and the linear output layer is used for carrying out linear transformation on the weighted first prediction data and the weighted activation data and outputting a prediction result.

Preferably, the step S2 includes:

Normalizing a plurality of groups of sample data to a [0,1] interval by using a Min-Max-Scaler function to obtain a plurality of groups of normalized data, wherein the Min-Max-Scaler function has the formula:

(1)

wherein, For sample data,/>Is the maximum value of the sample data,/>Is the minimum value of sample data,/>Is normalized data.

Preferably, the step S4 includes:

Dividing the multiple groups of normalized data into a training set and a testing set according to a preset dividing proportion.

Preferably, the step S8 includes:

and the test set is put into a loss function in the depth model for calculation, so that predicted data of the test set are obtained, wherein the loss function is as follows:

(2)

wherein, For/>Sample number,/>For testing the true value of the set data,/>Is predictive data for the test set.

Preferably, the step S9 includes:

s91: performing inverse normalization processing on the predicted data to obtain inverse normalized predicted data;

S92: and evaluating the predicted data after the inverse normalization processing by adopting an evaluation function root mean square error and a decision coefficient to obtain the evaluation result.

Preferably, the calculation formula of step S92 is as follows:

(3)

(4)

wherein, For the number of samples of the predicted data after the inverse normalization process,/>For the predicted data and/>, which are inversely normalized at the i-th momentIs the actual value of the potential of the steel rail at the ith moment/>Is the average value of the actual values of the rail potential.

Preferably, the prediction configuration parameters at least include training round number, learning rate, sliding window size, attention head number and GRU hidden layer number.

Preferably, after step S11, the method further comprises:

And if the potential predicted value of the steel rail is higher than the contact voltage of the safety permission, sending a pre-action command to the corresponding OVPD.

In a second aspect, the object of the present invention is achieved by the following technical solution:

the intelligent OVPD control device based on the attention mechanism is internally provided with the rail potential prediction method based on the attention mechanism, and the device comprises the following components:

the data acquisition module is used for acquiring rail side running state data of the target rail transit system through the SCADA as sample data;

the preprocessing module is used for preprocessing the sample data to obtain preprocessed sample data;

the setting module is used for setting a multi-step prediction label and setting a sliding window according to the multi-step prediction label; the sliding window is used for capturing local characteristics of the preprocessed sample data;

The data dividing module is used for dividing the preprocessed sample data into a test set and a training set according to the sliding window;

the model building module is used for building a GRU-MA-GRU neural network model with an attention mechanism built in;

The parameter optimization module is used for optimizing a plurality of super parameters of the GRU-MA-GRU neural network model;

The training module is used for placing the training set into the GRU-MA-GRU neural network model for training to obtain a depth model;

the depth model calculation module is used for placing the test set into the depth model for calculation to obtain prediction data;

the evaluation module is used for evaluating the prediction data to obtain an evaluation result;

the prediction model acquisition module is used for repeating the steps S6-S9, and if the evaluation result is within a preset threshold range, taking the hyper-parameters corresponding to the current evaluation result as the prediction configuration parameters of the GRU-MA-GRU neural network model to obtain a prediction model based on attention;

The prediction module is used for obtaining a rail potential predicted value in a preset time period by adopting the attention-based prediction model;

And the control module is used for sending a pre-action instruction to the corresponding OVPD if the steel rail potential predicted value is higher than the safety allowable contact voltage.

Compared with the prior art, the invention has at least the following advantages:

1) According to the method, the potential of the steel rail is predicted through the GRU-MA-GRU neural network model, so that accurate potential information can be obtained, and the blank in the field of action prediction of the steel rail potential limiting device (OVPD) is filled.

2) The method builds a GRU-MA-GRU neural network model, a multi-head attention mechanism MA layer in the model is used for calculating the attention score between each first prediction data and the potential of the historical steel rail through the multi-head attention mechanism, so that new weight is given to the first prediction data, the multi-head attention mechanism allows the model to pay attention to a plurality of important parts in the first prediction data at the same time, different attention points of the first prediction data are captured through calculating the attention scores of different heads, and therefore learning capacity of the model on different parts of the data can be enhanced, and accuracy of a prediction effect is improved.

3) The system can pre-send the action command to the steel rail potential limiting device (OVPD) by the obtained predicted value, can effectively inhibit the generation of stray current, brings great convenience to the safe operation and maintenance of urban rail transit, and simultaneously contributes to the realization of intelligent traffic and intelligent cities.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a rail potential prediction method based on an attention mechanism in a first embodiment of the invention;

FIG. 2 is a schematic diagram of a GRU-MA-GRU neural network model in accordance with an embodiment of the invention;

FIG. 3 is a1 st S prediction result chart of step S8 in the first embodiment of the present invention;

FIG. 4 is a scaled view of FIG. 3;

fig. 5 is a schematic structural diagram of an OVPD intelligent control device based on an attention mechanism in a second embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made apparent and fully in view of the accompanying drawings, in which some, but not all embodiments of the invention are shown. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The embodiment of the invention discloses a rail potential prediction method and an OVPD intelligent control device based on an attention mechanism, which are used for accurately predicting the rail potential of a target rail transit system.

Example 1

As shown in fig. 1, the rail potential prediction method based on the attention mechanism comprises the following steps:

S1: and acquiring rail side running state data of the target rail transit system through SCADA as sample data.

In this embodiment, SCADA (Supervisory Control and Data Acquisition, SCADA) is a technology for monitoring and controlling a system, and is widely used in the field of urban rail transit. And a sensor or other detection equipment arranged along the rail line is in communication connection with the SCADA system, so that the operation state data transmission and collection of the rail-side electrical equipment are realized.

It should be noted that the above-mentioned operation state data include, but are not limited to, a plurality of sets of voltage and current data corresponding to positions of the plurality of circuit breakers, a plurality of sets of voltage and current data corresponding to positions of the plurality of disconnectors, and voltage and current data of the rail itself. Typically, if the acquisition period is 0:00:00 to 23:59:59, the data acquisition frequency is 1 second once, then 86400 sets of sample data are left to be processed. According to actual needs, the method can increase the acquisition frequency to acquire more and denser sample data, and can also use data samples spanning multiple days to improve generalization and accuracy of the model.

S2: and preprocessing the sample data to obtain preprocessed sample data.

Wherein, step S2 includes:

normalizing a plurality of groups of sample data to a [0,1] interval by using a Min-Max-Scaler function to obtain a plurality of groups of normalized data, wherein the Min-Max-Scaler function has the following formula:

(1)

Specifically, the step S2 is used for adjusting the scale of the data, providing normalized data for modeling, so that the data values are on the same order of magnitude, and avoiding the training of the differential interference model between the data.

S3: setting a multi-step predictive tag, and setting a sliding window according to the multi-step predictive tag.

The data after the preprocessing is time-series data. Multi-step prediction refers to predicting values at a plurality of time points in the future.

In the present embodiment, the scale of the sliding window is set to 40.

In this embodiment, the data formats of input and output after the processing in step S3 are shown in table 1, the sliding window is set to 40, that is, the input feature matrix X is constructed by the first 40 pieces of data, and the predicted values Y1 to Y10 of 10 steps are output after model prediction.

TABLE 1

Specifically, step S3 may set a tag by changing the step length of the multi-step prediction according to the requirement, so as to construct the model input and output data formats. The sliding window is used for dividing the preprocessed sample data to construct a plurality of subsequences, so that local characteristics of the data are acquired, and understanding and predicting capability of a subsequent processing model on the time sequence are enhanced.

S4: and dividing the preprocessed sample data into a test set and a training set according to the sliding window.

Wherein, the step S4 includes:

In this embodiment, the preset division ratio is preferably 22:2, i.e. the future 2h data is predicted from the 22h data.

In other embodiments, the preset split ratio may be adjusted according to the data set and the site specific situation.

Specifically, in the step S4, a training set and a testing set are divided to provide a training and evaluating data base for the subsequent processing model, so as to improve the performance and generalization capability of the evaluating and optimizing model.

S5: and constructing a GRU-MA-GRU neural network model with an attention mechanism.

Specifically, as shown in fig. 2, the above-mentioned GRU-MA-GRU neural network model includes a linear input layer, a GRU network layer, a residual layer, a multi-head attention mechanism MA layer, a ReLu activation layer and a linear output layer.

The linear input layer is used for carrying out linear transformation and dimension reduction transformation on input data and outputting a first transformation value so as to adapt to the input requirement of the subsequent layer.

And the GRU network layer is used for carrying out feature extraction and prediction on the first transformation value to obtain first prediction data. It should be noted that GRU (Gated Recurrent Unit) is a variant of a Recurrent Neural Network (RNN), has a memory and gating mechanism, can capture long-term dependency in time series data, and has good modeling capability for data with strong time correlation.

The residual layer is used to preserve the original characteristics of the input data. In this embodiment, the residual layer directly adds the input data and the output data, so that the network can learn the residual signal, thereby retaining the original characteristics of the input data. Setting the residual layer helps to improve the representation ability and learning effect of the network.

And the multi-head attention mechanism MA layer calculates the attention score between each first predicted data and the historical rail potential by adopting a multi-head attention mechanism and gives the weight corresponding to each first predicted data. In this embodiment, the multi-head attention mechanism allows the model to pay attention to multiple important parts in the data at the same time, and captures different attention points of the data by calculating attention scores of different heads, so that learning ability of the model on different parts of the data is enhanced, and accuracy of prediction effect is improved.

ReLu the activation layer uses ReLu (RECTIFIED LINEAR Unit) as a nonlinear activation function, which can introduce nonlinear relationships so that the network can learn complex nonlinear patterns and features.

The linear output layer is used for carrying out linear transformation on the weighted first prediction data and the weighted activation data and outputting a final prediction result.

It should be explained that, the ReLu activation layer is one of the neural network layers, and the layer is commonly used in the neural network, and has the specific effect of adding nonlinear features in the data mapping, which is actually represented by 0 or smaller negative value in the data to be mapped to the high-order space input, so that the weight occupied by the useless features is reduced, and the useless features can be simply regarded as 0 clear of useless places in the data, and the useful places are unchanged.

Compared with the prior art, the method disclosed by the invention has the advantages that the GRU-MA-GRU neural network model is built, the multi-head attention mechanism MA layer in the model is used for calculating the attention score between each first prediction data and the historical steel rail potential through the multi-head attention mechanism, so that the new weight is given to the first prediction data, the multi-head attention mechanism allows the model to pay attention to a plurality of important parts in the first prediction data at the same time, and different attention points of the first prediction data are captured through calculating the attention scores of different heads, so that the learning capacity of the model on different parts of the data can be enhanced, and the accuracy of the prediction effect is improved.

S6: and optimizing a plurality of super parameters of the GRU-MA-GRU neural network model.

Specifically, step S6 mainly optimizes empirically 4 super parameters, such as learning rate, sliding window size, attention number, and number of layers of GRU hidden layers, of the GRU-MA-GRU neural network model, so as to ensure that the GRU-MA-GRU neural network model can maintain a relatively fast prediction speed under a high-precision condition. And storing the optimal model by storing the optimal learning strategy.

S7: and (3) placing the training set into a GRU-MA-GRU neural network model for training to obtain a depth model.

S8: and (5) placing the test set into a depth model for calculation to obtain prediction data.

Wherein, step S8 includes:

And (3) calculating a loss function of the test set in the depth model to obtain predicted data, wherein the loss function is as follows:

(2)

Specifically, as shown in fig. 3-4, the trained depth model is adopted to predict the rail potential of the divided test set, so as to obtain a non-inverse normalized predicted value of 10 seconds in the future.

S9: and evaluating the predicted data to obtain an evaluation result.

Wherein, step S9 includes:

S91: and carrying out inverse normalization processing on the predicted data to obtain the predicted data after the inverse normalization processing.

S92: and evaluating the predicted data after the inverse normalization processing by adopting an evaluation function root mean square error and a decision coefficient to obtain an evaluation result.

Preferably, the calculation formula in step S92 is:

(3)

(4)

In this embodiment, the results shown in table 2 can be obtained by combining the above-described calculation formulas (3) and (4) with the actual sample data:

TABLE 2

It should be noted that table 2 shows the evaluation results of step S9. RMSE characterizes the fitting ability of the model, and R2 characterizes the degree of correlation between predicted and true values. The RMSE is a loss type index, R2 is a gain type index, namely, the smaller the RMSE is, the closer the R2 is to 1, the better the model fitting effect is, and the more accurate the data filling result is. As can be seen from table 2, in case of predicting the next 10 seconds, R2 is very close to 1 and RMSE is small, which means that the prediction effect is very good, very close to the true value. Especially the first 5 seconds, the prediction accuracy is substantially above 95%. Thus, it can be considered that the model can accurately predict the potential value at the future time of OVPD.

Specifically, step S9 evaluates the predicted rail potential accuracy using an evaluation function root mean square error (root mean square error, RMSE) and a decision coefficient (coefficient of determination, R2). The performance of the GRU-MA-GRU neural network model in the rail potential prediction task can be objectively evaluated by using the RMSE and the R2 indexes. Lower RMSE values and higher R2 values indicate that the GRU-MA-GRU neural network model has more accurate predictive power and better fitness. The evaluation results can help judge whether the GRU-MA-GRU neural network model meets the prediction requirement, and can also be used as a basis for improving the performance of the GRU-MA-GRU neural network model, for example, the prediction accuracy is improved by means of adjusting the model structure, optimizing the training strategy and the like.

S10: and repeating the steps S6-S9, and if the evaluation result is within the preset threshold range, taking the hyper-parameters corresponding to the current evaluation result as the prediction configuration parameters of the GRU-MA-GRU neural network model to obtain the attention-based prediction model.

The prediction configuration parameters at least comprise training round number, learning rate, sliding window size, attention head number and GRU hiding layer number.

It should be noted that, the preset threshold range in the step S10, that is, the prediction accuracy range satisfied by the user, may be specifically set according to the actual situation.

S11: and obtaining a rail potential predicted value in a preset time period by adopting a prediction model based on attention.

Specifically, step S11 is used for predicting the actual rail potential to obtain a rail potential predicted value of 10 seconds.

Compared with the prior art, the method constructs the GRU-MA-GRU neural network model, trains and optimizes the model, wherein the method obtains the optimal model configuration parameters such as the learning rate, the sliding window size, the attention head number, the GRU hiding layer number and the like through the step S6, so that the performance of the GRU-MA-GRU neural network model is improved. And then training and evaluating by using the optimized configuration parameters through steps S7-S9 to show the prediction effect of the existing model. And more accurate prediction results can be obtained in future actual prediction tasks.

After step S11, the method of the present invention further comprises:

Specifically, if the predicted rail potential is higher than the safety allowable contact voltage, the OVPD is controlled to enter a pre-action state so as to short the rail with the ground at a high potential moment determined in the future, and the grounding protection is implemented, so that potential safety hazards are avoided. When the predicted rail potential drops to a safe allowable contact potential, the action OVPD cancels the shorting.

Compared with the prior art, the method has the advantages that the predicted value of the method sends the action command to the steel rail potential limiting device (OVPD) in advance, so that the generation of stray current can be effectively restrained, great convenience is brought to the safe operation and maintenance of urban rail transit, and meanwhile, contribution is made to the realization of intelligent traffic and intelligent cities.

Example two

As shown in fig. 5, on the basis of the first embodiment, this embodiment discloses an OVPD intelligent control device based on an attention mechanism, in which the above rail potential prediction method based on an attention mechanism is built, and the device includes:

The prediction model acquisition module is used for repeating the steps S6-S9, and if the evaluation result is within the preset threshold range, taking the hyper-parameters corresponding to the current evaluation result as the prediction configuration parameters of the GRU-MA-GRU neural network model to obtain a prediction model based on the attention;

the prediction module is used for obtaining a rail potential predicted value in a preset time period by adopting a prediction model based on attention;

and the control module is used for sending a pre-action instruction to the corresponding OVPD if the potential predicted value of the steel rail is higher than the safety allowable contact voltage.

Wherein, the preprocessing module includes:

The first calculation unit is used for normalizing the plurality of groups of sample data to the [0,1] interval by using a Min-Max-Scaler function to obtain a plurality of groups of normalized data.

Wherein, the data partitioning module includes:

The dividing unit is used for dividing the plurality of groups of normalized data into a training set and a testing set according to a preset dividing proportion.

Wherein, the depth model calculation module includes:

and the second calculation unit is used for calculating the loss function of the test set placed in the depth model to obtain the prediction data of the test set.

Wherein the evaluation module comprises:

The inverse normalization processing unit is used for performing inverse normalization processing on the predicted data to obtain the predicted data after the inverse normalization processing;

And the evaluation unit is used for evaluating the predicted data after the inverse normalization processing by adopting the root mean square error of the evaluation function and the decision coefficient to obtain an evaluation result.

Compared with the prior art, the device fills the blank in the field of motion prediction of the steel rail potential limiting device (OVPD).

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above embodiments may be implemented by hardware associated with program instructions, and the foregoing program may be stored in a computer readable storage medium, which when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.

It will be apparent to those skilled in the art from this disclosure that various other changes and modifications can be made which are within the scope of the invention as defined in the appended claims.

Claims

1. The rail potential prediction method based on the attention mechanism is characterized by comprising the following steps of:

S1: acquiring rail side running state data of a target rail transit system through SCADA as sample data;

s2: preprocessing the sample data to obtain preprocessed sample data;

S9: evaluating the predicted data to obtain an evaluation result;

2. The attention mechanism based rail potential prediction method of claim 1, wherein the GRU-MA-GRU neural network model comprises:

a residual layer for preserving original characteristics of the input data;

3. The method for predicting the potential of a steel rail based on an attention mechanism according to claim 1, wherein the step S2 comprises:

(1)

4. The method for predicting the potential of a steel rail based on an attention mechanism according to claim 1, wherein the step S4 comprises:

5. The method for predicting the potential of a steel rail based on an attention mechanism according to claim 1, wherein the step S8 comprises:

(2)

6. The method for predicting the potential of a rail based on an attention mechanism according to claim 5, wherein the step S9 comprises:

7. The method for predicting the potential of a steel rail based on an attention mechanism as set forth in claim 5, wherein the calculation formula in step S92 is:

(3)

(4)

8. The method for predicting the potential of a steel rail based on an attention mechanism according to claim 1, wherein the prediction configuration parameters at least comprise the number of training rounds, the learning rate, the sliding window size, the number of attention points and the number of GRU hiding layers.

9. The attention mechanism-based rail potential prediction method according to claim 1, characterized in that after step S11, the method further comprises:

10. The intelligent OVPD controlling means based on attention mechanism, characterized by comprising: