CN115062713A

CN115062713A - GCN-GRU-based open-pit mine truck stay area activity identification method

Info

Publication number: CN115062713A
Application number: CN202210728962.3A
Authority: CN
Inventors: 张磊; 卢传钊; 刘佰龙; 梁志贞
Original assignee: China University of Mining and Technology CUMT
Current assignee: China University of Mining and Technology CUMT
Priority date: 2022-06-24
Filing date: 2022-06-24
Publication date: 2022-09-16

Abstract

A GCN-GRU-based method for recognizing activities of stay areas of trucks in open mines belongs to the field of track data mining. The method comprises the following steps: filtering the GPS track data based on the unique identification ID and the time of the vehicle, namely screening the data to eliminate invalid data; setting a proper threshold value to identify the staying area of the track data according to the actual staying time condition of the vehicle based on the filtered GPS track data; extracting basic characteristics of the truck stay areas based on the obtained truck stay areas, connecting the stay areas according to a time sequence to construct a graph, and obtaining an adjacency matrix; based on the extracted features and the adjacency matrix, using a GCN neural network to embed the features of the staying area, and converting the features of the staying area into feature vectors; and training the stay area sequence by using a GRU neural network based on the obtained feature vector, and finally obtaining a classification result. The method constructs an adjacent matrix taking a staying area as a vertex according to a time sequence, and combines a GCN neural network with a GRU neural network model, wherein the GCN neural network can extract the characteristics of the staying areas before and after the current staying area, and the GRU neural network can capture the characteristics of all the staying areas before, so that the identification accuracy is improved.

Description

GCN-GRU-based open-pit mine truck stay area activity identification method

Technical Field

The invention relates to the field of track data mining, in particular to GPS track staying area identification of an open-pit truck, and staying area activity identification is carried out through combination of a GCN neural network and a GRU neural network, and particularly relates to a GCN-GRU-based open-pit truck staying area activity identification method.

Background

In recent years, with rapid development and wide application of global positioning system devices, people can track the moving track of moving objects around the world due to the development of positioning devices such as Wi-Fi, video surveillance, and wsn (wireless Sensor network). These movement trace data include traffic traces, human movement traces, animal movement traces, natural phenomenon traces, and the like. In addition, research on track data mining is rapidly developing, and the track data provides rich information and knowledge and can be applied to many fields such as location services, traffic management, city planning and the like. Spatiotemporal trajectory data typically includes trajectory points arranged in a time series. However, these points are very different in importance, some of the trace points are only the places where the moving object passes instantaneously, and some of the trace point sets represent that the moving object stays in a certain place for a certain period of time, and the points in these sets are static or slow in a certain small range, and these sets are called stay areas. The stay zone reflects the behavior of the moving object over a period of time and has important semantic features.

GPS trajectory data is typically represented as a time-ordered sequence of trajectory points { p } ₀ ,<x ₀ ,y ₀ >,t ₀ },…,{p _n ,<x _n ,y _n >,t _n }. Wherein<x _n ,y _n >Represents p _n Coordinate information of the track point, recorded time t _n And t is _n-1 <t _n . Because the acquisition frequency of the GPS track is high, a large amount of track data can be generatedWhile the track points are not equally important, some are only the places where the moving object passes instantaneously, e.g. the bus station where the user passes by a car, while some represent that the moving object stays in a certain place for a certain time, e.g. the user shops in a mall or has a rest at home. Such a set of trajectory points reflecting that a mobile object stays in a certain place for a certain period of time is called a stay area, and identifying and analyzing the stay area in a GPS trajectory is one of the tasks of trajectory data mining.

With the wide application of deep learning in the field of trajectory data, a trajectory identification technology based on an artificial intelligence method becomes a research hotspot. The GCN has a strong ability to mine spatio-temporal features from complex topological road networks, but its application in trajectory data mining is relatively small.

In open pit mines, the stopping area is typically where the truck driver stops to perform the relevant task, such as loading, unloading, waiting, resting, etc. The loading and unloading activities of the strip mine as a large production system characterized by mining, loading, transportation and dumping are one of the core businesses, and the retention area is an important part of the transportation task. The method has great significance for the operation supervision and the improvement of the trucks in the open-pit mine area by carrying out the activity recognition on the staying area.

Disclosure of Invention

The invention aims to provide a GCN-GRU-based method for identifying the movement of a truck staying area in an open pit, which solves the problems that the traditional machine learning method cannot fully utilize the track data characteristics and the monitoring efficiency of truck operation in the open pit is low.

The purpose of the invention is realized as follows: the method for identifying the activity of the area where the truck stays in the strip mine based on GCN-GRU comprises the following steps: filtering the GPS track data based on the unique identification ID and the time of the vehicle, namely screening the data to eliminate invalid data; based on the filtered GPS track data, setting a proper threshold value to identify the stopping area of the track data according to the actual research requirement on the stopping time of the stopping area; extracting basic characteristics of the truck stay areas based on the obtained truck stay areas, connecting the stay areas according to a time sequence to construct a graph, and obtaining an adjacency matrix as input of a model; based on the extracted features and the adjacency matrix, using a GCN neural network to embed the features of the staying area, and converting the features of the staying area into feature vectors; and training the stay area sequence by using a GRU neural network based on the obtained feature vector of the stay area, and finally obtaining a classification result.

The method comprises the following specific steps:

step 1: based on the acquired truck GPS track data, carrying out data filtration, namely data preprocessing, on the truck GPS track data;

step 2: based on the truck GPS track data filtered in the step 1, selecting a proper threshold value to identify a staying area of the track data, and finally obtaining staying area data;

and step 3: extracting the characteristics of the staying area based on the staying area data obtained in the step 2; constructing an adjacent matrix taking the staying area as a vertex according to the time sequence;

and 4, step 4: embedding the stay area features by using a graph convolution neural network in the GCN-GRU based on the stay area features and the adjacency matrix obtained in the step (3) and converting the stay area features into a vector form; wherein, GCN is the neural network of picture convolution; the GRU is a gate control circulation network;

and 5: and (4) based on the stay region feature embedding vector obtained in the step (4), utilizing a GRU model to train, and utilizing a gated cyclic network in GCN-GRU to train, so as to finally obtain different types of active labels.

Further, the step 1 specifically includes:

step 1A: compiling a filter program, and setting a screening condition of the truck GPS track data to remove repeated and invalid GPS data;

step 1B: filtering the GPS track data of the truck through the filtering program compiled in the step 1A;

step 1C: and D, storing the truck GPS track data filtered in the step 1B in the local for identifying the parking area.

Further, the step 2 specifically includes:

step 2A: determining the general stopping activities of the vehicle based on the vehicle type attributes and the GPS sampling frequency in the truck GPS track data, and determining a threshold value for screening a stopping area and a label for activity classification;

and step 2B: writing a distance calculation program, and calculating the distance between two trace points; the distance calculation formula is as follows:

suppose that two adjacent trajectory data D ₁ (x ₁ ,y ₁ ) And D ₂ (x ₂ ,y ₂ ) In which D is ₁ And D ₂ Is a GPS track point, x ₁ ,y ₁ And x ₂ ,y ₂ Are respectively D ₁ And D ₂ Longitude and latitude of, then D ₁ And D ₂ A distance D between ₁₂ Comprises the following steps:

and step 2C: writing a staying area detection program according to the threshold value of the step 2A and the distance calculation program of the step 2B: and calculating that the track is within the space threshold, and detecting the area with the stay time exceeding the time threshold as a stay area.

Setting rules of the stay area: with a time threshold of 10 minutes and a space threshold of 50 meters, the trajectory will stay for more than 10 minutes within 50 meters.

Step 2D: and C, obtaining a staying area according to the staying area detection program in the step 2C, and storing the obtained staying area result into a file, wherein the track sequence of the staying area and the label of the cluster to which the staying area belongs are stored in the file.

Further, the step 3 specifically includes:

step 3A: selecting the characteristics of the truck staying area for subsequent activity identification based on the condition of the truck staying in the strip mine;

and step 3B: selecting dwell time as a training feature; calculating the staying time of the staying area by a calculation method of calculating the difference between the time of the first recorded track point and the time of the last recorded track point in one staying area;

and step 3C: selecting the stay distance as a training characteristic; calculating the staying distance of the staying area, wherein the staying time is calculated by the following method: calculating the distance between the track point of the first record and the track point of the last record in the staying area by using a distance calculation program;

and step 3D: selecting the average speed of the stay area as a training feature; the average speed is calculated by calculating the speed between the track point and the track point in the staying area and taking the average value;

and step 3E: and calculating the central point of each staying area, wherein the central point is a longitude and latitude coordinate value and consists of the average value of all longitudes and the average value of all dimensions in the staying area. Calculating the average value of all longitudes in the staying area as the longitude value of the central point, and calculating the average value of all latitudes in the staying area as the latitude value of the central point;

and step 3F: calculating the starting time and the ending time of each staying area, wherein the starting time is the time of staying the track point recorded at the first position in the cluster, and the ending time is the time of staying the track point recorded at the last position in the cluster;

step 3G: based on the starting time of each staying area obtained in the step 3F, sequencing the staying areas according to the sequence of the starting time and establishing an edge between two adjacent staying areas, thereby forming an adjacency matrix a; the adjacency matrix is used as the subsequent input of the model;

the adjacency matrix elements take the values:

wherein A is _ij ∈R ^k*k Values, v, representing the dwell areas of i and j in the adjacency matrix _i Representing nodes i, v _i Representing node i, E representing the set of edges, and k representing the total number of dwell regions.

Further, the step 4 specifically includes:

feature embedding using a GCN model, comprising:

wherein A represents the adjacency matrix, I represents the identity matrix,

degree matrix representing undirected graph G, σ (-) is an activation function, W ^(l) As a weight matrix, X ^(l) X ^(l+1) Features of the l-th and l + 1-th layers, respectively. The final output result is matrix X belongs to R ^k*d The features of the last layer are represented, where k is represented as the number of dwell regions and d is represented as the dimension of the embedded representation. X _t ∈R ^d An embedded representation of the tth dwell region in time series is represented.

Further, the step 5 specifically includes:

step (51) at time step t, calculate an update gate z using the following formula _t ：

z _t ＝σ(W _z [h _t-1 ,X _t ]) (6)

Where t denotes the position of the current dwell region in the sequence, h _t-1 Information indicating the previous dwell region, X _t As the input vector for the tth dwell region of the GRU. It will be in the same order of h _t-1 Connected and linearly transformed, i.e. with the weight matrix W _z The multiplication is input to the activation function of the update gate. σ is expressed as the activation function used and is calculated as follows:

step (52) resets gate r _t The following formula is followed;

r _t ＝σ(W _r [h _t-1 ,X _t ]) (8)

wherein, W _r A weight matrix that is a reset gate;

step (53) in the use of the reset gate, the new memory content will use the reset gate to store the past related information, which is calculated by the expression:

first calculate the reset gate r _t And h _t-1 The product of (a); and is combined with X _t The connection is linearly transformed and input into the activation function of the reset gate. tanh is expressed as the activation function used and is calculated as follows:

step (54) of calculating the final feature h of the tth dwell zone _t . The vector will retain information of the current cell and pass on to the next cell; in this process, it is necessary to use an update gate that uses the current memory content

And the previous dwell area characteristic h _t-1 The collected information; the expression for this process is:

wherein z is _t To update the activation result of the gate, it also controls the inflow of information in a gated form; z is a radical of _t And h _t-1 The product of (A) represents the information from the previous time step to the last, which is equal to the final gate plus the information from the current memory to the final memoryAnd controlling the content output by the circulation unit.

Step (55) classifies the current stay area using the fully connected network layer and the softmax layer. The formula is as follows:

y＝FC(h _t ) (12)

where FC (-) is a fully connected network, y is the result output across the fully connected layer, y is _i The output value of the ith node of the full connection layer is n, and the number of the output nodes is the number of classification categories.

The method has the advantages that due to the adoption of the scheme, the staying area is identified through time and space factors; constructing an adjacent matrix taking the staying area as a vertex according to the time sequence based on the identified staying area, and extracting the characteristics of the staying area; and embedding the adjacency matrix and the node characteristics by using GCN, and performing characteristic training by using the characteristic embedded vector as input and using GRU to obtain a classification result.

Establishing association of each independent stay area, establishing a graph network structure, extracting the characteristics of adjacent nodes by using a GCN neural network, and simultaneously placing the GCN and a gating recursion unit GRU in a frame, wherein the GRU is used for capturing long-term time dependence, so that the characteristics of the stay area can be fully utilized by a model, and more accurate and effective activity prediction is realized.

And converting the stay region sequence into an adjacent matrix taking the stay region as a vertex, and associating the stay regions according to the time sequence, so that the characteristics of adjacent nodes can be learned in the subsequent model training. The relevance between the stay areas is fully utilized, so that the model can fully extract the features.

According to the track data characteristics of trucks in the strip mine, a GCN model is combined with a GRU model, the space-time characteristics can be fully utilized by combining the advantages of the GCN model and the GRU model, the GCN can extract characteristics from a complex graph structure, the GRU can effectively analyze the characteristics extracted by the GCN, and an identification model with long-term dependence is established; compared with the traditional machine learning model, the recognition accuracy is improved; the track data is used for identifying the movement of the truck when the truck stays, so that the operation supervision efficiency of the truck in the strip mine is improved.

The problem that the traditional machine learning method cannot fully utilize the track data characteristics and the operation supervision efficiency of trucks in strip mines is low is solved, and the purpose of the invention is achieved.

The advantages are that: the method constructs an adjacent matrix taking a staying area as a vertex, so that the adjacent matrix can learn the characteristics of adjacent nodes, the connection between the staying areas is enhanced, the space-time characteristics are mined through GCN, and the characteristics of the trajectory data are fully utilized; combining the GCN with the GRU model, and performing activity recognition on the staying area by using the GCN and the GRU; the advantages of the GCN model and the GRU model are combined, and the space-time characteristics can be fully utilized; the track data is used for identifying the movement of the truck when the truck stays, so that the operation supervision efficiency of the truck in the strip mine is improved. In the production application of the strip mine, the method has good effect and improves the production efficiency.

Drawings

FIG. 1 is a flow chart of an activity identification method for a stay area of a truck in an open pit based on GCN-GRU according to the invention.

FIG. 2 is a structural diagram of a GCN-GRU model in the method for identifying the activity of the truck stopping area of the strip mine based on the GCN-GRU.

Detailed Description

A GCN-GRU-based method for recognizing activities of a stopping area of a truck in an open pit comprises the following steps: filtering the GPS track data based on the unique identification ID and the time of the vehicle, namely screening the data to eliminate invalid data; setting a proper threshold value to identify the staying area of the track data according to the actual research requirement on the staying time of the staying area based on the filtered GPS track data; extracting basic characteristics of the truck stay areas based on the obtained truck stay areas, connecting the stay areas according to a time sequence to construct a graph, and obtaining an adjacency matrix; based on the extracted features and the adjacency matrix, using a GCN neural network to embed the features of the staying area, and converting the features of the staying area into feature vectors; and training the stay area sequence by using a GRU neural network based on the obtained feature vector, and finally obtaining a classification result.

The method comprises the following specific steps:

and step 3: extracting the characteristics of the staying area based on the staying area data obtained in the step (2); and constructing an adjacency matrix with the dwell region as a vertex according to the time sequence;

and 4, step 4: embedding the stay area features by using a graph convolution neural network in the GCN-GRU based on the stay area features and the adjacency matrix obtained in the step (3) and converting the stay area features into a vector form; wherein, GCN is the neural network of the volume of the picture; the GRU is a gate control circulation network;

Further, the step 1 specifically includes:

step 1C: and (4) storing the truck GPS track data filtered in the step 1B locally for stay area identification.

Further, the step 2 specifically includes:

suppose that two adjacent trajectory data D ₁ (x ₁ ,y ₁ ) And D ₂ (x ₂ ,y ₂ ) Wherein D is ₁ And D ₂ Is a GPS track point, x ₁ ,y ₁ And x ₂ ,y ₂ Are respectively D ₁ And D ₂ Longitude and latitude of, then D ₁ And D ₂ A distance D between ₁₂ Comprises the following steps:

and step 2C: writing a staying area detection program according to the threshold value of the step 2A and the distance calculation program of the step 2B: the calculated trajectory is within a spatial threshold and the region where the dwell time exceeds the temporal threshold is detected as a dwell region.

Further, the step 3 specifically includes:

step 3A: selecting the characteristics of the truck staying area for subsequent activity identification based on the state of the truck staying in the strip mine;

the adjacency matrix elements take the values:

Further, the step 4 specifically includes:

feature embedding using a GCN model, comprising:

wherein A represents the adjacency matrix, I represents the identity matrix,

degree matrix representing undirected graph G, σ (-) is an activation function, W ^(l) As a weight matrix, X ^(l) X ^(l+1) Features of the l-th and l + 1-th layers, respectively. The final output result is the matrix X belongs to R ^k*d The features of the last layer are represented, where k is the number of dwell regions and d is the dimension of the embedded representation. X _t ∈R ^d An embedded representation of the tth dwell region in time series is represented.

Further, the step 5 specifically includes:

z _t ＝σ(W _z [h _t-1 ,X _t ]) (6)

Where t denotes the position of the current dwell region in the sequence, h _t-1 Information indicating the previous dwell area, X _t As the input vector for the tth dwell region of the GRU, it will be h _t-1 Connected and linearly transformed, i.e. with the weight matrix W _z The multiplication is input to the activation function of the update gate. σ is expressed as the activation function used and is calculated as follows:

step (52) resets gate r _t The following formula is followed;

r _t ＝σ(W _r [h _t-1 ,X _t ]) (8)

wherein, W _r For resetting the doorA weight matrix;

step (53) in the use of the reset gate, the new memory content will use the reset gate to store the past related information, and the calculation expression is:

first calculate the reset gate r _t And h _t-1 The product of (a); and is combined with X _t The connection is changed linearly and is input to the activation function of the reset gate. tanh is expressed as the activation function used and is calculated as follows:

step (54) of calculating the final feature h of the tth dwell zone _t . The vector retains the information of the current cell and passes it on to the next cell; in this process, it is necessary to use an update gate which uses the current memory content

wherein z is _t To update the activation result of the gate, it also controls the inflow of information in a gated manner; z is a radical of _t And h _t-1 The product of (a) and (b) represents the information that was saved to the end in the previous time step, and the information added with the information that was saved to the end in the current memory is equal to the content of the output of the final gated loop unit.

y＝FC(h _t ) (12)

where FC (-) is a fully connected network, y is the result output across the fully connected layer, y is _i The output value of the ith node of the full connection layer is shown, and n is the number of the output nodes, namely the number of classification classes.

The technical scheme of the invention is further described in detail by combining the drawings and the specific embodiments:

example 1:

as shown in fig. 1, an embodiment of the present invention provides a GCN-GRU based method for identifying an activity of a truck staying area in an open pit mine, which is based on GPS track data of a truck operating on a day shift plus night shift in a certain mine in inner mongolia, and the processing technique comprises the following steps:

(1) and based on the acquired truck GPS track data, carrying out data filtering, namely data preprocessing on the truck GPS track data.

The GPS track data of the truck which operates on a white shift plus a night shift on a certain mine in inner Mongolia totally comprises 922560 pieces of GPS track data, and each piece of track data records the ID, time, longitude and latitude and direction of the truck.

And classifying different vehicle IDs in the truck GPS track data based on the acquired vehicle unique identification ID and truck GPS track data with disordered time series, and saving the vehicle IDs serving as file names to the local.

In this embodiment, a Python programming language is used to write a filter (screening) program to set the screening conditions of the truck GPS track data, so as to eliminate the repeated, invalid and erroneous GPS data.

And deleting the GPS track data record with incomplete data attribute information.

If the time attribute information of the GPS track data is the same, only the last record is reserved, and the rest records are deleted.

And classifying different vehicle IDs in the truck GPS track data, and reordering the time of the truck GPS track data to obtain the GPS track data with complete time sequence of each truck.

(2) And selecting a proper threshold value to identify the staying area of the track data, and finally obtaining the staying area data.

Based on the vehicle type attributes in the GPS track data, a determination is made of the general stopping activity of the vehicle. In a strip mine truck operation scenario, the stopping action is typically either loading or unloading.

Writing a stay area detection program: the calculated trajectory is within a spatial threshold and the region where the dwell time exceeds the temporal threshold is detected as a dwell region. In conjunction with the sampling frequency of the GPS track data being acquired every 5 seconds, we set the distance threshold to 20 meters and the time threshold to 15 seconds in order to obtain as many dwell regions as possible.

(3) Features of the dwell region are extracted. And constructing an adjacency matrix having the dwell region as a vertex according to the time order.

Parking time is a key feature in identifying parking locations and their purpose. Dwell time is the time elapsed between the first recorded trace point and the last recorded trace point of the stop point. Dwell time is selected as the training feature.

The program is written to calculate the difference between the trace point of the first record and the trace point of the last record in the dwell cluster.

And calculating the staying distance of the staying area, wherein the staying distance is used as a characteristic, and the calculation method of the staying distance is the Euclidean distance between the longitude and the latitude of the first recorded tracing point and the longitude and the latitude of the last recorded tracing point in the staying cluster.

Wherein two adjacent track data D are assumed ₁ (x ₁ ,y ₁ ) And D ₂ (x ₂ ,y ₂ )。

The speeds of different moving vehicles are different, so the average speed of the stopping area is calculated, and as a characteristic, the average speed is calculated by the average speed between the track points in the stopping cluster.

The program calculates the velocity between the points in the dwell region and averages.

To construct the adjacency matrix with the dwell regions as vertices, the center point and start time, end time of each dwell region are now calculated.

And writing a program, adding all longitudes in the staying area to obtain an average value, and averaging all latitudes in the staying area to obtain a longitude and latitude point which is the central point of the staying area.

And writing a program, and calculating the starting time and the ending time of each staying area, wherein the starting time is the time of the track point recorded first in the staying area, and the ending time is the time of staying the track point recorded last in the cluster.

Sequencing the stay areas according to the sequence of the starting time and establishing an edge between two adjacent stay areas, thereby forming an adjacency matrix A; the adjacency matrix is used as the subsequent input of the model;

(4) Features are embedded using GCN-GRU and converted to vector form.

Before model training is carried out, labeling is carried out on training data according to engineering experience and thread data, and activities of a staying area are divided into 4 activities which are respectively as follows: loading, unloading, waiting, and others. And respectively storing the stay area characteristics and the adjacency matrix by using a file.

And dividing the complete data into data sets according to a preset proportion to obtain a training set and a test set.

Initializing training parameters, and embedding the characteristics of the data, wherein the characteristic extraction process comprises the following steps:

wherein A represents the adjacency matrix, I represents the identity matrix,

degree matrix representing undirected graph G, σ (-) is an activation function, W ^(l) Is a weight matrix, H ^(l) H ^(l+1) Respectively representing the characteristics of the I layer and the I +1 layer, and finally outputting the result as a matrix H epsilon R ^k*d The features of the last layer are denoted, where k is denoted as the dimension of the matrix row and d is denoted as the dimension of the embedded representation. X _t ∈R ^d An embedded representation of the tth dwell region in time series is represented.

(5) Using GRU network to carry out feature training and finally obtaining classification result

Initializing training parameters and training the characteristics of the data.

At time step t, the update gate z is calculated using the following formula _t ：

z _t ＝σ(W _z [h _t-1 ,X _t ]) (6)

Where t denotes the position of the current dwell region in the sequence, h _t-1 Information indicating the previous dwell area, X _t The input vector of the t-th dwell region of (1), which is equal to h _t-1 Connected and linearly transformed, i.e. with the weight matrix W _z The multiplication is input to the activation function of the update gate. σ is expressed as the activation function used and is calculated as follows:

reset gate r _t Reset gate r _t The following formula is followed.

r _t ＝σ(W _r [h _t-1 ,X _t ]) (8)

In the use of the reset gate, the new memory will use the reset gate to store the past related information, whose calculation expression is:

first, a reset gate r is calculated _t And h _t-1 The product of (a); and is combined with X _t The connection is linearly transformed and input into the activation function of the reset gate. tanh is expressed as the activation function used and is calculated as follows:

calculating the final feature h of the tth stopping area _t . The vector will retain the information of the current cell and pass on to the next cell; in this process, it is necessary to use an update gate which uses the current memory content

And a previous dwell zone characteristic h _t-1 The collected information; the expression for this process is:

wherein z is _t To update the activation result of the gate, it also controls the inflow of information in a gated form; z is a radical of _t And h _t-1 The product of (a) and (b) represents the information that was retained to the end at the previous time step, and this information plus the information that was retained to the end at the current memory is equal to the content of the output of the end gated loop unit.

The current stay area is classified using the fully connected network layer and the softmax layer. The formula is as follows:

y＝FC(h _t ) (12)

where FC (-) is a fully connected network, y is the result output across the fully connected layer, y is _i The output value of the ith node of the full connection layer is shown, and n is the number of the nodes, namely the number of classification categories.

In summary, compared with the prior art, the method has the following beneficial effects:

the embodiment of the invention considers the current production situations that the monitoring efficiency of the production operation of the truck is low and the track data is not fully utilized in the strip mine, and applies the track data mining technology to the strip mine. The traditional machine learning method is not well suitable for the open-pit mine, and the characteristics of adjacent nodes are improved by constructing an adjacent matrix with a staying area as a vertex, a GCN network is used, long-term time dependence is captured through a GRU neural network, so that local characteristics are obtained, all characteristics of the previous staying area are captured, and the identification accuracy is improved.

Claims

1. A GCN-GRU-based open pit truck stay area activity identification method is characterized by comprising the following steps: filtering the GPS track data based on the unique identification ID and the time of the vehicle, namely screening the data to eliminate invalid data; setting a proper threshold value to identify the staying area of the track data according to the actual research requirement on the staying time of the staying area based on the filtered GPS track data; extracting basic characteristics of the truck staying areas based on the obtained truck staying areas, connecting the staying areas according to a time sequence to construct a diagram, and obtaining an adjacency matrix as input of a model; based on the extracted features and the adjacency matrix, using a GCN neural network to embed the features of the staying area, and converting the features of the staying area into feature vectors; based on the obtained feature vectors of the stay areas, the GRU neural network is used for training the stay area sequences, and finally the activity type of each stay area is obtained.

2. The GCN-GRU based method of actively identifying the area of interest of a truck in an open pit as claimed in claim 1 wherein: the method comprises the following specific steps:

step 2: based on the GPS track data filtered in the step 1, selecting a proper threshold value to identify a staying area of the track data, and finally obtaining staying area data;

and step 3: extracting the characteristics of the staying area based on the staying area data obtained in the step 2; constructing an adjacency matrix taking the dwell area as a vertex according to the time sequence;

3. The GCN-GRU based method of actively identifying the area of interest of a truck in an open pit as claimed in claim 2 wherein: the step 1 specifically comprises:

step 1A: compiling a filter program, and setting screening conditions for truck GPS filtering to eliminate repeated and invalid GPS data;

step 1B: filtering the GPS track data of the truck through the filter program compiled in the step 1A;

step 1C: and (4) storing the truck GPS data filtered in the step 1B locally for stay area identification.

4. The GCN-GRU based method of actively identifying the area of interest of a truck in an open pit as claimed in claim 2 wherein: the step 2 specifically comprises:

step 2A: determining a time threshold value and a space threshold value of a screening stay area and a label of activity classification based on vehicle type attributes and GPS sampling frequency in truck GPS track data and in combination with activities of the truck during stay;

and step 2C: writing a staying area detection program according to the threshold value of the step 2A and the distance calculation program of the step 2B: the calculated trajectory is within a spatial threshold and the region where the dwell time exceeds the temporal threshold is detected as a dwell region. One dwell area is formed by one or more track points;

step 2D: and C, obtaining a staying area according to the staying area detection program in the step 2C, and storing the obtained staying area result into a file, wherein the track sequence of the staying area and the label of the staying area to which the staying area belongs are stored in the file.

5. The GCN-GRU based method of actively identifying the area of interest of a truck in an open pit as claimed in claim 2 wherein: the step 3 specifically comprises:

and step 3C: selecting the stay distance as a training characteristic; calculating the stopping distance of the stopping area, wherein the calculating method of the stopping distance comprises the following steps: calculating the distance between the track point of the first record and the track point of the last record in the staying area by using a distance calculation program;

and step 3D: selecting the average speed of the stay area as a training characteristic; the average speed is calculated by calculating the speed between the track point and the track point in the staying area and taking the average value;

and step 3F: calculating the starting time and the ending time of each staying area, wherein the starting time is the time of staying at the track point recorded at the first time in the cluster, and the ending time is the time of staying at the track point recorded at the last time in the cluster;

the adjacency matrix elements take the values:

wherein A is _ij ∈R ^k*k Indicating that both i and j dwell regions are in the adjacency matrixValue of v _i Representing nodes i, v _i Nodes i are represented, E is an edge set, and k is an overview of the dwell region.

6. The GCN-GRU based method of actively identifying the area of interest of a truck in an open pit as claimed in claim 2 wherein: the step 4 specifically includes:

feature embedding using a GCN model, comprising:

wherein A represents the adjacency matrix, I represents the identity matrix,

degree matrix representing undirected graph G, σ (-) is an activation function, W ^(l) As a weight matrix, X ^(l) X ^(l+1) The characteristics of the l-th and l + 1-th layers are shown, respectively. The final output result is matrix X belongs to R ^k*d The features of the last layer are represented, where k is the number of dwell regions and d is the dimension of the embedded representation. X _t ∈R ^d An embedded representation of the tth dwell region in time series is represented.

7. The GCN-GRU based open pit truck stopping area activity recognition method of claim 2, wherein: the step 5 specifically includes:

z _t ＝σ(W _z [h _t-1 ,X _t ]) (6)

Where t represents the position of the current dwell region in the sequence, h _t-1 Information indicating the previous dwell area, X _t As the input vector for the tth dwell region of the GRU. It will be in the same time h _t-1 Connected and linearly transformed, i.e. with the weight matrix W _z The multiplication is input to the activation function of the update gate. σ is expressed as the activation function used and is calculated as follows:

step (52) resets gate r _t The following formula is followed:

r _t ＝σ(W _r [h _t-1 ,X _t ]) (8)

wherein, W _r A weight matrix that is a reset gate;

step (54) of calculating the final feature h of the tth dwell zone _t . TheThe vector will retain the information of the current cell and pass on to the next cell; in this process, it is necessary to use an update gate which uses the current memory content

wherein z is _t To update the activation result of the gate, it also controls the inflow of information in a gated form; z is a radical of formula _t And h _t-1 The product of (a) and (b) represents the information that was retained to the end at the previous time step, and this information plus the information that was retained to the end at the current memory is equal to the content of the output of the end gated loop unit.

y＝FC(h _t ) (12)