CN114723784B

CN114723784B - Pedestrian motion trail prediction method based on domain adaptation technology

Info

Publication number: CN114723784B
Application number: CN202210364770.9A
Authority: CN
Inventors: 张小恒; 刘书君; 李勇明
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2022-04-08
Filing date: 2022-04-08
Publication date: 2024-06-14
Anticipated expiration: 2042-04-08
Also published as: CN114723784A

Abstract

The invention relates to the technical field of automatic driving, and particularly discloses a pedestrian motion trail prediction method based on domain adaptation technology, which comprises the steps of firstly collecting a training data set and a test data set, preprocessing to obtain a training input sample set X, a training output reference set X', and a test input sample set Y, and then carrying out domain adaptation on X and Y based on r different domain adaptation parameters to obtain a data setThen based onConstructing r time series convolution networks by X' to train to obtain r prediction models, and then constructing r prediction modelsAnd inputting the path fusion information into r prediction models to perform prediction to obtain r prediction paths of N pedestrians, and finally performing path fusion to obtain the optimal prediction track of each pedestrian. According to the method, the training scene is approximately consistent with the actual application scene data distribution through domain adaptation processing, the generalization capability of the prediction model is enhanced, and the observation track is mapped into different-scale training input tracks through changing domain adaptation parameters, so that deeper information of the observation track is utilized, and the prediction track can be more accurate.

Description

Pedestrian motion trail prediction method based on domain adaptation technology

Technical Field

The invention relates to the technical field of automatic driving, in particular to a pedestrian motion trail prediction method based on a domain adaptation technology.

Background

With the popularization of intelligent traffic, predicting the motion trail of pedestrians is an increasingly important subject. In unmanned or assisted driving, accurate prediction of pedestrian trajectories facilitates the driving system to plan the motion trajectories of vehicles in the traffic environment in advance. The research results in the field are mainly focused on the following aspects: ① Early modeling based on social attributes of pedestrian motion using various social force models; ② In recent years, modeling is performed in combination with a deep learning method, such as a Social long short term memory network (society LSTM), a Social generation countermeasure network (society GAN), a graph neural network and the like, but the following problems are faced:

(1) The construction of the social force model is subjective, and complex actual motion rules of pedestrians are difficult to describe accurately;

(2) The deep learning related method relies on massive training data, but the training data scene is not matched with the actual application scene, so that the model generalization capability is poor;

(3) The observation track is generally sampled at fixed time intervals, and multi-scale and multi-resolution depth information of sampling points at different time intervals is difficult to use.

Therefore, the training scene and the test scene are subjected to domain adaptation processing to enable the distribution of the training scene and the test scene to be approximately consistent, and the sequences of sampling points at different time intervals are fused, so that the pedestrian track is predicted more accurately.

Disclosure of Invention

The invention provides a pedestrian motion trail prediction method based on domain adaptation technology, which solves the technical problems that: how to predict pedestrian trajectories in crowded environments more accurately.

In order to solve the technical problems, the invention provides a pedestrian motion trail prediction method based on a domain adaptation technology, which comprises the following steps:

S1, acquiring a training scene and a motion trail of a pedestrian under a test scene, generating a corresponding training data set and a corresponding test data set, preprocessing the training data set to generate a training input sample set X and a training output reference set X', and preprocessing the test data set to generate a test input sample set Y;

S2, performing domain adaptation processing on the training input sample set X and the test input sample set Y based on r different domain adaptation parameters to obtain r different domain adaptation training input sample sets And r different domain adaptation test input sample set/>

S3, adapting training input sample set based on r domainsConstructing r time sequence convolution networks for training by training the output reference set X' to obtain r corresponding different prediction models;

S4, adapting r domains to the test input sample set Correspondingly inputting the pedestrian prediction paths into r prediction models to predict, so as to obtain r prediction paths of each pedestrian;

s5, fusing the r prediction paths of each pedestrian to obtain the prediction track of each pedestrian.

Further, the step S2 specifically includes the steps of:

S21, extracting an abscissa component X _x of the training input sample set X and an abscissa component Y _x of the test input sample set Y, respectively treating as a source domain and a target domain for domain adaptation, and obtaining an r-group domain adaptation abscissa source domain matrix based on r different main feature vector numbers d Sum domain adaptive abscissa target domain matrix/>

S22, extracting an ordinate component X _y of the training input sample set X and an ordinate component Y _y of the test input sample set Y, respectively treating as a source domain and a target domain for domain adaptation, and obtaining an r-group domain adaptation ordinate source domain matrix based on r different main feature vector numbers dSum domain adaptive ordinate target domain matrix/>

S23, adapting the domains of the same group to the abscissa source domain matrixSum domain adaptive ordinate source domain matrix/>Merging to obtain r domain adaptation training input sample sets/>Adapting the same set of domains to an abscissa target domain matrix/>Sum domain adaptive ordinate target domain matrix/>Merging to obtain r-group domain adaptation test input sample set/>

Step S21 and step S22 do not distinguish the sequence.

The domain adaptation processing procedure in step S21 specifically includes the steps of:

S211, splicing the source domain matrix X _x and the target domain matrix Y _x to obtain a nuclear matrix The T in the upper right corner of the matrix represents the matrix transpose;

S212, constructing a maximum mean difference measure matrix Mn+KN dimension vector

M is the number of samples of the training input sample set X, i.e. the training output reference set X', N is the total number of pedestrians, K is the number of samples of the test input sample set Y,/>Is a matrixF norm of (c);

s213, constructing a center matrix E is MN+KN dimension identity matrix, 1 is MN+KN dimension full 1 row vector;

S214, carrying out eigenvalue decomposition on a matrix (KMK +mu E) ^-1 KHK, and extracting the first d main eigenvectors to construct a transfer matrix W, wherein mu is a balance factor;

S215, extracting MN column vectors before W ^T K to form an MN x d domain adaptive abscissa source domain matrix Extracting KN column vectors after W ^T K to form KN x d domain adaptive abscissa target domain matrix/>

S216, changing the number d of the main feature vectors for r-1 times, reconstructing a transfer matrix W according to the steps S214 and S215 after each change, and constructing a new domain-adaptive abscissa source domain matrix based on the reconstructed transfer matrix WSum domain adaptive abscissa target domain matrix/>Thereby obtaining r groups of different domain adaptation abscissa source domain matrixes/>, aiming at r different main feature vector numbers dDomain adaptation abscissa target Domain matrix/>

The domain adaptation processing in step S22 is the same as in steps S211 to S216.

Further, in the step S1, the training data set S includes F frame position coordinates formed by sampling the motion trajectories of N pedestrians at equal intervals over a long period of time, that isWherein 1 frame data/>The spatial positions of N pedestrians at a certain moment are recorded, namely/>Wherein the spatial position of pedestrian n in frame sequence f is identified asF is more than or equal to 1 and less than or equal to F is a frame number, N is more than or equal to 1 and less than or equal to N is a pedestrian number, and (x, y) is a pedestrian two-dimensional plane coordinate point;

The time period [ delta T, T+delta T ] comprises T ₀+T₁ frames of continuous position coordinates of N pedestrians, and the previous T ₀ frames of data can construct a multi-dimensional training input sample The post-T ₁ frame data can construct a multidimensional training output reference sampleWherein training input sample of nth pedestrian/>Training output reference sample of nth pedestrian/>(X _n,i,y_n,i) represents the position coordinates of the nth pedestrian at the ith frame, i=1, 2, …, T ₀+T₁;

Varying the start time Δt, i.e., Δt=mT _fra, 0.ltoreq.m.ltoreq.M-1, where T _fra is the adjacent frame time interval, a total of M multidimensional training input samples forming a training input sample set And M multidimensional training output reference samples form training output reference set/>

Preprocessing a test data set to generate a test input sample setThe same applies to the preprocessing of the training data set S to generate the training input sample set X.

Further, in step S211, the abscissa component X _x of the training input sample set X, that is, the source domain matrix X _x is:

the abscissa component Y _x of the test input sample set Y, the target domain matrix Y _x, is:

in step S215, the corresponding domain-adapted abscissa source domain matrix The method comprises the following steps:

Corresponding domain adaptation abscissa target domain matrix The method comprises the following steps:

Similarly, the domain adaptation ordinate source domain matrix in step S22 is available The method comprises the following steps:

corresponding domain adaptation ordinate target domain matrix The method comprises the following steps:

In step S23, the combination And/>The resulting domain adaptation training input sample set/>Samples of (3)The method comprises the following steps:

is of the same nature and can be combined And/>The obtained domain adaptation test input sample set/>Sample/>The method comprises the following steps:

Further, the step S3 specifically includes the steps of:

S31, adapting the first domain to the training input sample set Sample/>Rearranging a build setWherein the sample/>

S32, inputTraining the L-layer time sequence convolution network, taking the average displacement error of the output of the L-layer time sequence convolution network and the training output reference set X' as a loss function for calculating the training error in the training process, learning a weight value by utilizing a random gradient descent algorithm, and storing a trained prediction model under the training termination condition that the maximum training cycle number is met; the convolution process of the time-series convolution network is described as follows:

middle/> For a2 XN dimensional matrix, will/>Inputting the L-layer time sequence convolution network to obtain a first-layer outputT is not less than 1 and not more than d, T 'is not less than 1 and not more than T' is not less than T ₁, i is not less than 1 and not more than 2, j is not less than 1 and not more than N and higher layer output/>T' is more than or equal to 1 and less than T ₁, L is more than or equal to 1 and less than L, and the corresponding product is obtainedFirst layer output/>And higher layer output/>For the weight of different layers of convolution kernels, eta is the convolution kernel scale,/>An input representing an L-layer time series convolution network;

S33, remaining r-1 domain adaptation training input sample set And repeating the steps S31 to S32 to finally obtain r trained prediction models.

Further, the step S4 specifically includes the steps of:

S41, adapting r domains to a test input sample set Sample/>Rearranging and constructing r setsWherein/>

S42, gathering rAnd (3) respectively inputting the r prediction models trained in the step S33 to obtain r different prediction paths Tra (1), tra (2), … and Tra (r) of each pedestrian.

Further, r different predicted paths Tra (1), tra (2), …, tra (r) of the pedestrian n are fused according to tra=α ₁Tra(1)+α₂Tra(2)+…+α_r Tra (r) in step S5 to obtain a predicted track Tra of the pedestrian n,Α _j represents the weight of the first predicted path Tra (j), j=1, 2, …, r.

The invention provides a pedestrian motion trail prediction method based on domain adaptation technology, which comprises the steps of firstly collecting a training data set and a test data set, preprocessing to obtain a training input sample set X, a training output reference set X', and a test input sample set Y, and then carrying out domain adaptation on two input sample sets X and Y based on r different domain adaptation parameters to obtain a domain adapted data setThen based on/>Building r time series convolution networks by X' to train to obtain r prediction models, and then training r/>And inputting the r prediction paths into r prediction models to perform prediction to obtain r prediction paths of N pedestrians, and finally fusing the r prediction paths of each pedestrian to obtain an optimal prediction track of each pedestrian. According to the method, the statistical characteristics of independent identical distribution of the training scene data and the actual application scene data are considered, the training scene data and the actual application scene data are approximately consistent in distribution through domain adaptation processing, and the generalization capability of a prediction model is enhanced; according to the method, the multi-scale and multi-resolution depth information of the observation track is considered, the observation track is mapped into the training input track with different scales by changing the domain adaptation parameters, so that the deeper information of the observation track is utilized, and the final predicted track is more accurate as a whole.

Drawings

Fig. 1 is a flowchart of a pedestrian motion trail prediction method based on a domain adaptation technology provided by an embodiment of the present invention.

Detailed Description

The following examples are given for the purpose of illustration only and are not to be construed as limiting the invention, including the drawings for reference and description only, and are not to be construed as limiting the scope of the invention as many variations thereof are possible without departing from the spirit and scope of the invention.

In order to more accurately predict a pedestrian trajectory, referring to the flowchart of fig. 1, an embodiment of the present invention provides a pedestrian motion trajectory prediction method based on a domain adaptation technique, generally comprising the steps of:

S2, performing domain adaptation processing on the training input sample set X and the test input sample set Y to obtain r different domain adaptation training input sample sets And r different domain adaptation test input sample set/>

S4, correspondingly inputting r domain adaptation test input sample sets Y into r prediction models to predict, and obtaining r prediction paths of each pedestrian;

To elaborate, in step S1, the training data set S includes F frame position coordinates formed by sampling the motion trails of N pedestrians for a long time at equal time intervals, namelyWherein 1 frame data/>The spatial positions of N pedestrians at a certain moment are recorded, namely/>Wherein the spatial position of pedestrian n in frame sequence f is identified asF is more than or equal to 1 and less than or equal to F is a frame number, N is more than or equal to 1 and less than or equal to N is a pedestrian number, and (x, y) is a pedestrian two-dimensional plane coordinate point;

The step S2 specifically comprises the steps of:

Step S21 and step S22 do not distinguish the sequence.

S215, extracting MN column vectors before W ^T K to form an MN x d domain adaptive abscissa source domain matrix Extracting KN column vectors after W ^T K to form KN x d domain adaptive abscissa target domain matrix/>S216, changing the number d of the main feature vectors for r-1 times, reconstructing a transfer matrix W according to the steps S214 and S215 after each change, and constructing a new domain-adaptive abscissa source domain matrix/>, based on the reconstructed transfer matrix WSum domain adaptive abscissa target domain matrix/>Thereby obtaining r groups of different domain adaptation abscissa source domain matrixes/>, aiming at r different main feature vector numbers dDomain adaptation abscissa target Domain matrix/>

Further developed, in step S211, the abscissa component X _x of the training input sample set X, that is, the source domain matrix X _x, is:

Domain adaptation ordinate target domain matrix The method comprises the following steps:

In step S23, the combination And/>The resulting domain adaptation training input sample set/>Sample/>The method comprises the following steps:

based on this, step S3 specifically includes the steps of:

The step S4 specifically comprises the steps of:

Then, for r different predicted paths Tra (1), tra (2), …, tra (r) of the pedestrian n, step S5 fuses according to tra=α ₁Tra(1)+α₂Tra(2)+…+α_r Tra (r) to obtain a predicted trajectory Tra of the pedestrian n,Α _j represents the weight of the first predicted path Tra (j), j=1, 2, …, r. And similarly, the predicted tracks of N pedestrians can be obtained.

The embodiment of the invention provides a pedestrian motion trail prediction method based on domain adaptation technology, which comprises the steps of firstly collecting a training data set and a test data set, preprocessing to obtain a training input sample set X, a training output reference set X', and a test input sample set Y, and then carrying out domain adaptation on two input sample sets X and Y based on r different domain adaptation parameters to obtain a domain adapted data setThen based on/>Building r time series convolution networks by X' to train to obtain r prediction models, and then training r/>And inputting the r prediction paths into r prediction models to perform prediction to obtain r prediction paths of N pedestrians, and finally fusing the r prediction paths of each pedestrian to obtain an optimal prediction track of each pedestrian. According to the method, the statistical characteristics of independent identical distribution of the training scene data and the actual application scene data are considered, the training scene data and the actual application scene data are approximately consistent in distribution through domain adaptation processing, and the generalization capability of a prediction model is enhanced; according to the method, the multi-scale and multi-resolution depth information of the observation track is considered, the observation track is mapped into the training input track with different scales by changing the domain adaptation parameters, so that the deeper information of the observation track is utilized, and the final predicted track is more accurate as a whole.

In a specific experiment, the training dataset in step S1 comprises two datasets of eth (hot) and 3 datasets of UCY (univ, zara1, zara 2) together of 5 different scene datasets. Table 1 is an example of a single frame data segment, and the training data set is composed of multi-frame data, each frame data is marked with a pedestrian number, a frame number, a pedestrian x coordinate, and a pedestrian y coordinate, the frame interval is 0.4 seconds, the observed sample is a 3.2 second pedestrian track, corresponding to a T ₀ =8 frame image, and the next 4.8 second track is predicted, corresponding to a T ₁ =12 frame image.

Table 1 single frame data fragment example

Frame number	Pedestrian numbering	Pedestrian x coordinate	Pedestrian y coordinate
				10	1.0	10.7867577985	3.67631555479
10	2.0	10.9587077931	3.15460523261
				10	3.0	10.9993275592	2.64673717882

The balance factor μ=0.01 in the domain adaptation algorithm in step S2, the first domain adaptation principal eigenvector number d=12, the first d=10, the third d=6, i.e. the parameter is changed three times r=3.

Step S3 is a time series convolution network activation functionWhere a=0.25, the convolution kernel scale is 2×2, and the total number of layers l=5. The model implementation is based on Pytorch libraries, each layer of convolution is implemented by a two-dimensional convolution function Conv2d, the loss function is an average displacement error ADE, each batch of training samples is 128, the model is trained for 100 cycles by using random gradient descent (SGD), and the learning rate is 0.01.

In step S4, the fusion parameter α ₁＝0.2,α₂＝0.3,α₃ =0.5.

In step S1, the test data set is also taken from scenario 1 to scenario 5, and in order to avoid overlapping of the training set and the test set, 5 alternative training test modes are adopted. 1. Training set (hotel, univ, zara, zara 2), test set (eth); 2. training set (eth, univ, zara, zara 2), test set (hotel); 3. training set (eth, hotel, zara, zara 2), test set (univ); 4. training set (eth, hotel, univ, zara 2), test set (zara 1); 5. training set (eth, hotel, untv, zara 1), test set (zara).

And finally, calculating the average displacement error ADE and the final displacement error FDE of the optimal planned track and the actual track in the test set, and evaluating the classification effect. The expression of the average displacement error ADE and the final displacement error FDE is as follows:

Wherein the method comprises the steps of Predicted coordinates and actual coordinates of the t frame of the pedestrian n,/>, respectivelyThe coordinates and the actual coordinates are predicted for the last frame of the track of the pedestrian n.

The average ADE value FDE values of the present invention at 5 different scenes were compared with other mainstream methods (Linear, S-LSTM, S-GAN-P, soPhie) and the results are shown in table 2.

TABLE 2 comparison of ADE/FDE metrics with existing mainstream results

eth

hotel

univ

zara1

zara2

Average of

Linear

1.33/2.94

0.39/0.72

0.82/1.59

0.62/1.21

0.77/1.48

0.79/1.59

S-LSTM

1.09/2.35

0.79/1.76

0.67/1.40

0.47/1.00

0.56/1.17

0.72/1.54

S-GAN-P

0.87/1.62

0.67/1.37

0.76/1.52

0.35/0.68

0.42/0.84

0.61/1.21

SoPhie

0.70/1.43

0.76/1.67

0.54/1.24

0.30/0.63

0.38/0.78

0.54/1.15

The invention is that

0.70/1.28

0.39/0.63

0.49/0.90

0.39/0.66

0.32/0.50

0.45/0.79

As can be seen from table 2, the average ADE value FDE values of the present invention in 5 different scenarios are significantly better than the 4 main stream methods.

The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.

Claims

1. The pedestrian motion trail prediction method based on the domain adaptation technology is characterized by comprising the following steps:

2. The method for predicting the motion trajectory of a pedestrian based on the domain adaptation technique according to claim 1, wherein the step S2 specifically comprises the steps of:

Step S21 and step S22 do not distinguish the sequence.

3. The method for predicting the motion trajectory of a pedestrian based on the domain adaptation technique according to claim 2, wherein the domain adaptation processing in step S21 specifically comprises the steps of:

M is the number of samples of the training input sample set X, i.e. the training output reference set X', N is the total number of pedestrians, K is the number of samples of the test input sample set Y,/>For matrix/>F norm of (c);

S216, changing the number d of the main feature vectors for r-1 times, reconstructing a transfer matrix W according to the steps S214 and S215 after each change, and constructing a new domain-adaptive abscissa source domain matrix based on the reconstructed transfer matrix WSum domain adaptive abscissa target domain matrix/>Thereby obtaining r groups of different domain adaptation abscissa source domain matrixes aiming at r different main feature vector numbers dDomain adaptation abscissa target Domain matrix/>

4. The method for predicting motion trajectories of pedestrians based on domain adaptation technology as set forth in claim 3, wherein in said step S1, the training data set S includes F frame position coordinates formed by sampling motion trajectories of N pedestrians at equal time intervals over a long period of time, namelyWherein 1 frame data/>The spatial positions of N pedestrians at a certain moment are recorded, namely/>Wherein the spatial position of pedestrian n in frame sequence f is identified as/>F is more than or equal to 1 and less than or equal to F is a frame number, N is more than or equal to 1 and less than or equal to N is a pedestrian number, and (x, y) is a pedestrian two-dimensional plane coordinate point;

5. The method for predicting a pedestrian motion trajectory based on the domain adaptation technique according to claim 4, wherein in step S211, the abscissa component X _x of the training input sample set X, that is, the source domain matrix X _x is:

6. the method for predicting the motion trajectory of a pedestrian based on the domain adaptation technique according to claim 5, wherein the step S3 specifically comprises the steps of:

middle/> For a2 XN dimensional matrix, will/>Inputting the L-layer time sequence convolution network to obtain a first-layer outputHigher layer output/>Corresponding to get/>First layer output/>And higher layer output/> For the weights of the convolution kernels of the different layers, eta is the convolution kernel scale,An input representing an L-layer time series convolution network;

7. The method for predicting the motion trajectory of a pedestrian based on the domain adaptation technique according to claim 6, wherein the step S4 specifically comprises the steps of:

S41, adapting r domains to a test input sample set Sample/>Rearrangement construct r collections/>Wherein/>

8. The pedestrian motion trail prediction method based on domain adaptation technology of claim 7, wherein the method comprises the following steps: aiming at r different predicted paths Tra (1), tra (2), … and Tra (r) of the pedestrian n, the step S5 is to fuse according to Tra=alpha ₁Tra(1)+α₂Tra(2)+…+α_r Tra (r) to obtain a predicted track Tra of the pedestrian n,Α _j represents the weight of the first predicted path Tra (j), j=1, 2, …, r.