CN114723784A

CN114723784A - Pedestrian motion trajectory prediction method based on domain adaptation technology

Info

Publication number: CN114723784A
Application number: CN202210364770.9A
Authority: CN
Inventors: 张小恒; 刘书君; 李勇明
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2022-04-08
Filing date: 2022-04-08
Publication date: 2022-07-08
Anticipated expiration: 2042-04-08
Also published as: CN114723784B

Abstract

The invention relates to the technical field of automatic driving, and particularly discloses a pedestrian motion trail prediction method based on a domain adaptation technology

Then based on

And X' constructing r time sequence convolution networks for training to obtain r prediction models, and then carrying out training on the r prediction models

Inputting the predicted path into r prediction models to predict r predicted paths of N pedestrians, and finally performing path fusion to obtain the optimal predicted track of each pedestrian. The invention enables the data distribution of the training scene and the actual application scene to be approximately consistent through the domain adaptation processing, enhances the generalization capability of the prediction model, and maps the observation track into the training input tracks with different scales through changing the domain adaptation parameters so as to utilize the deeper information of the observation track, thereby enabling the prediction track to be more accurate.

Description

Pedestrian motion trajectory prediction method based on domain adaptation technology

Technical Field

The invention relates to the technical field of automatic driving, in particular to a pedestrian motion trail prediction method based on a domain adaptation technology.

Background

With the popularization of intelligent transportation, predicting the motion trajectory of pedestrians becomes an increasingly important issue. In unmanned or assisted driving, accurate prediction of pedestrian trajectories facilitates a driving system to plan ahead the motion trajectory of a vehicle in a traffic environment. Currently, research results in the field mainly focus on the following aspects: firstly, modeling is carried out by using various social force models based on social attributes of pedestrian motion in the early stage; in recent years, modeling is performed by combining a deep learning method, such as Social long-short term memory network (Social LSTM), Social generation countermeasure network (Social GAN), graph neural network and other methods, but the following problems are faced:

(1) the construction of the social force model is subjective, and the complex actual motion rule of the pedestrian is difficult to accurately describe;

(2) the deep learning related method depends on massive training data, but the training data scene is not matched with the actual application scene, so that the model generalization capability is poor;

(3) the observation track generally adopts fixed time interval sampling, and the multi-scale and multi-resolution depth information of sampling points with different time intervals is difficult to utilize.

Therefore, the invention carries out the domain adaptation processing on the training scene and the testing scene to ensure that the distribution of the training scene and the testing scene is approximately consistent, and fuses the sequences of sampling points at different time intervals, thereby more accurately predicting the pedestrian track.

Disclosure of Invention

The invention provides a pedestrian motion trail prediction method based on a domain adaptation technology, which solves the technical problems that: how to more accurately predict pedestrian trajectories in crowded environments.

In order to solve the above technical problems, the present invention provides a method for predicting a pedestrian motion trajectory based on a domain adaptation technique, comprising the steps of:

s1, acquiring the motion trail of the pedestrian under a training scene and a testing scene to generate a corresponding training data set and a corresponding testing data set, preprocessing the training data set to generate a training input sample set X and a training output reference set X', and preprocessing the testing data set to generate a testing input sample set Y;

s2, based on r different domain adaptive parameters, carrying out domain adaptive processing on the training input sample set X and the test input sample set Y to obtain r different domain adaptive training input sample sets

Adapting a test input sample set to r different domains

S3 adaptive training input sample set based on r fields

Constructing r time series convolution networks for training by a training output reference set X' to obtain r corresponding different prediction models;

s4, fitting r fields into the test input sample set

The method comprises the steps of inputting the prediction information into r prediction models correspondingly for prediction to obtain r prediction paths of each pedestrian;

and S5, fusing the r predicted paths of each pedestrian to obtain the predicted track of each pedestrian.

Further, the step S2 specifically includes the steps of:

s21, inputting training into the abscissa component X of the sample set X_xAnd testing the abscissa component Y of the input sample set Y_xExtracted and respectively regarded as a source domain and a target domain for domain adaptation processing, and r groups of domain adaptation abscissa source domain matrixes are obtained on the basis of r different main characteristic vector numbers d

Sum-domain adaptive abscissa target domain matrix

S22, inputting training into the ordinate component X of the sample set X_yAnd testing the ordinate component Y of the input sample set Y_yExtracted and respectively regarded as a source domain and a target domain for domain adaptation processing, and r groups of domain adaptation ordinate source domain matrixes are obtained based on r different main characteristic vector numbers d

Dome adaptive ordinate object domain matrix

S23, adapting the fields of the same group to the source field matrix of abscissa

Dome-adaptive ordinate source domain matrix

Merging to obtain r field adaptive training input sample sets

Adapting the fields of the same group to the abscissa object field matrix

Dome adaptive ordinate object domain matrix

To carry out the combinationAnd, r groups of domain adaptation test input sample sets are obtained together

Step S21 does not have to be in the order of step S22.

The domain adaptation processing procedure in step S21 specifically includes the steps of:

s211, combining the source domain matrix X_xAnd the target domain matrix Y_xSplicing to obtain a nuclear matrix

T at the upper right corner of the matrix represents matrix transposition;

s212, constructing a maximum mean difference measure matrix

MN + KN dimensional row vector

M is the number of samples in the training input sample set X, namely the training output reference set X', N is the total number of pedestrians, K is the number of samples in the test input sample set Y,

is a matrix

F norm of (d);

s213, constructing a central matrix

E is an MN + KN dimensional unit matrix, and 1 is an MN + KN dimensional all-1 row vector;

s214, matrix pair (KMK + mu E)^-1KHK carries out eigenvalue decomposition, and extracts the first d main eigenvectors to construct a transfer matrix W, wherein mu is a balance factor;

s215, extracting W^TMN column vectors before K form MN x d domain adaptive abscissa source domain matrix

Extraction of W^TKN column vectors after K form a KN x d domain adaptive abscissa target domain matrix

S216, changing the main feature vector number d for r-1 times, reconstructing the transfer matrix W according to the steps S214 and S215 after each change, and constructing a new domain adaptive abscissa source domain matrix based on the reconstructed transfer matrix W

Sum-domain adaptive abscissa target domain matrix

Thereby obtaining r groups of different domain adaptive abscissa source domain matrixes aiming at r different main characteristic vector numbers d

Domain-adapted abscissa target domain matrix

The domain adaptation processing procedure in step S22 is the same as step S211 to step S216.

Further, in step S1, the training data set S includes F frame position coordinates obtained by sampling the motion trajectories of N pedestrians at equal time intervals for a long time, that is, the position coordinates are obtained

Wherein 1 frame data

The spatial positions of N pedestrians at a certain moment in time, i.e.

Wherein the pedestrian is nThe spatial position of the frame sequence f is marked as

F is more than or equal to 1 and less than or equal to F is a frame number, N is more than or equal to 1 and less than or equal to N is a pedestrian number, and (x, y) is a two-dimensional plane coordinate point of a pedestrian;

time period [ Delta T, T + Delta T]T comprising N pedestrians₀+T₁Frame sequential position coordinates, front T₀The frame data can construct a multi-dimensional training input sample

Rear T₁The frame data can construct a multi-dimensional training output reference sample

Wherein training input samples for the nth pedestrian

Training output reference sample of nth pedestrian

(x_n,i,y_n,i) The position coordinates of the nth pedestrian in the ith frame are shown, i is 1,2, …, T₀+T₁；

By varying the starting time Δ t, i.e. Δ t ═ mT_fraM is more than or equal to 0 and less than or equal to (M-1), wherein T_fraFor adjacent frame time intervals, obtaining M multi-dimensional training input samples to form a training input sample set

And M multi-dimensional training output reference samples form a training output reference set

Preprocessing a test data set to generate a test input sample set

To training dataThe same process is used for generating the training input sample set X by preprocessing the set S.

Further, in step S211, the abscissa component X of the input sample set X is trained_xI.e. the source domain matrix X_xComprises the following steps:

testing the abscissa component Y of the input sample set Y_xI.e. the target domain matrix Y_xComprises the following steps:

in step S215, the corresponding domain is adapted to the abscissa source domain matrix

Comprises the following steps:

corresponding domain adaptation abscissa target domain matrix

Comprises the following steps:

similarly, the domain adaptation in step S22 is based on the vertical coordinate source domain matrix

Comprises the following steps:

corresponding field adaptation ordinate meshMark domain matrix

Comprises the following steps:

in step S23, merge

And

resulting set of domain-adapted training input samples

Sample of (1)

Comprises the following steps:

can be obtained by the same way, combined

And

resulting set of domain-adapted test input samples

Sample of (1)

Comprises the following steps:

further, the step S3 specifically includes the steps of:

s31, adapting the first domain to the training input sample set

Sample of (1)

Rearranging construct sets

Wherein the sample

S32, input

Training the L-layer time sequence convolution network, taking the average displacement error between the output of the L-layer time sequence convolution network and a training output reference set X' as a loss function for calculating a training error in the training process, learning a weight value by using a random gradient descent algorithm, and storing a trained prediction model when the training termination condition is that the maximum training period number is met; wherein the convolution process of the time series convolution network is described as follows:

in

Is a2 XN dimensional matrix, will

Inputting L-layer time sequence convolution network to obtain first-layer output

1≤t≤d,1≤t′≤T₁I is more than or equal to 1 and less than or equal to 2, j is more than or equal to 1 and less than or equal to N and higher layer output

1≤t′＜T₁L is more than or equal to 1 and less than L, and correspondingly obtained

First layer output of (2)

And higher layer output

The weights of the convolution kernels of the different layers, η the scale of the convolution kernel,

representing the input of the L-layer time series convolution network;

s33, remaining r-1 field adaptation training input sample set

And repeating the steps S31 to S32 to finally obtain r trained prediction models.

Further, the step S4 specifically includes the steps of:

s41, fitting r fields into the test input sample set

Sample of (1)

Rearranging r sets of structure

Wherein

S42, collecting r

The predicted paths are input into r prediction models trained in step S33, and r different predicted paths Tra (1), Tra (2), …, Tra (r) are obtained for each pedestrian.

Further, for r different predicted paths Tra (1), Tra (2), …, Tra (r) of the pedestrian n, the step S5 is performed according to Tra ═ α₁Tra(1)+α₂Tra(2)+…+α_rTra (r) to obtain a predicted trajectory Tra of the pedestrian n,

α_jdenotes the weight of the first predicted path tra (j), j being 1,2, …, r.

The invention provides a pedestrian motion trail prediction method based on a domain adaptation technology, which comprises the steps of firstly acquiring a training data set and a test data set, preprocessing the training data set and the test data set to obtain a training input sample set X, a training output reference set X' and a test input sample set Y, then performing domain adaptation on the two input sample sets X and Y based on r different domain adaptation parameters to obtain a domain adapted data set

Then based on

And X' constructing r time series convolution networks for training to obtain r prediction models, and then training the r prediction models

Inputting the prediction path into r prediction models for prediction to obtain r prediction paths of N pedestrians, and finally fusing the r prediction paths of each pedestrian to obtain the optimal prediction track of each pedestrian. In the invention, the training scene data and the actual application scene data do not meet the independent and same-distribution statistical characteristic, and the training scene data and the actual application scene data are approximately consistent through domain adaptation processing, so that the generalization capability of a prediction model is enhanced; the invention takes the multi-scale and multi-resolution depth information of the observation track into consideration, maps the observation track into different scale training input tracks by changing the domain adaptive parameters, thereby utilizingAnd the deeper information of the track is observed, so that the final predicted track is more accurate as a whole.

Drawings

Fig. 1 is a flowchart of a method for predicting a pedestrian motion trajectory based on a domain adaptation technology according to an embodiment of the present invention.

Detailed Description

The embodiments of the present invention will be described in detail below with reference to the accompanying drawings, which are given solely for the purpose of illustration and are not to be construed as limitations of the invention, including the drawings which are incorporated herein by reference and for illustration only and are not to be construed as limitations of the invention, since many variations thereof are possible without departing from the spirit and scope of the invention.

In order to predict the pedestrian trajectory more accurately, referring to the flowchart in fig. 1, an embodiment of the present invention provides a method for predicting a pedestrian motion trajectory based on a domain adaptation technique, which generally includes the steps of:

s2, carrying out domain adaptation processing on the training input sample set X and the test input sample set Y to obtain r different domain adaptation training input sample sets

Adapting a test input sample set to r different domains

S3 adaptive training input sample set based on r fields

s4, correspondingly inputting r field adaptation test input sample sets Y into r prediction models for prediction to obtain r prediction paths of each pedestrian;

In detail, in step S1, the training data set S includes F-frame position coordinates obtained by sampling the motion trajectories of N pedestrians at equal time intervals over a long period of time, that is, the F-frame position coordinates

Wherein 1 frame data

The spatial positions of N pedestrians at a certain moment in time, i.e.

Wherein the spatial position of the pedestrian n in the frame sequence f is marked as

Training input sample of nth pedestrian

Training output reference sample of nth pedestrian

Preprocessing a test data set to generate a test input sample set

The process of (2) is the same as the process of preprocessing the training data set S to generate the training input sample set X.

Step S2 specifically includes the steps of:

Sum-domain adaptive abscissa target domain matrix

Sum-domain adaptive ordinate object domain matrix

S23, adapting the domains of the same group to the abscissa source domain matrix

Dome-adaptive ordinate source domain matrix

Merging to obtain r field adaptive training input sample sets

Adapting the fields of the same group to the abscissa object field matrix

Dome adaptive ordinate object domain matrix

Merging to obtain r groups of domain adaptation test input sample sets

Step S21 does not have to be in the order of step S22.

T at the upper right corner of the matrix represents matrix transposition;

s212, constructing a maximum mean difference measure matrix

MN + KN dimensional row vector

is a matrix

F norm of (d);

s213, constructing a central matrix

s214, matrix pair (KMK + μ E)^-1KHK carries out eigenvalue decomposition, and extracts the first d main eigenvectors to construct a transfer matrix W, wherein mu is a balance factor;

Sum-domain adaptive abscissa target domain matrix

Thereby aiming at r different main characteristicsVector number d obtains r groups of different domain adaptive abscissa source domain matrixes

Domain-adapted abscissa target domain matrix

The domain adaptation processing procedure in step S22 is the same as that of steps S211 to S216.

Further illustratively, in step S211, the abscissa component X of the input sample set X is trained_xI.e. the source domain matrix X_xComprises the following steps:

Comprises the following steps:

corresponding domain adaptation abscissa target domain matrix

Comprises the following steps:

similarly, the domain adaptation ordinate in step S22Source domain matrix

Comprises the following steps:

domain-adapted ordinate target domain matrix

Comprises the following steps:

in step S23, merge

And

resulting set of domain-adapted training input samples

Sample of (1)

Comprises the following steps:

can be obtained by the same way, combined

And

resulting set of domain-adapted test input samples

Sample of (1)

Comprises the following steps:

based on this, step S3 specifically includes the steps of:

s31, adapting the first domain to the training input sample set

Sample of (1)

Rearranging construct sets

In which the sample

S32, input

in

Is a2 XN dimensional matrix, will

Inputting L layers of time sequence convolution network to obtain first layer output

First layer output of (2)

And higher layer output

representing the input of an L-layer time series convolutional network;

s33, remaining r-1 field adaptation training input sample set

Step S4 specifically includes the steps of:

s41, fitting r fields into the test input sample set

Sample of (1)

Rearranging r sets of structure

Wherein

S42, collecting r

Then, r different predicted paths Tra (1), Tra (2), …, Tra (r) for the pedestrian n, step S5 is performed based on Tra ═ α₁Tra(1)+α₂Tra(2)+…+α_rTra (r) to obtain a predicted trajectory Tra of the pedestrian n,

α_jdenotes the weight of the first predicted path tra (j), j being 1,2, …, r. And obtaining the predicted tracks of the N pedestrians in the same way.

The embodiment of the invention provides a pedestrian motion trail prediction method based on a domain adaptation technology, which comprises the steps of firstly acquiring a training data set and a test data set, carrying out preprocessing to obtain a training input sample set X, a training output reference set X' and a test input sample set Y, then carrying out domain adaptation on the two input sample sets X and Y based on r different domain adaptation parameters to obtain a domain adapted data set

Then based on

Inputting the prediction into r prediction models for prediction to obtain r prediction paths of N pedestrians, and finally fusing the r prediction paths of each pedestrian to obtain the optimal prediction track of each pedestrian. In the invention, the training scene data and the actual application scene data do not meet the independent and identically distributed statistical characteristics, and the training scene data and the actual application scene data are approximately consistent through domain adaptive processing, so that the generalization capability of a prediction model is enhanced; the invention takes the multi-scale and multi-resolution depth information of the observation track into consideration, and maps the observation track into different-scale training input tracks by changing the domain adaptive parameters, thereby utilizing the deeper information of the observation track and integrally enabling the final predicted track to be more accurate.

In a specific experiment, the training data set in step S1 includes two data sets (eth, hotel) of eth and 3 data sets (univ, zara1, zara2) of UCY, which are 5 different scene data sets. Table 1 shows an example of a single-frame data segment, a training data set is composed of multi-frame data, each frame of data is marked with a pedestrian number, a frame number, a pedestrian x coordinate and a pedestrian y coordinate, the frame interval is 0.4 second, an observation sample is a pedestrian track of 3.2 seconds and corresponds to T₀The next 4.8 seconds of trajectory is predicted, corresponding to T, for 8 frames of image₁12 frames of pictures.

Table 1 single frame data fragment example

Frame number	Pedestrian numbering	Pedestrian x coordinate	Y coordinate of pedestrian
				10	1.0	10.7867577985	3.67631555479
10	2.0	10.9587077931	3.15460523261
				10	3.0	10.9993275592	2.64673717882

The balance factor μ in the domain adaptation algorithm in step S2 is 0.01, the first domain adaptation main feature vector number d is 12, the first time d is 10, and the third time d is 6, that is, the parameter is changed three times r is 3.

Activation function of time series convolutional network in step S3

Where a is 0.25, the convolution kernel scale is 2 × 2, and the total number of layers L is 5. The model implementation is based on a Pythrch library, each layer of convolution is realized by depending on a two-dimensional convolution function Conv2d, the loss function is the average displacement error ADE, each batch of training samples is 128, the model is trained for 100 cycles by using random gradient descent (SGD), and the learning rate is 0.01.

Fusion parameter α in step S4₁＝0.2，α₂＝0.3，α₃＝0.5。

In step S1, the test data set is also taken from scene 1 to scene 5, and 5 alternative training test modes are adopted to avoid overlapping of the training set and the test set. 1. Training set (hotel, univ, zara1, zara2), test set (eth); 2. training set (eth, univ, zara1, zara2), test set (hotel); 3. training set (eth, hotel, zara1, zara2), test set (univ); 4. training set (eth, hotel, univ, zara2), test set (zara 1); 5. training set (eth, hotel, univ, zara1), test set (zara 2).

And finally, calculating the average displacement error ADE and the final displacement error FDE value of the optimal planning track and the actual track in the test set for evaluating the classification effect. The expressions for the average displacement error ADE and the final displacement error FDE are as follows:

wherein

Respectively predicting coordinates and actual coordinates of the t frame of the pedestrian n,

the predicted coordinates and the actual coordinates are respectively the last frame of the trajectory of the pedestrian n.

The average ADE FDE values for 5 different scenes of the present invention were compared to other mainstream methods (Linear, S-LSTM, S-GAN-P, SoPhie) and the results are shown in Table 2.

TABLE 2 comparison of ADE/FDE measurements with existing mainstream results

eth

hotel

univ

zara1

zara2

Average

Linear

1.33/2.94

0.39/0.72

0.82/1.59

0.62/1.21

0.77/1.48

0.79/1.59

S-LSTM

1.09/2.35

0.79/1.76

0.67/1.40

0.47/1.00

0.56/1.17

0.72/1.54

S-GAN-P

0.87/1.62

0.67/1.37

0.76/1.52

0.35/0.68

0.42/0.84

0.61/1.21

SoPhie

0.70/1.43

0.76/1.67

0.54/1.24

0.30/0.63

0.38/0.78

0.54/1.15

The invention

0.70/1.28

0.39/0.63

0.49/0.90

0.39/0.66

0.32/0.50

0.45/0.79

As can be seen from table 2, the average ADE value FDE values of the present invention are significantly better than the 4 main stream methods in 5 different scenes.

The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims

1. A pedestrian motion trail prediction method based on a domain adaptation technology is characterized by comprising the following steps:

Adapting a test input sample set to r different domains

S3, training input sample set based on r field adaptation

R time series convolution networks are constructed by the training output reference set X' for training to obtain r corresponding different prediction models;

s4, fitting r fields into the test input sample set

2. The method for predicting the motion trail of the pedestrian based on the domain adaptation technology as claimed in claim 1, wherein the step S2 specifically comprises the steps of:

Sum-domain adaptive abscissa target domain matrix

Dome adaptive ordinate object domain matrix

Dome-adaptive ordinate source domain matrix

Merging to obtain r field adaptive training input sample sets

Adapting the fields of the same group to the abscissa object field matrix

Dome adaptive ordinate object domain matrix

Merging to obtain r groups of domain adaptation test input sample sets

Step S21 does not have to be in the order of step S22.

3. The method for predicting the motion trail of the pedestrian based on the domain adaptation technology as claimed in claim 2, wherein the domain adaptation processing procedure in the step S21 specifically comprises the steps of:

T at the upper right corner of the matrix represents matrix transposition;

s212, constructing a maximum mean difference measure matrix

MN + KN dimensional row vector

is a matrix

F norm of (d);

s213, constructing a central matrix

Sum-domain adaptive abscissa object domain matrix

Domain-adapted abscissa target domain matrix

4. The method for predicting pedestrian motion trail according to claim 3, wherein in the step S1, the training data set S comprises F-frame position coordinates obtained by sampling N pedestrians 'motion trail at equal time intervals for a longer time, that is, N pedestrians' motion trail

Wherein 1 frame data

The spatial positions of N pedestrians at a certain moment in time, i.e.

Wherein the spatial position of the pedestrian n in the frame sequence f is identified as

Wherein training input samples for the nth pedestrian

Training output reference sample of nth pedestrian

Preprocessing a test data set to generate a test input sample set

5. The method for predicting the motion trail of the pedestrian based on the domain adaptation technology as claimed in claim 4, wherein in step S211, the abscissa component X of the input sample set X is trained_xI.e. the source domain matrix X_xComprises the following steps:

Comprises the following steps:

corresponding domain adaptation abscissa target domain matrix

Comprises the following steps:

Comprises the following steps:

corresponding domain adaptation ordinate target domain matrix

Comprises the following steps:

in step S23, merge

And

resulting set of domain-adapted training input samples

Sample of (1)

Comprises the following steps:

can be obtained by the same way, combined

And

resulting set of domain-adapted test input samples

Sample of (1)

Comprises the following steps:

6. the method for predicting the motion trail of the pedestrian based on the domain adaptation technology as claimed in claim 5, wherein the step S3 specifically comprises the steps of:

s31, adapting the first domain to the training input sample set

Sample of (1)

Rearranging construct sets

In which the sample

S32, input

in

Is a2 XN dimensional matrix, will

And higher layer output

Correspond to obtain

First layer output of (2)

And higher layer output

representing the input of the L-layer time series convolution network;

s33, remaining r-1 field adaptation training input sample set

7. The method for predicting the motion trail of the pedestrian based on the domain adaptation technology as claimed in claim 6, wherein the step S4 specifically comprises the steps of:

s41, fitting r fields into the test input sample set

Sample of (1)

Rearranging r sets of structure

Wherein

S42, collecting r

8. The method for predicting the motion trail of the pedestrian based on the domain adaptation technology as claimed in claim 7, wherein: r different predicted paths Tra (1), Tra (2), …, Tra (r) for the pedestrian n, in step S5, based on Tra ═ α₁Tra(1)+α₂Tra(2)+…+α_rTra (r) to obtain a predicted trajectory Tra of the pedestrian n,