CN107315994B

CN107315994B - Clustering method based on Spectral Clustering space trajectory

Info

Publication number: CN107315994B
Application number: CN201710334850.9A
Authority: CN
Inventors: 宋焕生; 李婵; 崔华; 王璇; 关琦; 孙士杰; 武非凡
Original assignee: Changan University
Current assignee: Changan University
Priority date: 2017-05-12
Filing date: 2017-05-12
Publication date: 2020-08-18
Anticipated expiration: 2037-05-12
Also published as: CN107315994A

Abstract

The invention discloses a Clustering algorithm based on Spectral Clustering space tracks, which comprises the steps of collecting video images of a road by using a camera, extracting characteristic points of all moving targets in each frame of image in the video images by adopting an ORB algorithm, and tracking the characteristic points by using a KLT tracking algorithm based on bidirectional weighted invertibility constraint to obtain a plurality of moving tracks of all the moving targets and coordinate values of each track point on each moving track in an image coordinate system; constructing similar matrixes for the n motion tracks of all the moving targets by using rigid motion constraint to perform spectral clustering to obtain different types of motion tracks; performing inter-class combination on the motion tracks of different classes to obtain inter-class combined motion tracks; the method provided by the invention is not influenced and limited by various environments in engineering application, is easy to realize, and can effectively and accurately detect the vehicle in real time, thereby having wide application prospect.

Description

Clustering method based on Spectral Clustering space trajectory

Technical Field

The invention belongs to the technical field of video detection, and particularly relates to a Clustering method based on a Spectral Clustering space track.

Background

Along with the rapid development of economy and the improvement of society, the living standard of people is improved, the number of motor vehicles is remarkably increased, and in contrast, the traffic capacity of roads is obviously reduced, and a series of problems of traffic jam, road blockage and the like are caused. The traffic flow of the road is detected and counted, and the information is sent to a supervision department, so that effective measures can be made, the traffic is relieved, and the purpose of controlling the traffic is achieved. Meanwhile, long-term traffic flow statistics provides important basis for design, maintenance and the like of urban roads in the future.

Vehicle detection and traffic flow statistics based on traffic scenes are more and more concerned due to the advantages of real-time detection performance, low cost, easiness in installation and use and the like. However, the current commonly used vehicle detection and traffic flow statistics software is limited by the traffic flow size, the scene complexity and the like, so that higher accuracy cannot be obtained, and in an actual scene, an expected effect cannot be achieved.

Disclosure of Invention

In view of the above problems or disadvantages in the prior art, it is an object of the present invention to provide a clustering method based on spectrum clustering spatial trajectories.

In order to achieve the purpose, the invention adopts the following technical scheme:

the Clustering method based on the Spectral Clustering space trajectory comprises the following steps:

step 1, carrying out video image acquisition on a road by using a camera to obtain a plurality of motion tracks of all moving targets in each frame of image in a video image and a coordinate value of each track point on each motion track in an image coordinate system;

setting the number of motion tracks of all the moving targets as n, wherein each motion track is provided with r continuous track points; wherein n is a natural number greater than or equal to 1, and r is a natural number greater than or equal to 1;

the image coordinate system takes any angle of each frame of image in the video image as an origin, the horizontal direction of each frame of image as a u-axis and the vertical direction of each frame of image as a v-axis;

step 2, constructing similar matrixes for the n motion tracks of all the moving targets by using rigid motion constraint to perform spectral clustering to obtain different types of motion tracks;

the method comprises the following steps:

step 21, establishing a world coordinate system by taking the direction parallel to the road lane markings as a Y axis, the direction perpendicular to the road lane markings as an X axis, the X axis and the Y axis are both parallel to the road, the intersection point of the X axis and the Y axis is taken as an origin O, and the direction of the shortest distance between the camera and the road is taken as a Z axis;

step 22, selecting two optional motion tracks from the multiple motion tracks of all the motion targets, respectively recording the two motion tracks as M and N, and the height h of the track M_MIs in the vertical direction of the horizontal plane, h_MHas a value range of 1-3m, h_MThe value interval of (a) is 0.1 m; height h of track N_NIs in the horizontal direction of the horizontal plane, h_NHas a value range of 0-4m, h_NThe value interval of (a) is 0.01 m;

obtaining Δ D by the formula (1)₁Dough making:

in the formula (1), N (P)_i) Represents the ith trace point on trace N, M (P)_i) Represents the ith trace point on trace M, M (P)_i+1) Represents the i +1 th track point on the track M, N (P)_i+1) Represents the i +1 th track point on the track N, i is 1, 2. M (P)_i)N(P_i) Representing trace points P on the trace M at the same moment_iWith the track point P on the track N_iDistance in the world coordinate system;

obtaining Δ D by the formula (2)₂Dough making:

in the formula (2), N (P)_i) Represents the ith trace point on trace N, M (P)_i) Represents the ith trace point on trace M, M (P)_i+1) Represents the i +1 th track point on the track M, N (P)_i+1) Represents the i +1 th track point on the track N, i is 1, 2. M (P)_i+1)M(P_i) Representing points P on the track M_iAnd P_i+1Distance in the world coordinate system; n (P)_i+1)N(P_i) Representing points P on the track N_iAnd P_i+1Distance in the world coordinate system;

tracing point P is determined by formula (3)_iCoordinate values (u) in the image coordinate system_i,v_i) Conversion into coordinate values (x) in world coordinate system_i,y_i,z_i)：

P_i＝C_3×4 ^-1λ_ip_i(3)

In the formula (3), p_i＝[u_i,v_i,1]^T，P_i＝[x_i,y_i,z_i,1]^T，λ_iIs a scale factor, 0 is more than or equal to lambda_i≤1；C^-1 _3×4An inverse matrix representing a perspective projection matrix of the camera;

step 23, with h_M、h_NAs two axes construct a plane as the zero plane, Δ D₁Flour,. DELTA.D₂The surfaces are two longitudinal planes vertical to the zero plane respectively;

ΔD₁sum of surface and Δ D₂The surfaces form two intersecting lines with the zero plane respectively as Δ D₁Intersection line sum Δ D₂Intersecting lines; let Delta D₁Intersection line sum Δ D₂Distance between intersecting lines is Δ Diff₁₂And Δ D₁Intersection line sum Δ D₂The angle between the intersecting lines is theta, delta D₁The slope of the line of intersection is k₁，ΔD₂The slope of the line of intersection is k₂；

Step 24, obtaining an element w in the similarity matrix A through the formula (4)_oq：

In formula (4), q is 1, 2.. n; n ═ 1,2,. n; delta D_1IJIndicates when the height of the track M is h_MIThe height of the track N is h_NJΔ D of time₁A value; delta D_2IJIndicates when the height of the track M is h_MIThe height of the track N is h_NJΔ D of time₂A value; h is_MI1,1.1,1.2, 3, in m; h is_NJ0,0.01,0.02, 3.99,4, in m;

step 25, repeating the steps 22 to 24 until every two of the N tracks are taken as a track M and a track N to obtain an nxn similar matrix A, and executing the step 26;

step 26, arranging the eigenvalues of the similarity matrix A in descending order, the sum of all eigenvalues being S_n；

The minimum k value is selected so that

The clustering number of the n motion tracks is k, and step 27 is executed; wherein S is_kK is the sum of the first k characteristic values, and k is a natural number which is more than or equal to 1;

and 27, constructing an n multiplied by K dimensional feature vector space by using the feature vectors corresponding to the first K feature values, clustering the n multiplied by K dimensional feature vector space by using a K-means algorithm, and clustering the n motion tracks into K categories of motion tracks.

Further, still include:

step 3, performing inter-class combination on the k classes of motion tracks obtained in the step 2 to obtain inter-class combined motion tracks;

the method comprises the following steps:

step 31, selecting two categories from the k categories of motion tracks clustered in the step 2, and respectively marking as C_aAnd C_bLet class C_aMiddle track p_aHas a minimum back projection velocity of v_aClass C_bMiddle track p_bHas a minimum back projection velocity of v_b(ii) a Wherein v is_a≥0，v_b≥0；

Step 32, if v_aLess than v_bThen C will be_aAs a reference category, track p_aAs a feature point of the reference category, C_bAs category to merge, track p_bAs feature points of the categories to be merged; if v is_bLess than v_aThen C will be_bAs a reference category, track p_bAs a feature point of the reference category, C_aAs category to merge, track p_aAs feature points of the categories to be merged;

obtaining the height H of the feature point of the category to be merged by the formula (5)_p：

In the formula (5), v is the speed of the moving object, and v ═ min (v)_a,v_b)；v_pThe minimum back projection speed of the category to be merged; h_cThe height of the camera in the world coordinate system;

obtaining coordinate values of the feature points of the categories to be merged in a world coordinate system through an equation (6):

P′＝C_3×4 ^-1λ_ip′ (6)

in formula (6), p '═ u'_i,v′_i,1]^T；P′＝[X′_i,Y′_i,Z′_i,1]^T；u′_i，v′_iThe coordinate values of the feature points of the categories to be merged in the image coordinate system; x'_i,Y′′,Z′_iThe coordinate values of the characteristic points of the categories to be merged in a world coordinate system; lambda [ alpha ]_iIs a scale factor, 0 is more than or equal to lambda_i≤1；C^-1 _3×4An inverse matrix representing a perspective projection matrix of the camera;

obtaining coordinate values of the feature points of the reference category in a world coordinate system by the following formula (7):

P″＝C_3×4 ^-1λ_ip″ (7)

in formula (7), p ″ - [ u ″ ]_i,v″_i,1]^T；P″＝[X″_i,Y″_i,0,1]^T；u″_i，v″_iCoordinate values of the characteristic points of the reference category in an image coordinate system; x ″)_i,Y″_i0 is a coordinate value of the characteristic point of the reference category in a world coordinate system; lambda [ alpha ]_iIs a scale factor, 0 is more than or equal to lambda_i≤1；C^-1 _3×4An inverse matrix representing a perspective projection matrix of the camera;

step 33, obtaining absolute distances Δ X, Δ Y, Δ Z between the feature points of the to-be-merged category and the feature points of the reference category in the world coordinate system by using formula (8):

step 34, if Δ X ═ X ', Δ Y ═ Y ', Δ Z ═ Z ', the to-be-merged class and the reference class are merged into one class; otherwise, go to step 35;

and 35, repeating the steps 31 to 34 until the motion tracks of the k categories are all used as the merged category and the category to be merged to obtain the motion track after the inter-category merging.

Further, an inverse matrix C of the perspective projection matrix of the camera is obtained by equation (9)^-1 _3×4：

C_3×4 ^-1＝{K[R_3×3|t_3×1]}^-1(9)

In the formula (9), K represents an intrinsic parameter matrix of the camera, R_3×3Representing a rotation matrix between the camera coordinate system and the world coordinate system, t_3×1Representing a translation matrix between a camera coordinate system and a world coordinate system;

the camera coordinate system is the optical center O of the camera_CIs the origin of a coordinate system, X_CIn line with the u-axis direction of the image coordinate system, Y_CIn line with the direction of the v-axis of the image coordinate system, Z_CThe axis being perpendicular to the plane formed by the image coordinate system, and Z_CThe intersection of the axis with the image plane is called the principal point of the camera.

Further, the image coordinate system takes the upper left corner of each frame of image in the video image as an origin, the horizontal direction of each frame of image as a u-axis, and the vertical direction of each frame of image as a v-axis.

Compared with the prior art, the invention has the following technical effects:

the method provided by the invention is not influenced and limited by various environments in engineering application, is easy to realize, and can effectively and accurately detect the vehicle in real time, thereby having wide application prospect.

Drawings

FIG. 1 is a frame image of a video image in example 1;

FIG. 2 is a schematic view of an image coordinate system in example 1;

FIG. 3 is a result diagram of feature point extraction of a moving object in example 1;

FIG. 4 shows the result of tracking the movement locus of the vehicle in embodiment 1;

FIG. 5 is a schematic view of a world coordinate system in example 1;

FIG. 6(a) is a graph showing Δ D plotted for track No. 2 and track No. 3 in example 1₁Sum of surface and Δ D₂Kneading; FIG. 6(b) is Δ D₁Intersection line sum Δ D₂Intersecting lines;

FIG. 7(a) is a graph showing Δ D plotted for track No. 0 and track No. 2 in example 1₁Sum of surface and Δ D₂Kneading; FIG. 7(b) is Δ D₁Intersection line sum Δ D₂Intersecting lines;

FIG. 8 shows 4 motion profiles selected in example 1;

FIG. 9 is a graph showing a clustering result of a part of the motion trajectories in example 1;

FIG. 10 shows the relationship between the camera imaging model and three coordinate systems.

Detailed Description

The invention is further illustrated by the figures and examples.

Example 1

The embodiment provides a Clustering method based on a Spectral Clustering space track, which comprises the following steps:

step 1, video image acquisition is carried out on a road by using a camera, feature points are extracted from all moving targets in each frame of image in the video image by adopting an ORB algorithm, and then the feature points are tracked by using a KLT tracking algorithm based on bidirectional weighted invertibility constraint to obtain a plurality of moving tracks of all the moving targets and coordinate values of each track point on each moving track in an image coordinate system;

wherein the ORB algorithm is from Ruble E., Rabaud V., Konolige K., Bradski G.ORB: and effective alternative to SIFT or SURF [ J ]. Proc.of IEEE Conf.on computer Vision,2011: 2564-.

KLT tracking algorithm from Song Lin, Cheng Yin Mei, Liu nan, etc. navigation of KLT visual tracking algorithm [ J ] infrared and laser engineering using multiple constrained drones, 2013.42(10): 2828-.

the image coordinate system uses the lower left corner of each frame of image in the video image as the origin, the horizontal direction of each frame of image as the u-axis, and the vertical direction of each frame of image as the v-axis, as shown in fig. 2.

The traffic videos adopted in this embodiment are all 720 × 288 grayscale images, as shown in fig. 1, one of the frames of images is shown, fig. 3 is a feature point extracted by applying an ORB algorithm to a moving object in the image of fig. 1, fig. 4 is a plurality of moving tracks in the image of fig. 1, each track has a corresponding number, and 33 tracks in total, where the tracks 0 to 1 are tracks of the same vehicle, the tracks 2 to 9 are tracks of the same vehicle, the tracks 10 to 16 and 19 to 32 are tracks of the same vehicle, and the tracks 17 to 18 are tracks of the same vehicle.

the method comprises the following steps:

step 21, as shown in fig. 5, establishing a world coordinate system by taking the direction parallel to the road lane markings as a Y-axis, the direction perpendicular to the road lane markings as an X-axis, the X-axis and the Y-axis both parallel to the road, the intersection point of the X-axis and the Y-axis as an origin O, and the direction of the shortest distance between the camera and the road as a Z-axis;

step 22, the clustering method of this embodiment adopts the following principle: the rigid body has two characteristics in the motion process: 1. the connecting lines between any two points on the rigid body are parallel and equal in the process of the translation of the rigid body; 2. the position vector between any elements on the rigid body is different, and the position vector is different from the position vector to the position vector, but the displacement, the speed and the acceleration of each element are completely the same.

Selecting two motion tracks from multiple motion tracks of all motion targets, respectively recording as M and N, setting r continuous track points on each track, and height h of the track M_MIs in the vertical direction of the horizontal plane, h_MHas a value range of 1-3m, h_MThe value interval of (a) is 0.1 m; height h of track N_NIs in the horizontal direction of the horizontal plane, h_NHas a value range of 0-4m, h_NThe value interval of (a) is 0.01 m;

obtaining Δ D by the formula (1)₁Dough making:

obtaining Δ D by the formula (2)₂Dough making:

P＝C_3×4 ^-1λ_ip (3)

In the formula (3), p ═ u_i,v_i,1]^T，P＝[X_i,Y_i,Z_i,1]^T，λ_iIs a scale factor, 0 is more than or equal to lambda_i≤1；C^-1 _3×4An inverse matrix representing a perspective projection matrix of the camera;

step 23, with h_M、h_NAs two axes construct a plane as the zero plane, Δ D₁、ΔD₂Two longitudinal planes perpendicular to the zero plane, respectively, as Δ D₁Sum of surface and Δ D₂Kneading;

The minimum k value is selected so that

in this example, 95%;

step 27, constructing an n × K-dimensional feature vector space by using feature vectors corresponding to the first K feature values, clustering the n × K-dimensional feature vector space by using a K-means algorithm, and clustering n motion tracks into K categories of motion tracks;

in this embodiment, as shown in fig. 6(a), Δ D plotted for the No. 2 track and the No. 3 track₁Sum of surface and Δ D₂FIG. 6(b) shows Δ D₁Intersection line sum Δ D₂The intersecting lines, Δ D of track No. 2 and track No. 3, can be seen from FIGS. 6(a) and 6(b)₁Intersection line sum Δ D₂Height difference Δ Diff between intersecting lines₁₂Is very small, and the included angle theta between two straight lines is very small, within the threshold value range, belonging to the motion trail of the same vehicle, so that the two straight lines belong to a classRespectively;

Δ D plotted for track No. 0 and track No. 2 as shown in FIG. 7(a)₁Sum of surface and Δ D₂FIG. 7(b) Δ D₁Intersection line sum Δ D₂The intersecting lines, Δ D of the track No. 0 and the track No. 2 can be seen from FIGS. 7(a) and 7(b)₁Intersection line sum Δ D₂Height difference Δ Diff between intersecting lines₁₂The included angle theta between the two intersecting lines exceeds the threshold range, and does not belong to the motion trail of the same vehicle, so that the included angle theta belongs to two categories;

FIG. 8 is a graph of 4 motion trajectories taken from FIG. 4, respectively labeled as

motion trajectories

0, 1,2, and 3;

table 1 shows the Δ Diff calculated by pairwise comparison of the traces obtained from the 4 traces in FIG. 8₁₂And theta results;

TABLE 1

From the data in Table 1, the similarity matrix A can be obtained as

The clustering result graph of this embodiment is a clustering result graph of a part of motion trajectories as shown in fig. 9.

Example 2

In this embodiment, on the basis of embodiment 1, in order to improve the accuracy of clustering, the method further includes:

and 3, performing inter-class combination on the k classes of motion tracks obtained in the step 2 to obtain inter-class combined motion tracks.

The method comprises the following steps:

step 31, selecting two categories from the k categories of motion tracks clustered in the step 2, and respectively marking as C_aAnd C_bLet class C_aMiddle track p_aHas a minimum back projection velocity of v_aClass C_bMiddle track p_bHas a minimum back projection velocity of v_b；

Step 32, if v_aIs less thanv_bThen C is_aFor reference categories, track p_aFeature points of the reference class, C_bFor categories to be merged, trace p_bIs a characteristic point; if v is_bLess than v_aThen C is_bFor reference categories, track p_bFeature points of the reference class, C_aFor categories to be merged, trace p_aThe feature points of the categories to be merged;

P′＝C_3×4 ^-1λ_ip′ (6)

in formula (6), p '═ u'_i,v′_i,1]^T；P′＝[X′_i,Y′_i,Z′_i,1]^T；u′_i，v′_iThe coordinate values of the feature points of the categories to be merged in the image coordinate system; x'_i,Y′_i,Z′_iSummarizing the characteristic points of the categories to be merged in a world coordinate system; lambda [ alpha ]_iIs a scale factor, 0 is more than or equal to lambda_i≤1；C^-1 _3×4An inverse matrix representing a perspective projection matrix of the camera;

in this example, C_3×4 ^-1＝{K[R_3×3|t_3×1]}^-1K denotes the intrinsic parameter matrix of the camera for which the parameters are a fixed 3 × 3 matrix, R_3×3Representing a rotation matrix between the camera coordinate system and the world coordinate system, t_3×1Representing a translation matrix between a camera coordinate system and a world coordinate system;

the camera coordinate system is the optical center O of the camera_CIs the origin of a coordinate system, X_CIn line with the u-axis direction of the image coordinate system, Y_CIn line with the direction of the v-axis of the image coordinate system, Z_CThe axis being perpendicular to the plane formed by the image coordinate system, and Z_CThe intersection of the axis with the image plane is called the principal point of the camera. As shown in fig. 10.

P″＝C_3×4 ^-1λ_ip″ (7)

the camera coordinate system is the optical center O of the camera_CIs the origin of a coordinate system, X_C、Y_CThe axis being parallel to the two-dimensional image plane, X_CIn line with the u-axis direction of the image coordinate system, Y_CIn line with the direction of the v-axis of the image coordinate system, Z_CThe axis being perpendicular to the plane formed by the image coordinate system, and Z_CThe intersection of the axis with the image plane is called the principal point of the camera. As shown in fig. 10.

Claims

1. The Clustering method based on the Spectral Clustering space trajectory is characterized by comprising the following steps of:

the method comprises the following steps:

step 22, selecting two optional motion tracks from the multiple motion tracks of all the motion targets, respectively recording the two motion tracks as M and N, and the height h of the track M_MIn the direction ofVertical direction of the horizontal plane, h_MHas a value range of 1-3m, h_MThe value interval of (a) is 0.1 m; height h of track N_NIs in the horizontal direction of the horizontal plane, h_NHas a value range of 0-4m, h_NThe value interval of (a) is 0.01 m;

obtaining Δ D by the formula (1)₁Dough making:

obtaining Δ D by the formula (2)₂Dough making:

P_i＝C_3×4 ^-1λ_ip_i(3)

Choose the bestSmall k value such that

The clustering number of the n motion tracks is k, and step 27 is executed; wherein S is_kThe sum of the first k characteristic values is obtained, k is a natural number which is more than or equal to 1, and is a ratio threshold value of the sum of the first k characteristic values and the sum of the n characteristic values;

2. The Spectral Clustering spatial trajectory-based Clustering method of claim 1, further comprising:

the method comprises the following steps:

P′＝C_3×4 ^-1λ_ip′ (6)

in formula (6), p '═ u'_i,v′_i,1]^T；P′＝[X′_i,Y′_i,Z′_i,1]^T；u_i′，v′_iThe coordinate values of the feature points of the categories to be merged in the image coordinate system; x'_i,Y′_i,Z′_iThe coordinate values of the characteristic points of the categories to be merged in a world coordinate system; lambda [ alpha ]_iIs a scale factor, 0 is more than or equal to lambda_i≤1；C^-1 _3×4An inverse matrix representing a perspective projection matrix of the camera;

P″＝C_3×4 ^-1λ_ip″ (7)

in formula (7), p ″ - [ u ″ ]_i,v″_i,1]^T；P″＝[X″_i,Y″_i,0,1]^T；u_i″，v″_iCoordinate values of the characteristic points of the reference category in an image coordinate system; x ″)_i,Y″_i0 is a coordinate value of the characteristic point of the reference category in a world coordinate system; lambda [ alpha ]_iIs a scale factor, 0 is more than or equal to lambda_i≤1；C^-1 _3×4An inverse matrix representing a perspective projection matrix of the camera;

3. The Spectral Clustering spatial trajectory-based Clustering method according to claim 1 or 2, wherein the inverse matrix C of the perspective projection matrix of the camera is obtained by equation (9)^-1 _3×4：

C_3×4 ^-1＝{K[R_3×3|t_3×1]}^-1(9)

4. The Spectral Clustering spatial trajectory-based Clustering method according to claim 1 or 2, wherein the image coordinate system uses the upper left corner of each frame of image in the video image as an origin, the horizontal direction of each frame of image as a u-axis, and the vertical direction of each frame of image as a v-axis.