CN103886293A - Human body behavior recognition method based on history motion graph and R transformation - Google Patents

Human body behavior recognition method based on history motion graph and R transformation Download PDF

Info

Publication number
CN103886293A
CN103886293A CN201410106957.4A CN201410106957A CN103886293A CN 103886293 A CN103886293 A CN 103886293A CN 201410106957 A CN201410106957 A CN 201410106957A CN 103886293 A CN103886293 A CN 103886293A
Authority
CN
China
Prior art keywords
pixel
deep video
rectangle frame
video fragment
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410106957.4A
Other languages
Chinese (zh)
Other versions
CN103886293B (en
Inventor
肖俊
李潘
庄越挺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201410106957.4A priority Critical patent/CN103886293B/en
Publication of CN103886293A publication Critical patent/CN103886293A/en
Application granted granted Critical
Publication of CN103886293B publication Critical patent/CN103886293B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a human body behavior recognition method based on a history motion graph and R transformation. According to the method, a depth video is used as a recognition basis, firstly, the minimum enclosure rectangle of human body motion is calculated according to a foreground segmentation technology, then the history motion graph is extracted within a depth video area limited by the minimum enclosure rectangle, motion intensity constraint is exerted on the extracted history motion graph, so that a motion energy diagram is obtained, R transformation is calculated on the obtained motion energy graph, and therefore a characteristic vector used for behavior recognition is obtained. A method of a support vector machine is adopted for training and recognition processes. The minimum enclosure rectangle of human body behavior motion is adopted for preprocessing, and behavior characteristic extraction is accelerated; a method of history motion graph sequences is adopted for reducing influences of noise in depth graphs; characteristics are extracted through performing R transformation on the energy graph, so that calculation speed is high.

Description

A kind of human body behavior recognition methods based on motion history figure and R conversion
Technical field
The present invention relates to computer vision and image processing field, relate in particular to a kind of human body behavior recognition methods based on motion history figure and R conversion.
Background technology
Video monitoring is focus and the Important Problems of current vision area research, in the field such as safety-security area and man-machine interaction, producing continuously the video data of One's name is legion, these data are weighed with the unit of G easily, only can expend undoubtedly huge manpower with artificial cognition. video content is abundant, and most of time, we only pay close attention to some part in video, such as human body behavior, if can identify automatically and efficiently, the manpower of amplification quantity will be separated.Current behavior Study of recognition achievement mainly concentrates in the behavior Study of recognition of rgb video.
Rgb video is the modal a kind of form of video, wide material sources, have for many years more achievement in research, the behavior recognition methods based on rgb video at present is mainly divided into space-time analysis method (Space-timeapproach), sequence analysis method (Sequential approach) and hierarchical parsing approach (Hierarchical approach) three major types.Through development for many years, the research bottleneck of the human body behavior identification aspect based on rgb video highlights day by day, and when reason is rgb video as the data source of human body behavior identification, background interference is difficult to remove.Prior thing, rgb video has only utilized two dimensional surface information, describes 3 D human body behavior obviously lost a lot of key messages by two-dimensional signal.
Along with the progress of technology, there is in recent years a kind of camera-Kinect that is furnished with depth transducer of cheapness.This Kinect camera of Microsoft can, in obtaining normal RGB image, obtain quality acceptable depth information.The algorithm of integrated bone study in camera, can obtain the bone information of normal human in three-dimensional scenic.The feature extraction of depth map is at present main or using for reference the experience of in the past extracting feature on RGB.Meanwhile, many common data sets propose, and are very easy to the research of feature extraction on depth map.The people such as Zicheng Liu have proposed the method based on three-dimensional data profile (A bag of3D words), he sees depth map as three-dimensional data, then in cartesian space from, left and first three direction projection obtain projected outline, the point that after this, down-sampling goes out fixed number in projected outline is as feature, and the feature drawing is inserted in Action Graph model and identified.Bingbing Ni has independently gathered one and has been referred to as the depth data collection of RGBD-HuDaAct, and first the thought of 3D-MHIs has been used in depth map sequence feature extraction.These methods have limitation separately: the method recognition accuracy of A bag of3D words is higher, but due to needs uniform sampling on human body contour outline, the depth data that requirement obtains is very pure, cannot in the human body behavior identification of actual scene, use; Directly the method speed of application 3D-MHIs is enough fast, but recognition accuracy is inadequate; DMM-HOG behavior identification to complex background in guaranteeing recognition accuracy is also more effective, but the method is too consuming time, cannot realize real-time human body behavior identification.
Summary of the invention
The present invention is directed to the deficiencies in the prior art, proposed a kind of human body behavior recognition methods based on motion history figure and R conversion.The method is used deep video as basis of characterization, the concept of motion history figure and R conversion has been applied among behavioural characteristic leaching process, and has utilized the method for support vector machine to carry out training and the identifying of behavior identification.
The method comprises off-line training step and ONLINE RECOGNITION stage, and concrete steps are as follows:
Step (1). off-line training step
Described off-line training step object is to obtain a human body behavior model of cognition, and its step is as follows:
Deep video S to be trained is cut into multiple deep video fragments that time span is identical by step 1-1., then stamps different behavior marks according to the different behavior classifications of each deep video fragment, obtained thus the training set T of human body behavior identification.
Described training set T is the set of each deep video fragment of different behavior mark;
Described time span is the time span of the video segment to be identified of ONLINE RECOGNITION stage definitions;
The minimum that step 1-2. uses " foreground segmentation technology " to obtain human body behavior campaign in each deep video fragment is surrounded square, and the minimum video content that surrounds square restriction in deep video fragment is zoomed to unified size.
Described " foreground segmentation technology " operation is as follows:
A) for a given deep video fragment V of training set T, it is by some frame depth map { P 1, P 2..., P iform, wherein i represents i frame depth map; For any depth map P wherein i, by P imiddle pixel carries out k-means binary clusters according to the depth value of pixel position, obtains foreground pixel set and background pixel set; Described foreground pixel is less than the mean depth value of background pixel.
B) at depth map P ion find out a rectangle frame R i, all foreground pixels that step a) obtains are included at this rectangle frame R iin, R iby
Figure BDA0000480160940000021
with
Figure BDA0000480160940000022
form, wherein
Figure BDA0000480160940000023
Figure BDA0000480160940000024
with
Figure BDA0000480160940000025
represent respectively R ithe pixel coordinate of left margin, right margin, coboundary and lower boundary; Then by rectangle frame R iaccording to being laterally divided into wide two parts, if rectangle frame R ileft-half pixel number more than right half part, and if
Figure BDA0000480160940000026
being moved to the left K(K is constant, can regulate according to practical application scene) pixel number after individual pixel in new rectangle frame is greater than original rectangular frame R ithe η ﹪ (50< η <100 can regulate according to practical application scene) of interior number, will
Figure BDA0000480160940000035
adjust K pixel left, if the pixel number in new rectangle frame is less than the η ﹪ of pixel number in original rectangular frame Ri after moving boundary, right margin adjustment completes; If rectangle frame R ithe pixel of right half part more than left-half, and will
Figure BDA0000480160940000036
pixel number after K the pixel that move right in rectangle frame is greater than original rectangular frame R ithe η ﹪ of interior number, will adjust K pixel to the right, if the pixel number in new rectangle frame is less than original rectangular frame R after moving boundary ithe η ﹪ of middle pixel number, left margin adjustment completes; If rectangle frame R ileft and right two halves pixel in number of pixels differ and be no more than ε (ε is threshold parameter), judge whether remaining pixel number in K/2 the stylish rectangle frame of pixel is drawn close in border, left and right simultaneously to center be greater than original rectangular frame R iinterior all η ﹪ of pixels, if set up, by rectangle frame R irespectively draw K/2 pixel according to border, left and right and adjust, repeating step (b) afterwards, until remaining pixel number is less than the interior all η ﹪ of pixels of original rectangular frame Ri in new rectangle frame.Adopt above-mentioned same method to rectangle frame R iup-and-down boundary adjust.
C) deep video fragment V is by horizontal ordinate x, the three-dimensional space mesosome that tri-dimensions of ordinate y and time coordinate t are described, this three-dimensional space mesosome through step b) adjust after, any frame P in deep video fragment V iforeground pixel out divided, this foreground pixel scope is by R ibe described.If the minimum of human body behavior is surrounded four coboundary R of square R in deep video S up, lower boundary R down, left margin R leftwith right margin R rightcan use respectively according to formula (1) and calculate:
R up = min R i up , R down = max R i down , R left = min R i left , R right = max R i right Formula (1);
A cross-talk sequence S who is τ since moment j random time length of window in step 1-3. deep video fragment V j, can obtain a motion history figure
Figure BDA0000480160940000038
, its account form is as follows:
MHI &tau; I ( x , y , t ) = &tau; , if | I ( x , y , t ) - I ( x , y , t - 1 ) | > &delta; I th max ( 0 , MHI &tau; I ( x , y , t - 1 ) - 1,0 ) , else Formula (2);
Wherein, I (x, y, t) represents that deep video is engraved in the depth value of catching of pixel (x, y) position in the time of t; The scope of t is [j, j+ τ-1]; δ I thfor constant threshold, j, τ are natural number;
The present invention gets three random time length of window τ s, τ m, τ l, obtain corresponding motion history figure
Figure BDA0000480160940000033
wherein s, m, l is natural number, m=2s, l=4s, and s is proportional to the time span of deep video fragment V;
Through the processing of step 1-3, deep video fragment is converted to motion history graphic sequence, three time window length motion history figure that note is obtained by the present invention
Figure BDA0000480160940000034
the motion history graphic sequence of the deep video fragment V forming in the extension of time dimension is expressed as MHIsO, wherein o=s, m, l.
Step 1-4., for any one the motion history graphic sequence MHIsO obtaining in step 1-3, establishes H o(x, y, t) represents motion history graphic sequence MHIs oin the intensity of t frame pixel (x, y) position.In order to get rid of the interference of noise in depth map, to motion history graphic sequence MHIs ocarry out further strength constraint, according to motion history graphic sequence MHIs obe calculated as follows energygram D o, wherein D oin the value D of each position (x, y) o(x, y) computing method are shown in formula (3):
D o ( x , y ) = &Sigma; i = 1 N - 1 &mu; ( | H o ( x , y , i + 1 ) - H o ( x , y , i ) | - &epsiv; ) Formula (3);
Wherein, μ (θ) is unit-step function, and μ in the time of θ >=0 (θ) is 1, and in the time of θ < 0, μ (θ) is 0; ε is threshold constant, can regulate according to design application scenarios; N is the time span of deep video fragment V.
Step 1-5. is to the energygram D obtaining o, ask its R conversion, calculate R conversion, obtain the behavioural characteristic of deep video fragment V
Figure BDA0000480160940000042
specific as follows:
First calculating energy figure D oradon conversion, computing method are shown in formula (4):
p o ( &rho; , &theta; ) = &Integral; - &infin; &infin; &Integral; - &infin; &infin; D o ( x , y ) &delta; ( x cos &theta; + y sin &theta; - &rho; ) dxdy Formula (4);
Then, θ direction omnirange is carried out to integration, obtain R conversion, account form is as formula (5):
Figure BDA0000480160940000044
formula (5);
In order to prevent scale affects, right
Figure BDA0000480160940000045
be normalized,
Figure BDA0000480160940000046
will
Figure BDA0000480160940000047
with
Figure BDA0000480160940000048
be spliced to form the behavioural characteristic of deep video fragment V
Figure BDA0000480160940000049
Step 1-6. is according to the behavioural characteristic of deep video fragment V
Figure BDA00004801609400000410
the behavior mark of the deep video fragment obtaining with step 1-1, adopts support vector machine to train model of cognition M.
Step (2). the ONLINE RECOGNITION stage
Described ONLINE RECOGNITION stage object is that the model of cognition M that utilizes off-line training step to obtain carries out behavior identification, and its step is as follows:
The behavioural characteristic of the step 2-1. method identical with off-line training step operation steps 1-1~1-6 to video extraction to be identified video to be identified.
When described ONLINE RECOGNITION stage identification granularity and off-line training step training, be consistent.
The behavioural characteristic of step 2-2. based on video to be identified, utilizes support vector machine to carry out behavior identification to video to be identified according to training model M out.
Method proposed by the invention has following beneficial effect compared with traditional human body behavior recognition methods:
1. in off-line training step and ONLINE RECOGNITION phase characteristic leaching process, use the minimum of human body behavior campaign to surround this preprocessing process of square, accelerated the process that behavioural characteristic is extracted, got rid of the interference of complex background simultaneously.
2. adopt the method for motion history graphic sequence that the key message of human body behavior campaign is kept down, there is three-dimensional motion information because depth map is natural, therefore there is stronger human body descriptive power compared to the behavior identification based on rgb video, the human body behavior description ability that the key message retaining is also more strengthened, the motion history figure strength constraint of back to back time dimension has reduced the impact of noise in depth map.
3. the final step that behavioural characteristic is extracted is carried out R conversion and is extracted feature on energygram, on the basis that fully obtains intensity and profile information on energygram, retain fast this advantage of computing velocity, therefore this method can be carried out in real time behavior identification in guaranteeing recognition accuracy, it should be noted that, on energygram, having retained profile and the strength information of motion, is to the well-refined of original motion behavior and description.
Based on above-mentioned three features, the invention provides one fast, effectively human body behavioural characteristic and the human body behavior recognition methods based on this feature.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the inventive method behavioural characteristic leaching process, and wherein figure (a) is concrete flow process, and figure (b) is the image preview corresponding with figure (a);
Fig. 2 is the outline flowchart of the inventive method.
Embodiment
Below in conjunction with the drawings and specific embodiments, the present invention is further illustrated.
As shown in Figure 1 and Figure 2, the present invention includes off-line training step and ONLINE RECOGNITION stage.
Step (1). off-line training step
Off-line training step object is to obtain a human body behavior model of cognition, and its step is as follows:
Deep video S to be trained is all cut into multiple deep video fragments that time span is identical by step 1-1., then stamps different behavior marks according to the different behavior classifications of each deep video fragment, obtained thus the training set T of human body behavior identification.
Described time span is the time span of the video segment to be identified of ONLINE RECOGNITION stage definitions;
Step 1-2. uses " foreground segmentation technology " to obtain the minimum encirclement square of human body behavior campaign in each deep video fragment, and the video content that in deep video fragment, minimum encirclement square limits is zoomed to the unified big or small 320*240 of being.
Described " foreground segmentation technology " is described below:
A) for a given deep video fragment V of training set T, it is by some frame depth map { P 1, P 2..., P iform, wherein i represents natural number, for any depth map P wherein i, by P imiddle pixel carries out k-means binary clusters according to the depth value of pixel position, obtains two set that comprise respectively foreground pixel and background pixel; Described foreground pixel is less than the mean depth value of background pixel.
B) at depth map P ion find out a rectangle frame R i, all foreground pixels that step a obtains are included at this rectangle frame R iin, R iby
Figure BDA0000480160940000061
with
Figure BDA0000480160940000062
form, wherein
Figure BDA0000480160940000063
with
Figure BDA0000480160940000065
represent respectively R ithe pixel coordinate of left margin, right margin, coboundary and lower boundary; Then by rectangle frame R iaccording to being laterally divided into wide two parts, if rectangle frame R ileft-half pixel number more than right half part, and if
Figure BDA0000480160940000066
being moved to the left K(K is constant, can regulate according to practical application scene) pixel number after individual pixel in new rectangle frame is greater than original rectangular frame R i90 ﹪ (90 ﹪ are recommended value, can regulate according to practical application scene) of interior number, will
Figure BDA0000480160940000067
adjust K pixel left, if the original rectangular frame R of the pixel number deficiency in rectangle frame after moving boundary i90 ﹪ of interior pixel number, right margin adjustment completes; If rectangle frame R ithe pixel number of right half part more than left-half, and will
Figure BDA0000480160940000068
pixel number after K the pixel that move right in rectangle frame is greater than original rectangular frame R i90 ﹪ of interior number, will
Figure BDA0000480160940000069
adjust K pixel to the right, if the original rectangular frame R of the pixel number deficiency in rectangle frame after moving boundary i90 ﹪ of middle pixel number, left margin adjustment completes; If number of pixels differs and is no more than ε (ε is threshold parameter) in the left and right two halves pixel of rectangle frame Ri, judge whether remaining pixel number in K/2 the stylish rectangle frame of pixel is drawn close in border, left and right simultaneously to center be greater than original rectangular frame R iinterior all 90 ﹪ of pixels, if set up, by rectangle frame R irespectively draw K/2 pixel according to border, left and right and adjust, repeating step (b) afterwards, until remaining pixel number is less than original rectangular frame R in new rectangle frame iinterior all 90 ﹪ of pixels.Adopt above-mentioned same method to rectangle frame R iup-and-down boundary adjust.
C) deep video fragment V is by horizontal ordinate x, the three-dimensional space mesosome that tri-dimensions of ordinate y and time coordinate t are described, through step b) after, any frame P in deep video fragment V iforeground pixel out divided, this foreground pixel scope is by R ibe described.If the minimum of human body behavior is surrounded four coboundary R of square R in deep video S up, lower boundary R down, left margin R leftwith right margin R rightcan use respectively according to formula (1) and calculate:
R up = min R i up , R down = max R i down , R left = min R i left , R right = max R i right Formula (1);
A cross-talk sequence S who is τ since moment j random time length of window in step 1-3. deep video fragment V j, can obtain a motion history figure
Figure BDA00004801609400000611
, its account form is as follows:
MHI &tau; I ( x , y , t ) = &tau; , if | I ( x , y , t ) - I ( x , y , t - 1 ) | > &delta; I th max ( 0 , MHI &tau; I ( x , y , t - 1 ) - 1,0 ) , else Formula (2);
Wherein, I (x, y, t) represents that deep video is engraved in the depth value of catching of pixel (x, y) position in the time of t.The scope of t is [j, j+ τ-1].δ I thfor constant threshold, j, τ are natural number;
Since any time t, the present invention gets length of window τ continuous time s=4, τ m=8 and τ l=16, obtain corresponding motion history graphic sequence
Figure BDA0000480160940000072
wherein s, m, l is natural number, m=2s, l=4s, and s is proportional to the time span of deep video fragment V;
Through the processing of step 1-3, deep video fragment is converted to motion history graphic sequence,, three time window length motion history figure that note is obtained by the present invention the motion history graphic sequence of the deep video fragment V forming in the extension of time dimension is expressed as MHIs o, wherein o=s, m, l.
Step 1-4. is for any one the motion history graphic sequence MHIs obtaining in step 1-3 o, wherein o=s, m, l, establishes H o(x, y, t) represents motion history graphic sequence MHIs oin the intensity of t frame pixel (x, y) position.In order to get rid of the interference of noise in depth map, to motion history graphic sequence MHIs ocarry out further strength constraint, according to motion history graphic sequence MHIs obe calculated as follows energygram D o, wherein D oin the value D of each position (x, y) o(x, y) computing method are shown in formula (3):
D o ( x , y ) = &Sigma; i = 1 N - 1 &mu; ( | H o ( x , y , i + 1 ) - H o ( x , y , i ) | - &epsiv; ) Formula (3);
Wherein, μ (θ) is unit-step function, and μ in the time of θ >=0 (θ) is 1, and in the time of θ < 0, μ (θ) is 0.ε is threshold constant, can regulate according to design application scenarios.N is the time span of deep video fragment V.
Step 1-5. is to the energygram D obtaining o, ask its R conversion, calculate R conversion, obtain the behavioural characteristic of deep video fragment V
Figure BDA0000480160940000075
specific as follows:
First calculating energy figure D oradon conversion, computing method are shown in formula (4):
p o ( &rho; , &theta; ) = &Integral; - &infin; &infin; &Integral; - &infin; &infin; D o ( x , y ) &delta; ( x cos &theta; + y sin &theta; - &rho; ) dxdy Formula (4);
Then, θ direction omnirange is carried out to integration, obtain R conversion, account form is as formula (5):
Figure BDA0000480160940000077
formula (5);
In order to prevent scale affects, right
Figure BDA0000480160940000078
be normalized,
Figure BDA0000480160940000079
will
Figure BDA00004801609400000710
be spliced to form the behavioural characteristic of deep video fragment V
Figure BDA00004801609400000711
Step 1-6. is according to the behavioural characteristic of deep video fragment V
Figure BDA00004801609400000712
the behavior mark of the deep video fragment obtaining with step (1), adopts support vector machine to train model of cognition M.
Step (2). the ONLINE RECOGNITION stage
ONLINE RECOGNITION stage object is that the model of cognition M that utilizes off-line training step to obtain carries out behavior identification, and its step is as follows:
The behavioural characteristic of the step 2-1. method identical with off-line training step operation steps 1-1~1-5 to video extraction to be identified video to be identified.
When described ONLINE RECOGNITION stage identification granularity and off-line training step training, be consistent.
The behavioural characteristic of step 2-2. based on video to be identified, utilizes support vector machine to carry out behavior identification to video to be identified according to training model M out.
Above-described embodiment is not that the present invention is not limited only to above-described embodiment for restriction of the present invention, as long as meet requirement of the present invention, all belongs to protection scope of the present invention.

Claims (1)

1. the human body behavior recognition methods based on motion history figure and R conversion, is characterized in that the method comprises off-line training step and ONLINE RECOGNITION stage, and concrete steps are as follows:
Step (1). off-line training step:
Deep video S to be trained is cut into multiple deep video fragments that time span is identical by step 1-1., then stamps different behavior marks according to the different behavior classifications of each deep video fragment, obtained thus the training set T of human body behavior identification;
Described training set T is the set of each deep video fragment of different behavior mark;
The minimum that step 1-2. uses " foreground segmentation technology " to obtain human body behavior campaign in each deep video fragment is surrounded square, and the minimum video content that surrounds square restriction in deep video fragment is zoomed to unified size;
Described " foreground segmentation technology " operation is as follows:
A) for a given deep video fragment V of training set T, it is by some frame depth map { P 1, P 2..., P iform, wherein i represents i frame depth map; For any depth map P wherein i, by P imiddle pixel carries out k-means binary clusters according to the depth value of pixel position, obtains foreground pixel set and background pixel set; Described foreground pixel is less than the mean depth value of background pixel;
B) at depth map P ion find out a rectangle frame R i, all foreground pixels that step a) obtains are included at this rectangle frame R iin, R iby
Figure FDA0000480160930000011
with
Figure FDA0000480160930000012
form, wherein
Figure FDA0000480160930000013
Figure FDA0000480160930000014
with
Figure FDA0000480160930000015
represent respectively R ithe pixel coordinate of left margin, right margin, coboundary and lower boundary; Then by rectangle frame R iaccording to being laterally divided into wide two parts, if rectangle frame R ileft-half pixel number more than right half part, and if
Figure FDA0000480160930000016
be moved to the left the η ﹪ that pixel number in rectangle frame new after K pixel is greater than number in original rectangular frame Ri, wherein K is constant, and 50< η <100 will
Figure FDA0000480160930000017
adjust K pixel left, if the pixel number in new rectangle frame is less than original rectangular frame R after moving boundary ithe η ﹪ of interior pixel number, right margin adjustment completes; If rectangle frame R ithe pixel of right half part more than left-half, and will
Figure FDA0000480160930000018
pixel number after K the pixel that move right in rectangle frame is greater than original rectangular frame R ithe η ﹪ of interior number, will
Figure FDA0000480160930000019
adjust K pixel to the right, if the pixel number in new rectangle frame is less than original rectangular frame R after moving boundary ithe η ﹪ of middle pixel number, left margin adjustment completes; If rectangle frame R ileft and right two halves pixel in number of pixels differ and be no more than ε, ε is threshold parameter, judges whether remaining pixel number in K/2 the stylish rectangle frame of pixel is drawn close in border, left and right simultaneously to center be greater than original rectangular frame R iinterior all η ﹪ of pixels, if set up, by rectangle frame R irespectively draw K/2 pixel according to border, left and right and adjust, repeating step (b) afterwards, until remaining pixel number is less than original rectangular frame R in new rectangle frame iinterior all η ﹪ of pixels; Adopt above-mentioned same method to rectangle frame R iup-and-down boundary adjust;
C) deep video fragment V is by horizontal ordinate x, the three-dimensional space mesosome that tri-dimensions of ordinate y and time coordinate t are described, this three-dimensional space mesosome through step b) adjust after, any frame P in deep video fragment V iforeground pixel out divided, this foreground pixel scope is by R ibe described; If the minimum of human body behavior is surrounded four coboundary R of square R in deep video S up, lower boundary R down, left margin R leftwith right margin R rightcan use respectively according to formula (1) and calculate:
R up = min R i up , R down = max R i down , R left = min R i left , R right = max R i right Formula (1);
A cross-talk sequence S who is τ since moment j random time length of window in step 1-3. deep video fragment V j, can obtain a motion history figure
Figure FDA0000480160930000022
its account form is as follows:
MHI &tau; I ( x , y , t ) = &tau; , if | I ( x , y , t ) - I ( x , y , t - 1 ) | > &delta; I th max ( 0 , MHI &tau; I ( x , y , t - 1 ) - 1,0 ) , else Formula (2);
Wherein, I (x, y, t) represents that deep video is engraved in the depth value of catching of pixel (x, y) position in the time of t; The scope of t is [j, j+ τ-1]; δ I thfor constant threshold, j, τ are natural number;
Get three random time length of window τ s, τ m, τ l, obtain corresponding motion history figure wherein s, m, l is natural number, m=2s, l=4s, and s is proportional to the time span of deep video fragment V;
Through the processing of step 1-3, deep video fragment is converted to motion history graphic sequence, three time window length motion history figure that remember
Figure FDA0000480160930000025
the motion history graphic sequence of the deep video fragment V forming in the extension of time dimension is expressed as MHIs o, wherein o=s, m, l;
Step 1-4. is for any one the motion history graphic sequence MHIs obtaining in step 1-3 o, establish H o(x, y, t) represents motion history graphic sequence MHIs oin the intensity of t frame pixel (x, y) position; According to motion history graphic sequence MHIs obe calculated as follows energygram D o, wherein D oin the value D of each position (x, y) o(x, y) computing method are shown in formula (3):
D o ( x , y ) = &Sigma; i = 1 N - 1 &mu; ( | H o ( x , y , i + 1 ) - H o ( x , y , i ) | - &epsiv; ) Formula (3);
Wherein, μ (θ) is unit-step function, and μ in the time of θ >=0 (θ) is 1, and in the time of θ < 0, μ (θ) is 0; ε is threshold constant; N is the time span of deep video fragment V;
Step 1-5. is to the energygram D obtaining o, ask its R conversion, calculate R conversion, obtain the behavioural characteristic of deep video fragment V
Figure FDA0000480160930000034
specific as follows:
First calculating energy figure D oradon conversion, computing method are shown in formula (4):
p o ( &rho; , &theta; ) = &Integral; - &infin; &infin; &Integral; - &infin; &infin; D o ( x , y ) &delta; ( x cos &theta; + y sin &theta; - &rho; ) dxdy Formula (4);
Then, θ direction omnirange is carried out to integration, obtain R conversion, account form is as formula (5):
formula (5);
Right
Figure FDA0000480160930000035
be normalized,
Figure FDA0000480160930000033
will
Figure FDA0000480160930000036
be spliced to form the behavioural characteristic of deep video fragment V
Figure FDA0000480160930000037
Step 1-6. is according to the behavioural characteristic of deep video fragment V
Figure FDA0000480160930000038
the behavior mark of the deep video fragment obtaining with step 1-1, adopts support vector machine to train model of cognition M;
Step (2). the ONLINE RECOGNITION stage:
The behavioural characteristic of the step 2-1. method identical with off-line training step operation steps 1-1~1-6 to video extraction to be identified video to be identified;
When described ONLINE RECOGNITION stage identification granularity and off-line training step training, be consistent;
The behavioural characteristic of step 2-2. based on video to be identified, utilizes support vector machine to carry out behavior identification to video to be identified according to training model M out.
CN201410106957.4A 2014-03-21 2014-03-21 Human body behavior recognition method based on history motion graph and R transformation Active CN103886293B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410106957.4A CN103886293B (en) 2014-03-21 2014-03-21 Human body behavior recognition method based on history motion graph and R transformation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410106957.4A CN103886293B (en) 2014-03-21 2014-03-21 Human body behavior recognition method based on history motion graph and R transformation

Publications (2)

Publication Number Publication Date
CN103886293A true CN103886293A (en) 2014-06-25
CN103886293B CN103886293B (en) 2017-04-26

Family

ID=50955176

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410106957.4A Active CN103886293B (en) 2014-03-21 2014-03-21 Human body behavior recognition method based on history motion graph and R transformation

Country Status (1)

Country Link
CN (1) CN103886293B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106204633A (en) * 2016-06-22 2016-12-07 广州市保伦电子有限公司 A kind of student trace method and apparatus based on computer vision
CN106778576A (en) * 2016-12-06 2017-05-31 中山大学 A kind of action identification method based on SEHM feature graphic sequences

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070122009A1 (en) * 2005-11-26 2007-05-31 Hyung Keun Jee Face recognition method and apparatus
US20110087677A1 (en) * 2008-04-30 2011-04-14 Panasonic Corporation Apparatus for displaying result of analogous image retrieval and method for displaying result of analogous image retrieval
CN102043967A (en) * 2010-12-08 2011-05-04 中国科学院自动化研究所 Effective modeling and identification method of moving object behaviors
CN103295016A (en) * 2013-06-26 2013-09-11 天津理工大学 Behavior recognition method based on depth and RGB information and multi-scale and multidirectional rank and level characteristics
CN103544466A (en) * 2012-07-09 2014-01-29 西安秦码软件科技有限公司 Vector field model based behavior analysis method
CN103577841A (en) * 2013-11-11 2014-02-12 浙江大学 Human body behavior identification method adopting non-supervision multiple-view feature selection

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070122009A1 (en) * 2005-11-26 2007-05-31 Hyung Keun Jee Face recognition method and apparatus
US20110087677A1 (en) * 2008-04-30 2011-04-14 Panasonic Corporation Apparatus for displaying result of analogous image retrieval and method for displaying result of analogous image retrieval
CN102043967A (en) * 2010-12-08 2011-05-04 中国科学院自动化研究所 Effective modeling and identification method of moving object behaviors
CN103544466A (en) * 2012-07-09 2014-01-29 西安秦码软件科技有限公司 Vector field model based behavior analysis method
CN103295016A (en) * 2013-06-26 2013-09-11 天津理工大学 Behavior recognition method based on depth and RGB information and multi-scale and multidirectional rank and level characteristics
CN103577841A (en) * 2013-11-11 2014-02-12 浙江大学 Human body behavior identification method adopting non-supervision multiple-view feature selection

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
印勇等: "近似周期运动的人体异常行为识别", 《计算机工程与应用》 *
欧阳寒等: "基于归一化R变换分层模型的人体行为识别", 《计算机工程与设计》 *
赵海勇等: "基于多特征融合的运动人体行为识别", 《计算机应用研究》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106204633A (en) * 2016-06-22 2016-12-07 广州市保伦电子有限公司 A kind of student trace method and apparatus based on computer vision
CN106204633B (en) * 2016-06-22 2020-02-07 广州市保伦电子有限公司 Student tracking method and device based on computer vision
CN106778576A (en) * 2016-12-06 2017-05-31 中山大学 A kind of action identification method based on SEHM feature graphic sequences
CN106778576B (en) * 2016-12-06 2020-05-26 中山大学 Motion recognition method based on SEHM characteristic diagram sequence

Also Published As

Publication number Publication date
CN103886293B (en) 2017-04-26

Similar Documents

Publication Publication Date Title
Wu et al. Helmet detection based on improved YOLO V3 deep model
Johnson-Roberson et al. Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks?
CN107679502A (en) A kind of Population size estimation method based on the segmentation of deep learning image, semantic
CN106845415B (en) Pedestrian fine identification method and device based on deep learning
CN108245384B (en) Binocular vision apparatus for guiding blind based on enhancing study
CN103208123B (en) Image partition method and system
CN107506722A (en) One kind is based on depth sparse convolution neutral net face emotion identification method
CN110852182B (en) Depth video human body behavior recognition method based on three-dimensional space time sequence modeling
CN102880865B (en) Dynamic gesture recognition method based on complexion and morphological characteristics
CN102968643B (en) A kind of multi-modal emotion identification method based on the theory of Lie groups
CN106296653A (en) Brain CT image hemorrhagic areas dividing method based on semi-supervised learning and system
CN103310194A (en) Method for detecting head and shoulders of pedestrian in video based on overhead pixel gradient direction
CN105096311A (en) Technology for restoring depth image and combining virtual and real scenes based on GPU (Graphic Processing Unit)
CN103020614B (en) Based on the human motion identification method that space-time interest points detects
CN106056631A (en) Pedestrian detection method based on motion region
CN103198330B (en) Real-time human face attitude estimation method based on deep video stream
CN106960181A (en) A kind of pedestrian&#39;s attribute recognition approach based on RGBD data
CN102567716A (en) Face synthetic system and implementation method
CN105095857A (en) Face data enhancement method based on key point disturbance technology
CN105956552A (en) Face black list monitoring method
CN104200505A (en) Cartoon-type animation generation method for human face video image
CN110533026A (en) The competing image digitization of electricity based on computer vision and icon information acquisition methods
CN105069745A (en) face-changing system based on common image sensor and enhanced augmented reality technology and method
CN107609509A (en) A kind of action identification method based on motion salient region detection
CN107944437A (en) A kind of Face detection method based on neutral net and integral image

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20140625

Assignee: CCI (CHINA) Co.,Ltd.

Assignor: ZHEJIANG University

Contract record no.: X2021980001760

Denomination of invention: A human behavior recognition method based on motion history map and r-transform

Granted publication date: 20170426

License type: Common License

Record date: 20210316

EE01 Entry into force of recordation of patent licensing contract