CN107330390B - People counting method based on image analysis and deep learning - Google Patents

People counting method based on image analysis and deep learning Download PDF

Info

Publication number
CN107330390B
CN107330390B CN201710492597.XA CN201710492597A CN107330390B CN 107330390 B CN107330390 B CN 107330390B CN 201710492597 A CN201710492597 A CN 201710492597A CN 107330390 B CN107330390 B CN 107330390B
Authority
CN
China
Prior art keywords
image
head
window
shoulder
hog
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710492597.XA
Other languages
Chinese (zh)
Other versions
CN107330390A (en
Inventor
黄建华
俞启尧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Nuclear Furstate Software Technology Co ltd
Original Assignee
Shanghai Nuclear Furstate Software Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Nuclear Furstate Software Technology Co ltd filed Critical Shanghai Nuclear Furstate Software Technology Co ltd
Priority to CN201710492597.XA priority Critical patent/CN107330390B/en
Publication of CN107330390A publication Critical patent/CN107330390A/en
Application granted granted Critical
Publication of CN107330390B publication Critical patent/CN107330390B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06MCOUNTING MECHANISMS; COUNTING OF OBJECTS NOT OTHERWISE PROVIDED FOR
    • G06M11/00Counting of objects distributed at random, e.g. on a surface
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a people counting method based on image analysis and deep learning, which comprises the following steps: A. performing pyramid model calculation on an input image to generate images with a plurality of resolutions and sizes; B. performing window sliding on each layer of the pyramid, calculating the HOG characteristic value of a window area, classifying through a linear SVM classifier, and judging whether the window is a head-shoulder area or not; C. for each head and shoulder area given in the step B, extracting a corresponding image, normalizing to a set same size, and inputting the image into a deep neural network to obtain classified output; D. and C, performing non-maximum suppression on all the head and shoulder windows in the output of the step C to combine the detection results of the overlapping of adjacent areas and scales. The invention can improve the defects of the prior art and can achieve higher people counting performance at a higher speed.

Description

People counting method based on image analysis and deep learning
Technical Field
The invention relates to the technical field of image target recognition and deep learning, in particular to a people counting method based on image analysis and deep learning.
Background
The computer vision technology is utilized to carry out people counting on the monitored images or videos, and the system can be widely applied to project scenes such as trampling early warning, traffic dispersion, shop pedestrian flow assessment, attendance counting and the like. However, the existing people counting system has large errors in crowded environments. This is because there are usually a large number of shadows in a crowded environment, and the features of the area below the shoulders of the human body are hardly reliably and efficiently used. However, if only the head-shoulder feature is extracted and located, because the head-shoulder shape curve is relatively simple, the extracted features of the traditional hand-designed feature extraction algorithm such as HOG, LBP, HAAR, etc. are easily confused with the corresponding features of other parts of the body or the shape of the background texture, which results in a large number of false detections, as described in "history of oriented graphics for human detection" (n.dalal and b.triggs, in IEEE Conference on Computer Vision and Pattern Recognition, 2005). On the other hand, feature extraction based on deep learning is described in Rich features for objective object detection and magnetic segmentation (r.girshick, j.donahue, t.darrell, et al, CVPR, 2014) and fast r-cnn: the methods of detecting objects with region pro-technical networks (S.ren, K.He, R.Girshick, et al, NIPS, 2015) have surpassed manual characterization in many image analysis areas. However, due to the large amount of calculation and the low speed, the method has not been widely applied to monitoring scenes with high real-time requirements.
Disclosure of Invention
The invention aims to provide a people counting method based on image analysis and deep learning, which can overcome the defects of the prior art and can achieve high people counting performance at a high speed.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows.
A people counting method based on image analysis and deep learning comprises the following steps:
A. performing pyramid model calculation on an input image to generate images with a plurality of resolutions and sizes;
B. performing window sliding on each layer of the pyramid, calculating the HOG characteristic value of a window area, classifying through a linear SVM classifier, and judging whether the window is a head-shoulder area or not;
C. for each head and shoulder area given in the step B, extracting a corresponding image, normalizing to a set same size, and inputting the image into a deep neural network to obtain classified output;
D. and C, performing non-maximum suppression on all the head and shoulder windows in the output of the step C to combine the detection results of the overlapping of adjacent areas and scales.
Preferably, in step a, the original image is gaussian smoothed and an image with 10% reduced resolution is generated, and this process is repeated for the newly generated low resolution image until a given number of layers of pyramid models are generated.
Preferably, in the step B, the step C,
carrying out target detection on each layer of the pyramid; in the detection process, a window with a fixed size of WXH slides in an image space, the HOG characteristic is calculated in an image area under the window and is input into a linear SVM classifier, and a judgment result of whether the window is a head-shoulder target is obtained; in the HOG calculation, the horizontal and vertical gradients of each pixel point (x, y) are respectively
Gx(x,y)=I(x+1,y)-I(x-1,y)
Gy(x,y)=I(x,y+1)-I(x,y-1)
In the formula, I (x, y) represents the pixel value at (x, y), and the gradient amplitude and direction of the pixel point (x, y) are respectively
Figure BSA0000146510680000031
α(x,y)=tan-1(Gy(x,y)/Gx(x,y))
The calculation of HOG is to divide the window into many cells, each cell is 4x4 pixels, there is no overlap between cells, and for each cell, the corresponding characteristic is generated by the formula
Ho(m,n)=∑4m≤x<4m+4,4n≤y<4n+4G(x,y)Δo(x,y)/Z
Figure BSA0000146510680000032
Wherein Ho(m, n) is a characteristic value of the cell (m, n) with the gradient direction of o (0 is more than or equal to o and less than 9); z is a certain normalization parameter; HOG calculates the characteristics of each block and connects in series; here, each block then comprises 2 × 2 adjacent cells, and there may be overlap between blocks; the features of each block include the normalized 9 gradient direction histograms of each cell thereunder, forming a 36-dimensional feature; the features of all blocks form HOG features with dimensions of 36 (W/4-1) x (H/4-1).
Preferably, in step C, for each head and shoulder region obtained in step B, the intra-region image is extracted, enlarged or reduced to a size of 48 × 48, and sent to the deep neural network to obtain a judgment whether the intra-region image is a head and shoulder target.
Preferably, the deep neural network comprises 3 sets of convolutional layers and sampling layers, 2 fully-connected layers and 1 output layer.
Preferably, in the 3D convolution operation of the convolutional layer, for each output channel O of the convolutional layernEach pixel (x, y),
Figure BSA0000146510680000041
wherein ImIs an input channel, M is the number of input channels, Hm,nTwo-dimensional filter of 5x5, alpham,nIs the channel weight; hm,nAnd alpham,nTogether forming a 3-dimensional filter; by PCA to Im5x5 neighborhood of I, then ImMay be represented as a weighted sum of a plurality of principal components,
Figure BSA0000146510680000042
where (i, j) belongs to the 5x5 neighborhood of (x, y), βkFor the kth PCA projection coefficient, UkIs the K-th PCA principal component, and K is the number of principal components, then
Figure BSA0000146510680000043
Wherein the content of the first and second substances,
Figure BSA0000146510680000044
adopt the beneficial effect that above-mentioned technical scheme brought to lie in: the deep neural network designed by the invention adopts a brand-new structure, greatly reduces model parameters and improves the operation speed. Different from the general target detector based on deep learning, the invention abandons the selective search, RPN (region protocol network) waiting for selecting the region extraction method commonly adopted by deep learning, and adopts the output of the traditional HOG detector as a candidate region, thereby having certain superiority for the head and shoulder scene and the small target scene in the crowded environment. And the artificial field depth calibration is carried out on the scene, so that the scale space search range of the HOG detection is greatly reduced. The invention has good universality and good detection performance for both crowded environment and uncongested environment; because a simpler deep neural network, PCA decomposition acceleration, HOG pre-screening mechanism and artificial depth of field calibration are adopted, the speed is higher.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention.
FIG. 2 is a block diagram of a deep neural network in accordance with one embodiment of the present invention.
Fig. 3 is a schematic view of an artificial depth of field calibration according to an embodiment of the present invention.
Detailed Description
Referring to fig. 1, one embodiment of the present invention includes the steps of:
A. performing pyramid model calculation on an input image to generate images with a plurality of resolutions and sizes;
B. performing window sliding on each layer of the pyramid, calculating the HOG characteristic value of a window area, classifying through a linear SVM classifier, and judging whether the window is a head-shoulder area or not;
C. for each head and shoulder area given in the step B, extracting a corresponding image, normalizing to a set same size, and inputting the image into a deep neural network to obtain classified output;
D. and C, performing non-maximum suppression on all the head and shoulder windows in the output of the step C to combine the detection results of the overlapping of adjacent areas and scales.
In step A, Gaussian smoothing is carried out on the original image, an image with the resolution reduced by 10% is generated, and the process is repeated on the newly generated low-resolution image until a pyramid model with a given layer number is generated.
In the step (B), the step (A),
carrying out target detection on each layer of the pyramid; in the detection process, a window with a fixed size of WXH slides in an image space, the HOG characteristic is calculated in an image area under the window and is input into a linear SVM classifier, and a judgment result of whether the window is a head-shoulder target is obtained; in the HOG calculation, the horizontal and vertical gradients of each pixel point (x, y) are respectively
Gx(x,y)=I(x+1,y)-I(x-1,y)
Gy(x,y)=I(x,y+1)-I(x,y-1)
In the formula, I (x, y) represents the pixel value at (x, y), and the gradient amplitude and direction of the pixel point (x, y) are respectively
Figure BSA0000146510680000061
α(x,y)=tan-1(Gy(x,y)/Gx(x,y))
The calculation of HOG is to divide the window into a number of cells, each cell being 4x4 pixels, the cells
Ho(m,n)=∑4m≤x<4m+4,4n≤y<4n+4G(x,y)Δo(x,y)/Z
Figure BSA0000146510680000062
Does not overlap with the cells, the corresponding feature is generated for each cell according to the formula
Wherein Ho(m, n) is a characteristic value of the cell (m, n) with the gradient direction of o (0 is more than or equal to o and less than 9); z is a certain normalization parameter; HOG calculates the characteristics of each block and connects in series; here, each block then comprises 2 × 2 adjacent cells, and there may be overlap between blocks; the features of each block include the normalized 9 gradient direction histograms of each cell thereunder, forming a 36-dimensional feature; the features of all blocks form HOG features with dimensions of 36 (W/4-1) x (H/4-1).
In the step C, for each head and shoulder area obtained in the step B, extracting the image in the area, amplifying or reducing the image to the size of 48x48, and sending the image into a deep neural network to judge whether the image is a head and shoulder target.
The deep neural network comprises 3 groups of convolutional layers and sampling layers, 2 full-connection layers and 1 output layer.
In a 3D convolution operation of a convolutional layer, for each output channel O of the convolutional layernEach pixel (x, y),
Figure BSA0000146510680000071
wherein ImIs an input channel, M is the number of input channels, Hm,nTwo-dimensional filter of 5x5, alpham,nIs the channel weight; hm,nAnd alpham,nTogether forming a 3-dimensional filter; by PCA to Im5x5 neighborhood of I, then ImMay be represented as a weighted sum of a plurality of principal components,
Figure BSA0000146510680000072
where (i, j) belongs to the 5x5 neighborhood of (x, y), βkFor the kth PCA projection coefficient, UkIs the K-th PCA principal component, and K is the number of principal components, then
Figure BSA0000146510680000073
Wherein the content of the first and second substances,
Figure BSA0000146510680000074
referring to fig. 2, wherein C1, C3, C5 and C7 are convolutional layers, S2, S4 and S6 are sampling layers (including nonlinear activation operation), and F8 and F9 are full-link layers. All convolutional layers use a filter length of 5. The filling length of the C1 and C3 layers is 2, the filling length of the C5 layer is 1, and the C7 layer is not filled. The number of nodes of F8 is 128. The number of nodes of F9 is 2.
To further increase its speed, we decompose the 3D convolution in each convolutional layer operation (C1, C3, C5, C7) into a number of convolution operations of 2-dimensional convolution and 1x 1.
Standard convolution operation: m is the number of input channels, N is the number of output channels, and the filter is 5x5, so for each pixel position, 5x5xMxN multiplications are performed.
PCA projection: for each pixel position of each input channel, 5x5 neighborhood is projected to 6 principal component directions, which is equivalent to 6 convolutions of 5x5, so that for each pixel position, 5x5x6xN multiplications are performed.
1x1 convolution: for each pixel location, the generated 6M-dimensional vector is subjected to weighted summation, which is equivalent to a standard convolution operation of 1x1, so that for each pixel location, 6xMxN multiplications are performed.
Referring to fig. 3, the head of a person located at a near place and the head of a person located at a far place are respectively selected, and two square frames are drawn according to the size of the head. The size of the human head at any position in the scene can be obtained by linear interpolation according to the sizes of the two frames and the longitudinal positions of the two frames. The calibration and estimation method is established on the premise that all people are in the same main plane, and the horizontal direction of the camera imaging is parallel to the main plane in the scene. Such a precondition can usually be met. The estimation of the size of the human head at any position of the scene can greatly reduce the search range of the scale space and improve the speed of image analysis.
The embodiment is verified in the classroom monitoring systems of a plurality of universities, and the average people counting accuracy rate of more than 89% can be achieved.
The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (1)

1. A people counting method based on image analysis and deep learning is characterized by comprising the following steps:
A. performing pyramid model calculation on the input image to generate images with multiple resolutions and sizes, specifically including,
performing Gaussian smoothing on an original image, generating an image with resolution reduced by 10%, and repeating the process on a newly generated low-resolution image until a pyramid model with a given layer number is generated;
B. performing window sliding on each layer of the pyramid, calculating HOG characteristic value of a window region, classifying by a linear SVM classifier, judging whether the window is a head-shoulder region, specifically comprising,
carrying out target detection on each layer of the pyramid; in the detection process, a window with a fixed size of WXH slides in an image space, the HOG characteristic is calculated in an image area under the window and is input into a linear SVM classifier, and a judgment result of whether the window is a head-shoulder target is obtained; in the HOG calculation, the horizontal and vertical gradients of each pixel point (x, y) are respectively
Gx(x,y)=I(x+1,y)-I(x-1,y)
Gy(x,y)=I(x,y+1)-I(x,y-1)
In the formula, I (x, y) represents the pixel value at (x, y), and the gradient amplitude and direction of the pixel point (x, y) are respectively
Figure FDA0002587923170000011
α(x,y)=tan-1(Gy(x,y)/Gx(x,y))
The calculation of HOG divides the window into a number of cells, each cell being 4x4 pixels, there is no overlap between cells, and for each cell, the corresponding feature is generated by the formula,
Ho(m,n)=∑4m≤x<4m+4,4n≤y<4n+4G(x,y)Δo(x,y)/Z
Figure FDA0002587923170000021
wherein Ho(m, n) is a characteristic value of the cell (m, n) with the gradient direction of o (0 is more than or equal to o and less than 9); z is a certain normalization parameter; HOG calculates the features of each block and concatenates them(ii) a Here, each block then comprises 2 × 2 adjacent cells, and there may be overlap between blocks; the features of each block include the normalized 9 gradient direction histograms of each cell thereunder, forming a 36-dimensional feature; the features of all blocks form HOG features with dimensions of 36 (W/4-1) x (H/4-1);
C. for each head and shoulder area given in the step B, extracting a corresponding image, normalizing to a set same size, and inputting the image into a deep neural network to obtain classified output; b, extracting the image in the region of each head and shoulder region obtained in the step B, amplifying or reducing the image to the size of 48x48, sending the image into a deep neural network, and obtaining the judgment whether the image is a head and shoulder target; the deep neural network comprises 3 groups of convolution layers and sampling layers, 2 full-connection layers and 1 output layer;
in a 3D convolution operation of a convolutional layer, for each output channel O of the convolutional layernEach pixel (x, y),
Figure FDA0002587923170000022
wherein ImIs an input channel, M is the number of input channels, Hm,nTwo-dimensional filter of 5x5, alpham,nIs the channel weight; hm,nAnd alpham,nTogether forming a 3-dimensional filter; by PCA to Im5x5 neighborhood of I, then ImMay be represented as a weighted sum of a plurality of principal components,
Figure DEST_PATH_FSA0000146510670000032
where (i, j) belongs to the 5x5 neighborhood of (x, y), βkFor the kth PCA projection coefficient, UkIs the K-th PCA principal component, and K is the number of principal components, then
Figure FDA0002587923170000031
Wherein the content of the first and second substances,
Figure FDA0002587923170000032
D. and C, performing non-maximum suppression on all the head and shoulder windows in the output of the step C to combine the detection results of the overlapping of adjacent areas and scales.
CN201710492597.XA 2017-06-26 2017-06-26 People counting method based on image analysis and deep learning Active CN107330390B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710492597.XA CN107330390B (en) 2017-06-26 2017-06-26 People counting method based on image analysis and deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710492597.XA CN107330390B (en) 2017-06-26 2017-06-26 People counting method based on image analysis and deep learning

Publications (2)

Publication Number Publication Date
CN107330390A CN107330390A (en) 2017-11-07
CN107330390B true CN107330390B (en) 2020-12-01

Family

ID=60196088

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710492597.XA Active CN107330390B (en) 2017-06-26 2017-06-26 People counting method based on image analysis and deep learning

Country Status (1)

Country Link
CN (1) CN107330390B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918969B (en) * 2017-12-12 2021-03-05 深圳云天励飞技术有限公司 Face detection method and device, computer device and computer readable storage medium
CN108197579B (en) * 2018-01-09 2022-05-20 杭州智诺科技股份有限公司 Method for detecting number of people in protection cabin
CN108563998A (en) * 2018-03-16 2018-09-21 新智认知数据服务有限公司 Vivo identification model training method, biopsy method and device
CN110659550A (en) * 2018-06-29 2020-01-07 比亚迪股份有限公司 Traffic sign recognition method, traffic sign recognition device, computer equipment and storage medium
CN109359577B (en) * 2018-10-08 2021-06-29 福州大学 System for detecting number of people under complex background based on machine learning
CN109472291A (en) * 2018-10-11 2019-03-15 浙江工业大学 A kind of demographics classification method based on DNN algorithm
CN109376637B (en) * 2018-10-15 2021-03-02 齐鲁工业大学 People counting system based on video monitoring image processing
CN110674704A (en) * 2019-09-05 2020-01-10 同济大学 Crowd density estimation method and device based on multi-scale expansion convolutional network
CN110929756B (en) * 2019-10-23 2022-09-06 广物智钢数据服务(广州)有限公司 Steel size and quantity identification method based on deep learning, intelligent equipment and storage medium
CN113190795A (en) * 2021-02-23 2021-07-30 深圳市大数据资源管理中心 Method, device, medium and equipment for counting actual management population data

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105160313A (en) * 2014-09-15 2015-12-16 中国科学院重庆绿色智能技术研究院 Method and apparatus for crowd behavior analysis in video monitoring
EP3173983A1 (en) * 2015-11-26 2017-05-31 Siemens Aktiengesellschaft A method and apparatus for providing automatically recommendations concerning an industrial system
CN105844234B (en) * 2016-03-21 2020-07-31 商汤集团有限公司 Method and equipment for counting people based on head and shoulder detection
CN106845406A (en) * 2017-01-20 2017-06-13 深圳英飞拓科技股份有限公司 Head and shoulder detection method and device based on multitask concatenated convolutional neutral net

Also Published As

Publication number Publication date
CN107330390A (en) 2017-11-07

Similar Documents

Publication Publication Date Title
CN107330390B (en) People counting method based on image analysis and deep learning
CN106874894B (en) Human body target detection method based on regional full convolution neural network
Xu et al. Inter/intra-category discriminative features for aerial image classification: A quality-aware selection model
CN107274419B (en) Deep learning significance detection method based on global prior and local context
CN107016357B (en) Video pedestrian detection method based on time domain convolutional neural network
CN103530599B (en) The detection method and system of a kind of real human face and picture face
CN103824070B (en) A kind of rapid pedestrian detection method based on computer vision
CN104933414B (en) A kind of living body faces detection method based on WLD-TOP
CN108960404B (en) Image-based crowd counting method and device
CN106952274B (en) Pedestrian detection and distance measuring method based on stereoscopic vision
TWI441096B (en) Motion detection method for comples scenes
CN103020985B (en) A kind of video image conspicuousness detection method based on field-quantity analysis
JP2017191501A (en) Information processing apparatus, information processing method, and program
CN108804992B (en) Crowd counting method based on deep learning
CN113536972B (en) Self-supervision cross-domain crowd counting method based on target domain pseudo label
CN105741319B (en) Improvement visual background extracting method based on blindly more new strategy and foreground model
CN110532959B (en) Real-time violent behavior detection system based on two-channel three-dimensional convolutional neural network
CN110837786B (en) Density map generation method and device based on spatial channel, electronic terminal and medium
WO2023159898A1 (en) Action recognition system, method, and apparatus, model training method and apparatus, computer device, and computer readable storage medium
Su et al. A new local-main-gradient-orientation HOG and contour differences based algorithm for object classification
Ahuja et al. A survey of recent advances in crowd density estimation using image processing
CN104200455B (en) A kind of key poses extracting method based on movement statistics signature analysis
CN106446832B (en) Video-based pedestrian real-time detection method
CN111241943B (en) Scene recognition and loopback detection method based on background target and triple loss
CN117409476A (en) Gait recognition method based on event camera

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant