CN105023000A - Human brain visual memory principle-based human body action identification method and system - Google Patents

Human brain visual memory principle-based human body action identification method and system Download PDF

Info

Publication number
CN105023000A
CN105023000A CN201510407799.0A CN201510407799A CN105023000A CN 105023000 A CN105023000 A CN 105023000A CN 201510407799 A CN201510407799 A CN 201510407799A CN 105023000 A CN105023000 A CN 105023000A
Authority
CN
China
Prior art keywords
video
identified
vector
training
coding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510407799.0A
Other languages
Chinese (zh)
Other versions
CN105023000B (en
Inventor
谌先敢
刘海华
高智勇
李旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South Central Minzu University
Original Assignee
South Central University for Nationalities
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South Central University for Nationalities filed Critical South Central University for Nationalities
Priority to CN201510407799.0A priority Critical patent/CN105023000B/en
Publication of CN105023000A publication Critical patent/CN105023000A/en
Application granted granted Critical
Publication of CN105023000B publication Critical patent/CN105023000B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a human brain visual memory principle-based human body action identification method and system, and relates to the field of computer vision and video monitoring. Enlightened by the human brain visual memory principle, the invention proposes the following technical scheme for the first time: in a training stage, feature coding of local features is used to train a classifier model, and is used to build a visual memory bank; in an identification stage, the feature coding of the local features of a video to be identified is searched in the visual memory bank; and part of the local features of the video in a search result are used to replace shielded information in the video to be identified, feature coding is performed on local features of the replaced video, and the training model is input to perform testing, thereby obtaining the category of a human body action in the video. The human brain visual memory principle-based human body action identification method and the system can effectively solve the problem of shielding in human body action identification.

Description

Based on human motion recognition method and the system of human brain visual memory principle
Technical field
The present invention relates to computer vision and field of video monitoring, specifically relate to a kind of human motion recognition method based on human brain visual memory principle and system.
Background technology
Human action identification based on video is a very important problem, can be applicable to video monitoring, video frequency searching and man-machine interaction.Human action identification refers to the classification distinguishing human action with computing machine from video sequence.
Human action identification based on video can be divided into two parts: the expression of action and the classification of action.Video can be divided into training set and test set.The expression of action refers to: from the video sequence comprising human action, extracts suitable characteristic, describes the action of human body.The classification of action refers to: by the characteristic in learning training set, obtain sorter model, the characteristic in test set is classified.
More or less all there are some and block in current many videos, comprises to block or by other target occlusions, and this can cause the main body performed an action not to be all visible, is difficult to extract effective motion characteristic, brings very large challenge to human action identification.
In current action identification method, following several method performance is under occlusion can be received: partial approach, the method based on probability and the method based on posture, but these methods respectively have certain limitation.The point of interest detection that partial approach is used, the local fritter of possible errors identification not in foreground target.Based on the method for probability, as Bayesian network, hidden Markov model is flat model, is effective in expression simple motion, but the level that can not describe in compound action and shared structure.Based on the method for posture, need to use detector, mark training image by manual, train each body part, which limits the application of method in action recognition based on posture.Therefore, urgently effective method solves the occlusion issue in human action identification.
Summary of the invention
The object of the invention is the deficiency in order to overcome above-mentioned background technology, a kind of human motion recognition method based on human brain visual memory principle and system being provided, effectively can solving the occlusion issue in human action identification.
The invention provides a kind of human motion recognition method based on human brain visual memory principle, comprise the following steps:
A, training stage:
A1, gather multiple training video, respectively intensive sampling is carried out to each training video, using the histograms of oriented gradients HOG feature on sampling block as local feature, obtains the HOG characteristic set of training video;
A2, employing expectation-maximization algorithm, learn the HOG characteristic set of the training video that steps A 1 obtains, obtain one group of " super complete " base vector;
A3, " super complete " base vector that integrating step A2 obtains, adopt the mode of sparse coding, feature coding is carried out to the HOG characteristic set of the training video that steps A 1 obtains, obtain the first sparse vector set, in first sparse vector set, the dimension of each vector is identical with the dimension of " super complete " base vector, summation operation is carried out to the whole sparse vectors in the first sparse vector set, be normalized again, obtain a dimension and " super complete " vector that base vector dimension is identical, as the coding result of training video, the human action in assertiveness training video is carried out with the coding result of training video,
A4, the coding result of all training videos steps A 3 obtained are sent into support vector machines sorter and are trained, and generate training pattern;
The coding result of all training videos that A5, use steps A 3 obtain, builds visual memory storehouse;
B, cognitive phase:
B1, input video to be identified, intensive sampling is carried out to video to be identified, using the HOG feature on sampling block as local feature, obtains the HOG characteristic set of video to be identified;
" super complete " base vector that B2, integrating step A2 obtain, adopt the mode of sparse coding, feature coding is carried out to the HOG characteristic set of the video to be identified that step B1 obtains, arrive to obtain the second sparse vector set, in second sparse vector set, the dimension of each vector is identical with " super complete " base vector dimension, summation operation is carried out to the whole sparse vectors in the second sparse vector set, then is normalized, obtain a dimension and " super complete " sparse vector that base vector dimension is identical;
B3, determine the position that is blocked in video to be identified, to replace the position be blocked in video to be identified with the result for retrieval in visual memory storehouse, obtain the coding result of video to be identified:
The sparse vector obtained with step B2 is index, retrieve in the visual memory storehouse that steps A 5 builds, using the video that retrieves as result for retrieval, the feature at the position that is blocked is replaced in video to be identified with the local feature of video in result for retrieval, obtain the HOG characteristic set of the video after replacing, as new local feature; " super complete " base vector obtained by steps A 2 carries out feature coding to this new local feature, obtains new sparse vector, as the coding result of video to be identified, expresses the human action in video to be identified with the coding result of video to be identified;
B4, the coding result of video to be identified obtained by step B3 are sent into the training pattern that steps A 4 generates and are tested, and obtain the human action classification in video to be identified.
On the basis of technique scheme, in steps A, describedly to the process that each training video carries out intensive sampling be respectively: for individualized training video, centered by intensive sampling point, find multiple local sampling blocks of this training video.
On the basis of technique scheme, described local sampling block is of a size of the arbitrary dimension being less than training video size.
On the basis of technique scheme, described local sampling block is of a size of 16 × 16 × 4 pixels.
On the basis of technique scheme, in steps A 5, content based video retrieval system system is adopted to carry out analog vision data base.
On the basis of technique scheme, the detailed process at position of determining in step B3 to be blocked in video to be identified is: the image entropy calculating each local sampling block in video to be identified, entropy is exactly the position be blocked lower than the position at the local sampling block place of predetermined threshold value, and predetermined threshold value is determined in an experiment.
The present invention also provides a kind of human action recognition system based on human brain visual memory principle, comprise a HOG characteristic set acquiring unit, " super complete " base vector acquiring unit, the first coding unit, training pattern generation unit, visual memory storehouse construction unit, the 2nd HOG characteristic set acquiring unit, sparse vector acquiring unit, the second coding unit, human action classification acquiring unit, wherein:
A described HOG characteristic set acquiring unit is used for: gather multiple training video, carry out intensive sampling respectively to each training video, using the histograms of oriented gradients feature on sampling block as local feature, obtains the HOG characteristic set of training video;
Described " super complete " base vector acquiring unit is used for: adopt expectation-maximization algorithm, learn, obtain one group of " super complete " base vector to the HOG characteristic set of the training video that a HOG characteristic set acquiring unit obtains;
Described first coding unit is used for: combine " super complete " base vector, adopt the mode of sparse coding, feature coding is carried out to the HOG characteristic set of the training video that a HOG characteristic set acquiring unit obtains, obtain the first sparse vector set, in first sparse vector set, the dimension of each vector is identical with the dimension of " super complete " base vector, summation operation is carried out to the whole sparse vectors in the first sparse vector set, be normalized again, obtain a dimension and " super complete " vector that base vector dimension is identical, as the coding result of training video, the human action in assertiveness training video is carried out with the coding result of training video,
Described training pattern generation unit is used for: the coding result of all training videos obtained by the first coding unit is sent into support vector machines sorter and trained, and generates training pattern;
Described visual memory storehouse construction unit is used for: the coding result of all training videos using the first coding unit to obtain, builds visual memory storehouse;
Described 2nd HOG characteristic set acquiring unit is used for: carry out intensive sampling to the video to be identified of input, using the HOG feature on sampling block as local feature, obtain the HOG characteristic set of video to be identified;
Described sparse vector acquiring unit is used for: combine " super complete " " super complete " base vector of obtaining of base vector acquiring unit, adopt the mode of sparse coding, feature coding is carried out to the HOG characteristic set of the video to be identified that the 2nd HOG characteristic set acquiring unit obtains, arrive to obtain the second sparse vector set, in second sparse vector set, the dimension of each vector is identical with " super complete " base vector dimension, summation operation is carried out to the whole sparse vectors in the second sparse vector set, be normalized again, obtain a dimension and " super complete " sparse vector that base vector dimension is identical,
Described second coding unit is used for: determine the position be blocked in video to be identified, the position be blocked in video to be identified is replaced with the result for retrieval in visual memory storehouse, obtain the coding result of video to be identified: the sparse vector obtained with sparse vector acquiring unit is index, retrieve in visual memory storehouse, using the video that retrieves as result for retrieval, the feature at the position that is blocked is replaced in video to be identified with the local feature of video in result for retrieval, obtain the HOG characteristic set of the video after replacing, as new local feature; With " super complete " base vector, feature coding is carried out to this new local feature, obtain new sparse vector, as the coding result of video to be identified, express the human action in video to be identified with the coding result of video to be identified;
Described human action classification acquiring unit is used for: the coding result of the video to be identified obtained by the second coding unit is sent into training pattern and tested, and obtains the human action classification in video to be identified.
On the basis of technique scheme, a described HOG characteristic set acquiring unit to the process that each training video carries out intensive sampling is respectively: for individualized training video, centered by intensive sampling point, finds multiple local sampling blocks of this training video.
On the basis of technique scheme, described visual memory storehouse construction unit adopts content based video retrieval system system to carry out analog vision data base.
On the basis of technique scheme, described second coding unit determine to be blocked in the video to be identified detailed process at position is: the image entropy calculating each local sampling block in video to be identified, entropy is exactly the position be blocked lower than the position at the local sampling block place of predetermined threshold value, and predetermined threshold value is determined in an experiment.
Compared with prior art, advantage of the present invention is as follows:
The present invention inspires by human brain visual memory principle, proposes following technical scheme first: in the training stage, uses the feature coding training classifier model of local feature, and uses this feature coding to build visual memory storehouse; At cognitive phase, in visual memory storehouse, retrieve the feature coding of local feature in video to be identified; With the part local feature of video in result for retrieval, replace the information be blocked in video to be identified, feature coding is carried out to the local feature of video after replacing, and input training pattern and test, obtain the classification of human action in video.The present invention can distinguish the classification of human action from video, effectively can solve the occlusion issue in human action identification.
Accompanying drawing explanation
Fig. 1 is the process flow diagram based on the human motion recognition method of human brain visual memory principle in the embodiment of the present invention.
Fig. 2 is the video frequency searching process of analog vision data base retrieval in the embodiment of the present invention.
Fig. 3 replaces with the result for retrieval in visual memory storehouse the process that block information obtains new sparse vector in the embodiment of the present invention.
Embodiment
Below in conjunction with drawings and the specific embodiments, the present invention is described in further detail.
Shown in Figure 1, the embodiment of the present invention provides a kind of human motion recognition method based on human brain visual memory principle, comprises the following steps:
A, training stage:
A1, gather multiple training video, respectively intensive sampling is carried out to each training video, HOG (Histogram of Oriented Gradients, histograms of oriented gradients) feature on sampling block, as local feature, is obtained the HOG characteristic set of training video;
The process of each training video being carried out respectively to intensive sampling is: for individualized training video, centered by intensive sampling point, finds multiple local sampling blocks of this training video; Local sampling block is of a size of the arbitrary dimension being less than training video size, such as: local sampling block is of a size of 16 × 16 × 4 pixels;
A2, employing well known to a person skilled in the art EM (Expectation Maximization, expectation maximization) algorithm, learn, obtain one group of " super complete " base vector to the HOG characteristic set of the training video that steps A 1 obtains; " super complete " base vector is prior art, does not repeat herein;
A3, " super complete " base vector that integrating step A2 obtains, adopt the mode of sparse coding, feature coding is carried out to the HOG characteristic set of the training video that steps A 1 obtains, obtain the first sparse vector set, in first sparse vector set, the dimension of each vector is identical with the dimension of " super complete " base vector, summation operation is carried out to the whole sparse vectors in the first sparse vector set, be normalized again, obtain a dimension and " super complete " vector that base vector dimension is identical, as the coding result of training video, the human action in assertiveness training video is carried out with the coding result of training video,
A4, the coding result of all training videos steps A 3 obtained are sent into SVM (Support Vector Machine, support vector machine) sorter and are trained, and generate training pattern;
The coding result of all training videos that A5, use steps A 3 obtain, builds visual memory storehouse.
Shown in Figure 2, in steps A 5, CBVR (Content-Based VideoRetrieval, content based video retrieval system) system can be adopted to carry out analog vision data base, and CBVR system is prior art, does not repeat herein.
First, human brain visual memory principle is simply introduced:
The eye minded major function of human brain comprises storage and association, and storage refers to: be stored in human brain by seen information; Association refers to: by some information seen, remembers former some information seen be stored in human brain.
By the inspiration of human brain visual memory principle, in this task of human action identification, can think human brain visual memory storehouse in what store is the full detail of video, association function then can come by retrieving this visual memory storehouse, and the feature coding of video can be retrieved as index.Suppose that the video stored in visual memory storehouse is not generally all blocked.
When adopting CBVR system to carry out analog vision data base, the eye minded memory function of human brain and association function, respectively two stages of corresponding CBVR system: the formation of property data base and video frequency searching.
B, cognitive phase:
B1, input video to be identified, intensive sampling is carried out to video to be identified, using the HOG feature on sampling block as local feature, obtains the HOG characteristic set of video to be identified;
" super complete " base vector that B2, integrating step A2 obtain, adopt the mode of sparse coding, feature coding is carried out to the HOG characteristic set of the video to be identified that step B1 obtains, arrive to obtain the second sparse vector set, in second sparse vector set, the dimension of each vector is identical with " super complete " base vector dimension, summation operation is carried out to the whole sparse vectors in the second sparse vector set, then is normalized, obtain a dimension and " super complete " sparse vector that base vector dimension is identical;
B3, determine the position that is blocked in video to be identified, to replace the position be blocked in video to be identified with the result for retrieval in visual memory storehouse, obtain the coding result of video to be identified:
The detailed process at position of determining to be blocked in video to be identified is: the image entropy calculating each local sampling block in video to be identified, and entropy is exactly the position be blocked lower than the position at the local sampling block place of predetermined threshold value, and predetermined threshold value is determined in an experiment;
Shown in Figure 3, the sparse vector obtained with step B2 is index, retrieve in the visual memory storehouse that steps A 5 builds, using the video that retrieves as result for retrieval, the feature at the position that is blocked is replaced in video to be identified with the local feature of video in result for retrieval, obtain the HOG characteristic set of the video after replacing, as new local feature; " super complete " base vector obtained by steps A 2 carries out feature coding to this new local feature, obtains new sparse vector, as the coding result of video to be identified, expresses the human action in video to be identified with the coding result of video to be identified;
B4, the coding result of video to be identified obtained by step B3 are sent into the training pattern that steps A 4 generates and are tested, and obtain the human action classification in video to be identified.
The embodiment of the present invention also provides a kind of human action recognition system based on human brain visual memory principle, comprise a HOG characteristic set acquiring unit, " super complete " base vector acquiring unit, the first coding unit, training pattern generation unit, visual memory storehouse construction unit, the 2nd HOG characteristic set acquiring unit, sparse vector acquiring unit, the second coding unit, human action classification acquiring unit, wherein:
One HOG characteristic set acquiring unit is used for: gather multiple training video, respectively intensive sampling is carried out to each training video, by HOG (the Histogram ofOriented Gradients on sampling block, histograms of oriented gradients) feature as local feature, obtain the HOG characteristic set of training video; For individualized training video, centered by intensive sampling point, find multiple local sampling blocks of this training video; Local sampling block is of a size of the arbitrary dimension being less than training video size, such as: the size of local sampling block gets 16 × 16 × 4 pixels;
" super complete " base vector acquiring unit is used for: adopt and well known to a person skilled in the art EM (Expectation Maximization, expectation maximization) algorithm, the HOG characteristic set of the training video that the one HOG characteristic set acquiring unit obtains is learnt, obtains one group of " super complete " base vector; " super complete " base vector is prior art, does not repeat herein;
First coding unit is used for: combine " super complete " base vector, adopt the mode of sparse coding, feature coding is carried out to the HOG characteristic set of the training video that a HOG characteristic set acquiring unit obtains, obtain the first sparse vector set, in first sparse vector set, the dimension of each vector is identical with the dimension of " super complete " base vector, summation operation is carried out to the whole sparse vectors in the first sparse vector set, be normalized again, obtain a dimension and " super complete " vector that base vector dimension is identical, as the coding result of training video, the human action in assertiveness training video is carried out with the coding result of training video,
Training pattern generation unit is used for: the coding result of all training videos obtained by the first coding unit is sent into SVM (Support Vector Machine, support vector machine) sorter and trained, and generates training pattern;
Visual memory storehouse construction unit is used for: the coding result of all training videos using the first coding unit to obtain, builds visual memory storehouse;
Shown in Figure 2, visual memory storehouse construction unit can adopt CBVR (Content-Based Video Retrieval, content based video retrieval system) system to carry out analog vision data base, and CBVR system is prior art, does not repeat herein;
2nd HOG characteristic set acquiring unit is used for: carry out intensive sampling to the video to be identified of input, using the HOG feature on sampling block as local feature, obtain the HOG characteristic set of video to be identified;
Sparse vector acquiring unit is used for: combine " super complete " " super complete " base vector of obtaining of base vector acquiring unit, adopt the mode of sparse coding, feature coding is carried out to the HOG characteristic set of the video to be identified that the 2nd HOG characteristic set acquiring unit obtains, arrive to obtain the second sparse vector set, in second sparse vector set, the dimension of each vector is identical with " super complete " base vector dimension, summation operation is carried out to the whole sparse vectors in the second sparse vector set, be normalized again, obtain a dimension and " super complete " sparse vector that base vector dimension is identical,
Second coding unit is used for: determine the position be blocked in video to be identified, replace the position be blocked in video to be identified with the result for retrieval in visual memory storehouse, obtain the coding result of video to be identified;
The detailed process at position of determining to be blocked in video to be identified is: the image entropy calculating each local sampling block in video to be identified, and entropy is exactly the position be blocked lower than the position at the local sampling block place of predetermined threshold value, and predetermined threshold value is determined in an experiment;
Shown in Figure 3, the sparse vector obtained with sparse vector acquiring unit is index, retrieve in visual memory storehouse, using the video that retrieves as result for retrieval, the feature at the position that is blocked is replaced in video to be identified with the local feature of video in result for retrieval, obtain the HOG characteristic set of the video after replacing, as new local feature; With " super complete " base vector, feature coding is carried out to this new local feature, obtain new sparse vector, as the coding result of video to be identified, express the human action in video to be identified with the coding result of video to be identified;
Human action classification acquiring unit is used for: the coding result of the video to be identified obtained by the second coding unit is sent into training pattern and tested, and obtains the human action classification in video to be identified.
Those skilled in the art can carry out various modifications and variations to the embodiment of the present invention, if these amendments and modification are within the scope of the claims in the present invention and equivalent technologies thereof, then these revise and modification also within protection scope of the present invention.
The prior art that the content do not described in detail in instructions is known to the skilled person.

Claims (10)

1. based on a human motion recognition method for human brain visual memory principle, it is characterized in that, comprise the following steps:
A, training stage:
A1, gather multiple training video, respectively intensive sampling is carried out to each training video, using the histograms of oriented gradients HOG feature on sampling block as local feature, obtains the HOG characteristic set of training video;
A2, employing expectation-maximization algorithm, learn the HOG characteristic set of the training video that steps A 1 obtains, obtain one group of " super complete " base vector;
A3, " super complete " base vector that integrating step A2 obtains, adopt the mode of sparse coding, feature coding is carried out to the HOG characteristic set of the training video that steps A 1 obtains, obtain the first sparse vector set, in first sparse vector set, the dimension of each vector is identical with the dimension of " super complete " base vector, summation operation is carried out to the whole sparse vectors in the first sparse vector set, be normalized again, obtain a dimension and " super complete " vector that base vector dimension is identical, as the coding result of training video, the human action in assertiveness training video is carried out with the coding result of training video,
A4, the coding result of all training videos steps A 3 obtained are sent into support vector machines sorter and are trained, and generate training pattern;
The coding result of all training videos that A5, use steps A 3 obtain, builds visual memory storehouse;
B, cognitive phase:
B1, input video to be identified, intensive sampling is carried out to video to be identified, using the HOG feature on sampling block as local feature, obtains the HOG characteristic set of video to be identified;
" super complete " base vector that B2, integrating step A2 obtain, adopt the mode of sparse coding, feature coding is carried out to the HOG characteristic set of the video to be identified that step B1 obtains, arrive to obtain the second sparse vector set, in second sparse vector set, the dimension of each vector is identical with " super complete " base vector dimension, summation operation is carried out to the whole sparse vectors in the second sparse vector set, then is normalized, obtain a dimension and " super complete " sparse vector that base vector dimension is identical;
B3, determine the position that is blocked in video to be identified, to replace the position be blocked in video to be identified with the result for retrieval in visual memory storehouse, obtain the coding result of video to be identified:
The sparse vector obtained with step B2 is index, retrieve in the visual memory storehouse that steps A 5 builds, using the video that retrieves as result for retrieval, the feature at the position that is blocked is replaced in video to be identified with the local feature of video in result for retrieval, obtain the HOG characteristic set of the video after replacing, as new local feature; " super complete " base vector obtained by steps A 2 carries out feature coding to this new local feature, obtains new sparse vector, as the coding result of video to be identified, expresses the human action in video to be identified with the coding result of video to be identified;
B4, the coding result of video to be identified obtained by step B3 are sent into the training pattern that steps A 4 generates and are tested, and obtain the human action classification in video to be identified.
2. as claimed in claim 1 based on the human motion recognition method of human brain visual memory principle, it is characterized in that: in steps A, describedly to the process that each training video carries out intensive sampling be respectively: for individualized training video, centered by intensive sampling point, find multiple local sampling blocks of this training video.
3., as claimed in claim 2 based on the human motion recognition method of human brain visual memory principle, it is characterized in that: described local sampling block is of a size of the arbitrary dimension being less than training video size.
4., as claimed in claim 3 based on the human motion recognition method of human brain visual memory principle, it is characterized in that: described local sampling block is of a size of 16 × 16 × 4 pixels.
5. as claimed in claim 1 based on the human motion recognition method of human brain visual memory principle, it is characterized in that: in steps A 5, adopt content based video retrieval system system to carry out analog vision data base.
6. the human motion recognition method based on human brain visual memory principle according to any one of claim 1 to 5, it is characterized in that: the detailed process at position of determining in step B3 to be blocked in video to be identified is: the image entropy calculating each local sampling block in video to be identified, entropy is exactly the position be blocked lower than the position at the local sampling block place of predetermined threshold value, and predetermined threshold value is determined in an experiment.
7. the human action recognition system based on human brain visual memory principle, it is characterized in that: comprise a HOG characteristic set acquiring unit, " super complete " base vector acquiring unit, the first coding unit, training pattern generation unit, visual memory storehouse construction unit, the 2nd HOG characteristic set acquiring unit, sparse vector acquiring unit, the second coding unit, human action classification acquiring unit, wherein:
A described HOG characteristic set acquiring unit is used for: gather multiple training video, carry out intensive sampling respectively to each training video, using the histograms of oriented gradients feature on sampling block as local feature, obtains the HOG characteristic set of training video;
Described " super complete " base vector acquiring unit is used for: adopt expectation-maximization algorithm, learn, obtain one group of " super complete " base vector to the HOG characteristic set of the training video that a HOG characteristic set acquiring unit obtains;
Described first coding unit is used for: combine " super complete " base vector, adopt the mode of sparse coding, feature coding is carried out to the HOG characteristic set of the training video that a HOG characteristic set acquiring unit obtains, obtain the first sparse vector set, in first sparse vector set, the dimension of each vector is identical with the dimension of " super complete " base vector, summation operation is carried out to the whole sparse vectors in the first sparse vector set, be normalized again, obtain a dimension and " super complete " vector that base vector dimension is identical, as the coding result of training video, the human action in assertiveness training video is carried out with the coding result of training video,
Described training pattern generation unit is used for: the coding result of all training videos obtained by the first coding unit is sent into support vector machines sorter and trained, and generates training pattern;
Described visual memory storehouse construction unit is used for: the coding result of all training videos using the first coding unit to obtain, builds visual memory storehouse;
Described 2nd HOG characteristic set acquiring unit is used for: carry out intensive sampling to the video to be identified of input, using the HOG feature on sampling block as local feature, obtain the HOG characteristic set of video to be identified;
Described sparse vector acquiring unit is used for: combine " super complete " " super complete " base vector of obtaining of base vector acquiring unit, adopt the mode of sparse coding, feature coding is carried out to the HOG characteristic set of the video to be identified that the 2nd HOG characteristic set acquiring unit obtains, arrive to obtain the second sparse vector set, in second sparse vector set, the dimension of each vector is identical with " super complete " base vector dimension, summation operation is carried out to the whole sparse vectors in the second sparse vector set, be normalized again, obtain a dimension and " super complete " sparse vector that base vector dimension is identical,
Described second coding unit is used for: determine the position be blocked in video to be identified, the position be blocked in video to be identified is replaced with the result for retrieval in visual memory storehouse, obtain the coding result of video to be identified: the sparse vector obtained with sparse vector acquiring unit is index, retrieve in visual memory storehouse, using the video that retrieves as result for retrieval, the feature at the position that is blocked is replaced in video to be identified with the local feature of video in result for retrieval, obtain the HOG characteristic set of the video after replacing, as new local feature; With " super complete " base vector, feature coding is carried out to this new local feature, obtain new sparse vector, as the coding result of video to be identified, express the human action in video to be identified with the coding result of video to be identified;
Described human action classification acquiring unit is used for: the coding result of the video to be identified obtained by the second coding unit is sent into training pattern and tested, and obtains the human action classification in video to be identified.
8. as claimed in claim 7 based on the human action recognition system of human brain visual memory principle, it is characterized in that: a described HOG characteristic set acquiring unit to the process that each training video carries out intensive sampling is respectively: for individualized training video, centered by intensive sampling point, find multiple local sampling blocks of this training video.
9. as claimed in claim 7 based on the human action recognition system of human brain visual memory principle, it is characterized in that: described visual memory storehouse construction unit adopts content based video retrieval system system to carry out analog vision data base.
10. the human action recognition system based on human brain visual memory principle according to any one of claim 7 to 9, it is characterized in that: described second coding unit determine to be blocked in the video to be identified detailed process at position is: the image entropy calculating each local sampling block in video to be identified, entropy is exactly the position be blocked lower than the position at the local sampling block place of predetermined threshold value, and predetermined threshold value is determined in an experiment.
CN201510407799.0A 2015-07-13 2015-07-13 Human motion recognition method and system based on human brain visual memory principle Expired - Fee Related CN105023000B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510407799.0A CN105023000B (en) 2015-07-13 2015-07-13 Human motion recognition method and system based on human brain visual memory principle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510407799.0A CN105023000B (en) 2015-07-13 2015-07-13 Human motion recognition method and system based on human brain visual memory principle

Publications (2)

Publication Number Publication Date
CN105023000A true CN105023000A (en) 2015-11-04
CN105023000B CN105023000B (en) 2018-05-01

Family

ID=54412955

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510407799.0A Expired - Fee Related CN105023000B (en) 2015-07-13 2015-07-13 Human motion recognition method and system based on human brain visual memory principle

Country Status (1)

Country Link
CN (1) CN105023000B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108491751A (en) * 2018-01-11 2018-09-04 华南理工大学 A kind of compound action recognition methods of the exploration privilege information based on simple action

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609686A (en) * 2012-01-19 2012-07-25 宁波大学 Pedestrian detection method
CN102945375A (en) * 2012-11-20 2013-02-27 天津理工大学 Multi-view monitoring video behavior detection and recognition method under multiple constraints
CN103605986A (en) * 2013-11-27 2014-02-26 天津大学 Human motion recognition method based on local features
CN103793054A (en) * 2014-01-17 2014-05-14 中南民族大学 Motion recognition method for simulating declarative memory process
CN103955951A (en) * 2014-05-09 2014-07-30 合肥工业大学 Fast target tracking method based on regularization templates and reconstruction error decomposition
CN104268568A (en) * 2014-09-17 2015-01-07 电子科技大学 Behavior recognition method based on intelligent sub-space networks

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609686A (en) * 2012-01-19 2012-07-25 宁波大学 Pedestrian detection method
CN102945375A (en) * 2012-11-20 2013-02-27 天津理工大学 Multi-view monitoring video behavior detection and recognition method under multiple constraints
CN103605986A (en) * 2013-11-27 2014-02-26 天津大学 Human motion recognition method based on local features
CN103793054A (en) * 2014-01-17 2014-05-14 中南民族大学 Motion recognition method for simulating declarative memory process
CN103955951A (en) * 2014-05-09 2014-07-30 合肥工业大学 Fast target tracking method based on regularization templates and reconstruction error decomposition
CN104268568A (en) * 2014-09-17 2015-01-07 电子科技大学 Behavior recognition method based on intelligent sub-space networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
PHILIP GEISMANN 等: "A Two-staged Approach to Vision-based Pedestrian Recognition Using Haar and HOG Features", 《2008 IEEE INTELLIGENT VEHICLES SYMPOSIUM》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108491751A (en) * 2018-01-11 2018-09-04 华南理工大学 A kind of compound action recognition methods of the exploration privilege information based on simple action
CN108491751B (en) * 2018-01-11 2021-08-10 华南理工大学 Complex action identification method for exploring privilege information based on simple action

Also Published As

Publication number Publication date
CN105023000B (en) 2018-05-01

Similar Documents

Publication Publication Date Title
Rudolph et al. Fully convolutional cross-scale-flows for image-based defect detection
Turcot et al. Better matching with fewer features: The selection of useful features in large database recognition problems
CN105844283B (en) Method, image search method and the device of image classification ownership for identification
AU2014278408B2 (en) Method for detecting a plurality of instances of an object
Esmaeili et al. Fast-at: Fast automatic thumbnail generation using deep neural networks
US9563822B2 (en) Learning apparatus, density measuring apparatus, learning method, computer program product, and density measuring system
Padilla-López et al. A discussion on the validation tests employed to compare human action recognition methods using the msr action3d dataset
Xie et al. A synthetic minority oversampling method based on local densities in low-dimensional space for imbalanced learning
CN104281572B (en) A kind of target matching method and its system based on mutual information
US20130129199A1 (en) Object-centric spatial pooling for image classification
CN108717520A (en) A kind of pedestrian recognition methods and device again
CN105303163A (en) Method and detection device for target detection
Nayan et al. Real time detection of small objects
CN104794446A (en) Human body action recognition method and system based on synthetic descriptors
JPWO2015146113A1 (en) Identification dictionary learning system, identification dictionary learning method, and identification dictionary learning program
JP2014228995A (en) Image feature learning device, image feature learning method and program
Wu et al. A method for identifying grape stems using keypoints
Schrunner et al. A comparison of supervised approaches for process pattern recognition in analog semiconductor wafer test data
CN114037886A (en) Image recognition method and device, electronic equipment and readable storage medium
CN106951924B (en) Seismic coherence body image fault automatic identification method and system based on AdaBoost algorithm
Bardeh et al. New approach for human detection in images using histograms of oriented gradients
CN117033956A (en) Data processing method, system, electronic equipment and medium based on data driving
CN105023000A (en) Human brain visual memory principle-based human body action identification method and system
CN113283354B (en) Method, system and storage medium for analyzing eye movement signal behavior
Zhang et al. SAPS: Self-attentive pathway search for weakly-supervised action localization with background-action augmentation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180501

Termination date: 20200713