CN111582220A - Skeleton point behavior identification system based on shift diagram convolution neural network and identification method thereof - Google Patents
Skeleton point behavior identification system based on shift diagram convolution neural network and identification method thereof Download PDFInfo
- Publication number
- CN111582220A CN111582220A CN202010419839.4A CN202010419839A CN111582220A CN 111582220 A CN111582220 A CN 111582220A CN 202010419839 A CN202010419839 A CN 202010419839A CN 111582220 A CN111582220 A CN 111582220A
- Authority
- CN
- China
- Prior art keywords
- image
- points
- joint
- vector
- point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims description 41
- 238000013528 artificial neural network Methods 0.000 title claims description 5
- 238000010586 diagram Methods 0.000 title description 6
- 238000012545 processing Methods 0.000 claims abstract description 65
- 210000000988 bone and bone Anatomy 0.000 claims abstract description 34
- 238000004364 calculation method Methods 0.000 claims abstract description 21
- 238000000605 extraction Methods 0.000 claims abstract description 20
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 6
- 239000013598 vector Substances 0.000 claims description 111
- 230000006399 behavior Effects 0.000 claims description 99
- 238000012937 correction Methods 0.000 claims description 19
- 238000003708 edge detection Methods 0.000 claims description 12
- 238000010606 normalization Methods 0.000 claims description 12
- 230000000052 comparative effect Effects 0.000 claims description 6
- 238000010845 search algorithm Methods 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 2
- 238000013461 design Methods 0.000 abstract description 8
- 238000004458 analytical method Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- HPTJABJPZMULFH-UHFFFAOYSA-N 12-[(Cyclohexylcarbamoyl)amino]dodecanoic acid Chemical compound OC(=O)CCCCCCCCCCCNC(=O)NC1CCCCC1 HPTJABJPZMULFH-UHFFFAOYSA-N 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Human Computer Interaction (AREA)
- Social Psychology (AREA)
- Computing Systems (AREA)
- Psychiatry (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Biodiversity & Conservation Biology (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a convolutional neural network skeleton point behavior identification system based on a shift map, which comprises the following components: the behavior recognition system comprises an image acquisition module, an image processing module, an extraction module and a behavior recognition module, wherein the image acquisition module is used for acquiring a behavior image; the image processing module is used for processing the behavior image acquired by the image acquisition module to perform image processing; the extraction module is used for extracting the bone points of the image processed by the image processing module; the behavior identification module is used for identifying and extracting the bone point behavior characteristics extracted by the module. The invention designs a behavior recognition module to recognize the behavior of the skeleton point, reduces the novel graph convolution of the graph convolution calculation amount, is different from the traditional graph convolution, and the shift graph convolution does not expand the feeling range by expanding the convolution kernel but makes the graph characteristics perform shift splicing by novel shift operation, thereby achieving the same or even higher recognition accuracy under the condition of obviously reducing the calculation amount and improving the calculation speed, and avoiding the increase of the calculation amount of the traditional graph convolution along with the increase of the convolution kernel.
Description
Technical Field
The invention relates to a convolutional neural network skeleton point behavior identification system based on a shift map, which relates to the field of general image data processing or generation G06T, in particular to the field of motion analysis of G06T 7/20.
Background
In the behavior recognition task, due to the constraints of data volume and algorithm, the behavior recognition model based on the RGB image is often interfered by the change of the viewing angle and the complex background, so that the generalization performance is insufficient, and the robustness in practical application is poor. While behavior recognition based on skeletal point data may solve this problem well.
In the skeletal point data, the human body is represented by coordinates of several predefined key joint points in the camera coordinate system. It can be conveniently obtained by a depth camera and various attitude estimation algorithms.
However, in this conventional graph convolution method, the modeled convolution kernel covers only a neighborhood of one point. However, in the skeletal point behavior recognition task, some behaviors (such as clapping) need to model the position relationship of points which are physically far apart (such as two hands). This requires increasing the convolution kernel size of the graph convolution model. However, the calculation amount of the graph convolution is increased along with the increase of the convolution kernel, so that the conventional graph convolution calculation amount is large.
Disclosure of Invention
The purpose of the invention is as follows: the system for recognizing the skeleton point behaviors based on the shift diagram convolution neural network is provided to solve the problems in the prior art.
The technical scheme is as follows: a shift-map convolutional neural network-based bone point behavior identification system, comprising:
the image acquisition module is used for acquiring behavior images;
the image processing module is used for processing the behavior image acquired by the image acquisition module to perform image processing;
the skeleton point extraction module is used for extracting the image processed by the image processing module;
and the behavior recognition module is used for extracting the behavior characteristics of the bone points by the recognition extraction module.
In a further embodiment, the image acquisition module is based on the image acquisition device, the image acquisition device is including being the camera that equilateral triangle placed, and set up the rotating device of camera afterbody, rotating device include with camera fixed connection's axis of rotation, cup joint the rotation motor of axis of rotation.
In a further embodiment, the image acquisition module performs human body shooting behaviors through three groups of cameras arranged in an equilateral triangle, and then the behavior images acquired by the three groups of cameras are installed in front of, behind and on the side of the computer terminal respectively, so that the image processing module can compare and process the images.
In a further embodiment, the image processing module mainly processes the human behavior image acquired by the image acquisition module into a human body edge map; traversing pixel points in the image by using a convolution 3 x 3 template when detecting the edge of the image through a Krisch edge detection operator, inspecting pixel gray values of adjacent areas around each pixel point one by one, and calculating the weighted sum difference of the gray values of three adjacent pixels and the gray weighted sum of the rest five pixels; the convolution template is as follows:
1 2 3 4
5 6 7 8
sequentially processing all pixels in the original image by using eight convolution templates, calculating to obtain the edge intensity of the pixels, detecting by using a threshold value, extracting the final edge point, and finishing edge detection;
the method for detecting the image edge by the Krisch operator comprises the following steps:
step 4, repeating the step 3, setting the remaining six templates at a time, performing calculation processing, and finally storing the larger gray values in the obtained image 1 and the image 2 in the buffer image 1;
and 5, copying the processed image 1 into original image data, and programming to realize edge processing of the image.
In a further embodiment, the extraction module is configured to extract skeleton points of the image processed by the image processing module, and after the image processing module finishes processing the image acquired by the image acquisition module, the skeleton points pre-recorded in the human body edge map are matched according to the closest acquired image actor body type, and then the matched skeleton points are displayed on the human body edge map.
In a further embodiment, the extracting module further includes a correcting module, when the image obtaining module obtains the human behavior image, due to different body types of people, when people with different body types perform the same group of actions, due to different skeleton sizes, three-dimensional coordinates of skeleton points are different, and therefore skeleton sizes need to be normalized to the same size;
firstly, selecting a skeleton of a person as a reference skeleton, selecting a body central point as a root node for a certain frame of skeleton data, calculating all vectors from points directly connected with the root node to the root node, obtaining a direction vector (the module length is 1) of each vector by using each vector as the module length of the vector, multiplying the direction vector by the length of the corresponding vector in the reference skeleton to obtain a vector, adding the vector to the coordinate of the root node to obtain the corrected coordinate of a certain point directly connected with the root node, recording the coordinates of the connected points as the coordinate values of the corresponding skeleton points after normalization, sequentially updating the coordinate values of the root node according to the sequence of a breadth-first search algorithm, and repeating the steps until the values of all the skeleton points are corrected, wherein the algorithm comprises the following steps:
inputting: the length of the limb in the reference appendage isPreparing a normalized bone point coordinate value;
the seventh step: returning to the third part, knowing that all limbs in the skeleton are traversed;
and (3) outputting: the coordinates of the skeleton points stored in the set A are corrected coordinates;
wherein ,the value of (A) representsThe body part is provided with a plurality of limbs,represents the first in the reference valuationThe length of each limb is determined by the length of the limb,respectively represent the first in the reference valuationCoordinate values of the starting node and the ending node of each limb, so as to obtain all the coordinatesCalculating the values of the skeleton points to obtain all corrected skeleton point coordinates, and zooming the estimated size under the condition of ensuring that the included angle between the limbs is unchanged;
when the included angle between the limbs changes, the included angle between the vectors is selected to describe the skeleton points so as to avoid the skeleton point deviation when the included angle between the limbs changes;
the steps for solving the included angle of the human joint vector are as follows:
obtaining the angle of a certain joint point, firstly obtaining three joint points used in angle calculation, capturing three-dimensional coordinate values of the joint points by using Kinect, constructing a structural vector between the three joint points, and then obtaining the size of a joint vector included angle by adopting an inverse cosine law;
selecting other two joint points connected with the first joint to obtain three-dimensional coordinate values of the joint points captured by the Kinect, wherein the other two joint points are expressed as、The first joint point is represented as;
Constructing an inter-joint structure vector, the first joint point toPoint vector=First joint point toPoint vector=,Point-to-pointVector of;
wherein ,the range of the angle is between 0 degree and 180 degrees, in order to enable the representation based on the included angle of the joint vector to be more accurate, representative joint angles are selected for representation according to the importance ranking of the joint angles in the action process, and then the position of the bone point is corrected through size normalization and angle correction.
In a further embodiment, the behavior identification module is mainly used for identifying and extracting the behavior features of the bone points, shifting and splicing the adjacent behavior features according to the adjacency relation of the graph, obtaining the calculated behavior features by only performing 1 × 1 convolution once after splicing, and obtaining the calculated behavior features for one imageFor a node graph, let the characteristic dimension beWith a characteristic size ofWherein the nodeIs provided withA node is adjacent to it, and the set of adjacent nodes is(ii) a For the firstThe characteristics of each node are equally divided into by the shift map module+1 part, the first part retaining its own character, the latterShares are shifted from their neighbor node characteristics, mathematically expressed as follows:
wherein ,,subscript of (1)A label representing a Python is used,and double vertical lines represent feature dimensions for feature splicing.
A recognition method based on a shift mapping convolutional neural network skeleton point behavior recognition system comprises the following steps:
sequentially processing all pixels in the original image by using eight convolution templates, calculating to obtain the edge intensity of the pixels, detecting by using a threshold value, extracting the final edge point, and finishing edge detection;
the method for detecting the image edge by the Krisch operator comprises the following steps:
step 4, repeating the step 3, setting the remaining six templates at a time, performing calculation processing, and finally storing the larger gray values in the obtained image 1 and the image 2 in the buffer image 1;
step 4, after the human behavior characteristic image is processed, the extraction module is used for extracting skeletal points of the image processed by the image processing module, and after the image processing module processes the image acquired by the image acquisition module, the human body edge map matches the skeletal points which are input in advance according to the closest acquired image behavior body type, and then displays the matched skeletal points on the human body edge map;
step 5, after the extraction of the skeleton points is completed, correcting the positions of the skeleton points by a correction module, and when the image acquisition module acquires a human behavior image, due to different body types of people, when people with different body types perform the same group of actions, due to different sizes of skeletons of people, the three-dimensional coordinates of the skeleton points are different, so that the sizes of the skeletons need to be normalized to be the same size; firstly, selecting a skeleton of a person as a reference skeleton, selecting a body central point as a root node for a certain frame of skeleton data, calculating all vectors from points directly connected with the root node to the root node, obtaining a direction vector (the module length is 1) of each vector by using the module length of each vector, multiplying the direction vector by the length of the corresponding vector in the reference skeleton to obtain a vector, adding the vector to the coordinate of the root node to obtain a corrected coordinate of a certain point directly connected with the root node, recording the coordinates of the connected points as the coordinate values of the corresponding skeleton points after normalization, sequentially updating the coordinate values of the root node according to the sequence of a breadth-first search algorithm, and repeating the steps until all the values of the skeleton points are corrected; the correction method is to zoom the estimated size under the condition of ensuring the included angle between the limbs to be unchanged;
when the included angle between the limbs changes, the included angle between the vectors is selected to describe the skeleton points so as to avoid the skeleton point deviation when the included angle between the limbs changes;
the steps for solving the included angle of the human joint vector are as follows:
obtaining the angle of a certain joint point, firstly obtaining three joint points used in angle calculation, capturing three-dimensional coordinate values of the joint points by using Kinect, constructing a structural vector between the three joint points, and then obtaining the size of a joint vector included angle by adopting an inverse cosine law;
selecting other two joint points connected with the first joint to obtain three-dimensional coordinate values of the joint points captured by the Kinect, wherein the other two joint points are expressed as、The first joint point is represented as;
Constructing an inter-joint structure vector, the first joint point toPoint vector=First joint point toPoint vector=,Point-to-pointVector of;
wherein ,the range of the angle is between 0 degree and 180 degrees, in order to enable the representation based on the included angle of the joint vector to be more accurate, representative joint angles are selected for representation according to the importance ranking of the joint angles in the action process, and then the positions of the skeleton points are corrected through size normalization and angle correction;
wherein ,,subscript of (1)A label representing a Python is used,and the double vertical lines represent characteristic dimensions to carry out characteristic splicing, so that the behavior characteristics of the bone points are identified.
Has the advantages that: the invention discloses a moving-map convolution-based neural network bone point behavior identification system, wherein a behavior identification module is designed to identify the behavior of a bone point, so that the novel graph convolution capable of obviously reducing the computation amount of the graph convolution is different from the traditional graph convolution.
Drawings
FIG. 1 is a schematic diagram of the convolution of the skeleton point behavior recognition shift map of the present invention.
FIG. 2 is a schematic diagram of the local chart of the present invention.
FIG. 3 is a schematic view of a non-local chart of the present invention.
Fig. 4 is a schematic diagram of conventional graph convolution for identifying the behavior of a bone point.
FIG. 5 is a table of the accuracy and computational complexity contrast of shift graph convolution with conventional graph convolution methods.
Detailed Description
Through research and analysis of the applicant, the reason for this problem (traditional volume calculation is large) is that in the traditional volume method, the modeled convolution kernel can only cover the neighborhood of one point. However, in the skeletal point behavior recognition task, some behaviors (such as clapping) need to model the position relationship of points which are physically far apart (such as two hands). This requires increasing the convolution kernel size of the graph convolution model. However, the calculated amount of the graph convolution is increased along with the increase of the convolution kernel, so that the calculated amount of the traditional graph convolution is larger, the behavior recognition module is designed to recognize the behavior of the bone points, the novel graph convolution capable of obviously reducing the calculated amount of the graph convolution is different from the traditional graph convolution, the sensing range is not expanded by expanding the convolution kernel by the shift graph convolution, the graph characteristics are subjected to shift splicing by a novel shift operation, the same or even higher recognition accuracy can be achieved under the condition that the calculated amount is obviously reduced and the calculation speed is improved, and the phenomenon that the calculated amount of the traditional graph convolution is increased along with the increase of the convolution kernel, so that the calculated amount of the traditional graph convolution is larger is avoided.
A shift-map convolutional neural network-based bone point behavior identification system, comprising: the image acquisition module is used for acquiring behavior images; the image processing module is used for processing the behavior image acquired by the image acquisition module to perform image processing; the skeleton point extraction module is used for extracting the image processed by the image processing module; the behavior recognition module is used for recognizing and extracting the bone point behavior characteristics extracted by the extraction module;
the present invention does not specify a method of skeletal point extraction. There are many methods for extracting human bone points, for example: shooting from a camera, and then obtaining the human skeleton points by an algorithm. And directly obtaining the data from the Kinect camera. The human body wears an acceleration sensor, so that the position of the skeleton is directly obtained; the present invention is concerned with how to perform behavior recognition in the case where bone points have been acquired. However, the present invention is not limited to the method for extracting the bone points, and any method for extracting the bone points is within the scope of the present invention, but in this embodiment, a correction module is provided to perform the identification and correction of the image, and the image acquisition device is correspondingly changed to increase the multi-angle of the image acquisition.
The image acquisition module is based on image acquisition device, image acquisition device is including being the camera that equilateral triangle placed, and set up the rotating device of camera afterbody, rotating device include with camera fixed connection's axis of rotation, cup joint the rotation motor of axis of rotation.
The image acquisition module is through three groups being the human action of making a video recording of the camera ware that equilateral triangle placed, and then the action image that acquires three groups of camera ware is installed before, after, the lateral part and is appeared respectively on computer terminal, and then carries out contrast processing image for image processing module.
The image processing module is mainly used for processing the human behavior image acquired by the image acquisition module into a human body edge map; traversing pixel points in the image by using a convolution 3 x 3 template when detecting the edge of the image through a Krisch edge detection operator, inspecting pixel gray values of adjacent areas around each pixel point one by one, and calculating the weighted sum difference of the gray values of three adjacent pixels and the gray weighted sum of the rest five pixels; the convolution template is as follows:
1 2 3 4
5 6 7 8
sequentially processing all pixels in the original image by using eight convolution templates, calculating to obtain the edge intensity of the pixels, detecting by using a threshold value, extracting the final edge point, and finishing edge detection; the method for detecting the image edge by the Krisch operator comprises the following steps: step 1, acquiring a data area pointer of an original image;
step 4, repeating the step 3, setting the remaining six templates at a time, performing calculation processing, and finally storing the larger gray values in the obtained image 1 and the image 2 in the buffer image 1;
and 5, copying the processed image 1 into original image data, and programming to realize edge processing of the image.
The extraction module is used for extracting the bone points of the image processed by the image processing module, when the image processing module finishes processing the image acquired by the image acquisition module, the positions of the bone points which are input in advance are matched according to the closest acquired image person body type on the human body edge image, and then the matched bone points are displayed on the human body edge image.
The extraction module also comprises a correction module, when the image acquisition module acquires the human behavior image, due to different body types of people, when people with different body types perform the same group of actions, due to different sizes of skeletons, the three-dimensional coordinates of skeleton points are different, so that the sizes of the skeletons need to be normalized to be the same size;
firstly, selecting a skeleton of a person as a reference skeleton, selecting a body central point as a root node for a certain frame of skeleton data, calculating all vectors from points directly connected with the root node to the root node, obtaining a direction vector (the module length is 1) of each vector by using each vector as the module length of the vector, multiplying the direction vector by the length of the corresponding vector in the reference skeleton to obtain a vector, adding the vector to the coordinate of the root node to obtain the corrected coordinate of a certain point directly connected with the root node, recording the coordinates of the connected points as the coordinate values of the corresponding skeleton points after normalization, sequentially updating the coordinate values of the root node according to the sequence of a breadth-first search algorithm, and repeating the steps until the values of all the skeleton points are corrected, wherein the algorithm comprises the following steps:
inputting: the length of the limb in the reference appendage isPreparing a normalized bone point coordinate value;
the seventh step: returning to the third part, knowing that all limbs in the skeleton are traversed;
and (3) outputting: the coordinates of the skeleton points stored in the set A are corrected coordinates;
wherein ,the value of (A) representsThe body part is provided with a plurality of limbs,represents the first in the reference valuationThe length of each limb is determined by the length of the limb,respectively represent the first in the reference valuationCoordinate values of the starting node and the ending node of each limb, so as to obtain all the coordinatesCalculating the values of the skeleton points to obtain all corrected skeleton point coordinates, and zooming the estimated size under the condition of ensuring that the included angle between the limbs is unchanged;
when the included angle between the limbs changes, the included angle between the vectors is selected to describe the skeleton points so as to avoid the skeleton point deviation when the included angle between the limbs changes;
the steps for solving the included angle of the human joint vector are as follows:
obtaining the angle of a certain joint point, firstly obtaining three joint points used in angle calculation, capturing three-dimensional coordinate values of the joint points by using Kinect, constructing a structural vector between the three joint points, and then obtaining the size of a joint vector included angle by adopting an inverse cosine law;
selecting other two joint points connected with the first joint to obtain three-dimensional coordinate values of the joint points captured by the Kinect, wherein the other two joint points are expressed as、The first joint point is represented as;
Constructing an inter-joint structure vector, the first joint point toPoint vector=First joint point toPoint vector=,Point-to-pointVector of;
wherein ,the range of the angle is between 0 degree and 180 degrees, in order to enable the representation based on the included angle of the joint vector to be more accurate, representative joint angles are selected for representation according to the importance ranking of the joint angles in the action process, and then the position of the bone point is corrected through size normalization and angle correction.
The behavior identification module is mainly used for identifying and extracting the behavior features of the bone points, shifting and splicing the adjacent behavior features according to the adjacent relation of the graphs, obtaining the calculated behavior features by only performing 1 × 1 convolution once after splicing, and obtaining the calculated behavior features for one lineFor a node graph, let the characteristic dimension beWith a characteristic size ofWherein the nodeIs provided withA node is adjacent to it, and the set of adjacent nodes is(ii) a For the firstThe characteristics of each node are equally divided into by the shift map module+1 part, the first part retaining its own character, the latterShares are shifted from their neighbor node characteristics, mathematically expressed as follows:
wherein ,,subscript of (1)A label representing a Python is used,the double vertical lines represent characteristic dimensions for characteristic splicing; for intuitive understanding of the above formula, we use a graph of 7 nodes 20-dimensional features as an example, as shown in fig. 2 and 3; here we discuss two cases:
1. the neighborhood of each point contains only physically contiguous locations, we call the local design, shown in FIG. 2;
2. the location of each point contains the entire human skeleton map, we call the non-local design, shown in FIG. 3;
for both designs, we use node 1 (node 1) and node 2 (node 2), respectively, as examples; the following is a detailed explanation of the invention,
in fig. 2, for node 1, there are 1 neighboring nodes (i.e., node 2), so we average their features into 1+1=2, where the first retains its own features (node 1 labeled as part 1) and the second is shifted from node 2 (node 1 labeled as part 2). In fig. 2, for node 2, there are 3 contiguous nodes (i.e. node 1, node 3 and node 4), so we average their characteristics into 3+1=4 shares, the first of which holds its own characteristics (node 1 labeled as part 2) and the last 3 shares are shifted from nodes 1, 3, 4 respectively (corresponding to node 1 labeled as parts 1, 3, 4 respectively).
In fig. 3, for any one node, all other nodes are adjacent to it, so we shift the features of all other nodes from the current node. Examples of node 1 and node 2 are shown in fig. 3. After the shift is carried out, the formed features look like a spiral shape, which is the result of the intensive mixing of the features of different nodes, and experiments show that in two designs of the shift graph convolution, the non-local design has higher precision on the task of behavior identification, because the non-local design can better fuse the features of different nodes, the efficient feature fusion can be carried out even if the nodes are far away,
it is worth to be noted that, under the same recognition accuracy, the shift map convolution proposed by us is more than 3 times smaller than the conventional map convolution in computation cost, which is very important for fast recognition, and the method can be faster due to the saved computation times of convolution (compare fig. 1 and fig. 4); another aspect is that the shift operation can be implemented by a pointer in the C + + or CUDA languages, and thus can be very efficiently deployed on a CPU or GPU.
Our main experiment is shown in figure 5. ST-GCN, Adaptive-GCN and Adaptive-NL GCN are three typical methods of conventional GCN. Our Shift-map convolution (Shift GCN) includes both Local Shift GCN and Non-Local Shift GCN designs. As can be seen from the table, the FLOPs (floating point number of computations, representing computational complexity) of our method is more than 3 times smaller than the conventional graph convolution, which is very important for fast identification. Moreover, the accuracy of the method is higher than that of the traditional graph convolution method.
In addition, we also compare the case of reducing the adjacency matrix of the conventional graph convolution, i.e., the model of suffix "one a", and their computation amount is comparable to that of us, but the accuracy is significantly reduced. This means that the accuracy is significantly degraded when the amount of calculation of the conventional graph convolution is reduced. Our Shift-map convolution (Shift GCN) can achieve an accuracy exceeding all previous algorithms with a small amount of computation.
Description of the working principle: firstly, controlling a camera to rotate through an image acquisition module so as to acquire a human behavior characteristic image; the rotating motor rotates to drive the rotating shaft to rotate, and then the camera is driven to rotate through the rotating shaft, so that the position of the camera is adjusted; the image acquisition module is used for shooting human body behaviors through three groups of cameras which are placed in an equilateral triangle, and then behavior images acquired by the three groups of cameras are respectively displayed on a computer terminal before, after and at the side parts of the behavior images are installed, so that the image processing module can compare and process the images; the image processing module is mainly used for processing the human behavior image acquired by the image acquisition module into a human body edge map; traversing pixel points in the image by using a convolution 3 x 3 template when detecting the edge of the image through a Krisch edge detection operator, inspecting pixel gray values of adjacent areas around each pixel point one by one, and calculating the weighted sum difference of the gray values of three adjacent pixels and the gray weighted sum of the rest five pixels; sequentially processing all pixels in the original image by using eight convolution templates, calculating to obtain the edge intensity of the pixels, detecting by using a threshold value, extracting the final edge point, and finishing edge detection; the method for detecting the image edge by the Krisch operator comprises the following steps:
step 4, repeating the step 3, setting the remaining six templates at a time, performing calculation processing, and finally storing the larger gray values in the obtained image 1 and the image 2 in the buffer image 1;
after the human behavior characteristic image is processed, the extraction module is used for extracting skeletal points of the image processed by the image processing module, and after the image processing module processes the image acquired by the image acquisition module, the skeletal points which are input in advance are matched according to the closest acquired image behavior body type on the human body edge map, and then the matched skeletal points are displayed on the human body edge map; after the extraction of the skeleton points is completed, the positions of the skeleton points are corrected by a correction module, when the image acquisition module acquires a human behavior image, due to different body types of people, when people with different body types perform the same group of actions, due to different sizes of skeletons, the three-dimensional coordinates of the skeleton points are different, and therefore the sizes of the skeletons need to be normalized to be the same size; firstly, selecting a skeleton of a person as a reference skeleton, selecting a body central point as a root node for a certain frame of skeleton data, calculating all vectors from points directly connected with the root node to the root node, obtaining a direction vector (the module length is 1) of each vector by using the module length of each vector, multiplying the direction vector by the length of the corresponding vector in the reference skeleton to obtain a vector, adding the vector to the coordinate of the root node to obtain a corrected coordinate of a certain point directly connected with the root node, recording the coordinates of the connected points as the coordinate values of the corresponding skeleton points after normalization, sequentially updating the coordinate values of the root node according to the sequence of a breadth-first search algorithm, and repeating the steps until all the values of the skeleton points are corrected; the correction method is to zoom the estimated size under the condition of ensuring the included angle between the limbs to be unchanged; when the included angle between the limbs changes, the included angle between the vectors is selected to describe the skeleton points so as to avoid the skeleton point deviation when the included angle between the limbs changes; the steps for solving the included angle of the human joint vector are as follows: obtaining the angle of a certain joint point, firstly obtaining three joint points used in angle calculation, capturing three-dimensional coordinate values of the joint points by using Kinect, constructing a structural vector between the three joint points, and then obtaining the size of a joint vector included angle by adopting an inverse cosine law; in order to enable the representation based on the joint vector included angle to be more accurate, representative joint angles are selected for representation according to the importance ranking of the joint angles in the action process, and then the positions of the skeleton points are corrected through size normalization and angle correction; after the correction of the skeleton points is completed, the behavior recognition module carries out the behavior recognition of the skeleton points at the moment, the adjacent behavior features are subjected to displacement splicing according to the adjacent relation of the graphs, and the calculated behavior features can be obtained only by carrying out 1 × 1 convolution once after the splicing.
The preferred embodiments of the present invention have been described in detail with reference to the accompanying drawings, however, the present invention is not limited to the specific details of the embodiments, and various equivalent changes can be made to the technical solution of the present invention within the technical idea of the present invention, and these equivalent changes are within the protection scope of the present invention.
Claims (7)
1. A neural network skeleton point behavior identification system based on shift mapping convolution is characterized by comprising the following components:
the behavior recognition module is used for recognizing and extracting the bone point behavior characteristics extracted by the extraction module;
the behavior recognition module is mainly used for recognizing and extracting the behavior characteristics of the bone points according toThe adjacent behavior features are subjected to shift splicing in the adjacent relation of the graphs, the calculated behavior features can be obtained only by performing 1 × 1 convolution once after splicing, and for one graphFor a node graph, let the characteristic dimension beWith a characteristic size ofWherein the nodeIs provided withA node is adjacent to it, and the set of adjacent nodes is(ii) a For the firstThe characteristics of each node are equally divided into by the shift map module+1 part, the first part retaining its own character, the latterShares are shifted from their neighbor node characteristics, mathematically expressed as follows:
2. The system of claim 1, wherein the system is characterized in that: the device also comprises an image acquisition module for acquiring the behavior image;
the image acquisition module is based on image acquisition device, image acquisition device is including being the camera that equilateral triangle placed, and set up the rotating device of camera afterbody, rotating device include with camera fixed connection's axis of rotation, cup joint the rotation motor of axis of rotation.
3. The system of claim 2, wherein the system is characterized in that: the image acquisition module is through three groups being the human action of making a video recording of the camera ware that equilateral triangle placed, and then the action image that acquires three groups of camera ware is installed before, after, the lateral part and is appeared respectively on computer terminal, and then carries out contrast processing image for image processing module.
4. The system of claim 1, wherein the system is characterized in that: the behavior image acquisition module is used for acquiring behavior images of the user;
the image processing module is mainly used for processing the human behavior image acquired by the image acquisition module into a human body edge map; traversing pixel points in the image by using a convolution 3 x 3 template when detecting the edge of the image through a Krisch edge detection operator, inspecting pixel gray values of adjacent areas around each pixel point one by one, and calculating the weighted sum difference of the gray values of three adjacent pixels and the gray weighted sum of the rest five pixels; the convolution template is as follows:
1 2 3 4
5 6 7 8
sequentially processing all pixels in the original image by using eight convolution templates, calculating to obtain the edge intensity of the pixels, detecting by using a threshold value, extracting the final edge point, and finishing edge detection;
the method for detecting the image edge by the Krisch operator comprises the following steps:
step 1, acquiring a data area pointer of an original image;
step 2, establishing two buffer areas, wherein the size of the buffer areas is the same as that of the original image, the buffer areas are mainly used for storing the original image and an original image copy, and the two buffer areas are initialized into the original image copy and are respectively marked as an image 1 and an image 2;
step 3, independently setting a Krisch template for convolution operation in each buffer area, respectively traversing pixels in the duplicate image in the two areas, performing convolution operation one by one, calculating results, comparing, storing a calculated comparative value into the image 1, and copying the image 1 into the cache image 2;
step 4, repeating the step 3, setting the remaining six templates at a time, performing calculation processing, and finally storing the larger gray values in the obtained image 1 and the image 2 in the buffer image 1;
and 5, copying the processed image 1 into original image data, and programming to realize edge processing of the image.
5. The system of claim 1, wherein the system is characterized in that: the bone point extraction module is used for extracting the image processed by the image processing module;
the extraction module is used for extracting the bone points of the image processed by the image processing module, when the image processing module finishes processing the image acquired by the image acquisition module, the positions of the bone points which are input in advance are matched according to the closest acquired image person body type on the human body edge image, and then the matched bone points are displayed on the human body edge image.
6. The system of claim 5, wherein the system is characterized in that: the extraction module also comprises a correction module, when the image acquisition module acquires the human behavior image, due to different body types of people, when people with different body types perform the same group of actions, due to different sizes of skeletons, the three-dimensional coordinates of skeleton points are different, so that the sizes of the skeletons need to be normalized to be the same size;
firstly, selecting a skeleton of a person as a reference skeleton, selecting a body central point as a root node for a certain frame of skeleton data, calculating all vectors from points directly connected with the root node to the root node, obtaining a direction vector (the module length is 1) of each vector by using each vector as the module length of the vector, multiplying the direction vector by the length of the corresponding vector in the reference skeleton to obtain a vector, adding the vector to the coordinate of the root node to obtain the corrected coordinate of a certain point directly connected with the root node, recording the coordinates of the connected points as the coordinate values of the corresponding skeleton points after normalization, sequentially updating the coordinate values of the root node according to the sequence of a breadth-first search algorithm, and repeating the steps until the values of all the skeleton points are corrected, wherein the algorithm comprises the following steps:
inputting: the length of the limb in the reference appendage isPreparing a normalized bone point coordinate value;
the seventh step: returning to the third part, knowing that all limbs in the skeleton are traversed;
and (3) outputting: the coordinates of the skeleton points stored in the set A are corrected coordinates;
wherein ,the value of (A) representsThe body part is provided with a plurality of limbs,represents the first in the reference valuationThe length of each limb is determined by the length of the limb,respectively represent the first in the reference valuationCoordinate values of the starting node and the ending node of each limb, so as to obtain all the coordinatesCalculating the values of the skeleton points to obtain all corrected skeleton point coordinates, and zooming the estimated size under the condition of ensuring that the included angle between the limbs is unchanged;
when the included angle between the limbs changes, the included angle between the vectors is selected to describe the skeleton points so as to avoid the skeleton point deviation when the included angle between the limbs changes;
the steps for solving the included angle of the human joint vector are as follows:
obtaining the angle of a certain joint point, firstly obtaining three joint points used in angle calculation, capturing three-dimensional coordinate values of the joint points by using Kinect, constructing a structural vector between the three joint points, and then obtaining the size of a joint vector included angle by adopting an inverse cosine law;
selecting other two joint points connected with the first joint to obtain three-dimensional coordinate values of the joint points captured by the Kinect, wherein the other two joint points are expressed as、The first joint point is represented as;
Constructing an inter-joint structure vector, the first joint point toPoint vector=First joint point toPoint vector=,Point-to-pointVector of;
wherein ,the range of the angle is between 0 degree and 180 degrees, in order to enable the representation based on the included angle of the joint vector to be more accurate, representative joint angles are selected for representation according to the importance ranking of the joint angles in the action process, and then the position of the bone point is corrected through size normalization and angle correction.
7. The method for recognizing the bone point behavior recognition system based on the shift-map convolutional neural network as claimed in claim 1, which comprises:
step 1, firstly, controlling a camera to rotate through an image acquisition module, and further acquiring a human behavior characteristic image; the rotating motor rotates to drive the rotating shaft to rotate, and then the camera is driven to rotate through the rotating shaft, so that the position of the camera is adjusted;
step 2, the image acquisition module carries out human body shooting behaviors through three groups of cameras which are placed in an equilateral triangle, and then behavior images acquired by the three groups of cameras are respectively displayed on a computer terminal before, after and at the side parts of the behavior images are installed, so that the image processing module can compare and process the images;
step 3, the image processing module mainly processes the human behavior image acquired by the image acquisition module into a human body edge image; traversing pixel points in the image by using a convolution 3 x 3 template when detecting the edge of the image through a Krisch edge detection operator, inspecting pixel gray values of adjacent areas around each pixel point one by one, and calculating the weighted sum difference of the gray values of three adjacent pixels and the gray weighted sum of the rest five pixels;
sequentially processing all pixels in the original image by using eight convolution templates, calculating to obtain the edge intensity of the pixels, detecting by using a threshold value, extracting the final edge point, and finishing edge detection;
the method for detecting the image edge by the Krisch operator comprises the following steps:
step 1, acquiring a data area pointer of an original image;
step 2, establishing two buffer areas, wherein the size of the buffer areas is the same as that of the original image, the buffer areas are mainly used for storing the original image and an original image copy, and the two buffer areas are initialized into the original image copy and are respectively marked as an image 1 and an image 2;
step 3, independently setting a Krisch template for convolution operation in each buffer area, respectively traversing pixels in the duplicate image in the two areas, performing convolution operation one by one, calculating results, comparing, storing a calculated comparative value into the image 1, and copying the image 1 into the cache image 2;
step 4, repeating the step 3, setting the remaining six templates at a time, performing calculation processing, and finally storing the larger gray values in the obtained image 1 and the image 2 in the buffer image 1;
step 5, copying the processed image 1 into original image data, and programming to realize edge processing of the image;
step 4, after the human behavior characteristic image is processed, the extraction module is used for extracting skeletal points of the image processed by the image processing module, and after the image processing module processes the image acquired by the image acquisition module, the human body edge map matches the skeletal points which are input in advance according to the closest acquired image behavior body type, and then displays the matched skeletal points on the human body edge map;
step 5, after the extraction of the skeleton points is completed, correcting the positions of the skeleton points by a correction module, and when the image acquisition module acquires a human behavior image, due to different body types of people, when people with different body types perform the same group of actions, due to different sizes of skeletons of people, the three-dimensional coordinates of the skeleton points are different, so that the sizes of the skeletons need to be normalized to be the same size; firstly, selecting a skeleton of a person as a reference skeleton, selecting a body central point as a root node for a certain frame of skeleton data, calculating all vectors from points directly connected with the root node to the root node, obtaining a direction vector (the module length is 1) of each vector by using the module length of each vector, multiplying the direction vector by the length of the corresponding vector in the reference skeleton to obtain a vector, adding the vector to the coordinate of the root node to obtain a corrected coordinate of a certain point directly connected with the root node, recording the coordinates of the connected points as the coordinate values of the corresponding skeleton points after normalization, sequentially updating the coordinate values of the root node according to the sequence of a breadth-first search algorithm, and repeating the steps until all the values of the skeleton points are corrected; the correction method is to zoom the estimated size under the condition of ensuring the included angle between the limbs to be unchanged;
when the included angle between the limbs changes, the included angle between the vectors is selected to describe the skeleton points so as to avoid the skeleton point deviation when the included angle between the limbs changes;
the steps for solving the included angle of the human joint vector are as follows:
obtaining the angle of a certain joint point, firstly obtaining three joint points used in angle calculation, capturing three-dimensional coordinate values of the joint points by using Kinect, constructing a structural vector between the three joint points, and then obtaining the size of a joint vector included angle by adopting an inverse cosine law;
selecting other two joint points connected with the first joint to obtain three-dimensional coordinate values of the joint points captured by the Kinect, wherein the other two joint points are expressed as、The first joint point is represented as;
Constructing an inter-joint structure vector, the first joint point toPoint vector=First joint point toPoint vector=,Point-to-pointVector of;
wherein ,the range of the angle is between 0 degree and 180 degrees, in order to enable the representation based on the included angle of the joint vector to be more accurate, representative joint angles are selected for representation according to the importance ranking of the joint angles in the action process, and then the positions of the skeleton points are corrected through size normalization and angle correction;
step 6, after the correction of the skeleton points is completed, the behavior recognition module carries out the behavior recognition of the skeleton points at the moment, the adjacent behavior characteristics are shifted and spliced according to the adjacent relation of the graphs, the calculated behavior characteristics can be obtained only by carrying out 1 x 1 convolution once after the splicing, and for one skeleton point, the calculated behavior characteristics can be obtainedFor a node graph, let the characteristic dimension beWith a characteristic size ofWherein the nodeIs provided withA node is adjacent to it, and the set of adjacent nodes is(ii) a For the firstThe characteristics of each node are equally divided into by the shift map module+1 part, the first part retaining its own character, the latterShares are shifted from their neighbor node characteristics, mathematically expressed as follows:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010419839.4A CN111582220B (en) | 2020-05-18 | 2020-05-18 | Bone point behavior recognition system based on shift map convolution neural network and recognition method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010419839.4A CN111582220B (en) | 2020-05-18 | 2020-05-18 | Bone point behavior recognition system based on shift map convolution neural network and recognition method thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111582220A true CN111582220A (en) | 2020-08-25 |
CN111582220B CN111582220B (en) | 2023-05-26 |
Family
ID=72123047
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010419839.4A Active CN111582220B (en) | 2020-05-18 | 2020-05-18 | Bone point behavior recognition system based on shift map convolution neural network and recognition method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111582220B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112009717A (en) * | 2020-08-31 | 2020-12-01 | 南京迪沃航空技术有限公司 | Airport bulk cargo loader, machine leaning anti-collision system for bulk cargo loader and anti-collision method of machine leaning anti-collision system |
CN113158782A (en) * | 2021-03-10 | 2021-07-23 | 浙江工业大学 | Multi-person concurrent interaction behavior understanding method based on single-frame image |
CN113627409A (en) * | 2021-10-13 | 2021-11-09 | 南通力人健身器材有限公司 | Body-building action recognition monitoring method and system |
CN114463840A (en) * | 2021-12-31 | 2022-05-10 | 北京工业大学 | Skeleton-based method for recognizing human body behaviors through shift graph convolution network |
JP7485154B1 (en) | 2023-05-19 | 2024-05-16 | トヨタ自動車株式会社 | Video Processing System |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017133009A1 (en) * | 2016-02-04 | 2017-08-10 | 广州新节奏智能科技有限公司 | Method for positioning human joint using depth image of convolutional neural network |
CN109522793A (en) * | 2018-10-10 | 2019-03-26 | 华南理工大学 | More people's unusual checkings and recognition methods based on machine vision |
CN111340011A (en) * | 2020-05-18 | 2020-06-26 | 中国科学院自动化研究所南京人工智能芯片创新研究院 | Self-adaptive time sequence shift neural network time sequence behavior identification method and system |
-
2020
- 2020-05-18 CN CN202010419839.4A patent/CN111582220B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017133009A1 (en) * | 2016-02-04 | 2017-08-10 | 广州新节奏智能科技有限公司 | Method for positioning human joint using depth image of convolutional neural network |
CN109522793A (en) * | 2018-10-10 | 2019-03-26 | 华南理工大学 | More people's unusual checkings and recognition methods based on machine vision |
CN111340011A (en) * | 2020-05-18 | 2020-06-26 | 中国科学院自动化研究所南京人工智能芯片创新研究院 | Self-adaptive time sequence shift neural network time sequence behavior identification method and system |
Non-Patent Citations (1)
Title |
---|
韩敏捷;: "基于深度学习框架的多模态动作识别", 计算机与现代化 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112009717A (en) * | 2020-08-31 | 2020-12-01 | 南京迪沃航空技术有限公司 | Airport bulk cargo loader, machine leaning anti-collision system for bulk cargo loader and anti-collision method of machine leaning anti-collision system |
CN112009717B (en) * | 2020-08-31 | 2022-08-02 | 南京迪沃航空技术有限公司 | Airport bulk cargo loader, machine leaning anti-collision system for bulk cargo loader and anti-collision method of machine leaning anti-collision system |
CN113158782A (en) * | 2021-03-10 | 2021-07-23 | 浙江工业大学 | Multi-person concurrent interaction behavior understanding method based on single-frame image |
CN113158782B (en) * | 2021-03-10 | 2024-03-26 | 浙江工业大学 | Multi-person concurrent interaction behavior understanding method based on single-frame image |
CN113627409A (en) * | 2021-10-13 | 2021-11-09 | 南通力人健身器材有限公司 | Body-building action recognition monitoring method and system |
CN114463840A (en) * | 2021-12-31 | 2022-05-10 | 北京工业大学 | Skeleton-based method for recognizing human body behaviors through shift graph convolution network |
JP7485154B1 (en) | 2023-05-19 | 2024-05-16 | トヨタ自動車株式会社 | Video Processing System |
Also Published As
Publication number | Publication date |
---|---|
CN111582220B (en) | 2023-05-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110135455B (en) | Image matching method, device and computer readable storage medium | |
CN111582220A (en) | Skeleton point behavior identification system based on shift diagram convolution neural network and identification method thereof | |
US20170161901A1 (en) | System and Method for Hybrid Simultaneous Localization and Mapping of 2D and 3D Data Acquired by Sensors from a 3D Scene | |
KR102285915B1 (en) | Real-time 3d gesture recognition and tracking system for mobile devices | |
JP6493163B2 (en) | Density search method and image processing apparatus | |
CN109919971B (en) | Image processing method, image processing device, electronic equipment and computer readable storage medium | |
CN108010082B (en) | Geometric matching method | |
CN110070096B (en) | Local frequency domain descriptor generation method and device for non-rigid shape matching | |
CN112819875B (en) | Monocular depth estimation method and device and electronic equipment | |
CN111928842B (en) | Monocular vision based SLAM positioning method and related device | |
JP2007249592A (en) | Three-dimensional object recognition system | |
US20200005078A1 (en) | Content aware forensic detection of image manipulations | |
CN106407978B (en) | Method for detecting salient object in unconstrained video by combining similarity degree | |
CN111928857B (en) | Method and related device for realizing SLAM positioning in dynamic environment | |
CN111199558A (en) | Image matching method based on deep learning | |
Zhao et al. | Probabilistic spatial distribution prior based attentional keypoints matching network | |
JP6482130B2 (en) | Geometric verification apparatus, program and method | |
CN105447869B (en) | Camera self-calibration method and device based on particle swarm optimization algorithm | |
JP7123270B2 (en) | Neural network active training system and image processing system | |
Lati et al. | Robust aerial image mosaicing algorithm based on fuzzy outliers rejection | |
CN115008454A (en) | Robot online hand-eye calibration method based on multi-frame pseudo label data enhancement | |
JP6558803B2 (en) | Geometric verification apparatus and program | |
CN113190120B (en) | Pose acquisition method and device, electronic equipment and storage medium | |
CN115630660B (en) | Barcode positioning method and device based on convolutional neural network | |
Geng et al. | SANet: A novel segmented attention mechanism and multi-level information fusion network for 6D object pose estimation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |