CN109977834A - The method and apparatus divided manpower from depth image and interact object - Google Patents
The method and apparatus divided manpower from depth image and interact object Download PDFInfo
- Publication number
- CN109977834A CN109977834A CN201910207311.8A CN201910207311A CN109977834A CN 109977834 A CN109977834 A CN 109977834A CN 201910207311 A CN201910207311 A CN 201910207311A CN 109977834 A CN109977834 A CN 109977834A
- Authority
- CN
- China
- Prior art keywords
- depth image
- manpower
- pixel
- image
- data set
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The application proposes a kind of method and apparatus divided manpower from depth image and interact object, wherein method includes: to construct the manpower partitioned data set based on depth image using the dividing method based on color image;Using the manpower partitioned data set based on depth image, training obtains parted pattern, and parted pattern is made of encoder, attention TRANSFER MODEL and decoder;Depth image to be processed is split using parted pattern, obtains tag along sort figure corresponding with depth image to be processed, the value of each pixel is the types value of each pixel in tag along sort figure.The parted pattern that this method is obtained using the manpower partitioned data set based on depth image by training, depth image to be processed is split using parted pattern, realize the manpower and object segmentation of pixel scale, environmental robustness is improved, segmentation precision is higher, the manpower that is capable of handling under complex interaction situation and the case where object segmentation.
Description
Technical field
This application involves technical field of computer vision, more particularly to one kind divide from depth image manpower with interact object
The method and apparatus of body.
Background technique
Manpower segmentation is the basic problem of the research fields such as many gesture identifications, hand tracking, manpower reconstruction.It compares
It is more more important in human-computer interaction and field of virtual reality to the research under same object interaction mode in individual hand exercise.
Semantic segmentation model neural network based general in recent years is more and more perfect, but the environment of existing method model
The manpower that robustness is low, segmentation precision is poor, can not handle under complex interaction situation is divided.
Summary of the invention
The application proposes a kind of method and apparatus divided manpower from depth image and interact object, for solving correlation
Existing manpower parted pattern environmental robustness is low in technology, segmentation precision is poor, can not handle manpower under complex interaction situation
The problem of segmentation.
The application one side embodiment proposes a kind of method divided manpower from depth image and interact object, packet
It includes:
Using the dividing method based on color image, the manpower partitioned data set based on depth image is constructed;
Using the manpower partitioned data set based on depth image, training obtains parted pattern, the parted pattern by
Encoder, attention TRANSFER MODEL and decoder are constituted;
Depth image to be processed is split using the parted pattern, is obtained and the depth image to be processed
Corresponding tag along sort figure, the value of each pixel is the types value of each pixel in the tag along sort figure, described
Types value is used to characterize pixel type affiliated in the depth image to be processed.
The method divided manpower in the slave depth image of the embodiment of the present application and interact object, by using based on color diagram
The dividing method of picture constructs the manpower partitioned data set based on depth image, divides data using the manpower based on depth image
Collection, training parted pattern, parted pattern are made of encoder, attention TRANSFER MODEL and decoder, are treated using parted pattern
The depth image of processing is split, acquisition tag along sort figure corresponding with depth image to be processed, every in tag along sort figure
The value of a pixel is the types value of each pixel, each pixel can be determined according to the types value of each pixel belonging to
Type, utilize the manpower partitioned data set based on depth image by the obtained parted pattern of training as a result, utilize segmentation mould
Type is split depth image to be processed, realizes the manpower and object segmentation of pixel scale, improves environmental robustness,
The case where segmentation precision is higher, is capable of handling under complex interaction situation manpower and object segmentation.
The application another aspect embodiment proposes a kind of device divided manpower from depth image and interact object, packet
It includes:
Module is constructed, for utilizing the dividing method based on color image, the manpower based on depth image is constructed and divides number
According to collection;
Training module, for using the manpower partitioned data set based on depth image, training to obtain parted pattern, institute
Parted pattern is stated to be made of encoder, attention TRANSFER MODEL and decoder;
Identification module, for being split using the parted pattern to depth image to be processed, obtain with it is described to
The corresponding tag along sort figure of the depth image of processing, the value of each pixel is each pixel in the tag along sort figure
Types value, the types value be used for characterize pixel in the depth image to be processed belonging to type.
The device divided manpower in the slave depth image of the embodiment of the present application and interact object, by using based on color diagram
The dividing method of picture constructs the manpower partitioned data set based on depth image, divides data using the manpower based on depth image
Collection, training parted pattern, parted pattern are made of encoder, attention TRANSFER MODEL and decoder, are treated using parted pattern
The depth image of processing is split, acquisition tag along sort figure corresponding with depth image to be processed, every in tag along sort figure
The value of a pixel is the types value of each pixel, each pixel can be determined according to the types value of each pixel belonging to
Type, utilize the manpower partitioned data set based on depth image by the obtained parted pattern of training as a result, utilize segmentation mould
Type is split depth image to be processed, realizes the manpower and object segmentation of pixel scale, improves environmental robustness,
The case where segmentation precision is higher, is capable of handling under complex interaction situation manpower and object segmentation.
The additional aspect of the application and advantage will be set forth in part in the description, and will partially become from the following description
It obtains obviously, or recognized by the practice of the application.
Detailed description of the invention
The application is above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments
Obviously and it is readily appreciated that, in which:
Fig. 1 is a kind of process for dividing manpower and the method for interacting object from depth image provided by the embodiments of the present application
Schematic diagram;
Fig. 2 is a kind of structural schematic diagram of parted pattern provided by the embodiments of the present application;
Fig. 3 is a kind of structural schematic diagram of attention Mechanism Model provided by the embodiments of the present application;
Fig. 4 is another stream for dividing manpower and the method for interacting object from depth image provided by the embodiments of the present application
Journey schematic diagram;
Fig. 5 is a kind of training process schematic diagram of parted pattern provided by the embodiments of the present application;
Fig. 6 is a kind of effect diagram using profile errors provided by the embodiments of the present application;
Fig. 7 is a kind of structure for dividing manpower and the device for interacting object from depth image provided by the embodiments of the present application
Schematic diagram.
Specific embodiment
Embodiments herein is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end
Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached
The embodiment of figure description is exemplary, it is intended to for explaining the application, and should not be understood as the limitation to the application.
Below with reference to the accompanying drawings describe to divide in the slave depth image of the embodiment of the present application manpower and the method for interacting object and
Device.
Fig. 1 is a kind of process for dividing manpower and the method for interacting object from depth image provided by the embodiments of the present application
Schematic diagram.
As shown in Figure 1, manpower should be divided from depth image and with the method for interacting object include:
Step 101, using the dividing method based on color image, the manpower partitioned data set based on depth image is constructed.
Since depth camera can acquire colored and depth image simultaneously, manpower and object are acquired using depth camera
Interactive color image and depth image, to obtain multipair color image and depth image.Then, based on color image to depth
Image procossing is spent, and then is attained at the manpower partitioned data set of depth image.
In order to improve segmentation precision, in the present embodiment, can in the fixed light source of same brightness and colour temperature, to manpower skin
The object for differing biggish color is acquired.For example, blue pen is held in acquisition in the case where same brightness and light source
Image.
Step 102, using the manpower partitioned data set based on depth image, training obtains parted pattern.
After obtaining the manpower partitioned data set based on depth image, using the data set to initial neural network model
It is trained, the parted pattern met the requirements.
Wherein, in the training process, it can use the estimated performance that loss function measures parted pattern.
In the present embodiment, parted pattern is made of encoder, attention TRANSFER MODEL and decoder.Wherein, encoder makes
With large-scale convolutional network, decoder restores high layer information to image pixel scale using warp lamination.
Fig. 2 is a kind of structural schematic diagram of parted pattern provided by the embodiments of the present application.As shown in Fig. 2, parted pattern by
Encoder, attention TRANSFER MODEL and decoder are constituted.In the present embodiment, increase attention machine between encoder and decoder
System, for strengthening the connection of the same layer between codec, can be improved by merging multi-scale image feature construction attention characteristic pattern
The accuracy and validity of information transmitting between the two.
Fig. 3 is a kind of structural schematic diagram of attention Mechanism Model provided by the embodiments of the present application.In Fig. 3, by the 1st layer,
2nd layer ..., (i-1)-th layer of characteristic pattern, which be multiplied, obtains bottom attention (FineAtt);1st layer, the 2nd layer ..., (i-1)-th layer
Every layer include scale scaling network (SqueezeNet, abbreviation SN) and bilinearity down-sampling layer (Bilinear down-
Sampling, abbreviation DS), wherein SN can be with normalization characteristic figure dimension.By i+1 layer, the i-th+2 layers ..., n-th layer characteristic pattern
It is multiplied to obtain and constitutes high-rise attention (CoarseAtt);Wherein, i+1 layer, the i-th+2 layers ..., every layer of n-th layer include SN and
It up-samples layer (up-sampling layer, abbreviation US).DS and US is respectively used to reduce characteristic pattern scale and amplification characteristic figure ruler
Degree.The FineAtt and CoarseAtt that will acquire pay attention to trying hard to, with input decoder after i-th layer of characteristic pattern cascade.To in Fig. 3
1st layer of each layer of the characteristic pattern scale to n-th layer is strengthened using the attention mechanism.
Step 103, depth image to be processed is split using parted pattern, is obtained and depth image to be processed
Corresponding tag along sort figure.
In the present embodiment, before being identified to the depth image of processing, it can be obtained by depth camera to be processed
Depth image.
After obtaining parted pattern, depth image to be processed is input in the parted pattern that training obtains, divides mould
Type exports tag along sort figure corresponding with depth image to be processed.Wherein, tag along sort figure and depth image to be processed
Size is identical, and the value of each pixel is the types value of each pixel in tag along sort figure.Types value is for characterizing pixel
Point type affiliated in depth image to be processed.In addition, pixel coordinate value is lain in image pixel arrangement, input
The value of each pixel is depth value in depth image.
Wherein, pixel type affiliated in depth image to be processed may include manpower, object, background.Specific real
Now, these three types of manpower, object, background can be indicated with different types values.For example, with 0 indicate background, 1 indicate manpower,
2 indicate object.
It is available to be processed according to the types value of each pixel and the corresponding type of types value in the present embodiment
The segmentation result of manpower and object in depth image realizes manpower and the object segmentation that interacts.
As shown in Fig. 2, depth image to be processed is input in depth network model, encoder is first passed through, using
Attention TRANSFER MODEL finally passes through decoder, the tag along sort figure of depth image to be processed is exported, according to each pixel
Types value, obtain the position of manpower and object, realize manpower and object segmentation.
In the embodiment of the present application, according to the types value of each pixel in the depth image to be processed of parted pattern output
And the corresponding type of types value, can determine the pixel for belonging to manpower and the pixel that belongs to object, thus realize by
The manpower of interaction is opened with object segmentation in processing image, realizes the manpower and object segmentation of pixel scale, and segmentation precision is higher,
The manpower of interaction under complicated case can be split with object.
In one embodiment of the application, the manpower based on depth image can be constructed according to color image and divides training number
According to collection.It is described in detail below with reference to Fig. 4, Fig. 4 is that another kind provided by the embodiments of the present application divides people from depth image
The flow diagram of hand and the method for interacting object.
As shown in figure 4, manpower partitioned data set method of the building based on depth image includes:
Step 301, it obtains under manpower and object exchange scenario, multipair color image and depth image.
In the present embodiment, it first can artificially collect and some differ bigger object with manpower skin color.Then, depth is utilized
The image under camera shooting manpower and each object exchange scenario is spent, to obtain multipair color image and depth image.In addition,
In order to improve data volume, for same object, the image of manpower Yu object distinct interaction posture can be acquired.
When using depth camera acquisition image, fixed-illumination environment, such as the fixation light using same brightness and colour temperature
Source, to guarantee the clear shadow-free of color image of acquisition.
Step 302, the object segmentation based on hsv color space is carried out to all color images, obtains every color image
In each pixel types value.
In the present embodiment, depth threshold can be first passed through and reject background in all color images and depth image, retain people
The image of hand and object.Then, according to existing RGB color to the conversion formula in hsv color space, what be will acquire is all
Color image is transformed into hsv color space.Wherein, the parameter in hsv color space is respectively: tone (H), saturation degree (S), lightness
(V)。
Later, the corresponding hsv color space of every color image is split, it is each in every color image to obtain
The types value of pixel.Specifically, analyzing the distribution of multiple pure hand samples and interaction sampled pixel point in HSV space, sample
This overlapping region is the corresponding region of manpower pixel, fits a plurality of Linear Constraints.To all color images into
Row analysis, the pixel in constraint are designated as manpower, are designated as object outside constraint.
Step 303, for each pair of color image and depth image, by pixel each in color image, the depth being mapped to
Corresponding pixel points in image are spent, the manpower based on depth image is constructed and divides training dataset.
For each pair of color image and depth image, color image and depth image are subjected to pixel alignment, i.e., to depth
Estimated respectively with joining inside and outside the camera of color sensor, depth point cloud affine transformation to color camera space is used a kind of
It automates mask method and generates the true tag along sort image based on color image, which is also color image
The true tag along sort figure of corresponding depth image.Wherein, the types value of each pixel can use 0 in true tag along sort image
Indicate background, 1 indicates hand, and 2 indicate object.
In the present embodiment, all depth images and its true tag along sort figure constitute the segmentation of the manpower based on depth image
Training dataset.
It further,, can be first before being mapped in one embodiment of the application in order to improve segmentation precision
Depth image is pre-processed, is denoised using morphology and profile filtering method, and the background in analysis depth image,
The object for only retaining manpower and being interacted with manpower.
It, can be first by the people based on depth image in training pattern after obtaining the data set for training parted pattern
Hand segmentation training dataset is divided into training dataset and test data set, wherein training data concentrates the quantity of depth image remote
Greater than the quantity that test data concentrates depth image, for training, test data set is used for training completion training dataset
Model is tested.
Then, using training dataset, initial parted pattern is trained, and calculates first-loss function.Wherein,
First-loss function uses softmax cross entropy loss function, shown in following formula (1):
Wherein, yiIndicate legitimate reading, xiIndicating the predicted value of parted pattern output, subscript i indicates different types, under
Mark j also illustrates that different types.For example, pixel shares three types, the loss of types value i=0 is calculated first, which isCalculate the loss of types value i=1:Calculate types value i=2's
Loss:The loss of so model is
It should be noted that first-loss function is also possible to other loss functions that can be realized segmentation task.
Specifically, the depth image that training data is concentrated is input in initial neural network model, network model is defeated
The prediction tag along sort figure of depth image out.Then, according between prediction tag along sort figure and the true tag figure of depth image
Gap, feed back to all parameters in network using gradient descent algorithm, and accordingly update network parameter.When next time, input is deep
When spending image, the prediction tag along sort figure of network output can be closer to true tag along sort figure.
When the value of training to first-loss function no longer declines, that is to say, that utilize first-loss function, the property of the model
When can be optimal, profile errors is used to continue to train as loss function.Wherein, shown in the following formula of profile errors (2):
Wherein, B is fuzzy operation, and the Gaussian kernel of 5 × 5 σ=2.121 such as can be used to carry out Gaussian Blur;S mentions for profile
It takes, such as carries out contours extract using the primary operator of rope;MlabelsFor true tag along sort figure, MlogitsFor network output, specially picture
The type prediction value of vegetarian refreshments.
When stablizing when the value of profile errors is in, no longer declining, parted pattern can be obtained with deconditioning.Then, it utilizes
Test set tests the parted pattern, specifically, the depth image in test set can be input in parted pattern and identified, system
Measurement examination concentrates the friendship of all depth images and than (Intersection-over-Union, abbreviation IOU) score, is obtained using IOU
Divide to judge whether the parted pattern reaches requirement.
Wherein, IOU refers to the ratio of intersection and union, in the present embodiment, refers to the same legitimate reading of model prediction result
Intersection and union ratio, that is, the intersection of model prediction result and legitimate reading, with model prediction result and true knot
The ratio of the union of fruit.
Fig. 5 is a kind of training process schematic diagram of parted pattern provided by the embodiments of the present application.Left side is data structure in Fig. 5
Process schematic is built, right side is model training process schematic.When data construct, the color image of depth camera acquisition will be utilized
It is aligned with depth image, and generates the true tag along sort figure based on color image using a kind of automation mask method, it is same
When be also alignment respective depth image true tag along sort.All depth images and its true tag along sort image construction
Manpower based on depth image divides training dataset.
When model training, it is input in attention segmentation network using the depth image in data set, obtains network model
The tag along sort figure of prediction compares with true tag along sort figure and calculates loss, and iteration updates network parameter step by step
Fig. 6 is a kind of effect diagram using profile errors provided by the embodiments of the present application.In Fig. 6, what the left side one arranged
Object and hand are true tag, and centre one is classified as the network output of unused profile errors, and the column of the right one indicate to have used profile
Network output after error.
In the embodiment of the present application, in training parted pattern, by first using general loss function, when general loss letter
Numerical value is in when stablizing, i.e., model is optimal under the loss function, using profile errors as loss function training, and
Attention Mechanism Model is added in parted pattern, substantially increases the segmentation precision of model as a result,.
Further, in order to enhance the generalization ability of parted pattern, before using training dataset training parted pattern,
The operation of data augmentation can be carried out to training dataset, training dataset is added in the depth image that data augmentation is operated.
Wherein, the operation of data augmentation includes rotating freely depth image, being added in random noise, at random overturning depth image
At least one.
In order to realize above-described embodiment, the embodiment of the present application also propose one kind divide from depth image manpower with interact object
The device of body.Fig. 7 is a kind of knot for dividing manpower and the device for interacting object from depth image provided by the embodiments of the present application
Structure schematic diagram.
As shown in fig. 7, should divide manpower from depth image with the device for interacting object includes: building module 610, training
Module 620, identification module 630.
Module 610 is constructed, for utilizing the dividing method based on color image, constructs the manpower segmentation based on depth image
Data set;
Training module 620, for using the manpower partitioned data set based on depth image, training to obtain segmentation mould
Type, the parted pattern are made of encoder, attention TRANSFER MODEL and decoder;
Identification module 630, for being split using the parted pattern to depth image to be processed, obtain with it is described
The corresponding tag along sort figure of depth image to be processed, the value of each pixel is each pixel in the tag along sort figure
The types value of point, the types value are used to characterize pixel type affiliated in depth image to be processed.
In a kind of possible implementation of the embodiment of the present application, above-mentioned building module 610 is specifically used for:
It acquires under manpower and object exchange scenario, multipair color image and depth image;
Object segmentation based on hsv color space is carried out to all color images, obtains each picture in every color image
The types value of vegetarian refreshments;
For each pair of color image and depth image, by pixel each in color image, in the depth image being mapped to
Corresponding pixel points construct the manpower based on depth image and divide training dataset.
In a kind of possible implementation of the embodiment of the present application, the depth image is pre-processed, including makes an uproar
Sound and background removal.
In a kind of possible implementation of the embodiment of the present application, the manpower partitioned data set based on depth image includes
Training dataset and test data set, training module 620, are specifically used for:
Using training dataset, initial neural network model is trained, and calculates first-loss function, wherein the
One loss function uses softmax cross entropy loss function;
When the value of first-loss function no longer declines, profile errors is used to continue to train as loss function.
In a kind of possible implementation of the embodiment of the present application, the device further include:
Processing module, for carrying out the operation of data augmentation to the training dataset, the data augmentation operation includes certainly
By rotation depth image, random noise, at random at least one of overturning depth image is added.
It should be noted that above-mentioned to dividing explaining for manpower and the embodiment of the method that interacts object from depth image
It is bright, it is also applied for the device divided manpower in the slave depth image of the embodiment with interact object, therefore details are not described herein.
The device divided manpower in the slave depth image of the embodiment of the present application and interact object, by using based on color diagram
The dividing method of picture constructs the manpower partitioned data set based on depth image, divides data using the manpower based on depth image
Collection, training parted pattern, parted pattern are made of encoder, attention TRANSFER MODEL and decoder, are treated using parted pattern
The depth image of processing is split, acquisition tag along sort figure corresponding with depth image to be processed, every in tag along sort figure
The value of a pixel is the types value of each pixel, each pixel can be determined according to the types value of each pixel belonging to
Type, utilize the manpower partitioned data set based on depth image by the obtained parted pattern of training as a result, utilize segmentation mould
Type is split depth image to be processed, realizes the manpower and object segmentation of pixel scale, improves environmental robustness,
The case where segmentation precision is higher, is capable of handling under complex interaction situation manpower and object segmentation.
Claims (10)
1. a kind of method divided manpower from depth image and interact object characterized by comprising
Using the dividing method based on color image, the manpower partitioned data set based on depth image is constructed;
Using the manpower partitioned data set based on depth image, training obtains parted pattern, and the parted pattern is by encoding
Device, attention TRANSFER MODEL and decoder are constituted;
Depth image to be processed is split using the parted pattern, is obtained corresponding with the depth image to be processed
Tag along sort figure, the value of each pixel is the types value of each pixel, the type in the tag along sort figure
Value is for characterizing pixel type affiliated in the depth image to be processed.
2. the method as described in claim 1, which is characterized in that it is described to utilize the dividing method based on color image, construct base
In the manpower partitioned data set of depth image, comprising:
It obtains under manpower and object exchange scenario, multipair color image and depth image;
Object segmentation based on hsv color space is carried out to all color images, obtains each pixel in every color image
Types value;
Pixel each in the color image is mapped to the depth image for each pair of color image and depth image
Middle corresponding pixel points construct the manpower based on depth image and divide training dataset.
3. method according to claim 2, which is characterized in that it is described by pixel each in the color image, it is mapped to
In depth image after corresponding pixel points, further includes:
The depth image is pre-processed, including noise and background removal.
4. method according to claim 2, which is characterized in that the manpower partitioned data set based on depth image includes instruction
Practice data set and test data set, it is described to utilize the manpower partitioned data set based on depth image, training parted pattern, packet
It includes:
Using the training dataset, initial neural network model is trained, and calculates first-loss function, wherein the
One loss function uses softmax cross entropy loss function;
When the value of first-loss function no longer declines, profile errors is used to continue to train as loss function.
5. method as claimed in claim 4, which is characterized in that described to utilize the training dataset, the training segmentation mould
Before type, further includes:
The operation of data augmentation is carried out to the training dataset, the data augmentation operation includes rotating freely depth image, adding
Enter at least one of random noise, random overturning depth image.
6. a kind of device divided manpower from depth image and interact object characterized by comprising
Module is constructed, for utilizing the dividing method based on color image, constructs the manpower partitioned data set based on depth image;
Training module, for training and obtaining parted pattern using the manpower partitioned data set based on depth image, described point
Model is cut to be made of encoder, attention TRANSFER MODEL and decoder;
Identification module, for being split using the parted pattern to depth image to be processed, obtain with it is described to be processed
The corresponding tag along sort figure of depth image, the value of each pixel is the class of each pixel in the tag along sort figure
Offset, the types value are used to characterize pixel type affiliated in the depth image to be processed.
7. device as claimed in claim 6, which is characterized in that the building module is specifically used for:
It obtains under manpower and object exchange scenario, multipair color image and depth image;
Object segmentation based on hsv color space is carried out to all color images, obtains each pixel in every color image
Types value;
Pixel each in the color image is mapped to the depth image for each pair of color image and depth image
Middle corresponding pixel points construct the manpower based on depth image and divide training dataset.
8. device as claimed in claim 7, which is characterized in that further include:
Preprocessing module, for being pre-processed to the depth image, including noise and background removal.
9. device as claimed in claim 7, which is characterized in that the manpower partitioned data set based on depth image includes instruction
Practice data set and test data set, the training module be specifically used for:
Using the training dataset, initial neural network model is trained, and calculates first-loss function, wherein the
One loss function uses softmax cross entropy loss function;
When the value of first-loss function no longer declines, profile errors is used to continue to train as loss function.
10. device as claimed in claim 9, which is characterized in that further include:
Processing module, for carrying out the operation of data augmentation to the training dataset, the data augmentation operation includes freely revolving
At least one of turn depth image, random noise, random overturning depth image is added.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910207311.8A CN109977834B (en) | 2019-03-19 | 2019-03-19 | Method and device for segmenting human hand and interactive object from depth image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910207311.8A CN109977834B (en) | 2019-03-19 | 2019-03-19 | Method and device for segmenting human hand and interactive object from depth image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109977834A true CN109977834A (en) | 2019-07-05 |
CN109977834B CN109977834B (en) | 2021-04-06 |
Family
ID=67079395
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910207311.8A Active CN109977834B (en) | 2019-03-19 | 2019-03-19 | Method and device for segmenting human hand and interactive object from depth image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109977834B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111127535A (en) * | 2019-11-22 | 2020-05-08 | 北京华捷艾米科技有限公司 | Hand depth image processing method and device |
CN111568197A (en) * | 2020-02-28 | 2020-08-25 | 佛山市云米电器科技有限公司 | Intelligent detection method, system and storage medium |
CN112396137A (en) * | 2020-12-14 | 2021-02-23 | 南京信息工程大学 | Point cloud semantic segmentation method fusing context semantics |
CN113158774A (en) * | 2021-03-05 | 2021-07-23 | 北京华捷艾米科技有限公司 | Hand segmentation method, device, storage medium and equipment |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106469446A (en) * | 2015-08-21 | 2017-03-01 | 小米科技有限责任公司 | The dividing method of depth image and segmenting device |
WO2017116879A1 (en) * | 2015-12-31 | 2017-07-06 | Microsoft Technology Licensing, Llc | Recognition of hand poses by classification using discrete values |
CN107729326A (en) * | 2017-09-25 | 2018-02-23 | 沈阳航空航天大学 | Neural machine translation method based on Multi BiRNN codings |
CN108647214A (en) * | 2018-03-29 | 2018-10-12 | 中国科学院自动化研究所 | Coding/decoding method based on deep-neural-network translation model |
CN108898142A (en) * | 2018-06-15 | 2018-11-27 | 宁波云江互联网科技有限公司 | A kind of recognition methods and calculating equipment of handwritten formula |
CN109272513A (en) * | 2018-09-30 | 2019-01-25 | 清华大学 | Hand and object interactive segmentation method and device based on depth camera |
CN109448006A (en) * | 2018-11-01 | 2019-03-08 | 江西理工大学 | A kind of U-shaped intensive connection Segmentation Method of Retinal Blood Vessels of attention mechanism |
-
2019
- 2019-03-19 CN CN201910207311.8A patent/CN109977834B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106469446A (en) * | 2015-08-21 | 2017-03-01 | 小米科技有限责任公司 | The dividing method of depth image and segmenting device |
WO2017116879A1 (en) * | 2015-12-31 | 2017-07-06 | Microsoft Technology Licensing, Llc | Recognition of hand poses by classification using discrete values |
CN107729326A (en) * | 2017-09-25 | 2018-02-23 | 沈阳航空航天大学 | Neural machine translation method based on Multi BiRNN codings |
CN108647214A (en) * | 2018-03-29 | 2018-10-12 | 中国科学院自动化研究所 | Coding/decoding method based on deep-neural-network translation model |
CN108898142A (en) * | 2018-06-15 | 2018-11-27 | 宁波云江互联网科技有限公司 | A kind of recognition methods and calculating equipment of handwritten formula |
CN109272513A (en) * | 2018-09-30 | 2019-01-25 | 清华大学 | Hand and object interactive segmentation method and device based on depth camera |
CN109448006A (en) * | 2018-11-01 | 2019-03-08 | 江西理工大学 | A kind of U-shaped intensive connection Segmentation Method of Retinal Blood Vessels of attention mechanism |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111127535A (en) * | 2019-11-22 | 2020-05-08 | 北京华捷艾米科技有限公司 | Hand depth image processing method and device |
CN111568197A (en) * | 2020-02-28 | 2020-08-25 | 佛山市云米电器科技有限公司 | Intelligent detection method, system and storage medium |
CN112396137A (en) * | 2020-12-14 | 2021-02-23 | 南京信息工程大学 | Point cloud semantic segmentation method fusing context semantics |
CN112396137B (en) * | 2020-12-14 | 2023-12-15 | 南京信息工程大学 | Point cloud semantic segmentation method integrating context semantics |
CN113158774A (en) * | 2021-03-05 | 2021-07-23 | 北京华捷艾米科技有限公司 | Hand segmentation method, device, storage medium and equipment |
CN113158774B (en) * | 2021-03-05 | 2023-12-29 | 北京华捷艾米科技有限公司 | Hand segmentation method, device, storage medium and equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109977834B (en) | 2021-04-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108304765B (en) | Multi-task detection device for face key point positioning and semantic segmentation | |
CN109977834A (en) | The method and apparatus divided manpower from depth image and interact object | |
CN106682633B (en) | The classifying identification method of stool examination image visible component based on machine vision | |
CN106127108B (en) | A kind of manpower image region detection method based on convolutional neural networks | |
Hou et al. | Classification of tongue color based on CNN | |
CN109558832A (en) | A kind of human body attitude detection method, device, equipment and storage medium | |
CN104463250B (en) | A kind of Sign Language Recognition interpretation method based on Davinci technology | |
CN106687989A (en) | Method and system of facial expression recognition using linear relationships within landmark subsets | |
CN109325395A (en) | The recognition methods of image, convolutional neural networks model training method and device | |
Geetha et al. | A vision based dynamic gesture recognition of indian sign language on kinect based depth images | |
CN104484886B (en) | A kind of dividing method and device of MR images | |
CN110246181B (en) | Anchor point-based attitude estimation model training method, attitude estimation method and system | |
CN113256637A (en) | Urine visible component detection method based on deep learning and context correlation | |
CN114998934B (en) | Clothes-changing pedestrian re-identification and retrieval method based on multi-mode intelligent perception and fusion | |
CN109145836A (en) | Ship target video detection method based on deep learning network and Kalman filtering | |
CN109598249A (en) | Dress ornament detection method and device, electronic equipment, storage medium | |
CN110008961A (en) | Text real-time identification method, device, computer equipment and storage medium | |
CN110555830A (en) | Deep neural network skin detection method based on deep Labv3+ | |
CN112700461A (en) | System for pulmonary nodule detection and characterization class identification | |
CN110334566A (en) | Fingerprint extraction method inside and outside a kind of OCT based on three-dimensional full convolutional neural networks | |
CN109949200A (en) | Steganalysis framework establishment method based on filter subset selection and CNN | |
CN106529486A (en) | Racial recognition method based on three-dimensional deformed face model | |
CN117036288A (en) | Tumor subtype diagnosis method for full-slice pathological image | |
CN111862031A (en) | Face synthetic image detection method and device, electronic equipment and storage medium | |
CN113033305B (en) | Living body detection method, living body detection device, terminal equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |