CN109977834A - The method and apparatus divided manpower from depth image and interact object - Google Patents

The method and apparatus divided manpower from depth image and interact object Download PDF

Info

Publication number
CN109977834A
CN109977834A CN201910207311.8A CN201910207311A CN109977834A CN 109977834 A CN109977834 A CN 109977834A CN 201910207311 A CN201910207311 A CN 201910207311A CN 109977834 A CN109977834 A CN 109977834A
Authority
CN
China
Prior art keywords
depth image
manpower
pixel
image
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910207311.8A
Other languages
Chinese (zh)
Other versions
CN109977834B (en
Inventor
徐枫
薄子豪
雍俊海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201910207311.8A priority Critical patent/CN109977834B/en
Publication of CN109977834A publication Critical patent/CN109977834A/en
Application granted granted Critical
Publication of CN109977834B publication Critical patent/CN109977834B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The application proposes a kind of method and apparatus divided manpower from depth image and interact object, wherein method includes: to construct the manpower partitioned data set based on depth image using the dividing method based on color image;Using the manpower partitioned data set based on depth image, training obtains parted pattern, and parted pattern is made of encoder, attention TRANSFER MODEL and decoder;Depth image to be processed is split using parted pattern, obtains tag along sort figure corresponding with depth image to be processed, the value of each pixel is the types value of each pixel in tag along sort figure.The parted pattern that this method is obtained using the manpower partitioned data set based on depth image by training, depth image to be processed is split using parted pattern, realize the manpower and object segmentation of pixel scale, environmental robustness is improved, segmentation precision is higher, the manpower that is capable of handling under complex interaction situation and the case where object segmentation.

Description

The method and apparatus divided manpower from depth image and interact object
Technical field
This application involves technical field of computer vision, more particularly to one kind divide from depth image manpower with interact object The method and apparatus of body.
Background technique
Manpower segmentation is the basic problem of the research fields such as many gesture identifications, hand tracking, manpower reconstruction.It compares It is more more important in human-computer interaction and field of virtual reality to the research under same object interaction mode in individual hand exercise.
Semantic segmentation model neural network based general in recent years is more and more perfect, but the environment of existing method model The manpower that robustness is low, segmentation precision is poor, can not handle under complex interaction situation is divided.
Summary of the invention
The application proposes a kind of method and apparatus divided manpower from depth image and interact object, for solving correlation Existing manpower parted pattern environmental robustness is low in technology, segmentation precision is poor, can not handle manpower under complex interaction situation The problem of segmentation.
The application one side embodiment proposes a kind of method divided manpower from depth image and interact object, packet It includes:
Using the dividing method based on color image, the manpower partitioned data set based on depth image is constructed;
Using the manpower partitioned data set based on depth image, training obtains parted pattern, the parted pattern by Encoder, attention TRANSFER MODEL and decoder are constituted;
Depth image to be processed is split using the parted pattern, is obtained and the depth image to be processed Corresponding tag along sort figure, the value of each pixel is the types value of each pixel in the tag along sort figure, described Types value is used to characterize pixel type affiliated in the depth image to be processed.
The method divided manpower in the slave depth image of the embodiment of the present application and interact object, by using based on color diagram The dividing method of picture constructs the manpower partitioned data set based on depth image, divides data using the manpower based on depth image Collection, training parted pattern, parted pattern are made of encoder, attention TRANSFER MODEL and decoder, are treated using parted pattern The depth image of processing is split, acquisition tag along sort figure corresponding with depth image to be processed, every in tag along sort figure The value of a pixel is the types value of each pixel, each pixel can be determined according to the types value of each pixel belonging to Type, utilize the manpower partitioned data set based on depth image by the obtained parted pattern of training as a result, utilize segmentation mould Type is split depth image to be processed, realizes the manpower and object segmentation of pixel scale, improves environmental robustness, The case where segmentation precision is higher, is capable of handling under complex interaction situation manpower and object segmentation.
The application another aspect embodiment proposes a kind of device divided manpower from depth image and interact object, packet It includes:
Module is constructed, for utilizing the dividing method based on color image, the manpower based on depth image is constructed and divides number According to collection;
Training module, for using the manpower partitioned data set based on depth image, training to obtain parted pattern, institute Parted pattern is stated to be made of encoder, attention TRANSFER MODEL and decoder;
Identification module, for being split using the parted pattern to depth image to be processed, obtain with it is described to The corresponding tag along sort figure of the depth image of processing, the value of each pixel is each pixel in the tag along sort figure Types value, the types value be used for characterize pixel in the depth image to be processed belonging to type.
The device divided manpower in the slave depth image of the embodiment of the present application and interact object, by using based on color diagram The dividing method of picture constructs the manpower partitioned data set based on depth image, divides data using the manpower based on depth image Collection, training parted pattern, parted pattern are made of encoder, attention TRANSFER MODEL and decoder, are treated using parted pattern The depth image of processing is split, acquisition tag along sort figure corresponding with depth image to be processed, every in tag along sort figure The value of a pixel is the types value of each pixel, each pixel can be determined according to the types value of each pixel belonging to Type, utilize the manpower partitioned data set based on depth image by the obtained parted pattern of training as a result, utilize segmentation mould Type is split depth image to be processed, realizes the manpower and object segmentation of pixel scale, improves environmental robustness, The case where segmentation precision is higher, is capable of handling under complex interaction situation manpower and object segmentation.
The additional aspect of the application and advantage will be set forth in part in the description, and will partially become from the following description It obtains obviously, or recognized by the practice of the application.
Detailed description of the invention
The application is above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments Obviously and it is readily appreciated that, in which:
Fig. 1 is a kind of process for dividing manpower and the method for interacting object from depth image provided by the embodiments of the present application Schematic diagram;
Fig. 2 is a kind of structural schematic diagram of parted pattern provided by the embodiments of the present application;
Fig. 3 is a kind of structural schematic diagram of attention Mechanism Model provided by the embodiments of the present application;
Fig. 4 is another stream for dividing manpower and the method for interacting object from depth image provided by the embodiments of the present application Journey schematic diagram;
Fig. 5 is a kind of training process schematic diagram of parted pattern provided by the embodiments of the present application;
Fig. 6 is a kind of effect diagram using profile errors provided by the embodiments of the present application;
Fig. 7 is a kind of structure for dividing manpower and the device for interacting object from depth image provided by the embodiments of the present application Schematic diagram.
Specific embodiment
Embodiments herein is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, it is intended to for explaining the application, and should not be understood as the limitation to the application.
Below with reference to the accompanying drawings describe to divide in the slave depth image of the embodiment of the present application manpower and the method for interacting object and Device.
Fig. 1 is a kind of process for dividing manpower and the method for interacting object from depth image provided by the embodiments of the present application Schematic diagram.
As shown in Figure 1, manpower should be divided from depth image and with the method for interacting object include:
Step 101, using the dividing method based on color image, the manpower partitioned data set based on depth image is constructed.
Since depth camera can acquire colored and depth image simultaneously, manpower and object are acquired using depth camera Interactive color image and depth image, to obtain multipair color image and depth image.Then, based on color image to depth Image procossing is spent, and then is attained at the manpower partitioned data set of depth image.
In order to improve segmentation precision, in the present embodiment, can in the fixed light source of same brightness and colour temperature, to manpower skin The object for differing biggish color is acquired.For example, blue pen is held in acquisition in the case where same brightness and light source Image.
Step 102, using the manpower partitioned data set based on depth image, training obtains parted pattern.
After obtaining the manpower partitioned data set based on depth image, using the data set to initial neural network model It is trained, the parted pattern met the requirements.
Wherein, in the training process, it can use the estimated performance that loss function measures parted pattern.
In the present embodiment, parted pattern is made of encoder, attention TRANSFER MODEL and decoder.Wherein, encoder makes With large-scale convolutional network, decoder restores high layer information to image pixel scale using warp lamination.
Fig. 2 is a kind of structural schematic diagram of parted pattern provided by the embodiments of the present application.As shown in Fig. 2, parted pattern by Encoder, attention TRANSFER MODEL and decoder are constituted.In the present embodiment, increase attention machine between encoder and decoder System, for strengthening the connection of the same layer between codec, can be improved by merging multi-scale image feature construction attention characteristic pattern The accuracy and validity of information transmitting between the two.
Fig. 3 is a kind of structural schematic diagram of attention Mechanism Model provided by the embodiments of the present application.In Fig. 3, by the 1st layer, 2nd layer ..., (i-1)-th layer of characteristic pattern, which be multiplied, obtains bottom attention (FineAtt);1st layer, the 2nd layer ..., (i-1)-th layer Every layer include scale scaling network (SqueezeNet, abbreviation SN) and bilinearity down-sampling layer (Bilinear down- Sampling, abbreviation DS), wherein SN can be with normalization characteristic figure dimension.By i+1 layer, the i-th+2 layers ..., n-th layer characteristic pattern It is multiplied to obtain and constitutes high-rise attention (CoarseAtt);Wherein, i+1 layer, the i-th+2 layers ..., every layer of n-th layer include SN and It up-samples layer (up-sampling layer, abbreviation US).DS and US is respectively used to reduce characteristic pattern scale and amplification characteristic figure ruler Degree.The FineAtt and CoarseAtt that will acquire pay attention to trying hard to, with input decoder after i-th layer of characteristic pattern cascade.To in Fig. 3 1st layer of each layer of the characteristic pattern scale to n-th layer is strengthened using the attention mechanism.
Step 103, depth image to be processed is split using parted pattern, is obtained and depth image to be processed Corresponding tag along sort figure.
In the present embodiment, before being identified to the depth image of processing, it can be obtained by depth camera to be processed Depth image.
After obtaining parted pattern, depth image to be processed is input in the parted pattern that training obtains, divides mould Type exports tag along sort figure corresponding with depth image to be processed.Wherein, tag along sort figure and depth image to be processed Size is identical, and the value of each pixel is the types value of each pixel in tag along sort figure.Types value is for characterizing pixel Point type affiliated in depth image to be processed.In addition, pixel coordinate value is lain in image pixel arrangement, input The value of each pixel is depth value in depth image.
Wherein, pixel type affiliated in depth image to be processed may include manpower, object, background.Specific real Now, these three types of manpower, object, background can be indicated with different types values.For example, with 0 indicate background, 1 indicate manpower, 2 indicate object.
It is available to be processed according to the types value of each pixel and the corresponding type of types value in the present embodiment The segmentation result of manpower and object in depth image realizes manpower and the object segmentation that interacts.
As shown in Fig. 2, depth image to be processed is input in depth network model, encoder is first passed through, using Attention TRANSFER MODEL finally passes through decoder, the tag along sort figure of depth image to be processed is exported, according to each pixel Types value, obtain the position of manpower and object, realize manpower and object segmentation.
In the embodiment of the present application, according to the types value of each pixel in the depth image to be processed of parted pattern output And the corresponding type of types value, can determine the pixel for belonging to manpower and the pixel that belongs to object, thus realize by The manpower of interaction is opened with object segmentation in processing image, realizes the manpower and object segmentation of pixel scale, and segmentation precision is higher, The manpower of interaction under complicated case can be split with object.
In one embodiment of the application, the manpower based on depth image can be constructed according to color image and divides training number According to collection.It is described in detail below with reference to Fig. 4, Fig. 4 is that another kind provided by the embodiments of the present application divides people from depth image The flow diagram of hand and the method for interacting object.
As shown in figure 4, manpower partitioned data set method of the building based on depth image includes:
Step 301, it obtains under manpower and object exchange scenario, multipair color image and depth image.
In the present embodiment, it first can artificially collect and some differ bigger object with manpower skin color.Then, depth is utilized The image under camera shooting manpower and each object exchange scenario is spent, to obtain multipair color image and depth image.In addition, In order to improve data volume, for same object, the image of manpower Yu object distinct interaction posture can be acquired.
When using depth camera acquisition image, fixed-illumination environment, such as the fixation light using same brightness and colour temperature Source, to guarantee the clear shadow-free of color image of acquisition.
Step 302, the object segmentation based on hsv color space is carried out to all color images, obtains every color image In each pixel types value.
In the present embodiment, depth threshold can be first passed through and reject background in all color images and depth image, retain people The image of hand and object.Then, according to existing RGB color to the conversion formula in hsv color space, what be will acquire is all Color image is transformed into hsv color space.Wherein, the parameter in hsv color space is respectively: tone (H), saturation degree (S), lightness (V)。
Later, the corresponding hsv color space of every color image is split, it is each in every color image to obtain The types value of pixel.Specifically, analyzing the distribution of multiple pure hand samples and interaction sampled pixel point in HSV space, sample This overlapping region is the corresponding region of manpower pixel, fits a plurality of Linear Constraints.To all color images into Row analysis, the pixel in constraint are designated as manpower, are designated as object outside constraint.
Step 303, for each pair of color image and depth image, by pixel each in color image, the depth being mapped to Corresponding pixel points in image are spent, the manpower based on depth image is constructed and divides training dataset.
For each pair of color image and depth image, color image and depth image are subjected to pixel alignment, i.e., to depth Estimated respectively with joining inside and outside the camera of color sensor, depth point cloud affine transformation to color camera space is used a kind of It automates mask method and generates the true tag along sort image based on color image, which is also color image The true tag along sort figure of corresponding depth image.Wherein, the types value of each pixel can use 0 in true tag along sort image Indicate background, 1 indicates hand, and 2 indicate object.
In the present embodiment, all depth images and its true tag along sort figure constitute the segmentation of the manpower based on depth image Training dataset.
It further,, can be first before being mapped in one embodiment of the application in order to improve segmentation precision Depth image is pre-processed, is denoised using morphology and profile filtering method, and the background in analysis depth image, The object for only retaining manpower and being interacted with manpower.
It, can be first by the people based on depth image in training pattern after obtaining the data set for training parted pattern Hand segmentation training dataset is divided into training dataset and test data set, wherein training data concentrates the quantity of depth image remote Greater than the quantity that test data concentrates depth image, for training, test data set is used for training completion training dataset Model is tested.
Then, using training dataset, initial parted pattern is trained, and calculates first-loss function.Wherein, First-loss function uses softmax cross entropy loss function, shown in following formula (1):
Wherein, yiIndicate legitimate reading, xiIndicating the predicted value of parted pattern output, subscript i indicates different types, under Mark j also illustrates that different types.For example, pixel shares three types, the loss of types value i=0 is calculated first, which isCalculate the loss of types value i=1:Calculate types value i=2's Loss:The loss of so model is
It should be noted that first-loss function is also possible to other loss functions that can be realized segmentation task.
Specifically, the depth image that training data is concentrated is input in initial neural network model, network model is defeated The prediction tag along sort figure of depth image out.Then, according between prediction tag along sort figure and the true tag figure of depth image Gap, feed back to all parameters in network using gradient descent algorithm, and accordingly update network parameter.When next time, input is deep When spending image, the prediction tag along sort figure of network output can be closer to true tag along sort figure.
When the value of training to first-loss function no longer declines, that is to say, that utilize first-loss function, the property of the model When can be optimal, profile errors is used to continue to train as loss function.Wherein, shown in the following formula of profile errors (2):
Wherein, B is fuzzy operation, and the Gaussian kernel of 5 × 5 σ=2.121 such as can be used to carry out Gaussian Blur;S mentions for profile It takes, such as carries out contours extract using the primary operator of rope;MlabelsFor true tag along sort figure, MlogitsFor network output, specially picture The type prediction value of vegetarian refreshments.
When stablizing when the value of profile errors is in, no longer declining, parted pattern can be obtained with deconditioning.Then, it utilizes Test set tests the parted pattern, specifically, the depth image in test set can be input in parted pattern and identified, system Measurement examination concentrates the friendship of all depth images and than (Intersection-over-Union, abbreviation IOU) score, is obtained using IOU Divide to judge whether the parted pattern reaches requirement.
Wherein, IOU refers to the ratio of intersection and union, in the present embodiment, refers to the same legitimate reading of model prediction result Intersection and union ratio, that is, the intersection of model prediction result and legitimate reading, with model prediction result and true knot The ratio of the union of fruit.
Fig. 5 is a kind of training process schematic diagram of parted pattern provided by the embodiments of the present application.Left side is data structure in Fig. 5 Process schematic is built, right side is model training process schematic.When data construct, the color image of depth camera acquisition will be utilized It is aligned with depth image, and generates the true tag along sort figure based on color image using a kind of automation mask method, it is same When be also alignment respective depth image true tag along sort.All depth images and its true tag along sort image construction Manpower based on depth image divides training dataset.
When model training, it is input in attention segmentation network using the depth image in data set, obtains network model The tag along sort figure of prediction compares with true tag along sort figure and calculates loss, and iteration updates network parameter step by step
Fig. 6 is a kind of effect diagram using profile errors provided by the embodiments of the present application.In Fig. 6, what the left side one arranged Object and hand are true tag, and centre one is classified as the network output of unused profile errors, and the column of the right one indicate to have used profile Network output after error.
In the embodiment of the present application, in training parted pattern, by first using general loss function, when general loss letter Numerical value is in when stablizing, i.e., model is optimal under the loss function, using profile errors as loss function training, and Attention Mechanism Model is added in parted pattern, substantially increases the segmentation precision of model as a result,.
Further, in order to enhance the generalization ability of parted pattern, before using training dataset training parted pattern, The operation of data augmentation can be carried out to training dataset, training dataset is added in the depth image that data augmentation is operated.
Wherein, the operation of data augmentation includes rotating freely depth image, being added in random noise, at random overturning depth image At least one.
In order to realize above-described embodiment, the embodiment of the present application also propose one kind divide from depth image manpower with interact object The device of body.Fig. 7 is a kind of knot for dividing manpower and the device for interacting object from depth image provided by the embodiments of the present application Structure schematic diagram.
As shown in fig. 7, should divide manpower from depth image with the device for interacting object includes: building module 610, training Module 620, identification module 630.
Module 610 is constructed, for utilizing the dividing method based on color image, constructs the manpower segmentation based on depth image Data set;
Training module 620, for using the manpower partitioned data set based on depth image, training to obtain segmentation mould Type, the parted pattern are made of encoder, attention TRANSFER MODEL and decoder;
Identification module 630, for being split using the parted pattern to depth image to be processed, obtain with it is described The corresponding tag along sort figure of depth image to be processed, the value of each pixel is each pixel in the tag along sort figure The types value of point, the types value are used to characterize pixel type affiliated in depth image to be processed.
In a kind of possible implementation of the embodiment of the present application, above-mentioned building module 610 is specifically used for:
It acquires under manpower and object exchange scenario, multipair color image and depth image;
Object segmentation based on hsv color space is carried out to all color images, obtains each picture in every color image The types value of vegetarian refreshments;
For each pair of color image and depth image, by pixel each in color image, in the depth image being mapped to Corresponding pixel points construct the manpower based on depth image and divide training dataset.
In a kind of possible implementation of the embodiment of the present application, the depth image is pre-processed, including makes an uproar Sound and background removal.
In a kind of possible implementation of the embodiment of the present application, the manpower partitioned data set based on depth image includes Training dataset and test data set, training module 620, are specifically used for:
Using training dataset, initial neural network model is trained, and calculates first-loss function, wherein the One loss function uses softmax cross entropy loss function;
When the value of first-loss function no longer declines, profile errors is used to continue to train as loss function.
In a kind of possible implementation of the embodiment of the present application, the device further include:
Processing module, for carrying out the operation of data augmentation to the training dataset, the data augmentation operation includes certainly By rotation depth image, random noise, at random at least one of overturning depth image is added.
It should be noted that above-mentioned to dividing explaining for manpower and the embodiment of the method that interacts object from depth image It is bright, it is also applied for the device divided manpower in the slave depth image of the embodiment with interact object, therefore details are not described herein.
The device divided manpower in the slave depth image of the embodiment of the present application and interact object, by using based on color diagram The dividing method of picture constructs the manpower partitioned data set based on depth image, divides data using the manpower based on depth image Collection, training parted pattern, parted pattern are made of encoder, attention TRANSFER MODEL and decoder, are treated using parted pattern The depth image of processing is split, acquisition tag along sort figure corresponding with depth image to be processed, every in tag along sort figure The value of a pixel is the types value of each pixel, each pixel can be determined according to the types value of each pixel belonging to Type, utilize the manpower partitioned data set based on depth image by the obtained parted pattern of training as a result, utilize segmentation mould Type is split depth image to be processed, realizes the manpower and object segmentation of pixel scale, improves environmental robustness, The case where segmentation precision is higher, is capable of handling under complex interaction situation manpower and object segmentation.

Claims (10)

1. a kind of method divided manpower from depth image and interact object characterized by comprising
Using the dividing method based on color image, the manpower partitioned data set based on depth image is constructed;
Using the manpower partitioned data set based on depth image, training obtains parted pattern, and the parted pattern is by encoding Device, attention TRANSFER MODEL and decoder are constituted;
Depth image to be processed is split using the parted pattern, is obtained corresponding with the depth image to be processed Tag along sort figure, the value of each pixel is the types value of each pixel, the type in the tag along sort figure Value is for characterizing pixel type affiliated in the depth image to be processed.
2. the method as described in claim 1, which is characterized in that it is described to utilize the dividing method based on color image, construct base In the manpower partitioned data set of depth image, comprising:
It obtains under manpower and object exchange scenario, multipair color image and depth image;
Object segmentation based on hsv color space is carried out to all color images, obtains each pixel in every color image Types value;
Pixel each in the color image is mapped to the depth image for each pair of color image and depth image Middle corresponding pixel points construct the manpower based on depth image and divide training dataset.
3. method according to claim 2, which is characterized in that it is described by pixel each in the color image, it is mapped to In depth image after corresponding pixel points, further includes:
The depth image is pre-processed, including noise and background removal.
4. method according to claim 2, which is characterized in that the manpower partitioned data set based on depth image includes instruction Practice data set and test data set, it is described to utilize the manpower partitioned data set based on depth image, training parted pattern, packet It includes:
Using the training dataset, initial neural network model is trained, and calculates first-loss function, wherein the One loss function uses softmax cross entropy loss function;
When the value of first-loss function no longer declines, profile errors is used to continue to train as loss function.
5. method as claimed in claim 4, which is characterized in that described to utilize the training dataset, the training segmentation mould Before type, further includes:
The operation of data augmentation is carried out to the training dataset, the data augmentation operation includes rotating freely depth image, adding Enter at least one of random noise, random overturning depth image.
6. a kind of device divided manpower from depth image and interact object characterized by comprising
Module is constructed, for utilizing the dividing method based on color image, constructs the manpower partitioned data set based on depth image;
Training module, for training and obtaining parted pattern using the manpower partitioned data set based on depth image, described point Model is cut to be made of encoder, attention TRANSFER MODEL and decoder;
Identification module, for being split using the parted pattern to depth image to be processed, obtain with it is described to be processed The corresponding tag along sort figure of depth image, the value of each pixel is the class of each pixel in the tag along sort figure Offset, the types value are used to characterize pixel type affiliated in the depth image to be processed.
7. device as claimed in claim 6, which is characterized in that the building module is specifically used for:
It obtains under manpower and object exchange scenario, multipair color image and depth image;
Object segmentation based on hsv color space is carried out to all color images, obtains each pixel in every color image Types value;
Pixel each in the color image is mapped to the depth image for each pair of color image and depth image Middle corresponding pixel points construct the manpower based on depth image and divide training dataset.
8. device as claimed in claim 7, which is characterized in that further include:
Preprocessing module, for being pre-processed to the depth image, including noise and background removal.
9. device as claimed in claim 7, which is characterized in that the manpower partitioned data set based on depth image includes instruction Practice data set and test data set, the training module be specifically used for:
Using the training dataset, initial neural network model is trained, and calculates first-loss function, wherein the One loss function uses softmax cross entropy loss function;
When the value of first-loss function no longer declines, profile errors is used to continue to train as loss function.
10. device as claimed in claim 9, which is characterized in that further include:
Processing module, for carrying out the operation of data augmentation to the training dataset, the data augmentation operation includes freely revolving At least one of turn depth image, random noise, random overturning depth image is added.
CN201910207311.8A 2019-03-19 2019-03-19 Method and device for segmenting human hand and interactive object from depth image Active CN109977834B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910207311.8A CN109977834B (en) 2019-03-19 2019-03-19 Method and device for segmenting human hand and interactive object from depth image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910207311.8A CN109977834B (en) 2019-03-19 2019-03-19 Method and device for segmenting human hand and interactive object from depth image

Publications (2)

Publication Number Publication Date
CN109977834A true CN109977834A (en) 2019-07-05
CN109977834B CN109977834B (en) 2021-04-06

Family

ID=67079395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910207311.8A Active CN109977834B (en) 2019-03-19 2019-03-19 Method and device for segmenting human hand and interactive object from depth image

Country Status (1)

Country Link
CN (1) CN109977834B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111127535A (en) * 2019-11-22 2020-05-08 北京华捷艾米科技有限公司 Hand depth image processing method and device
CN111568197A (en) * 2020-02-28 2020-08-25 佛山市云米电器科技有限公司 Intelligent detection method, system and storage medium
CN112396137A (en) * 2020-12-14 2021-02-23 南京信息工程大学 Point cloud semantic segmentation method fusing context semantics
CN113158774A (en) * 2021-03-05 2021-07-23 北京华捷艾米科技有限公司 Hand segmentation method, device, storage medium and equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106469446A (en) * 2015-08-21 2017-03-01 小米科技有限责任公司 The dividing method of depth image and segmenting device
WO2017116879A1 (en) * 2015-12-31 2017-07-06 Microsoft Technology Licensing, Llc Recognition of hand poses by classification using discrete values
CN107729326A (en) * 2017-09-25 2018-02-23 沈阳航空航天大学 Neural machine translation method based on Multi BiRNN codings
CN108647214A (en) * 2018-03-29 2018-10-12 中国科学院自动化研究所 Coding/decoding method based on deep-neural-network translation model
CN108898142A (en) * 2018-06-15 2018-11-27 宁波云江互联网科技有限公司 A kind of recognition methods and calculating equipment of handwritten formula
CN109272513A (en) * 2018-09-30 2019-01-25 清华大学 Hand and object interactive segmentation method and device based on depth camera
CN109448006A (en) * 2018-11-01 2019-03-08 江西理工大学 A kind of U-shaped intensive connection Segmentation Method of Retinal Blood Vessels of attention mechanism

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106469446A (en) * 2015-08-21 2017-03-01 小米科技有限责任公司 The dividing method of depth image and segmenting device
WO2017116879A1 (en) * 2015-12-31 2017-07-06 Microsoft Technology Licensing, Llc Recognition of hand poses by classification using discrete values
CN107729326A (en) * 2017-09-25 2018-02-23 沈阳航空航天大学 Neural machine translation method based on Multi BiRNN codings
CN108647214A (en) * 2018-03-29 2018-10-12 中国科学院自动化研究所 Coding/decoding method based on deep-neural-network translation model
CN108898142A (en) * 2018-06-15 2018-11-27 宁波云江互联网科技有限公司 A kind of recognition methods and calculating equipment of handwritten formula
CN109272513A (en) * 2018-09-30 2019-01-25 清华大学 Hand and object interactive segmentation method and device based on depth camera
CN109448006A (en) * 2018-11-01 2019-03-08 江西理工大学 A kind of U-shaped intensive connection Segmentation Method of Retinal Blood Vessels of attention mechanism

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111127535A (en) * 2019-11-22 2020-05-08 北京华捷艾米科技有限公司 Hand depth image processing method and device
CN111568197A (en) * 2020-02-28 2020-08-25 佛山市云米电器科技有限公司 Intelligent detection method, system and storage medium
CN112396137A (en) * 2020-12-14 2021-02-23 南京信息工程大学 Point cloud semantic segmentation method fusing context semantics
CN112396137B (en) * 2020-12-14 2023-12-15 南京信息工程大学 Point cloud semantic segmentation method integrating context semantics
CN113158774A (en) * 2021-03-05 2021-07-23 北京华捷艾米科技有限公司 Hand segmentation method, device, storage medium and equipment
CN113158774B (en) * 2021-03-05 2023-12-29 北京华捷艾米科技有限公司 Hand segmentation method, device, storage medium and equipment

Also Published As

Publication number Publication date
CN109977834B (en) 2021-04-06

Similar Documents

Publication Publication Date Title
CN108304765B (en) Multi-task detection device for face key point positioning and semantic segmentation
CN109977834A (en) The method and apparatus divided manpower from depth image and interact object
CN106682633B (en) The classifying identification method of stool examination image visible component based on machine vision
CN106127108B (en) A kind of manpower image region detection method based on convolutional neural networks
Hou et al. Classification of tongue color based on CNN
CN109558832A (en) A kind of human body attitude detection method, device, equipment and storage medium
CN104463250B (en) A kind of Sign Language Recognition interpretation method based on Davinci technology
CN106687989A (en) Method and system of facial expression recognition using linear relationships within landmark subsets
CN109325395A (en) The recognition methods of image, convolutional neural networks model training method and device
Geetha et al. A vision based dynamic gesture recognition of indian sign language on kinect based depth images
CN104484886B (en) A kind of dividing method and device of MR images
CN110246181B (en) Anchor point-based attitude estimation model training method, attitude estimation method and system
CN113256637A (en) Urine visible component detection method based on deep learning and context correlation
CN114998934B (en) Clothes-changing pedestrian re-identification and retrieval method based on multi-mode intelligent perception and fusion
CN109145836A (en) Ship target video detection method based on deep learning network and Kalman filtering
CN109598249A (en) Dress ornament detection method and device, electronic equipment, storage medium
CN110008961A (en) Text real-time identification method, device, computer equipment and storage medium
CN110555830A (en) Deep neural network skin detection method based on deep Labv3+
CN112700461A (en) System for pulmonary nodule detection and characterization class identification
CN110334566A (en) Fingerprint extraction method inside and outside a kind of OCT based on three-dimensional full convolutional neural networks
CN109949200A (en) Steganalysis framework establishment method based on filter subset selection and CNN
CN106529486A (en) Racial recognition method based on three-dimensional deformed face model
CN117036288A (en) Tumor subtype diagnosis method for full-slice pathological image
CN111862031A (en) Face synthetic image detection method and device, electronic equipment and storage medium
CN113033305B (en) Living body detection method, living body detection device, terminal equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant