WO2020215565A1 - 手部图像的分割方法、装置及计算机设备 - Google Patents

手部图像的分割方法、装置及计算机设备 Download PDF

Info

Publication number
WO2020215565A1
WO2020215565A1 PCT/CN2019/103140 CN2019103140W WO2020215565A1 WO 2020215565 A1 WO2020215565 A1 WO 2020215565A1 CN 2019103140 W CN2019103140 W CN 2019103140W WO 2020215565 A1 WO2020215565 A1 WO 2020215565A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
hand
detection
recognition model
recognized
Prior art date
Application number
PCT/CN2019/103140
Other languages
English (en)
French (fr)
Inventor
侯丽
王福晴
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020215565A1 publication Critical patent/WO2020215565A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Definitions

  • Gesture is a very user-friendly way of human-computer interaction without an intermediary. Gesture recognition has become an important content and research hotspot of human-computer interaction. Vision-based gesture recognition systems usually include processes such as hand segmentation, gesture modeling, and gesture shape feature extraction. Among them, the purpose of hand segmentation is to separate the hand from the captured gesture image. This is the first step in the vision-based gesture recognition process, and it is also a key step. The accuracy and real-time performance of segmentation directly affect the later recognition effect and the performance of the entire interactive system. Therefore, it is of great significance to conduct in-depth research on hand segmentation by machine learning to further improve the segmentation effect and speed of hand segmentation.
  • the most common method is to use skin color for hand segmentation.
  • Using the skin color model for hand segmentation does not need to consider the changeable geometric characteristics of the hand, but it will introduce the interference of the illumination component, and the complexity of the background and the difference in the detection of race are also It will affect the detection results, and how to overcome these two problems has become one of the main research directions of hand segmentation.
  • this application provides a method, device and computer equipment for hand image segmentation, the main purpose is to solve the problem of skin color confusion, illumination and deformation will interfere with hand image detection when the hand image is segmented, thereby making the segmentation result The problem of insufficient accuracy.
  • a method for segmenting a hand image including:
  • a hand image segmentation device which includes:
  • the acquisition module is used to acquire sample images containing complete hand images
  • a labeling module for labeling the coordinate position of the hand area in the sample image
  • the training module is configured to use the sample image with the coordinate position marked as a training set, and train based on the Faster R-CNN algorithm to obtain a hand recognition model whose training result meets a preset standard;
  • the detection module is configured to use the hand recognition model to detect whether the image to be recognized includes a hand image
  • the output module is used to output the hand image segmentation result of the image to be recognized according to the detection result.
  • a non-volatile readable storage medium having computer readable instructions stored thereon, and when the computer readable instructions are executed by a processor, the above-mentioned hand image segmentation method is realized.
  • a computer device including a non-volatile readable storage medium, a processor, and a computer-readable storage medium that is stored on the non-volatile readable storage medium and can run on the processor. Instructions, when the processor executes the computer-readable instructions, the aforementioned hand image segmentation method is implemented.
  • the hand image segmentation method, device and computer equipment provided by this application are compared with the current method of using skin color model to segment the hand image.
  • This application uses Faster to create a hand recognition model.
  • the R-CNN algorithm trains the hand recognition model, and performs continuous machine learning and correction on the judgment of the hand image position in the sample image, so that the training result can meet the preset standard, and finally uses the successfully trained hand recognition model to determine the pending recognition Whether the image contains a hand image, and output the corresponding hand image segmentation result, the whole technical solution is detected and analyzed through the recognition model, which can effectively resist the influence of light and the tilt of the hand in non-extreme environments. Therefore, it can be accurately judged whether the detection image contains a hand image, so that the area to which the hand belongs can be accurately located, and the accuracy and scientificity of the analysis result is enhanced.
  • FIG. 1 shows a schematic flowchart of a method for segmenting a hand image provided by an embodiment of the present application
  • FIG. 2 shows a schematic flowchart of another hand image segmentation method provided by an embodiment of the present application
  • FIG. 3 shows a schematic diagram of intercepting a rectangular palm area in an optimal detection image provided by an embodiment of the present application
  • FIG. 4 shows a physical schematic diagram of the part of the rectangular palm area that is cut out and the elliptical area to be cut is provided in an embodiment of the present application
  • FIG. 5 shows a schematic structural diagram of a hand image segmentation device provided by an embodiment of the present application
  • Fig. 6 shows a schematic structural diagram of another hand image segmentation device provided by an embodiment of the present application.
  • this embodiment provides a hand image segmentation method, as shown in Figure 1 , The method includes:
  • the sample image may contain complex backgrounds such as hands, faces, arms, and surrounding environment, in order to eliminate irrelevant interference, it is necessary to mark the position of the hand in the complex image to make the recognition area more It is prominent and convenient to calculate the deviation of the size value, and then the position of the hand can be accurately extracted.
  • the hand recognition model can be used to detect the positioning, that is, to determine whether there is a hand in the image, and find the area where the hand is; it can also be used to segment the hand to extract the hand area from the screen to remove the background interference.
  • the input image can be detected.
  • the user has triggered the palmprint detection button, that is, the user has performed the palmprint detection gesture operation, it will start
  • the camera acquires the image data in the image recognition area, and uses the hand recognition model to determine whether the image to be recognized contains a hand image.
  • the hand image segmentation results can include two situations, one is to detect that the image to be recognized contains a hand image, then the segmented hand image can be output as the hand image segmentation result; the other is to detect If the image to be recognized does not contain a hand image, the corresponding prompt content can be output accordingly, and the prompt content of detecting the no-hand image is used as the final output segmentation result.
  • sample images containing complete hand images can be used, and in order to better train the accuracy of model training during the model training process, and to simulate and recognize hand images, it is necessary to label
  • the coordinate position of the hand area in the sample image, and the hand recognition model is trained based on the Faster R-CNN algorithm.
  • the hand recognition model After it is determined that the hand recognition model meets the preset standard, it can be put into use.
  • Input the image to be recognized, and the hand recognition model can be detected. Recognize whether the image contains a hand image, and output the hand image segmentation result of the image to be recognized according to the detection result.
  • the entire technical solution is detected and analyzed through the recognition model, which can effectively resist light and light in non-extreme environments.
  • the influence of the tilt of the hand up and down, so as to accurately determine whether the detected image contains a hand image, and locate the area where the hand belongs, the whole process is very smart and fast, and it also enhances the accuracy and scientificity of the analysis result.
  • the method includes:
  • the actual situation should be considered when selecting sample images.
  • the corresponding preset threshold should be set according to the specific application scenario, so that the number of selected sample images can better meet the analysis needs of users.
  • step 202 may specifically include: creating an image coordinate system with the upper left corner of the sample image as the origin; in the image coordinate system, determining the thumb, little finger, and middle finger of the stretched hand Fingertips, the four-point coordinate position of the base of the palm; determine the rectangular frame where the hand is located according to the abscissa of the tip of the thumb and little finger, the ordinate of the tip of the middle finger and the base of the palm; mark the upper left and lower right corners of the rectangular frame Coordinate position.
  • 1000 pictures with clear images and complete hand image data are selected from the picture database as sample images, and then the thumb fingertip, little fingertip and middle finger of the hand in these 1000 images are respectively determined
  • the four-point coordinate position of the fingertip and the base of the palm is determined according to the abscissa of the tip of the thumb and the fingertip of the little finger, and the ordinate of the fingertip of the middle finger and the root of the palm is determined.
  • the initial hand recognition model is created in advance according to the design needs.
  • the difference from the hand recognition model is: the initial hand recognition model is only initially created, it has not passed the model training, and has not met the preset standards, and the hand data
  • the recognition model refers to the recognition model that has reached the preset standard and can be applied to the detection of the image to be recognized through model training.
  • the creation principle of the model is based on the Faster-R-CNN algorithm.
  • the hand recognition model consists of two major modules, namely PRN candidate frame extraction module and Faster-R-CNN detection module.
  • RPN is a fully convolutional neural network used to extract rectangular detection frames; FasterR-CNN detects and recognizes the target in the proposal based on the proposal extracted by RPN.
  • the hand is in the training process
  • the input of the recognition model is a sample image with marked coordinate positions, and the output data is a preset number of hand image suggestion windows containing evaluation scores.
  • the regional suggestion network RPN According to the characteristics of the hand image, the regional suggestion network RPN generates a predetermined number of suggestion window proposals.
  • RPN Region Proposal Network
  • Each picture generates a predetermined number of proposal windows.
  • Faster-R-CNN creatively uses convolutional networks to generate proposals.
  • Suggestion boxes, and share the convolutional network with the target detection network the number of generated suggestion boxes is 300, where the input of RPN is an image, and the output is a series of rectangular detection boxes (proposals), each proposal corresponds to its own
  • the objectness score objectness score indicates whether the center of the patch contains the confidence of the object. The highest confidence score is 100%.
  • the confidence interval shows the probability that the hand image appears in the suggestion box. The higher the objectness score , Represents the more complete the hand image included in the suggestion window.
  • each RoI pooling layer can be used to generate a fixed-size feature map through the RoIpooling layer.
  • the RoI Pooling layer is responsible for collecting all candidate frames, calculating the feature map of each candidate frame, and then sending it to Subsequent network; when the network is trained, the input image size must be a fixed value, and the network output is also a fixed size.
  • ROI Pooling is to pool feature maps of different sizes into feature maps of the same size, which is conducive to output to the next layer of network.
  • the detection classification probability is the probability that the system correctly detects the target when the target exists.
  • the detection classification Probability is the probability that the hand recognition model judges that there is a hand image when the sample image data with a complete hand image is input to the model; the detection frame regression is for when the image in the candidate frame is not positioned accurately (the degree of overlap (Intersection) When over Union, IoU) ⁇ predetermined value), fine-tune the image in the candidate frame to make the fine-tuned window closer to the annotated image data to make the positioning more accurate.
  • step 207 may specifically include: matching the detection image with the hand image in the rectangular frame marked with coordinate positions; determining the value of the coincidence as the corresponding evaluation score; If the number of evaluation scores in the window that is greater than or equal to the first preset threshold meets the first preset number condition, it is determined that the initial hand recognition model training result reaches the preset standard; if the evaluation score in the suggestion window is greater than or equal to the first preset threshold If the number does not meet the first preset number condition, the suggestion window whose evaluation score is lower than the first preset threshold is corrected according to the coordinate position of the actually marked hand image.
  • the first preset threshold is a numerical comparison basis for judging whether the recognition result of a single suggestion window meets the preset standard. When the evaluation score of the suggestion window is greater than or equal to the first preset threshold, it means that the recognition result of the window meets expectations Standard, otherwise it does not meet.
  • the first predetermined number condition is the criterion for judging whether the hand recognition model training can pass the verification.
  • the first predetermined number condition is to set a minimum score ratio, when the number of recommended windows and the total window are greater than or equal to the first preset threshold When the number is greater than or equal to the minimum score, the first predetermined number condition is met, otherwise it is not met.
  • the hand recognition model When the hand recognition model reaches the preset standard, it can be used in the application of hand image detection of the image to be recognized, and any image to be recognized that is unknown whether it contains a hand image is uploaded to the hand recognition model, and then the hand Image detection and recognition.
  • the detection probability is the judging data for judging whether the suggestion window contains the hand image, and the higher the detection probability, the greater the probability that the corresponding suggestion window contains the hand image.
  • the second preset threshold value is the minimum detection probability that can determine that each suggestion window contains hand images
  • the second preset quantity condition is the judging criterion for judging whether the image to be recognized contains hand images, that is, setting a minimum The percentage of scores.
  • the ratio of the number of suggestion windows with detection probabilities greater than the second preset threshold to the total number of suggestion windows is greater than or equal to the minimum score percentage in the second predetermined number of conditions, it means that the image to be recognized contains hands Image, otherwise it means that the image to be recognized does not contain a hand image.
  • the second preset threshold is set to 90%, and it is determined that there are 100 detection probability greater than or equal to 90% in the 300 suggestion windows, and the minimum score in the preset second predetermined number condition is 1/ 3.
  • the second preset threshold is set to 90%, and it is determined that there are 100 detection probability greater than or equal to 90% in the 300 suggestion windows, and the minimum score in the preset second predetermined number condition is 1/ 2.
  • the image to be recognized contains a hand image
  • the optimal detection image of the hand with the most complete recognition is determined as the last segmented hand image, and the segmented hand image is further displayed on the display page.
  • the prompt information may include text prompt information, picture prompt information, audio prompt information, video prompt information, light prompt information, vibration prompt information, etc. of the display page.
  • this embodiment may further include: using a key point detection algorithm to extract the fingertips and fingertips of the hand in the optimal detection image.
  • the position of the root and palm; the rectangular palm area of the optimal detection image is intercepted according to the position of the root and palm; the fingertip position is used to determine the ellipse to be cut area of the thumb; the part of the rectangular palm area that overlaps the ellipse to be cut area is cut;
  • the rectangular palm area is compared with the pre-stored user palmprint image for palmprint similarity; according to the user identity corresponding to the user palmprint image whose palmprint similarity is greater than or equal to the preset threshold, it is determined that the image to be identified corresponds to User identity.
  • the method of intercepting the rectangular palm area of the optimal detection image according to the position of the root of the finger and the center of the palm can be (as shown in Figure 3): connect the line between the root A of the index finger and the root B of the little finger with The angle of the intersecting horizontal lines is determined as the deviation angle angle, and the palm area is rotated and adjusted according to the angle to make AB rotate to the horizontal direction.
  • the midpoint (x, y) of the palm area is defined as p1(x-length,y-length), p2(x+length,y-length), p3(x-length,y+length), p4(x+ length, y+length) is the rectangular area of the vertex to intercept the rectangular palm area.
  • the method of using the position of the fingertip to determine the area to be cut in the ellipse of the thumb can be as follows (as shown in FIG. 4): Determine the left and right hand attributes of the hand based on the detection results of the key points. Specifically, the position of the base point B of the little finger and the point T of the middle finger can be determined according to the key point detection, and the left and right attributes of the hand can be determined according to the B point and the T point. In the image coordinate system, if the ordinate Ty of point T> the ordinate By of B, then the hand is pointing down and the is_up flag is False, otherwise the hand is pointing up and the is_up flag is True.
  • the semi-major axis of the ellipse is 2/5 of the height of the palm area to be divided, and the semi-minor axis of the ellipse is 1/4 the height of the palm area of the rectangle to be divided.
  • the position of the center point of the ellipse and the long and short axis determine the position of the image to be cut, that is, the thumb part, so it is related to the number of rows and columns of the palm image.
  • the thumb positions of the left and right hands are different, so the position of the center of the ellipse is also different.
  • the x-coordinate of the center of the ellipse takes the number of columns of the image, and the y-coordinate takes 4/5 of the number of rows of the image; if it is left-handed, the x-coordinate of the center of the ellipse takes 0, and the coordinate takes 4/ of the number of rows of the image. 5.
  • the method of cutting off the portion of the rectangular palm area that overlaps the ellipse to be cut can be: The pixels of the intersection of the rectangular palm area are set to 0, that is, the thumb is cut off.
  • the method for determining the user identity corresponding to the image to be recognized may be: obtaining the palmprint image after the thumb is removed through machine learning Compared with the pre-stored palmprint image entered by the user.
  • MobileNet is used to extract the features of palmprints.
  • MobileNet extracts the feature vectors of palmprints, and judges whether they match by calculating the cosine similarity of the feature vectors of the two palmprint images that need to be compared. Euclidean distance can be used to determine the distance between the two. If the cosines of the two feature vectors are similar If the degree reaches the preset threshold, it will match, otherwise it will not match, thus realizing the verification of the user's identity.
  • calculating the cosine similarity of two palmprint image feature vectors is to judge the similarity of the two palmprint images by calculating the vector cosine value. The closer the cosine value is to 1, the higher the similarity of the two palmprint images.
  • the calculation formula of the cosine value is: Where x and y are the feature vector of the palmprint image after the thumb is removed and the feature vector of the palmprint image entered by the user, and n is the number of all feature vectors contained in the two palmprint images; compare the two palmprints
  • the similarity of the image can also be achieved by comparing the Euclidean distance between multiple two points. For example, the coordinates of the two points on the two-dimensional plane corresponding to the palmprint image are a(x1, y1) and b(x2, y2). Calculate a, b
  • the formula for the Euclidean distance between two points can be:
  • the hand recognition model can be created based on the Faster R-CNN algorithm, and the first preset threshold is used to determine a single suggestion window during the model training process using the sample image with the coordinate position Whether the recognition result of the model can meet the preset standard, and the first predetermined number of conditions are used to further determine whether the hand recognition model training can pass the verification.
  • the double threshold limit method is used to make the model training process more accurate, and when the model training fails
  • the suggestion window whose evaluation score is lower than the first preset threshold will be corrected according to the coordinate position of the actually marked hand image, so that it can better meet the needs of users, and when the hand recognition model is put into use, the detection
  • the result is still in the form of the second preset threshold and the second preset number condition double threshold limit, which makes the final judgment of whether the image to be recognized contains a hand image more convincing and accurate, and finally the detected result
  • the result of hand image segmentation is displayed intuitively.
  • the entire technical solution is detected and analyzed through the recognition model.
  • the hand image can accurately locate the area where the hand belongs and perform precise segmentation, which enhances the accuracy and scientificity of the hand image segmentation result.
  • an embodiment of the present application provides a hand image segmentation device.
  • the device includes: an acquisition module 31, an annotation module 32, and training Module 33, detection module 34, output module 35.
  • the collection module 31 can be used to collect sample images including complete hand images
  • the labeling module 32 can be used to label the coordinate position of the hand area in the sample image
  • the training module 33 can be used to use the sample images with marked coordinate positions as a training set, and train based on the Faster R-CNN algorithm to obtain a hand recognition model whose training results meet preset standards;
  • the detection module 34 can be used to use the hand recognition model to detect whether the image to be recognized contains a hand image
  • the output module 35 can be used to output the hand image segmentation result of the image to be recognized according to the detection result.
  • the labeling module 32 can also be used to create an image coordinate system with the upper left corner of the sample image as the origin; in the image coordinate system, determine the stretched The four-point coordinates of the thumb, little fingertip, middle fingertip, and palm base of the hand; determine the rectangle where the hand is based on the abscissa of the thumb and little fingertip, the ordinate of the middle fingertip and the palm base Box; mark the coordinates of the upper left and lower right corners of the rectangular box.
  • the training module 33 can also be used to input sample images with marked coordinate positions into the initial hand recognition model created in advance based on the Faster R-CNN algorithm ;
  • the regional suggestion network RPN generates a predetermined number of proposal window proposals; maps each proposal window to the end of the CNN
  • a layer of convolution feature map generates a fixed size;
  • the detection classification probability Softmax Loss and the detection frame regression Smooth L1 Loss are used to train and modify the detection images in each suggestion window, so that the initial hand recognition model meets the preset standard.
  • the training module 33 can also be used to match the detection images with the hand images in the rectangular frame marked with coordinate positions; The value of the degree is determined as the corresponding evaluation score; if the number of evaluation scores greater than or equal to the first preset threshold in the suggestion window meets the first preset number ratio, it is determined that the initial hand recognition model training result reaches the preset standard; if it is recommended If the number of evaluation scores in the window that is greater than or equal to the first preset threshold does not meet the first preset number ratio, the suggested window whose evaluation score is lower than the first preset threshold is corrected according to the coordinate position of the actually marked hand image.
  • the detection module 34 can also be used to transfer the image to be recognized into the hand recognition model; the Faster R-CNN algorithm is used to determine the image to be recognized.
  • the detection probability is judging whether the suggestion window contains judgment data of the hand image; if the detection probability is greater than or equal to the second preset threshold, the number of suggestion windows If the second preset number ratio is met, it is determined that the image to be recognized contains hand images; if the number of suggestion windows with a detection probability greater than or equal to the second preset threshold does not meet the second preset number ratio, it is determined that the image to be recognized Does not contain hand images.
  • the output module 35 can also be used to select the suggestion window with the highest detection score as the optimal detection image if it is determined that the image to be recognized contains a hand image ;
  • the optimal detection image is output as the result of hand image segmentation; if it is determined that the image to be recognized does not contain a hand image, a prompt message that no hand image is detected is output on the display page.
  • the device further includes: an extraction module 36, an interception module 37, a determination module 38, a removal module 39, and a comparison module 310 , Determination module 311.
  • the extraction module 36 can be used to extract the fingertips, finger roots and palm positions of the hand in the optimal detection image by using the key point detection algorithm;
  • the intercepting module 37 can be used to intercept the rectangular palm area of the optimal detection image according to the position of the base of the finger and the palm;
  • the determining module 38 can be used to determine the ellipse to be cut area of the thumb using the position of the fingertip;
  • the cutting module 39 can be used to cut the overlapping part of the rectangular palm area with the elliptical area to be cut;
  • the comparison module 310 can be used to compare the palmprint similarity of the rectangular palm area after the resection with the pre-stored user palmprint image
  • the determining module 311 may be used to determine the user identity corresponding to the image to be recognized according to the user identity corresponding to the palmprint image of the user whose palmprint similarity is greater than or equal to a preset threshold.
  • an embodiment of the present application also provides a non-volatile readable storage medium on which computer-readable instructions are stored, and the computer-readable instructions are When executed, the hand image segmentation method shown in FIG. 1 and FIG. 2 is realized.
  • the technical solution of this application can be embodied in the form of a software product.
  • the software product can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.), including several
  • the instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute the methods in each implementation scenario of the present application.
  • an embodiment of the present application also provides a computer device, which may specifically be a personal computer, Server, network device, etc.
  • the physical device includes a nonvolatile readable storage medium and a processor; a nonvolatile readable storage medium for storing computer readable instructions; a processor for executing computer readable instructions to The hand image segmentation method shown in Figure 1 and Figure 2 is realized.
  • the computer device may also include a user interface, a network interface, a camera, a radio frequency (RF) circuit, a sensor, an audio circuit, a WI-FI module, and so on.
  • the user interface may include a display screen (Display), an input unit such as a keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, and the like.
  • the network interface can optionally include a standard wired interface, a wireless interface (such as a Bluetooth interface, a WI-FI interface), etc.
  • the computer device structure provided in this embodiment does not constitute a limitation on the physical device, and may include more or fewer components, or combine certain components, or arrange different components.
  • the non-volatile readable storage medium may also include an operating system and a network communication module.
  • the operating system is a program for the hardware and software resources of the physical device that executes the hand image segmentation method, and supports the operation of information processing programs and other software and/or programs.
  • the network communication module is used to implement communication between various components in the non-volatile readable storage medium and communication with other hardware and software in the physical device.
  • this application can be implemented by means of software plus a necessary general hardware platform, or by hardware.
  • this application can create a hand recognition model based on the Faster R-CNN algorithm, and use the sample image marked with coordinate positions for model training.
  • the first preset threshold determines whether the recognition result of a single suggestion window meets the preset criteria, and the first predetermined number of conditions is used to further determine whether the hand recognition model training can pass the verification.
  • the double threshold limit method is used to make the model training process more efficient.
  • the suggested window whose evaluation score is lower than the first preset threshold will be corrected according to the coordinate position of the actual marked hand image, so that it can better meet the needs of users, and in hand
  • the detection result still adopts the form of the second preset threshold and the second preset number condition double threshold limit, which makes the final judgment whether the image to be recognized contains the hand image more convincing, Accuracy.
  • the result of hand image segmentation is displayed intuitively. The entire technical solution is detected and analyzed through the recognition model, which can effectively resist the influence of light and the tilt of the hand in non-extreme environments. Therefore, it can be accurately judged whether the detection image contains a hand image, so that the area to which the hand belongs can be accurately located, and accurate segmentation can be performed, which enhances the accuracy and scientificity of the segmentation result of the hand image.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

一种手部图像的分割方法、装置及计算机设备,涉及计算机技术领域,适用于对图片中手部图像的检测与分割,可以在进行手部图像的分割时,有效避免肤色混淆、光照和形变对手部图像检测产生影响的问题。其中方法包括:采集包含完整手部图像的样本图像(101);标注所述样本图像中手部区域的坐标位置(102);将已标注所述坐标位置的所述样本图像作为训练集,基于Faster R-CNN算法训练得到训练结果满足预设标准的手部识别模型(103);利用所述手部识别模型,检测待识别图像中是否包含手部图像(104);根据检测结果输出所述待识别图像的手部图像分割结果(105)。

Description

手部图像的分割方法、装置及计算机设备 技术领域
本申请要求与2019年4月26日提交中国专利局、申请号为2019103457613、申请名称为“手部图像的分割方法、装置及计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
背景技术
手势是一种不需要中间媒介的,非常人性化的人机交互方式。手势识别已经成为人机交互的重要内容和研究热点。基于视觉的手势识别***通常包含手分割、手势建模、手势形状特征提取等过程。其中,手分割的目的是将手部从摄取的手势图像中划分出来,这是基于视觉的手势识别过程中的第一个步骤,也是关键的一步。分割的准确度和实时性能直接影响到后期的识别效果以及整个交互***的性能,因而对手分割进行机器学习的深入研究,进一步提高手分割的分割效果与分割速度具有很重要的意义。
目前最常见的方法就是利用肤色进行手分割,利用肤色模型进行手部分割可以不用考虑手部多变的几何特征,但会引入光照成分的干扰,同时背景的复杂程度及检测人种的差异也会影响检测结果,如何克服这两大问题成为现今手部分割的主要研究方向之一。
发明内容
有鉴于此,本申请提供了一种手部图像的分割方法、装置及计算机设备,主要目的在于解决对手部图像进行分割时,肤色混淆、光照和形变会干扰手部图像检测,从而使分割结果不够准确的问题。
根据本申请的一个方面,提供了一种手部图像的分割方法,该方法包括:
采集包含完整手部图像的样本图像;
标注所述样本图像中手部区域的坐标位置;
将已标注所述坐标位置的所述样本图像作为训练集,基于FasterR-CNN算法训练得到训练结果满足预设标准的手部识别模型;
利用所述手部识别模型,检测待识别图像中是否包含手部图像;
根据检测结果输出所述待识别图像的手部图像分割结果。
根据本申请的另一个方面,提供了一种手部图像的分割装置,该装置包括:
采集模块,用于采集包含完整手部图像的样本图像;
标注模块,用于标注所述样本图像中手部区域的坐标位置;
训练模块,用于将已标注所述坐标位置的所述样本图像作为训练集,基于Faster R-CNN算法训练得到训练结果满足预设标准的手部识别模型;
检测模块,用于利用所述手部识别模型,检测待识别图像中是否包含手部图像;
输出模块,用于根据检测结果输出所述待识别图像的手部图像分割结果。
根据本申请的又一个方面,提供了一种非易失性可读存储介质,其上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现上述手部图像的分割方法。
根据本申请的再一个方面,提供了一种计算机设备,包括非易失性可读存储介质、处理器及存储在非易失性可读存储介质上并可在处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现上述手部图像的分割方法。
借由上述技术方案,本申请提供的一种手部图像的分割方法、装置及计算机设备,与目前利用肤色模型对手部图像进行分割的方式相比,本申请通过创建手部识别模型,利用Faster R-CNN算法训练手部识别模型,并对样本图像中手部图像位置的判断进行持续的机器学习与修正,使训练结果能够符合预设标准,最后利用训练成功的手部识别模型判断待识别图像中是否包含手部图像,并输出对应的手部图像分割结果,整个技术方案是通过识别模型进行检测和分析的,可以在非极端环境下,有效抵抗光照和手的上下左右倾斜的影响,从而准确判断出检测图像中是否含有手部图像,从而能准确定位出手所属的区域,增强了分析结果的准确性和科学性。
上述说明仅是本申请技术方案的概述,为了能够更清楚了解本申请的技术手段,而可依照说明书的内容予以实施,并且为了本申请的上述和其他目的、特征和优点能够更明显易懂,以下特举本申请的具体实施方式。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本地申请的不当限定。在附图中:
图1示出了本申请实施例提供的一种手部图像的分割方法的流程示意图;
图2示出了本申请实施例提供的另一种手部图像的分割方法的流程示意图;
图3示出了本申请实施例提供的截取最优检测图像中矩形手掌区域的示意图;
图4示出了本申请实施例提供的切除矩形手掌区域中与椭圆待切割区域重合部分的实物示意图;
图5示出了本申请实施例提供的一种手部图像的分割装置的结构示意图;
图6示出了本申请实施例提供的另一种手部图像的分割装置的结构示意图。
具体实施方式
下文将参考附图并结合实施例来详细说明本申请。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互结合。
针对目前对手部图像进行分割时,肤色混淆、光照和形变会干扰手部图像检测,从而使分割结果不够准确的问题,本实施例提供了一种手部图像的分割方法,如图1所示,该方法包括:
101、采集包含完整手部图像的样本图像。
在具体的应用场景中,为了更好的实现机器学习的目的,从而使分析结果更为统一、具有针对性,故在对样本图像进行选取时,需要满足图像中包含完整手部图像的先决条件。
102、标注样本图像中手部区域的坐标位置。
在具体的应用场景中,因样本图像中可能包含手、人脸、手臂、周围环境等复杂背景,为了排除不相关干扰,需要将复杂图像中手部出现的位置进行标注,使识别区域更为突出且方便计算尺寸数值偏差,进而可以准确提取出手的位置。
103、将已标注坐标位置的样本图像作为训练集,基于Faster R-CNN算法训练得到训练结果满足预设标准的手部识别模型。
其中,可利用手部识别模型检测定位,即确定图像中有没有手出现,找到手所在区域;还可利用手部识别模型进行手分割,将手部区域从画面中提取出来,从而去除背景的干扰。
104、利用手部识别模型,检测待识别图像中是否包含手部图像。
在具体的应用场景中,在手部识别模型达到训练标准后,可进行对输入图像的检测,当检测到用户触发了掌纹检测的按键、即用户执行了掌纹检测的手势操作,则启动摄像头获取图像识别区域中的图像数据,并利用手部识别模型,确定待识别图像中是否包含手部图像。
105、根据检测结果输出待识别图像的手部图像分割结果。
其中,手部图像分割结果可包括两种情况,一种是检测出待识别图像中包含手部图像,则可将分割出的手部图像输出作为手部图像分割结果;另一种使检测出待识别图像中不包含手部图像,则可对应输出相应的提示内容,将检测无手部图像的提示内容作为最终输出的分割结果。
通过本实施例中手部图像的分割方法,可以利用包含完整手部图像的样本图像,并为了在模型训练过程中,更好的训练模型训练的准确度,模拟识别手部图像,故需要标注样本图像中手部区域的坐标位置,同时基于Faster R-CNN算法训练手部识别模型,在判定手 部识别模型满足预设标准后,即可投入使用,输入待识别图像,即可检测出待识别图像中是否包含手部图像,并根据检测结果输出所述待识别图像的手部图像分割结果,整个技术方案是通过识别模型进行检测和分析的,可以在非极端环境下,有效抵抗光照和手的上下左右倾斜的影响,从而准确判断出检测图像中是否含有手部图像,并定位出手所属的区域,整个过程非常的智能快捷,同时也增强了分析结果的准确性和科学性。
进一步的,作为上述实施例具体实施方式的细化和扩展,为了完整说明本实施例中的具体实施过程,提供了另一种手部图像的分割方法,如图2所示,该方法包括:
201、采集包含完整手部图像的样本图像。
在具体的应用场景中,为了避免出现选取的样本图像过少,导致训练出的结果具备偶然性,或者选取数量过多,给数据分析造成困难的情况,故在选取样本图像时应该考虑实际情况,应根据具体的应用场景,设定对应的预设阈值,使选取的样本图像的数量能更好的满足用户的分析需求。
202、标注样本图像中手部区域的坐标位置。
在一种可选的实施方式中,步骤202具体可以包括:以样本图像的左上角作为原点创建图像坐标系;在图像坐标系中,确定伸展的手部的拇指指尖、小指指尖、中指指尖,手掌根部的四点坐标位置;根据拇指指尖和小指指尖的横坐标、中指指尖和手掌根部的纵坐标确定出手部所在的矩形框;标注出矩形框左上角和右下角的坐标位置。
例如,根据开发需求,从图片数据库中筛选出1000张图像清晰且包含完整手部图像数据的图片作为样本图像,则分别确定出这1000张图像中手部的拇指指尖、小指指尖、中指指尖,手掌根部的四点坐标位置,并分别根据拇指指尖和小指指尖的横坐标确定出手部所在矩形框的宽,根据中指指尖和手掌根部的纵坐标确定出手部所在矩形框的高,即可确定出1000张图像中各自手部图像矩形框的具***置,最后标注出每个矩形框左上角和右下角的坐标位置,用于位置记录、识别和校正。
203、将已标注坐标位置的样本图像输入预先基于Faster R-CNN算法创建得到的初始手部识别模型中。
其中,初始手部识别模型为预先根据设计需要创建的,与手部识别模型的区别是:初始手部识别模型只是初步创建完成,未通过模型训练,且未满足预设标准,而手部数据识别模型是指通过模型训练,已达到预设标准、可应用于对待识别图像检测的识别模型,模型的创建原理是基于Faster-R-CNN算法,手部识别模型共有两大模块构成,分别为PRN候选框提取模块和Faster-R-CNN检测模块,RPN是全卷积神经网络,用于提取矩形检测 框;FasterR-CNN基于RPN提取的proposal检测并识别proposal中的目标,训练过程中手部识别模型的输入为已标注坐标位置的样本图像,输出数据为预设数量个包含评定分数的手部图像建议窗口。
204、利用初始手部识别模型的深度卷积神经网络CNN提取样本图像的手部图像特征。
在具体的应用场景中,需要将整张图片输入CNN卷积神经网络,进行手部特征图像的反复机器学习。
205、依据手部图像特征,区域建议网络RPN生成预定数量的建议窗口proposals。
在具体的应用场景中,需要用区域建议网络(Region Proposal Network,RPN)生成建议窗口(proposals),每张图片均生成预定数量个建议窗口,Faster-R-CNN创造性地采用卷积网络自行产生建议框,并且和目标检测网络共享卷积网络,生成建议框的数目为300个,其中,RPN的输入为一张图像,输出为一系列的矩形检测框(proposals),每个proposals都对应各自的物体性分数objectness score,表示这个patch的中心是否包含物体的置信度,置信度最高分值为100%,置信区间展现的是手部图像出现在建议框中的概率程度,物体性分数越高,代表建议窗口中包含的手部图像越完整。
206、把各个建议窗口映射到CNN的最后一层卷积feature map上,生成固定尺寸。
在具体的应用场景中,可通过RoIpooling层使每个RoI池化层生成固定尺寸的feature map,其中,RoI Pooling层负责收集所有的候选框,并计算每一个候选框的特征图,然后送入后续网络;当网络训练好后输入的图像尺寸必须是固定值,同时网络输出也是固定的大小。ROI Pooling就是将大小不同的feature map池化成大小相同的feature map,利于输出到下一层网络中。
207、利用探测分类概率Softmax Loss和探测边框回归Smooth L1Loss对各个建议窗口内的检测图像进行训练修正,以使初始手部识别模型满足预设标准。
其中,探测分类概率,是目标存在时***正确探测出目标的概率,计算方法可为:探测分类概率=正确检测出目标图像的数量/参与检测的总样本图像的数量,在本方案中探测分类概率就是向模型输入具有完整手部图像的样本图像数据时,手部识别模型判断出有手部图像出现的概率;探测边框回归,是为了当候选框中的图像未定位准确(重叠度(Intersection over Union,IoU)<预定值)时,对候选框中的图像进行微调,使经过微调后的窗口跟批注好的图像数据更接近,使定位更准确,其中,IOU是用来评价定位精度的,它代表两个boundingbox的重叠度,其计算公式为:IOU=(A∩B)/(A∪B),即矩形框A、B的重叠面积占A、B并集的面积比例。
在一种可选的实施方式中,步骤207具体可以包括:将检测图像与标注坐标位置的矩形框内的手部图像进行重合度匹配;将重合度的数值确定为对应的评定分数;若建议窗口中评定分数大于或等于第一预设阈值的数量满足第一预设数量条件,则确定初始手部识别模型训练结果达到预设标准;若建议窗口中评定分数大于或等于第一预设阈值的数量不满足第一预设数量条件,则按照实际标注的手部图像的坐标位置修正评定分数低于第一预设阈值的建议窗口。
其中,第一预设阈值为判定单个建议窗口的识别结果能否满足预设标准的数值比较依据,当建议窗口的评定分数大于或等于第一预设阈值时,说明该窗口的识别结果符合预期标准,否则不符合。第一预定数量条件为判定手部识别模型训练是否能通过验证的评判标准,第一预定数量条件为设定一个最小分数占比,当大于或等于第一预设阈值的建议窗口数与总窗口数大于或等于最小分数占比时,即满足第一预定数量条件,否则不满足。第一预定数量条件中设定的最小分数占比越高代表模型训练得越严格,模型分析出的结论越精准,当判定所有建议窗口中符合预期标准的建议窗口数量满足第一预定数量条件时,说明手部识别模型通过训练,否则未通过,仍需进一步进行手部图像的识别、修正。
例如,设定的第二预设阈值为95%,而判断300个建议窗口中检测概率高于95%有200个,且判断待识别图像中是否包含有手部图像的第二预定数量条件中的最小分数占比为1/2,则可先计算出符合预期标准的建议窗口与总建议窗口的数量之比为:200/300=2/3,通过数值比较确定2/3大于1/2,则可进一步判断出手部识别模型通过训练。
208、将待识别图像上传入手部识别模型中。
在手部识别模型达到预设标准时,即可投入对待识别图像的手部图像检测的应用中去,将未知是否含有手部图像的任意待识别图像上传到手部识别模型中,即可进行手部图像的检测和识别。
209、利用Faster R-CNN算法确定待识别图像对应的预定数量个建议窗口及各自对应的检测概率。
其中,检测概率为判断建议窗口中是否含有手部图像的评判数据,检测概率越高,表示对应建议窗口中含有手部图像的概率越大。
210、若检测概率大于或等于第二预设阈值的建议窗口的数量满足第二预设数量条件,则确定待识别图像中包含手部图像。
其中,第二预设阈值为能判断出各个建议窗口中包含手部图像的最小检测概率;第二预设数量条件为判定待识别图像中是否包含手部图像的评判标准,即设定一个最小分数占 比,当所有检测概率大于第二预设阈值的建议窗口与总建议窗口的数量之比大于或等于第二预定数量条件中的最小分数占比时,说明待识别图像中包含有手部图像,否则说明待识别图像中不包含手部图像。例如,设定的第二预设阈值为90%,而判断300个建议窗口中检测概率大于或等于90%有100个,而预先设定的第二预定数量条件中最小分数占比为1/3,则可先计算出检测概率大于或等于第二预设阈值的建议窗口与总建议窗口的数量之比为:100/300=1/3,通过数值比较确定计算出的1/3与第二预定数量比相,则可进一步判断出待识别图像中包含有手部图像。
211、若检测概率大于或等于第二预设阈值的建议窗口的数量不满足第二预设数量条件,则确定待识别图像中不包含手部图像。
例如,设定的第二预设阈值为90%,而判断300个建议窗口中检测概率大于或等于90%有100个,而预先设定的第二预定数量条件中最小分数占比为1/2,则可先计算出检测概率大于或等于第二预设阈值的建议窗口与总建议窗口的数量之比为:100/300=1/3,通过数值比较确定1/3小于1/2,则可进一步判断出待识别图像中不包含有手部图像。
212、若确定待识别图像中包含手部图像,则选取检测分数最高的建议窗口作为最优检测图像。
在具体的应用场景中,在判定出待识别图像中包含手部图像后,则需要进一步对比300个建议窗口中检测概率大于或等于第二预设阈值建议窗口对应的检测概率数值大小,选取检测分数最高的建议窗口作为最优检测图像,即对应识别最完整的手部检测图像。
213、将最优检测图像作为手部图像分割结果输出。
在具体的应用场景中,最后将识别最完整的手部最优检测图像确定为最后分割出的手部图像,将分割出的手部图像进一步展示到显示页面。
214、若确定待识别图像中不包含手部图像,则在显示页面输出未检测出手部图像的提示信息。
其中,提示信息可包括展示页面的文字提示信息、图片提示信息、音频提示信息、视频提示信息、灯光提示信息、震动提示信息等。
相应的,为了更好的实现对手部分割图像的应用,作为一种可选的应用场景,本实施例还可包括:利用关键点检测算法提取出最优检测图像中手部的指尖、指根和掌心位置;根据指根和掌心位置截取最优检测图像的矩形手掌区域;利用指尖位置确定大拇指部位的椭圆待切割区域;切除矩形手掌区域中与椭圆待切割区域的重合部分;将切除完成后的矩形手掌区域与预先存储的用户掌纹图像进行掌纹相似度比对;按照掌纹相似度大于或等于 预设阈值的用户掌纹图像所对应的用户身份,判定待识别图像对应的用户身份。
在具体的应用场景中,根据指根和掌心位置截取最优检测图像的矩形手掌区域的方法可为(如图3所示):将食指指根A点和小指指根B点的连线与相交水平线的夹角确定为偏差角度angle,根据angle将手掌区域进行旋转调整,使AB旋转到水平方向。选取AB线段的中点E点,以及掌心点C点,使CE的长度length等于1/2*AB的长度,CE垂直于AB,建立图像坐标系,确定旋转后的手掌中的C点为矩形手掌区域的中点(x,y),以点p1(x-length,y-length),p2(x+length,y-length),p3(x-length,y+length),p4(x+length,y+length)为顶点的矩形区域截取矩形手掌区域。
相应的,利用指尖位置确定出大拇指部位的椭圆待切割区域的方法可为(如图4所示):根据关键点的检测结果,确定出手的左右手属性。具体可为根据关键点检测确定出小指的指根点B点及中指指尖T点的位置,根据B点和T点可以确定出手的左右属性。图像坐标系中,如果T点的纵坐标Ty>B的纵坐标By,则手的指向往下,is_up标志为False,否则手的指向往上,is_up标志为True。也即先通过这两个点的纵坐标来确定手的上下方向。如果is_up=True且T点的横坐标Tx<B的横坐标Bx,或者is_up=False且T点的横坐标Tx>B的横坐标Bx,则此手为左手,否则为右手。根据左右手属性的检测结果,确定大拇指的所在方向。在矩形手掌区域大拇指一侧的矩形边框上确定椭圆的原点和长轴。椭圆长半轴取矩形待分割手掌区域高的2/5,椭圆短半轴取矩形待分割手掌区域高的1/4。椭圆中心点的位置和长短轴决定了切取图像的位置即大拇指部分,所以跟手掌图像的行列数有关,左右手的大拇指位置不同,所以椭圆中心点的位置也不同。如果是右手,椭圆中心点的x坐标取图像的列数,y坐标取图像的行数的4/5;如果是左手,椭圆中心点的x坐标取0,坐标取图像的行数的4/5。
在具体的应用场景中,在确定大拇指对应的椭圆图像后,为了切除矩形手掌区域中存在乱纹的部分,切除矩形手掌区域中与椭圆待切割区域重合部分的方法可为:将此椭圆与矩形手掌区域的相交部分的像素置为0,即切除了大拇指部分。
相应的,按照掌纹相似度大于或等于预设阈值的用户掌纹图像所对应的用户身份,判定待识别图像对应的用户身份的方法可为:通过机器学习获取切除大拇指后的掌纹图像的特征,与预先存储的用户录入的掌纹图像相对比。通过对两个掌纹图像的特征进行匹配,使用MobileNet提取掌纹的特征。MobileNet提取的是掌纹的特征向量,通过计算需要比对的两个掌纹图像的特征向量的余弦相似度来判断是否匹配,可用欧式距离来判断两个的距离,如果两特征向量的余弦相似度达到预设阈值则匹配,否则不匹配,从而实现对用户身 份的验证。
其中,计算两个掌纹图像特征向量的余弦相似度,就是通过计算向量余弦值来判断两个掌纹图像的相似程度,余弦值越接近1,说明两个掌纹图像的相似度就越高,余弦值的计算公式为:
Figure PCTCN2019103140-appb-000001
其中x、y分别为切除大拇指后的掌纹图像的特征向量和用户录入的掌纹图像的特征向量,n为两个掌纹图像对应包含的所有特征向量个数;比对两个掌纹图像的相似度,还可通过对比多个两点间的欧式距离来实现,例如掌纹图像对应二维平面上两点坐标为a(x1,y1)与b(x2,y2),计算a、b两点间欧氏距离的公式可为:
Figure PCTCN2019103140-appb-000002
通过上述手部图像的分割方法,可以基于Faster R-CNN算法进行手部识别模型的创建,并在利用标注坐标位置的样本图像进行模型训练的过程中,用第一预设阈值判定单个建议窗口的识别结果能否满足预设标准,以及用第一预定数量条件进一步判定手部识别模型训练能否通过验证,利用双重阈值限定的方法,使模型训练过程更为精准,且当模型训练达不到预期标准时,则会按照实际标注的手部图像的坐标位置修正评定分数低于第一预设阈值的建议窗口,使其更能满足用户需求,且在手部识别模型投入使用时,对检测结果仍采用第二预设阈值及第二预设数量条件双重阈值限定的形式,使最终判定出的待识别图像是否包含手部图像的结论更具说服性、准确性,最后将检测出的结果手部图像分割结果直观进行显示,整个技术方案是通过识别模型进行检测和分析的,可以在非极端环境下,有效抵抗光照和手的上下左右倾斜的影响,从而准确判断出检测图像中是否含有手部图像,从而能准确定位出手所属的区域,并进行精确的分割,增强了手部图像分割结果的准确性和科学性。
进一步的,作为图1和图2所示方法的具体体现,本申请实施例提供了一种手部图像的分割装置,如图5所示,该装置包括:采集模块31、标注模块32、训练模块33、检测模块34、输出模块35。
采集模块31,可用于采集包含完整手部图像的样本图像;
标注模块32,可用于标注样本图像中手部区域的坐标位置;
训练模块33,可用于将已标注坐标位置的样本图像作为训练集,基于Faster R-CNN算法训练得到训练结果满足预设标准的手部识别模型;
检测模块34,可用于利用手部识别模型,检测待识别图像中是否包含手部图像;
输出模块35,可用于根据检测结果输出待识别图像的手部图像分割结果。
在具体的应用场景中,为了准确标注出样本图像中手部区域的坐标位置,标注模块32,还可用于以样本图像的左上角作为原点创建图像坐标系;在图像坐标系中,确定伸展的手部的拇指指尖、小指指尖、中指指尖,手掌根部的四点坐标位置;根据拇指指尖和小指指尖的横坐标、中指指尖和手掌根部的纵坐标确定出手部所在的矩形框;标注出矩形框左上角和右下角的坐标位置。
相应的,为了训练得到训练结果满足预设标准的手部识别模型,训练模块33,还可用于将已标注坐标位置的样本图像输入预先基于Faster R-CNN算法创建得到的初始手部识别模型中;利用初始手部识别模型的深度卷积神经网络CNN提取样本图像的手部图像特征;依据手部图像特征,区域建议网络RPN生成预定数量的建议窗口proposals;把各个建议窗口映射到CNN的最后一层卷积feature map上,生成固定尺寸;利用探测分类概率Softmax Loss和探测边框回归Smooth L1 Loss对各个建议窗口内的检测图像进行训练修正,以使初始手部识别模型满足预设标准。
在具体的应用场景中,为了实现对各个建议窗口内的检测图像进行训练修正,训练模块33,还可用于将检测图像与标注坐标位置的矩形框内的手部图像进行重合度匹配;将重合度的数值确定为对应的评定分数;若建议窗口中评定分数大于或等于第一预设阈值的数量满足第一预设数量比,则确定初始手部识别模型训练结果达到预设标准;若建议窗口中评定分数大于或等于第一预设阈值的数量不满足第一预设数量比,则按照实际标注的手部图像的坐标位置修正评定分数低于第一预设阈值的建议窗口。
相应的,为了利用训练好的手部识别模型,检测待识别图像中是否包含手部图像,检测模块34,还可用于将待识别图像上传入手部识别模型中;利用Faster R-CNN算法确定待识别图像对应的预定数量个建议窗口及各自对应的检测概率,其中,检测概率为判断建议窗口中是否含有手部图像的评判数据;若检测概率大于或等于第二预设阈值的建议窗口的数量满足第二预设数量比,则确定待识别图像中包含手部图像;若检测概率大于或等于第二预设阈值的建议窗口的数量不满足第二预设数量比,则确定待识别图像中不包含手部图像。
在具体的应用场景中,为了输出待识别图像的手部图像分割结果,输出模块35,还可用于若确定待识别图像中包含手部图像,则选取检测分数最高的建议窗口作为最优检测图像;将最优检测图像作为手部图像分割结果输出;若确定待识别图像中不包含手部图像, 则在显示页面输出未检测出手部图像的提示信息。
在具体的应用场景中,为了实现对分割出的手部图像的应用,如图6所示,本装置还包括:提取模块36、截取模块37、确定模块38、切除模块39、比对模块310、判定模块311。
提取模块36,可用于利用关键点检测算法提取出最优检测图像中手部的指尖、指根和掌心位置;
截取模块37,可用于根据指根和掌心位置截取最优检测图像的矩形手掌区域;
确定模块38,可用于利用指尖位置确定大拇指部位的椭圆待切割区域;
切除模块39,可用于切除矩形手掌区域中与椭圆待切割区域的重合部分;
比对模块310,可用于将切除完成后的矩形手掌区域与预先存储的用户掌纹图像进行掌纹相似度比对;
判定模块311,可用于按照掌纹相似度大于或等于预设阈值的用户掌纹图像所对应的用户身份,判定待识别图像对应的用户身份。
需要说明的是,本实施例提供的一种手部图像的分割装置所涉及各功能单元的其它相应描述,可以参考图1至图2中的对应描述,在此不再赘述。
基于上述如图1和图2所示方法,相应的,本申请实施例还提供了一种非易失性可读存储介质,其上存储有计算机可读指令,该计算机可读指令被处理器执行时实现上述如图1和图2所示的手部图像的分割方法。
基于这样的理解,本申请的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施场景的方法。
基于上述如图1、图2所示的方法,以及图5、图6所示的虚拟装置实施例,为了实现上述目的,本申请实施例还提供了一种计算机设备,具体可以为个人计算机、服务器、网络设备等,该实体设备包括非易失性可读存储介质和处理器;非易失性可读存储介质,用于存储计算机可读指令;处理器,用于执行计算机可读指令以实现上述如图1和图2所示的手部图像的分割方法。
可选地,该计算机设备还可以包括用户接口、网络接口、摄像头、射频(Radio Frequency,RF)电路,传感器、音频电路、WI-FI模块等等。用户接口可以包括显示屏(Display)、输入单元比如键盘(Keyboard)等,可选用户接口还可以包括USB接口、读卡器接口等。 网络接口可选的可以包括标准的有线接口、无线接口(如蓝牙接口、WI-FI接口)等。
本领域技术人员可以理解,本实施例提供的计算机设备结构并不构成对该实体设备的限定,可以包括更多或更少的部件,或者组合某些部件,或者不同的部件布置。
非易失性可读存储介质中还可以包括操作***、网络通信模块。操作***是执行手部图像分割方法的实体设备硬件和软件资源的程序,支持信息处理程序以及其它软件和/或程序的运行。网络通信模块用于实现非易失性可读存储介质内部各组件之间的通信,以及与该实体设备中其它硬件和软件之间通信。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到本申请可以借助软件加必要的通用硬件平台的方式来实现,也可以通过硬件实现。通过应用本申请的技术方案,与目前现有技术相比,本申请可以基于Faster R-CNN算法进行手部识别模型的创建,并在利用标注坐标位置的样本图像进行模型训练的过程中,用第一预设阈值判定单个建议窗口的识别结果能否满足预设标准,以及用第一预定数量条件进一步判定手部识别模型训练能否通过验证,利用双重阈值限定的方法,使模型训练过程更为精准,且当模型训练达不到预期标准时,则会按照实际标注的手部图像的坐标位置修正评定分数低于第一预设阈值的建议窗口,使其更能满足用户需求,且在手部识别模型投入使用时,对检测结果仍采用第二预设阈值及第二预设数量条件双重阈值限定的形式,使最终判定出的待识别图像是否包含手部图像的结论更具说服性、准确性,最后将检测出的结果手部图像分割结果直观进行显示,整个技术方案是通过识别模型进行检测和分析的,可以在非极端环境下,有效抵抗光照和手的上下左右倾斜的影响,从而准确判断出检测图像中是否含有手部图像,从而能准确定位出手所属的区域,并进行精确的分割,增强了手部图像分割结果的准确性和科学性。
本领域技术人员可以理解附图只是一个优选实施场景的示意图,附图中的模块或流程并不一定是实施本申请所必须的。本领域技术人员可以理解实施场景中的装置中的模块可以按照实施场景描述进行分布于实施场景的装置中,也可以进行相应变化位于不同于本实施场景的一个或多个装置中。上述实施场景的模块可以合并为一个模块,也可以进一步拆分成多个子模块。
上述本申请序号仅仅为了描述,不代表实施场景的优劣。以上公开的仅为本申请的几个具体实施场景,但是,本申请并非局限于此,任何本领域的技术人员能思之的变化都应落入本申请的保护范围。

Claims (20)

  1. 一种手部图像的分割方法,其特征在于,包括:
    采集包含完整手部图像的样本图像;
    标注所述样本图像中手部区域的坐标位置;
    将已标注所述坐标位置的所述样本图像作为训练集,基于Faster R-CNN算法训练得到训练结果满足预设标准的手部识别模型;
    利用所述手部识别模型,检测待识别图像中是否包含手部图像;
    根据检测结果输出所述待识别图像的手部图像分割结果。
  2. 根据权利要求1所述的方法,其特征在于,所述标注所述样本图像中手部区域的坐标位置,具体包括:
    以所述样本图像的左上角作为原点创建图像坐标系;
    在所述图像坐标系中,确定伸展的手部的拇指指尖、小指指尖、中指指尖,手掌根部的四点坐标位置;
    根据所述拇指指尖和所述小指指尖的横坐标、所述中指指尖和所述手掌根部的纵坐标确定出手部所在的矩形框;
    标注出所述矩形框左上角和右下角的坐标位置。
  3. 根据权利要求2所述的方法,其特征在于,所述将已标注所述坐标位置的所述样本图像作为训练集,基于Faster R-CNN算法训练得到训练结果满足预设标准的手部识别模型,具体包括:
    将已标注所述坐标位置的所述样本图像输入预先基于Faster R-CNN算法创建得到的初始手部识别模型中;
    利用所述初始手部识别模型的深度卷积神经网络CNN提取所述样本图像的手部图像特征;
    依据所述手部图像特征,区域建议网络RPN生成预定数量的建议窗口proposals;
    把各个所述建议窗口映射到CNN的最后一层卷积feature map上,生成固定尺寸;
    利用探测分类概率Softmax Loss和探测边框回归Smooth L1 Loss对各个所述建议窗口内的检测图像进行训练修正,以使所述初始手部识别模型满足所述预设标准。
  4. 根据权利要求3所述的方法,其特征在于,所述利用探测分类概率Softmax Loss和探测边框回归Smooth L1 Loss对各个所述建议窗口内的检测图像进行训练修正,以使所述初始手部识别模型满足所述预设标准,具体包括:
    将所述检测图像与标注坐标位置的所述矩形框内的手部图像进行重合度匹配;
    将重合度的数值确定为对应的评定分数;
    若所述建议窗口中所述评定分数大于或等于第一预设阈值的数量满足第一预设数量条件,则确定所述初始手部识别模型训练结果达到预设标准;
    若所述建议窗口中所述评定分数大于或等于所述第一预设阈值的数量不满足所述第一预设数量条件,则按照实际标注的所述手部图像的坐标位置修正所述评定分数低于所述第一预设阈值的所述建议窗口。
  5. 根据权利要求4所述的方法,其特征在于,利用所述手部识别模型,检测待识别图像中是否包含手部图像,具体包括:
    将所述待识别图像上传入所述手部识别模型中;
    利用Faster R-CNN算法确定所述待识别图像对应的预定数量个所述建议窗口及各自对应的检测概率,其中,所述检测概率为判断建议窗口中是否含有手部图像的评判数据;
    若所述检测概率大于或等于第二预设阈值的所述建议窗口的数量满足第二预设数量条件,则确定所述待识别图像中包含手部图像;
    若所述检测概率大于或等于所述第二预设阈值的所述建议窗口的数量不满足所述第二预设数量条件,则确定所述待识别图像中不包含手部图像。
  6. 根据权利要求5所述的方法,其特征在于,所述根据检测结果输出所述待识别图像的手部图像分割结果,具体包括:
    若确定所述待识别图像中包含所述手部图像,则选取所述检测概率最高的所述建议窗口作为最优检测图像;
    将所述最优检测图像作为手部图像分割结果输出;
    若确定所述待识别图像中不包含所述手部图像,则在显示页面输出未检测出所述手部图像的提示信息。
  7. 根据权利要求1所述的方法,其特征在于,若确定所述待识别图像中包含手部图像,在根据检测结果输出所述待识别图像的手部图像分割结果之后,所述方法还包括:
    利用关键点检测算法提取出所述最优检测图像中手部的指尖、指根和掌心位置;
    根据所述指根和所述掌心位置截取所述最优检测图像中的矩形手掌区域;
    利用所述指尖位置确定大拇指部位的椭圆待切割区域;
    切除所述矩形手掌区域中与所述椭圆待切割区域的重合部分;
    将切除完成后的所述矩形手掌区域与预先存储的用户掌纹图像进行掌纹相似度比对;
    按照掌纹相似度大于或等于预设阈值的所述用户掌纹图像所对应的用户身份,判定所述待识别图像对应的用户身份。
  8. 一种手部图像的分割装置,其特征在于,包括:
    采集模块,用于采集包含完整手部图像的样本图像;
    标注模块,用于标注所述样本图像中手部区域的坐标位置;
    训练模块,用于将已标注所述坐标位置的所述样本图像作为训练集,基于Faster R-CNN算法训练得到训练结果满足预设标准的手部识别模型;
    检测模块,用于利用所述手部识别模型,检测待识别图像中是否包含手部图像;
    输出模块,用于根据检测结果输出所述待识别图像的手部图像分割结果。
  9. 根据权利要求8所述的装置,其特征在于,所述标注模块,具体用于以所述样本图像的左上角作为原点创建图像坐标系;在所述图像坐标系中,确定伸展的手部的拇指指尖、小指指尖、中指指尖,手掌根部的四点坐标位置;根据所述拇指指尖和所述小指指尖的横坐标、所述中指指尖和所述手掌根部的纵坐标确定出手部所在的矩形框;标注出所述矩形框左上角和右下角的坐标位置。
  10. 根据权利要求9所述的装置,其特征在于,所述训练模块,具体用于将已标注所述坐标位置的所述样本图像输入预先基于Faster R-CNN算法创建得到的初始手部识别模型中;利用所述初始手部识别模型的深度卷积神经网络CNN提取所述样本图像的手部图像特征;依据所述手部图像特征,区域建议网络RPN生成预定数量的建议窗口proposals;把各个所述建议窗口映射到CNN的最后一层卷积feature map上,生成固定尺寸;利用探测分类概率Softmax Loss和探测边框回归Smooth L1 Loss对各个所述建议窗口内的检测图像进行训练修正,以使所述初始手部识别模型满足所述预设标准。
  11. 根据权利要求10所述的装置,其特征在于,所述训练模块,具体用于将所述检测图像与标注坐标位置的所述矩形框内的手部图像进行重合度匹配;将重合度的数值确定为对应的评定分数;若所述建议窗口中所述评定分数大于或等于第一预设阈值的数量满足第一预设数量条件,则确定所述初始手部识别模型训练结果达到预设标准;若所述建议窗口中所述评定分数大于或等于所述第一预设阈值的数量不满足所述第一预设数量条件,则按照实际标注的所述手部图像的坐标位置修正所述评定分数低于所述第一预设阈值的所述建议窗口。
  12. 根据权利要求11所述的装置,其特征在于,所述检测模块,具体用于将所述待识别图像上传入所述手部识别模型中;利用Faster R-CNN算法确定所述待识别图像对应的预定数量个所述建议窗口及各自对应的检测概率,其中,所述检测概率为判断建议窗口中是否含有手部图像的评判数据;若所述检测概率大于或等于第二预设阈值的所述建议窗口的数量满足第二预设数量条件,则确定所述待识别图像中包含手部图像;若所述检测概率大 于或等于所述第二预设阈值的所述建议窗口的数量不满足所述第二预设数量条件,则确定所述待识别图像中不包含手部图像。
  13. 根据权利要求12所述的装置,其特征在于,所述输出模块,具体用于若确定所述待识别图像中包含所述手部图像,则选取所述检测概率最高的所述建议窗口作为最优检测图像;将所述最优检测图像作为手部图像分割结果输出;若确定所述待识别图像中不包含所述手部图像,则在显示页面输出未检测出所述手部图像的提示信息。
  14. 根据权利要求8所述的装置,其特征在于,所述装置还包括:提取模块、截取模块、确定模块、切除模块、比对模块、判定模块;
    所述提取模块,用于利用关键点检测算法提取出所述最优检测图像中手部的指尖、指根和掌心位置;
    所述截取模块,用于根据所述指根和所述掌心位置截取所述最优检测图像中的矩形手掌区域;
    所述确定模块,用于利用所述指尖位置确定大拇指部位的椭圆待切割区域;
    所述切除模块,用于切除所述矩形手掌区域中与所述椭圆待切割区域的重合部分;
    所述比对模块,用于将切除完成后的所述矩形手掌区域与预先存储的用户掌纹图像进行掌纹相似度比对;
    所述判定模块,用于按照掌纹相似度大于或等于预设阈值的所述用户掌纹图像所对应的用户身份,判定所述待识别图像对应的用户身份。
  15. 一种非易失性可读存储介质,其上存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现手部图像的分割方法,包括:
    采集包含完整手部图像的样本图像;标注所述样本图像中手部区域的坐标位置;将已标注所述坐标位置的所述样本图像作为训练集,基于Faster R-CNN算法训练得到训练结果满足预设标准的手部识别模型;利用所述手部识别模型,检测待识别图像中是否包含手部图像;根据检测结果输出所述待识别图像的手部图像分割结果。
  16. 根据权利要求15所述的非易失性可读存储介质,其特征在于,所述计算机可读指令被处理器执行时实现所述标注所述样本图像中手部区域的坐标位置,包括:以所述样本图像的左上角作为原点创建图像坐标系;在所述图像坐标系中,确定伸展的手部的拇指指尖、小指指尖、中指指尖,手掌根部的四点坐标位置;根据所述拇指指尖和所述小指指尖的横坐标、所述中指指尖和所述手掌根部的纵坐标确定出手部所在的矩形框;标注出所述矩形框左上角和右下角的坐标位置。
  17. 根据权利要求16所述的非易失性可读存储介质,其特征在于,所述计算机可读 指令被处理器执行时实现所述将已标注所述坐标位置的所述样本图像作为训练集,基于Faster R-CNN算法训练得到训练结果满足预设标准的手部识别模型,包括:将已标注所述坐标位置的所述样本图像输入预先基于Faster R-CNN算法创建得到的初始手部识别模型中;利用所述初始手部识别模型的深度卷积神经网络CNN提取所述样本图像的手部图像特征;依据所述手部图像特征,区域建议网络RPN生成预定数量的建议窗口proposals;把各个所述建议窗口映射到CNN的最后一层卷积feature map上,生成固定尺寸;利用探测分类概率Softmax Loss和探测边框回归Smooth L1 Loss对各个所述建议窗口内的检测图像进行训练修正,以使所述初始手部识别模型满足所述预设标准。
  18. 一种计算机设备,包括非易失性可读存储介质、处理器及存储在非易失性可读存储介质上并可在处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现手部图像的分割方法,包括:
    采集包含完整手部图像的样本图像;标注所述样本图像中手部区域的坐标位置;将已标注所述坐标位置的所述样本图像作为训练集,基于Faster R-CNN算法训练得到训练结果满足预设标准的手部识别模型;利用所述手部识别模型,检测待识别图像中是否包含手部图像;根据检测结果输出所述待识别图像的手部图像分割结果。
  19. 根据权利要求18所述的非易失性可读存储介质,其特征在于,所述计算机可读指令被处理器执行时实现所述标注所述样本图像中手部区域的坐标位置,包括:以所述样本图像的左上角作为原点创建图像坐标系;在所述图像坐标系中,确定伸展的手部的拇指指尖、小指指尖、中指指尖,手掌根部的四点坐标位置;根据所述拇指指尖和所述小指指尖的横坐标、所述中指指尖和所述手掌根部的纵坐标确定出手部所在的矩形框;标注出所述矩形框左上角和右下角的坐标位置。
  20. 根据权利要求19所述的非易失性可读存储介质,其特征在于,所述计算机可读指令被处理器执行时实现所述将已标注所述坐标位置的所述样本图像作为训练集,基于Faster R-CNN算法训练得到训练结果满足预设标准的手部识别模型,包括:将已标注所述坐标位置的所述样本图像输入预先基于Faster R-CNN算法创建得到的初始手部识别模型中;利用所述初始手部识别模型的深度卷积神经网络CNN提取所述样本图像的手部图像特征;依据所述手部图像特征,区域建议网络RPN生成预定数量的建议窗口proposals;把各个所述建议窗口映射到CNN的最后一层卷积feature map上,生成固定尺寸;利用探测分类概率Softmax Loss和探测边框回归Smooth L1 Loss对各个所述建议窗口内的检测图像进行训练修正,以使所述初始手部识别模型满足所述预设标准。
PCT/CN2019/103140 2019-04-26 2019-08-28 手部图像的分割方法、装置及计算机设备 WO2020215565A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910345761.3A CN110232311B (zh) 2019-04-26 2019-04-26 手部图像的分割方法、装置及计算机设备
CN201910345761.3 2019-04-26

Publications (1)

Publication Number Publication Date
WO2020215565A1 true WO2020215565A1 (zh) 2020-10-29

Family

ID=67860929

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/103140 WO2020215565A1 (zh) 2019-04-26 2019-08-28 手部图像的分割方法、装置及计算机设备

Country Status (2)

Country Link
CN (1) CN110232311B (zh)
WO (1) WO2020215565A1 (zh)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112419339A (zh) * 2020-12-11 2021-02-26 上海联影医疗科技股份有限公司 医学图像分割模型训练方法及***
CN112558810A (zh) * 2020-12-11 2021-03-26 北京百度网讯科技有限公司 检测指尖位置的方法、装置、设备和存储介质
CN112818825A (zh) * 2021-01-28 2021-05-18 维沃移动通信有限公司 工作状态确定方法和装置
CN112927247A (zh) * 2021-03-08 2021-06-08 常州微亿智造科技有限公司 基于目标检测的切图方法、切图装置和存储介质
CN113095248A (zh) * 2021-04-19 2021-07-09 中国石油大学(华东) 一种用于羽毛球运动的技术动作纠正方法
CN113158912A (zh) * 2021-04-25 2021-07-23 北京华捷艾米科技有限公司 手势识别方法及装置、存储介质及电子设备
CN113158774A (zh) * 2021-03-05 2021-07-23 北京华捷艾米科技有限公司 一种手部分割方法、装置、存储介质和设备
CN113239939A (zh) * 2021-05-12 2021-08-10 北京杰迈科技股份有限公司 一种轨道信号灯识别方法、模块及存储介质
CN113435508A (zh) * 2021-06-28 2021-09-24 中冶建筑研究总院(深圳)有限公司 玻璃幕墙开启窗开启状态检测方法、装置、设备及介质
CN113486758A (zh) * 2021-06-30 2021-10-08 浙江大学 一种手部穴位自动定位方法
CN113744161A (zh) * 2021-09-16 2021-12-03 北京顺势兄弟科技有限公司 增强数据的获取方法及装置、数据增强方法、电子设备
CN115273282A (zh) * 2022-07-26 2022-11-01 宁波芯然科技有限公司 一种基于掌静脉识别的车门解锁方法
CN117115774A (zh) * 2023-10-23 2023-11-24 锐驰激光(深圳)有限公司 草坪边界的识别方法、装置、设备及存储介质
CN117274249A (zh) * 2023-11-20 2023-12-22 江西省中鼐科技服务有限公司 一种基于人工智能图像技术的瓷砖外观检测方法及***

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110751149B (zh) * 2019-09-18 2023-12-22 平安科技(深圳)有限公司 目标对象标注方法、装置、计算机设备和存储介质
CN111178170B (zh) * 2019-12-12 2023-07-04 青岛小鸟看看科技有限公司 一种手势识别方法和一种电子设备
CN111241947B (zh) * 2019-12-31 2023-07-18 深圳奇迹智慧网络有限公司 目标检测模型的训练方法、装置、存储介质和计算机设备
CN111242109B (zh) * 2020-04-26 2021-02-02 北京金山数字娱乐科技有限公司 一种手动取词的方法及装置
CN113743169B (zh) * 2020-05-29 2023-11-07 北京达佳互联信息技术有限公司 手掌平面检测方法、装置、电子设备及存储介质
CN112861783A (zh) * 2021-03-08 2021-05-28 北京华捷艾米科技有限公司 一种手部检测方法及***
CN113486718B (zh) * 2021-06-08 2023-04-07 天津大学 一种基于深度多任务学习的指尖检测方法
CN113610033A (zh) * 2021-08-16 2021-11-05 明见(厦门)软件开发有限公司 一种双手脱离方向盘监测方法、终端设备及存储介质
CN113792651B (zh) * 2021-09-13 2024-04-05 广州广电运通金融电子股份有限公司 一种融合手势识别和指尖定位的手势交互方法、设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106960175A (zh) * 2017-02-21 2017-07-18 华南理工大学 基于深度卷积神经网络的第一视角动态手势检测方法
CN106960214A (zh) * 2017-02-17 2017-07-18 北京维弦科技有限责任公司 基于图像的物体识别方法
US20180096457A1 (en) * 2016-09-08 2018-04-05 Carnegie Mellon University Methods and Software For Detecting Objects in Images Using a Multiscale Fast Region-Based Convolutional Neural Network
CN108427942A (zh) * 2018-04-22 2018-08-21 广州麦仑信息科技有限公司 一种基于深度学习的手掌检测与关键点定位方法
CN108509839A (zh) * 2018-02-02 2018-09-07 东华大学 一种基于区域卷积神经网络高效的手势检测识别方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073843B (zh) * 2010-11-05 2013-03-20 沈阳工业大学 非接触式快速人手多模态信息融合识别方法
CN107016323A (zh) * 2016-01-28 2017-08-04 厦门中控生物识别信息技术有限公司 一种手掌感兴趣区域的定位方法及装置
CN106097354B (zh) * 2016-06-16 2019-07-09 南昌航空大学 一种结合自适应高斯肤色检测和区域生长的手部图像分割方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180096457A1 (en) * 2016-09-08 2018-04-05 Carnegie Mellon University Methods and Software For Detecting Objects in Images Using a Multiscale Fast Region-Based Convolutional Neural Network
CN106960214A (zh) * 2017-02-17 2017-07-18 北京维弦科技有限责任公司 基于图像的物体识别方法
CN106960175A (zh) * 2017-02-21 2017-07-18 华南理工大学 基于深度卷积神经网络的第一视角动态手势检测方法
CN108509839A (zh) * 2018-02-02 2018-09-07 东华大学 一种基于区域卷积神经网络高效的手势检测识别方法
CN108427942A (zh) * 2018-04-22 2018-08-21 广州麦仑信息科技有限公司 一种基于深度学习的手掌检测与关键点定位方法

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112419339B (zh) * 2020-12-11 2024-05-14 上海联影医疗科技股份有限公司 医学图像分割模型训练方法及***
CN112558810A (zh) * 2020-12-11 2021-03-26 北京百度网讯科技有限公司 检测指尖位置的方法、装置、设备和存储介质
CN112558810B (zh) * 2020-12-11 2023-10-03 北京百度网讯科技有限公司 检测指尖位置的方法、装置、设备和存储介质
CN112419339A (zh) * 2020-12-11 2021-02-26 上海联影医疗科技股份有限公司 医学图像分割模型训练方法及***
CN112818825A (zh) * 2021-01-28 2021-05-18 维沃移动通信有限公司 工作状态确定方法和装置
CN112818825B (zh) * 2021-01-28 2024-02-23 维沃移动通信有限公司 工作状态确定方法和装置
CN113158774A (zh) * 2021-03-05 2021-07-23 北京华捷艾米科技有限公司 一种手部分割方法、装置、存储介质和设备
CN113158774B (zh) * 2021-03-05 2023-12-29 北京华捷艾米科技有限公司 一种手部分割方法、装置、存储介质和设备
CN112927247A (zh) * 2021-03-08 2021-06-08 常州微亿智造科技有限公司 基于目标检测的切图方法、切图装置和存储介质
CN113095248A (zh) * 2021-04-19 2021-07-09 中国石油大学(华东) 一种用于羽毛球运动的技术动作纠正方法
CN113095248B (zh) * 2021-04-19 2022-10-25 中国石油大学(华东) 一种用于羽毛球运动的技术动作纠正方法
CN113158912B (zh) * 2021-04-25 2023-12-26 北京华捷艾米科技有限公司 手势识别方法及装置、存储介质及电子设备
CN113158912A (zh) * 2021-04-25 2021-07-23 北京华捷艾米科技有限公司 手势识别方法及装置、存储介质及电子设备
CN113239939A (zh) * 2021-05-12 2021-08-10 北京杰迈科技股份有限公司 一种轨道信号灯识别方法、模块及存储介质
CN113435508B (zh) * 2021-06-28 2024-01-19 中冶建筑研究总院(深圳)有限公司 玻璃幕墙开启窗开启状态检测方法、装置、设备及介质
CN113435508A (zh) * 2021-06-28 2021-09-24 中冶建筑研究总院(深圳)有限公司 玻璃幕墙开启窗开启状态检测方法、装置、设备及介质
CN113486758A (zh) * 2021-06-30 2021-10-08 浙江大学 一种手部穴位自动定位方法
CN113486758B (zh) * 2021-06-30 2024-03-08 浙江大学 一种手部穴位自动定位方法
CN113744161A (zh) * 2021-09-16 2021-12-03 北京顺势兄弟科技有限公司 增强数据的获取方法及装置、数据增强方法、电子设备
CN113744161B (zh) * 2021-09-16 2024-03-29 北京顺势兄弟科技有限公司 增强数据的获取方法及装置、数据增强方法、电子设备
CN115273282A (zh) * 2022-07-26 2022-11-01 宁波芯然科技有限公司 一种基于掌静脉识别的车门解锁方法
CN115273282B (zh) * 2022-07-26 2024-05-17 宁波芯然科技有限公司 一种基于掌静脉识别的车门解锁方法
CN117115774A (zh) * 2023-10-23 2023-11-24 锐驰激光(深圳)有限公司 草坪边界的识别方法、装置、设备及存储介质
CN117115774B (zh) * 2023-10-23 2024-03-15 锐驰激光(深圳)有限公司 草坪边界的识别方法、装置、设备及存储介质
CN117274249A (zh) * 2023-11-20 2023-12-22 江西省中鼐科技服务有限公司 一种基于人工智能图像技术的瓷砖外观检测方法及***
CN117274249B (zh) * 2023-11-20 2024-03-01 江西省中鼐科技服务有限公司 一种基于人工智能图像技术的瓷砖外观检测方法及***

Also Published As

Publication number Publication date
CN110232311B (zh) 2023-11-14
CN110232311A (zh) 2019-09-13

Similar Documents

Publication Publication Date Title
WO2020215565A1 (zh) 手部图像的分割方法、装置及计算机设备
WO2019128646A1 (zh) 人脸检测方法、卷积神经网络参数的训练方法、装置及介质
US8970696B2 (en) Hand and indicating-point positioning method and hand gesture determining method used in human-computer interaction system
WO2019232862A1 (zh) 嘴巴模型训练方法、嘴巴识别方法、装置、设备及介质
JP6815707B2 (ja) 顔姿勢検出方法、装置及び記憶媒体
US9122353B2 (en) Kind of multi-touch input device
WO2019041519A1 (zh) 目标跟踪装置、方法及计算机可读存储介质
JP2021524951A (ja) 空中手書きを識別するための方法、装置、デバイス及びコンピュータ読み取り可能な記憶媒体
WO2021103945A1 (zh) 地图融合方法及装置、设备、存储介质
CN109902541B (zh) 一种图像识别的方法及***
CN105825524A (zh) 目标跟踪方法和装置
CN103294996A (zh) 一种3d手势识别方法
CN110008824B (zh) 掌纹识别方法、装置、计算机设备和存储介质
CN104063059A (zh) 一种基于手指分割的实时手势识别方法
JP7106742B2 (ja) 顔認識方法、装置、電子機器及びコンピュータ不揮発性読み取り可能な記憶媒体
CN111340020B (zh) 一种公式识别方法、装置、设备及存储介质
US10922535B2 (en) Method and device for identifying wrist, method for identifying gesture, electronic equipment and computer-readable storage medium
US20130243251A1 (en) Image processing device and image processing method
US20200302173A1 (en) Image processing device, image processing method, and image processing system
WO2022105569A1 (zh) 页面方向识别方法、装置、设备及计算机可读存储介质
US20170024073A1 (en) Probabilistic Palm Rejection Using Spatiotemporal Touch Features and Iterative Classification
WO2021196013A1 (zh) 单词识别方法、设备及存储介质
CN112036362A (zh) 图像处理方法、装置、计算机设备和可读存储介质
WO2022206534A1 (zh) 文本内容识别方法、装置、计算机设备和存储介质
US9727145B2 (en) Detecting device and detecting method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19925660

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 25.02.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19925660

Country of ref document: EP

Kind code of ref document: A1