CN110287891B - Gesture control method and device based on human body key points and electronic equipment - Google Patents

Gesture control method and device based on human body key points and electronic equipment Download PDF

Info

Publication number
CN110287891B
CN110287891B CN201910563241.XA CN201910563241A CN110287891B CN 110287891 B CN110287891 B CN 110287891B CN 201910563241 A CN201910563241 A CN 201910563241A CN 110287891 B CN110287891 B CN 110287891B
Authority
CN
China
Prior art keywords
target object
key point
video file
gesture
key points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910563241.XA
Other languages
Chinese (zh)
Other versions
CN110287891A (en
Inventor
黄佳斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Douyin Vision Co Ltd
Douyin Vision Beijing Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN201910563241.XA priority Critical patent/CN110287891B/en
Publication of CN110287891A publication Critical patent/CN110287891A/en
Application granted granted Critical
Publication of CN110287891B publication Critical patent/CN110287891B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The embodiment of the disclosure provides a gesture control method and device based on human key points and electronic equipment, belonging to the technical field of image processing, wherein the method comprises the following steps: acquiring a video file generated by a camera device for a target object, wherein the video file records one or more actions of the target object; performing key point detection on a target object in the video file to obtain a key point set of the target object; when a plurality of necessary key points exist in the key point set, further acquiring gesture actions of the target object at a preset position; and determining a gesture instruction sent by the target object based on the analysis result aiming at the gesture action. Through the processing scheme disclosed by the invention, the accuracy of gesture control is improved.

Description

Gesture control method and device based on human body key points and electronic equipment
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to a gesture control method and apparatus based on human body key points, and an electronic device.
Background
Image processing (image processing), also called video processing, is a technique for achieving a desired result in an image by a computer. Since the 20 th century, digital image processing was common. The main contents of the image processing technology include image compression, enhancement restoration, and matching description identification 3 parts, and common processes include image digitization, image coding, image enhancement, image restoration, image segmentation, image analysis, and the like. The image processing is to process the image information by using a computer to meet the visual psychology of people or the behavior of application requirements, has wide application, and is mainly used for mapping, atmospheric science, astronomy, beautifying, image identification improvement and the like.
One application of image processing is gesture recognition, a gesture interaction technology based on computer vision processes and recognizes a gesture image sequence collected by a camera through machine vision so as to interact with a computer, the method collects gesture information by using the camera, then divides a human hand part by using a skin color model so as to realize gesture detection and recognition, and finally realizes tracking of a motion gesture by using an interframe difference method. The effect of the method depends on the accuracy of the skin color model, but the skin color of people is different, so that a universal and efficient skin color model is difficult to obtain; in addition, when the movement speed of the human hand is not uniform, the phenomenon of interruption can occur when the gesture is tracked by adopting the interframe difference method, so that the tracked gesture is lost.
Disclosure of Invention
In view of this, embodiments of the present disclosure provide a method and an apparatus for controlling a gesture based on a human body key point, and an electronic device, which at least partially solve the problems in the prior art.
In a first aspect, an embodiment of the present disclosure provides a gesture control method based on human body key points, including:
acquiring a video file generated by a camera device for a target object, wherein the video file records one or more actions of the target object;
performing key point detection on a target object in the video file to obtain a key point set of the target object;
when a plurality of necessary key points exist in the key point set, further acquiring gesture actions of the target object at a preset position;
and determining a gesture instruction sent by the target object based on the analysis result aiming at the gesture action.
According to a specific implementation manner of the embodiment of the present disclosure, the acquiring a video file generated by an image capturing apparatus for a target object includes:
performing subtraction operation on pixel values of pixel points corresponding to adjacent frame images formed in the camera equipment to obtain a pixel difference matrix;
judging whether the average value of the pixel difference matrix is larger than a preset threshold value or not;
and if so, storing the adjacent frame image as a video frame in the video file.
According to a specific implementation manner of the embodiment of the present disclosure, when there are a plurality of necessary key points in the key point set, before further acquiring a gesture action of a target object at a preset position, the method further includes:
detecting the position state of the target object based on the detected key point set;
and performing prompt operation on the target object based on the detected position state of the target object.
According to a specific implementation manner of the embodiment of the present disclosure, the performing a prompt operation on the target object based on the detected position state of the target object includes:
judging whether shoulder key points and hand key points of a target object exist in video frames of the video file;
if not, prompting the target object to change the current position of the target object until the shoulder key point and the hand key point of the target object appear in the video frame of the video file.
According to a specific implementation manner of the embodiment of the present disclosure, the performing a prompt operation on the target object based on the detected position state of the target object includes:
judging whether the target object is located at a preset position in the middle of a video frame or not based on a shoulder key point of the target object detected in the video frame of the video file;
if so, prompting the target object to execute a preset action after the upper body key point of the target object appears in the video frame.
According to a specific implementation manner of the embodiment of the present disclosure, when there are a plurality of necessary key points in the key point set, further acquiring a gesture action of a target object at a preset position, includes:
judging whether a preset position area has a hand key point or not;
if yes, further acquiring the moving distance of the hand key point in a preset time period;
and judging the gesture action of the target object at the preset position based on the moving distance.
According to a specific implementation manner of the embodiment of the present disclosure, the performing of the key point detection on the target object in the video file includes:
converting the video frame image in the video file into a gray level image;
performing edge detection on the gray level image to obtain an edge contour of the target object;
based on the edge profile, a set of keypoints for the target object is determined.
According to a specific implementation manner of the embodiment of the present disclosure, the determining the set of key points of the target object based on the edge contour includes:
selecting a plurality of structural elements with different orientations;
performing detail matching on the plurality of gray level images by using each structural element in a plurality of structural elements to obtain a filtering image;
determining a gray scale edge of the filtered image to obtain a number of pixels present in each of a plurality of gray scale levels in the filtered image;
weighting the number of pixels in each gray level, and taking the weighted gray average value as a threshold value;
carrying out binarization processing on the filtered image based on the threshold value;
and taking the image after the binarization processing as an edge image of the target object.
In a second aspect, an embodiment of the present disclosure provides a gesture control device based on human body key points, including:
the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a video file generated by the camera equipment aiming at a target object, and the video file records one or more actions of the target object;
the detection module is used for executing key point detection on the target object in the video file to obtain a key point set of the target object;
the second acquisition module is used for further acquiring the gesture action of the target object at the preset position when a plurality of necessary key points exist in the key point set;
and the determining module is used for determining a gesture instruction sent by the target object based on the analysis result aiming at the gesture action.
In a third aspect, an embodiment of the present disclosure further provides an electronic device, where the electronic device includes:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform any one of the first aspect or any implementation manner of the first aspect.
In a fourth aspect, the embodiments of the present disclosure further provide a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the human keypoint-based gesture control method in the foregoing first aspect or any implementation manner of the first aspect.
In a fifth aspect, the present disclosure also provides a computer program product, which includes a computer program stored on a non-transitory computer-readable storage medium, where the computer program includes program instructions, and when the program instructions are executed by a computer, the computer executes the method for controlling a gesture based on a human body key point in the foregoing first aspect or any implementation manner of the first aspect.
The gesture control scheme based on the human key points in the embodiment of the disclosure comprises the steps of obtaining a video file generated by a camera device for a target object, wherein one or more actions of the target object are recorded in the video file; performing key point detection on a target object in the video file to obtain a key point set of the target object; when a plurality of necessary key points exist in the key point set, further acquiring gesture actions of the target object at a preset position; and determining a gesture instruction sent by the target object based on the analysis result aiming at the gesture action. Through the scheme disclosed by the invention, the accuracy of gesture control based on the human key points is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings needed to be used in the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present disclosure, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic diagram of a gesture control process based on human key points according to an embodiment of the present disclosure;
fig. 2 is a schematic diagram of key points based on a human body according to an embodiment of the present disclosure;
fig. 3 is a schematic diagram illustrating another gesture control process based on human key points according to an embodiment of the present disclosure;
fig. 4 is a schematic view illustrating another gesture control process based on human key points according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of a gesture control device based on human body key points according to an embodiment of the present disclosure;
fig. 6 is a schematic diagram of an electronic device provided in an embodiment of the present disclosure.
Detailed Description
The embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.
The embodiments of the present disclosure are described below with specific examples, and other advantages and effects of the present disclosure will be readily apparent to those skilled in the art from the disclosure in the specification. It is to be understood that the described embodiments are merely illustrative of some, and not restrictive, of the embodiments of the disclosure. The disclosure may be embodied or carried out in various other specific embodiments, and various modifications and changes may be made in the details within the description without departing from the spirit of the disclosure. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
It is noted that various aspects of the embodiments are described below within the scope of the appended claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the disclosure, one skilled in the art should appreciate that one aspect described herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. Additionally, such an apparatus may be implemented and/or such a method may be practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present disclosure, and the drawings only show the components related to the present disclosure rather than the number, shape and size of the components in actual implementation, and the type, amount and ratio of the components in actual implementation may be changed arbitrarily, and the layout of the components may be more complicated.
In addition, in the following description, specific details are provided to facilitate a thorough understanding of the examples. However, it will be understood by those skilled in the art that the aspects may be practiced without these specific details.
The embodiment of the disclosure provides a gesture control method based on human key points. The gesture control method based on human body key points provided by the embodiment can be executed by a computing device, the computing device can be implemented as software, or implemented as a combination of software and hardware, and the computing device can be integrally arranged in a server, a terminal device and the like.
Referring to fig. 1, a gesture control method based on human key points provided by the embodiments of the present disclosure includes:
s101, acquiring a video file generated by an image pickup device aiming at a target object, wherein the video file records one or more actions of the target object.
The image pickup apparatus is used to capture various motions of a target object, form the motions of the target object into a picture or a video image, and thereby form a video file. The camera device may be various devices including a camera, for example, the camera device may be a mobile phone, or the camera device may be another electronic device having a camera function.
Besides being capable of performing video recording or photographing operations, the image pickup apparatus can also run various application programs as an electronic apparatus, and through the application programs, the image pickup apparatus can recognize specific meanings of various actions contained in a video file so as to analyze specific action instructions.
The target object is an object photographed by the image pickup apparatus, and may be a person or another animal or object capable of generating motion. The target object represents a specific action command by generating a specific action. As an example, a target object may represent a response to an operation in an application (e.g., a game) through a flick finger action. The target object can complete interaction with the application program through the video file recorded by the camera device through one or more actions.
The video file is a recording file formed by the image pickup apparatus for one or more actions of the target object, and as a case may be, the image pickup apparatus acquires the one or more actions of the target object in real time, and the video file also records the one or more actions of the target object in real time at the same time.
The video file can be composed of a plurality of video frames, and target detection can be performed on newly generated video frames in real time as an application scene in order to save system resources. By performing object detection on the newly generated video frame, it is possible to determine whether or not a target object exists in the video frame, thereby determining whether or not further data processing needs to be performed on the target device. When the target object does not exist in the video frame through the target detection, the video frame without the target object can be stored in the video file, and therefore the occupation of the video file on system resources is further reduced.
S102, performing key point detection on the target object in the video file to obtain a key point set of the target object.
The video file comprises a plurality of video frames, and the key point set of the target object in the video file can be further obtained by performing key point detection on the video frames in the video file.
Referring to fig. 2, taking a human body as an example, in order to describe various motions of a target object (human body), the target object may be represented by a plurality of human body key points, the human body is described by the human body key points, a basic motion shape of the human body can be determined, and thus motions of the target object are recognized based on different motion shapes of the human body.
As an example, referring to fig. 2, the set of key points may be key points including different parts of the human body, for example, the set of key points may include hip key points P12, P3 and P16, and may further include shoulder key points P4, P2 and P8.
The target object displays different human body regions on the video frame of the video file. After the video frame is formed, the key point detection can be carried out on the video frame acquired by the camera equipment in real time, so that a plurality of key points in different areas of the target object are obtained, and a key point set is formed. The detection of the key points of the target object on the video frame can be performed by using methods such as CPM (Convolutional Pose detector), PAF (Part Affinity Fields, local Affinity detection method), and the like. The method of detecting the key points is not limited herein
S103, when a plurality of necessary key points exist in the key point set, further acquiring gesture actions of the target object at a preset position.
After the key point set of the target object in the video frame is acquired in real time, the position and the posture of the target object relative to the camera device can be judged through the acquired key point set. For example, when the upper body of the target object appears in the picture captured by the imaging device, the head key point and the shoulder key point of the target object appear in the key point set. Thus, the posture of the target object can be determined by defining the necessary key points.
The essential key points refer to one or more key points which are set to be required to appear on the video frames of the video file according to the requirements of the image posture of the target object in the video frames. For example, for a scene that requires the target object to appear entirely on a video frame, the essential keypoints may be all human keypoints of the target object. For scenes that only require the upper half of the target object to appear on the video frame, the essential keypoints may be the set of head and shoulder keypoints of the target object. Based on different scene needs, necessary key points can be specifically set. As an application scenario, the necessary key points may include a head key point, a shoulder key point, and a hand key point of the target object, in which case, the gesture motion of the user may be effectively recognized.
After necessary key points of the target object appear on the video frame, the position and the posture of the target object at the moment are shown to meet the shooting requirement of the camera device, and the gesture action of the target object at the preset position can be further acquired.
The preset position is an interaction area set by a specific application program for interaction, and as an example, the preset position may be a position at a lower left corner and a lower right corner in a video frame (i.e., a shooting field of view of a camera), and a left-hand motion of the target object is acquired in a region at the lower left corner, and a right-hand motion of the target object is acquired in a region at the lower right corner.
The action of the target object at the preset position can be any type of action, for example, the application program can be a pachinko game, the finger of the user can make a gesture action of a pachinko at the preset position, and the application program responds correspondingly by acquiring the gesture action of the user.
And S104, determining a gesture instruction sent by the target object based on the analysis result of the gesture action.
By analyzing the gesture actions of the user shot in the video frame of the video file, the gesture of the user can be analyzed, and then the gesture command of the user is obtained. The analysis of the user gesture action can adopt a common image recognition method, and the gesture instruction sent by the target object is determined by recognizing the graph of the user gesture. In addition, the trajectory of the user gesture can be acquired through a plurality of video frames, and the gesture instruction sent by the target object is determined by analyzing the style of the trajectory of the user gesture. Here, the scheme of the present disclosure does not limit the recognition method of the gesture command.
By the scheme, the action of the target object can be identified based on the key point of the target object, so that the accuracy of action identification of the target object is improved.
In order to further improve the processing efficiency of the video file, as an optional implementation manner of the embodiment of the present disclosure, referring to fig. 3, acquiring the video file generated by the image capturing apparatus for the target object may include the following steps:
and S301, performing subtraction operation on pixel values of pixel points corresponding to adjacent frame images formed in the image pickup equipment to obtain a pixel difference matrix.
Two adjacent frames formed in the camera equipment can be selected at will, the adjacent frames are respectively represented by adopting 2 pixel matrixes, and the pixel difference matrixes of the adjacent frame images can be obtained by performing difference calculation on the 2 pixel matrixes.
S302, judging whether the average value of the pixel difference matrix is larger than a preset threshold value.
The pixel difference matrix can be used for representing the action change process between any two adjacent frames, when the content on the two adjacent frames is not changed, the average value of the pixel difference matrix is close to 0, otherwise, the value on the pixel difference matrix has a certain change. By comparing the mean value of the pixel difference matrix with a preset threshold value, whether action changes on adjacent frame images can be judged.
And S303, if so, storing the adjacent frame image as a video frame in the video file.
When the mean value of the pixel difference matrix is larger than the preset threshold value, it can be considered that motion changes exist on the adjacent video frames, and at this time, the video frames with the motion changes can be stored in a video file for subsequent key point detection.
Through the steps in the steps S301-303, the video frames without action change in the video file can be reduced, and the burden of performing key point detection on the video frames subsequently is reduced.
As an optional implementation manner of the embodiment of the present disclosure, referring to fig. 4, when there are multiple necessary key points in the key point set, before further acquiring a gesture action of a target object at a preset position, the method further includes:
s401, detecting the position state of the target object based on the detected key point set.
By detecting the types of the key points existing in the key point set, the position state of the target object can be determined. For example, when a whole-body key point of the target object is included in the key point set, the whole body of the target object is located in the shooting field of the image pickup apparatus at this time. When only the upper body key points of the target object are included in the key point set, the upper body of the target object is positioned in the shooting field of the image pickup apparatus at this time. Based on this, the positional state of the target object with respect to the image pickup apparatus can be determined.
S402, based on the detected position state of the target object, prompting operation is carried out on the target object.
Through the detected position state of the target object, prompt operation can be performed on the target object so that the target object can adjust the position of the target object relative to the shooting device.
As one case, it may be determined whether shoulder key points and hand key points of a target object exist in a video frame of the video file. If not, prompting the target object to change the current position of the target object until the shoulder key point and the hand key point of the target object appear in the video frame of the video file.
Alternatively, whether the target object is at a preset position in the middle of a video frame may be determined based on a shoulder key point of the target object detected in the video frame of the video file; if so, prompting the target object to execute a preset action after the upper body key point of the target object appears in the video frame.
As an optional implementation manner of the embodiment of the present disclosure, when a plurality of necessary key points exist in the key point set, a gesture motion of the target object at a preset position is further obtained, whether a hand key point exists in a preset position area may be first determined, if yes, a moving distance of the hand key point within a preset time period is further obtained, and based on the moving distance, the gesture motion of the target object at the preset position is determined, for example, when the moving distance is greater than a preset distance value, it may be determined that a gesture of the target object has an action instruction.
As an optional implementation manner of the embodiment of the present disclosure, the performing key point detection on the target object in the video file includes: converting the video frame image in the video file into a gray level image; performing edge detection on the gray level image to obtain an edge contour of the target object; based on the edge profile, a set of keypoints for the target object is determined.
Wherein, based on the edge contour, determining the set of key points of the target object may comprise the following steps:
first, a plurality of structural elements with different orientations are selected.
The target object can be detected through the edge detection operator, if the edge detection operator only adopts one structural element, the output image only contains one type of geometric information, and the preservation of image details is not facilitated. In order to ensure the accuracy of image detection, an edge detection operator containing various structural elements is selected.
And then, carrying out detail matching on the gray level image by using each structural element in a plurality of structural elements to obtain a filtering image.
By using multiple structural elements in different orientations, each structural element being used as a scale to match image details, various details of the image can be adequately preserved while filtering to different types and sizes of noise.
Next, a gray edge calculation of the filtered image is determined to obtain a number of pixels present in each of a plurality of gray levels in the filtered image.
After filtering the image, in order to further reduce the amount of calculation, the filtered image may be converted into a gray scale image, and by setting a plurality of gray scale levels to the gray scale image, the number of pixels present in each gray scale image may be calculated.
Next, the number of pixels in each gradation level is weighted, and the weighted average value of the gradations is used as a threshold value.
For example, a large weight is given to a gradation level value having a large number of pixels, a small weight is given to a gradation level value having a small number of pixels, and an average value of the weighted gradation values is calculated to obtain a weighted average gradation value as a threshold value, thereby performing binarization processing on a gradation image based on the average gradation value.
Next, binarization processing is performed on the filtered image based on the threshold value.
Based on the threshold value, the filtered image may be subjected to binarization processing, for example, to data 1 for pixels larger than the threshold value and 0 for pixels smaller than the threshold value.
And finally, taking the image after the binarization processing as an edge image of the target object.
The edge image of the target object is obtained by performing corresponding color assignment on the binarized data, for example, assigning a pixel with a binarization value of 1 to black and assigning an image with a binarization value of 0 to white.
Corresponding to the above method embodiment, referring to fig. 5, the present disclosure also provides a human body key point-based gesture control device 50, including:
a first obtaining module 501, configured to obtain a video file generated by an image capturing apparatus for a target object, where the video file records one or more actions of the target object;
a detection module 502, configured to perform keypoint detection on a target object in the video file to obtain a keypoint set of the target object;
a second obtaining module 503, configured to further obtain a gesture action of the target object at a preset position when a plurality of necessary key points exist in the key point set;
a determining module 504, configured to determine, based on a result of the parsing for the gesture action, a gesture instruction issued by the target object.
The apparatus shown in fig. 5 may correspondingly execute the content in the above method embodiment, and details of the part not described in detail in this embodiment refer to the content described in the above method embodiment, which is not described again here.
Referring to fig. 6, an embodiment of the present disclosure also provides an electronic device 60, including:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method for controlling gestures based on human key points in the above method embodiments.
The disclosed embodiments also provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the foregoing method embodiments.
The disclosed embodiments also provide a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the human keypoint-based gesture control method in the aforementioned method embodiments.
Referring now to FIG. 6, a schematic diagram of an electronic device 60 suitable for use in implementing embodiments of the present disclosure is shown. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., car navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 6, the electronic device 60 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 60 are also stored. The processing device 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, image sensor, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 60 to communicate with other devices wirelessly or by wire to exchange data. While the figures illustrate an electronic device 60 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of the embodiments of the present disclosure.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring at least two internet protocol addresses; sending a node evaluation request comprising the at least two internet protocol addresses to node evaluation equipment, wherein the node evaluation equipment selects the internet protocol addresses from the at least two internet protocol addresses and returns the internet protocol addresses; receiving an internet protocol address returned by the node evaluation equipment; wherein the obtained internet protocol address indicates an edge node in the content distribution network.
Alternatively, the computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: receiving a node evaluation request comprising at least two internet protocol addresses; selecting an internet protocol address from the at least two internet protocol addresses; returning the selected internet protocol address; wherein the received internet protocol address indicates an edge node in the content distribution network.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of a unit does not in some cases constitute a limitation of the unit itself, for example, the first retrieving unit may also be described as a "unit for retrieving at least two internet protocol addresses".
It should be understood that portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof.
The above description is only for the specific embodiments of the present disclosure, but the scope of the present disclosure is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present disclosure should be covered within the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (11)

1. A gesture control method based on human key points is characterized by comprising the following steps:
acquiring a video file generated by a camera device for a target object, wherein the video file records one or more actions of the target object;
performing key point detection on a target object in the video file to obtain a key point set of the target object;
when a plurality of necessary key points exist in the key point set, further acquiring gesture actions of the target object at a preset position, wherein the necessary key points comprise shoulder key points and hand key points of the target object;
and determining a gesture instruction sent by the target object based on the analysis result aiming at the gesture action.
2. The method according to claim 1, wherein the acquiring a video file generated by an image capturing apparatus for a target object comprises:
performing subtraction operation on pixel values of pixel points corresponding to adjacent frame images formed in the camera equipment to obtain a pixel difference matrix;
judging whether the average value of the pixel difference matrix is larger than a preset threshold value or not;
and if so, storing the adjacent frame image as a video frame in the video file.
3. The method according to claim 1, wherein when there are a plurality of essential key points in the key point set, before the gesture action of the target object at the preset position is further acquired, the method further comprises:
detecting the position state of the target object based on the detected key point set;
and performing prompt operation on the target object based on the detected position state of the target object.
4. The method of claim 3, wherein performing a prompt operation on the target object based on the detected position state of the target object comprises:
judging whether shoulder key points and hand key points of a target object exist in video frames of the video file;
if not, prompting the target object to change the current position of the target object until the shoulder key point and the hand key point of the target object appear in the video frame of the video file.
5. The method of claim 3, wherein performing a prompt operation on the target object based on the detected position state of the target object comprises:
judging whether the target object is located at a preset position in the middle of a video frame or not based on a shoulder key point of the target object detected in the video frame of the video file;
if so, prompting the target object to execute a preset action after the upper body key point of the target object appears in the video frame.
6. The method according to claim 1, wherein when there are a plurality of essential key points in the key point set, further acquiring a gesture action of a target object at a preset position, including:
judging whether a preset position area has a hand key point or not;
if yes, further acquiring the moving distance of the hand key point in a preset time period;
and judging the gesture action of the target object at the preset position based on the moving distance.
7. The method of claim 1, wherein performing keypoint detection on target objects in the video file comprises:
converting the video frame image in the video file into a gray level image;
performing edge detection on the gray level image to obtain an edge contour of the target object;
based on the edge profile, a set of keypoints for the target object is determined.
8. The method of claim 7, wherein determining the set of keypoints for the target object based on the edge contour comprises:
selecting a plurality of structural elements with different orientations;
carrying out detail matching on the gray level image by utilizing each structural element in a plurality of structural elements to obtain a filtering image;
determining a gray scale edge of the filtered image to obtain a number of pixels present in each of a plurality of gray scale levels in the filtered image;
weighting the number of pixels in each gray level, and taking the weighted gray average value as a threshold value;
carrying out binarization processing on the filtered image based on the threshold value;
and taking the image after the binarization processing as an edge image of the target object.
9. A gesture control device based on human key points is characterized by comprising:
the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a video file generated by the camera equipment aiming at a target object, and the video file records one or more actions of the target object;
the detection module is used for executing key point detection on the target object in the video file to obtain a key point set of the target object;
the second acquisition module is used for further acquiring the gesture action of the target object at the preset position when a plurality of necessary key points exist in the key point set, wherein the necessary key points comprise shoulder key points and hand key points of the target object;
and the determining module is used for determining a gesture instruction sent by the target object based on the analysis result aiming at the gesture action.
10. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of human keypoint-based gesture control of any preceding claim 1 to 8.
11. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the human keypoint based gesture control method of any preceding claim 1-8.
CN201910563241.XA 2019-06-26 2019-06-26 Gesture control method and device based on human body key points and electronic equipment Active CN110287891B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910563241.XA CN110287891B (en) 2019-06-26 2019-06-26 Gesture control method and device based on human body key points and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910563241.XA CN110287891B (en) 2019-06-26 2019-06-26 Gesture control method and device based on human body key points and electronic equipment

Publications (2)

Publication Number Publication Date
CN110287891A CN110287891A (en) 2019-09-27
CN110287891B true CN110287891B (en) 2021-11-09

Family

ID=68006279

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910563241.XA Active CN110287891B (en) 2019-06-26 2019-06-26 Gesture control method and device based on human body key points and electronic equipment

Country Status (1)

Country Link
CN (1) CN110287891B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110991235B (en) * 2019-10-29 2023-09-01 京东科技信息技术有限公司 State monitoring method and device, electronic equipment and storage medium
CN112166435A (en) * 2019-12-23 2021-01-01 商汤国际私人有限公司 Target tracking method and device, electronic equipment and storage medium
CN112262393A (en) * 2019-12-23 2021-01-22 商汤国际私人有限公司 Gesture recognition method and device, electronic equipment and storage medium
CN113177472B (en) * 2021-04-28 2024-03-29 北京百度网讯科技有限公司 Dynamic gesture recognition method, device, equipment and storage medium
CN113096152B (en) * 2021-04-29 2022-04-01 北京百度网讯科技有限公司 Multi-object motion analysis method, device, equipment and medium
CN113505735B (en) * 2021-05-26 2023-05-02 电子科技大学 Human body key point stabilization method based on hierarchical filtering
CN113946216A (en) * 2021-10-18 2022-01-18 阿里云计算有限公司 Man-machine interaction method, intelligent device, storage medium and program product
CN115474076A (en) * 2022-08-15 2022-12-13 珠海视熙科技有限公司 Video stream image output method and device and camera equipment

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101246551A (en) * 2008-03-07 2008-08-20 北京航空航天大学 Fast license plate locating method
CN103047938B (en) * 2013-01-05 2015-08-12 山西省电力公司大同供电分公司 Electric power line ice-covering thickness detection method and pick-up unit
CN103501446B (en) * 2013-10-12 2016-05-18 青岛旲天下智能科技有限公司 Internet television system based on gesture human-computer interaction technology and its implementation
CN105045399B (en) * 2015-09-07 2018-08-14 哈尔滨市一舍科技有限公司 A kind of electronic equipment with 3D camera assemblies
CN105425964B (en) * 2015-11-30 2018-07-13 青岛海信电器股份有限公司 A kind of gesture identification method and system
CN106971131A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 A kind of gesture identification method based on center
CN107340852A (en) * 2016-08-19 2017-11-10 北京市商汤科技开发有限公司 Gestural control method, device and terminal device
CN107272899B (en) * 2017-06-21 2020-10-30 北京奇艺世纪科技有限公司 VR (virtual reality) interaction method and device based on dynamic gestures and electronic equipment
CN108227912B (en) * 2017-11-30 2021-05-11 北京市商汤科技开发有限公司 Device control method and apparatus, electronic device, computer storage medium
CN109063653A (en) * 2018-08-07 2018-12-21 北京字节跳动网络技术有限公司 Image processing method and device
CN109299743B (en) * 2018-10-18 2021-08-10 京东方科技集团股份有限公司 Gesture recognition method and device and terminal
CN109446994B (en) * 2018-10-30 2020-10-30 北京达佳互联信息技术有限公司 Gesture key point detection method and device, electronic equipment and storage medium
CN109657537A (en) * 2018-11-05 2019-04-19 北京达佳互联信息技术有限公司 Image-recognizing method, system and electronic equipment based on target detection
CN109712129A (en) * 2018-12-25 2019-05-03 河北工业大学 A kind of arc image processing method based on mathematical morphology
CN109710071B (en) * 2018-12-26 2022-05-17 青岛小鸟看看科技有限公司 Screen control method and device
CN109618184A (en) * 2018-12-29 2019-04-12 北京市商汤科技开发有限公司 Method for processing video frequency and device, electronic equipment and storage medium
CN109725724B (en) * 2018-12-29 2022-03-04 百度在线网络技术(北京)有限公司 Gesture control method and device for screen equipment
CN109862274A (en) * 2019-03-18 2019-06-07 北京字节跳动网络技术有限公司 Earphone with camera function, the method and apparatus for exporting control signal

Also Published As

Publication number Publication date
CN110287891A (en) 2019-09-27

Similar Documents

Publication Publication Date Title
CN110287891B (en) Gesture control method and device based on human body key points and electronic equipment
CN110058685B (en) Virtual object display method and device, electronic equipment and computer-readable storage medium
CN110070551B (en) Video image rendering method and device and electronic equipment
CN107392189B (en) Method and device for determining driving behavior of unmanned vehicle
CN110070063B (en) Target object motion recognition method and device and electronic equipment
CN104182718A (en) Human face feature point positioning method and device thereof
CN110062157B (en) Method and device for rendering image, electronic equipment and computer readable storage medium
CN110619314A (en) Safety helmet detection method and device and electronic equipment
CN110084204B (en) Image processing method and device based on target object posture and electronic equipment
CN111199169A (en) Image processing method and device
CN109981989B (en) Method and device for rendering image, electronic equipment and computer readable storage medium
CN111080665B (en) Image frame recognition method, device, equipment and computer storage medium
CN114998935A (en) Image processing method, image processing device, computer equipment and storage medium
CN110599520B (en) Open field experiment data analysis method, system and terminal equipment
CN111191556A (en) Face recognition method and device and electronic equipment
CN113205011B (en) Image mask determining method and device, storage medium and electronic equipment
CN113902636A (en) Image deblurring method and device, computer readable medium and electronic equipment
CN110222576B (en) Boxing action recognition method and device and electronic equipment
CN110378936B (en) Optical flow calculation method and device and electronic equipment
CN110047126B (en) Method, apparatus, electronic device, and computer-readable storage medium for rendering image
CN110751120A (en) Detection method and device and electronic equipment
CN111583329A (en) Augmented reality glasses display method and device, electronic equipment and storage medium
CN111258413A (en) Control method and device of virtual object
CN110941327A (en) Virtual object display method and device
CN110942033B (en) Method, device, electronic equipment and computer medium for pushing information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee after: Tiktok vision (Beijing) Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee before: BEIJING BYTEDANCE NETWORK TECHNOLOGY Co.,Ltd.

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee after: Douyin Vision Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee before: Tiktok vision (Beijing) Co.,Ltd.