CN111796673A

CN111796673A - Multi-finger gesture recognition method and storage medium for head-mounted device

Info

Publication number: CN111796673A
Application number: CN202010442065.7A
Authority: CN
Inventors: 刘德建; 陈丛亮; 郭玉湖; 陈宏�
Original assignee: Fujian TQ Digital Co Ltd
Current assignee: Fujian TQ Digital Co Ltd
Priority date: 2020-05-22
Filing date: 2020-05-22
Publication date: 2020-10-20
Anticipated expiration: 2040-05-22
Also published as: CN111796673B

Abstract

The invention provides a multi-finger gesture recognition method and a storage medium of head-mounted equipment, wherein the method comprises the following steps: presetting instructions corresponding to all multi-finger gestures; within a preset first time length, acquiring unused RGB values according to the RGB values of pixels of each frame of image, and sending the unused RGB values to light-emitting equipment; within a preset second time length, identifying the coordinate positions of the light output by more than two light-emitting devices according to the received RGB values in the display respectively; starting timing in turn seamlessly between the first time length and the second time length; and if the change of the coordinate position corresponding to the light-emitting equipment is matched with the preset multi-finger gesture within the third duration, triggering a corresponding instruction. The invention can improve the accuracy and the recognition efficiency of single-camera gesture recognition, support the recognition of multi-finger gestures, improve the operation convenience, enrich the control modes and functions and greatly improve the product availability.

Description

Multi-finger gesture recognition method and storage medium for head-mounted device

Technical Field

The invention relates to the field of gesture recognition, in particular to a multi-finger gesture recognition method and a storage medium of head-mounted equipment.

Background

The head-mounted equipment in the prior art is worn on the head, so that the operation control is difficult to be carried out by using a mode such as touch control of a mobile phone screen. Although some head-mounted devices are already capable of supporting gesture control. However, since the head-mounted device is generally configured with only a single camera, and gesture recognition performed by the single camera generally has the problems of complicated calculation, low recognition rate, low operation sensitivity, and the like, and cannot support multi-finger touch recognition, the gesture control function is not comprehensive enough.

Therefore, it is necessary to provide a gesture recognition method and a storage medium for a head-mounted device that can overcome the above problems at the same time.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the multi-finger gesture recognition method and the storage medium of the head-mounted device can improve the accuracy and recognition efficiency of single-camera gesture recognition, support recognition of multi-finger gestures and improve operation convenience.

In order to solve the technical problems, the invention adopts the technical scheme that:

the multi-finger gesture recognition method of the head-mounted device comprises the following steps:

presetting instructions corresponding to all multi-finger gestures;

within a preset first time length, acquiring unused RGB values according to the RGB values of pixels of each frame of image, and sending the unused RGB values to light-emitting equipment;

within a preset second time length, identifying the coordinate positions of the light output by more than two light-emitting devices according to the received RGB values in the display respectively;

starting timing in turn seamlessly between the first time length and the second time length;

and if the change of the coordinate position corresponding to the light-emitting equipment is matched with the preset multi-finger gesture within a third time length, triggering a corresponding instruction, wherein the third time length is greater than the sum of the first time length and the second time length.

The invention provides another technical scheme as follows:

a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, is capable of carrying out the steps included in the above-mentioned multi-finger gesture recognition method for a head-mounted device.

The invention has the beneficial effects that: the invention controls more than two light-emitting devices to respectively output the light of the unused RGB value by acquiring the unused RGB value; and then, the moving track of the multi-finger gesture is obtained by identifying the moving track of the display corresponding to the light of each light-emitting device. Therefore, the method can change the existing mode that the operation track of the real hand of the user can be recognized only by performing complex calculation on all image pixels, and convert the mode into the recognition mode that the gesture of the user can be quickly obtained only by performing simple analysis on the pixels with specific RGB values, thereby greatly reducing the calculation complexity of recognition, improving the recognition efficiency and accuracy, and particularly aiming at the head-mounted equipment with a single camera, obviously improving the gesture recognition efficiency and accuracy; moreover, the product can support multi-finger gesture control, so that control modes and functions are enriched, and the usability of the product is greatly improved.

Drawings

FIG. 1 is a schematic flow chart of a multi-finger gesture recognition method according to an embodiment of the present invention;

fig. 2 is a schematic flowchart of a multi-finger gesture recognition method according to an embodiment of the present invention.

Detailed Description

In order to explain technical contents, achieved objects, and effects of the present invention in detail, the following description is made with reference to the accompanying drawings in combination with the embodiments.

The most key concept of the invention is as follows: the multi-finger gesture of the user can be quickly recognized only by simply analyzing the pixels with specific RGB values, and corresponding operation is executed.

Referring to fig. 1, the present invention provides a multi-finger gesture recognition method for a head-mounted device, including:

presetting instructions corresponding to all multi-finger gestures;

Therefore, the speed and the precision of gesture recognition can be effectively improved; and the multi-touch effect can be simply realized, the existing application gestures are compatible, the functions of the product are enriched, and the usability of the product is improved.

Further, the obtaining of the unused RGB values according to the RGB values of each pixel of each frame image specifically includes:

presetting more than two groups respectively corresponding to different RGB value ranges;

acquiring the RGB value of each pixel of each frame of image;

dividing each pixel into a corresponding group according to the RGB value;

calculating the number of pixel points of each group to obtain the group with the least number of pixel points;

and determining the RGB value in the RGB value range corresponding to the group with the least number of pixel points as an unused RGB value.

As can be seen from the above description, the way of dividing the RGB value ranges of each group according to the color values is helpful to lock one or a few groups in a centralized manner when analyzing the unused pixel colors in the image captured in the first time period, so as not to disperse into multiple groups, thereby improving the accuracy and efficiency of subsequent analysis and calculation.

Further, if the unused RGB values correspond to two or more groups, the sending is performed to a light emitting device, specifically:

respectively calculating RGB difference values of more than two groups corresponding to the unused RGB values and other groups;

acquiring an RGB value range corresponding to the group with the largest difference value with other groups;

and sending the RGB value range to a light-emitting device.

As can be seen from the above description, if the unused RGB values are dispersed in two or more groups, the group with the largest difference from other groups is further selected, and the corresponding RGB value range is used as the standard of the light output by the light-emitting device, so that the recognition degree of the light output by the light-emitting device in the display screen of the head portrait device can be further improved, and the recognition accuracy is improved again.

Further, the RGB values of the light output by the light emitting device are randomly chosen from the range of received RGB values.

As can be seen from the above description, the light emitting device can be freely selected from a given range, and the matching degree with the light emitting device is improved, ensuring that it can output light of RGB values meeting the requirements.

Further, the different RGB value ranges are RGB value ranges corresponding to respective colors.

As can be seen from the above description, grouping is directly performed according to the color values corresponding to the colors, so that the available value and intuitiveness of the pixel grouping result can be improved.

Further, the identifying the coordinate positions of the lights output by the two or more light-emitting devices according to the received RGB values in the display is specifically:

controlling more than two light-emitting devices to simultaneously emit light corresponding to the received RGB values;

searching a pixel point set corresponding to the RGB value sent to the light-emitting equipment in the current frame image, and respectively obtaining the coordinate positions of the pixel point set, wherein the number of the pixel points in the pixel point set corresponds to the number of the light-emitting equipment;

and acquiring the movement tracks of the display corresponding to the more than two light-emitting devices in the second time length according to the coordinate positions of the frames of images in the second time length and the light-emitting devices corresponding to the coordinate positions.

As can be seen from the above description, by positioning the specific RGB values in the image and combining the RGB values in accordance with the time sequence, the control gestures made by the user through two or more light-emitting devices are respectively obtained, so as to obtain the multi-finger gesture instruction. Compared with the prior art, the method not only obviously improves the recognition efficiency and the recognition precision, but also enriches the control function.

Further, the multi-finger gesture comprises two-finger distance increase, two-finger distance decrease, and two-finger or three-finger simultaneous pull-down or pull-up.

Furthermore, the distance between two fingers is increased corresponding to the zoom-in operation instruction, the distance between two fingers is decreased corresponding to the zoom-out operation instruction, and the distance between three fingers is simultaneously pulled down or pulled up corresponding to the screenshot operation instruction.

As can be seen from the above description, a variety of common gesture controls are supported, as well as customization is supported; more convenient and efficient control can be realized.

Further, the wearing device and the light-emitting device carry out communication transmission through a Bluetooth communication link;

the first duration is equal to the second duration.

As can be seen from the above description, the light-emitting device and the head-mounted device are wirelessly connected, which is more convenient for the user to control; the head-mounted device and the light-emitting device adopt the same frequency for analysis processing, and the accuracy of the calculation result of the head-mounted device and the accuracy of the output of the light-emitting device can be ensured at the same time.

The invention provides another technical scheme as follows:

a computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, is capable of carrying out the steps included in a multi-finger gesture recognition method for a head-mounted device:

presetting instructions corresponding to all multi-finger gestures;

acquiring the RGB value of each pixel of each frame of image;

dividing each pixel into a corresponding group according to the RGB value;

and sending the RGB value range to a light-emitting device.

Further, the head-mounted device and the light-emitting device carry out communication transmission through a Bluetooth communication link;

the first duration is equal to the second duration.

As can be understood from the above description, those skilled in the art can understand that all or part of the processes in the above technical solutions can be implemented by instructing related hardware through a computer program, where the program can be stored in a computer-readable storage medium, and when executed, the program can include the processes of the above methods. The program can also achieve advantageous effects corresponding to the respective methods after being executed by a processor.

The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

Example one

As shown in fig. 2, the embodiment provides a multi-finger gesture recognition method for a head-mounted device, which can not only significantly improve gesture recognition efficiency and recognition accuracy; and multi-finger gesture control can be supported, and product functions are enriched. Wherein, the light emitting device can be any device capable of emitting light corresponding to the specified RGB value, such as a ring or a bracelet configured with LEDs; in the embodiment, the multi-finger gesture is executed by the user wearing the light-emitting devices on different fingers respectively.

The method comprises the following steps:

s1: presetting instructions corresponding to all multi-finger gestures;

the multi-finger gesture may be a two-finger distance increase, two distance decrease, two or three fingers simultaneously pull down or up, or other multi-finger gestures. For example, the two-finger distance increasing corresponding to the zoom-in operation instruction, the two-finger distance decreasing corresponding to the zoom-out operation instruction, and the three-finger simultaneous pull-down or pull-up corresponding to the screenshot operation instruction are commonly conformed to the operation habit of the user.

S2: presetting more than two groups respectively corresponding to different RGB value ranges;

that is, one group corresponds to one RGB value range. Preferably a set corresponds to a range of colour values. For example, the color image may be divided into 9 groups of red, orange, yellow, green, blue, purple, and black, wherein the "red" group corresponds to a cube having a length, width, and height of 50 and 55, which is a rectangular solid space having RGB values ranging from (200, 50, 50) to (255, 0, 0).

Of course, the grouping may also be finer, such as grouping each small RGB value interval.

Red: (200, 50, 50) - (255, 0, 0);

orange: (200,100, 50) - (255, 50, 0);

yellow: (200, 150, 50) - (255, 100, 0);

green: (0, 255, 0) to (50, 200, 50);

cyan: (0, 255, 50) - (50, 200, 100);

blue: (0, 0, 255) to (50, 50, 200);

purple: (50, 0, 255) - (100, 50, 200);

preferably, the partition of red, orange, yellow, green, blue-violet may also be a range classified in an LAB-wise manner in polar coordinates and then converted into RGB.

S3: presetting a first time length, a second time length and a third time length;

the third duration is greater than the sum of the first duration and the second duration, and corresponds to an effective recognition period for recognizing the multi-finger gesture, so that a gesture recognition period is required to be included at least.

Preferably, the first and second time periods are equal, such as 100 ms.

S4: in the first time period, the head-mounted device obtains unused RGB values according to the RGB values of the pixels of each frame of image and sends the unused RGB values to the light-emitting device.

Specifically, the step includes:

s41: and acquiring the RGB value of each pixel of each frame of image. Namely, the RGB value of each pixel in each frame of image shot by the camera in the first duration is obtained.

S42: and dividing each pixel into a corresponding group according to the RGB value. That is, each pixel acquired in step S31 is grouped according to its RGB value and classified into a group corresponding to a range of RGB values.

S43: and calculating the number of pixel points of each group to obtain the group with the minimum number of pixel points. That is, the number of pixels included in each group is calculated, and the group with the smallest number of pixels is obtained, and if the obtained number of groups is one, the group can be considered to have the largest difference from other groups.

In another specific example, the number of the groups with the smallest number of pixels finally obtained in the step S33 is two or more, and the identification degree of the light output by the light emitting device can be improved by further calculating one group with the largest difference from all other groups.

For example, if statistics shows that the pixel points included in the 3 groups of "red", "orange" and "yellow" are all 0 or close to 0, the RGB value ranges corresponding to the three groups are the unused RGB values.

In accordance with another embodiment, the most different group can be determined by:

s44: RGB difference values of two or more groups corresponding to the unused RGB values acquired at S33 and all other groups are calculated, respectively, to determine.

Taking the above 9 groups of red, orange, yellow, green, blue, purple and black as an example, and determining that the three groups of "red", "orange" and "yellow" have the least and equal pixel points after the step of S33, the unused RGB values correspond to the RGB values of the three groups of "red", "orange" and "yellow". In order to further improve the discrimination of the light emitted by the light emitting device. Then the difference between the three groups "red", "orange" and "yellow" and the 9 groups red, orange, yellow, green, blue, purple and black can be calculated. The calculation process may be: the RGB values of three groups of "red", "orange" and "yellow" are subtracted from other groups having pixel points (6 groups of cyan, blue, violet, black and white) to obtain a group having the largest difference among the three components (R, G, B). The formula is as follows: group 1 and group 2 had a difference d12 ═ 2+ (R1-R2)2+ (G1-G2)2+ (B1-B2) 2; the difference value dij is obtained for each of the red orange yellow 3 group and the green blue violet 4 group. Wherein the numbers of red, orange, yellow, green, blue and purple are 1234567 respectively; the maximum difference value is max (min (d14, d15, d16, d17), min (d24, d25, d26, d27), min (d34, d35, d36, d 37)).

If the maximum value d14 is found, the red group is defined.

S45: if the group corresponding to the unused RGB value is only one group, the RGB values within the range of RGB values corresponding to the group can be directly sent to the light emitting device.

If the group with the least number of pixels corresponds to more than two groups, the corresponding RGB value range can be directly sent to the light-emitting equipment, one RGB value is selected by the light-emitting equipment to carry out light sending accidents, and the group with the highest identification degree can be screened from the more than two groups and then the RGB value range is sent to the light-emitting equipment.

That is, regardless of the number of groups to which unused RGB values correspond, it is preferable to transmit the RGB values having the largest difference to the light emitting devices; of course, it is also possible to choose to send all unused RGB values to the light emitting device.

It should be noted that the head-mounted device sends the RGB value range to the lighting device via the lighting device connected to its bluetooth.

S5: during the second time period, the head-mounted device identifies the coordinate positions of the lights respectively output by the more than two light-emitting devices according to the received RGB value ranges in the display.

The method specifically comprises the following steps:

s51: and controlling more than two light-emitting devices to emit corresponding light according to the received RGB value range in the second time length.

Preferably, if the unused RGB values correspond to more than two color values (i.e. two grouped ranges of RGB values), the light-emitting device may randomly select the RGB values from the unused RGB values for output.

In a specific example, if the maximum difference group is found as "red" group, only one group of corresponding RGB values is obtained. If the "red" group corresponds to a cuboid space having RGB values in the ranges of (200, 50, 50) to (255, 0, 0), i.e., a cuboid having a length, width and height of 50 and 55, the light emitting device may randomly output the colors of (200, 50, 50) to (255, 0, 0).

S52: the head-mounted device searches a pixel point set of the RGB value correspondingly sent to the light-emitting device in the current frame image according to the image shot by the camera at present (the image is still shot by the camera in real time within the second duration), and obtains the coordinate position of the pixel point set. The number of pixels in the pixel set corresponds to the number of the light-emitting devices, that is, the designated pixels having the same number as the light-emitting devices are identified in one frame of image.

S53: and acquiring the movement track of the display corresponding to each light-emitting device in the second time period according to the coordinate position determined in the last step of each frame image in the second time period and the corresponding relation between each coordinate position and the light-emitting device.

The corresponding relation does not need to be clear, and only different moving tracks need to be distinguished. For example, a plurality of pixels may be classified by k-means algorithm (classified according to the number of light emitting devices). Therefore, after the initial coordinate positions are initially determined, the respective movement tracks can be derived from different initial coordinate positions by comparing the time sequence.

That is, the head-mounted device only needs to identify more than two pixel points corresponding to the RGB values sent to the light-emitting device in the image shot in real time, and locate the coordinate positions of the pixel points in the image; and then, connecting the coordinate positions led out by the different initial coordinate positions according to the time sequence, so that the multi-finger gesture of the head-mounted device screen corresponding to the light output by the plurality of light-emitting devices in the second time length can be obtained, namely the multi-finger gesture made by the user through the light-emitting devices.

S6: and starting timing in turn seamlessly between the first time length and the second time length.

Correspondingly, S3 and S4 are executed alternately, and the first time period is the beginning. Specifically, in the first time period, the light emitting device stops outputting any light, and only when receiving the RGB value range sent by the head-mounted device, the corresponding light is output.

That is, in the first duration, the head-mounted device calculates and transmits the calculation result to the light-emitting device; in a second time period, the light-emitting device outputs corresponding light; and starting the timing of the first time length again, stopping outputting light by the light-emitting device, and repeatedly calculating and sending the light to the light-emitting device by the head-mounted device.

S7: and if the change of the coordinate position corresponding to the light-emitting equipment is matched with the preset multi-finger gesture within a third time length, triggering a corresponding instruction, wherein the third time length is greater than the sum of the first time length and the second time length.

Preferably, the third time period is the first time period and the second time period, and N times are counted in turn, wherein N is greater than or equal to 2, and preferably 5 times.

In the step, the multi-finger gesture of the user operated through the light-emitting device obtained through recognition is matched with the preset multi-finger gesture, and when the matching is successful, the corresponding operation instruction is triggered, so that the recognition and the control of the multi-finger gesture are completed.

The whole process is that the user utilizes the light-emitting device to simulate a human hand, a mouse or other control devices to make gestures, and the head-mounted device obtains the user control gestures by identifying the position of the screen corresponding to the light emitted by the light-emitting device.

In a specific example, based on the above, it is already possible to set various specific multi-finger gestures and corresponding control methods in the head-mounted device in advance, and then, after recognizing the gestures, directly execute the control methods corresponding to the gestures. For example, a distance of 2 led lamps is increased, and the distance can be mapped to an amplification operation; the distance reduction of 2 led lamps can be mapped into the amplification operation; and performing screenshot operation when 3 led and the like are simultaneously pulled down.

The embodiment realizes that the multi-finger gesture control on the head-mounted equipment can be realized by respectively wearing one light-emitting device on different fingers of a user, so that the gesture recognition speed and the recognition accuracy of the head-mounted equipment, particularly the head-mounted equipment which is configured with a single camera, can be remarkably improved; on the basis, multi-finger gesture control is realized, the functions of the product are enriched, the usability of the product is improved, and the product is easier to control.

Example two

The invention provides a specific application scenario corresponding to the first embodiment:

1. the RGB values of pixels of each frame of image of the camera are obtained, firstly, the RGB values are grouped through manual presetting, and each group has an RGB range. For example: the groups can be divided into 9 groups of red, orange, yellow, green, blue, purple and black (the groups can also be divided into one group for each smaller rgb value interval). Then, the number of pixel points of each group is calculated according to the image acquired by the camera. For example, the 3 groups of pixels of red, orange and yellow are statistically found to be 0, or the three groups are the unused rgb values when the pixel is close to 0.

2. The most differentiated color is output by more than two led rings worn on different fingers on the hand and turned off at a fixed frequency, such as 100 milliseconds. The maximum difference group was found from the red orange yellow group as the red group above. For example, if the red rgb ranges from (200, 00, 00) to (255, 00, 00), the led ring randomly outputs the colors of (200, 00, 00) to (255, 00, 00).

3. The camera also calculates the rgb range of the maximum value of rgb difference red in the led off output period according to the same frequency as (200, 00, 00) to (255, 00, 00), then the led outputs the color of the maximum difference value, and the upper led position is obtained when the led is on. After the Led is closed, calculating the color to be displayed by the Led, after the Led is opened, searching the position of a pixel point in the Led color range by the camera, for example, the shooting range of the camera is the coordinate range from (0, 0) to (1920, 1080), randomly initializing 3 coordinates by obtaining the current Led number of 3, classifying the pixels into 3 classes by a k-means algorithm, and finding out the central positions of the 3 classes as (100,200), (150, 200) (200 ).

4. Then, assuming that the head mounted display is also 1920 × 1080 pixels, the display touch point floats on the (100,200), (150, 200) (200 ) coordinates.

5. And repeating the steps 1 to 4, gradually moving the three coordinates from (100,200), (150, 200) (200 ) to (100,100), (150, 100) (200,100) along with the downward movement of the led, and judging that the downward movement of a plurality of points exceeds 100 within 0.5 second by calculating that a 3-point pull-down gesture triggers the screenshot operation.

If the number of the current led is 2 in the step 3, randomly initializing 2 coordinates by obtaining the current led number 2, dividing a plurality of pixel points into 2 classes by a k-means algorithm, and solving the central positions of the 2 classes as (100,200), (150, 200); according to the calculation formula: (x1, y1), (x2, y2), the distance is ((x2-x1)2+ (y2-y1)2)1/2, and the distance of 2 points is calculated as ((150 + 100)2+ (200 + 200)2) 1/2-7.0710678118654755.

Repeating the steps 1,2, 6 and 4 to obtain the distance of 2 led coordinates gradually reduced (121,200), (125, 200) calculating ((125 + 121)2+ (200 + 200)2)1/2 to 2, and reducing the distance to trigger the zooming event.

EXAMPLE III

Corresponding to the first embodiment and the second embodiment, a computer-readable storage medium is provided, where a computer program is stored, and when the computer program is executed by a processor, the steps included in the multi-finger gesture recognition method for a headset according to the first embodiment or the second embodiment can be implemented, and specific steps are not repeated here, and refer to the description of the first embodiment or the second embodiment for details.

In summary, the multi-finger gesture recognition method and the storage medium of the head-mounted device provided by the invention can improve the accuracy and recognition efficiency of single-camera gesture recognition, support the recognition of multi-finger gestures, improve the operation convenience, enrich the control modes and functions, and greatly improve the product availability. Particularly, the method is applied to the head-mounted equipment with a single camera, and has remarkable effect; furthermore, the equipment (light-emitting equipment) required to be matched has the characteristics of simple structure, lightness, portability and the like, so that the scheme also has the characteristics of strong practicability and easiness in implementation.

The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all equivalent changes made by using the contents of the present specification and the drawings, or applied directly or indirectly to the related technical fields, are included in the scope of the present invention.

Claims

1. The multi-finger gesture recognition method of the head-mounted device is characterized by comprising the following steps:

presetting instructions corresponding to all multi-finger gestures;

2. The method for recognizing a multi-finger gesture of a head-mounted device according to claim 1, wherein the obtaining of the unused RGB values according to the RGB values of the pixels of each frame of image comprises:

acquiring the RGB value of each pixel of each frame of image;

dividing each pixel into a corresponding group according to the RGB value;

3. The method according to claim 1, wherein if the unused RGB values correspond to two or more groups, the RGB values are sent to a light-emitting device, specifically:

and sending the RGB value range to a light-emitting device.

4. The method of claim 1, wherein the RGB values of the light output by the light emitting device are randomly selected from a range of received RGB values.

5. The method of claim 4, wherein the different RGB value ranges are RGB value ranges corresponding to respective colors.

6. The method for recognizing a multi-finger gesture on a head-mounted device according to claim 1, wherein the recognizing the coordinate positions of the lights outputted by the two or more light-emitting devices according to the received RGB values in the display respectively comprises:

7. The method for multi-finger gesture recognition of a headset of claim 1, wherein the multi-finger gesture comprises two-finger distance increase, two-finger distance decrease, two-finger or three-finger pull-down or pull-up at the same time.

8. The method for recognizing the multi-finger gesture of the head-mounted device according to claim 7, wherein the distance between two fingers is increased corresponding to a zoom-in operation instruction, the distance between two fingers is decreased corresponding to a zoom-out operation instruction, and the distance between three fingers is simultaneously pulled down or pulled up corresponding to a screenshot operation instruction.

9. The multi-finger gesture recognition method of the head-mounted device according to claim 1, wherein the head-mounted device and the light-emitting device are in communication transmission through a bluetooth communication link;

the first duration is equal to the second duration.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, is able to carry out the steps included in the method for multi-finger gesture recognition of a head-mounted device according to any one of the claims 1 to 9.