CN109101108B - Method and system for optimizing human-computer interaction interface of intelligent cabin based on three decisions - Google Patents

Method and system for optimizing human-computer interaction interface of intelligent cabin based on three decisions Download PDF

Info

Publication number
CN109101108B
CN109101108B CN201810823980.3A CN201810823980A CN109101108B CN 109101108 B CN109101108 B CN 109101108B CN 201810823980 A CN201810823980 A CN 201810823980A CN 109101108 B CN109101108 B CN 109101108B
Authority
CN
China
Prior art keywords
gesture
granularity
area image
image
interaction interface
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810823980.3A
Other languages
Chinese (zh)
Other versions
CN109101108A (en
Inventor
刘群
张刚强
王如琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN201810823980.3A priority Critical patent/CN109101108B/en
Publication of CN109101108A publication Critical patent/CN109101108A/en
Application granted granted Critical
Publication of CN109101108B publication Critical patent/CN109101108B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • G06V40/113Recognition of static hand signs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the field of intelligent driving, and discloses a method and a system for optimizing a human-computer interaction interface of an intelligent cabin based on three decisions, wherein the method comprises the steps of collecting gesture videos in the cabin, and preprocessing the videos to obtain gesture images; segmenting the gesture and the background of the gesture image to obtain a gesture area image; performing multi-granularity expression on the gesture area image, and extracting multi-granularity characteristics of the gesture area image by using a convolutional neural network; calculating the conditional probability of classifying each granularity gesture area image into each category from coarse granularity to fine granularity, and sequentially finishing gesture recognition by using three decisions; performing semantic conversion on the recognized gesture, and operating the human-computer interaction interface according to a result after the semantic conversion; and obtaining the optimal granularity by adopting a weighted summation mode, and taking the optimal granularity as the finest granularity. The method and the device can not only accurately identify the gestures in the cockpit and execute the gesture commands, but also reduce the interaction time of the man-machine interaction interface of the cockpit and provide more comfortable interaction experience for users.

Description

Method and system for optimizing human-computer interaction interface of intelligent cabin based on three decisions
Technical Field
The invention belongs to the field of intelligent driving, and particularly relates to a method and a system for optimizing a human-computer interaction interface of an intelligent cabin based on three decisions.
Background
With the development of artificial intelligence and deep learning techniques, intelligent driving has attracted attention of many people. Gesture recognition, one of typical human-machine interaction modes in intelligent driving, is very important for the optimal design of a human-machine interaction (HMI) interface in a cabin. Accurate quick gesture recognition not only can provide more comfortable interactive experience, also can improve driver's security.
The current gesture recognition methods mainly have two modes based on sensor equipment and computer vision. Although the former has a better recognition rate, the cost is higher, the interaction experience cannot meet the current requirements, and the latter is easier to acquire gesture images, the former includes: the gesture recognition method based on template matching, geometric feature extraction, hidden Markov model and neural network still has the problems of low model recognition precision or low recognition speed and the like, and cannot well adapt to the current accurate real-time gesture recognition requirement. The main reason of low model identification precision is that the features of the gestures cannot be well extracted, while the main reason of low identification speed is that the models are too complex, and the existing methods cannot solve the two problems at the same time.
Disclosure of Invention
Based on the problems, the optimization problems of low gesture recognition precision and low recognition speed can be solved by selecting proper granularity by utilizing the capability of extracting features of a deep neural network and combining a multi-granularity information expression mode and a three-branch decision making idea.
The invention provides a method for optimizing a man-machine interaction interface of an intelligent cabin based on three decisions, which comprises the following steps:
s1, acquiring a gesture video in the cabin, and preprocessing the gesture video to obtain a static gesture image;
s2, performing segmentation processing on the gestures and the background in the gesture image to obtain a gesture area image;
s3, performing multi-granularity expression on the gesture area image from coarse granularity to fine granularity; extracting multi-granularity characteristics of the gesture area image by using a convolutional neural network;
s4, calculating the conditional probability of classifying each granularity gesture area image into each category from coarse granularity to fine granularity, and sequentially completing gesture recognition by utilizing three decisions;
s5, performing semantic conversion on the recognized gesture, and performing corresponding operation on the human-computer interaction interface according to the gesture recognition result after the semantic conversion;
s6, obtaining the best granularity by adopting a weighted summation mode, and repeatedly executing the steps S3-S5 by taking the best granularity as the finest granularity.
Further, the gesture area image is expressed in a multi-granularity mode from coarse granularity to fine granularity, and for the same gesture area image, the multi-granularity information expression mode is as follows:
Figure BDA0001742025310000021
wherein A isiInformation representing different granularities of images of gesture areas, A1Information indicating the coarse granularity of the image of the gesture area, AnInformation indicating that the gesture area image is in a fine granularity, namely the fine granularity comprises a coarse granularity; i 1,2, n, n represents the particle size.
Further, the extracting the multi-granularity features of the gesture area image by using the convolutional neural network comprises extracting the multi-granularity image features of the gesture image by using different convolutional kernels in the convolutional neural network.
Further, the step S4 includes extracting coarse-grained features from the gesture area image to make three decisions, if the classification category of the gesture can be determined, not continuing fine-grained feature extraction and further three decisions, otherwise, extracting finer-grained features to make three decisions until the classification category of the gesture area image is determined.
Further, the step S6 includes obtaining a final human-computer interaction interface optimization result for each granularity by a weighted summation method, so as to determine a granularity at which the gesture has the best human-computer interaction interface optimization effect;
Result=w×Acc+(1-w)×Time
Time=T1+T2
wherein Result is the optimal granularity of the gesture area image, Acc represents gesture recognition accuracy, Time represents Time spent in the gesture recognition process, w represents weight, T represents weight1Representing the time for extracting the multi-granularity features of the gesture area image; t is2Indicating the time at which the gesture was recognized.
The invention provides a system for optimizing a human-computer interaction interface of an intelligent cockpit based on three decisions, which comprises a camera, a cockpit gesture acquisition module, a gesture image segmentation module, a multi-granularity feature extraction module, a three-decision gesture recognition module, a gesture semantic conversion module and an optimal granularity acquisition module, wherein the camera, the cockpit gesture acquisition module, the gesture image segmentation module, the multi-granularity feature extraction module, the three-decision gesture recognition module, the gesture semantic conversion module and the optimal granularity acquisition module;
the cockpit gesture acquisition module acquires a gesture video in the cockpit through a camera and converts a video frame into a series of static gesture images;
the gesture image segmentation module is used for segmenting the gesture and the background of the gesture image to obtain a gesture area image;
the gesture multi-granularity feature extraction module is used for extracting multi-granularity features of the gesture area image from coarse granularity to fine granularity;
the three-decision gesture recognition module is used for carrying out three-decision on the gesture area image in each granularity according to the extracted multi-granularity features so as to classify the gestures;
the gesture semantic conversion module is used for performing semantic conversion on the classified gestures;
the optimal granularity acquisition module is used for acquiring optimal granularity and sending the optimal granularity to the multi-granularity feature extraction module.
Further, the gesture multi-granularity feature extraction module comprises a convolutional neural network unit, and extracts multi-granularity image features of the gesture area image by using different convolutional kernels in the convolutional neural network unit; the multi-granularity information representation mode is specifically
Figure BDA0001742025310000031
Wherein A isiInformation representing different granularities of images of gesture areas, A1Information indicating the coarse granularity of the image of the gesture area, AnInformation indicating that the gesture area image is in a fine granularity, namely the fine granularity comprises a coarse granularity; i 1,2, n, n represents the particle size.
Further, the three-branch decision gesture recognition module performs three-branch decision on coarse-grained features of the gesture area image, if the classification category of the gesture can be determined, fine-grained feature extraction and further three-branch decision are not continued, otherwise, finer-grained features are extracted to perform three-branch decision until the classification category of the gesture area image is determined.
Further, the optimal granularity acquisition module acquires a final human-computer interaction interface optimization result of each granularity by adopting a weighted summation mode so as to determine the optimal granularity of the gesture area image;
Result=w×Acc+(1-w)×Time
Time=T1+T2
wherein Result is a gesture areaOptimal granularity of a domain image, Acc represents gesture recognition precision, Time represents Time spent in a gesture recognition process, w represents weight, T represents weight1Representing the time for extracting the multi-granularity features of the gesture area image; t is2Indicating the time at which the gesture was recognized.
The invention has the beneficial effects that:
the invention utilizes the thought of 'gradual calculation' in particle calculation to construct a multi-granularity information expression mode for a gesture image, utilizes a convolutional neural network to extract the characteristics of the multi-granularity gesture image, and uses a three-decision method to identify the gesture in each granularity from coarse granularity to fine granularity, then carries out corresponding semantic conversion on the identified gesture, and applies the gesture identification result to HMI interface optimization in a cabin.
The method and the device can utilize the characteristics of the acquired gestures with different granularities and combine three decision-making ideas, the gestures are recognized more accurately, and corresponding semantic operations are executed more quickly, so that the interaction time of the HMI interface of the cockpit can be reduced, and more comfortable interaction experience can be provided for users.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic diagram of multi-granularity feature extraction employed in the present invention;
FIG. 3 is a flow chart of three decision gesture recognition employed in the present invention.
FIG. 4 is an HMI interface optimization design method employed by the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more clearly and completely apparent, the technical solutions in the embodiments of the present invention are described below with reference to the accompanying drawings, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
To better illustrate the specific implementation steps of the method, the following is illustrated by way of example in conjunction with fig. 1:
example 1
The invention comprises the following steps:
s1, acquiring a gesture video in the cabin, and preprocessing the gesture video to obtain a static gesture image;
s2, performing segmentation processing on the gestures and the background in the gesture image to obtain a gesture area image;
s3, performing multi-granularity expression on the gesture area image from coarse granularity to fine granularity; extracting multi-granularity characteristics of the gesture area image by using a convolutional neural network;
s4, calculating the conditional probability of classifying each granularity gesture area image into each category from coarse granularity to fine granularity, and sequentially completing gesture recognition by utilizing three decisions;
s5, performing semantic conversion on the recognized gesture area image, and operating the human-computer interaction interface according to the gesture recognition result after the semantic conversion;
the gesture area image is expressed in a multi-granularity mode from coarse granularity to fine granularity, and for the same gesture area image, the multi-granularity information expression mode is as follows:
Figure BDA0001742025310000051
wherein A isiInformation representing different granularities of images of gesture areas, A1Information indicating the coarse granularity of the image of the gesture area, AnInformation indicating that the gesture area image is in a fine granularity, namely the fine granularity comprises a coarse granularity; i 1,2, n, n represents the particle size.
The extracting of the multi-granularity features of the gesture area image by using the convolutional neural network includes extracting the multi-granularity image features of the gesture image by using different convolutional kernels in the convolutional neural network, and as shown in fig. 2, extracting n granularity features (in turn, coarse granularity to fine granularity features) of the gesture area image by using the convolutional neural network CNN.
Further, the step S4 includes performing three decisions on coarse-grained features of the gesture area image, if the classification category of the gesture can be determined, not continuing fine-grained feature extraction and the further three decisions, otherwise, extracting more fine-grained features and performing three decisions until the classification category of the gesture area image is determined.
The flow chart of the three-branch decision is shown in fig. 3, and the input data set is used for extracting the multi-granularity features of the gesture area image, calculating the conditional probability and performing the three-branch decision.
Selecting a softmax function to calculate a conditional probability, wherein the conditional probability for classifying the gesture x into the category j is as follows:
Figure BDA0001742025310000052
wherein, l is 1, 2.. k, k represents the total number of categories of the gesture area image; θ is a parameter vector.
The three-branch decision model uses a group of decision thresholds alpha, beta and gamma to draw the gesture object into a positive domain (POS), a boundary domain (BND) and a negative domain (NEG), adopts an acceptance and rejection rule for the positive domain and the negative domain to directly obtain a gesture recognition result, adopts a delay decision for the boundary domain, and continues to apply three-branch decisions when more information is obtained at finer granularity.
The expressions for the positive, boundary and negative domains are as follows:
POS(α,β)={x∈U|p(X|[x])≥α}
BND(α,β)={x∈U|β<p(X|[x])<α}
NEG(α,β)={x∈U|p(X|[x])≤β}
where p (X [ X ]) is the conditional probability of classification, and [ X ] is the equivalence class containing X.
Threshold alpha of three decisionsi,βi,γiThe calculation method of (c) is as follows:
Figure BDA0001742025310000061
Figure BDA0001742025310000062
Figure BDA0001742025310000063
Figure BDA0001742025310000064
respectively, are loss functions that take different actions,
Figure BDA0001742025310000065
respectively representing a penalty function of taking an accept, delay and reject decision respectively when the ith granularity gesture X belongs to category X,
Figure BDA0001742025310000066
respectively representing the loss functions of taking acceptance, delay and rejection decisions respectively when the ith granularity gesture X does not belong to the category X, the loss functions of each granularity being given by experts according to experience.
The multi-granularity three-branch decision threshold is set in such a way that finer-grained decisions are made only if necessary or beneficial. This provides the basis for setting the three decision thresholds at different granularities, i.e. the coarse granularity selects a larger acceptance threshold and a smaller rejection threshold, i ═ 1,2, …, n-1 represents the sequence from coarse granularity to fine granularity, and then the thresholds at different granularities are specifically described as follows:
0≤βii≤1,1≤i<n,
β1≤β2≤…≤βii≤…≤α2≤α1
when i is n granularity, the three-branch decision becomes the two-branch decision, and the decision threshold calculation mode is as follows:
Figure BDA0001742025310000071
the three-branch decision is a decision mode conforming to human thinking, and compared with the traditional two-branch decision, a choice of no commitment is added, namely, a third delay decision is adopted when the information is not enough to be accepted or rejected. The two-branch decision making process is quick and simple, but the three-branch decision making is more suitable when the obtained information is insufficient or the obtained information needs a certain cost. The purpose of selecting three decisions for gesture recognition is that time spent for acquiring gesture features of different granularities is different, and for HMI interface operation with high real-time requirement, it is very necessary to consider time cost. In three-branch decision gesture recognition, the key steps are extracting multi-granularity features, and calculating threshold value pairs and conditional probabilities of three-branch decisions.
Example 2
On the basis of steps S1-S5, the embodiment further adds step S6, obtains the optimal granularity by means of weighted summation, and repeatedly executes steps S3-S5 with the optimal granularity as the finest granularity.
The HMI interface optimization design method is as shown in fig. 4, a final human-computer interaction interface optimization result of each granularity is obtained by adopting a weighted summation mode, so that the optimal granularity of the gesture area image is determined, the optimal granularity is used as the finest granularity, a convolutional neural network is utilized to extract multi-granularity characteristics of a new gesture, and three decisions are sequentially carried out;
Result=w×Acc+(1-w)×Time
Time=T1+T2
wherein Result is the optimal granularity of the gesture area image, Acc represents gesture recognition accuracy, Time represents Time spent in the gesture recognition process, w represents weight, T represents weight1Representing the time for extracting the multi-granularity features of the gesture area image; t is2Indicating the time at which the gesture was recognized.
This embodiment can save more time resources and have less computational complexity than embodiment 1, for example, in the case of embodiment 1, instead of using the optimal granularity, it takes 100 time to set the feature of extracting 5 granularities, and if it is known that the effect of 3 granularities is slightly worse than the recognition effect of 5 granularities, but the time is 40, then it is considered comprehensively that it is actually more suitable for practical application than 5 to 1 granularities at 5 to 3 granularities.
And the optimal granularity is calculated and then used as the finest granularity of subsequent gesture image processing. Different recognition results can be obtained due to different information amounts of the extracted features with different granularities, time spent by fine-grained feature extraction is more than that spent by coarse-grained feature extraction, gesture recognition accuracy and recognition time are considered in a weighting mode, and the most appropriate granularity can be selected for gesture feature extraction so as to meet the gesture-based HMI interface optimization design target in the cockpit.
The invention provides a system for optimizing a human-computer interaction interface of an intelligent cockpit based on three decisions, which comprises a camera, a cockpit gesture acquisition module, a gesture image segmentation module, a multi-granularity feature extraction module, a three-decision gesture recognition module, a gesture semantic conversion module and an optimal granularity acquisition module, wherein the camera, the cockpit gesture acquisition module, the gesture image segmentation module, the multi-granularity feature extraction module, the three-decision gesture recognition module, the gesture semantic conversion module and the optimal granularity acquisition module;
the cockpit gesture acquisition module acquires a gesture video in the cockpit through a camera and converts a video frame into a series of static gesture images;
the gesture image segmentation module is used for segmenting the gesture and the background of the gesture image to obtain a gesture area image;
the gesture multi-granularity feature extraction module is used for extracting multi-granularity features of the gesture area image from coarse granularity to fine granularity;
the three-decision gesture recognition module is used for carrying out three-decision on the gesture area image in each granularity according to the extracted multi-granularity features so as to classify the gestures;
the gesture semantic conversion module is used for performing semantic conversion on the classified gestures;
the optimal granularity acquisition module is used for acquiring optimal granularity and sending the optimal granularity to the multi-granularity feature extraction module.
Further, the gesture multi-granularity feature extraction module comprises a convolutional neural network unit, and extracts multi-granularity image features of the gesture area image by using different convolutional kernels in the convolutional neural network unit; the multi-granularity information representation mode is specifically
Figure BDA0001742025310000081
Wherein A isiInformation representing different granularities of images of gesture areas, A1Information indicating the coarse granularity of the image of the gesture area, AnInformation indicating that the gesture area image is in a fine granularity, namely the fine granularity comprises a coarse granularity; i 1,2, n, n represents the particle size.
Further, the three-branch decision gesture recognition module performs three-branch decision on coarse-grained features of the gesture area image, if the classification type of the gesture can be determined, fine-grained feature extraction and further three-branch decision are not continued, otherwise, more fine-grained features are extracted to perform three-branch decision until the classification type of the gesture area image is determined.
Further, the optimal granularity acquisition module acquires a final human-computer interaction interface optimization result of each granularity by adopting a weighted summation mode so as to determine the optimal granularity of the gesture area image;
Result=w×Acc+(1-w)×Time
Time=T1+T2
wherein Result is the optimal granularity of the gesture area image, Acc represents gesture recognition accuracy, Time represents Time spent in the gesture recognition process, w represents weight, T represents weight1Representing the time for extracting the multi-granularity features of the gesture area image; t is2Indicating the time at which the gesture was recognized.
Those skilled in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be performed by hardware associated with program instructions, and that the program may be stored in a computer-readable storage medium, which may include: ROM, RAM, magnetic or optical disks, and the like.
The above-mentioned embodiments, which further illustrate the objects, technical solutions and advantages of the present invention, should be understood that the above-mentioned embodiments are only preferred embodiments of the present invention, and should not be construed as limiting the present invention, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. A method for optimizing a human-computer interaction interface of an intelligent cabin based on three decisions is characterized by comprising the following steps:
s1, acquiring gesture videos in the cabin, and preprocessing the gestures to obtain a series of static gesture images;
s2, performing segmentation processing on the gestures and the background in the gesture image to obtain a gesture area image;
s3, performing multi-granularity expression on the gesture area image from coarse granularity to fine granularity; extracting multi-granularity characteristics of the gesture area image by using a convolutional neural network;
s4, calculating the conditional probability of classifying each granularity gesture area image into each category from coarse granularity to fine granularity, and sequentially completing gesture recognition by utilizing three decisions;
s5, performing semantic conversion on the recognized gesture, and performing corresponding operation on the human-computer interaction interface according to the gesture recognition result after the semantic conversion;
s6, obtaining the best granularity by adopting a weighted summation mode, and repeatedly executing the steps S3-S5 by taking the best granularity as the finest granularity.
2. The method for optimizing the human-computer interaction interface of the intelligent cockpit based on three decisions according to claim 1, wherein the images of the gesture area are expressed in a multi-granularity mode from a coarse granularity mode to a fine granularity mode, and for the same image of the gesture area, the multi-granularity information expression mode is as follows:
Figure FDA0001742025300000011
wherein A isiInformation representing different granularities of images of gesture areas, A1Information indicating the coarse granularity of the image of the gesture area, AnInformation indicating that the gesture area image is in a fine granularity, namely the fine granularity comprises a coarse granularity; i 1,2, n, n represents the particle size.
3. The method for optimizing the human-computer interaction interface of the intelligent cockpit based on the three-branch decision as claimed in claim 2, wherein the extracting the multi-granularity features of the image of the gesture area by using the convolutional neural network comprises extracting the multi-granularity features of the image of the gesture area by using different convolutional kernels in the convolutional neural network.
4. The method for optimizing the human-computer interaction interface of the intelligent cockpit based on the three-branch decision as claimed in claim 1, wherein the step S4 includes performing three-branch decision from coarse-grained features of the gesture area image, if the classification category of the gesture can be determined, the fine-grained feature extraction and the further three-branch decision are not continued, otherwise, the finer-grained features are extracted to perform the three-branch decision until the classification category of the gesture area image is determined.
5. The method for optimizing the human-computer interaction interface of the intelligent cockpit according to claim 1, wherein the step S6 includes obtaining the final human-computer interaction interface optimization result for each granularity by means of weighted summation, so as to determine the optimal granularity of the gesture area image;
Result=w×Acc+(1-w)×Time
Time=T1+T2
wherein Result is the optimal granularity of the gesture area image, Acc represents gesture recognition accuracy, Time represents Time spent in the gesture recognition process, w represents weight, T represents weight1Representing the time for extracting the multi-granularity features of the gesture area image; t is2Indicating the time at which the gesture was recognized.
6. A system for optimizing a human-computer interaction interface of an intelligent cabin based on three decisions is characterized by comprising a camera, a cabin gesture acquisition module, a gesture image segmentation module, a multi-granularity feature extraction module, a three-decision gesture recognition module, a gesture semantic conversion module and an optimal granularity acquisition module which are electrically connected;
the cockpit gesture acquisition module acquires a gesture video in the cockpit through a camera and converts a video frame into a series of static gesture images;
the gesture image segmentation module is used for segmenting the gesture and the background of the gesture image to obtain a gesture area image;
the gesture multi-granularity feature extraction module is used for extracting multi-granularity features of the gesture area image from coarse granularity to fine granularity;
the three-decision gesture recognition module is used for carrying out three-decision on the gesture area image in each granularity according to the extracted multi-granularity features so as to classify the gestures;
the gesture semantic conversion module is used for performing semantic conversion on the classified gestures;
the optimal granularity acquisition module is used for acquiring optimal granularity and sending the optimal granularity to the multi-granularity feature extraction module.
7. The system for optimizing the human-computer interaction interface of the intelligent cabin based on the three-branch decision as claimed in claim 6, wherein the gesture multi-granularity feature extraction module comprises a convolutional neural network unit, and extracts multi-granularity image features of a gesture area image by using different convolutional kernels in the convolutional neural network unit; the multi-granularity information representation mode is specifically
Figure FDA0001742025300000031
Wherein A isiInformation representing different granularities of images of gesture areas, A1Information indicating the coarse granularity of the image of the gesture area, AnInformation indicating that the gesture area image is in a fine granularity, namely the fine granularity comprises a coarse granularity; i 1,2, n, n represents the particle size.
8. The system for optimizing the human-computer interaction interface of the intelligent cockpit according to claim 6, wherein the three-decision gesture recognition module performs three decisions on coarse-grained features of the gesture area image, if the classification category of the gesture can be determined, fine-grained feature extraction and further three decisions are not continued, otherwise, finer-grained features are extracted to perform three decisions until the classification category of the gesture area image is determined.
9. The system for optimizing the human-computer interaction interface of the intelligent cockpit according to claim 6, wherein the optimal granularity obtaining module obtains the final human-computer interaction interface optimization result of each granularity by adopting a weighted summation mode, so as to determine the optimal granularity of the gesture area image;
Result=w×Acc+(1-w)×Time
Time=T1+T2
wherein Result is the optimal granularity of the gesture area image, Acc represents gesture recognition accuracy, Time represents Time spent in the gesture recognition process, w represents weight, T represents weight1Representing the time for extracting the multi-granularity features of the gesture area image; t is2Indicating the time at which the gesture was recognized.
CN201810823980.3A 2018-07-25 2018-07-25 Method and system for optimizing human-computer interaction interface of intelligent cabin based on three decisions Active CN109101108B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810823980.3A CN109101108B (en) 2018-07-25 2018-07-25 Method and system for optimizing human-computer interaction interface of intelligent cabin based on three decisions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810823980.3A CN109101108B (en) 2018-07-25 2018-07-25 Method and system for optimizing human-computer interaction interface of intelligent cabin based on three decisions

Publications (2)

Publication Number Publication Date
CN109101108A CN109101108A (en) 2018-12-28
CN109101108B true CN109101108B (en) 2021-06-18

Family

ID=64847467

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810823980.3A Active CN109101108B (en) 2018-07-25 2018-07-25 Method and system for optimizing human-computer interaction interface of intelligent cabin based on three decisions

Country Status (1)

Country Link
CN (1) CN109101108B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109820479B (en) * 2019-01-08 2021-08-27 西北大学 Fluorescence molecular tomography feasible region optimization method
CN109816022A (en) * 2019-01-29 2019-05-28 重庆市地理信息中心 A kind of image-recognizing method based on three decisions and CNN
CN110298263B (en) * 2019-06-10 2023-05-30 中南大学 Real-time accurate and contactless gesture recognition method and system based on RFID system
CN110458233B (en) * 2019-08-13 2024-02-13 腾讯云计算(北京)有限责任公司 Mixed granularity object recognition model training and recognition method, device and storage medium
CN111046732B (en) * 2019-11-11 2023-11-28 华中师范大学 Pedestrian re-recognition method based on multi-granularity semantic analysis and storage medium
CN111104339B (en) * 2019-12-31 2023-06-16 上海艺赛旗软件股份有限公司 Software interface element detection method, system, computer equipment and storage medium based on multi-granularity learning
CN111814737B (en) * 2020-07-27 2022-02-18 西北工业大学 Target intention identification method based on three sequential decisions
CN112580785B (en) * 2020-12-18 2022-04-05 河北工业大学 Neural network topological structure optimization method based on three-branch decision

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101639351B1 (en) * 2015-01-15 2016-07-13 주식회사 엔씨소프트 Wearable input system and method for recognizing motion
CN107578023A (en) * 2017-09-13 2018-01-12 华中师范大学 Man-machine interaction gesture identification method, apparatus and system
CN107958255A (en) * 2017-11-21 2018-04-24 中国科学院微电子研究所 Target detection method and device based on image

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101639351B1 (en) * 2015-01-15 2016-07-13 주식회사 엔씨소프트 Wearable input system and method for recognizing motion
CN107578023A (en) * 2017-09-13 2018-01-12 华中师范大学 Man-machine interaction gesture identification method, apparatus and system
CN107958255A (en) * 2017-11-21 2018-04-24 中国科学院微电子研究所 Target detection method and device based on image

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"基于三支决策的机器人触感手势识别研究";时煜斌;《中国优秀硕士学位论文全文数据库信息科技辑》;20180415(第04期);正文第9-11页 *

Also Published As

Publication number Publication date
CN109101108A (en) 2018-12-28

Similar Documents

Publication Publication Date Title
CN109101108B (en) Method and system for optimizing human-computer interaction interface of intelligent cabin based on three decisions
Zhang et al. Pedestrian detection method based on Faster R-CNN
CN104361313B (en) A kind of gesture identification method merged based on Multiple Kernel Learning heterogeneous characteristic
CN110796199B (en) Image processing method and device and electronic medical equipment
CN108829677A (en) A kind of image header automatic generation method based on multi-modal attention
CN110781829A (en) Light-weight deep learning intelligent business hall face recognition method
CN104504383B (en) A kind of method for detecting human face based on the colour of skin and Adaboost algorithm
Sajanraj et al. Indian sign language numeral recognition using region of interest convolutional neural network
CN111160407A (en) Deep learning target detection method and system
Yasir et al. Two-handed hand gesture recognition for Bangla sign language using LDA and ANN
Patil et al. Indian sign language recognition using convolutional neural network
CN113255602A (en) Dynamic gesture recognition method based on multi-modal data
WO2024016812A1 (en) Microscopic image processing method and apparatus, computer device, and storage medium
CN104156690A (en) Gesture recognition method based on image space pyramid bag of features
Rathi et al. Development of full duplex intelligent communication system for deaf and dumb people
WO2011096010A1 (en) Pattern recognition device
CN114492634A (en) Fine-grained equipment image classification and identification method and system
CN103942572A (en) Method and device for extracting facial expression features based on bidirectional compressed data space dimension reduction
Huang et al. Receptive field fusion RetinaNet for object detection
CN114898464B (en) Lightweight accurate finger language intelligent algorithm identification method based on machine vision
CN114067359B (en) Pedestrian detection method integrating human body key points and visible part attention characteristics
CN113705489B (en) Remote sensing image fine-granularity airplane identification method based on priori regional knowledge guidance
Heer et al. An improved hand gesture recognition system based on optimized msvm and sift feature extraction algorithm
CN110688880A (en) License plate identification method based on simplified ResNet residual error network
Yuan et al. Research on vehicle detection algorithm of driver assistance system based on vision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant