CN109101108A

CN109101108A - Method and system based on three decision optimization intelligence cockpit human-computer interaction interfaces

Info

Publication number: CN109101108A
Application number: CN201810823980.3A
Authority: CN
Inventors: 刘群; 张刚强; 王如琪
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2018-07-25
Filing date: 2018-07-25
Publication date: 2018-12-28
Anticipated expiration: 2038-07-25
Also published as: CN109101108B

Abstract

The invention belongs to intelligent driving fields, are a kind of method and system based on three decision optimization intelligence cockpit human-computer interaction interfaces, and method includes the gesture video acquired in cockpit, are pre-processed, obtain images of gestures；To the gesture and background segment of images of gestures, gesture area image is obtained；More granularity expression are carried out for it, more grain size characteristics of gesture area image are extracted using convolutional neural networks；Each granularity gesture area image classification is calculated from coarseness to fine granularity to conditional probability of all categories, utilizes the completion gesture identification that three decisions are sequential；Gesture after identification is subjected to semantic conversion, human-computer interaction interface is operated according to the result after semantic conversion；Optimal granularity is obtained by the way of weighted sum, using the optimal granularity as most fine granularity.The present invention can not only more accurately identify gesture in cockpit, execute gesture command, additionally it is possible to reduce the interaction time of cockpit human-computer interaction interface, provide more comfortable interactive experience for user.

Description

Method and system based on three decision optimization intelligence cockpit human-computer interaction interfaces

Technical field

The invention belongs to intelligent driving fields, more particularly to one kind to be based on three decision optimization intelligence cockpit human-computer interactions circle The method and system in face.

Background technique

With the development of artificial intelligence and depth learning technology, intelligent driving has attracted the concern of many people.Gesture Identification is as one of man-machine interaction mode typical in intelligent driving, to the optimization design at the interface human-computer interaction in cockpit (HMI) It is very important.Precisely quickly gesture identification can not only provide more comfortable interactive experience, can also improve driver's Safety.

Current gesture identification method mainly has based on sensor device and two ways based on computer vision.Though the former So there is preferable discrimination, its cost is larger, and interactive experience is not able to satisfy current demand, although the latter acquires gesture figure As being easier, but it is existing include: based on template matching, based on Extraction of Geometrical Features, based on hidden Markov model and Gesture identification method neural network based still has the problems such as model accuracy of identification is low or recognition speed is slow, can not be very The current precisely real-time gesture identification demand of good adaptation.The main reason for model accuracy of identification is low is to fail to extract well The feature of gesture, and recognition speed is slow is primarily due to caused by model too complex, existing method tends not to solve simultaneously Certainly both of these problems.

Summary of the invention

Based on problem above, the ability of feature is extracted using deep neural network, in conjunction with more granular information expression ways and Three decision thoughts, selecting suitable granularity that can solve simultaneously, gesture identification precision is low and the slow optimization problem of recognition speed.

The present invention provides a kind of methods based on three decision optimization intelligence cockpit human-computer interaction interfaces, including following step It is rapid:

Gesture video in S1, acquisition cockpit, pre-processes it, obtains static images of gestures；

S2, in images of gestures gesture and background be split processing, obtain gesture area image；

S3, more granularity expression are carried out by coarseness to fine granularity for gesture area image；It is extracted using convolutional neural networks More grain size characteristics of gesture area image；

S4, each granularity gesture area image classification is calculated from coarseness to fine granularity to conditional probability of all categories, benefit With the completion gesture identification that three decisions are sequential；

S5, the gesture after identification is subjected to semantic conversion, to human-computer interaction interface according to the gesture identification after semantic conversion As a result corresponding operation is carried out；

S6, optimal granularity is obtained by the way of weighted sum, using the optimal granularity as most fine granularity, repeat step Rapid S3~S5.

Further, described that more granularity expression are carried out by coarseness to fine granularity for gesture area image, for same proficiency Gesture area image, more granular information representations are specific as follows:

Wherein, A_iIndicate gesture area image in varigrained information, A₁Indicate gesture area image in the letter of coarseness Breath, A_nGesture area image is indicated in fine-grained information, i.e. fine granularity includes coarseness；I=1,2 ..., n, n indicate granularity Number.

Further, more grain size characteristics that gesture area image is extracted using convolutional neural networks, including the use of volume Different convolution kernels, extracts more granularity image features of images of gestures in product neural network.

Further, the step S4 includes carrying out three decisions from gesture area image zooming-out coarseness feature, if energy It determines the class categories of gesture, does not then continue fine-grained feature extraction and further three decisions, otherwise extract more particulate The feature of degree carries out three decisions, the class categories until determining gesture area image.

Further, the step S6 includes human-computer interaction circle that final each granularity is obtained by the way of weighted sum Face optimum results, so that it is determined that gesture is to the optimal granularity of human-computer interaction interface effect of optimization；

Result=w × Acc+ (1-w) × Time

Time=T₁+T₂

Wherein, Result is the optimal granularity of gesture area image, and Acc indicates gesture identification precision, and Time indicates gesture The time spent in identification process, w indicate weight, T₁Indicate the time of more grain size characteristics of extraction gesture area image；T₂It indicates Identify the time of gesture.

The present invention provides a kind of systems based on three decision optimization intelligence cockpit human-computer interaction interfaces, including electrically connect The camera that connects, cockpit gesture obtain module, images of gestures segmentation module, more grain size characteristic extraction modules, three decision gestures Identification module, gesture semantic conversion module and optimal granularity obtain module；

The cockpit gesture obtains module by the gesture video in camera acquisition cockpit, and video frame is changed into a system The static gesture image of column；

The images of gestures segmentation module is used to for the gesture of images of gestures and background to be split processing, obtains gesture area Area image；

The more grain size characteristic extraction modules of gesture are for extracting gesture area image from coarseness to fine-grained more Grain size characteristic；

Three decision gesture recognition modules are used for according to extracted more grain size characteristics in each granularity to gesture area Area image carries out three decisions, thus by gesture classification；

The gesture semantic conversion module is used to sorted gesture carrying out semantic conversion；

The optimal granularity obtains module for obtaining optimal granularity, and sends more grain size characteristics for the optimal granularity Extraction module.

Further, the more grain size characteristic extraction modules of the gesture include convolutional neural networks unit, utilize convolutional Neural Different convolution kernels in network unit extract more granularity image features of gesture area image；More granular information representations SpeciallyWherein, A_iIndicate gesture area image in varigrained information, A₁Indicate gesture area Information of the image in coarseness, A_nGesture area image is indicated in fine-grained information, i.e. fine granularity includes coarseness；I=1, 2 ..., n, n indicate granularity number.

Further, three decision gesture recognition modules include carrying out three to the coarseness feature of gesture area image Branch decision does not continue fine-grained feature extraction and further three decisions, otherwise if can determine that the class categories of gesture It extracts more fine-grained feature and carries out three decisions, the class categories until determining gesture area image.

Further, it includes that final each granularity is obtained by the way of weighted sum that the optimal granularity, which obtains module, Human-computer interaction interface optimum results, so that it is determined that the optimal granularity of gesture area image；

Result=w × Acc+ (1-w) × Time

Time=T₁+T₂

Beneficial effects of the present invention:

The present invention constructs more granular information expression ways using the thought of " step by step calculation " in Granule Computing for images of gestures, The images of gestures feature that more granularities are extracted using convolutional neural networks applies three in each granularity from coarseness to fine granularity The method of decision carries out gesture identification, then carries out corresponding semantic conversion to the gesture of identification, and gesture identification result is answered In HMI interface optimization in cockpit.

The present invention can more accurately be identified using the varigrained feature of acquired gesture in conjunction with three decision thoughts Gesture faster executes corresponding semantic operation, this can not only reduce the interaction time at the interface cockpit HMI, additionally it is possible to be User provides more comfortable interactive experience.

Detailed description of the invention

Fig. 1 is the flow diagram that the present invention uses；

Fig. 2 is that more grain size characteristics that the present invention uses extract schematic diagram；

Fig. 3 is three decision gesture identification flow charts that the present invention uses.

Fig. 4 is the HMI interface optimization design method that the present invention uses.

Specific embodiment

In order to make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing to of the invention real The technical solution applied in example is clearly and completely described, it is clear that described embodiment is only that present invention a part is implemented Example, instead of all the embodiments.

In order to better illustrate the specific implementation step of this method, it is described as follows in conjunction with Fig. 1 and example way:

Embodiment 1

The present invention the following steps are included:

S5, by after identification gesture area image carry out semantic conversion, to human-computer interaction interface according to semantic conversion after Gesture identification result is operated；

It is described that more granularity expression are carried out by coarseness to fine granularity for gesture area image, for same gesture area figure Picture, more granular information representations are specific as follows:

More grain size characteristics that gesture area image is extracted using convolutional neural networks, including the use of convolutional neural networks In different convolution kernels, more granularity image features of images of gestures are extracted, as shown in Fig. 2, mentioning using convolutional neural networks CNN Take the feature (being followed successively by coarseness to fine-grained feature) of n granularity of gesture area image.

Further, the step S4 includes carrying out three decisions from the coarseness feature to gesture area image, if energy It determines the class categories of gesture, does not then continue fine-grained feature extraction and further three decisions, otherwise extract more particulate The feature of degree carries out three decisions, the class categories until determining gesture area image.

Wherein, the flow chart of three decisions is as shown in figure 3, by the data set of input, more for extracting gesture area image Feature is spent, design conditions probability simultaneously carries out three decisions.

Softmax function design conditions probability is selected, gesture x is classified as to the conditional probability of classification j are as follows:

Wherein, l=1,2 ..., k, k indicate the classification sum of gesture area image；θ is parameter vector.

Three decision models utilize one group of decision-making value α, and gesture object is divided into positive domain (POS), Boundary Region by beta, gamma (BND) and in negative domain (NEG), for positive domain and negative domain using receiving and refusing rule, gesture identification is directly obtained as a result, and side Boundary domain uses Delayed Decision, when continuing when more fine granularity gets more information using three decisions.

The expression formula in positive domain, Boundary Region and negative domain is as follows:

POS_(α,β)=x ∈ U | and p (X | [x]) >=α }

BND_(α,β)=x ∈ U | and β < p (X | [x]) < α }

NEG_(α,β)=x ∈ U | and p (X | [x])≤β }

Wherein, p (X | [x]) is the conditional probability of classification, and [x] is the equivalence class comprising x.

The threshold alpha of three decisionsⁱ, βⁱ, γⁱCalculation it is as follows:

The loss function of different action is respectively taken,Point Receiving, delay and the loss function for refusing decision Biao Shi not be taken respectively when the i-th granularity gesture x belongs to classification X,It is illustrated respectively in and takes receiving, delay and refusal decision when the i-th granularity gesture x is not belonging to classification X respectively Loss function, the loss function of each granularity are rule of thumb provided by expert respectively.

The setting principle of more three decision-making values of granularity is as follows, i.e., only in the case where must or be beneficial to decision Carry out more fine-grained decision.This provides foundation, i.e. coarseness selection more for the setting of lower three decision-making values of different grain size Big acceptance threshold and smaller refusal threshold value, i=1,2 ..., n-1 indicate by coarseness to fine-grained sequence, then difference grains The threshold value of degree is described in detail below:

0≤β_i<α_i≤1,1≤i<n,

β₁≤β₂≤…≤β_i<α_i≤…≤α₂≤α₁

When i=n granularity, three decisions become two decisions, decision-making value calculation are as follows:

Three decisions are a kind of decision modes for meeting human thinking, and one is not held more than two traditional decisions Promise selection, i.e., when information is not enough to receive or refuse using the third Delayed Decision.Two decision processes are quickly succinct, but It is that three decisions are more suitable when obtaining insufficient information or the acquisition information certain cost of needs.Three decisions are selected to carry out hand The purpose of gesture identification is exactly to be different because obtaining the time spent required for varigrained gesture feature, for real-time For exigent HMI interface operation, consider that time cost is necessary.In three decision gesture identifications, close The step of key be extract more grain size characteristics, calculate the threshold values of three decisions to and conditional probability.

Embodiment 2

On the basis of step S1~S5, the present embodiment is also added step S6, is obtained most by the way of weighted sum Good granularity repeats step S3~S5 using the optimal granularity as most fine granularity.

HMI interface optimization design method by the way of weighted sum as shown in figure 4, obtain the man-machine of final each granularity Interactive interface optimum results, so that it is determined that the optimal granularity of gesture area image is utilized using the optimal granularity as most fine granularity Convolutional neural networks extract more grain size characteristics, and three decisions of sequential carry out to new gesture；

Result=w × Acc+ (1-w) × Time

Time=T₁+T₂

The present embodiment can save more time resources compared with embodiment 1, there is smaller computation complexity, for example, In the case where embodiment 1, optimal granularity is not used, the feature of 5 granularities is extracted in setting, and the time of cost is 100, and if known The effect of 3 granularities in road than 5 granularities recognition effect slightly almost, but the time is 40, then comprehensively considering, 5 to 3 Its 5 to 1 granularity of fullsized of granularity is more suitble to practical application.

Wherein, after the expression of human-computer interaction interface optimization design finds out optimal granularity, as subsequent gesture image procossing Most fine granularity.Since the information content of the extracted feature of different grain size is different, different recognition results, and fine granularity feature can be obtained Extraction is more compared to the time that coarseness is spent, and considers gesture accuracy of identification and recognition time weighting, can be special for gesture Sign extracts one most suitable granularity of selection, to meet the HMI interface optimization design object in cockpit based on gesture.

Result=w × Acc+ (1-w) × Time

Time=T₁+T₂

Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can With what is be done through the relevant hardware of the program instructions, which be can store in computer-readable storage medium, storage Medium may include: ROM, RAM, disk or CD etc..

Embodiment provided above has carried out further detailed description, institute to the object, technical solutions and advantages of the present invention It should be understood that embodiment provided above is only the preferred embodiment of the present invention, be not intended to limit the invention, it is all Any modification, equivalent substitution, improvement and etc. made for the present invention, should be included in the present invention within the spirit and principles in the present invention Protection scope within.

Claims

1. a kind of method based on three decision optimization intelligence cockpit human-computer interaction interfaces, which is characterized in that the described method includes:

Gesture video in S1, acquisition cockpit, pre-processes it, obtains a series of images of gestures of static state；

S3, more granularity expression are carried out by coarseness to fine granularity for gesture area image；Gesture is extracted using convolutional neural networks More grain size characteristics of area image；

S4, each granularity gesture area image classification is calculated from coarseness to fine granularity to conditional probability of all categories, utilize three The sequential completion gesture identification of branch decision；

S5, the gesture after identification is subjected to semantic conversion, to human-computer interaction interface according to the gesture identification result after semantic conversion Carry out corresponding operation；

S6, optimal granularity is obtained by the way of weighted sum, using the optimal granularity as most fine granularity, repeat step S3 ~S5.

2. a kind of method based on three decision optimization intelligence cockpit human-computer interaction interfaces according to claim 1, special Sign is, described is gesture area image by coarseness to the more granularity expression of fine granularity progress, for same gesture area image, Its more granular information representation is specific as follows:

Wherein, A_iIndicate gesture area image in varigrained information, A₁Indicate gesture area image coarseness information, A_nGesture area image is indicated in fine-grained information, i.e. fine granularity includes coarseness；I=1,2 ..., n, n indicate granularity number.

3. a kind of method based on three decision optimization intelligence cockpit human-computer interaction interfaces according to claim 2, special Sign is that more grain size characteristics that gesture area image is extracted using convolutional neural networks include utilizing convolutional neural networks Middle different convolution kernel extracts more granularity image features of gesture area image.

4. a kind of method based on three decision optimization intelligence cockpit human-computer interaction interfaces according to claim 1, special Sign is that the step S4 includes carrying out three decisions from the coarseness feature to gesture area image, if can determine that gesture Class categories do not continue fine-grained feature extraction and further three decisions then, otherwise extract more fine-grained feature into Three decisions of row, the class categories until determining gesture area image.

5. a kind of method based on three decision optimization intelligence cockpit human-computer interaction interfaces according to claim 1, special Sign is that the step S6 includes the human-computer interaction interface optimization knot that final each granularity is obtained by the way of weighted sum Fruit, so that it is determined that the optimal granularity of gesture area image；

Result=w × Acc+ (1-w) × Time

Time=T₁+T₂

Wherein, Result is the optimal granularity of gesture area image, and Acc indicates gesture identification precision, and Time indicates gesture identification The time spent in the process, w indicate weight, T₁Indicate the time of more grain size characteristics of extraction gesture area image；T₂Indicate identification The time of gesture.

6. a kind of system based on three decision optimization intelligence cockpit human-computer interaction interfaces, which is characterized in that the system comprises The camera of electric connection, cockpit gesture obtain module, images of gestures segmentation module, more grain size characteristic extraction modules, three certainly Plan gesture recognition module, gesture semantic conversion module and optimal granularity obtain module；

The cockpit gesture obtains module by the gesture video in camera acquisition cockpit, video frame is changed into a series of Static gesture image；

The images of gestures segmentation module is used to for the gesture of images of gestures and background to be split processing, obtains gesture area figure Picture；

The more grain size characteristic extraction modules of gesture are for extracting gesture area image from coarseness to fine-grained more granularities Feature；

Three decision gesture recognition modules are used for according to extracted more grain size characteristics in each granularity to gesture administrative division map As carrying out three decisions, thus by gesture classification；

The optimal granularity obtains module for obtaining optimal granularity, and sends more grain size characteristics for the optimal granularity and extract Module.

7. a kind of system based on three decision optimization intelligence cockpit human-computer interaction interfaces according to claim 6, special Sign is that the more grain size characteristic extraction modules of gesture include convolutional neural networks unit, using in convolutional neural networks unit Different convolution kernels, extract more granularity image features of gesture area image；More granular information representations are speciallyWherein, A_iIndicate gesture area image in varigrained information, A₁Indicate that gesture area image exists The information of coarseness, A_nGesture area image is indicated in fine-grained information, i.e. fine granularity includes coarseness；I=1,2 ..., N, n indicate granularity number.

8. a kind of system based on three decision optimization intelligence cockpit human-computer interaction interfaces according to claim 6, special Sign is that three decision gesture recognition modules include coarseness feature three decisions of progress to gesture area image, if It can determine that the class categories of gesture, then do not continue fine-grained feature extraction and further three decisions, otherwise extract thinner The feature of granularity carries out three decisions, the class categories until determining gesture area image.

9. a kind of system based on three decision optimization intelligence cockpit human-computer interaction interfaces according to claim 6, special Sign is that it includes that human-computer interaction circle of final each granularity is obtained by the way of weighted sum that the optimal granularity, which obtains module, Face optimum results, so that it is determined that the optimal granularity of gesture area image；

Result=w × Acc+ (1-w) × Time

Time=T₁+T₂