WO2018145316A1 - 鼠标手势的识别方法和装置 - Google Patents

鼠标手势的识别方法和装置 Download PDF

Info

Publication number
WO2018145316A1
WO2018145316A1 PCT/CN2017/073382 CN2017073382W WO2018145316A1 WO 2018145316 A1 WO2018145316 A1 WO 2018145316A1 CN 2017073382 W CN2017073382 W CN 2017073382W WO 2018145316 A1 WO2018145316 A1 WO 2018145316A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
neural network
mouse
gesture
recognized
Prior art date
Application number
PCT/CN2017/073382
Other languages
English (en)
French (fr)
Inventor
张小敏
罗健
韩荣华
唐婵娟
Original Assignee
深圳市华第时代科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市华第时代科技有限公司 filed Critical 深圳市华第时代科技有限公司
Priority to PCT/CN2017/073382 priority Critical patent/WO2018145316A1/zh
Publication of WO2018145316A1 publication Critical patent/WO2018145316A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to a method and an apparatus for identifying a mouse gesture.
  • the user In the existing Windows computer, the user often draws a certain trajectory on the webpage of the browser through the mouse, and implements corresponding operations through the trajectory, for example, can perform predetermined operations such as forward, backward, refresh, and close the window. .
  • the process is as follows: input the graphic with the mouse, and then obtain the mouse input graphic and the template to compare, extract the feature, and output the recognition result on the webpage.
  • the main object of the present invention is to solve the technical problem that the prior art has limited application range and low recognition accuracy in mouse gesture recognition.
  • the present invention provides a method for identifying a mouse gesture, the method comprising:
  • Detecting a mouse gesture input by a user through a mouse operation and acquiring a mouse gesture image to obtain an image to be recognized;
  • the feature vector is input into a pre-trained BP neural network to obtain an identified character.
  • the pre-processing the image to be identified to obtain the feature vector of the image to be identified specifically includes:
  • the extracting the feature vector of the image to be identified from the pixel matrix of the normalized pixel bitmap includes:
  • the method further includes: training a preset BP neural network by using a preset gesture template image. To obtain the pre-trained BP neural network.
  • the training the preset BP neural network by using the preset gesture template image to obtain the pre-trained BP neural network specifically includes:
  • the present invention further provides a device for recognizing a mouse gesture, the device for identifying a mouse gesture includes:
  • a detecting module configured to detect a mouse gesture input by a user through a mouse operation, and obtain a mouse gesture image to obtain an image to be recognized;
  • a processing module configured to preprocess the image to be identified to obtain a feature vector of the image to be identified
  • an identification module configured to input the feature vector into a pre-trained BP neural network to obtain an identification character.
  • the processing module specifically includes:
  • a grayscale unit configured to perform grayscale processing on the image to be identified
  • a binarization unit configured to perform binarization processing on the image to be recognized after the grayscale processing and perform de-discrete noise processing
  • a normalization unit configured to normalize the image to be recognized after the de-discrete noise processing to obtain a normalized pixel dot pattern of the image to be recognized
  • an extracting unit configured to extract a feature vector of the image to be recognized from a pixel matrix of the normalized pixel bitmap.
  • the extracting unit specifically includes:
  • a matrix subunit configured to extract each pixel value from the normalized image bitmap to obtain a pixel matrix corresponding to the image to be identified
  • a conversion subunit configured to convert the pixel matrix into a column matrix to obtain a feature column vector of the image to be identified.
  • the device further includes: a training module, configured to train the preset BP neural network by using the preset gesture template image to obtain the pre-trained BP neural network.
  • a training module configured to train the preset BP neural network by using the preset gesture template image to obtain the pre-trained BP neural network.
  • the training module specifically includes:
  • a pre-processing unit configured to pre-process each preset gesture template image to obtain a feature vector of each gesture template image
  • a setting unit configured to set a desired output of the pre-designed BP neural network as a gesture corresponding to each gesture template image
  • An input unit configured to input a feature vector of each gesture template image into the BP neural network, and train the BP neural network according to a desired output;
  • the judging unit is configured to stop training and save each parameter of the BP neural network at this time to obtain a pre-trained BP neural network when the number of times of training of the BP neural network reaches a preset number or the error value is less than an expected value.
  • the method and device for identifying a mouse gesture detects a mouse gesture input by a user through a mouse operation, acquires a mouse gesture image to obtain an image to be recognized, and performs preprocessing on the image to be recognized to obtain the image to be recognized.
  • a feature vector is input into a pre-trained BP neural network to obtain a recognized character.
  • the feature vector of the pre-processed mouse gesture image can be input into the trained BP neural network to obtain the recognition result.
  • the BP neural network's distributed storage information, parallel processing of information, self-organization, self-learning information, etc. allows the mouse gestures input by the user to be processed in time, and a large amount of training also ensures the recognition accuracy, and at the same time, Applied to various types of applications, greatly improving the user experience.
  • FIG. 1 is a schematic flow chart of a first embodiment of a method for recognizing a mouse gesture according to the present invention
  • step S20 in FIG. 1 is a schematic diagram of a specific refinement process of step S20 in FIG. 1;
  • step S24 in FIG. 2 is a schematic diagram of a specific refinement process of step S24 in FIG. 2;
  • FIG. 4 is a schematic diagram of a refinement process of BP neural network training in a first embodiment of a mouse gesture recognition method according to the present invention
  • FIG. 5 is a schematic diagram of functional modules of a first embodiment of a mouse gesture recognition apparatus according to the present invention.
  • FIG. 6 is a schematic diagram of a refinement function module of the processing module 70 of FIG. 5;
  • FIG. 7 is a schematic diagram of a refinement function module of the extraction unit 74 of FIG. 6;
  • FIG. 8 is a schematic diagram of a refinement function module of the training module 90 in the first embodiment of the mouse gesture recognition device of the present invention.
  • Figure 9 is a graph showing the learning error curve of BP neural network training in the present invention.
  • the terminal includes, but is not limited to, an electronic device that adopts a Windows system, such as a personal computer, a mobile terminal, or an IPAD.
  • the invention provides a method for recognizing a mouse gesture.
  • 1 is a schematic flowchart of a first embodiment of a method for recognizing a mouse gesture according to the present invention.
  • the method for identifying the mouse gesture includes:
  • Step S10 detecting a mouse gesture input by the user through a mouse operation, and acquiring a mouse gesture image to obtain an image to be recognized;
  • the terminal detects a mouse gesture input by the user through a mouse operation, for example, a drag track performed by the user on the current page on the terminal by using a right mouse button, and the terminal records the trajectory of the mouse cursor and enters the user input.
  • a graphic containing the trajectory that is, a mouse gesture image, is generated.
  • the mouse gesture image is the image to be identified that needs to be identified later.
  • the current mouse gesture has already covered many common English letters and numbers.
  • the mouse gesture to be recognized defaults to 10 numeric characters of 0 to 9.
  • Step S20 preprocessing the image to be identified to obtain a feature vector of the image to be identified
  • a series of transform processing is performed on the image to be identified to obtain a required feature vector, where the transform processing includes but is not limited to grayscale processing, binarization processing, and Preprocessing such as discrete noise, normalization adjustment, etc.
  • the image to be recognized can be converted into a normalized M*N pixel bitmap, and thus the feature vector A of the image to be identified is obtained, and the feature vector A A column matrix of (N*M)*1 whose numeric type is double.
  • M and N can be set according to requirements. In this embodiment, M can be 20 and N is 36.
  • step S20 specifically includes:
  • Step S21 performing grayscale processing on the image to be identified
  • Step S22 performing binarization processing on the image to be recognized after the grayscale processing and performing de-discrete noise processing
  • Step S23 normalizing the image to be recognized after the discrete noise processing to obtain a normalized pixel dot pattern of the image to be recognized;
  • Step S24 extracting feature vectors of the image to be recognized from the pixel matrix of the normalized pixel bitmap.
  • the terminal performs the grayscale processing on the recognized image.
  • the specific processing manner is not limited.
  • the maximum value method, the average value method, or the weighted average method may be used.
  • the weighted average method for example, can directly achieve grayscale of an image by the rgb2gray function in MATLAB.
  • the terminal further performs binarization processing and denoising processing on the grayscale image to be recognized.
  • the MATLAB software is still used as follows: assuming that the grayscale image to be recognized is fig_gray, the global threshold of fig_gray is first obtained by the graythresh function using the maximum inter-class variance method, and then the fig_gray is converted by calling the global threshold by the im2bw function. It is a binary image fig_bool.
  • the implementation process is as follows:
  • Threshold graythresh(fig_gray);
  • Fig_ bool im2bw(fig_gray,threshold);
  • fig_gray and fig_boo1 are respectively a storage matrix of the grayscale image and a storage matrix of the binarized image.
  • the gradient sharpening is generally required before the discrete noise is removed, and the noise can also be removed.
  • the Roberts operator can be used.
  • the processing method of Sobel operator, Prewitt operator or Laplacian operator sharpens the image to be recognized after binarization to make the highlight edge clear, and selects the appropriate threshold to weaken and eliminate the fine noise, and to remove the noise. It can still be realized by MATLAB software.
  • the existing de-discrete noise method is very mature, so it will not be described in detail here. Only one way is taken as an example: the whole image is scanned, and when a black pixel is found, it is directly or indirectly connected to it. The number of black pixels, if the number is greater than a certain value (the specific value depends on the image), it can be determined to be a non-discrete point, otherwise it is considered a discrete point, which is removed from the image.
  • the input mouse gesture image that is, the image to be recognized, needs to be normalized.
  • the image to be recognized after the discrete noise is uniformly normalized into a M*N or 20 ⁇ 36 pixel bitmap, from which the feature vector of the image to be identified can be extracted.
  • step S24 specifically includes:
  • Step S241 extracting each pixel value from the normalized image dot map to obtain a pixel matrix corresponding to the image to be identified;
  • Step S242 converting the pixel matrix into a column matrix to obtain a feature vector of the image to be identified.
  • the image to be identified is a normalized binarized image, and each pixel corresponds to a pixel value, and a corresponding 36*20 Boolean matrix is obtained after the extraction.
  • the 36*20 Boolean matrix is converted into a 720*1 double-precision column matrix, which is the feature vector of the image to be identified.
  • step S30 the feature vector is input into the pre-trained BP neural network to obtain the recognized character.
  • the 3-layer network can approximate any continuous function. Therefore, in the embodiment, a 3-layer network with a relatively simple structure is selected as an example for description.
  • the number of neurons in the hidden layer is determined according to the network convergence performance. Based on summarizing a large number of network structures, the empirical formula is obtained:
  • n is the number of neurons in the input layer
  • m is the number of neurons in the output layer. According to the above formula, the number of neurons in the hidden layer can be 70.
  • the feature vector corresponding to the numeric character 0-9 When the feature vector corresponding to the numeric character 0-9 is input to the BP neural network, it is 1 at the position corresponding to the output neuron, and the other positions are 0. For example, if you enter the number 0, the first output neuron is 1, and the others are 0. When you enter the number 1, the second output neuron is 1, the other is 0, and so on.
  • the BP neural network For the BP neural network that has been pre-trained, when the feature vector of the image to be recognized, that is, the column matrix of 720*1 is input, the BP neural network correspondingly outputs the corresponding character.
  • the method for identifying the mouse gesture further includes:
  • Step S40 The preset BP neural network is trained by using the preset gesture template image to obtain the pre-trained BP neural network.
  • a preset BP neural network may be constructed in the initialization phase, and a preset number of preset gesture template images are input into a preset BP neural network, and the preset BP neural network is trained according to the expected output. Until the error between the actual output and the desired output is less than the target value.
  • step S40 specifically includes:
  • Step S41 pre-processing each preset gesture template image to obtain a feature vector of each gesture template image
  • Step S42 setting a desired output of the preset BP neural network as a character corresponding to each gesture template image
  • Step S43 input a feature vector of each gesture template image into the BP neural network, and train the BP neural network according to a desired output;
  • Step S44 when the number of trainings of the BP neural network reaches a preset number of times or the error value is less than the expected value, the training is stopped and the parameters of the BP neural network at this time are saved to obtain a pre-trained BP neural network.
  • the structure of the BP neural network can be determined. At this time, a forward can be established by the newff function.
  • the neural network is a pre-designed BP neural network, as follows:
  • minmax(P) is the BP neural network's limit on the maximum and minimum values of its 720 input elements.
  • P is a collection of training samples.
  • [720, 70, 10] is the layer structure of the neural network.
  • ⁇ ‘logsig’, ‘logsig’, ‘logsig’ ⁇ are the transfer/transfer functions of the layers of the neural network, all set to a logarithmic S-type activation function.
  • the training function uses traincgb, which is trained using the Powel1-Beale conjugate gradient method.
  • a gesture template image is a numeric character that corresponds to a desired output.
  • the target vector corresponding to the input vector is that each digital input is required to be 1 at the position corresponding to the output neuron after inputting the BP neural network, and the other positions are 0.
  • each class takes 20 training samples, that is, 20 sets of input vectors constitute a training sample set.
  • the target performance function of the BP neural network training is SSE
  • the error performance target value is set to 0.01
  • the BP neural network training times reach the preset maximum value (take 1000) or the BP neural network error square sum SSE ( If the error value falls below 0.01 (expected value), the training target is considered to be reached and the training is terminated.
  • Net.1ayers ⁇ 1 ⁇ .initFcn ‘initwb’; % network layer initialization function is selected as ‘initwb’, making the following input layer initialization statement randnr’ valid
  • net in [net,tr] is the BP neural network with updated weight
  • tr is the training record (the number of records and the error of each training).
  • FIG. 9 is a schematic diagram of the learning error curve of the BP neural network during the training process, but when the degree shown in FIG. 9 is reached, at this time, the error square sum SSE of the neural network is 0.0088, which is less than the performance expectation value of 0.01.
  • the training can be ended and a pre-trained BP neural network can be obtained.
  • the method for recognizing a mouse gesture obtained by the above embodiment obtains a mouse gesture image by detecting a mouse gesture input by a mouse operation to obtain an image to be recognized; and preprocessing the image to be recognized to obtain the image to be recognized Feature vector; input the feature vector into a pre-trained BP neural network to obtain an identified character.
  • the feature vector of the pre-processed mouse gesture image can be input into the trained BP neural network to obtain the recognition result.
  • the BP neural network's distributed storage information, parallel processing of information, self-organization, self-learning information, etc. allows the mouse gestures input by the user to be processed in time, and a large amount of training also ensures the recognition accuracy, and at the same time, Applied to various types of applications, greatly improving the user experience.
  • FIG. 5 is a schematic diagram of functional modules of a first embodiment of a method for recognizing a mouse gesture according to the present invention.
  • the recognition device 100 for the mouse gesture is applied to the terminal, and includes:
  • the detecting module 60 is configured to detect a mouse gesture input by a user through a mouse operation, and acquire a mouse gesture image to obtain an image to be recognized;
  • the detecting module 60 detects a mouse gesture input by the user through a mouse operation, for example, a drag track performed by the user on the current page on the terminal by using a right mouse button, and the detecting module 60 records the mouse cursor passing through.
  • the mouse gesture image is the image to be identified that needs to be identified later.
  • the current mouse gesture has already covered many common English letters and numbers.
  • the mouse gesture to be recognized defaults to 10 numeric characters of 0 to 9.
  • the processing module 70 is configured to preprocess the image to be identified to obtain a feature vector of the image to be identified;
  • a series of transform processing is performed on the image to be identified to obtain a required feature vector, where the transform processing includes but is not limited to grayscale processing, binarization processing, and Preprocessing, such as discrete noise, normalization adjustment, etc.
  • the processing module 70 can convert the image to be recognized into a normalized M*N pixel bitmap, and thereby obtain the feature vector A of the image to be identified, and
  • the feature vector A is a column matrix of (N*M)*1 whose numerical type is a double type.
  • the value of M and N can be set according to requirements. In this embodiment, M can be 20 and N is 36.
  • FIG. 6 is a schematic diagram of the refinement function module of the processing module 70.
  • the processing module 70 specifically includes:
  • a gradation unit 71 configured to perform gradation processing on the image to be identified
  • the binarization unit 72 is configured to perform binarization processing on the image to be recognized after the grayscale processing and perform de-discrete noise processing;
  • a normalization unit 73 configured to normalize the image to be recognized after the de-discrete noise processing to obtain a normalized pixel dot pattern of the image to be recognized;
  • the extracting unit 74 is configured to extract a feature vector of the image to be recognized from a pixel matrix of the normalized pixel bitmap.
  • the gradation unit 71 performs the gradation processing on the image to be recognized first.
  • the specific processing manner is not limited.
  • the maximum value method, the average value method, or the weighted average method may be used.
  • the weighted average method is preferably used.
  • the gray scale of the image can be directly realized by the rgb2gray function in MATLAB.
  • the binarization unit 72 further performs binarization processing and denoising processing on the grayscale image to be recognized.
  • the MATLAB software is still used as follows: assuming that the grayscale image to be recognized is fig_gray, the global threshold of fig_gray is first obtained by the graythresh function using the maximum inter-class variance method, and then the fig_gray is converted by calling the global threshold by the im2bw function. It is a binary image fig_bool.
  • the implementation process is as follows:
  • Threshold graythresh(fig_gray);
  • Fig_ bool im2bw(fig_gray,threshold);
  • fig_gray and fig_boo1 are respectively a storage matrix of the grayscale image and a storage matrix of the binarized image.
  • the gradient sharpening is generally required before the discrete noise is removed, and the noise can also be removed.
  • the Roberts operator can be used.
  • the processing method of Sobel operator, Prewitt operator or Laplacian operator sharpens the image to be recognized after binarization to make the highlight edge clear, and selects the appropriate threshold to weaken and eliminate the fine noise, and to remove the noise. It can still be realized by MATLAB software.
  • the existing de-discrete noise method is very mature, so it will not be described in detail here. Only one way is taken as an example: the whole image is scanned, and when a black pixel is found, it is directly or indirectly connected to it. The number of black pixels, if the number is greater than a certain value (the specific value depends on the image), it can be determined to be a non-discrete point, otherwise it is considered a discrete point, which is removed from the image.
  • the normalization unit 73 needs to recognize the input mouse gesture image.
  • the image is normalized. After the image to be recognized after the discrete noise is uniformly normalized into a M*N or 20 ⁇ 36 pixel bitmap, the extracting unit 74 can extract the feature vector of the image to be recognized from the pixel bitmap.
  • the extracting unit 74 specifically includes:
  • a matrix sub-unit 741 configured to extract each pixel value from the normalized image bitmap to obtain a pixel matrix corresponding to the image to be identified;
  • the conversion subunit 742 is configured to convert the pixel matrix into a column matrix to obtain a feature vector of the image to be identified.
  • the image to be identified is a normalized binarized image, and each pixel corresponds to a pixel value, and a corresponding 36*20 Boolean matrix is obtained after the extraction.
  • the 36*20 Boolean matrix is converted into a 720*1 double-precision column matrix, which is the feature vector of the image to be identified.
  • the identification module 80 is configured to input the feature vector into the pre-trained BP neural network to obtain the recognized character.
  • the 3-layer network can approximate any continuous function. Therefore, in the embodiment, a 3-layer network with a relatively simple structure is selected as an example for description.
  • the number of neurons in the hidden layer is determined according to the network convergence performance. Based on summarizing a large number of network structures, the empirical formula is obtained:
  • n is the number of neurons in the input layer
  • m is the number of neurons in the output layer. According to the above formula, the number of neurons in the hidden layer can be 70.
  • the feature vector corresponding to the numeric character 0-9 When the feature vector corresponding to the numeric character 0-9 is input to the BP neural network, it is 1 at the position corresponding to the output neuron, and the other positions are 0. For example, if you enter the number 0, the first output neuron is 1, and the others are 0. When you enter the number 1, the second output neuron is 1, the other is 0, and so on.
  • the recognition module 80 when the recognition module 80 inputs the feature vector of the image to be recognized, that is, the column matrix of the 720*1, the BP neural network correspondingly outputs the corresponding character.
  • the mouse gesture recognition apparatus 100 further includes:
  • the training module 90 is configured to train the preset BP neural network by using the preset gesture template image to obtain the pre-trained BP neural network.
  • the weights of the neurons in the hidden layer are adjusted through a large amount of training, so that the input of the BP neural network approaches the expected output.
  • a preset BP neural network may be constructed in the initialization phase, and the training module 90 inputs a large number of preset gesture template images into a preset BP neural network, and outputs the preset BP neural network according to the expected output. Train until the error between the actual output and the desired output is less than the target value.
  • the training module 90 specifically includes:
  • the pre-processing unit 91 is configured to pre-process each preset gesture template image to obtain a feature vector of each gesture template image.
  • a setting unit 92 configured to set a desired output of the BP neural network designed in advance as a gesture corresponding to each gesture template image
  • the input unit 93 is configured to input feature vectors of the gesture template images into the BP neural network, and train the BP neural network according to a desired output;
  • the determining unit 94 is configured to stop training and save each parameter of the BP neural network at this time to obtain a pre-trained BP neural network when the number of trainings of the BP neural network reaches a preset number of times or the error value is less than the expected value.
  • the structure of the BP neural network can be determined. At this time, a forward can be established by the newff function.
  • the neural network is a pre-designed BP neural network, as follows:
  • minmax(P) is the BP neural network's limit on the maximum and minimum values of its 720 input elements.
  • P is a collection of training samples.
  • [720, 70, 10] is the layer structure of the neural network.
  • ⁇ ‘logsig’, ‘logsig’, ‘logsig’ ⁇ are the transfer/transfer functions of the layers of the neural network, all set to a logarithmic S-type activation function.
  • the training function uses traincgb, which is trained using the Powel1-Beale conjugate gradient method.
  • a gesture template image is a numeric character that corresponds to a desired output.
  • the target vector corresponding to the input vector is that each digital input is required to be 1 at the position corresponding to the output neuron after inputting the BP neural network, and the other positions are 0.
  • each class takes 20 training samples, that is, 20 sets of input vectors constitute a training sample set.
  • the target performance function of the BP neural network training is SSE
  • the error performance target value is set to 0.01
  • the BP neural network training times reach the preset maximum value (take 1000) or the BP neural network error square sum SSE ( If the error value falls below 0.01 (expected value), the training target is considered to be reached and the training is terminated.
  • Net.1ayers ⁇ 1 ⁇ .initFcn ‘initwb’; % network layer initialization function is selected as ‘initwb’, making the following input layer initialization statement randnr’ valid
  • net in [net,tr] is the BP neural network with updated weight
  • tr is the training record (the number of records and the error of each training).
  • FIG. 9 is a schematic diagram of the learning error curve of the BP neural network during the training process, but when the degree shown in FIG. 9 is reached, at this time, the error square sum SSE of the neural network is 0.0088, which is less than the performance expectation value of 0.01.
  • the training can be ended and a pre-trained BP neural network can be obtained.
  • the mouse gesture recognition device obtains a mouse gesture image to obtain an image to be recognized by detecting a mouse gesture input by a user, and preprocesses the image to be recognized to obtain the image to be recognized.
  • Feature vector input the feature vector into a pre-trained BP neural network to obtain an identified character.
  • the feature vector of the pre-processed mouse gesture image can be input into the trained BP neural network to obtain the recognition result.
  • the BP neural network's distributed storage information, parallel processing of information, self-organization, self-learning information, etc. allows the mouse gestures input by the user to be processed in time, and a large amount of training also ensures the recognition accuracy, and at the same time, Applied to various types of applications, greatly improving the user experience.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种鼠标手势的识别方法,包括侦测用户通过鼠标操作输入的鼠标手势,获取鼠标手势图像以得到待识别图像;对待识别图像进行预处理以得到待识别图像的特征向量;将特征向量输入预先训练的BP神经网络中以得到识别字符。本发明还提供一种鼠标手势的识别装置。本发明所提供的鼠标手势的识别方法和装置,利用BP神经网络的人工智能,将经过预处理后的鼠标手势图像的特征向量输入训练好的BP神经网络即可得到识别结果。由此,BP神经网络的分布式存储信息,并行处理信息,自组织,自学习信息等优点,让用户输入的鼠标信息能够及时的处理,并且大量的训练也保证了识别精度,同时,还可应用于各种类型的应用程序,大大的提高了用户体验。

Description

鼠标手势的识别方法和装置 技术领域
本发明涉及通信技术领域,尤其涉及一种鼠标手势的识别方法和装置。
背景技术
在现有的Windows计算机中,用户常常通过鼠标在浏览器的网页上划出某种特定的轨迹,通过该轨迹实现相应的操作,例如可以实现预定的前进、后退、刷新、关闭窗口等常用操作。其过程具体如下:用鼠标输入图形,然后获取鼠标输入的图形与模板进行对比,提取特征,在网页上进行识别结果的输出。
但是,目前鼠标手势大多数还只限于浏览器上的应用,而对于其他Windows客户端应用程序并不适用。但是,由于计算机对鼠标轨迹识别的准确度还有待提高,现在的鼠标手势还比较简单,对用户而言还不能体会到鼠标手势比键盘上的快捷键会更加的方便快捷,而且鼠标手势是基于字符识别的,字符识别是通过基于字符结构的识别法及模糊匹配方法来进行处理,所以识别精度不高。
基于此,有必要提供一种鼠标手势的识别方法,以解决当前鼠标手势识别应用范围有限、识别精度不高的问题。
技术问题
本发明的主要目的在于解决现有技术在鼠标手势识别应用范围有限、识别精度不高的技术问题。
技术解决方案
为实现上述目的,本发明提供的一种鼠标手势的识别方法,所述方法包括:
侦测用户通过鼠标操作输入的鼠标手势,获取鼠标手势图像以得到待识别图像;
对所述待识别图像进行预处理以得到所述待识别图像的特征向量;
将所述特征向量输入预先训练的BP神经网络中以得到识别字符。
优选地,所述对所述待识别图像进行预处理以得到所述待识别图像的特征向量具体包括:
对所述待识别图像进行灰度化处理;
对灰度化处理后的所述待识别图像进行二值化处理并进行去离散噪声处理;
对去离散噪声处理后的所述待识别图像进行归一化处理以得到待识别的图像的归一化像素点阵图;
从所述归一化像素点阵图的像素矩阵中提取所述待识别图像的特征向量。
优选地,所述从所述归一化像素点阵图的像素矩阵中提取所述待识别图像的特征向量具体包括:
从所述归一化图像点阵图中提取各像素值以得到所述待识别图像对应的像素矩阵;
将所述像素矩阵转化为列矩阵以得到所述待识别图像的特征列向量。
优选地,在所述侦测用户通过鼠标操作输入的鼠标手势,获取鼠标手势图像以得到待识别图像之前,所述方法还包括:通过预设的手势模板图像对预设的BP神经网络进行训练以得到所述预先训练的BP神经网络。
优选地,所述通过预设的手势模板图像对预设BP神经网络进行训练以得到所述预先训练的BP神经网络具体包括:
对预设的各手势模板图像进行预处理以得到各手势模板图像的特征向量;
设定预设的BP神经网络的期望输出为所述各手势模板图像对应的字符;
将所述各手势模板图像的特征向量输入所述BP神经网络,根据期望输出对所述BP神经网络进行训练;
当所述BP神经网络的训练次数达到预设次数或误差值小于期望值时,停止训练并保存此时的BP神经网络的各参数以得到预先训练的BP神经网络。
此外,为实现上述目的,本发明还提供一种鼠标手势的识别装置,所述鼠标手势的识别装置包括:
侦测模块,用于侦测用户通过鼠标操作输入的鼠标手势,获取鼠标手势图像以得到待识别图像;
处理模块,用于对所述待识别图像进行预处理以得到所述待识别图像的特征向量;
识别模块,用于将所述特征向量输入预先训练的BP神经网络中以得到识别字符。
优选地,所述处理模块具体包括:
灰度化单元,用于对所述待识别图像进行灰度化处理;
二值化单元,用于对灰度化处理后的所述待识别图像进行二值化处理并进行去离散噪声处理;
归一化单元,用于对去离散噪声处理后的所述待识别图像进行归一化处理以得到待识别的图像的归一化像素点阵图;
提取单元,用于从所述归一化像素点阵图的像素矩阵中提取所述待识别图像的特征向量。
优选地,所述提取单元具体包括:
矩阵子单元,用于从所述归一化图像点阵图中提取各像素值以得到所述待识别图像对应的像素矩阵;
转换子单元,用于将所述像素矩阵转化为列矩阵以得到所述待识别图像的特征列向量。
优选地,所述装置还包括:训练模块,用于通过预设的手势模板图像对预设的BP神经网络进行训练以得到所述预先训练的BP神经网络。
优选地,所述训练模块具体包括:
预处理单元,用于对预设的各手势模板图像进行预处理以得到各手势模板图像的特征向量;
设定单元,用于设定预先设计的BP神经网络的期望输出为所述各手势模板图像对应的手势;
输入单元,用于将所述各手势模板图像的特征向量输入所述BP神经网络,根据期望输出对所述BP神经网络进行训练;
判断单元,用于当所述BP神经网络的训练次数达到预设次数或误差值小于期望值时,停止训练并保存此时的BP神经网络的各参数以得到预先训练的BP神经网络。
有益效果
本发明所提供的鼠标手势的识别方法和装置,侦测用户通过鼠标操作输入的鼠标手势,获取鼠标手势图像以得到待识别图像;对所述待识别图像进行预处理以得到所述待识别图像的特征向量;将所述特征向量输入预先训练的BP神经网络中以得到识别字符。利用BP神经网络的人工智能,将经过预处理后的鼠标手势图像的特征向量输入训练好的BP神经网络即可得到识别结果。由此,BP神经网络的分布式存储信息,并行处理信息,自组织,自学习信息等优点,让用户输入的鼠标手势能够及时的处理,并且大量的训练也保证了识别精度,同时,还可应用于各种类型的应用程序,大大的提高了用户体验。
附图说明
图1为本发明的鼠标手势的识别方法第一实施例的流程示意图;
图2为图1中步骤S20的具体细化流程示意图;
图3为图2中步骤S24的具体细化流程示意图;
图4为本发明的鼠标手势的识别方法第一实施例中BP神经网络训练的细化流程示意图;
图5为本发明的鼠标手势的识别装置第一实施例的功能模块示意图;
图6为图5中处理模块70的细化功能模块示意图;
图7为图6中提取单元74的细化功能模块示意图;
图8为本发明的鼠标手势的识别装置第一实施例中训练模块90的细化功能模块示意图;
图9为本发明中BP神经网络训练时的学习误差曲线图。
本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
本发明的实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
需要说明的是,在本发明各实施例中,终端包括但不限于个人计算机、移动终端、IPAD等任意采用Windows***的电子设备。
本发明提供一种鼠标手势的识别方法。参照图1,图1为本发明的鼠标手势的识别方法第一实施例的流程示意图。在该实施例中,所述鼠标手势的识别方法包括:
步骤S10,侦测用户通过鼠标操作输入的鼠标手势,获取鼠标手势图像以得到待识别图像;
具体的,在本实施例中,终端侦测用户通过鼠标操作输入的鼠标手势,例如,用户通过鼠标右键在终端上当前页面上进行的拖曳轨迹,终端记录鼠标光标经过的轨迹并在用户输入完毕后生成包含该轨迹的图形,即鼠标手势图像。所述鼠标手势图像即为后续需要进行识别的待识别图像。
可以理解的是,当前鼠标手势已经涵盖了许多的常用英文字母和数字,在本实施例中,待识别的鼠标手势默认为0到9的10个数字字符。
步骤S20,对所述待识别图像进行预处理以得到所述待识别图像的特征向量;
具体的,在识别鼠标手势图像之前,都需要对该待识别图像进行一系列的变换处理以得到符合要求的特征向量,所述变换处理包括但不限于灰度化处理、二值化处理、去离散噪声、归一化调整等预处理,最终,可以将待识别图像转换为归一化的M*N的像素点阵图,并由此得到待识别图像的特征向量A,并且,特征向量A为数值类型为双精度类型的(N*M)*1的列矩阵。其中,M,N的取值可以依据需要设定,在本实施例中,可以取M为20,N为36。
进一步的,请参考图2,所示是步骤S20的细化流程图,在本实施例中,步骤S20具体包括:
步骤S21,对所述待识别图像进行灰度化处理;
步骤S22,对灰度化处理后的所述待识别图像进行二值化处理并进行去离散噪声处理;
步骤S23,对去离散噪声处理后的所述待识别图像进行归一化处理以得到待识别的图像的归一化像素点阵图;
步骤S24,从所述归一化像素点阵图的像素矩阵中提取所述待识别图像的特征向量。
具体的,终端先对待识别图像进行灰度化处理,在本实施例中,对于具体的处理方式不作限制,例如可以采用最大值法,平均值法或是加权平均法,本实施例中优选采用加权平均法,例如可以通过MATLAB中的rgb2gray函数直接实现图像的灰度化。得到灰度化的待识别图像后,终端进一步对灰度化的待识别图像进行二值化处理以及去噪处理。本实施例中仍旧通过MATLAB软件实现如下:假设灰度化的待识别图像为fig_gray,则首先通过graythresh函数使用最大类间方差法得到fig_gray的全局阈值,再通过im2bw函数调用该全局阈值将fig_gray转化为二值图像fig_bool。实现过程如下:
Threshold = graythresh(fig_gray);
Fig_ bool = im2bw(fig_gray,threshold);
其中,其中fig_gray和fig_boo1分别为灰度图像的存储矩阵和二值化图像的存储矩阵。
因为二值化后的图像一般存在字体模糊现象,因此,在去除离散噪声之前一般还需要进行梯度锐化,同时也可以对噪声起到一定的去除作用,本实施例中可以采用Roberts算子、Sobel算子、Prewitt算子或是Laplacian算子等处理方式对二值化后的待识别图像进行锐化使凸显边缘变清楚,同时选用合适的阈值可以减弱和消除细小的噪声,去离散噪声的仍可以通过MATLAB软件实现,现有的去离散噪声方式已经非常成熟,故此处不作详细描述,仅列举一方式作为例子:扫描整个图像,当发现一个黑色像素时就考察和它直接或间接相连的黑色像素的数量,如果该数量大于一定值(具体数值根据图像情况而定),则可以判定它为非离散点,反之,则认为它为离散点,将其从图像中去除。
因为用户通过鼠标输入鼠标手势后,其覆盖的范围存在不同,因此,终端获取到的鼠标手势图像大小也存在差异,因此,需要对输入的鼠标手势图像即待识别图像进行归一化调整。将去离散噪声后的待识别图像统一归一化为M*N即20×36的像素点阵图,从该像素点阵图中即可提取待识别图像的特征向量。
进一步的,请参考图3,在本实施例中,步骤S24具体包括:
步骤S241,从所述归一化图像点阵图中提取各像素值以得到所述待识别图像对应的像素矩阵;
步骤S242,将所述像素矩阵转化为列矩阵以得到所述待识别图像的特征向量。
经过上述一系列处理后,此时,待识别图像为归一化的二值化图像,每一个像素点对应一个像素值,提取之后即可得到一个对应的36*20的布尔矩阵,此时,将该36*20的布尔矩阵转换为720*1的双精度列矩阵,该列矩阵即为待识别图像的特征向量。
步骤S30,将所述特征向量输入预先训练的BP神经网络中以得到识别字符。
具体的,对于BP神经网络而言,隐含层数越多,神经网络学习速度就越慢,根据Kosmogorov定理,在合理的结构和恰当的权值条件下,3层网络可以逼近任意的连续函数,因此,在本实施例中,选取结构相对简单的3层网络为例进行说明。其中,输入层的神经元个数等于特征向量的位数,即M*N=20*36的720个输入神经元,输出层神经元则等于本实施例中默认可识别的鼠标手势的数量,即0到9的10个数字字符,因此输出层神经元个数为10。而隐含层神经元个数是根据网络收敛性能的好坏来确定的,在总结大量网络结构的基础上,得出经验公式:
s=sqr(0.43nm+0.12m +2.54n+0.77m+0.35+0.51);
其中n为输入层神经元个数,m为输出层神经元个数,根据以上公式,可以得出隐含层神经元个数为70。
当数字字符0—9对应的特征向量输入BP神经网络后在输出神经元对应的位置上为1,其他的位置为0。例如输入数字0,则第1个输出神经元为1,其他为0;输入数字1,则第2个输出神经元为1,其他为0,以此类推。在本实施例中,对于已经预先训练好的BP神经网络,当将待识别图像的特征向量即该720*1的列矩阵输入时,则BP神经网络会对应输出相应的字符。
进一步的,在本实施例中,在步骤S10之前,所述鼠标手势的识别方法还包括:
步骤S40,通过预设的手势模板图像对预设的BP神经网络进行训练以得到所述预先训练的BP神经网络。
具体的,对于BP神经网络而言,都要通过大量的训练调整隐含层各神经元的权值,使BP神经网络的输入趋近于期望输出。本实施例中,在初始化阶段可以构造预设的BP神经网络,通过将大量预设的手势模板图像输入至预设的BP神经网络中,根据期望输出对该预设的BP神经网络进行训练,直至实际输出与期望输出的误差小于目标值为止。
进一步的,请参考图4,在本实施例中,步骤S40具体包括:
步骤S41,对预设的各手势模板图像进行预处理以得到各手势模板图像的特征向量;
步骤S42,设定预设的BP神经网络的期望输出为所述各手势模板图像对应的字符;
步骤S43,将所述各手势模板图像的特征向量输入所述BP神经网络,根据期望输出对所述BP神经网络进行训练;
步骤S44,当所述BP神经网络的训练次数达到预设次数或误差值小于期望值时,停止训练并保存此时的BP神经网络的各参数以得到预先训练的BP神经网络。
本实施例中,在预先确定了待识别图像归一化后的大小后,即确定了M*N的值后,即可以确定BP神经网络的结构,此时,可以通过newff函数建立一个前向神经网络即预先设计的BP神经网络,如下:
net=newff(minmax(P),[720,70,10],{‘logsig’,‘logsig’,‘logsig’},‘traincgb’);
其中,minmax(P)为BP神经网络对它的720个输入元素的最大值和最小值的限制。P为训练样本集合。[720,70,10]为该神经网络的层结构。{‘logsig’,‘logsig’,‘logsig’}为神经网络的各层的转移/传递函数,均设置为对数S型激活函数。训练函数采用traincgb,即采用Powel1-Beale共轭梯度法训练。
一个手势模板图像即为一个数字字符,其对应一个期望输出。对于训练样本集合而言,本发明中0—9共10类数据,每个数字字符经过上述相同的一系列处理后归一化后的图像为36×20的布尔矩阵,用此36×20=720个元素组成一个数字字符的720*1的列矩阵即数字字符的特征向量,由0~9这10个数字的特征向量组成一个720×10的输入矢量,记为:sample_group=[0,1,2,⋯ ,9 ];式中的0,1,⋯ ,9代表数字字符的特征向量。与该输入矢量对应的目标矢量是希望每一个数字输入BP神经网络后在输出神经元对应的位置上为1,其他的位置为0。为此取目标矢量为对角线上为1的10×10的单位矩阵,用matlab命令实现为:targets=eye(10)。
0—9共l0类数据,每类取20个做训练样本,即20组输入矢量构成训练样本集合,训练样本集合如下:P=[samples_groupl,samples_group2,⋯ ,group20];因为数字字符的特征向量由布尔元素组成,所以训练样本集合为布尔类型,而BP神经网络不能够对布尔值进行训练,所以需要将训练样本集合转化为双精度类型,可通过直接通过P=double(P)实现。
与训练样本集合相对应的目标值集合由20组目标矢量构成,目标值集合如下:T=[targets1,targets2,⋯ ,targets19,targets20]。
本实施例中BP神经网络的训练采用的目标性能函数为SSE,误差性能目标值设置为0.01,当BP神经网络训练次数达到预设最大值(取1000)或者BP神经网络的误差平方和SSE(误差值)降到0.01(期望值)以下,则认为训练目标达到,终止训练。可通过MATLAB软件实现,其实现部分程序代码如下:
net.trainParam.epochs=1000;%最大训练次数
net.trainParam.show=20;%显示的间隔次数
net.trainParam.min_grad=le-10;%最小执行梯度
net.peformFcn:‘sse’;%设置目标性能函数
net.trainParam.goal=0.01;%性能期望值
net.1ayers{1}.initFcn=‘initwb’;%网络层的初始化函数选为‘initwb’,使下面的输入层初始化语句randnr’有效
net.inputWeights{1,1}.initFcn=‘randnr’;%输入层权值向量初始化
net.1ayerWeights{2,1}.initFcn=‘randnr’;%第1网络层到第2网络层的权值向量初始化
net=init(net);% 初始化网络
[net,tr]=train(net,P,T);% 网络训练
其中[net,tr]中的net为更新了权值的BP神经网络,tr为训练记录(记录次数和每次训练的误差)。
请同时参考图9,所示是训练过程中BP神经网络的学***方和SSE为0.0088,小于性能期望值0.01,可结束训练,得到预先训练的BP神经网络。
以上实施例提供的鼠标手势的识别方法,通过侦测用户通过鼠标操作输入的鼠标手势,获取鼠标手势图像以得到待识别图像;对所述待识别图像进行预处理以得到所述待识别图像的特征向量;将所述特征向量输入预先训练的BP神经网络中以得到识别字符。利用BP神经网络的人工智能,将经过预处理后的鼠标手势图像的特征向量输入训练好的BP神经网络即可得到识别结果。由此,BP神经网络的分布式存储信息,并行处理信息,自组织,自学习信息等优点,让用户输入的鼠标手势能够及时的处理,并且大量的训练也保证了识别精度,同时,还可应用于各种类型的应用程序,大大的提高了用户体验。
本发明提供一种鼠标手势的识别装置。参照图5,图5为本发明的鼠标手势的识别方法第一实施例的功能模块示意图。在该实施例中,鼠标手势的识别装置100应用于终端中,包括:
侦测模块60,用于侦测用户通过鼠标操作输入的鼠标手势,获取鼠标手势图像以得到待识别图像;
具体的,在本实施例中,侦测模块60侦测用户通过鼠标操作输入的鼠标手势,例如,用户通过鼠标右键在终端上当前页面上进行的拖曳轨迹,侦测模块60记录鼠标光标经过的轨迹并在用户输入完毕后生成包含该轨迹的图形,即鼠标手势图像。所述鼠标手势图像即为后续需要进行识别的待识别图像。
可以理解的是,当前鼠标手势已经涵盖了许多的常用英文字母和数字,在本实施例中,待识别的鼠标手势默认为0到9的10个数字字符。
处理模块70,用于对所述待识别图像进行预处理以得到所述待识别图像的特征向量;
具体的,在识别鼠标手势图像之前,都需要对该待识别图像进行一系列的变换处理以得到符合要求的特征向量,所述变换处理包括但不限于灰度化处理、二值化处理、去离散噪声、归一化调整等预处理,最终,处理模块70可以将待识别图像转换为归一化的M*N的像素点阵图,并由此得到待识别图像的特征向量A,并且,特征向量A为数值类型为双精度类型的(N*M)*1的列矩阵。其中,M,N的取值可以依据需要设定,在本实施例中,可以取M为20,N为36。
进一步的,请参考图6,所示是处理模块70的细化功能模块示意图,在本实施例中,处理模块70具体包括:
灰度化单元71,用于对所述待识别图像进行灰度化处理;
二值化单元72,用于对灰度化处理后的所述待识别图像进行二值化处理并进行去离散噪声处理;
归一化单元73,用于对去离散噪声处理后的所述待识别图像进行归一化处理以得到待识别的图像的归一化像素点阵图;
提取单元74,用于从所述归一化像素点阵图的像素矩阵中提取所述待识别图像的特征向量。
具体的,灰度化单元71先对待识别图像进行灰度化处理,在本实施例中,对于具体的处理方式不作限制,例如可以采用最大值法,平均值法或是加权平均法,本实施例中优选采用加权平均法,例如可以通过MATLAB中的rgb2gray函数直接实现图像的灰度化。得到灰度化的待识别图像后,二值化单元72进一步对灰度化的待识别图像进行二值化处理以及去噪处理。本实施例中仍旧通过MATLAB软件实现如下:假设灰度化的待识别图像为fig_gray,则首先通过graythresh函数使用最大类间方差法得到fig_gray的全局阈值,再通过im2bw函数调用该全局阈值将fig_gray转化为二值图像fig_bool。实现过程如下:
Threshold = graythresh(fig_gray);
Fig_ bool = im2bw(fig_gray,threshold);
其中,其中fig_gray和fig_boo1分别为灰度图像的存储矩阵和二值化图像的存储矩阵。
因为二值化后的图像一般存在字体模糊现象,因此,在去除离散噪声之前一般还需要进行梯度锐化,同时也可以对噪声起到一定的去除作用,本实施例中可以采用Roberts算子、Sobel算子、Prewitt算子或是Laplacian算子等处理方式对二值化后的待识别图像进行锐化使凸显边缘变清楚,同时选用合适的阈值可以减弱和消除细小的噪声,去离散噪声的仍可以通过MATLAB软件实现,现有的去离散噪声方式已经非常成熟,故此处不作详细描述,仅列举一方式作为例子:扫描整个图像,当发现一个黑色像素时就考察和它直接或间接相连的黑色像素的数量,如果该数量大于一定值(具体数值根据图像情况而定),则可以判定它为非离散点,反之,则认为它为离散点,将其从图像中去除。
因为用户通过鼠标输入鼠标手势后,其覆盖的范围存在不同,因此,侦测模块60获取到的鼠标手势图像大小也存在差异,因此,归一化单元73需要对输入的鼠标手势图像即待识别图像进行归一化调整。将去离散噪声后的待识别图像统一归一化为M*N即20×36的像素点阵图后,提取单元74从该像素点阵图中即可提取待识别图像的特征向量。
进一步的,请参考图7,在本实施例中,提取单元74具体包括:
矩阵子单元741,用于从所述归一化图像点阵图中提取各像素值以得到所述待识别图像对应的像素矩阵;
转换子单元742,用于将所述像素矩阵转化为列矩阵以得到所述待识别图像的特征向量。
经过上述一系列处理后,此时,待识别图像为归一化的二值化图像,每一个像素点对应一个像素值,提取之后即可得到一个对应的36*20的布尔矩阵,此时,将该36*20的布尔矩阵转换为720*1的双精度列矩阵,该列矩阵即为待识别图像的特征向量。
识别模块80,用于将所述特征向量输入预先训练的BP神经网络中以得到识别字符。
具体的,对于BP神经网络而言,隐含层数越多,神经网络学习速度就越慢,根据Kosmogorov定理,在合理的结构和恰当的权值条件下,3层网络可以逼近任意的连续函数,因此,在本实施例中,选取结构相对简单的3层网络为例进行说明。其中,输入层的神经元个数等于特征向量的位数,即M*N=20*36的720个输入神经元,输出层神经元则等于本实施例中默认可识别的鼠标手势的数量,即0到9的10个数字字符,因此输出层神经元个数为10。而隐含层神经元个数是根据网络收敛性能的好坏来确定的,在总结大量网络结构的基础上,得出经验公式:
s=sqr(0.43nm+0.12m +2.54n+0.77m+0.35+0.51);
其中n为输入层神经元个数,m为输出层神经元个数,根据以上公式,可以得出隐含层神经元个数为70。
当数字字符0—9对应的特征向量输入BP神经网络后在输出神经元对应的位置上为1,其他的位置为0。例如输入数字0,则第1个输出神经元为1,其他为0;输入数字1,则第2个输出神经元为1,其他为0,以此类推。在本实施例中,对于已经预先训练好的BP神经网络,当识别模块80将待识别图像的特征向量即该720*1的列矩阵输入时,则BP神经网络会对应输出相应的字符。
进一步的,在本实施例中,鼠标手势的识别装置100还包括:
训练模块90,用于通过预设的手势模板图像对预设的BP神经网络进行训练以得到所述预先训练的BP神经网络。
具体的,对于BP神经网络而言,都要通过大量的训练调整隐含层各神经元的权值,使BP神经网络的输入趋近于期望输出。本实施例中,在初始化阶段可以构造预设的BP神经网络,训练模块90通过将大量预设的手势模板图像输入至预设的BP神经网络中,根据期望输出对该预设的BP神经网络进行训练,直至实际输出与期望输出的误差小于目标值为止。
进一步的,请参考图8,在本实施例中,训练模块90具体包括:
预处理单元91,用于对预设的各手势模板图像进行预处理以得到各手势模板图像的特征向量;
设定单元92,用于设定预先设计的BP神经网络的期望输出为所述各手势模板图像对应的手势;
输入单元93,用于将所述各手势模板图像的特征向量输入所述BP神经网络,根据期望输出对所述BP神经网络进行训练;
判断单元94,用于当所述BP神经网络的训练次数达到预设次数或误差值小于期望值时,停止训练并保存此时的BP神经网络的各参数以得到预先训练的BP神经网络。
本实施例中,在预先确定了待识别图像归一化后的大小后,即确定了M*N的值后,即可以确定BP神经网络的结构,此时,可以通过newff函数建立一个前向神经网络即预先设计的BP神经网络,如下:
net=newff(minmax(P),[720,70,10],{‘logsig’,‘logsig’,‘logsig’},‘traincgb’);
其中minmax(P)为BP神经网络对它的720个输入元素的最大值和最小值的限制。P为训练样本集合。[720,70,10]为该神经网络的层结构。{‘logsig’,‘logsig’,‘logsig’}为神经网络的各层的转移/传递函数,均设置为对数S型激活函数。训练函数采用traincgb,即采用Powel1-Beale共轭梯度法训练。
一个手势模板图像即为一个数字字符,其对应一个期望输出。对于训练样本集合而言,本发明中0—9共10类数据,每个数字字符经过上述相同的一系列处理后归一化后的图像为36×20的布尔矩阵,用此36×20=720个元素组成一个数字字符的720*1的列矩阵即数字字符的特征向量,由0~9这10个数字的特征向量组成一个720×10的输入矢量,记为:sample_group=[0,1,2,⋯ ,9 ];式中的0,1,⋯ ,9代表数字字符的特征向量。与该输入矢量对应的目标矢量是希望每一个数字输入BP神经网络后在输出神经元对应的位置上为1,其他的位置为0。为此取目标矢量为对角线上为1的10×10的单位矩阵,用matlab命令实现为:targets=eye(10)。
0—9共l0类数据,每类取20个做训练样本,即20组输入矢量构成训练样本集合,训练样本集合如下:P=[samples_groupl,samples_group2,⋯ ,group20];因为数字字符的特征向量由布尔元素组成,所以训练样本集合为布尔类型,而BP神经网络不能够对布尔值进行训练,所以需要将训练样本集合转化为双精度类型,可通过直接通过P=double(P)实现。
与训练样本集合相对应的目标值集合由20组目标矢量构成,目标值集合如下:T=[targets1,targets2,⋯ ,targets19,targets20]。
本实施例中BP神经网络的训练采用的目标性能函数为SSE,误差性能目标值设置为0.01,当BP神经网络训练次数达到预设最大值(取1000)或者BP神经网络的误差平方和SSE(误差值)降到0.01(期望值)以下,则认为训练目标达到,终止训练。可通过MATLAB软件实现,其实现部分程序代码如下:
net.trainParam.epochs=1000;%最大训练次数
net.trainParam.show=20;%显示的间隔次数
net.trainParam.min_grad=le-10;%最小执行梯度
net.peformFcn:‘sse’;%设置目标性能函数
net.trainParam.goal=0.01;%性能期望值
net.1ayers{1}.initFcn=‘initwb’;%网络层的初始化函数选为‘initwb’,使下面的输入层初始化语句randnr’有效
net.inputWeights{1,1}.initFcn=‘randnr’;%输入层权值向量初始化
net.1ayerWeights{2,1}.initFcn=‘randnr’;%第1网络层到第2网络层的权值向量初始化
net=init(net);% 初始化网络
[net,tr]=train(net,P,T);% 网络训练
其中[net,tr]中的net为更新了权值的BP神经网络,tr为训练记录(记录次数和每次训练的误差)。
请同时参考图9,所示是训练过程中BP神经网络的学***方和SSE为0.0088,小于性能期望值0.01,可结束训练,得到预先训练的BP神经网络。
以上实施例提供的鼠标手势的识别装置,通过侦测用户通过鼠标操作输入的鼠标手势,获取鼠标手势图像以得到待识别图像;对所述待识别图像进行预处理以得到所述待识别图像的特征向量;将所述特征向量输入预先训练的BP神经网络中以得到识别字符。利用BP神经网络的人工智能,将经过预处理后的鼠标手势图像的特征向量输入训练好的BP神经网络即可得到识别结果。由此,BP神经网络的分布式存储信息,并行处理信息,自组织,自学习信息等优点,让用户输入的鼠标手势能够及时的处理,并且大量的训练也保证了识别精度,同时,还可应用于各种类型的应用程序,大大的提高了用户体验。
以上仅为本发明的优选实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围内。

Claims (10)

  1. 一种鼠标手势的识别方法,其特征在于,所述方法包括:
    侦测用户通过鼠标操作输入的鼠标手势,获取鼠标手势图像以得到待识别图像;
    对所述待识别图像进行预处理以得到所述待识别图像的特征向量;
    将所述特征向量输入预先训练的BP神经网络中以得到识别字符。
  2. 根据权利要求1所述的鼠标手势的识别方法,其特征在于,所述对所述待识别图像进行预处理以得到所述待识别图像的特征向量具体包括:
    对所述待识别图像进行灰度化处理;
    对灰度化处理后的所述待识别图像进行二值化处理并进行去离散噪声处理;
    对去离散噪声处理后的所述待识别图像进行归一化处理以得到待识别的图像的归一化像素点阵图;
    从所述归一化像素点阵图的像素矩阵中提取所述待识别图像的特征向量。
  3. 根据权利要求2所述的鼠标手势的识别方法,其特征在于,所述从所述归一化像素点阵图的像素矩阵中提取所述待识别图像的特征向量具体包括:
    从所述归一化图像点阵图中提取各像素值以得到所述待识别图像对应的像素矩阵;
    将所述像素矩阵转化为列矩阵以得到所述待识别图像的特征向量。
  4. 根据权利要求1所述的鼠标手势的识别方法,其特征在于,在所述侦测用户通过鼠标操作输入的鼠标手势,获取鼠标手势图像以得到待识别图像之前,所述方法还包括:
    通过预设的手势模板图像对预设的BP神经网络进行训练以得到所述预先训练的BP神经网络。
  5. 根据权利要求1所述的鼠标手势的识别方法,其特征在于,所述通过预设的手势模板图像对预设BP神经网络进行训练以得到所述预先训练的BP神经网络具体包括:
    对预设的各手势模板图像进行预处理以得到各手势模板图像的特征向量;
    设定预设的BP神经网络的期望输出为所述各手势模板图像对应的字符;
    将所述各手势模板图像的特征向量输入所述BP神经网络,根据期望输出对所述BP神经网络进行训练;
    当所述BP神经网络的训练次数达到预设次数或误差值小于期望值时,停止训练并保存此时的BP神经网络的各参数以得到预先训练的BP神经网络。
  6. 一种鼠标手势的识别装置,其特征在于,所述鼠标手势的识别装置包括:
    侦测模块,用于侦测用户通过鼠标操作输入的鼠标手势,获取鼠标手势图像以得到待识别图像;
    处理模块,用于对所述待识别图像进行预处理以得到所述待识别图像的特征向量;
    识别模块,用于将所述特征向量输入预先训练的BP神经网络中以得到识别字符。
  7. 根据权利要求6所述的鼠标手势的识别装置,其特征在于,所述处理模块具体包括:
    灰度化单元,用于对所述待识别图像进行灰度化处理;
    二值化单元,用于对灰度化处理后的所述待识别图像进行二值化处理并进行去离散噪声处理;
    归一化单元,用于对去离散噪声处理后的所述待识别图像进行归一化处理以得到待识别的图像的归一化像素点阵图;
    提取单元,用于从所述归一化像素点阵图的像素矩阵中提取所述待识别图像的特征向量。
  8. 根据权利要求7所述的鼠标手势的识别装置,其特征在于,所述提取单元具体包括:
    矩阵子单元,用于从所述归一化图像点阵图中提取各像素值以得到所述待识别图像对应的像素矩阵;
    转换子单元,用于将所述像素矩阵转化为列矩阵以得到所述待识别图像的量特征向量。
  9. 根据权利要求6所述的鼠标手势的识别装置,其特征在于,所述装置还包括:
    训练模块,用于通过预设的手势模板图像对预设的BP神经网络进行训练以得到所述预先训练的BP神经网络。
  10. 根据权利要求9所述的鼠标手势的识别装置,其特征在于,所述训练模块具体包括:
    预处理单元,用于对预设的各手势模板图像进行预处理以得到各手势模板图像的特征向量;
    设定单元,用于设定预先设计的BP神经网络的期望输出为所述各手势模板图像对应的手势;
    输入单元,用于将所述各手势模板图像的特征向量输入所述BP神经网络,根据期望输出对所述BP神经网络进行训练;
    判断单元,用于当所述BP神经网络的训练次数达到预设次数或误差值小于期望值时,停止训练并保存此时的BP神经网络的各参数以得到预先训练的BP神经网络。
PCT/CN2017/073382 2017-02-13 2017-02-13 鼠标手势的识别方法和装置 WO2018145316A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/073382 WO2018145316A1 (zh) 2017-02-13 2017-02-13 鼠标手势的识别方法和装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/073382 WO2018145316A1 (zh) 2017-02-13 2017-02-13 鼠标手势的识别方法和装置

Publications (1)

Publication Number Publication Date
WO2018145316A1 true WO2018145316A1 (zh) 2018-08-16

Family

ID=63107130

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/073382 WO2018145316A1 (zh) 2017-02-13 2017-02-13 鼠标手势的识别方法和装置

Country Status (1)

Country Link
WO (1) WO2018145316A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101408824A (zh) * 2008-11-18 2009-04-15 广东威创视讯科技股份有限公司 鼠标手势识别方法
US20110304573A1 (en) * 2010-06-14 2011-12-15 Smith George C Gesture recognition using neural networks
CN102508547A (zh) * 2011-11-04 2012-06-20 哈尔滨工业大学深圳研究生院 基于计算机视觉的手势输入法构建方法及***
CN102854982A (zh) * 2012-08-01 2013-01-02 华平信息技术(南昌)有限公司 一种识别自定义手势轨迹的方法
CN104484115A (zh) * 2014-12-01 2015-04-01 林志均 一种鼠标手势识别方法
CN105138169A (zh) * 2015-08-26 2015-12-09 苏州市新瑞奇节电科技有限公司 一种基于手势识别的触摸板控制装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101408824A (zh) * 2008-11-18 2009-04-15 广东威创视讯科技股份有限公司 鼠标手势识别方法
US20110304573A1 (en) * 2010-06-14 2011-12-15 Smith George C Gesture recognition using neural networks
CN102508547A (zh) * 2011-11-04 2012-06-20 哈尔滨工业大学深圳研究生院 基于计算机视觉的手势输入法构建方法及***
CN102854982A (zh) * 2012-08-01 2013-01-02 华平信息技术(南昌)有限公司 一种识别自定义手势轨迹的方法
CN104484115A (zh) * 2014-12-01 2015-04-01 林志均 一种鼠标手势识别方法
CN105138169A (zh) * 2015-08-26 2015-12-09 苏州市新瑞奇节电科技有限公司 一种基于手势识别的触摸板控制装置

Similar Documents

Publication Publication Date Title
WO2021080103A1 (en) Method for learning and testing user learning network to be used for recognizing obfuscated data created by concealing original data to protect personal information and learning device and testing device using the same
WO2021132927A1 (en) Computing device and method of classifying category of data
WO2016204466A1 (en) User authentication method and electronic device supporting the same
WO2019000462A1 (zh) 人脸图像处理方法、装置、存储介质及电子设备
WO2018088806A1 (ko) 영상처리장치 및 영상처리방법
WO2017039287A1 (ko) 세그먼트 기반 수기서명 인증 시스템 및 방법
WO2015137666A1 (ko) 오브젝트 인식 장치 및 그 제어 방법
WO2021158085A1 (en) Neural network update method, classification method and electronic device
WO2015160207A1 (en) System and method for detecting region of interest
WO2017099555A1 (ko) 시간분할 세그먼트 블록 기반 수기서명 인증 시스템 및 방법
WO2021177758A1 (en) Methods and systems for denoising media using contextual information of the media
WO2022240030A1 (ko) 반려동물 생애 관리 시스템 및 그 방법
WO2020017902A1 (en) Method and apparatus for performing user authentication
WO2020027519A1 (ko) 영상 처리 장치 및 그 동작방법
WO2022240029A1 (ko) 반려동물 식별 시스템 및 그 방법
WO2020091519A1 (en) Electronic apparatus and controlling method thereof
WO2020207038A1 (zh) 基于人脸识别的人数统计方法、装置、设备及存储介质
WO2018145316A1 (zh) 鼠标手势的识别方法和装置
WO2024039058A1 (ko) 피부 진단 장치, 이를 포함하는 피부 진단 시스템 및 그 방법
WO2021060721A1 (ko) 전자 장치 및 그의 이미지 처리 방법
WO2021230680A1 (en) Method and device for detecting object in image
WO2022065604A1 (ko) 모바일 디바이스, 사용자에 의해 발생된 드로잉 데이터를 이용하여 사용자와 관련된 정보를 판별하는 방법 및 컴퓨터 프로그램
WO2017171142A1 (ko) 얼굴의 특징점 검출 시스템 및 방법
Niaraki et al. Accuracy improvement of face recognition system based on co-occurrence matrix of local median binary pattern
WO2018008934A2 (en) Adaptive quantization method for iris image encoding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17895884

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: 1205A 16.12.2019

122 Ep: pct application non-entry in european phase

Ref document number: 17895884

Country of ref document: EP

Kind code of ref document: A1