CN115471917A - Gesture detection and recognition system and method - Google Patents

Gesture detection and recognition system and method Download PDF

Info

Publication number
CN115471917A
CN115471917A CN202211194591.1A CN202211194591A CN115471917A CN 115471917 A CN115471917 A CN 115471917A CN 202211194591 A CN202211194591 A CN 202211194591A CN 115471917 A CN115471917 A CN 115471917A
Authority
CN
China
Prior art keywords
gesture
sensing
infrared detector
weighting matrix
detector array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211194591.1A
Other languages
Chinese (zh)
Other versions
CN115471917B (en
Inventor
胡小燕
操俊
曹静
刘松
汪志强
王伟平
李慧津
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC Information Science Research Institute
Original Assignee
CETC Information Science Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC Information Science Research Institute filed Critical CETC Information Science Research Institute
Priority to CN202211194591.1A priority Critical patent/CN115471917B/en
Publication of CN115471917A publication Critical patent/CN115471917A/en
Application granted granted Critical
Publication of CN115471917B publication Critical patent/CN115471917B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01JMEASUREMENT OF INTENSITY, VELOCITY, SPECTRAL CONTENT, POLARISATION, PHASE OR PULSE CHARACTERISTICS OF INFRARED, VISIBLE OR ULTRAVIOLET LIGHT; COLORIMETRY; RADIATION PYROMETRY
    • G01J5/00Radiation pyrometry, e.g. infrared or optical thermometry
    • G01J5/10Radiation pyrometry, e.g. infrared or optical thermometry using electric radiation detectors
    • G01J5/20Radiation pyrometry, e.g. infrared or optical thermometry using electric radiation detectors using resistors, thermistors or semiconductors sensitive to radiation, e.g. photoconductive devices
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01JMEASUREMENT OF INTENSITY, VELOCITY, SPECTRAL CONTENT, POLARISATION, PHASE OR PULSE CHARACTERISTICS OF INFRARED, VISIBLE OR ULTRAVIOLET LIGHT; COLORIMETRY; RADIATION PYROMETRY
    • G01J5/00Radiation pyrometry, e.g. infrared or optical thermometry
    • G01J5/48Thermography; Techniques using wholly visual means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01JMEASUREMENT OF INTENSITY, VELOCITY, SPECTRAL CONTENT, POLARISATION, PHASE OR PULSE CHARACTERISTICS OF INFRARED, VISIBLE OR ULTRAVIOLET LIGHT; COLORIMETRY; RADIATION PYROMETRY
    • G01J5/00Radiation pyrometry, e.g. infrared or optical thermometry
    • G01J5/10Radiation pyrometry, e.g. infrared or optical thermometry using electric radiation detectors
    • G01J5/20Radiation pyrometry, e.g. infrared or optical thermometry using electric radiation detectors using resistors, thermistors or semiconductors sensitive to radiation, e.g. photoconductive devices
    • G01J2005/202Arrays

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Psychiatry (AREA)
  • Photometry And Measurement Of Optical Pulse Characteristics (AREA)
  • Image Analysis (AREA)

Abstract

The disclosure relates to the field of gesture recognition, and provides a gesture detection recognition system and method, wherein the system comprises: the sensing and calculating integrated infrared detection module is used for extracting characteristic information based on a weighting matrix: performing convolution operation on the gesture image to be recognized based on a pre-trained deep neural network model according to the dynamically configured weighting matrix to obtain characteristic information of the gesture image; the gesture recognition module is used for searching gesture semantics corresponding to the feature information under the weighting matrix from a preset gesture feature corresponding table, updating the weighting matrix when the weighting matrix is not searched, so that the sensing and computing integrated infrared detection module extracts the feature information based on the weighting matrix again, and determining the gesture semantics as the semantic information of the gesture image when the weighting matrix is searched; and the semantic interpretation processing conversion module is used for converting the semantic information into information in a preset form and transmitting the information in the preset form to the user. The method and the device solve the problems of high power consumption, large size, large number of chips and the like of the traditional gesture recognition system to a certain extent.

Description

Gesture detection and recognition system and method
Technical Field
The present disclosure relates to the field of gesture recognition technologies, and in particular, to a gesture detection recognition system and method.
Background
Gesture recognition is a subject of computer science and language technology, which aims to recognize human gestures by mathematical algorithms. From the technical implementation of gesture recognition, common gesture recognition methods mainly include a template matching method and a hidden markov model method based on a neural network. The template matching method based on the neural network is to regard the motion of a gesture as a sequence consisting of static gesture images, and then compare the gesture template sequence to be recognized with a known gesture template sequence so as to recognize the gesture.
The common camera widely used in the current research can be normally used under the condition of sufficient visible light. When light is insufficient, the image shot by the common camera has the problems of blurring and low resolution, so that when the image is used for image processing and recognition, wrong image segmentation can be caused, and false recognition can be caused. In addition, if the camera is in an environment completely without visible light, the common camera cannot work at all. Therefore, the common camera has a large defect that the common camera must work under sufficient visible light, which causes great limitation to the applicable environment of the common camera.
Because the infrared electromagnetic wave has the characteristic of less attenuation when being transmitted in the atmosphere, the thermal infrared imager carries out imaging by detecting the infrared radiation quantity, not only can realize the observation of the target object under the environment completely without visible light, but also can realize the detection of the target object without being influenced by the environment under the environments such as thick smoke screen or cloud and mist and the like, and even can detect a disguised target and a target with higher moving speed. Therefore, the infrared gesture recognition can solve the problem of illumination requirements of a common camera, is suitable for various complex environments, and can normally run whether visible light exists or not.
The existing infrared gesture recognition technology is less in research. A thermal imager-based gesture recognition method combines the thermal imager with gesture recognition, provides an operation process of an infrared gesture recognition system, and the operation process comprises five parts of thermal infrared image data acquisition, infrared image gray scale expansion, image edge detection and segmentation, image feature extraction and gesture classification recognition, wherein each part comprises a plurality of steps. However, the gesture recognition method needs to collect the whole infrared gesture image, and has the disadvantages of large data collection and transmission amount, multiple data processing steps, long processing time and the like, and also has the disadvantages of high system power consumption, large size, large number of required processing chips and the like.
Disclosure of Invention
The present disclosure is directed to at least one of the problems of the prior art, and provides a gesture detection and recognition system and method.
In one aspect of the present disclosure, a gesture detection and recognition system is provided, including:
the sensing and calculating integrated infrared detection module is used for extracting characteristic information based on the weighting matrix: performing convolution operation on a gesture image to be recognized based on a pre-trained deep neural network model according to a dynamically configured weighting matrix to obtain characteristic information of the gesture image;
the gesture recognition module is used for searching a gesture semantic corresponding to the feature information under the weighting matrix from a preset gesture feature corresponding table, updating the weighting matrix when the gesture semantic is not searched, so that the sensing and calculation integrated infrared detection module extracts the feature information based on the updated weighting matrix, and determining the gesture semantic as the semantic information of the gesture image when the gesture semantic is searched; the gesture feature corresponding table comprises the corresponding relation between the weighting matrix and the feature information and the gesture semantics;
and the semantic interpretation processing conversion module is used for converting the semantic information into information in a preset form and transmitting the information in the preset form to a user.
Optionally, the sensing and computing integrated infrared detection module is further configured to obtain gesture data of a plurality of sample gesture images with known gesture semantics according to a dynamically configured edge extraction operator, so as to obtain the training sample set; wherein the sample gesture images correspond to at least one of different gesture semantics, positions, visual angles and distances.
Optionally, the sensing and calculating integrated infrared detection module includes an infrared detector array and a sensing and calculating integrated processing chip electrically connected to the infrared detector array;
the infrared detector array is used for outputting current signals under the action of the sensing and computing integrated processing chip based on the edge extraction operator or the weighting matrix;
the sensing and calculation integrated processing chip is used for dynamically configuring the edge extraction operator or the weighting matrix to the infrared detector array, and controlling the infrared detector array to output the current signal based on the edge extraction operator or the weighting matrix to obtain the training sample set or the characteristic information.
Optionally, the sensing and computation integrated processing chip includes a bias voltage generation module, a gating device array, a column amplification conversion module, and a control module, where:
first ends of the infrared detectors in each row are respectively and electrically connected with the bias voltage generation module and the control module through corresponding row output lines;
the first end of each row of the gating devices is electrically connected with the second end of the infrared detector of the corresponding row, the second end of each row of the gating devices is electrically connected with the corresponding amplification conversion module through the corresponding row output line, and the control end of each row of the gating devices is electrically connected with the control module; wherein the content of the first and second substances,
the control module is used for controlling the bias voltage generation module to apply bias voltage to the infrared detector array; and (c) a second step of,
the control module is further configured to control the gating device array to gate according to a preset gating manner, adjust the bias voltage, dynamically configure the edge extraction operator or the weighting matrix to the infrared detector array, and control the infrared detector array to output the current signal based on the edge extraction operator or the weighting matrix;
the column amplification conversion module is configured to amplify the current signal, and convert the amplified current signal into a voltage signal to obtain the training sample set or the feature information.
In another aspect of the present disclosure, a gesture detection and recognition method is provided, which is applied to the gesture detection and recognition system described above. The gesture detection and recognition method comprises the following steps:
extracting characteristic information based on the weighting matrix: performing convolution operation on a gesture image to be recognized based on a pre-trained deep neural network model according to a dynamically configured weighting matrix to obtain characteristic information of the gesture image;
searching a gesture semantic corresponding to the feature information under the weighting matrix from a preset gesture feature corresponding table, updating the weighting matrix when the gesture semantic is not found, returning to the step of extracting the feature information based on the weighting matrix, and determining the gesture semantic as the semantic information of the gesture image when the gesture semantic is found; the gesture feature correspondence table comprises the corresponding relation between the weighting matrix and the feature information and the gesture semantics;
and converting the semantic information into information in a preset form, and transmitting the information in the preset form to a user.
Optionally, the pre-trained deep neural network model is obtained according to the following steps:
dynamically configuring a preset edge extraction operator to the sensing and calculating integrated infrared detection module based on a preset application scene, and acquiring gesture data of a plurality of sample gesture images with known gesture semantics through the sensing and calculating integrated infrared detection module to obtain the training sample set; wherein the sample gesture images correspond to at least one of different gesture semantics, positions, visual angles and distances;
establishing a deep neural network model comprising convolutional layers based on the preset application scene;
and training the deep neural network model by using the training sample set to obtain the pre-trained deep neural network model.
Optionally, when the sensing and arithmetic integrated infrared detection module comprises an infrared detector array and a sensing and arithmetic integrated processing chip electrically connected with the infrared detector array,
the method comprises the following steps that based on a preset application scene, a preset edge extraction operator is dynamically configured to the sensing and calculation integrated infrared detection module, gesture data of a plurality of sample gesture images with known gesture semantics are obtained through the sensing and calculation integrated infrared detection module, and a training sample set is obtained and comprises the following steps:
the sensing and computing integrated processing chip dynamically configures the preset edge extraction operator to the infrared detector array based on the preset application scene;
the infrared detector array acquires the gesture data based on the preset edge extraction operator and outputs a current signal under the control of the sensing and calculating integrated processing chip;
and the sensing and calculating integrated processing chip obtains the training sample set according to the current signal.
Optionally, the edge extraction operator includes an x-direction operator and a y-direction operator;
when the y-direction operator includes a negative number, the sensing and computing integrated processing chip dynamically configures the edge extraction operator to the infrared detector array based on the application scene, including:
the sensing and computing integrated processing chip configures a substitute operator corresponding to the y-direction operator to the infrared detector array based on the application scene; the substitute operator is obtained by replacing a negative number in the y-direction operator with a corresponding positive number;
when the x-direction operator includes a negative number, the sensing and computing integrated processing chip dynamically configures the edge extraction operator to the infrared detector array based on the application scene, including:
the sensing and calculating integrated processing chip respectively configures a preset number of adjusting operators corresponding to the x-direction operators to the infrared detector array based on the application scene; wherein the adjustment operator is a non-negative operator.
Optionally, when the sensing and computation integrated processing chip includes a bias voltage generation module, a gating device array, a column amplification conversion module and a control module,
the method for performing convolution operation on the gesture image to be recognized based on the pre-trained deep neural network model according to the dynamically configured weighting matrix to obtain the characteristic information of the gesture image comprises the following steps:
the control module controls the bias voltage generation module to apply bias voltage to the infrared detector array;
the control module controls the gating device array to gate according to a preset gating mode, adjusts the bias voltage and dynamically configures the weighting matrix to the infrared detector array;
the infrared detector array outputs corresponding current signals under the control of the control module based on the weighting matrix and the gesture image;
and the column amplification conversion module amplifies the current signal, converts the amplified current signal into a voltage signal and obtains the characteristic information.
Compared with the prior art, the method has the advantages that detection perception and feature calculation are combined, sensing, convolution calculation and other steps are integrated, sensing and calculation integrated detection is achieved, feature information of a gesture image to be recognized can be obtained through a sensing and calculation integrated infrared detection module, gesture recognition can be directly performed by the gesture recognition module according to the feature information of the gesture image through the gesture feature correspondence table, accordingly, data transmission bandwidth requirements and data processing calculation requirements of a system are reduced, gesture recognition processing time and power consumption of the system are reduced, the problems that a traditional gesture recognition system is high in power consumption, large in size, large in number of chips and the like are solved to a certain extent, and the method can be applied to scenes needing gesture detection recognition at night in a close distance, such as intelligent helmets, intelligent glasses, unmanned aerial vehicle clusters and the like.
Drawings
One or more embodiments are illustrated by way of example in the accompanying drawings, which are not to be construed as limiting the embodiments, in which elements having the same reference numeral designations represent like elements throughout, and in which the drawings are not to be construed as limiting in scale unless otherwise specified.
Fig. 1 is a schematic structural diagram of a gesture detection and recognition system according to an embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of a calculation-integrated infrared detection module according to another embodiment of the present disclosure;
fig. 3 is a schematic array structure diagram of a sensing and computing integrated infrared detection module according to another embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a bias voltage generation module according to another embodiment of the present disclosure;
FIG. 5 is a schematic circuit diagram of a computationally integrated infrared detection module provided in another embodiment of the present disclosure;
FIG. 6 is a flowchart of a gesture detection and recognition method according to another embodiment of the disclosure;
fig. 7 is a flowchart of a gesture detection and recognition method according to another embodiment of the present disclosure;
fig. 8 is a flowchart of a gesture recognition method based on deep learning in the prior art.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that in various embodiments of the disclosure, numerous technical details are set forth in order to provide a better understanding of the disclosure. However, the technical solutions claimed in the present disclosure can be implemented without these technical details and various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and no limitation should be made to specific implementations of the present disclosure, and the embodiments may be mutually incorporated and referred to without contradiction.
One embodiment of the present disclosure relates to a gesture detection and recognition system, as shown in fig. 1, including a sensory-computational integrated infrared detection module 101, a gesture recognition module 102, and a semantic interpretation processing conversion module 103.
A sensing and calculating integrated infrared detection module 101, configured to extract feature information based on the weighting matrix: and performing convolution operation on the gesture image to be recognized based on a pre-trained deep neural network model according to the dynamically configured weighting matrix to obtain the characteristic information of the gesture image.
The gesture recognition module 102 is configured to search a gesture semantic corresponding to the feature information under the weighting matrix from a preset gesture feature correspondence table, update the weighting matrix when the gesture semantic is not found, so that the sensing and computing integrated infrared detection module extracts the feature information based on the updated weighting matrix, and determine the gesture semantic as the semantic information of the gesture image when the gesture semantic is found. The gesture feature correspondence table comprises a weighting matrix, and a correspondence relation between feature information and gesture semantics.
And the semantic interpretation processing conversion module 103 is used for converting the semantic information into information in a preset form and transmitting the information in the preset form to the user.
Specifically, the deep neural network model may be a convolutional neural network model for gesture recognition, which is established based on the sensing and computation integrated infrared detection module 101 and a deep learning theory, and the model may include a convolutional layer, a pooling layer, a full-link layer, and the like. The convolution operation of the convolution layer can be realized by the sensing and calculating integrated infrared detection module 101 by using the weighting matrix. Pooling layers, full connectivity layers, etc. may be implemented by the gesture recognition module 102. The gesture recognition module 102 may be a Field Programmable Gate Array (FPGA) chip or an Artificial Intelligence (AI) chip.
In general, the number of semantic gestures in a particular application scenario is limited, and in a night-time close-range scenario, the location of the gesture is usually significant, and the background environment has less influence on the gesture recognition. Therefore, a gesture feature correspondence table for pattern matching can be established according to the correspondence between the feature information of the gesture image and the gesture semantics. Meanwhile, the sensing and computing integrated infrared detection module 101 can extract feature information of the gesture image based on a dynamically configured weighting matrix, so that a gesture feature correspondence table with higher precision and richer content can be established, and the weighting matrix corresponding to the feature information of the gesture image and the gesture semantics is added to the gesture feature correspondence table. For example, the gesture feature correspondence table may include the following: on-sensing and computing integrated infrared detection module 101 based on weighting matrixW 1 The obtained characteristic information of the gesture image isC 1-1 And when the gesture image is displayed, the gesture semantic corresponding to the gesture image is 1-1. On-sensing and calculating integrated infrared detection module 101 based on weighting matrixW 1 The obtained characteristic information of the gesture image isC 1-2 And when the gesture image is displayed, the gesture semantics corresponding to the gesture image is 1-2. On-sensing and calculating integrated infrared detection module 101 based on weighting matrixW 1 The obtained characteristic information of the gesture image isC 1-3 And when the gesture image is displayed, the gesture semantics corresponding to the gesture image is 1-3. On-sensing and computing integrated infrared detection module 101 based on weighting matrixW 2 The obtained characteristic information of the gesture image isC 2-1 And when the gesture image is displayed, the gesture semantic corresponding to the gesture image is 2-1. On-sensing and computing integrated infrared detection module 101 based on weighting matrixW 2 The obtained characteristic information of the gesture image isC 2-2 When indicates thatThe gesture semantics corresponding to the gesture image is 2-2. On-sensing and calculating integrated infrared detection module 101 based on weighting matrixW 2 The obtained characteristic information of the gesture image isC 2-3 And when the gesture image is displayed, the gesture semantics corresponding to the gesture image is 2-3.
It should be noted that the gesture feature correspondence table is not limited to the above. For example, the gesture feature correspondence table may further include the perception integral infrared detection module 101 based on a weighting matrixW 1 The obtained characteristic information of the gesture image is respectivelyC 1-4 ,C 1-5 ,..., C 1-N Gesture semantics corresponding to the time 1-4,1-5, \ 8230;, 1-N. Alternatively, the gesture feature correspondence table may further include the sensing and calculation integrated infrared detection module 101 based on a weighting matrixW 2 The obtained characteristic information of the gesture image is respectivelyC 2-4 ,C 2-5 ,…,C 2-N Gesture semantics 2-4, 2-5, \ 8230;, 2-N corresponding to the gesture. Alternatively, the gesture feature mapping table may further include the sensing and calculation integrated infrared detection module 101 based on a weighting matrixW k The obtained characteristic information of the gesture image is respectivelyC k-1 ,C k-2 ,…, C k-N Gesture semantics k-1, k-2, \ 8230;, k-N, etc. corresponding to the gesture.
It should be further noted that, in order to facilitate identification and distinction, the content in the gesture feature correspondence table adopts a numbered form to represent gesture semantics. The gesture semantics corresponding to each number can be digital meanings, such as 1,2,3, \8230, and the like, and can also be meanings representing actions such as forward movement, backward movement, and the like, or other meanings, and can be set by a person skilled in the art according to actual needs. Besides the form of the number, those skilled in the art may also use other forms, such as characters, to represent the gesture semantics, which is not limited in this embodiment.
The information in the preset form may be information in the form of images, characters, voice, etc., and certainly, the information in the preset form may also be information in other forms, which is not limited in this embodiment.
Compared with the prior art, the detection sensing and the feature calculation are combined, the steps of sensing, convolution calculation and the like are integrated, sensing and calculation integrated detection is achieved, the feature information of the gesture image to be recognized can be obtained through the sensing and calculation integrated infrared detection module, the gesture recognition module can directly conduct gesture recognition through the gesture feature mapping table according to the feature information of the gesture image, the data transmission bandwidth requirement and the data processing calculation power requirement of the system are reduced, the gesture recognition processing time and the gesture recognition processing power of the system are reduced, the problems that a traditional gesture recognition system is high in power consumption, large in size, large in chip number and the like are solved to a certain extent, the system can work online for a long time, and the system is applied to scenes such as intelligent helmets, intelligent glasses and unmanned aerial vehicle clusters and the like which need to conduct gesture detection recognition at night in a short distance.
Illustratively, the sensing and computing integrated infrared detection module 101 is further configured to obtain gesture data of a plurality of sample gesture images with known gesture semantics according to a dynamically configured edge extraction operator, so as to obtain a training sample set. The sample gesture images correspond to at least one of different gesture semantics, positions, visual angles and distances. That is, gestures in the sample gesture images may correspond to different gesture semantics, or may correspond to different positions, different viewing angles, different distances, and the like. Preferably, gesture semantics, positions, visual angles and distances corresponding to gestures in the plurality of sample gesture images are different, so that the richness of the training sample set is enhanced.
Illustratively, the integrated sensing infrared detection module 101 includes an infrared detector array and an integrated sensing processing chip electrically connected to the infrared detector array.
With reference to fig. 2 and 3, the infrared detector array includes a plurality of infrared detectors arranged in an array form, and is configured to output a current signal under the action of the sensing-integration processing chip based on an edge extraction operator or a weighting matrix.
Illustratively, the infrared detector may comprise a thermistor or thermistor type device. The infrared detector is used as a core component of the sensing and calculating integrated infrared detection module, and resistance change can occur after receiving infrared thermal radiation signals, so that the sensing of the infrared thermal radiation signals is realized by utilizing the photoresistance characteristics of the infrared detector comprising thermosensitive or photosensitive resistor devices, and the infrared thermal radiation power is represented by the resistance change of the infrared detector comprising the thermosensitive or photosensitive resistor devices.
The resistance of an infrared detector can be characterized by its conductance. For example, as shown in fig. 2 and 3, when the infrared detector array is m rows and n columns, the conductance of the infrared detector in the 1 st row is respectively represented as g 11 、g 12 、g 13 、……、g 1n The conductance of the infrared detector of line 2 is denoted as g 21 、g 22 、g 23 、……、g 2n The conductance of the infrared detector of line 3 is respectively expressed as g 31 、g 32 、g 33 、……、g 3n 8230the conductance of the mth row of infrared detectors is denoted as g m1 、g m2 、g m3 、……、g mn
And the sensing and computing integrated processing chip is used for dynamically configuring the edge extraction operator or the weighting matrix to the infrared detector array, and controlling the infrared detector array to output current signals based on the edge extraction operator or the weighting matrix to obtain a training sample set or characteristic information.
Specifically, the sensing and calculation integrated processing chip can configure an edge extraction operator or a weighting matrix to the infrared detector array in a voltage signal manner, so that the infrared detector array can output a current signal based on the edge extraction operator or the weighting matrix in the voltage signal form, thereby obtaining a training sample set or feature information of a gesture image to be recognized.
It should be noted that the sense integrated processing chip can be implemented by a Complementary Metal Oxide Semiconductor (CMOS) standard process. The infrared detector array and the sensing and calculation integrated processing chip can realize monolithic interconnection and integration through a subsequent process mode.
Illustratively, as shown in fig. 2, the sense all-in-one processing chip comprises a bias voltage generating module, a gating device array, a column amplification conversion module and a control module.
First ends of the infrared detectors in each row are respectively and electrically connected with the bias voltage generation module and the control module through corresponding row output lines. That is, as shown in fig. 2, the first end of each infrared detector is electrically connected to the bias voltage generating module and the control module through the corresponding row output line, respectively.
The first end of each row of the gating devices is electrically connected with the second end of the corresponding row of the infrared detectors, the second end of each row of the gating devices is electrically connected with the corresponding amplification conversion module through the corresponding row output line, and the control end of each row of the gating devices is electrically connected with the control module.
The bias voltage generating module is used for providing bias voltage for the infrared detector array, and the bias voltage can be configured into an edge extraction operator or a weighting matrix and is configured to the infrared detector array through the control module.
Illustratively, as shown in fig. 4, the bias voltage generating module may include a reference voltage generating module and a digital-to-analog converter (DAC). The reference voltage generating module generates reference voltage, the reference voltage is converted into programmable analog voltage through the digital-to-analog converter, and the programmable analog voltage is configured to the corresponding row output line under the control of the control module to provide corresponding bias voltage for each row of infrared detectors.
For example, as shown in FIG. 4, when the infrared detector array has m rows and n columns, the infrared detectors of rows 1,2, \8230;, and m rows correspond to the bias voltage V, respectively sk1 、V sk2 、……、V skm . For another example, as shown in fig. 2 and 3, the bias voltage generating module may provide bias voltages V for the infrared detectors in the rows respectively sk1 、V sk2 、V sk3 、……、V skm The current signals on the column output lines corresponding to the infrared detectors of each column can be respectively expressed as I 1 、I 2 、I 3 、……、I n
As shown in fig. 2, the gating device array includes a plurality of gating devices arranged in an array form, and each gating device corresponds to one infrared detector in the infrared detector array. That is to say, the gating devices in the gating device array correspond to the infrared detectors in the infrared detector array one by one to form an array cross structure. In the array cross structure, a first end of each infrared detector is electrically connected with a corresponding row output line, a second end of each infrared detector is electrically connected with a first end of a corresponding gating device, a second end of each gating device is electrically connected with a corresponding column output line, and a control end of each gating device is electrically connected with a control module.
The gating device is used for gating according to a preset gating mode under the control of the control module, adjusting the bias voltage through bias and gating control, and configuring the adjusted bias voltage to the corresponding infrared detector, so that the adjusted bias voltage is configured to the corresponding infrared detector as an edge extraction operator or a weighting matrix. It should be noted that the preset gating manner may be set according to actual needs, and the present embodiment does not limit this.
Illustratively, as shown in fig. 3, the gating device may employ a MOS gate tube. And a gating device array consisting of MOS (metal oxide semiconductor) gating tubes and an infrared detector array form an integrated sensing and calculating cross array structure. And the drain electrode of the MOS gate tube is the first end of the gate device and is electrically connected with the second end of the corresponding infrared detector. And the source stage of the MOS gate tube is the second end of the gate device and is electrically connected with the corresponding amplification conversion module through the corresponding column output line. The gate of the MOS gate tube is a control end of the gating device and is electrically connected with the control module, the gating of the infrared detector can be realized under the control of the control module, and the current output by the infrared detector can be finely adjusted.
According to the embodiment, the MOS gate tube is used as the gating device, so that the size and the weight of the gating device array can be reduced, the energy consumption is further reduced, and the efficiency ratio is improved.
The control module is used for controlling the bias voltage generation module to apply bias voltage to the infrared detector array. Specifically, the control module may configure the bias voltage to the row output line by timing, so as to apply the bias voltage to the infrared detector array through the row output line.
The control module is further used for controlling the gating device array to gate according to a preset gating mode, adjusting the bias voltage, dynamically configuring the edge extraction operator or the weighting matrix to the infrared detector array, and controlling the infrared detector array to output current signals based on the edge extraction operator or the weighting matrix.
And the column amplification conversion module is used for amplifying the current signal and converting the amplified current signal into a voltage signal to obtain a training sample set or characteristic information. Specifically, as shown in fig. 2 and 3, the column amplification conversion module includes a plurality of amplification conversion modules, each amplification conversion module corresponds to each column of gating devices, and is electrically connected to the second ends of the gating devices in the corresponding column through a corresponding column output line. Each amplification conversion module can amplify the current signal on the corresponding column output line and convert the amplified current signal into a voltage signal, so that the current signal I on the column output line is converted 1 、I 2 、I 3 、……、I n And amplifying and converting the amplified current signal into a voltage signal to obtain a training sample set or characteristic information.
The embodiment further reduces the data transmission amount and the data processing amount of the system, saves the data transmission bandwidth, and accelerates the data transmission speed and the data processing speed of the system.
Illustratively, as shown in fig. 5, the amplification conversion module includes a capacitive transconductance amplifier and a sample-and-hold circuit, an input terminal of the capacitive transconductance amplifier is electrically connected to the corresponding column output line, and an output terminal of the capacitive transconductance amplifier is electrically connected to an input terminal of the sample-and-hold circuit.
Specifically, the capacitive transconductance amplifier is configured to amplify a current signal on the column output line and convert the amplified current signal into a voltage signal. As shown in fig. 5, the sample-and-hold circuit includes a sampling capacitor and a switch (not shown), and is used for maintaining the stability of the voltage signal output by the capacitor transconductance amplifier before performing analog-to-digital conversion on the voltage signal, so as to improve the accuracy of the analog-to-digital conversion.
Illustratively, in conjunction with fig. 2, the integrated sensing infrared detection module further includes an analog-to-digital converter (ADC) and a buffer (not shown), and the ADC is electrically connected to an output terminal of the sample-and-hold circuit (not shown) through the buffer.
Specifically, in conjunction with FIG. 5, the sample-and-hold circuit outputs a voltage signal V out Voltage signal V out The voltage signal V is input into an analog-to-digital converter after being acted by a buffer out And converting the signals into digital signals to realize analog-to-digital conversion.
The buffer is arranged between the sampling and holding circuit and the analog-to-digital converter, so that the driving capability can be increased, and the capacity with load can be improved, thereby improving the signal quality and further improving the precision of the analog-to-digital conversion.
The circuit principle of the integrated infrared sensing module shown in fig. 2 will be described below with reference to fig. 3 and 5.
The first end of each infrared detector in the infrared detector array is respectively and electrically connected with the bias voltage generation module and the control module through the corresponding row output line, so that the bias voltage generation module generates bias voltage V under the control of the control module sk1 、V sk2 、V sk3 、……、V skm And a bias voltage V is applied through the corresponding row output line sk1 、V sk2 、V sk3 、……、V skm Respectively configured to the infrared detectors of the corresponding rows.
When the gating devices in the gating device array are MOS gating tubes, drain electrodes of the MOS gating tubes are respectively and electrically connected with second ends of the corresponding infrared detectors, source electrodes of the MOS gating tubes are respectively and electrically connected with corresponding column output lines, grid electrodes of the MOS gating tubes are respectively and electrically connected with a control module, and the control module enables grid voltage V to be applied to the grid electrodes FID1 、……、V FIDm And the MOS gate tubes are respectively configured to the corresponding rows, so that the MOS gate tubes are controlled to gate according to a preset gating mode, the bias voltage is adjusted, and the adjusted bias voltage is configured to the corresponding infrared detector.
Each column output line is electrically connected with the input end of a corresponding capacitor transconductance amplifier in the amplification and conversion module, the output end of each capacitor transconductance amplifier is electrically connected with the input end of a corresponding sample-and-hold circuit, and the output end of the sample-and-hold circuit is electrically connected with the analog-to-digital converter through a buffer, so that current signals output by each column output line are amplified and converted into voltage signals through each capacitor transconductance amplifier, and digital quantization of the voltage signals is realized through the sample-and-hold circuit, the buffer and the analog-to-digital converter.
The following describes a calculation principle of the infrared detection module with integrated sensing and calculation to realize convolution operation with reference to fig. 2.
Will be the first in the infrared detector arrayiGo to the firstjThe conductance of the infrared detector of the column is expressed asg ij i=1,2,…,mIndicating the number of rows in the infrared detector array,mrepresenting the number of rows of the infrared detector array,j=1,2,…,nindicating the column number in the infrared detector array,nindicating the number of columns of the infrared detector array. That is, the infrared detector array is of a scale ofm×nThe conductance of the infrared detector of line 1 is denoted as g 11 、g 12 、g 13 、……、g 1n The conductance of the infrared detector of line 2 is denoted as g 21 、g 22 、g 23 、……、g 2n The conductance of the infrared detector of line 3 is respectively expressed as g 31 、g 32 、g 33 、……、g 3n 8230the conductance of the m row infrared detector is denoted g m1 、g m2 、g m3 、……、g mn . Wherein the content of the first and second substances,g ij can be determined based on the infrared thermal radiation signal detected by the corresponding infrared detector.
Will be the first in the infrared detector arrayiThe voltage at two ends of the line infrared detector is expressed asV ini 1,2, 8230;,mthe voltages at the two ends of the line infrared detector can be respectively expressed asV in1V in2 、……、V inm . Voltage acrossV ini Is the firstiAnd correspondingly adjusting the bias voltage of the line infrared detector.
Will be in the infrared detector arrayjCurrent on column output line corresponding to column infrared detectorThe signal is represented asI j 1,2, 8230;, nthe current signals on the column output lines corresponding to the column infrared detectors can be respectively represented as I 1 、I 2 、……、I n . According to kirchhoff's law, the column current signal I 1 、I 2 、……、I n Can be respectively expressed as:
Figure 373913DEST_PATH_IMAGE001
converting the current signal obtained according to kirchhoff's law into a matrix form, which can be expressed as the following formula (1):
Figure 822212DEST_PATH_IMAGE002
(1)。
in particular, in order to realize the matrix multiplication operation which is a key step in the convolution operation, the matrix multiplication operation can be realized by configuring a plurality of groups of voltages at two ends of the infrared detector array, namely the adjusted bias voltage.
Exemplary, letk=1,2,…,ttIndicating a preset number of groups, and arranging the infrared detectors in the arrayiFirst of the line infrared detectorkThe voltage across the group is represented asV inki In the infrared detector arrayjA second row output line corresponding to the row infrared detectorkThe group current signal is represented asI jk . Then, the alloy of 1,2, \ 8230;,mthe voltage across the 1 st group of the line infrared detector can be expressed asV in11V in12 、……、V in1m 1,2, 8230in the specification,mthe voltage across the 2 nd group of the line infrared detector can be expressed asV in21V in22 、……、V in2m 8230; \ 8230;, 1,2, 8230;,mfirst of the line infrared detectortThe voltage across the stack can be expressed asV int1V int2 、……、V intm . 1,2, 8230;, non the column output line corresponding to the column infrared detectorCan be represented as I 11 、I 21 、……、I n1 1,2, 8230;, nthe 2 nd set of current signals on the column output lines corresponding to the column infrared detectors can be represented as I 12 、I 22 、……、I n2 8230; \ 8230;, 1,2, 8230;, na second row output line corresponding to the row infrared detectortThe group current signal can be represented as I t1 、I t2 、……、I tn
The sets of current signals can be obtained based on the above formula (1) according to kirchhoff's law, and the sets of current signals are combined together and can be expressed as the following formula (2):
Figure 929845DEST_PATH_IMAGE003
(2)。
through combining a plurality of groups of voltages at two ends and a plurality of gating modes, configuration of a plurality of kinds of weight information related to convolution operation can be realized, and therefore corresponding convolution operation is completed based on the configured weight information.
Another embodiment of the present disclosure relates to a gesture detection and recognition method, which is applied to the gesture detection and recognition system provided in the above embodiment of the present disclosure. For a specific structure of the gesture detection and recognition system, reference may be made to the above description of the implementation of the gesture detection and recognition system, and details are not repeated here.
As shown in fig. 6, the gesture detection and recognition method includes the following steps:
step 601, extracting characteristic information based on the weighting matrix: and performing convolution operation on the gesture image to be recognized based on a pre-trained deep neural network model according to the dynamically configured weighting matrix to obtain the characteristic information of the gesture image.
Step 602, finding the gesture semantics corresponding to the feature information in the weighting matrix from the preset gesture feature correspondence table, updating the weighting matrix when the gesture semantics are not found, returning to the step of extracting the feature information based on the weighting matrix, and determining the gesture semantics as the semantic information of the gesture image when the gesture semantics are found. That is to say, when the gesture semantics corresponding to the feature information in the weighting matrix configured in step 601 is not found in the gesture feature correspondence table, step 602 needs to adjust the weighting matrix, so as to extract the feature information of the gesture image to be recognized based on the adjusted weighting matrix and reuse step 601. When the gesture semantics corresponding to the feature information in the weighting matrix configured in step 601 are found from the gesture feature correspondence table, step 602 may directly use the found gesture semantics as the semantic information of the gesture image to be recognized. The gesture feature correspondence table comprises a weighting matrix, and a correspondence relation between feature information and gesture semantics.
Step 603, converting the semantic information into information in a preset form, and transmitting the information in the preset form to a user.
It should be noted that, for specific contents of the gesture feature mapping table and specific forms of the preset form of information, reference may be made to the above description on the implementation of the gesture detection and recognition system, and details are not described here again.
Compared with the prior art, the detection sensing and the feature calculation are combined, the steps of sensing, convolution calculation and the like are integrated, sensing and calculation integrated detection is achieved, gesture recognition can be performed by directly utilizing the gesture feature correspondence table based on feature information of a gesture image to be recognized, the data transmission bandwidth requirement and the data processing calculation force requirement of the system are reduced, the gesture recognition processing time and power consumption of the system are reduced, the problems that a traditional gesture recognition method is high in power consumption, large in data transmission amount, large in data processing amount and the like are solved to a certain extent, and the method can be applied to scenes of intelligent helmets, intelligent glasses, unmanned aerial vehicle clusters and the like which need to perform gesture detection recognition at night in a short distance.
Illustratively, the pre-trained deep neural network model is obtained according to the following steps:
based on a preset application scene, dynamically configuring a preset edge extraction operator to a sensing and calculating integrated infrared detection module, and acquiring gesture data of a plurality of sample gesture images with known gesture semantics through the sensing and calculating integrated infrared detection module to obtain a training sample set. The training sample set may include gesture data for each sample gesture image and its corresponding gesture semantics.
And establishing a deep neural network model comprising convolutional layers based on a preset application scene.
And training the deep neural network model by utilizing the training sample set to obtain the pre-trained deep neural network model. In other words, in this step, the deep neural network model may be trained by recognizing the corresponding gesture semantics by using the gesture data in the training sample set, so as to determine the model parameters of the deep neural network model, and obtain the trained deep neural network model.
Specifically, the sample gesture images correspond to at least one of different gesture semantics, positions, visual angles and distances. That is to say, the gestures in the sample gesture images may correspond to different gesture semantics, or may correspond to different positions, different viewing angles, different distances, and the like. Preferably, the gesture semantics, the position, the visual angle and the distance corresponding to the gesture in the plurality of sample gesture images are different, so that the richness of the training sample set is enhanced, and the training effect of the deep neural network model is further enhanced.
The preset application scene may be a night gesture recognition scene, a near distance gesture recognition scene, a night near distance gesture recognition scene, and the like.
Exemplarily, when the sensing and calculation integrated infrared detection module includes an infrared detector array and a sensing and calculation integrated processing chip electrically connected to the infrared detector array, based on a preset application scenario, dynamically configuring a preset edge extraction operator to the sensing and calculation integrated infrared detection module, and acquiring gesture data of a plurality of sample gesture images with known gesture semantics through the sensing and calculation integrated infrared detection module to obtain a training sample set, including:
the sensing and calculating integrated processing chip dynamically configures the edge extraction operator to the infrared detector array based on an application scene; the infrared detector array acquires gesture data based on an edge extraction operator and outputs a current signal under the control of the sensing and calculating integrated processing chip; and the sensing and calculating integrated processing chip obtains a training sample set according to the current signal.
Specifically, the infrared detector array can acquire data of sample gesture images with known gesture semantics based on the configured edge extraction operator, and output the acquired data in the form of current signals, so that the sensing and calculation integrated processing chip can determine gesture data of the sample gesture images with the known gesture semantics according to the current signals output by the infrared detector array, and form a training sample set by the gesture data and the corresponding gesture semantics.
The embodiment can realize multiplication and addition operation by using the infrared detector array, thereby reducing data transmission quantity and reducing system power consumption. In addition, the infrared detector array can independently output each line of data, thereby realizing parallel operation and accelerating the data processing speed.
Illustratively, the edge extraction operators include an x-direction operator and a y-direction operator. The x-direction operator is an operator in the x direction in the edge extraction operator, and the y-direction operator is an operator in the y direction in the edge extraction operator.
When y direction operator includes the negative number, sense integrative processing chip based on the application scene, draws operator dynamic configuration to infrared detector array with the edge, includes: the sensing and calculating integrated processing chip configures a substitute operator corresponding to the y-direction operator to the infrared detector array based on an application scene; the substitute operator is obtained by replacing a negative number in the y-direction operator with a corresponding positive number.
When x direction operator includes the negative number, sense integrative processing chip is based on using the scene, draws operator dynamic configuration to infrared detector array with the edge, includes: the sensing and calculating integrated processing chip respectively configures a preset number of adjusting operators corresponding to the x-direction operators to the infrared detector array based on an application scene; wherein the adjustment operator is a non-negative operator.
Specifically, the edge extraction operator is a Sobel (Sobel) operator as an example. X-direction operator in Sobel operatorW x And y-direction operatorW y Respectively representThe following:
Figure 856213DEST_PATH_IMAGE004
since the weight configured into the infrared detector array cannot be negative, the y-direction operator in the Sobel operator is not negativeW y The negative number can be directly replaced by the corresponding positive number to obtain the y-direction operatorW y Corresponding substitute operatorW' y I.e. by
Figure 936164DEST_PATH_IMAGE005
Substitute operator by using sensing and calculating integrated processing chipW' y Configuring to the infrared detector array such that the infrared detector array is based on the substitution operatorW' y Edge data acquisition in the y direction is carried out on sample gesture images with known gesture semantics, and the acquired data can be converted into y-direction operators in Sobel operators through positive-negative conversion and other subsequent calculation after analog-to-digital conversionW y Corresponding gesture data.
For x-direction operator in Sobel operatorW x There may be 2 adjustment operators, each denoted as
Figure 86523DEST_PATH_IMAGE006
And
Figure 884715DEST_PATH_IMAGE007
then, thenW x Can be expressed as
Figure 349194DEST_PATH_IMAGE008
. Adjusting operator by using sensing and calculating integrated processing chip
Figure 549231DEST_PATH_IMAGE009
And
Figure 870491DEST_PATH_IMAGE010
are respectively configured to the infrared detector array,making the infrared detector array based on the adjustment operator, respectively
Figure 155979DEST_PATH_IMAGE009
And
Figure 158570DEST_PATH_IMAGE010
the method comprises the following steps of carrying out edge data acquisition in the x direction on sample gesture images of known gesture semantics, wherein the acquired data can be obtained according to the sample gesture images after analog-to-digital conversion
Figure 213114DEST_PATH_IMAGE008
Conversion to x-direction operator in Sobel operatorW x Corresponding gesture data.
It should be noted that the edge extraction operator may be a Sobel operator, or may be another operator determined according to actual needs, which is not limited in this embodiment.
According to the embodiment, the x-direction operator and the y-direction operator containing the negative numbers in the edge extraction operator are respectively adjusted to be non-negative operators, so that the edge extraction operators of different types can be configured to the infrared detector array, and the application range of the deep neural network model is expanded.
Illustratively, when the sensing and computation integrated processing chip includes a bias voltage generation module, a gating device array and a control module, the sensing and computation integrated processing chip dynamically configures an edge extraction operator to an infrared detector array based on an application scene, including:
the control module controls the bias voltage generation module to apply bias voltage to the infrared detector array based on an application scene; the control module controls the gating device array to gate according to a preset gating mode, adjusts the bias voltage and dynamically configures the x-direction operator and the y-direction operator to the infrared detector array respectively.
The embodiment further saves the data transmission bandwidth and accelerates the data transmission speed and the data processing speed.
Exemplarily, when the sensing and computing integrated processing chip includes a bias voltage generation module, a gating device array, a column amplification conversion module and a control module, performing convolution operation on a gesture image to be recognized based on a pre-trained deep neural network model according to a dynamically configured weighting matrix to obtain feature information of the gesture image, including:
the control module controls the bias voltage generation module to apply bias voltage to the infrared detector array;
the control module controls the gating device array to gate according to a preset gating mode, adjusts the bias voltage and dynamically configures the weighting matrix to the infrared detector array;
the infrared detector array outputs corresponding current signals under the control of the control module based on the weighting matrix and the gesture image;
the column amplification conversion module amplifies the current signals, converts the amplified current signals into voltage signals and obtains characteristic information.
The embodiment further saves the data transmission bandwidth and accelerates the data transmission speed and the data processing speed.
For example, when the column amplification and conversion module includes a plurality of amplification and conversion modules, each amplification and conversion module corresponds to each column gating device and is electrically connected to the second end of the corresponding column gating device through a corresponding column output line, the current signal output by each column output line may be amplified by each amplification and conversion module, and the amplified current signal may be converted into a voltage signal.
For example, when the amplification and conversion module includes a capacitive transconductance amplifier and a sample-and-hold circuit, the capacitive transconductance amplifier may amplify a current signal output by a corresponding column output line, and convert the amplified current signal into a voltage signal.
Illustratively, when the sensory-computational integrated infrared detection module further comprises an analog-to-digital converter and a buffer, the gesture detection and recognition method further comprises the following steps:
and inputting the voltage signal output by the column amplification conversion module into an analog-to-digital converter after the action of the buffer to obtain a digital signal corresponding to the voltage signal.
In order to enable those skilled in the art to better understand the above embodiments, a specific example is described below.
With reference to fig. 2,3, 4, 5 and 7, a gesture detection and recognition method includes the following steps:
the method comprises the following steps: and (4) setting up a gesture detection and recognition system, and configuring parameters of the gesture detection and recognition system.
Specifically, the gesture detection and recognition system comprises a sensing and calculation integrated infrared detection module, an FPGA/AI gesture recognition module and a semantic interpretation processing conversion module.
The sensing and calculating integrated infrared detection module comprises an infrared detector array, a bias voltage generation module, an MOS gate tube array, a column amplification conversion module, a control module, an analog-to-digital converter and a buffer. The amplification conversion module comprises a capacitance transconductance amplifier and a sampling holding circuit. The infrared detector is a thermosensitive or photoresistor device.
The first end of each infrared detector in the infrared detector array is respectively and electrically connected with the bias voltage generation module and the control module through the corresponding row output line, so that the bias voltage generation module generates bias voltage V under the control of the control module sk1 、V sk2 、V sk3 、……、V skm And a bias voltage V is applied through the corresponding row output line sk1 、V sk2 、V sk3 、……、V skm Respectively configured to the infrared detectors of the corresponding rows.
The MOS gate tube array and the infrared detector array form a sensing and calculating integrated cross array structure. The drain electrode of each MOS gate tube is respectively and electrically connected with the second end of the corresponding infrared detector, the source electrode of each MOS gate tube is respectively and electrically connected with the corresponding column output line, the grid electrode of each MOS gate tube is respectively and electrically connected with the control module, and the control module enables the grid voltage V to be applied to the grid electrode FID1 、……、V FIDm And the MOS gate tubes are respectively configured to the corresponding rows, so that the MOS gate tubes are controlled to gate according to a preset gating mode, the bias voltage is adjusted, and the adjusted bias voltage is configured to the corresponding infrared detector.
Each column output line is electrically connected with the input end of a corresponding capacitor transconductance amplifier in the amplification and conversion module, the output end of each capacitor transconductance amplifier is electrically connected with the input end of a corresponding sample-hold circuit, and the output end of the sample-hold circuit is electrically connected with the analog-to-digital converter through a buffer, so that current signals output by each column output line are amplified and converted into voltage signals through each capacitor transconductance amplifier, and digital quantization of the voltage signals is realized through the sample-hold circuit, the buffer and the analog-to-digital converter.
The infrared image containing the gesture directly acquired by the common infrared camera can be represented asI 0 (x,y) In order to identify the gesture in the infrared image, a Sobel operator is used for carrying out convolution operation on the infrared image, the edge characteristics of the gesture are extracted, and the obtained convolved image can be expressed asI 1 (x,y). Wherein, x-direction operator in Sobel operatorW x And y-direction operatorW y Are respectively represented as
Figure 970854DEST_PATH_IMAGE004
However, since the weight configured into the infrared detector array cannot be a negative number, it is not necessary for the y-direction operator in the Sobel operatorW y Directly replacing the negative number with the corresponding positive number to obtain the y-direction operatorW y Corresponding substitute operatorW' y I.e. by
Figure 743638DEST_PATH_IMAGE005
Will substitute for the operatorW' y Configured to the infrared detector array such that the infrared detector array is based on the substitution operatorW' y Edge data acquisition in the y direction is carried out on the gesture image, and the acquired data can be converted into a y-direction operator in a Sobel operator through positive-negative conversion and other subsequent calculation after analog-to-digital conversionW y Corresponding gesture data.
For x-direction operator in Sobel operatorW x Setting 2 adjustment operators to be respectively expressed as
Figure 284341DEST_PATH_IMAGE006
And
Figure 255708DEST_PATH_IMAGE007
then, thenW x Is shown as
Figure 122033DEST_PATH_IMAGE008
. Adjusting operator by using sensing and calculating integrated processing chip
Figure 116534DEST_PATH_IMAGE009
And
Figure 726506DEST_PATH_IMAGE010
are respectively configured to the infrared detector arrays, so that the infrared detector arrays are respectively based on the adjusting operators
Figure 552380DEST_PATH_IMAGE009
And
Figure 855185DEST_PATH_IMAGE010
carrying out edge data acquisition in the x direction on sample gesture images of known gesture semantics, wherein the acquired data are obtained according to the analog-to-digital conversion
Figure 71403DEST_PATH_IMAGE008
Conversion to x-direction operator in Sobel operatorW x Corresponding gesture data.
Step two: a deep neural network model including convolutional layers for gesture recognition is built.
As shown in fig. 8, in the prior art, a gesture recognition method based on deep learning generally acquires an image of a gesture, and then directly inputs the acquired whole gesture image into a convolutional neural network model for training calculation, which requires a large amount of analog-to-digital conversion. Moreover, the convolutional neural network model used in the prior art generally includes a plurality of convolutional layers and a plurality of pooling layers in addition to an input layer, an output layer and a full link layer, which further increases the data transmission amount and the data processing amount.
Therefore, in the step, on the basis of the gesture detection and recognition system built in the step one, a lightweight deep convolutional neural network model is built, a sensing and computing integrated infrared detection module in the gesture detection and recognition system is used for carrying out convolutional operation, edge feature extraction is carried out on the gesture image, and the extracted feature information is transmitted to the FPGA/AI gesture recognition module, so that the FPGA/AI gesture recognition module can directly utilize the feature information of the gesture image to realize other processing except for convolutional operation in the deep convolutional neural network model, such as pooling processing, full connection processing and the like, and a large amount of data transmission time and bandwidth are saved.
Step three: and (4) acquiring gesture data of a gesture image with known gesture semantics by using a gesture detection and recognition system to serve as a training sample set, training the deep neural network model established in the step two, determining model parameters of the deep neural network model, and obtaining the trained deep neural network model.
Specifically, under a night close-range scene, the gesture detection and recognition system built in the step one is used for carrying out image acquisition on a gesture image with known gesture semantics, and the acquired gesture data and the corresponding gesture semantics are used as a training sample set. The training sample set contains gesture data of different gesture semantics and gesture data of different positions, different visual angles and different distances of gestures in the image.
And substituting the training sample set into the deep neural network model established in the step two for training, determining model parameters of the deep neural network model, and obtaining the trained deep neural network model.
Step four: and establishing a lookup table corresponding to the feature information of the gesture image and the gesture semantics based on the pattern matching.
In particular, in general, the semantic number of gestures in a specific application scene is limited, and in a night close scene, the positions of the gestures are usually obvious, and the influence of the background environment on the gesture recognition is small. Therefore, a gesture feature correspondence table for pattern matching is established based on the correspondence between the feature information of the gesture image output by the gesture detection and recognition system and the gesture semantics, and the gesture feature correspondence table comprises the correspondence between the weighting matrix and the feature information and the gesture semantics.
Step five: and performing gesture recognition on the gesture image to be detected in the night scene to determine gesture semantics.
Specifically, a weighting matrix is dynamically configured to the infrared detector array, and the sensing and calculation integrated infrared detection module is used for extracting the characteristic information of the gesture image to be detected based on the weighting matrix. After the characteristic information output by the sensing and calculating integrated infrared detection module is obtained, gesture semantics corresponding to the characteristic information under the weighting matrix are searched from the gesture characteristic corresponding table based on mode matching. If the corresponding gesture semantics are matched, namely the gesture semantics corresponding to the characteristic information under the weighting matrix are found, the finding is terminated, and the found gesture semantics are used as the semantic information of the gesture image to be detected. If the corresponding gesture semantics are not matched, namely the gesture semantics corresponding to the feature information under the weighting matrix are not found, the weighting matrix is adjusted, the feature information of the gesture image to be detected is re-extracted based on the adjusted weighting matrix by using a sensing and calculating integrated infrared detection module in the gesture detection and recognition system, pattern matching is performed again based on the re-extracted feature information, the gesture semantics corresponding to the feature information extracted under the adjusted weighting matrix are found from a gesture feature corresponding table, if the corresponding gesture semantics are not matched, namely the gesture semantics corresponding to the feature information extracted under the adjusted weighting matrix are not found, the process is repeated until the corresponding gesture semantics are matched, the finding is terminated, and the found gesture semantics are used as the semantic information of the gesture image to be detected.
Step six: and converting the semantic information obtained in the fifth step into images, characters or voice and the like by using a semantic interpretation processing conversion module according to the needs of the user, and transmitting the images, characters or voice and the like to the user.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific embodiments for practicing the present disclosure, and that various changes in form and details may be made therein without departing from the spirit and scope of the present disclosure in practice.

Claims (10)

1. A gesture detection recognition system, the gesture detection recognition system comprising:
the sensing and calculating integrated infrared detection module is used for extracting characteristic information based on the weighting matrix: performing convolution operation on a gesture image to be recognized based on a pre-trained deep neural network model according to a dynamically configured weighting matrix to obtain characteristic information of the gesture image;
the gesture recognition module is used for searching a gesture semantic corresponding to the feature information under the weighting matrix from a preset gesture feature corresponding table, updating the weighting matrix when the gesture semantic is not searched, so that the sensing and computing integrated infrared detection module extracts the feature information based on the updated weighting matrix, and determining the gesture semantic as the semantic information of the gesture image when the gesture semantic is searched; the gesture feature corresponding table comprises the corresponding relation between the weighting matrix and the feature information and the gesture semantics;
and the semantic interpretation processing conversion module is used for converting the semantic information into information in a preset form and transmitting the information in the preset form to a user.
2. The gesture detection and recognition system according to claim 1, wherein the sensing and calculation integrated infrared detection module is further configured to obtain gesture data of a plurality of sample gesture images with known gesture semantics according to a dynamically configured edge extraction operator, so as to obtain the training sample set; wherein the sample gesture images correspond to at least one of different gesture semantics, positions, viewing angles and distances.
3. The gesture detection and recognition system according to claim 2, wherein the sensing and calculation integrated infrared detection module comprises an infrared detector array and a sensing and calculation integrated processing chip electrically connected with the infrared detector array;
the infrared detector array is used for outputting current signals under the action of the sensing and computing integrated processing chip based on the edge extraction operator or the weighting matrix;
the sensing and computing integrated processing chip is configured to dynamically configure the edge extraction operator or the weighting matrix to the infrared detector array, and control the infrared detector array to output the current signal based on the edge extraction operator or the weighting matrix, so as to obtain the training sample set or the feature information.
4. The gesture detection and recognition system of claim 3, wherein the sensory integration processing chip comprises a bias voltage generation module, a gating device array, a column amplification conversion module and a control module, wherein:
the first ends of the infrared detectors in each row are respectively and electrically connected with the bias voltage generation module and the control module through corresponding row output lines;
the first end of each row of the gating device is electrically connected with the second end of the infrared detector in the corresponding row, the second end of each row of the gating device is electrically connected with the corresponding amplification conversion module through the corresponding row output line, and the control end of each row of the gating device is electrically connected with the control module; wherein, the first and the second end of the pipe are connected with each other,
the control module is used for controlling the bias voltage generation module to apply bias voltage to the infrared detector array; and the number of the first and second groups,
the control module is further configured to control the gating device array to gate according to a preset gating manner, adjust the bias voltage, dynamically configure the edge extraction operator or the weighting matrix to the infrared detector array, and control the infrared detector array to output the current signal based on the edge extraction operator or the weighting matrix;
the column amplification conversion module is configured to amplify the current signal, and convert the amplified current signal into a voltage signal to obtain the training sample set or the feature information.
5. A gesture detection and recognition method applied to the gesture detection and recognition system according to any one of claims 1 to 4, the gesture detection and recognition method comprising the steps of:
extracting characteristic information based on the weighting matrix: performing convolution operation on a gesture image to be recognized based on a pre-trained deep neural network model according to a dynamically configured weighting matrix to obtain characteristic information of the gesture image;
searching a gesture semantic corresponding to the feature information under the weighting matrix from a preset gesture feature corresponding table, updating the weighting matrix when the gesture semantic is not found, returning to the step of extracting the feature information based on the weighting matrix, and determining the gesture semantic as the semantic information of the gesture image when the gesture semantic is found; the gesture feature correspondence table comprises the corresponding relation between the weighting matrix and the feature information and the gesture semantics;
and converting the semantic information into information in a preset form, and transmitting the information in the preset form to a user.
6. The gesture detection and recognition method according to claim 5, wherein the pre-trained deep neural network model is obtained according to the following steps:
dynamically configuring a preset edge extraction operator to the sensing and calculating integrated infrared detection module based on a preset application scene, and acquiring gesture data of a plurality of sample gesture images with known gesture semantics through the sensing and calculating integrated infrared detection module to obtain the training sample set; wherein the sample gesture images correspond to at least one of different gesture semantics, positions, visual angles and distances;
establishing a deep neural network model comprising convolutional layers based on the preset application scene;
and training the deep neural network model by using the training sample set to obtain the pre-trained deep neural network model.
7. The gesture detection and recognition method according to claim 6, wherein when the sensory integration infrared detection module comprises an infrared detector array and a sensory integration processing chip electrically connected with the infrared detector array,
the method comprises the following steps that based on a preset application scene, a preset edge extraction operator is dynamically configured to the sensing and calculation integrated infrared detection module, gesture data of a plurality of sample gesture images with known gesture semantics are obtained through the sensing and calculation integrated infrared detection module, and a training sample set is obtained and comprises the following steps:
the sensing and calculation integrated processing chip dynamically configures the edge extraction operator to the infrared detector array based on the application scene;
the infrared detector array acquires the gesture data based on the edge extraction operator and outputs a current signal under the control of the sensing and calculating integrated processing chip;
and the sensing and calculating integrated processing chip obtains the training sample set according to the current signal.
8. The gesture detection recognition method according to claim 7, wherein the edge extraction operator comprises an x-direction operator and a y-direction operator;
when the y-direction operator includes a negative number, the sensing and computing integrated processing chip dynamically configures the edge extraction operator to the infrared detector array based on the application scene, including:
the sensing and computing integrated processing chip configures a substitute operator corresponding to the y-direction operator to the infrared detector array based on the application scene; the substitute operator is obtained by replacing a negative number in the y-direction operator with a corresponding positive number;
when the x-direction operator includes a negative number, the sensing and computing integrated processing chip dynamically configures the edge extraction operator to the infrared detector array based on the application scene, including:
the sensing and calculation integrated processing chip respectively configures a preset number of adjusting operators corresponding to the x-direction operators to the infrared detector array based on the application scene; wherein the adjustment operator is a non-negative operator.
9. The gesture detection and recognition method according to claim 8, wherein when the integrated sensing processing chip comprises a bias voltage generation module, a gating device array and a control module,
the sensing and computing integrated processing chip dynamically configures the edge extraction operator to the infrared detector array based on the application scene, and the method comprises the following steps:
the control module controls the bias voltage generation module to apply bias voltage to the infrared detector array based on the application scene;
and the control module controls the gating device array to gate according to a preset gating mode, adjusts the bias voltage, and dynamically configures the x-direction operator and the y-direction operator to the infrared detector array respectively.
10. The gesture detection and recognition method according to any one of claims 5 to 9, wherein when the integrated sensing processing chip comprises a bias voltage generation module, a gating device array, a column amplification conversion module and a control module,
the method for performing convolution operation on the gesture image to be recognized based on the pre-trained deep neural network model according to the dynamically configured weighting matrix to obtain the characteristic information of the gesture image comprises the following steps:
the control module controls the bias voltage generation module to apply bias voltage to the infrared detector array;
the control module controls the gating device array to gate according to a preset gating mode, adjusts the bias voltage and dynamically configures the weighting matrix to the infrared detector array;
the infrared detector array outputs corresponding current signals under the control of the control module based on the weighting matrix and the gesture image;
the column amplification conversion module amplifies the current signal and converts the amplified current signal into a voltage signal to obtain the characteristic information.
CN202211194591.1A 2022-09-29 2022-09-29 Gesture detection and recognition system and method Active CN115471917B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211194591.1A CN115471917B (en) 2022-09-29 2022-09-29 Gesture detection and recognition system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211194591.1A CN115471917B (en) 2022-09-29 2022-09-29 Gesture detection and recognition system and method

Publications (2)

Publication Number Publication Date
CN115471917A true CN115471917A (en) 2022-12-13
CN115471917B CN115471917B (en) 2024-02-27

Family

ID=84335336

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211194591.1A Active CN115471917B (en) 2022-09-29 2022-09-29 Gesture detection and recognition system and method

Country Status (1)

Country Link
CN (1) CN115471917B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108573190A (en) * 2017-03-08 2018-09-25 北京微美云息软件有限公司 A kind of 3D holographies gesture recognition system
CN109508670A (en) * 2018-11-12 2019-03-22 东南大学 A kind of static gesture identification method based on infrared camera
US20200042776A1 (en) * 2018-08-03 2020-02-06 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for recognizing body movement
CN111368800A (en) * 2020-03-27 2020-07-03 中国工商银行股份有限公司 Gesture recognition method and device
CN112148128A (en) * 2020-10-16 2020-12-29 哈尔滨工业大学 Real-time gesture recognition method and device and man-machine interaction system
US20220036050A1 (en) * 2018-02-12 2022-02-03 Avodah, Inc. Real-time gesture recognition method and apparatus

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108573190A (en) * 2017-03-08 2018-09-25 北京微美云息软件有限公司 A kind of 3D holographies gesture recognition system
US20220036050A1 (en) * 2018-02-12 2022-02-03 Avodah, Inc. Real-time gesture recognition method and apparatus
US20200042776A1 (en) * 2018-08-03 2020-02-06 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for recognizing body movement
CN109508670A (en) * 2018-11-12 2019-03-22 东南大学 A kind of static gesture identification method based on infrared camera
CN111368800A (en) * 2020-03-27 2020-07-03 中国工商银行股份有限公司 Gesture recognition method and device
CN112148128A (en) * 2020-10-16 2020-12-29 哈尔滨工业大学 Real-time gesture recognition method and device and man-machine interaction system

Also Published As

Publication number Publication date
CN115471917B (en) 2024-02-27

Similar Documents

Publication Publication Date Title
CN109993160B (en) Image correction and text and position identification method and system
US20180186452A1 (en) Unmanned Aerial Vehicle Interactive Apparatus and Method Based on Deep Learning Posture Estimation
CN111814661B (en) Human body behavior recognition method based on residual error-circulating neural network
CN110619373B (en) Infrared multispectral weak target detection method based on BP neural network
CN110766041B (en) Deep learning-based pest detection method
WO2022078197A1 (en) Point cloud segmentation method and apparatus, device, and storage medium
CN107025440A (en) A kind of remote sensing images method for extracting roads based on new convolutional neural networks
CN110555408B (en) Single-camera real-time three-dimensional human body posture detection method based on self-adaptive mapping relation
Daroya et al. Alphabet sign language image classification using deep learning
CN112766229B (en) Human face point cloud image intelligent identification system and method based on attention mechanism
Abiyev et al. Reconstruction of convolutional neural network for sign language recognition
WO2020222391A1 (en) System and method for invertible wavelet layer for neural networks
Bin et al. Study of convolutional neural network in recognizing static American sign language
CN111950570A (en) Target image extraction method, neural network training method and device
CN111339976A (en) Indoor positioning method, device, terminal and storage medium
Makarov et al. Russian sign language dactyl recognition
CN115471917B (en) Gesture detection and recognition system and method
CN111104921A (en) Multi-mode pedestrian detection model and method based on Faster rcnn
CN110619288A (en) Gesture recognition method, control device and readable storage medium
CN116797938A (en) SAR ship classification method based on contrast learning pre-training
WO2018153112A1 (en) Apparatus for image processing
CN210899633U (en) Indoor positioning system based on deep neural network
Etienne-Cummings et al. A vision chip for color segmentation and pattern matching
CN115471710B (en) Infrared detection recognition system and method
Nijhawan et al. Real-time Object Detection for Visually Impaired with Optimal Combination of Scores

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant