CN113392904B - LTC-DNN-based visual inertial navigation combined navigation system and self-learning method - Google Patents

LTC-DNN-based visual inertial navigation combined navigation system and self-learning method Download PDF

Info

Publication number
CN113392904B
CN113392904B CN202110664888.9A CN202110664888A CN113392904B CN 113392904 B CN113392904 B CN 113392904B CN 202110664888 A CN202110664888 A CN 202110664888A CN 113392904 B CN113392904 B CN 113392904B
Authority
CN
China
Prior art keywords
ltc
layer
inertial navigation
visual
rnn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110664888.9A
Other languages
Chinese (zh)
Other versions
CN113392904A (en
Inventor
胡斌杰
丘金光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202110664888.9A priority Critical patent/CN113392904B/en
Publication of CN113392904A publication Critical patent/CN113392904A/en
Application granted granted Critical
Publication of CN113392904B publication Critical patent/CN113392904B/en
Priority to PCT/CN2022/112625 priority patent/WO2022262878A1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/165Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments
    • G01C21/1656Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments with passive imaging devices, e.g. cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Automation & Control Theory (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a visual inertial navigation combined navigation system based on LTC-DNN and a self-learning method, wherein the visual inertial navigation combined navigation system comprises a deep learning network model, wherein the deep learning network model consists of a visual feature extraction module, an inertial navigation feature extraction module and a pose regression module; the visual feature extraction module is used for extracting visual features of the two adjacent frames of RGB pictures; the inertial navigation feature extraction module is used for extracting inertial navigation features of the inertial navigation data; the pose regression module comprises an attention mechanism fusion submodule, a liquid time constant recurrent neural network (LTC-RNN) and a full-connection regression submodule, and is used for predicting relative displacement and relative rotation. The self-learning method disclosed by the invention is used for training the visual inertial navigation integrated navigation system, and compared with the same type of algorithm, the dependence on a real label is reduced; and the estimation precision of the relative displacement and the relative pose of the deep learning network model is high, and the robustness to data damage is good.

Description

LTC-DNN-based visual inertial navigation combined navigation system and self-learning method
Technical Field
The invention relates to the technical field of sensor fusion and motion estimation, in particular to a visual inertial navigation combined navigation system based on LTC-DNN and a self-learning method.
Background
With the continuous development of automatic driving and unmanned aerial vehicles, the realization of high-precision and high-robustness positioning is an important premise for completing tasks such as autonomous navigation, searching unknown areas and the like, a pure visual odometry method acquires surrounding environment information by using a visual sensor, and estimates a motion state by analyzing visual data, but once a shelter appears in a scene or the visual data is lost due to data transmission reasons, the estimation of the motion state is undoubtedly seriously interfered, and an error is originally larger. The visual inertial navigation odometer adds and Inertial Measurement Unit (IMU) information on the basis of a pure visual odometer, and can improve the precision of motion state estimation under the condition of visual loss.
In recent years, deep learning techniques have achieved enormous success in the field of computer vision, and are widely used in various fields. The visual inertial navigation combined navigation is used as a regression task and can also be trained by adopting a deep learning method, but the existing visual inertial navigation combined navigation algorithm based on the deep learning is limited by the number of real labels in the training process, and the generalization capability is weak; meanwhile, a large amount of trainable parameters are needed in the existing visual inertial navigation combination navigation task based on deep learning, and the practical application of the visual inertial navigation combination navigation task is greatly influenced.
Disclosure of Invention
The invention aims to solve the defects in the prior art and provides a visual inertial navigation combined navigation system based on LTC-DNN and a self-learning method.
The first purpose of the invention can be achieved by adopting the following technical scheme:
an LTC-DNN-based visual inertial navigation combined navigation system is used for automatic driving and autonomous navigation of unmanned aerial vehicles and comprises a deep learning network model, wherein the deep learning network model consists of a visual feature extraction module, an inertial navigation feature extraction module and a pose regression module which are sequentially connected in sequence,
the visual feature extraction module is used for extracting 1024-dimensional visual features, the input of the visual feature extraction module is two adjacent frames of RGB pictures overlapped along a channel, and the 1024-dimensional visual features are output;
the inertial navigation feature extraction module comprises a first single-layer LTC-RNN in a 1024-dimensional hidden state; the input of the inertial navigation feature extraction module is inertial navigation data between the two adjacent frames of RGB pictures, and the output of the inertial navigation feature extraction module is 1024-dimensional inertial navigation features;
the pose regression module comprises an attention mechanism fusion submodule, a 1000-dimensional hidden state second single-layer LTC-RNN and a fully-connected regression submodule which are connected in sequence, wherein the input of the attention mechanism fusion submodule is a series connection characteristic obtained by connecting a visual characteristic and an inertial navigation characteristic in series and is used for weighting the visual characteristic and the inertial navigation characteristic to obtain a weighted fusion characteristic; the input of the second single-layer LTC-RNN is weighted fusion characteristics, and regression characteristics are output; the input of the fully connected regression submodule is the regression feature, and the estimation of the relative displacement and the relative rotation is output.
Further, the visual feature extraction module is formed by sequentially stacking 10 layers of convolutional neural networks, the sizes of convolutional kernels of the former three layers of convolutional neural networks in the 10 layers of convolutional neural networks are 7 × 7, 5 × 5 and 5 × 5 in sequence, the sizes of convolutional kernels of the latter seven layers of convolutional neural networks are all 3 × 3, the convolution step lengths of the fourth layer, the sixth layer and the eighth layer of convolutional neural networks are 1, and the convolution step lengths of the rest convolutional neural networks are 2; the 10-layer convolutional neural networks all use the ReLU activation function.
Further, the RGB picture is converted to 416 × 128 size before being input to the feature extraction module.
Further, the calculation formula of the first single-layer LTC-RNN and the second single-layer LTC-RNN is as follows:
Figure BDA0003116441260000031
h (t) is a hidden state of the LTC-RNN at the current moment, tau is a constant time constant, delta t is a time step, x (t) is input data of the current moment, f (h) (t), x (t), t and theta are deep learning networks, theta is a trainable parameter of the deep learning networks, t is the current moment, the calculation mode of the first single-layer LTC-RNN and the second single-layer LTC-RNN inputs data x (t) and h (t) into the calculation formula at the initial stage of each calculation, the current output h (t + delta t) of the formula is used as the input h (t) of the formula for the next time, the calculation is continued, and the calculation is repeated for 6 times; and taking the output h (t + delta t) of the 6 th time as the calculation result of the first single-layer LTC-RNN and the second single-layer LTC-RNN.
Furthermore, the attention mechanism fusion sub-module comprises two sub-networks with the same structure, each sub-network is formed by overlapping two layers of fully-connected networks, the dimensionality of the first layer of the fully-connected network is 2048 and is connected with a ReLU activation function, the dimensionality of the second layer of the fully-connected network is 1024 and is connected with a Sigmoid activation function.
Further, the fully-connected regression submodule consists of four layers of fully-connected networks, wherein the dimension of the first layer of fully-connected network is 512, the dimension of the second layer of fully-connected network is 128, the dimension of the third layer of fully-connected network is 64, and the dimension of the fourth layer of fully-connected network is 6; and a ReLU activation function is connected behind the first three layers of fully-connected networks in the fully-connected regression submodule, and the fourth layer of fully-connected networks are not connected with any activation function.
The other purpose of the invention can be achieved by adopting the following technical scheme:
a self-learning method of an LTC-DNN based visual inertial navigation combination navigation system comprises the following steps:
s1, converting the real label with real relative displacement and relative rotation into standard normal distribution to obtain a real standard label, a mean value 1 and a variance 1, and performing first training on the deep learning network model by using the real standard label;
s2, predicting the unlabeled data by the deep learning network model after the first training, and performing first inverse standardized calculation on the prediction result by using the mean value 1 and the variance 1 to obtain a pseudo label;
s3, randomly selecting a certain number of pseudo labels and real labels, and mixing according to the proportion of 0.2:1 to obtain mixed labels;
and S4, converting the mixed label into a standard normal distribution to obtain a mixed standard label, a mean value 2 and a variance 2, and performing secondary training on the deep learning network model by using the mixed standard label.
Further, the pseudo label, the real label and the mixed label comprise relative displacement and relative rotation on x, y and z axes.
Further, the operation of converting the real label and the mixed label into the standard normal distribution is to convert the relative displacement and the relative rotation on the x, y and z axes into the standard normal distribution respectively.
Further, the training of the deep learning network model uses an Adam optimizer with momentum set to (0.9, 0.99); the learning rates of the first single-layer LTC-RNN and the second single-layer LTC-RNN are set to be 0.001, and the learning rates of the rest of the modules are set to be 0.00001; the loss function is smooth _ l1_ loss.
Compared with the prior art, the invention has the following advantages and effects:
(1) the invention provides a visual inertial navigation integrated navigation system based on LTC-DNN, which comprises a deep learning network model, wherein the deep learning network model introduces a first single-layer LTC-RNN and a second single-layer LTC-RNN, and the purposes of reducing the trainable parameter number of the deep learning network model and improving the robustness of the deep learning network model are achieved.
(2) The invention provides a self-learning method of a visual inertial navigation combined navigation system based on LTC-DNN, which reduces the dependence on real labels compared with the same type of algorithm.
Drawings
Fig. 1 is a schematic structural diagram of a deep learning network model in an integrated navigation system based on LTC-DNN vision inertial navigation system disclosed in an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of an attention mechanism fusion submodule in an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a fully connected regression submodule in an embodiment of the present invention;
fig. 4 is a flowchart of a self-learning method of a visual inertial navigation integrated navigation system based on LTC-DNN disclosed in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Example one
The embodiment discloses a visual inertial navigation combination navigation system based on LTC-DNN, and fig. 1 is a schematic structural diagram of the visual inertial navigation combination navigation system based on LTC-DNN.
Referring to fig. 1, the deep learning network model is composed of a visual feature extraction module, an inertial navigation feature extraction module and a pose regression module which are sequentially connected.
The visual feature extraction module is used for extracting 1024-dimensional visual features, the input of the visual feature extraction module is two adjacent frames of RGB pictures overlapped along a channel, and the 1024-dimensional visual features are output.
The visual feature extraction module is formed by sequentially stacking 10 layers of convolutional neural networks, the sizes of convolutional kernels of the former three layers of convolutional neural networks in the 10 layers of convolutional neural networks are 7 multiplied by 7, 5 multiplied by 5 and 5 multiplied by 5 in sequence, the sizes of convolutional kernels of the latter seven layers of convolutional neural networks are all 3 multiplied by 3, the convolution step lengths of the fourth layer, the sixth layer and the eighth layer of convolutional neural networks are 1, and the convolution step lengths of the rest convolutional neural networks are 2; the 10-layer convolutional neural networks all use the ReLU activation function.
The inertial navigation feature extraction module comprises a first single-layer LTC-RNN (liquid time constant recurrent neural network) with 1024-dimensional hidden states; the input of the inertial navigation feature extraction module is inertial navigation data between the two adjacent frames of RGB pictures, and the output is 1024-dimensional inertial navigation features;
the pose regression module comprises an attention mechanism fusion submodule, a 1000-dimensional hidden second single-layer LTC-RNN and a fully-connected regression submodule which are connected in sequence, wherein the input of the attention mechanism fusion submodule is a series connection characteristic obtained by connecting a visual characteristic and an inertial navigation characteristic in series, and the series connection characteristic is used for weighting the visual characteristic and the inertial navigation characteristic to obtain a weighted fusion characteristic; the input of the second single-layer LTC-RNN is a weighted fusion characteristic, and a regression characteristic is output; the input of the fully connected regression submodule is regression characteristics, and estimates of relative displacement and relative rotation are output.
The calculation formula of the first single-layer LTC-RNN and the second single-layer LTC-RNN is as follows:
Figure BDA0003116441260000061
h (t) is a hidden state of the LTC-RNN at the current moment, tau is a constant time constant, delta t is a time step length, x (t) is input data of the current moment, f (h) (t), x (t), t and theta are deep learning networks, theta is trainable parameters of the deep learning networks, t is the current moment, the data x (t) and h (t) are input into the calculation formula in the starting stage of each calculation in the calculation mode of the first single-layer LTC-RNN and the second single-layer LTC-RNN, the current output h (t + delta t) of the formula is taken as the input h (t) of the formula for the next time, the calculation is continued, and the calculation is repeated for 6 times; and taking the output h (t + delta t) of the 6 th time as the calculation result of the first single-layer LTC-RNN and the second single-layer LTC-RNN.
Fig. 2 is a schematic diagram of an attention mechanism fusion sub-module according to an embodiment of the present invention, and with reference to fig. 2, the attention mechanism fusion sub-module includes two sub-networks with the same structure, each sub-network is formed by overlapping two layers of fully connected networks, a dimension of the first layer of fully connected network is 2048, and is followed by a ReLU activation function, a dimension of the second layer of fully connected network is 1024, and is followed by a Sigmoid activation function.
Fig. 3 is a schematic structural diagram of a fully-connected regression submodule according to an embodiment of the present invention, referring to fig. 3, where the fully-connected regression submodule is composed of four layers of fully-connected networks, a dimension of the first layer of fully-connected network is 512, a dimension of the second layer of fully-connected network is 128, a dimension of the third layer of fully-connected network is 64, and a dimension of the fourth layer of fully-connected network is 6; and a ReLU activation function is connected behind the first three layers of fully-connected networks in the fully-connected regression submodule, and the fourth layer of fully-connected networks are not connected with any activation function.
Example two
The embodiment discloses a self-learning method of the visual inertial navigation combination navigation system based on the LTC-DNN disclosed in the embodiment. FIG. 4 is a flow chart of a self-learning method according to an embodiment of the present invention, and referring to FIG. 4, the self-learning method comprises four steps, and the process is as follows:
s1, converting the real label with real relative displacement and relative rotation into standard normal distribution to obtain a real standard label, a mean value 1 and a variance 1, and performing first training on the deep learning network model by using the real standard label;
s2, predicting the unlabeled data by the deep learning network model after the first training, and performing first inverse standardized calculation on the prediction result by using the mean value 1 and the variance 1 to obtain a pseudo label;
s3, randomly selecting a certain number of pseudo labels and real labels to be mixed according to the proportion of 0.2:1 to obtain mixed labels;
and S4, converting the mixed label into a standard normal distribution to obtain a mixed standard label, a mean value 2 and a variance 2, and performing secondary training on the deep learning network model by using the mixed standard label.
The pseudo label, the real label and the mixed label comprise relative displacement and relative rotation on x, y and z axes.
The operation of converting the real label and the mixed label into the standard normal distribution is to convert the relative displacement and the relative rotation on the x, y and z axes into the standard normal distribution respectively.
Training of the deep learning network model uses an Adam optimizer with momentum set to (0.9, 0.99); the learning rates of the first single-layer LTC-RNN and the second single-layer LTC-RNN are set to be 0.001, and the learning rates of the rest of the modules are set to be 0.00001; the penalty function is smooth _ l1_ loss.
Inputting two adjacent frames of RGB pictures overlapped along a channel by a visual feature extraction module in the deep learning network model after the second training to obtain visual features; meanwhile, inertial navigation data between two adjacent frames of RGB pictures are input into an inertial navigation feature extraction module to obtain inertial navigation features; then, the visual characteristic and the inertial navigation characteristic are connected in series along the row direction and input to a pose regression module to obtain relative displacement 1 and relative rotation 1; next, the relative displacement 1 and the relative rotation 1 are subjected to a second inverse normalization by the mean 2 and the variance 2, and the relative displacement 2 and the relative rotation 2 are obtained.
In summary, the self-learning method in this embodiment introduces the pseudo tags and the real tags to train the deep learning network model together, so that the requirement for the number of the real tags is reduced, unlike other methods that require a large number of real tags to train. The method utilizes the first single-layer LTC-RNN and the second single-layer LTC-RNN to respectively extract inertial navigation characteristics and regress pose, and has the advantages that the iterative calculation mode inside the first single-layer LTC-RNN and the second single-layer LTC-RNN increases the capability of extracting the characteristics, and the method is different from other recurrent neural networks which only extract the characteristics in a single calculation mode.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (9)

1. The visual inertial navigation combined navigation system based on LTC-DNN is used for automatic driving and autonomous navigation of unmanned aerial vehicles and is characterized by comprising a deep learning network model, wherein the deep learning network model consists of a visual feature extraction module, an inertial navigation feature extraction module and a pose regression module which are sequentially connected in sequence,
the visual feature extraction module is used for extracting 1024-dimensional visual features, the input of the visual feature extraction module is two adjacent frames of RGB pictures overlapped along a channel, and the 1024-dimensional visual features are output;
the inertial navigation feature extraction module comprises a first single-layer LTC-RNN with 1024-dimensional hidden states; the input of the inertial navigation feature extraction module is inertial navigation data between the two adjacent frames of RGB pictures, and the output is 1024-dimensional inertial navigation features;
the pose regression module comprises an attention mechanism fusion submodule, a 1000-dimensional hidden state second single-layer LTC-RNN and a fully-connected regression submodule which are connected in sequence, wherein the input of the attention mechanism fusion submodule is a series connection characteristic obtained by connecting a visual characteristic and an inertial navigation characteristic in series and is used for weighting the visual characteristic and the inertial navigation characteristic to obtain a weighted fusion characteristic; the input of the second single-layer LTC-RNN is weighted fusion characteristics, and regression characteristics are output; the input of the full-connection regression submodule is regression characteristics, and estimation of relative displacement and relative rotation is output;
wherein the calculation formula of the first single-layer LTC-RNN and the second single-layer LTC-RNN is as follows:
Figure FDA0003629234500000011
h (t) is a hidden state of the LTC-RNN at the current moment, tau is a constant time constant, delta t is a time step, x (t) is input data of the current moment, f (h) (t), x (t), t and theta are deep learning networks, theta is a trainable parameter of the deep learning networks, t is the current moment, the calculation mode of the first single-layer LTC-RNN and the second single-layer LTC-RNN inputs data x (t) and h (t) into the calculation formula at the initial stage of each calculation, the current output h (t + delta t) of the formula is used as the input h (t) of the formula for the next time, the calculation is continued, and the calculation is repeated for 6 times; and taking the output h (t + delta t) of the 6 th time as the calculation result of the first single-layer LTC-RNN and the second single-layer LTC-RNN.
2. The integrated navigation system for visual inertial navigation based on LTC-DNN (low temperature co-fired temperature sensor-based) according to claim 1, wherein the visual feature extraction module is formed by sequentially stacking 10 layers of convolutional neural networks, the sizes of convolutional kernels of the former three layers of convolutional neural networks in the 10 layers of convolutional neural networks are 7 x 7, 5 x 5 and 5 x 5 in sequence, the sizes of convolutional kernels of the latter seven layers of convolutional neural networks are 3 x 3, the convolution step sizes of the fourth layer, the sixth layer and the eighth layer of convolutional neural networks are 1, and the convolution step sizes of the rest convolutional neural networks are 2; the 10-layer convolutional neural networks all use the ReLU activation function.
3. The LTC-DNN based visual inertial navigation combination navigation system according to claim 1, wherein the RGB picture is converted to 416 x 128 size before input feature extraction module.
4. The LTC-DNN based visual and inertial navigation combined navigation system according to claim 1, wherein the attention mechanism fusion submodule comprises two sub-networks with the same structure, each sub-network is formed by overlapping two layers of fully connected networks, the dimension of the first layer of fully connected networks is 2048 and is connected with a ReLU activation function, the dimension of the second layer of fully connected networks is 1024 and is connected with a Sigmoid activation function.
5. The LTC-DNN based visual inertial navigation combination navigation system of claim 1, wherein the fully-connected regression sub-module is composed of four layers of fully-connected networks, wherein the first layer of fully-connected network dimension is 512, the second layer of fully-connected network dimension is 128, the third layer of fully-connected network dimension is 64, and the fourth layer of fully-connected network dimension is 6; and the front three layers of fully-connected networks in the fully-connected regression submodule are connected with a ReLU activation function, and the fourth layer of fully-connected networks are not connected with any activation function.
6. A self-learning method of the LTC-DNN based visual inertial navigation unit navigation system according to any one of claims 1 to 5, the self-learning method comprising the steps of:
s1, converting the real label with real relative displacement and relative rotation into standard normal distribution to obtain a real standard label, a mean value 1 and a variance 1, and performing first training on the deep learning network model by using the real standard label;
s2, predicting the unlabeled data by the deep learning network model after the first training, and performing first inverse standardized calculation on the prediction result by using the mean value 1 and the variance 1 to obtain a pseudo label;
s3, randomly selecting a certain number of pseudo labels and real labels to be mixed according to the proportion of 0.2:1 to obtain mixed labels;
and S4, converting the mixed label into a standard normal distribution to obtain a mixed standard label, a mean value 2 and a variance 2, and performing secondary training on the deep learning network model by using the mixed standard label.
7. The self-learning method of the LTC-DNN based visual inertial navigation combination navigation system of claim 6, wherein the pseudo tag, the real tag and the hybrid tag comprise relative displacement and relative rotation on x, y and z axes.
8. The self-learning method of the LTC-DNN based visual inertial navigation combined navigation system is characterized in that the real label and the mixed label are converted into the standard normal distribution according to the claim 6, and the relative displacement and the relative rotation on the x axis, the y axis and the z axis are respectively converted into the standard normal distribution.
9. The self-learning method of the LTC-DNN based visual inertial navigation combined navigation system, according to claim 6, is characterized in that the deep learning network model is trained by using Adam optimizer, and the momentum of Adam optimizer is set to (0.9, 0.99); the learning rates of the first single-layer LTC-RNN and the second single-layer LTC-RNN are set to be 0.001, and the learning rates of the rest of the modules are set to be 0.00001; the penalty function is smooth _ l1_ loss.
CN202110664888.9A 2021-06-16 2021-06-16 LTC-DNN-based visual inertial navigation combined navigation system and self-learning method Active CN113392904B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110664888.9A CN113392904B (en) 2021-06-16 2021-06-16 LTC-DNN-based visual inertial navigation combined navigation system and self-learning method
PCT/CN2022/112625 WO2022262878A1 (en) 2021-06-16 2022-08-15 Ltc-dnn-based visual inertial navigation combined navigation system and self-learning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110664888.9A CN113392904B (en) 2021-06-16 2021-06-16 LTC-DNN-based visual inertial navigation combined navigation system and self-learning method

Publications (2)

Publication Number Publication Date
CN113392904A CN113392904A (en) 2021-09-14
CN113392904B true CN113392904B (en) 2022-07-26

Family

ID=77621376

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110664888.9A Active CN113392904B (en) 2021-06-16 2021-06-16 LTC-DNN-based visual inertial navigation combined navigation system and self-learning method

Country Status (2)

Country Link
CN (1) CN113392904B (en)
WO (1) WO2022262878A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230213936A1 (en) * 2022-01-05 2023-07-06 Honeywell International Inc. Multiple inertial measurement unit sensor fusion using machine learning
CN115953839B (en) * 2022-12-26 2024-04-12 广州紫为云科技有限公司 Real-time 2D gesture estimation method based on loop architecture and key point regression
CN115793001B (en) * 2023-02-07 2023-05-16 立得空间信息技术股份有限公司 Vision, inertial navigation and defending fusion positioning method based on inertial navigation multiplexing
CN116704026A (en) * 2023-05-24 2023-09-05 国网江苏省电力有限公司南京供电分公司 Positioning method, positioning device, electronic equipment and storage medium
CN116989820B (en) * 2023-09-27 2023-12-05 厦门精图信息技术有限公司 Intelligent navigation system and method

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106940562B (en) * 2017-03-09 2023-04-28 华南理工大学 Mobile robot wireless cluster system and neural network visual navigation method
US10885659B2 (en) * 2018-01-15 2021-01-05 Samsung Electronics Co., Ltd. Object pose estimating method and apparatus
CN110084131A (en) * 2019-04-03 2019-08-02 华南理工大学 A kind of semi-supervised pedestrian detection method based on depth convolutional network
CN112556692B (en) * 2020-11-27 2023-01-31 绍兴市北大信息技术科创中心 Vision and inertia odometer method and system based on attention mechanism
CN112801201B (en) * 2021-02-08 2022-10-25 华南理工大学 Deep learning visual inertial navigation combined navigation design method based on standardization

Also Published As

Publication number Publication date
CN113392904A (en) 2021-09-14
WO2022262878A1 (en) 2022-12-22

Similar Documents

Publication Publication Date Title
CN113392904B (en) LTC-DNN-based visual inertial navigation combined navigation system and self-learning method
CN111738110A (en) Remote sensing image vehicle target detection method based on multi-scale attention mechanism
CN110781262B (en) Semantic map construction method based on visual SLAM
CN108520238B (en) Scene prediction method of night vision image based on depth prediction coding network
CN112733768B (en) Natural scene text recognition method and device based on bidirectional characteristic language model
CN111462324B (en) Online spatiotemporal semantic fusion method and system
CN113422952B (en) Video prediction method based on space-time propagation hierarchical coder-decoder
Sun et al. Unmanned surface vessel visual object detection under all-weather conditions with optimized feature fusion network in YOLOv4
CN113658189B (en) Cross-scale feature fusion real-time semantic segmentation method and system
CN114526728B (en) Monocular vision inertial navigation positioning method based on self-supervision deep learning
CN113838135A (en) Pose estimation method, system and medium based on LSTM double-current convolution neural network
CN116912485A (en) Scene semantic segmentation method based on feature fusion of thermal image and visible light image
CN112906549B (en) Video behavior detection method based on space-time capsule network
CN112268564B (en) Unmanned aerial vehicle landing space position and attitude end-to-end estimation method
Jo et al. Mixture density-PoseNet and its application to monocular camera-based global localization
CN112149496A (en) Real-time road scene segmentation method based on convolutional neural network
WO2020093210A1 (en) Scene segmentation method and system based on contenxtual information guidance
CN115797684A (en) Infrared small target detection method and system based on context information
CN113920317A (en) Semantic segmentation method based on visible light image and low-resolution depth image
Asghar et al. Allo-centric occupancy grid prediction for urban traffic scene using video prediction networks
CN114034312B (en) Light-weight multi-decoupling visual odometer implementation method
CN116486203B (en) Single-target tracking method based on twin network and online template updating
CN113837080B (en) Small target detection method based on information enhancement and receptive field enhancement
CN118155294B (en) Double-flow network classroom behavior identification method based on space-time attention
CN114170421B (en) Image detection method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant