CN112509051A

CN112509051A - Bionic-based autonomous mobile platform environment sensing and mapping method

Info

Publication number: CN112509051A
Application number: CN202011521855.0A
Authority: CN
Inventors: 王博; 吴忻生; 陈安; 杨璞光; 刘丞; 陈纯玉
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2020-12-21
Filing date: 2020-12-21
Publication date: 2021-03-16

Abstract

The invention discloses an autonomous mobile platform environment sensing and mapping method based on bionics, which comprises the following steps: s1, collecting an automatic driving data set, sending the pictures into an improved semantic segmentation network, and training the improved semantic segmentation network; s2, initializing binocular vision and IMU combination; s3, fusing binocular vision and IMU; s4, constructing a bionic cell model; and S5, constructing the cognitive map. The method uses the bionics principle to establish the empirical map, greatly reduces the parameter quantity in the map, and can store the scene map information with a larger range. Meanwhile, the robustness of detection can be improved by semantically segmenting and modeling the place information and fusing the place information with the position cells. The powerful feature extraction and learning capabilities of the convolutional neural network are fully utilized, the problems of excessive map building calculation amount and inaccurate recognition of the traditional SLAM method can be improved, and the accuracy of the bionic environment perception and the place judgment in the map building process is better improved.

Description

Bionic-based autonomous mobile platform environment sensing and mapping method

Technical Field

The invention belongs to the technical field of autonomous mobile platform positioning navigation, and particularly relates to an autonomous mobile platform environment sensing and mapping method based on bionics.

Background

The instant positioning and Mapping (SLAM) refers to a technology that a robot actively creates a map in an unknown environment and performs self-positioning on the map according to estimation of the state of the robot and the map. The accurate positioning information is beneficial to the autonomous mobile platform to complete tasks such as path planning and map drawing. A traditional immediate positioning and map construction algorithm adopts a Bayesian probability estimation-based method, and solves speed and pose by acquiring external input information. And complex nonlinear optimization knowledge is used to solve for prior information in the environment. A huge database is constructed as a retrieval source. However, when a model is constructed in a real environment with complex environmental characteristics, huge computing power is consumed. And the resulting modeling results cannot be used for deeper level navigation tasks such as semantic analysis and cognitive understanding.

Aiming at the requirements of environment modeling and intelligent navigation on unknown terrain in an autonomous environment, the bionic environment sensing and mapping method is developed by simulating the biological characteristics of animal recognition environment and navigation. The method plays an important role in the tasks of environment perception and map construction and navigation, and is a hot problem for intelligent navigation and cognitive scientific research in the future.

Disclosure of Invention

The invention aims to provide an autonomous mobile platform environment sensing and mapping method based on bionics. The problems set forth in the background above are solved. The following presents a simplified summary of various aspects in order to provide a basic understanding of such aspects. This summary does not detail all contemplated aspects, however, its sole purpose is to present concepts of the aspects in a simplified form to facilitate a detailed description of the overall innovation.

Studies have found that the brain of lower mammals bears many similarities to the human brain, where environmental information is presented in the form of cognitive maps. Cognitive maps are the reproduction of internal nerves and cells to the external environment. With respect to external environmental information, mammals respond to the brain's where pathway, and specific cells accumulate electrical discharges to obtain current pathway information. The mammal recognizes the image seen by the eyeball through the what visual path to obtain the object attribute information of the current position, and activates the specific position cell through the discharge of the position information. And the information of the position cell is combined with the object attribute information obtained from the visual access to judge whether the current memory unit is a place which does not appear on the cognitive map or not, and if so, the information is sent to a node of the cognitive map link expansion cognitive map. And obtaining a complete cognitive map after traversing the whole area.

The method aims at the conditions of much random noise, uneven illumination and unobvious environment boundary in a natural scene. According to the scheme, a method of obtaining information by combining binocular vision and IMU is adopted, and pose information obtained by an IMU sensor is sent into head orientation cells and stripe cells. Obtaining category information, characteristic information and depth information of the binocular vision image through semantic segmentation, and combining the obtained semantic characteristic vector and the semantic category information. And judging whether the current position is stored or not, and if not, mapping to a position node to construct a cognitive map. The position node contains the depth information and semantic feature information of the current position object. And comparing the performances of the SURF, ORB and SIFT feature points in an object perception module, selecting an ORB feature descriptor to extract matching points of the binocular view, and recovering the depth of the image through the matching points. And simultaneously, respectively extracting semantic features from the binocular images, and establishing an offline dictionary of the semantic features and a query image database.

An autonomous mobile platform environment sensing and mapping method based on bionics comprises the following steps:

s1, collecting an automatic driving data set, sending the pictures into an improved semantic segmentation network, and training the improved semantic segmentation network;

s2, initializing binocular vision and IMU combination;

s3, fusing binocular vision and IMU;

s4, constructing a bionic cell model;

and S5, constructing the cognitive map.

Preferably, the improved convolutional neural network is used for semantic modeling, and comprises a feature extraction layer, a feature fusion layer, a feature coding layer and a feature decoding layer; in the feature extraction stage, an improved deep Lab network (encoder-decoder network with separable convolution) is used for fully extracting the features of an input image through a plurality of convolution layers, the features are extracted by operating with cavity convolution kernels with the sizes of c × c and b × b respectively, and the result is sent to a feature decoding layer through the convolution of c × c to carry out pixel prediction; in the decoding stage, firstly, 4 times of upsampling is carried out on the feature map, then, the feature map is connected with the conv3 bottom-layer features, 4 times of convolution kernels with the size of 3 are expanded to obtain a semantic segmentation result, and semantic category vectors are extracted from the semantic segmentation result; and adding a convolution kernel with the size of 1 to the middle convolution layer, converting the feature map into a one-dimensional object feature vector, and merging the semantic segmentation vector and the object feature vector to obtain a semantic location vector.

Preferably, step S1 specifically includes:

s1-1, collecting pictures of roads in the automatic driving data set, dividing the pictures into rectangular areas, and dividing a training set, a test set and a verification set according to a proportion;

and S1-2, sending the pictures in the training set into an improved semantic segmentation network, performing semantic segmentation test on the training set, and storing a group of network parameters with better effect obtained by performing semantic segmentation on the data set.

Preferably, initializing binocular vision comprises the steps of:

s2-1, carrying out combined calibration on the binocular camera modules, and specifically representing the corresponding relation from a camera coordinate system to a world coordinate system by using a matrix:

where Xc, Yc, Zc are points of the camera coordinate system, Xw, Yw, Zw are world coordinate systems, R is a rotation matrix of the camera coordinate system relative to the world coordinate system, t is a translation matrix of the camera coordinate system relative to the world coordinate system, M is a rotation matrix of the camera coordinate system relative to the world coordinate system₁Is a matrix of internal parameters about the camera;

the translation vector of the imaging point to the origin of the pixel coordinate system is [ d ]_x，d_y]^TIf the focal length is f and the unit is pixel, then pixel point [ u, v]^TTransformation to image coordinate system:

wherein x, y are the coordinates of the image coordinate system, u₀And v₀Half the length and width of the imaging plane;

the transformation matrix of the points on the plane of the camera coordinate system and the image coordinate system is:

the above formula is combined to obtain a conversion formula from an image coordinate system to a world coordinate system, the formula is used for calibrating the binocular camera,

where M1 is a matrix of internal parameters about the camera,m2 is a matrix of external parameters of the camera, the two matrices being a simplified representation of a homogeneous transformation matrix as described by the equation, M representing a normalized internal and external parameter matrix, P_wThree-dimensional coordinates of the spatial points; giving coordinates of a plurality of points of the calibration plate, and substituting the coordinates of the image of the corresponding points of the calibration plate and the coordinates of the world system to obtain a parameter matrix M of the camera, wherein the matrix M is used for calculating the spatial position of the matching points of the binocular camera;

s2-2, operating the binocular camera module and the IMU module, initializing an Inertial Measurement Unit (IMU) by using the pose of each key frame measured by binocular vision simultaneous positioning and mapping (SLAM), calculating the following parameters including the dimension, the gravity direction, the speed, the acceleration and the offset of a gyroscope, wherein an IMU coordinate system is represented by B, a world coordinate system is represented by W, sampling IMU output at a certain time interval delta t, and measuring the acceleration of the IMU_Wa (t) and angular velocity_Bω_WB(t), the IMU measurement model is:

wherein_Bω′_WB(t)∈R³Representing the true value of the instantaneous angular velocity ω of the gyroscope in a coordinate system B relative to the world coordinate system, where B^g(t) represents the IMU internal error, η, which varies slowly with time t^g(t) is the measurement of white noise,_wa(t)∈R³representing the instantaneous acceleration, g, of the IMU in W coordinates_WRepresenting the gravity acceleration vector under the world coordinate system W,_Ba' (t) represents the true value of the instantaneous acceleration a of the gyroscope in the coordinate system B relative to the world coordinate system, R_WB ^T(t) represents the transpose of the rotation matrix of the gyroscope relative to the world coordinate system W under the coordinate system B, B^a(t) represents the acceleration error, η, of the IMU slowly varying over time t^a(t) represents measurement white noise of the acceleration.

Preferably, the binocular camera module is a vision module and is used for collecting image vision information and fixing the image vision information on the front side of the autonomous mobile platform, a binocular system is automatically initialized, a homography matrix is selected for a planar scene to calculate, a basic matrix is selected for a non-planar scene to calculate, two image frames are selected as initial frames through matching, and the pose between the two frames is calculated according to matching; the disparity of the binocular view at corresponding pixel points is calculated to form a disparity map, a global energy function associated with the disparity map is set, the optimal disparity of each pixel is solved by minimizing the global energy function, depth information is obtained through the disparity between the left image and the right image, and then the depth map is obtained.

Preferably, the IMU module is an encoder module, is configured to acquire displacement information of the autonomous mobile platform, and is installed on a base of the autonomous mobile platform.

Preferably, initializing the IMU includes the steps of:

s2-2-1, using several continuous key frames to estimate the bias of gyroscope, and assuming a constant b_gOn the basis of (2) plus the variation

The amount of update of the gyroscope bias is calculated by minimizing the rotation error:

where N is the number of key frames,

indicating rotation by camera

Rotation transformation matrix R from rigid body to camera_CB；ΔR_i，i+1Is the detection quantity of the pre-integral rotation of two continuous image frames;

representing the rotation matrix, Δ R, from the IMU sensor coordinate system to the world coordinate system_i，i+1Is a pre-integral rotation measurement of two consecutive key frames, given a zeroThe initial value is obtained, and the initial bias of the gyroscope corresponding to the minimum target value of the function is solved through Gauss-Newton iteration;

s2-2-2, solving by calculating a linear equation between continuous frames to obtain an acceleration offset estimation value, a scale and a gravity direction accurate value;

s2-2-3, and substituting the formula (5) to calculate R under the condition of knowing the dimension and the gravity acceleration direction_WBPosition and velocity measurements, completing the IMU initialization.

Preferably, step S3 includes the steps of:

s3-1-1, two images shot at the same time are called as a frame, a group of left and right images shot by a binocular camera with stable IMU data are selected as initial frames, the IMU data started at the moment are stored, and the image frames are preprocessed through distortion correction;

s3-1-2, respectively extracting key points from a left image and a right image of the binocular camera, detecting and describing point features by adopting an ORB operator, wherein the ORB descriptor is a 256-bit binary descriptor, and an offline dictionary of the key point features is established by normalizing the feature descriptor;

s3-1-3, carrying out point feature matching from frame to frame, and eliminating mismatching of ORB feature points by adopting Random Sample Consensus (Random Sample Consensus) algorithm Random Sample Consensus;

s3-1-4, selecting matched feature points in the feature map to calculate the parallax, forming a parallax map, setting a global energy function related to the parallax map, minimizing the energy function, and solving the parallax optimal result of each pixel.

Preferably, step S4 includes the steps of:

s4-1, describing a heading angle theta of the autonomous mobile platform gyroscope sensor through a head heading cell model, wherein the head heading cell is a head heading dependent neuron and is expressed as a Gaussian model, and obtaining a head heading transfer matrix by inputting an angle and an angle offset of the gyroscope sensor; modeling yields the head-oriented cell transfer equation for the velocity signal as:

wherein the signal generated by the i-th head towards the cell at time t is oriented predominantly in the direction θ_bAnd the angular offset theta_iDetermining that D (t) is the head orientation transfer matrix, typically setting the predominant orientation of the head towards the cell to 0 DEG θ_iIs the i-th head-to-cell offset, θ₀～θ_mRepresents the respective orientation angles of the m heads towards the cells, and the value range is more than 0 DEG theta_i< 360 DEG, v (t) is the moving speed of the robot along the theta (t) direction at the time t;

s4-2, describing the linear velocity obtained by the accelerometer sensor by using a stripe cell model, wherein stripe cells have a discharge field similar to a parallel stripe shape, and the stripe cell model obtains the current path of the autonomous mobile platform through integration; establishing a transfer model of the streak cells to the displacement signals as follows:

the stripe cells are coded into displacement in a specific direction, the discharge activity of the stripe cells is a periodic reset coding process and is responsible for path integration of linear velocity, an initial direction delta is defined, a discharge phase beta is defined, a discharge period is f, the moving speed of the robot in the theta (t) direction at the time t is v (t), and the speed in the delta direction is:

v_θ(t)＝cos(δ-θ(t))v(t) (9)；

s4-3, expressing the movement of the current autonomous mobile platform in a two-dimensional space by using a grid cell model, wherein the grid cells are cells which are regularly discharged by rodents in the two-dimensional space; the discharge field represents the expression of partial environment of a certain position, direction information and displacement information are sent into a grid cell model, the grid cell model is modeled according to a continuous attractor network to generate grid cell group activity driven by speed, in an attractor network, neurons are arranged on a two-dimensional neural plate, in order to solve the problem of attractor network boundary, the neurons on the upper, lower, left and right boundaries of the neuron plate are connected to form a two-dimensional annular attractor model, the neurons are connected through local excitability and global inhibitability to form an activity packet on the neuron plate, the activity packet moves on the neuron plate along with the input of speed signals to generate the grid cell group, and the kinetic equation of a grid cell i based on the speed is expressed as follows:

wherein S_iThe current active state of the lattice cell, T_cIs the time constant of the response of the neuron,

inhibitory input of peripheral ring neurons to neuron i, F_iIs the forward excitatory projection of the grid cells by the head-facing cells, determined primarily by velocity and direction, and is expressed for the above model as a function of f (x), where the independent variable x is expressed as the sum of excitatory and inhibitory projections, as in equation (10)

Is solved as follows

Connection weight

The connection weight of the grid cells i to j, g the gain for adjusting the activity of the grid cell group, the initialization gamma is 1.05 beta,

wherein λ_gridActive space of lattice cell wherein is active space of lattice cell, x_iIs the location of the grid cell i on the neuron plate,

is the reset distance of the striped cell model;

the head-oriented cell model and the stripe cell model integrate information to form regular discharge and send signals to grid cells; after receiving the information, the grid cells form specific grid cell grid fields with fixed intervals through regular repeated discharge, and the grid fields correspond to a plurality of activation regions of an external space;

s4-4, describing the current position through a position cell model: the position cells are cells which are discharged at specific positions in a map space, have space specificity, map a position field in a brain hippocampus body with external specific space positions through selective discharge, discharge maximally to form position activation when an autonomous mobile platform reaches the specific positions in the space, establish new position points, and model a grid cell model through a Hebbian neural network (Hebbian neural network)

Wherein: a is the positional cell gain constant; m_lIs the number of layers of neural plates, where r represents the position of the autonomous mobile platform;

the connection weight of grid cell i to position cell j; c_placeIs the inhibition constant of the site cell, s_jAnd expressing the discharge rate of the grid cells, and obtaining a relation between the position cells and the grid cells:

where k is the learning rate, p_iAs a model of the location of cell discharge, s_jEquation (13) represents the relationship between the activation rate of the grid cells, the inhibition rate of the grid cells, and the strength of the synaptic connections, and the synaptic connections become stronger if the activation is greater than the inhibition, and weaker otherwise.

Preferably, step S5 includes the steps of:

s5-1, obtaining the semantic segmentation result of the current frame: sending the image frame to the improved semantic segmentation network in the step S1 to obtain a place semantic vector;

s5-2, if the place semantic feature vector does not exist in the existing cognitive map nodes and the depth information of the image frame is different from the known place, judging that a new position is reached and generating a new position cell;

s5-3, if a new position cell is generated, nodularizing the current position cell, adding the nodularized current position cell into an existing cognitive map, storing semantic vector information of a current image into a position library, adding depth information of a view object into the current position cell, expressing cognitive map network nodes by e, wherein each cognitive node comprises position cell discharge p, environment image characteristics V, topological relation L between cognitive nodes and depth information D of an environment image:

e_i＝{p_i，V_i，L_i，D} (14)

judging position cell state p through environment image characteristic V_iEnvironment image feature V_iAnd when the information is not matched with the information in the cognitive nodes, a new cognitive node is created:

e_i＝{p_i，V_i，D，L_i+ Δ L } (equation 15)

Wherein, the delta L is the distance between the current cognitive node and the previous cognitive node, and the displacement information comes from the odometer system;

updating the pose information p of each node through the transfer matrix t of the experience map, and updating the current node e_iPose information and successful closed loop detection matching node e_jThe pose information is fused to obtain a transfer matrix t updating formula of the experience map, wherein the transfer matrix t updating formula is as follows:

where α is the correction rate, N_fFor the current node e_iWith other nodesNumber of connections, N_tIs the number of connections, p, of the successful closed-loop detection matching nodeⁱAnd Δ pⁱRespectively representing the current value and the variation of the pose, p^jAnd p^kRespectively representing the pose values, Δ p, of the j and k positions^ijAnd Δ p^kiTranslation information updated for the location;

s5-4, along with the movement of the autonomous mobile platform, the position cells activate the whole cognitive map, the cognitive node semantic information and the depth information of the image frame are obtained according to the cognitive map, and navigation is performed according to the environment information and the map information;

s5-5, finally, starting a new thread to execute global optimization to optimize the optimal structure of the whole cognitive map

Compared with the prior art, the invention has the beneficial effects that:

the method aims at solving the problem that the traditional identification navigation method can not be accurately identified under the condition of overlarge map range or uneven scene illumination. The method uses the bionics principle to establish the empirical map, greatly reduces the parameter quantity in the map, and can store the scene map information with a larger range. Meanwhile, the robustness of detection can be improved by semantically segmenting and modeling the place information and fusing the place information with the position cells. The powerful feature extraction and learning capabilities of the convolutional neural network are fully utilized, and the problems of excessive map building calculation amount and inaccurate identification of the traditional SLAM method can be improved.

Drawings

FIG. 1 is a flowchart of an environment sensing and mapping method for an autonomous mobile platform based on bionics according to an embodiment of the present invention;

fig. 2 is a structural diagram of an improved semantic segmentation network in an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in fig. 1 and fig. 2, an environment sensing and mapping method for an autonomous mobile platform based on bionics includes the following steps:

the method aims at solving the problem that the common identification navigation method cannot accurately detect the map in the condition of overlarge map range. The method uses a cognitive map to greatly reduce the number of parameters in the map. So that a wide range of scene map information can be saved. Meanwhile, the robustness of detection can be improved by semantically segmenting modeling location information and comparing the location information with the location cells. The powerful feature extraction and learning capacity of the convolutional neural network is fully utilized, and the problems of excessive map building calculation amount and inaccurate identification of the traditional SLAM method are solved.

Specifically, a picture of a road in the automatic driving data set is collected, the road scene picture is divided into rectangular areas of 128 × 128, and the ratio of the area to the scene is 6: 3: 1, dividing a training set, a test set and a verification set, and sending pictures in the training set into a convolutional neural network.

The improved convolutional neural network is used for semantic place modeling, and as shown in fig. 2, the network is divided into a feature extraction layer, a feature fusion layer, a feature coding layer and a feature decoding layer. In the feature extraction stage, the network uses a modified deep lab network (encoder-decoder network with separable convolutions), which adequately extracts features of the image through a plurality of convolutional layers of different sizes, then operating with the hollow convolution kernels with the sizes of 1 × 1 and 3 × 3 respectively to further extract features, sending the obtained convolution result to a feature decoding layer through a convolution layer with the size of 1 × 1 to carry out pixel prediction, carrying out 4 times of upsampling on a feature graph in a decoding stage, then, performing connection operation with the conv3 bottom layer characteristics, expanding by 4 times through a convolution kernel with the size of 3 to obtain a semantic segmentation result, extracting a semantic category vector from the semantic segmentation result, and adding a convolution kernel with the size of 1 in the middle convolution layer, converting the feature map into a one-dimensional object feature vector, and merging the semantic segmentation vector and the object feature vector to obtain a location semantic vector.

Step S2, binocular vision and IMU initialization.

S2-1, initializing binocular stereo vision, wherein the camera adopts a pinhole imaging model to jointly calibrate the existing binocular camera modules, and a matrix is used for specifically representing the corresponding relation from a camera coordinate system to a world coordinate system, wherein Xc, Yc and Zc are points of the camera coordinate system, Xw, Yw and ZW are the world coordinate system, R is a rotation matrix of the camera coordinate system relative to the world coordinate system, t is a translation matrix of the camera coordinate system relative to the world coordinate system, and M is a translation matrix of the camera coordinate system relative to the world coordinate system₁Is a simplified form of the homogeneous transformation matrix described by the preceding equation, representing the intrinsic parameter matrix of the camera;

wherein the coordinates of the image coordinate system are xy. u. of₀And v₀Half the length and width of the imaging plane.

wherein M is₁Is a matrix of internal parameters, M, with respect to the camera₂Are matrices for camera extrinsic parameters, both matrices being a simplified representation of the homogeneous transformation matrix described in the preceding equation, matrix M representing a normalized extrinsic parameter matrix, P_wIs the three-dimensional coordinates of a spatial point. Coordinates of a plurality of points of the calibration plate are given and are substituted into image coordinates and world system coordinates of corresponding points of the calibration plate, and a parameter matrix M of the camera can be obtained and used for calculating the space position of the matching points of the binocular camera.

And S2-2, wherein the IMU module is a posture input module of the autonomous mobile platform environment perception system and is used for acquiring displacement information of the autonomous mobile platform. Operating a binocular module and an IMU module, initializing the IMU by using the pose of each key frame measured by binocular vision simultaneous localization and mapping (SLAM), wherein the calculation parameters comprise scale, gravity direction, speed, acceleration and gyroscope bias, an IMU coordinate system is represented by B, a world coordinate system is represented by W, IMU output is sampled at a certain time interval delta t, and the acceleration of the IMU is measured_Wa (t) and angular velocity_Bω_WB(t), the IMU measurement model is:

wherein_Bω′_WB(t)∈R³Representing the true value of the instantaneous angular velocity ω of the gyroscope in a coordinate system B relative to the world coordinate system, where B^g(t) represents the IMU internal error, η, which varies slowly with time t^g(t) is the measurement of white noise,_Wa(t)∈R³representing the instantaneous acceleration, g, of the IMU in W coordinates_WRepresenting the gravity acceleration vector under the world coordinate system W._Ba' (t) represents the true value of the instantaneous acceleration a of the gyroscope in the coordinate system B relative to the world coordinate system, R_WB ^T(t) represents the transpose of the rotation matrix of the gyroscope relative to the world coordinate system W under coordinate system B. b^a(t) represents the acceleration error, η, of the IMU slowly varying over time t^a(t) represents measurement white noise of the acceleration.

S2-2-1, using several continuous key frames to estimate the bias of gyroscope, and assuming a constant b_gOn the basis of (2) adding

The amount of change of the gyro bias is calculated by minimizing the rotation error:

where N is the number of key frames,

indicating rotation by camera

representing the rotation matrix, Δ R, from the IMU sensor coordinate system to the world coordinate system_i，i+1The method comprises the steps of performing pre-integration rotation measurement on two continuous key frames, giving an initial value of zero, and solving initial bias of a gyroscope corresponding to the minimum target value of the function through Gauss-Newton iteration;

s2-2-2, solving an acceleration offset estimation, a scale and a gravity direction accurate value by calculating a linear equation between continuous frames;

And step S3, fusing binocular vision and IMU.

S3-1-1, two images shot at the same time are called as a frame, a group of left and right images shot by a binocular camera with stable IMU data are selected as initial frames, the IMU data started at the moment are stored, and the image frames are preprocessed through distortion correction; and selecting a group of binocular views with stable IMU data as a starting frame, and recording the corresponding IMU data.

And S3-1-2, respectively extracting key points from the left image and the right image of the binocular camera according to the pose between the two frames calculated by matching, and detecting and describing the point characteristics by adopting an ORB operator. The ORB descriptor is a 256-bit binary characteristic vector, and an offline dictionary of key point characteristics is established by normalizing the characteristic vector;

s3-1-3, carrying out point feature matching from frame to frame, and eliminating mismatching of ORB feature points by adopting a RANSAC algorithm.

S3-1-4, forming a disparity map by calculating the disparity of the binocular view at corresponding pixel points, setting a global energy function associated with the disparity map, solving the optimal disparity of each pixel by minimizing the global energy function, and acquiring depth information by the disparity between the left image and the right image so as to acquire the depth map.

And S4, constructing a bionic cell model.

S4-1, constructing a head orientation cell model to describe the orientation angle theta of the autonomous mobile platform gyroscope sensor. Head-oriented cells are represented as a gaussian model. By inputting the angle and angular offset of the gyro sensor. A head orientation transfer matrix is obtained. Modeling yields the head-oriented cell transfer equation for the velocity signal as:

wherein the signal generated by the i-th head towards the cell at time t is oriented predominantly in the direction θ_bAnd the angular offset theta_iDetermining that D (t) is the head orientation transfer matrix, typically setting the predominant orientation of the head towards the cell to 0 DEG θ_iIs the i-th head-to-cell offset, θ₀～θ_mRepresents the respective orientation angles of the m heads towards the cells, and the value range is more than 0 DEG theta_i< 360 DEG, v (t) is t time along theta (t)t) speed of movement in the direction;

s4-2, constructing a stripe cell model to describe the linear velocity obtained by the accelerometer sensor. The striped cells have a discharge field resembling a parallel stripe shape. The cell model can obtain the path of the current autonomous mobile platform through integration. Establishing a transfer model of the streak cells to the displacement signals as follows:

v_θ(t)＝cos(δ-θ(t))v(t) (9)；

Is solved as follows

Connection weight

is the reset distance of the striped cell model;

Wherein: a is the positional cell gain constant; m_lThe number of layers of the neural plate is shown, and r is the position of the autonomous mobile platform;

Step S5, cognitive map construction

S5-1, calculating the semantic segmentation result of the current frame: and sending the initial pose to the improved semantic segmentation network in the step S1 to obtain a place semantic vector.

And S5-2, if the place semantic feature vector does not exist in the existing place library node and the depth information of the object is different from the known place, judging that a new position is reached and generating a new position cell.

e_i＝{p_i，V_i，L_i，D} (14)

e_i＝{p_i，V_i，D，L_i+ΔL} (15)

where α is the correction rate, N_fFor the current node e_iNumber of connections to other nodes, N_tIs the number of connections, p, of the successful closed-loop detection matching nodeⁱAnd Δ pⁱRespectively representing the current value and the variation of the pose, p^jAnd p^kRespectively representing the pose values, Δ p, of the j and k positions^ijAnd Δ p^kiTranslation information updated for the location;

and S5-5, finally, starting a new thread to execute global optimization, and calculating the optimal structure of the whole cognitive map.

While, for purposes of simplicity of explanation, the above-described text has been described as a series of steps, it is to be understood and appreciated that the methodologies are not limited by the order of operations, as some acts may occur in different orders and apart from as performed in one or more steps, as one skilled in the art will appreciate the principles of how such acts occur.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. An autonomous mobile platform environment sensing and mapping method based on bionics is characterized by comprising the following steps:

s2, initializing binocular vision and IMU combination;

s3, fusing binocular vision and IMU;

s4, constructing a bionic cell model;

and S5, constructing the cognitive map.

2. The context awareness and mapping method for the autonomous mobile platform based on bionics as claimed in claim 1, wherein: the improved convolutional neural network is used for semantic modeling and comprises a feature extraction layer, a feature fusion layer, a feature coding layer and a feature decoding layer; in the characteristic extraction stage, an improved DeepLab network with a coder-decoder network capable of separating convolution is used, the characteristics of an input image are fully extracted through a plurality of convolution layers, operation is carried out on the characteristics and cavity convolution kernels with the sizes of c × c and b × b respectively to extract the characteristics, and the result is sent to a characteristic decoding layer through the convolution of c × c to carry out pixel prediction; in the decoding stage, firstly, 4 times of upsampling is carried out on the feature map, then, the feature map is connected with the conv3 bottom-layer features, 4 times of convolution kernels with the size of 3 are expanded to obtain a semantic segmentation result, and semantic category vectors are extracted from the semantic segmentation result; and adding a convolution kernel with the size of 1 to the middle convolution layer, converting the feature map into a one-dimensional object feature vector, and merging the semantic segmentation vector and the object feature vector to obtain a semantic location vector.

3. The context awareness and mapping method for the autonomous mobile platform based on bionics as claimed in claim 2, wherein: step S1 specifically includes:

4. The context awareness and mapping method for the autonomous mobile platform based on bionics as claimed in claim 3, wherein: initializing binocular vision comprises the steps of:

where Xc, Yc, Zc are points of the camera coordinate system, Xw, Yw, Zw are world coordinate systems, R is a rotation matrix of the camera coordinate system relative to the world coordinate system, t is a translation matrix of the camera coordinate system relative to the world coordinate system, M is a rotation matrix of the camera coordinate system relative to the world coordinate system₁Is a matrix of internal parameters about the camera:

the translation vector of the imaging point to the origin of the pixel coordinate system is [ d ]_x,d_y]^TIf the focal length is f and the unit is pixel, then pixel point [ u, v]^TTransformation to image coordinate system:

wherein M is₁Is a matrix of internal parameters, M, with respect to the camera₂Is a matrix relating to the external parameters of the camera, the two matrices being simplified representations of a homogeneous transformation matrix as described by the equation, the matrix M representing a normalized internal and external parameter matrix, P_wThree-dimensional coordinates of the spatial points; giving coordinates of a plurality of points of the calibration plate, and substituting the coordinates of the image of the corresponding points of the calibration plate and the coordinates of the world system to obtain a parameter matrix M of the camera, wherein the matrix M is used for calculating the spatial position of the matching points of the binocular camera;

s2-2, operating the binocular camera module and the IMU module, initializing an Inertial Measurement Unit (IMU) by using the pose of each key frame measured by binocular vision simultaneous positioning and mapping (SLAM), and calculating the following parameters including the dimension, the gravity direction, the speed, the acceleration and the offset of a gyroscope, wherein an IMU coordinate system is represented by B, a world coordinate system is represented by W, and the IMU coordinate system is represented by a certain time intervalSampling IMU output at intervals of delta t, measuring IMU acceleration_Wa (t) and angular velocity_Bω_WB(t), the IMU measurement model is:

5. The context awareness and mapping method for the autonomous mobile platform based on bionics as claimed in claim 4, wherein: the binocular camera module is a vision module and is used for collecting image vision information and fixing the image vision information on the front side of the autonomous mobile platform, a binocular system is automatically initialized, a homography matrix is selected for a planar scene to calculate, a basic matrix is selected for a non-planar scene to calculate, two image frames are selected as initial frames through matching, and the pose between the two frames is calculated according to matching; the disparity of the binocular view at corresponding pixel points is calculated to form a disparity map, a global energy function associated with the disparity map is set, the optimal disparity of each pixel is solved by minimizing the global energy function, depth information is obtained through the disparity between the left image and the right image, and then the depth map is obtained.

6. The context awareness and mapping method for the autonomous mobile platform based on bionics of claim 5, wherein: the IMU module is an encoder module and is used for acquiring displacement information of the autonomous mobile platform and is installed on a base of the autonomous mobile platform.

7. The context awareness and mapping method for the autonomous mobile platform based on bionics as claimed in claim 6, wherein: initializing the IMU includes the steps of:

where N is the number of key frames,

indicating rotation by camera

Rotation transformation matrix R from rigid body to camera_CB；ΔR_i,i+1Is the detection quantity of the pre-integral rotation of two continuous image frames;

representing the rotation matrix, Δ R, from the IMU sensor coordinate system to the world coordinate system_i,i+1The method comprises the steps of performing pre-integration rotation measurement on two continuous key frames, giving an initial value of zero, and solving initial bias of a gyroscope corresponding to the minimum target value of the function through Gauss-Newton iteration;

8. The context awareness and mapping method for the autonomous mobile platform based on bionics as claimed in claim 7, wherein: step S3 includes the following steps:

9. The context awareness and mapping method for the autonomous mobile platform based on bionics as claimed in claim 8, wherein: step S4 includes the following steps:

wherein the signal generated by the i-th head towards the cell at time t is oriented predominantly in the direction θ_bAnd the angular offset theta_iDetermining that D (t) is the head orientation transfer matrix, typically setting the predominant orientation of the head towards the cell to 0 DEG θ_iIs the i-th head-to-cell offset, θ₀～θ_mRepresenting the respective orientation angles of the m heads towards the cells, in the range of 0 °<θ_i<360 degrees, v (t) is the moving speed of the robot along the theta (t) direction at the moment t;

v_θ(t)＝cos(δ-θ(t))v(t) (9)；

Is solved as follows

Connection weight

is the reset distance of the striped cell model;

Wherein A is a positional cell gain constant; m_lThe number of layers of the neural plate is shown, and r represents the position of the autonomous mobile platform;

10. The context awareness and mapping method for the autonomous mobile platform based on bionics of claim 9, wherein: step S5 includes the following steps:

e_i＝{p_i,V_i,L_i,D} (14)

e_i＝{p_i,V_i,D,L_i+ΔL} (15)

updating the pose information p of each node through the transfer matrix t of the experience map, and updating the current node e_iPose information and successful closed loop detection matching node e_jThe pose information of the map is fused to obtain the conversion of the experience mapThe updating formula of the motion matrix t is as follows:

s5-4, obtaining cognitive node semantic information and depth information of the image frame according to the cognitive map, and navigating according to the environment information and the map information;

and S5-5, starting a new thread to execute global optimization, and optimizing the optimal structure of the whole cognitive map.