WO2019228195A1 - Method and apparatus for perceiving spatial environment - Google Patents

Method and apparatus for perceiving spatial environment Download PDF

Info

Publication number
WO2019228195A1
WO2019228195A1 PCT/CN2019/087272 CN2019087272W WO2019228195A1 WO 2019228195 A1 WO2019228195 A1 WO 2019228195A1 CN 2019087272 W CN2019087272 W CN 2019087272W WO 2019228195 A1 WO2019228195 A1 WO 2019228195A1
Authority
WO
WIPO (PCT)
Prior art keywords
module
image
blurred image
residual
model
Prior art date
Application number
PCT/CN2019/087272
Other languages
French (fr)
Chinese (zh)
Inventor
李军
张文强
缪弘
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2019228195A1 publication Critical patent/WO2019228195A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06T5/60
    • G06T5/73
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/68Control of cameras or camera modules for stable pick-up of the scene, e.g. compensating for camera body vibrations
    • H04N23/682Vibration or motion blur correction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/80Camera processing pipelines; Components thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • This application relates to, but is not limited to, the field of computers.
  • simultaneous positioning and map construction is a technology that uses various sensors to perceive its position and the surrounding environment.
  • the vision SLAM system restores the current pose of the sensor and the three-dimensional structure of the current scene according to the information obtained by the vision sensor to perform positioning and map construction.
  • image blur caused by camera shake is difficult to avoid, and it is also a common problem affecting the SLAM effect.
  • the image blur caused by camera shake will cause the system image tracking loss and affect the overall system efficiency.
  • the general SLAM system does not have a special anti-shake system.
  • the camera shakes and blurs the image, it often causes image tracking failure.
  • the camera needs to be stopped or rolled back to capture a clear image, and search and match in the entire map to reposition the current position, and then continue to move the camera until the tracking is successful.
  • Global map search is a time-consuming operation. If this operation is frequently triggered due to blurred images, it will affect the overall operating efficiency. At the same time, stopping or moving the camera back will make the entire running process discontinuous and affect the smoothness.
  • a method for sensing a space environment including: determining a first blurred image that causes a tracking failure of a SLAM system; and analyzing the first blurred image using a first model to obtain a first A clear image, wherein the first model is obtained through machine learning training using multiple sets of data, and each set of data in the multiple sets of data includes a blurred image and a clear image corresponding to the blurred image; and according to the first A clear image perceives the spatial environment.
  • a space environment sensing device including: a determining module for determining a first blurred image that causes the tracking failure of the SLAM system; and an obtaining module for using the first model to The first blurred image is analyzed to obtain a first clear image, wherein the first model is obtained through machine learning training using multiple sets of data, and each set of data in the multiple sets of data includes the blurred image and the same as the A clear image corresponding to the blurred image; and a sensing module for sensing a spatial environment according to the first clear image.
  • a storage medium is further provided, on which a computer program is stored.
  • the processor executes a method for sensing a space environment according to the present application.
  • an electronic device including a memory and a processor, where the computer program is stored in the memory, and when the processor runs the computer program, the processor executes the space environment according to the present application. Perception method.
  • FIG. 1 is a flowchart of a method for sensing a space environment according to an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of a neural network according to an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of an autoencoder according to an embodiment of the present application.
  • FIG. 4 is an output schematic diagram of a residual module according to an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of an overall deblurring method according to an embodiment of the present application.
  • FIG. 6 is a specific flowchart of blurring a blurred image according to an embodiment of the present application.
  • this application can be applied to the SLAM system, but is not limited to this, and can also be applied to various application scenarios in which a clear image is analyzed based on a blurred image and the clear image is continuously used.
  • this application can use (but not limited to) the Pytorch framework for training and testing.
  • Various neural network frameworks can support this application, such as TensorFlow, MXNet, and so on.
  • FIG. 1 is a flowchart of a method for sensing a space environment according to an embodiment of the present application.
  • the method for sensing a spatial environment includes steps S102 to S106.
  • step S102 a first blurred image that causes tracking failure of the SLAM system is determined.
  • the first blurred image is analyzed using the first model to obtain a first clear image.
  • the first model may be obtained through machine learning training using multiple sets of data, and each set of data in the multiple sets of data includes a blurred image and a clear image corresponding to the blurred image.
  • the first clear image may be an image obtained after analyzing the first blurred image.
  • step S106 a spatial environment is perceived based on the first clear image.
  • the problem of tracking failure of the SLAM system due to capturing a blurred image is solved.
  • the blurred image is analyzed to obtain a clear image, and the space environment is subsequently perceived using the obtained clear image.
  • the subject executing the method for sensing the space environment according to the embodiment of the present application may be a terminal using a SLAM system, such as a sweeping robot, a exploring robot, and the like, but is not limited thereto.
  • a SLAM system such as a sweeping robot, a exploring robot, and the like, but is not limited thereto.
  • step S104 it is necessary to determine that the first blurred image is analyzed for the first time within a predetermined period of time.
  • the method of the present application has been used to analyze the first blurred image within the predetermined period of time, but the problem of tracking failure of the SLAM system has not been successfully resolved, then explain that the reason for SLAM system tracking failure may not be the first blurred image, or the processing effect of the first blurred image is not good. No matter what the reason is, there is no need to analyze the first blurred image again within the predetermined time period. .
  • the predetermined time period as a limiting condition, other situations may be considered to determine whether the first blurred image is analyzed for the first time. For example, after the terminal using the SLAM system is started, it may be determined that the first blurred image is analyzed for the first time.
  • the first model includes an input layer, an autoencoder, and an output layer
  • step S104 may include: receiving the first blurred image through the input layer and outputting the feature map to the autoencoder; receiving the self The feature map processed by the encoder, and the first clear image is output according to the processed feature map.
  • the first model may be a neural network structure.
  • the autoencoder includes an initial residual module, an encoding module, and a decoding module.
  • the encoding module includes multiple downsampling layers and multiple residual modules
  • the decoding module includes multiple upsampling layers and multiple residual modules.
  • the output of the residual module in the encoding module and the output of the residual module of the same size in the decoding module are added as the output of the decoding module.
  • the residual module includes multiple sets of convolution groups with different expansion degrees, and each set of convolution layer groups includes two convolution layers.
  • the number of channels is reduced first and then increased. That is, the two convolution layers are hourglass-shaped.
  • the degree of expansion of the convolution operation in each convolution layer group is different.
  • a first model is obtained by training using a plurality of sets of data in a manner of attenuating a learning rate.
  • the core of the embodiment of the present application is a visual SLAM anti-shake system based on deep learning.
  • the principle of the anti-shake system is to use a neural network to deblur the single image.
  • the input of the system is a blurred image
  • the output is a corresponding clear image.
  • the residual network and the network structure of the autoencoder are combined, and at the same time, multiple layers of convolutions with different expansion degrees are combined.
  • Such a network structure not only ensures the effect of the algorithm, but also greatly reduces the parameters of the network and speeds up the algorithm's operating efficiency.
  • the entire neural network uses an open data set for offline training. It can obtain a clearer image from a blurred image in a short period of time for later SLAM algorithms.
  • the system will not always run. Only when the SLAM system fails to track due to camera shake, the anti-shake system is started to process the blurred image. This can prevent the efficiency of the SLAM system from being affected by an additional system.
  • FIG. 2 is a schematic structural diagram of a neural network according to an embodiment of the present application.
  • the neural network includes an input layer, two auto-encoder modules, and an output layer.
  • the input to the neural network can be a 3-channel RGB blurred image, and the output of the neural network is a corresponding clear image.
  • the input layer can be a 5 * 5 convolution layer, which accepts 3-channel RGB images as input to output 64-channel feature maps.
  • the output layer can be a 5 * 5 convolution layer. It accepts 64-channel feature maps as input, and outputs 3-channel RGB images as system output.
  • the loss function of the neural network is the mean square error function, which measures the mean square error between the model output and the real clear image.
  • FIG. 3 is a schematic structural diagram of an autoencoder according to an embodiment of the present application.
  • the autoencoder includes an initial residual module, an encoding module, and a decoding module.
  • the encoding module includes 4 downsampling layers and 4 residual modules, and the downsampling layer uses a 2 * 2 maximum pooling layer.
  • the decoding module includes 4 upsampling layers and 4 residual modules, and the upsampling layer uses the nearest neighbor upsampling layer.
  • Spatial connection is used in the autoencoder to improve the effect and shorten the training time.
  • the spatial connection is to add the output of the residual module in the encoder module and the output of the residual module of the same size in the decoder module as the output of the decoder.
  • FIG. 4 is a schematic diagram of an output of a residual module according to an embodiment of the present application.
  • the residual module includes a plurality of groups of convolution groups with different expansion degrees.
  • Each set of convolutional layers contains two convolutional layers. These two convolutional layers can show an hourglass shape, that is, the number of channels decreases first and then increases. The degree of expansion of the convolution operation in each convolutional layer group is different.
  • the residual module adds the results of multiple sets of convolutional layers to the input to get the output.
  • this neural network combines the network structure of the residual network and the autoencoder, and uses multiple convolutional layer groups with different expansion levels in the residual network, which can reduce the network layer while ensuring the effect To increase the speed of the network.
  • An embodiment of the present application further provides a neural network training method, including: collecting training data (step 1); training on a training set until convergence (step 2); and testing on a test set to verify the effect (step three). It needs to be added that convergence can be equivalent to reaching a preset loss parameter.
  • the public image deblurring dataset can be used.
  • the data set uses the images taken by the high-speed camera as clear images, and averages the adjacent multiple images to obtain the corresponding blurred images. According to this method, a large number of fuzzy-clear image pairs are obtained to form a data set.
  • step two because the original image in the data set is large, an image block of 256 * 256 size can be intercepted from the original image each time as a training input.
  • the training rate attenuation method is used in the training. At the beginning, a large learning rate is used to make the training achieve a good effect quickly, and then the learning rate is reduced to further optimize the effect and try to approximate the optimal optimization result.
  • step three when testing the effect on the test set, the test set should have no intersection with the training set.
  • the hyperparameters may include the learning rate described above.
  • the anti-shake system when the SLAM system can keep tracking successfully, it is considered that no camera shake has caused the image to be blurred at this time, so the anti-shake system is not activated; when the SLAM system fails to track, it is considered that an image due to camera shake has occurred Blur, and start the anti-shake system, use the neural network to deblur the blurred image, get the corresponding clear image, and return to the SLAM system for tracking; when the SLAM system can successfully track, close the anti-shake system.
  • the advantage of this method is that when the SLAM system can successfully track normally, there is no extra overhead. Only when the SLAM system fails to track due to camera shake, the anti-shake system is started to process the blurred image. This can prevent the efficiency of the SLAM system from being affected by an additional system.
  • the tracking module in the visual SLAM system needs to extract feature points from the current image frame and match them with the map points obtained previously to obtain the pose estimation of the current camera.
  • the image captured by the camera will be blurred, and the feature points cannot be extracted or the quality of the extracted feature points will be affected. As a result, it will not be able to match, making the SLAM algorithm tracking invalid.
  • the spatial environment sensing method of the present application is applied to a scene where tracking is disabled due to image blur, the blurred image may be subjected to a deblurring process to obtain a clear image, so that tracking can be continued.
  • FIG. 5 is a schematic flowchart of an overall deblurring method according to an embodiment of the present application.
  • the overall deblurring method may include: inputting a current image frame captured by a camera into the system (ie, the step of obtaining the current frame); extracting feature points of the current frame, which are retained with the previous system The map points are matched and the pose estimation of the current camera is obtained using the PnP algorithm (that is, the tracking step); if the current frame is successfully tracked, the subsequent steps of the SLAM algorithm are continued; if the tracking fails, it is further judged whether it is necessary to start the defense The dithering system processes the current frame.
  • determine whether the first tracking failure if it is the first tracking failure, start the anti-shake system for deblurring processing; if not, it indicates that the tracking could not be successful after the anti-shake system, it may be because The cause of the tracking failure is not that the image is blurred or the processing result of the anti-shake system is not good. At this time, it is not necessary to start the anti-shake system again.
  • FIG. 6 is a specific flowchart of blurring a blurred image according to an embodiment of the present application.
  • the specific process of deblurring the blurred image may include: inputting the current blurred frame into the neural network; running the neural network to process the current frame; and obtaining the current clear frame as the output of the neural network, and Return to the tracking module in the SLAM system to continue tracking.
  • the technical solution of the present application can be applied to the SLAM system and combined with the tracking module in the SLAM system to improve the overall smoothness of operation.
  • the technical solution of this application may also be combined with other systems that require a camera to process image blur caused by dithering and improve the quality of captured images.
  • the smoothness of the overall operation of the SLAM system can be improved.
  • the SLAM system will fail to track.
  • the camera needs to stop moving or retreat for a short period of time while performing global search to relocate. If tracking failure often occurs, it will seriously affect the smoothness of system operation.
  • the blurred image can be deblurred to make it clear, so that it can be used for tracking in SLAM, reducing the number of tracking failures in SLAM, and improving the overall fluency.
  • the computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, and optical disk), and includes several instructions for making a terminal
  • the device (which may be a mobile phone, a computer, a server, or a network device, etc.) executes the methods described in the embodiments of the present application.
  • a space environment sensing device is further provided, which is used to implement the space environment sensing methods according to the embodiments of the application. Therefore, the descriptions will not be repeated.
  • module may implement a combination of software and / or hardware for a predetermined function.
  • devices described in the following embodiments can be implemented in the form of software modules, hardware, or a combination of software and hardware, is also possible and conceived.
  • An apparatus for sensing a space environment includes: a determining module for determining a first blurred image that causes a tracking failure of a SLAM system; an obtaining module for analyzing a first blurred image using a first model, To obtain a first clear image, wherein the first model is obtained through machine learning training using multiple sets of data, each set of data in the multiple sets of data including a blurred image and a clear image corresponding to the blurred image; and a perception module For sensing a spatial environment according to the first clear image.
  • the determining module may be further configured to determine that the first blurred image is analyzed for the first time within a predetermined period of time.
  • the first model includes an input layer, an auto-encoder, and an output layer
  • the acquisition module is configured to receive the first blurred image through the input layer, output feature maps to the auto-encoder, and receive from the output layer through The feature map processed by the encoder, and the first clear image is output according to the processed feature map.
  • the autoencoder includes an initial residual module, an encoding module, and a decoding module.
  • the encoding module includes multiple downsampling layers and multiple residual modules
  • the decoding module includes multiple upsampling layers and multiple residual modules.
  • the output of the residual module in the encoding module and the output of the residual module of the same size in the decoding module are added as the output of the decoding module.
  • the residual module includes multiple sets of convolution groups with different expansion degrees, and each set of convolution layer groups includes two convolution layers.
  • the number of channels is reduced first and then increased. That is, the two convolution layers are hourglass-shaped.
  • the degree of expansion of the convolution operation in each convolution layer group is different.
  • a first model is obtained by training using a plurality of sets of data in a manner of attenuating a learning rate.
  • each of the foregoing modules may be implemented by software, hardware, or a combination of software and hardware, and each of the foregoing modules may be implemented by the same processor; or each of the foregoing modules and any combination thereof may be implemented by different processors, respectively. achieve.
  • An embodiment of the present application further provides a storage medium on which a computer program is stored.
  • the processor executes a method for sensing a space environment according to the embodiments of the present application.
  • An embodiment of the present application further provides an electronic device including a memory and a processor, where the computer program is stored in the memory, and when the processor runs the computer program, the processor executes the program according to the embodiments of the present application. Perception of space environment.
  • modules or steps of the present application may be implemented by a general-purpose computing device, and they may be concentrated on a single computing device or distributed on a network composed of multiple computing devices. on.
  • Each module or each step of the present application may be implemented by a program code executable by a computing device, so that the program code may be stored in a storage device and executed by the computing device, and in some cases, may be different from here
  • the steps shown or described are performed sequentially, or each module or step is implemented as each integrated circuit module, or multiple modules or steps in them are made into a single integrated circuit module for implementation. As such, this application is not limited to any particular combination of hardware and software.

Abstract

Provided is a method and apparatus for perceiving a spatial environment. The method comprises: determining a first blurred image which causes a SLAM system tracking failure; using a first model to analyze the first blurred image so as to acquire a first sharp image, wherein the first model is obtained through machine learning training based on multiple groups of data; and perceiving the spatial environment according to the first sharp image.

Description

空间环境的感知方法及装置Method and device for sensing space environment 技术领域Technical field
本申请涉及(但不限于)计算机领域。This application relates to, but is not limited to, the field of computers.
背景技术Background technique
在相关技术中,同步定位与地图构建(Simultaneous Localization And Mapping,SLAM)是利用各种传感器来感知自身位置和周围环境的技术。In related technologies, simultaneous positioning and map construction (Simultaneous Localization And Mapping, SLAM) is a technology that uses various sensors to perceive its position and the surrounding environment.
视觉SLAM***根据视觉传感器得到的信息来恢复传感器当前的位姿与当前场景的三维结构,以此来进行定位与地图构建工作。在SLAM***运行过程中,因为相机抖动而造成的图像模糊是很难避免的,也是一种常见的影响SLAM效果的问题。在视觉SLAM***中,相机抖动造成的图像模糊会造成***图像跟踪丢失,并影响整体***效率。The vision SLAM system restores the current pose of the sensor and the three-dimensional structure of the current scene according to the information obtained by the vision sensor to perform positioning and map construction. During the operation of the SLAM system, image blur caused by camera shake is difficult to avoid, and it is also a common problem affecting the SLAM effect. In the visual SLAM system, the image blur caused by camera shake will cause the system image tracking loss and affect the overall system efficiency.
目前,一般的SLAM***并没有专门的防抖动***。当遇到相机抖动造成图像模糊时,往往会造成图像跟踪失败。当遇到图像跟踪失败时,需要使相机停下或者回退,以拍摄清晰的图像,并在整个地图中进行搜索匹配,以重新定位当前的位置,直至跟踪成功后,再使相机继续运动。At present, the general SLAM system does not have a special anti-shake system. When the camera shakes and blurs the image, it often causes image tracking failure. When encountering an image tracking failure, the camera needs to be stopped or rolled back to capture a clear image, and search and match in the entire map to reposition the current position, and then continue to move the camera until the tracking is successful.
因此,当出现相机抖动模糊时,相机都必须停止运动或者回退一小段,并需要进行全局的地图搜索。全局的地图搜索是比较耗时的操作,如果因为图像模糊而频繁触发这种操作的话,则会影响整体运行效率。同时,相机停止运动或者回退会使整个运行过程不连续,影响了流畅性。Therefore, when camera shake and blur occur, the camera must stop moving or fall back a short period, and a global map search is required. Global map search is a time-consuming operation. If this operation is frequently triggered due to blurred images, it will affect the overall operating efficiency. At the same time, stopping or moving the camera back will make the entire running process discontinuous and affect the smoothness.
发明内容Summary of the Invention
根据本申请的一个实施例,提供了一种空间环境的感知方法,包括:确定导致SLAM***跟踪失败的第一模糊图像;使用第一模型对所述第一模糊图像进行分析,以获取第一清晰图像,其中,使用多 组数据通过机器学习训练得到所述第一模型,所述多组数据中的每组数据均包括模糊图像以及与所述模糊图像对应的清晰图像;以及依据所述第一清晰图像感知空间环境。According to an embodiment of the present application, a method for sensing a space environment is provided, including: determining a first blurred image that causes a tracking failure of a SLAM system; and analyzing the first blurred image using a first model to obtain a first A clear image, wherein the first model is obtained through machine learning training using multiple sets of data, and each set of data in the multiple sets of data includes a blurred image and a clear image corresponding to the blurred image; and according to the first A clear image perceives the spatial environment.
根据本申请的另一个实施例,还提供了一种空间环境的感知装置,包括:确定模块,用于确定导致SLAM***跟踪失败的第一模糊图像;获取模块,用于使用第一模型对所述第一模糊图像进行分析,以获取第一清晰图像,其中,使用多组数据通过机器学习训练得到所述第一模型,所述多组数据中的每组数据均包括模糊图像以及与所述模糊图像对应的清晰图像;以及感知模块,用于依据所述第一清晰图像感知空间环境。According to another embodiment of the present application, a space environment sensing device is further provided, including: a determining module for determining a first blurred image that causes the tracking failure of the SLAM system; and an obtaining module for using the first model to The first blurred image is analyzed to obtain a first clear image, wherein the first model is obtained through machine learning training using multiple sets of data, and each set of data in the multiple sets of data includes the blurred image and the same as the A clear image corresponding to the blurred image; and a sensing module for sensing a spatial environment according to the first clear image.
根据本申请的又一个实施例,还提供了一种存储介质,其上存储有计算机程序,所述计算机程序被处理器运行时,所述处理器执行根据本申请的空间环境的感知方法。According to yet another embodiment of the present application, a storage medium is further provided, on which a computer program is stored. When the computer program is run by a processor, the processor executes a method for sensing a space environment according to the present application.
根据本申请的又一个实施例,还提供了一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器运行所述计算机程序时执行根据本申请的空间环境的感知方法。According to another embodiment of the present application, there is also provided an electronic device including a memory and a processor, where the computer program is stored in the memory, and when the processor runs the computer program, the processor executes the space environment according to the present application. Perception method.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The drawings described herein are used to provide a further understanding of the present application and constitute a part of the present application. The schematic embodiments of the present application and the descriptions thereof are used to explain the present application, and do not constitute an improper limitation on the present application. In the drawings:
图1是根据本申请实施例的空间环境的感知方法的流程图;1 is a flowchart of a method for sensing a space environment according to an embodiment of the present application;
图2是根据本申请实施例的神经网络结构示意图;2 is a schematic structural diagram of a neural network according to an embodiment of the present application;
图3是根据本申请实施例的自编码器结构示意图;3 is a schematic structural diagram of an autoencoder according to an embodiment of the present application;
图4是根据本申请实施例的残差模块的输出示意图;4 is an output schematic diagram of a residual module according to an embodiment of the present application;
图5是根据本申请实施例的整体去模糊方法的流程示意图;以及5 is a schematic flowchart of an overall deblurring method according to an embodiment of the present application; and
图6是根据本申请实施例的模糊图像去模糊的具体流程图。FIG. 6 is a specific flowchart of blurring a blurred image according to an embodiment of the present application.
具体实施方式Detailed ways
下文中将参考附图并结合实施例来详细说明本申请。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。Hereinafter, the present application will be described in detail with reference to the drawings and embodiments. It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other.
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。It should be noted that the terms “first” and “second” in the specification and claims of the present application and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence.
本申请文件的技术方案可以应用于SLAM***中,但不局限于此,还可以应用于其他依据模糊图像分析出清晰图像,并继续使用清晰图像的各种应用场景中。在软件架构上,本申请可以使用(但不限于)Pytorch框架来进行训练和测试。各种神经网络框架都可以支持本申请,例如TensorFlow,MXNet等。The technical solution of this application document can be applied to the SLAM system, but is not limited to this, and can also be applied to various application scenarios in which a clear image is analyzed based on a blurred image and the clear image is continuously used. In terms of software architecture, this application can use (but not limited to) the Pytorch framework for training and testing. Various neural network frameworks can support this application, such as TensorFlow, MXNet, and so on.
图1是根据本申请实施例的空间环境的感知方法的流程图。FIG. 1 is a flowchart of a method for sensing a space environment according to an embodiment of the present application.
如图1所示,根据本申请实施例的空间环境的感知方法包括步骤S102至S106。As shown in FIG. 1, the method for sensing a spatial environment according to an embodiment of the present application includes steps S102 to S106.
在步骤S102,确定导致SLAM***跟踪失败的第一模糊图像。In step S102, a first blurred image that causes tracking failure of the SLAM system is determined.
在步骤S104,使用第一模型对第一模糊图像进行分析,以获取第一清晰图像。可以使用多组数据通过机器学习训练得到第一模型,所述多组数据中的每组数据均包括模糊图像以及与模糊图像对应的清晰图像。第一清晰图像可以是分析第一模糊图像后获取的图像。In step S104, the first blurred image is analyzed using the first model to obtain a first clear image. The first model may be obtained through machine learning training using multiple sets of data, and each set of data in the multiple sets of data includes a blurred image and a clear image corresponding to the blurred image. The first clear image may be an image obtained after analyzing the first blurred image.
在步骤S106,依据该第一清晰图像感知空间环境。In step S106, a spatial environment is perceived based on the first clear image.
根据本申请的空间环境感知方法,解决了由于拍摄到模糊图像导致SLAM***跟踪失败的问题,通过分析模糊图像以获取清晰图像,并后续继续使用得到的清晰图像感知空间环境。According to the spatial environment perception method of the present application, the problem of tracking failure of the SLAM system due to capturing a blurred image is solved. The blurred image is analyzed to obtain a clear image, and the space environment is subsequently perceived using the obtained clear image.
根据本申请的实施例,执行根据本申请实施例的空间环境的感知方法的主体可以为使用SLAM***的终端,例如扫地机器人、探索机器人等,但不限于此。According to the embodiment of the present application, the subject executing the method for sensing the space environment according to the embodiment of the present application may be a terminal using a SLAM system, such as a sweeping robot, a exploring robot, and the like, but is not limited thereto.
在一实施例中,在步骤S104之前,需要确定在预定时间段内,是首次对第一模糊图像进行分析。In one embodiment, before step S104, it is necessary to determine that the first blurred image is analyzed for the first time within a predetermined period of time.
如果在预定时间段内,不是首次分析第一模糊图像,即,在预定时间段内已经使用过本申请的方法对第一模糊图像进行过分析,但 是没有成功解决SLAM***跟踪失败的问题,则说明SLAM***跟踪失败的原因可能不是么第一模糊图像,或者对第一模糊图像的处理效果不好,无论是何种原因,在所述预定时间段内都无需再次对第一模糊图像进行分析。If the first blurred image is not analyzed for the first time within a predetermined period of time, that is, the method of the present application has been used to analyze the first blurred image within the predetermined period of time, but the problem of tracking failure of the SLAM system has not been successfully resolved, then Explain that the reason for SLAM system tracking failure may not be the first blurred image, or the processing effect of the first blurred image is not good. No matter what the reason is, there is no need to analyze the first blurred image again within the predetermined time period. .
除了将预定时间段作为限定条件以外,还可以考虑其他情况来确定是否是首次对第一模糊图像进行分析,例如使用SLAM***的终端启动后,可以确定是首次对第一模糊图像进行分析。In addition to using the predetermined time period as a limiting condition, other situations may be considered to determine whether the first blurred image is analyzed for the first time. For example, after the terminal using the SLAM system is started, it may be determined that the first blurred image is analyzed for the first time.
在一实施例中,第一模型包括输入层、自编码器和输出层,并且步骤S104可以包括:通过输入层接收第一模糊图像,并输出特征映射至自编码器;通过输出层接收该自编码器处理后的特征映射,并根据处理后的特征映射输出第一清晰图像。第一模型可以是神经网络结构。In an embodiment, the first model includes an input layer, an autoencoder, and an output layer, and step S104 may include: receiving the first blurred image through the input layer and outputting the feature map to the autoencoder; receiving the self The feature map processed by the encoder, and the first clear image is output according to the processed feature map. The first model may be a neural network structure.
在一实施例中,自编码器包括初始残差模块、编码模块和解码模块。编码模块包括多个降采样层和多个残差模块,并且解码模块包括多个升采样层和多个残差模块。In one embodiment, the autoencoder includes an initial residual module, an encoding module, and a decoding module. The encoding module includes multiple downsampling layers and multiple residual modules, and the decoding module includes multiple upsampling layers and multiple residual modules.
在自编码器中,将编码模块中残差模块的输出与解码模块中相同尺寸的残差模块输出相加,以作为解码模块的输出。In the auto-encoder, the output of the residual module in the encoding module and the output of the residual module of the same size in the decoding module are added as the output of the decoding module.
在一实施例中,残差模块包括多组不同扩张度的卷积组,并且每组卷积层组包括两个卷积层,所述两个卷积层中,通道数先减少再增加,即,所述两个卷积层呈现沙漏状。In an embodiment, the residual module includes multiple sets of convolution groups with different expansion degrees, and each set of convolution layer groups includes two convolution layers. In the two convolution layers, the number of channels is reduced first and then increased. That is, the two convolution layers are hourglass-shaped.
在一实施例中,每组卷积层组中的卷积操作的扩张度不同。In one embodiment, the degree of expansion of the convolution operation in each convolution layer group is different.
在一实施例中,使用多组数据,通过学习率衰减的方式训练得到第一模型。In an embodiment, a first model is obtained by training using a plurality of sets of data in a manner of attenuating a learning rate.
本申请实施例的核心是一种基于深度学习的视觉SLAM防抖动***。防抖动***的原理为使用一个神经网络对单张图像进行去模糊处理。***的输入为一张模糊图像,输出为对应的清晰图像。在神经网络的结构上,结合了残差网络与自编码器的网络结构,同时将多层不同扩张度的卷积相结合。这样的网络结构既保证了算法效果,又极大地减少了网络的参数,加快了算法运行效率。整个神经网络使用一个公开的数据集进行离线训练,能够在较短的时间内,由一张模糊的图 像得到较清晰的图像,以用于之后的SLAM算法。The core of the embodiment of the present application is a visual SLAM anti-shake system based on deep learning. The principle of the anti-shake system is to use a neural network to deblur the single image. The input of the system is a blurred image, and the output is a corresponding clear image. In the structure of the neural network, the residual network and the network structure of the autoencoder are combined, and at the same time, multiple layers of convolutions with different expansion degrees are combined. Such a network structure not only ensures the effect of the algorithm, but also greatly reduces the parameters of the network and speeds up the algorithm's operating efficiency. The entire neural network uses an open data set for offline training. It can obtain a clearer image from a blurred image in a short period of time for later SLAM algorithms.
同时,为了保证SLAM算法整体的运行效率,***不会一直运行。只有当因为相机抖动使SLAM***发生跟踪失败时,才启动防抖动***,以对模糊图像进行处理。这样可以防止因为一个额外的***而影响SLAM***正常工作时的效率。At the same time, in order to ensure the overall operating efficiency of the SLAM algorithm, the system will not always run. Only when the SLAM system fails to track due to camera shake, the anti-shake system is started to process the blurred image. This can prevent the efficiency of the SLAM system from being affected by an additional system.
图2是根据本申请实施例的神经网络结构示意图。FIG. 2 is a schematic structural diagram of a neural network according to an embodiment of the present application.
如图2所示,神经网络包括输入层、两个自编码器模块与输出层。As shown in Figure 2, the neural network includes an input layer, two auto-encoder modules, and an output layer.
进入到神经网络的输入可以为3通道RGB模糊图像,神经网络的输出为对应的清晰图像。The input to the neural network can be a 3-channel RGB blurred image, and the output of the neural network is a corresponding clear image.
输入层可以为5*5的卷积层,接受3通道RGB图像作为输入,以输出64通道的特征映射。The input layer can be a 5 * 5 convolution layer, which accepts 3-channel RGB images as input to output 64-channel feature maps.
输出层可以为5*5的卷积层,接受64通道的特征映射作为输入,以输出3通道RGB图像作为***输出。The output layer can be a 5 * 5 convolution layer. It accepts 64-channel feature maps as input, and outputs 3-channel RGB images as system output.
神经网络的损失函数为均方误差函数,测量模型输出与真实清晰图像的均方误差。The loss function of the neural network is the mean square error function, which measures the mean square error between the model output and the real clear image.
图3是根据本申请实施例的自编码器结构示意图。FIG. 3 is a schematic structural diagram of an autoencoder according to an embodiment of the present application.
如图3所示,自编码器包括初始残差模块,编码模块和解码模块。编码模块包括4个降采样层与4个残差模块,并且降采样层使用的是2*2的最大池化层。相应地,解码模块包括4个升采样层与4个残差模块,并且升采样层使用的是最近邻升采样层。As shown in FIG. 3, the autoencoder includes an initial residual module, an encoding module, and a decoding module. The encoding module includes 4 downsampling layers and 4 residual modules, and the downsampling layer uses a 2 * 2 maximum pooling layer. Correspondingly, the decoding module includes 4 upsampling layers and 4 residual modules, and the upsampling layer uses the nearest neighbor upsampling layer.
自编码器中采用了空间连接,用来提升效果,缩短训练时间。空间连接为,将编码器模块中残差模块的输出与解码器模块中相同尺寸的残差模块输出相加,以作为解码器的输出。Spatial connection is used in the autoencoder to improve the effect and shorten the training time. The spatial connection is to add the output of the residual module in the encoder module and the output of the residual module of the same size in the decoder module as the output of the decoder.
图4是根据本申请实施例的残差模块的输出示意图。FIG. 4 is a schematic diagram of an output of a residual module according to an embodiment of the present application.
如图4所示,残差模块包括多组不同扩张度的卷积组。每组卷积层组包含两个卷积层,这两个卷积层可以呈现沙漏状,即,通道数先减少再增加。每组卷积层组中的卷积操作的扩张度不同。残差模块将多组卷积层组的结果与输入相加,以得到输出。As shown in FIG. 4, the residual module includes a plurality of groups of convolution groups with different expansion degrees. Each set of convolutional layers contains two convolutional layers. These two convolutional layers can show an hourglass shape, that is, the number of channels decreases first and then increases. The degree of expansion of the convolution operation in each convolutional layer group is different. The residual module adds the results of multiple sets of convolutional layers to the input to get the output.
这种神经网络的优点在于,结合了残差网络和自编码器的网络 结构,同时在残差网络中使用了多个不同扩张度的卷积层组,可以在保证效果的同时,减少网络层数,提升网络的运行速度。同时使用模块化的设计方式,可以方便地对网络结构进行调整。The advantage of this neural network is that it combines the network structure of the residual network and the autoencoder, and uses multiple convolutional layer groups with different expansion levels in the residual network, which can reduce the network layer while ensuring the effect To increase the speed of the network. At the same time using the modular design method, you can easily adjust the network structure.
本申请实施例还提供了一种神经网络训练方法,包括:收集训练数据(步骤一);在训练集上进行训练,直至收敛(步骤二);以及在测试集上进行测试,检验效果(步骤三)。需要补充的是,收敛可以相当于达到预设的损失参数。An embodiment of the present application further provides a neural network training method, including: collecting training data (step 1); training on a training set until convergence (step 2); and testing on a test set to verify the effect (step three). It needs to be added that convergence can be equivalent to reaching a preset loss parameter.
在步骤一中可以使用公开的图像去模糊数据集。数据集使用高速摄像机拍摄的图像作为清晰图像,将相邻的多张图像平均得到对应的模糊图像。根据此方法得到大量模糊-清晰图像对,以构成数据集。In step one, the public image deblurring dataset can be used. The data set uses the images taken by the high-speed camera as clear images, and averages the adjacent multiple images to obtain the corresponding blurred images. According to this method, a large number of fuzzy-clear image pairs are obtained to form a data set.
在步骤二中,因为数据集中原始图像较大,所以可以每次从原始图像中截取256*256大小的图像块作为训练输入。训练时使用了学习率衰减的方法,在一开始使用一个较大的学习率,使得训练能快速达到一个较好的效果,然后减小学习率,进一步优化效果,尽量逼近最优优化结果。In step two, because the original image in the data set is large, an image block of 256 * 256 size can be intercepted from the original image each time as a training input. The training rate attenuation method is used in the training. At the beginning, a large learning rate is used to make the training achieve a good effect quickly, and then the learning rate is reduced to further optimize the effect and try to approximate the optimal optimization result.
在步骤三中,在测试集上测试效果时,测试集应与训练集没有交集。通过在测试集上检验效果,可以在必要时调整网络的超参数,并再次执行步骤二,重新训练。超参数可以包括上述的学习率。In step three, when testing the effect on the test set, the test set should have no intersection with the training set. By checking the effect on the test set, you can adjust the hyperparameters of the network if necessary, and perform step two again to retrain. The hyperparameters may include the learning rate described above.
在一实施例中,当SLAM***能一直跟踪成功时,认为此时没有发生相机抖动造成图像模糊,因此不启动防抖动***;当SLAM***跟踪失败时,认为发生了因相机抖动造成的图像模糊,并启动防抖动***,使用神经网络对模糊图像进行去模糊,得到对应的清晰图像,并返回给SLAM***,进行跟踪;当SLAM***能够跟踪成功时,则关闭防抖动***。In one embodiment, when the SLAM system can keep tracking successfully, it is considered that no camera shake has caused the image to be blurred at this time, so the anti-shake system is not activated; when the SLAM system fails to track, it is considered that an image due to camera shake has occurred Blur, and start the anti-shake system, use the neural network to deblur the blurred image, get the corresponding clear image, and return to the SLAM system for tracking; when the SLAM system can successfully track, close the anti-shake system.
这种方法的优点在于,当SLAM***能正常跟踪成功时,不会有额外的开销,只有当因为相机抖动使SLAM***发生跟踪失败时,才启动防抖动***,以对模糊图像进行处理。这样可以防止因为一个额外的***而影响SLAM***正常工作时的效率。The advantage of this method is that when the SLAM system can successfully track normally, there is no extra overhead. Only when the SLAM system fails to track due to camera shake, the anti-shake system is started to process the blurred image. This can prevent the efficiency of the SLAM system from being affected by an additional system.
视觉SLAM***中的跟踪模块,需要对当前图像帧提取特征点,并与之前所得到的地图点进行匹配,以得到当前相机的位姿估计。当 相机发生抖动时,会使相机拍摄的图像模糊,无法提取特征点或影响提取特征点的质量,进而造成无法匹配,使得SLAM算法跟踪失效。当本申请的空间环境的感知方法应用于因图像模糊而造成跟踪失效的场景时,可以对模糊图像进行去模糊处理,以得到清晰图像,进而能够继续进行跟踪。The tracking module in the visual SLAM system needs to extract feature points from the current image frame and match them with the map points obtained previously to obtain the pose estimation of the current camera. When the camera shakes, the image captured by the camera will be blurred, and the feature points cannot be extracted or the quality of the extracted feature points will be affected. As a result, it will not be able to match, making the SLAM algorithm tracking invalid. When the spatial environment sensing method of the present application is applied to a scene where tracking is disabled due to image blur, the blurred image may be subjected to a deblurring process to obtain a clear image, so that tracking can be continued.
图5是根据本申请实施例的整体去模糊方法的流程示意图。FIG. 5 is a schematic flowchart of an overall deblurring method according to an embodiment of the present application.
如图5所示,根据本申请实施例的整体去模糊方法可以包括:将由相机拍摄当前的图像帧输入***(即,获取当前帧的步骤);提取当前帧的特征点,与之前***所保留的地图点进行匹配,并利用PnP算法得到当前相机的位姿估计(即,跟踪步骤);如果当前帧跟踪成功,则继续进行SLAM算法的后续步骤;如果跟踪失败,则进一步判断是否需要启动防抖动***对当前帧进行处理。As shown in FIG. 5, the overall deblurring method according to the embodiment of the present application may include: inputting a current image frame captured by a camera into the system (ie, the step of obtaining the current frame); extracting feature points of the current frame, which are retained with the previous system The map points are matched and the pose estimation of the current camera is obtained using the PnP algorithm (that is, the tracking step); if the current frame is successfully tracked, the subsequent steps of the SLAM algorithm are continued; if the tracking fails, it is further judged whether it is necessary to start the defense The dithering system processes the current frame.
在进一步判断中,确定是否第一次跟踪失败:如果是第一跟踪失败,则启动防抖动***进行去模糊处理;如果不是,则表明经过防抖动***后仍无法跟踪成功,可能是因为造成跟踪失败的原因不是图像模糊,或防抖动***处理结果不好,此时,无需再次启动防抖动***。In further judgment, determine whether the first tracking failure: if it is the first tracking failure, start the anti-shake system for deblurring processing; if not, it indicates that the tracking could not be successful after the anti-shake system, it may be because The cause of the tracking failure is not that the image is blurred or the processing result of the anti-shake system is not good. At this time, it is not necessary to start the anti-shake system again.
图6是根据本申请实施例的模糊图像去模糊的具体流程图。FIG. 6 is a specific flowchart of blurring a blurred image according to an embodiment of the present application.
如同6所示,根据本申请实施例的模糊图像去模糊的具体流程可以包括:将当前模糊帧输入神经网络;运行神经网络对当前帧进行处理;以及得到当前清晰帧作为神经网络的输出,并返回给SLAM***中跟踪模块,以继续进行跟踪。As shown in FIG. 6, the specific process of deblurring the blurred image according to the embodiment of the present application may include: inputting the current blurred frame into the neural network; running the neural network to process the current frame; and obtaining the current clear frame as the output of the neural network, and Return to the tracking module in the SLAM system to continue tracking.
本申请的技术方案可以应用于SLAM***中,与SLAM***中的跟踪模块相结合,以提升整体的运行流畅程度。同时,本申请的技术方案还可能与其他需要相机的***相结合,用来处理因为抖动造成的图像模糊,提升拍摄的图像质量。The technical solution of the present application can be applied to the SLAM system and combined with the tracking module in the SLAM system to improve the overall smoothness of operation. At the same time, the technical solution of this application may also be combined with other systems that require a camera to process image blur caused by dithering and improve the quality of captured images.
根据本申请的技术方案,可以提升SLAM***整体运行的流畅程度。当出现相机抖动造成的图像模糊时,SLAM***会跟踪失败。在相关技术中,需要相机停止运动或回退一小段,同时进行全局搜索来重定位。如果经常出现跟踪失败的话,会严重影响***运行的流畅程 度。根据本申请的技术方案,可以对模糊图像进行去模糊处理,使其变得清晰,从而能够用于SLAM中的跟踪,减少了SLAM中跟踪失败的次数,提升了整体的流畅程度。According to the technical solution of the present application, the smoothness of the overall operation of the SLAM system can be improved. When the image blur caused by camera shake occurs, the SLAM system will fail to track. In the related art, the camera needs to stop moving or retreat for a short period of time while performing global search to relocate. If tracking failure often occurs, it will seriously affect the smoothness of system operation. According to the technical solution of the present application, the blurred image can be deblurred to make it clear, so that it can be used for tracking in SLAM, reducing the number of tracking failures in SLAM, and improving the overall fluency.
通过以上的各实施方式的描述,本领域的技术人员可以清楚地了解到根据上述各实施例可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件或者硬件加软件组合的方式来实现。基于这样的理解,本申请的技术方案可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that according to the above embodiments, it can be implemented by means of software plus a necessary universal hardware platform, and of course, it can also be implemented by means of hardware or a combination of hardware and software. to realise. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product. The computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, and optical disk), and includes several instructions for making a terminal The device (which may be a mobile phone, a computer, a server, or a network device, etc.) executes the methods described in the embodiments of the present application.
根据本申请的实施例,还提供一种空间环境的感知装置,用于实现根据申请各实施例的空间环境的感知方法,因此,已经进行过说明的不再赘述。According to the embodiments of the present application, a space environment sensing device is further provided, which is used to implement the space environment sensing methods according to the embodiments of the application. Therefore, the descriptions will not be repeated.
如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置可以以软件模块的方式来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。As used below, the term "module" may implement a combination of software and / or hardware for a predetermined function. Although the devices described in the following embodiments can be implemented in the form of software modules, hardware, or a combination of software and hardware, is also possible and conceived.
根据本申请的实施例的一种空间环境的感知装置包括:确定模块,用于确定导致SLAM***跟踪失败的第一模糊图像;获取模块,用于使用第一模型对第一模糊图像进行分析,以获取第一清晰图像,其中,使用多组数据通过机器学习训练得到所述第一模型,所述多组数据中的每组数据均包括模糊图像以及与模糊图像对应的清晰图像;以及感知模块,用于依据该第一清晰图像感知空间环境。An apparatus for sensing a space environment according to an embodiment of the present application includes: a determining module for determining a first blurred image that causes a tracking failure of a SLAM system; an obtaining module for analyzing a first blurred image using a first model, To obtain a first clear image, wherein the first model is obtained through machine learning training using multiple sets of data, each set of data in the multiple sets of data including a blurred image and a clear image corresponding to the blurred image; and a perception module For sensing a spatial environment according to the first clear image.
确定模块还可以用于确定在预定时间段内,首次对所述第一模糊图像进行分析。The determining module may be further configured to determine that the first blurred image is analyzed for the first time within a predetermined period of time.
在一实施例中,第一模型包括输入层、自编码器和输出层,并且获取模块用于:通过输入层接收第一模糊图像,并输出特征映射至自编码器,并且通过输出层接收自编码器处理后的特征映射,并根据处理后的特征映射输出第一清晰图像。In an embodiment, the first model includes an input layer, an auto-encoder, and an output layer, and the acquisition module is configured to receive the first blurred image through the input layer, output feature maps to the auto-encoder, and receive from the output layer through The feature map processed by the encoder, and the first clear image is output according to the processed feature map.
在一实施例中,自编码器包括初始残差模块、编码模块和解码模块。编码模块包括多个降采样层和多个残差模块,并且解码模块包 括多个升采样层和多个残差模块。In one embodiment, the autoencoder includes an initial residual module, an encoding module, and a decoding module. The encoding module includes multiple downsampling layers and multiple residual modules, and the decoding module includes multiple upsampling layers and multiple residual modules.
在自编码器中,将编码模块中残差模块的输出与解码模块中相同尺寸的残差模块输出相加,以作为解码模块的输出。In the auto-encoder, the output of the residual module in the encoding module and the output of the residual module of the same size in the decoding module are added as the output of the decoding module.
在一实施例中,残差模块包括多组不同扩张度的卷积组,并且每组卷积层组包括两个卷积层,所述两个卷积层中,通道数先减少再增加,即,所述两个卷积层呈现沙漏状。In an embodiment, the residual module includes multiple sets of convolution groups with different expansion degrees, and each set of convolution layer groups includes two convolution layers. In the two convolution layers, the number of channels is reduced first and then increased. That is, the two convolution layers are hourglass-shaped.
在一实施例中,每组卷积层组中的卷积操作的扩张度不同。In one embodiment, the degree of expansion of the convolution operation in each convolution layer group is different.
在一实施例中,使用多组数据,通过学习率衰减的方式训练得到第一模型。In an embodiment, a first model is obtained by training using a plurality of sets of data in a manner of attenuating a learning rate.
需要说明的是,上述各个模块可以通过软件、硬件或者软件和硬件组合的方式来实现,上述各个模块可以由同一处理器实现;或者,上述各个模块及其任意组合形式可以分别由不同的处理器实现。It should be noted that each of the foregoing modules may be implemented by software, hardware, or a combination of software and hardware, and each of the foregoing modules may be implemented by the same processor; or each of the foregoing modules and any combination thereof may be implemented by different processors, respectively. achieve.
本申请的实施例还提供一种存储介质,其上存储有计算机程序,所述计算机程序被处理器运行时,所述处理器执行根据本申请各实施例的空间环境的感知方法。An embodiment of the present application further provides a storage medium on which a computer program is stored. When the computer program is executed by a processor, the processor executes a method for sensing a space environment according to the embodiments of the present application.
本申请的实施例还提供一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器运行所述计算机程序时,所述处理器执行根据本申请各实施例的空间环境的感知方法。An embodiment of the present application further provides an electronic device including a memory and a processor, where the computer program is stored in the memory, and when the processor runs the computer program, the processor executes the program according to the embodiments of the present application. Perception of space environment.
显然,本领域的技术人员应该明白,上述的本申请的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上。本申请的各模块或各步骤可以用计算装置可执行的程序代码来实现,从而,可以将程序代码存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将各个模块或各步骤分别实现为各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本申请不限制于任何特定的硬件和软件结合。Obviously, those skilled in the art should understand that the above-mentioned modules or steps of the present application may be implemented by a general-purpose computing device, and they may be concentrated on a single computing device or distributed on a network composed of multiple computing devices. on. Each module or each step of the present application may be implemented by a program code executable by a computing device, so that the program code may be stored in a storage device and executed by the computing device, and in some cases, may be different from here The steps shown or described are performed sequentially, or each module or step is implemented as each integrated circuit module, or multiple modules or steps in them are made into a single integrated circuit module for implementation. As such, this application is not limited to any particular combination of hardware and software.
以上仅为本申请的实施例,并不用于限制本申请,对于本领域的技术人员来说,可以有各种更改和变化。凡在本申请的精神和原则之内的修改、等同替换、改进等,均包括在本申请的保护范围之内。The above are only examples of the present application and are not intended to limit the present application. For those skilled in the art, various modifications and changes can be made. All modifications, equivalent replacements, and improvements within the spirit and principle of this application are included in the protection scope of this application.

Claims (18)

  1. 一种空间环境的感知方法,包括:A method for sensing a space environment includes:
    确定导致同步定位与地图构建SLAM***跟踪失败的第一模糊图像;Determine the first blurred image that causes the tracking failure of the simultaneous positioning and map construction SLAM system to fail;
    使用第一模型对所述第一模糊图像进行分析,以获取第一清晰图像,其中,使用多组数据通过机器学习训练得到所述第一模型,所述多组数据中的每组数据均包括模糊图像以及与所述模糊图像对应的清晰图像;以及Use a first model to analyze the first blurred image to obtain a first clear image, wherein the first model is obtained through machine learning training using multiple sets of data, and each set of data in the multiple sets of data includes A blurred image and a clear image corresponding to the blurred image; and
    依据所述第一清晰图像感知空间环境。Perceive a spatial environment according to the first clear image.
  2. 根据权利要求1所述的方法,其中,在使用第一模型对所述第一模糊图像进行分析,以获取所述第一清晰图像的步骤之前,所述方法还包括:The method according to claim 1, wherein before the step of analyzing the first blurred image using a first model to obtain the first clear image, the method further comprises:
    确定在预定时间段内,首次对所述第一模糊图像进行分析。It is determined that the first blurred image is analyzed for the first time within a predetermined period of time.
  3. 根据权利要求1所述的方法,其中,所述第一模型包括输入层、自编码器和输出层,并且The method of claim 1, wherein the first model includes an input layer, an autoencoder, and an output layer, and
    使用第一模型对所述第一模糊图像进行分析,以获取第一清晰图像的步骤包括:The step of analyzing the first blurred image using the first model to obtain a first clear image includes:
    通过所述输入层接收所述第一模糊图像,并输出特征映射至所述自编码器;以及Receiving the first blurred image through the input layer, and outputting a feature map to the autoencoder; and
    通过所述输出层接收所述自编码器处理后的特征映射,并根据所述处理后的特征映射输出所述第一清晰图像。Receiving the feature map processed by the self-encoder through the output layer, and outputting the first clear image according to the processed feature map.
  4. 根据权利要求3所述的方法,其中,所述自编码器包括初始残差模块、编码模块和解码模块,并且The method according to claim 3, wherein the autoencoder includes an initial residual module, an encoding module, and a decoding module, and
    其中,所述编码模块包括多个降采样层和多个残差模块,所述解码模块包括多个升采样层和多个残差模块。The encoding module includes multiple downsampling layers and multiple residual modules, and the decoding module includes multiple upsampling layers and multiple residual modules.
  5. 根据权利要求4所述的方法,其中,在所述自编码器中,将所述编码模块中残差模块的输出与所述解码模块中相同尺寸的残差模块输出相加,以作为所述解码模块的输出。The method according to claim 4, wherein in the autoencoder, the output of the residual module in the encoding module and the output of the residual module of the same size in the decoding module are added as the The output of the decoding module.
  6. 根据权利要求4所述的方法,其中,所述残差模块包括多组不同扩张度的卷积组,并且每组卷积层组包括两个卷积层,所述两个卷积层中,通道数先减少再增加。The method according to claim 4, wherein the residual module comprises a plurality of groups of convolution groups with different expansion degrees, and each group of convolution layer groups includes two convolution layers, and in the two convolution layers, The number of channels decreases first and then increases.
  7. 根据权利要求6所述的方法,其中,每组卷积层组中的卷积操作的扩张度不同。The method according to claim 6, wherein the degree of expansion of the convolution operation in each set of convolutional layer groups is different.
  8. 根据权利要求1所述的方法,其中,使用所述多组数据,通过学习率衰减的方式训练得到所述第一模型。The method according to claim 1, wherein the first model is obtained by training using the plurality of sets of data in a manner of attenuating a learning rate.
  9. 一种空间环境的感知装置,包括:A space environment sensing device includes:
    确定模块,用于确定导致同步定位与地图构建SLAM***跟踪失败的第一模糊图像;A determining module, configured to determine a first blurred image that causes tracking failure of the synchronous positioning and map construction SLAM system;
    获取模块,用于使用第一模型对所述第一模糊图像进行分析,以获取第一清晰图像,其中,使用多组数据通过机器学习训练得到所述第一模型,所述多组数据中的每组数据均包括模糊图像以及与所述模糊图像对应的清晰图像;以及An acquisition module, configured to analyze the first blurred image by using a first model to obtain a first clear image, wherein the first model is obtained through machine learning training using multiple sets of data, where Each set of data includes a blurred image and a clear image corresponding to the blurred image; and
    感知模块,用于依据所述第一清晰图像感知空间环境。A sensing module is configured to sense a spatial environment according to the first clear image.
  10. 根据权利要求9所述的感知装置,其中,所述确定模块还用于确定在预定时间段内,首次对所述第一模糊图像进行分析。The sensing device according to claim 9, wherein the determining module is further configured to determine that the first blurred image is analyzed for the first time within a predetermined period of time.
  11. 根据权利要求9所述的感知装置,其中,所述第一模型包括输入层、自编码器和输出层,并且The sensing device according to claim 9, wherein the first model includes an input layer, an autoencoder, and an output layer, and
    所述获取模块用于:The obtaining module is configured to:
    通过所述输入层接收所述第一模糊图像,并输出特征映射至所 述自编码器,并且Receiving the first blurred image through the input layer, and outputting a feature map to the autoencoder, and
    通过所述输出层接收所述自编码器处理后的特征映射,并根据所述处理后的特征映射输出所述第一清晰图像。Receiving the feature map processed by the self-encoder through the output layer, and outputting the first clear image according to the processed feature map.
  12. 根据权利要求11所述的感知装置法,其中,所述自编码器包括初始残差模块、编码模块和解码模块,并且The perceptual device method according to claim 11, wherein the autoencoder includes an initial residual module, an encoding module, and a decoding module, and
    其中,所述编码模块包括多个降采样层和多个残差模块,所述解码模块包括多个升采样层和多个残差模块。The encoding module includes multiple downsampling layers and multiple residual modules, and the decoding module includes multiple upsampling layers and multiple residual modules.
  13. 根据权利要求12所述的感知装置,其中,在所述自编码器中,将所述编码模块中残差模块的输出与所述解码模块中相同尺寸的残差模块输出相加,以作为所述解码模块的输出。The sensing device according to claim 12, wherein in the autoencoder, the output of the residual module in the encoding module and the output of the residual module of the same size in the decoding module are added as the total The output of the decoding module is described.
  14. 根据权利要求12所述的感知装置,其中,所述残差模块包括多组不同扩张度的卷积组,并且每组卷积层组包括两个卷积层,所述两个卷积层中,通道数先减少再增加。The sensing device according to claim 12, wherein the residual module includes a plurality of groups of convolution groups with different expansion degrees, and each group of convolution layer groups includes two convolution layers, and the two convolution layers , The number of channels first decreases and then increases.
  15. 根据权利要求13所述的感知装置,其中,每组卷积层组中的卷积操作的扩张度不同。The sensing device according to claim 13, wherein the degree of expansion of the convolution operation in each group of convolution layer groups is different.
  16. 根据权利要求9所述的感知装置,其中,使用所述多组数据,通过学习率衰减的方式训练得到所述第一模型。The sensing device according to claim 9, wherein the first model is obtained by training using the plurality of sets of data in a manner of attenuating a learning rate.
  17. 一种存储介质,其上存储有计算机程序,所述计算机程序被处理器运行时,所述处理器执行根据权利要求1至8任一项中所述的空间环境的感知方法。A storage medium stores a computer program thereon, and when the computer program is run by a processor, the processor executes a method for sensing a space environment according to any one of claims 1 to 8.
  18. 一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器运行所述计算机程序时,所述处理器执行根据权利要求1至8任一项中所述的空间环境的感知方法。An electronic device includes a memory and a processor, the memory stores a computer program, and when the processor runs the computer program, the processor executes the space according to any one of claims 1 to 8 Environmental perception methods.
PCT/CN2019/087272 2018-05-28 2019-05-16 Method and apparatus for perceiving spatial environment WO2019228195A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810522285.3 2018-05-28
CN201810522285.3A CN110545373B (en) 2018-05-28 2018-05-28 Spatial environment sensing method and device

Publications (1)

Publication Number Publication Date
WO2019228195A1 true WO2019228195A1 (en) 2019-12-05

Family

ID=68697852

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/087272 WO2019228195A1 (en) 2018-05-28 2019-05-16 Method and apparatus for perceiving spatial environment

Country Status (2)

Country Link
CN (1) CN110545373B (en)
WO (1) WO2019228195A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112202696A (en) * 2020-10-12 2021-01-08 青岛科技大学 Underwater sound signal automatic modulation identification method based on fuzzy self-encoder

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104680491A (en) * 2015-02-28 2015-06-03 西安交通大学 Non-uniform image motion blur removing method based on deep neural network
CN106530256A (en) * 2016-11-18 2017-03-22 四川长虹电器股份有限公司 Improved-deep-learning-based intelligent camera image blind super-resolution system
CN106920227A (en) * 2016-12-27 2017-07-04 北京工业大学 Based on the Segmentation Method of Retinal Blood Vessels that deep learning is combined with conventional method
US20180121767A1 (en) * 2016-11-02 2018-05-03 Adobe Systems Incorporated Video deblurring using neural networks

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102589470A (en) * 2012-02-14 2012-07-18 大闽食品(漳州)有限公司 Fuzzy-neural-network-based tea leaf appearance quality quantification method
US20140307950A1 (en) * 2013-04-13 2014-10-16 Microsoft Corporation Image deblurring
CN103926933A (en) * 2014-03-29 2014-07-16 北京航空航天大学 Indoor simultaneous locating and environment modeling method for unmanned aerial vehicle
CN109945844B (en) * 2014-05-05 2021-03-12 赫克斯冈技术中心 Measurement subsystem and measurement system
CN106372663B (en) * 2016-08-30 2019-09-10 北京小米移动软件有限公司 Construct the method and device of disaggregated model
CN106447626B (en) * 2016-09-07 2019-06-07 华中科技大学 A kind of fuzzy core size estimation method and system based on deep learning
CN106952239A (en) * 2017-03-28 2017-07-14 厦门幻世网络科技有限公司 image generating method and device
CN107689034B (en) * 2017-08-16 2020-12-01 清华-伯克利深圳学院筹备办公室 Denoising method and denoising device
CN107948510B (en) * 2017-11-27 2020-04-07 北京小米移动软件有限公司 Focal length adjusting method and device and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104680491A (en) * 2015-02-28 2015-06-03 西安交通大学 Non-uniform image motion blur removing method based on deep neural network
US20180121767A1 (en) * 2016-11-02 2018-05-03 Adobe Systems Incorporated Video deblurring using neural networks
CN106530256A (en) * 2016-11-18 2017-03-22 四川长虹电器股份有限公司 Improved-deep-learning-based intelligent camera image blind super-resolution system
CN106920227A (en) * 2016-12-27 2017-07-04 北京工业大学 Based on the Segmentation Method of Retinal Blood Vessels that deep learning is combined with conventional method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112202696A (en) * 2020-10-12 2021-01-08 青岛科技大学 Underwater sound signal automatic modulation identification method based on fuzzy self-encoder

Also Published As

Publication number Publication date
CN110545373B (en) 2021-12-28
CN110545373A (en) 2019-12-06

Similar Documents

Publication Publication Date Title
US10573018B2 (en) Three dimensional scene reconstruction based on contextual analysis
US20190043216A1 (en) Information processing apparatus and estimating method for estimating line-of-sight direction of person, and learning apparatus and learning method
US9390475B2 (en) Backlight detection method and device
CN110008806B (en) Information processing device, learning processing method, learning device, and object recognition device
EP3816929A1 (en) Method and apparatus for restoring image
CN108958473A (en) Eyeball tracking method, electronic device and non-transient computer-readable recording medium
CN111695421B (en) Image recognition method and device and electronic equipment
WO2012177166A1 (en) An efficient approach to estimate disparity map
CN109005334A (en) A kind of imaging method, device, terminal and storage medium
JP2018170003A (en) Detection device and method for event in video, and image processor
CN110705353A (en) Method and device for identifying face to be shielded based on attention mechanism
CN113724379B (en) Three-dimensional reconstruction method and device for fusing image and laser point cloud
CN111325798A (en) Camera model correction method and device, AR implementation equipment and readable storage medium
WO2021101732A1 (en) Joint rolling shutter correction and image deblurring
CN110673607B (en) Feature point extraction method and device under dynamic scene and terminal equipment
WO2019228195A1 (en) Method and apparatus for perceiving spatial environment
WO2020044630A1 (en) Detector generation device, monitoring device, detector generation method, and detector generation program
CN116977876A (en) Unmanned aerial vehicle image processing method, system and medium
CN113409331B (en) Image processing method, image processing device, terminal and readable storage medium
CN114743090A (en) Open type building block splicing prompting method and device, electronic equipment and storage medium
JP6962450B2 (en) Image processing equipment, image processing methods, and programs
CN114120423A (en) Face image detection method and device, electronic equipment and computer readable medium
CN108764110B (en) Recursive false detection verification method, system and equipment based on HOG characteristic pedestrian detector
CN114511591B (en) Track tracking method and device, electronic equipment and storage medium
WO2020044629A1 (en) Detector generation device, monitoring device, detector generation method, and detector generation program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19810596

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 20-04-2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19810596

Country of ref document: EP

Kind code of ref document: A1