GB2609676A

GB2609676A - A method and a system for detecting free space region in surroundings of autonomous objects

Info

Publication number: GB2609676A
Application number: GB2114542.0A
Authority: GB
Inventors: Kannan Srividhya; Hegde K Sneha
Original assignee: Continental Automotive GmbH
Current assignee: Continental Automotive GmbH
Priority date: 2021-08-13
Filing date: 2021-10-12
Publication date: 2023-02-15
Also published as: GB202114542D0

Abstract

The present disclosure relates to a method for detecting a free space region in surroundings of autonomous objects 101. The method is performed by a detection system 106 and comprises receiving real-time image 112 and real-time point cloud data 103 of scene ahead of the autonomous object 101. The scene comprises at least one of, one or more obstacles and free space region. Further, the method comprises determining variance for each pixel of the real-time image 112, based on information in the plurality of pixels. Furthermore, the method comprises mapping the real-time image 112 with the real-time point cloud data 103. Thereafter, the method comprises providing the real-time mapped image (404, fig 5 not shown) and the variance (402), to a neural network (501). The neural network determines the free space region in the surroundings of the autonomous object 101.

Description

A METHOD AND A SYSTEM FOR DETECTING FREE SPACE REGION IN

SURROUNDINGS OF AUTONOMOUS OBJECTS

TECHNICAL FIELD

[1] The present disclosure generally relates to autonomous objects. More particularly, the present disclosure relates to a method and a system for detecting a free space region in surroundings of autonomous objects.

BACKGROUND

[2] Autonomous objects such as autonomous vehicles, autonomous robots, and the like are intelligent machines capable of performing tasks by themselves, without explicit human control. The autonomous objects must perceive its environment for advancing in the environment. The autonomous objects require comprehensive perception of static and moving obstacles. The obstacles such as trees, buildings, poles, automobiles, people, animals, holes, and the like may be in surroundings of the autonomous objects. A free space region has to be detected in the surroundings of the autonomous objects Detection of the free space aids in safe movement of the autonomous objects.

[3] Conventional techniques for detecting the free space region use images from a camera. The images from the camera are two-dimensional. The free space region determined from the images may not be accurate, when there are shadows on a road, reflections on the road, poor weather and lighting conditions, and the like. Further, some of the conventional techniques use point cloud data for detecting the free space region. However, use of mere point cloud data may not be accurate, since the point cloud data is sparse in nature. Data is considered as sparse when certain values in a dataset are missing. When the data is sparse, the accuracy of detecting the free space region is reduced. There is a need for an improved system that can detect free space region accurately.

[4] The information disclosed in this background of the disclosure section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person skilled in the art.

SUMMARY

[5] In an embodiment the present disclosure discloses a method for detecting a free space region in surroundings of autonomous objects. The method comprises receiving a real-time image of a scene and real-time point cloud data of the scene ahead of an autonomous object. The scene comprises at least one of, one or more obstacles and a free space region for movement of the autonomous object. Further, the method comprises determining a variance for each pixel from a plurality of pixels of the real-time image The variance is determined based on information in the plurality of pixels. Furthermore, the method comprises mapping the real-time image with the real-time point cloud data, to obtain a real-time mapped image of the scene. Thereafter, the method comprises providing the real-time mapped image and the variance, to a neural network. The neural network determines the free space region in the surroundings of the autonomous object.

[6] In an embodiment, the present disclosure discloses a detection system for detecting a free space region in surroundings of autonomous objects. The detection system comprises one or more processors and a memory. The one or more processors are configured to receive a real-time image of a scene and real-time point cloud data of the scene ahead of an autonomous object. The scene comprises at least one of, one or more obstacles and a free space region for movement of the autonomous object. Further, the one or more processors are configured to determine a variance for each pixel from a plurality of pixels of the real-time image. The variance is determined based on information in the plurality of pixels. Furthermore, the one or more processors are configured to map the real-time image with the real-time point cloud data, to obtain a real-time mapped image of the scene. Thereafter, the one or more processors are configured to provide the real-time mapped image and the variance, to a neural network. The neural network determines the free space region in the surroundings of the autonomous object.

[7] As used in this summary, in the description below, in the claims below, and in the accompanying drawings, the term "autonomous object" is defined as an entity that functions without human direction, freely moving, and interacting with humans and other objects to perform a specific task. For example, the autonomous object may be a vehicle, a robot, and the like.

[8] As used in this summary, in the description below, in the claims below, and in the accompanying drawings, the term "free space region" is defined as a region without any obstacle The free space region is the region ahead of the autonomous object, where the autonomous object may potentially move. For example, when the autonomous object is a vehicle, the free space region may be an area on a road without any obstacles. In another example, when the autonomous object is a robot implemented in a store, the free space region may be a space between aisles in the store.

[009] As used in this summary, in the description below, in the claims below, and in the accompanying drawings, the term "at least" followed by a number is used in to denote the start of a range beginning with that number (which may be a range having an upper limit or no upper limit, depending on the variable being defined). For example, "at least one" means one or more than one.

[0010] As used in this summary, in the description below in the claims below, and in the accompanying drawings, the tem, "scene" is defined as a view of what a person or camera actually sees in the real world ahead of the autonomous object. Scene is distinguishable from an image because an image is a picture of what a person or camera actually sees in the real world that is captured, for example by a camera. The scene may comprise at least one of, the one or more obstacles and the free space region. For example, when the autonomous object is a vehicle, the scene may comprise the free space region, a tree, and a building, a pole, a junction, and the like. In another example, when the autonomous object is an industrial robot, the scene may comprise people, machines, racks, and the like.

[0011] As used in this summary, in the description below, in the claims below, and in the accompanying drawings, the term "real-time image" is defined as is apicture of the scene ahead of the autonomous object. Scene is distinguishable from an image because an image is a picture of what a person or camera actually sees in the real world that is captured, for example by a camera. The real-time image indicates objects in the surroundings of the autonomous object. For example, when the autonomous object is a vehicle, the real-time image may comprise other vehicles, people, buildings, and the like. In another example, when the autonomous object is a robot equipped in household maintenance, the real-time image may comprise electronic appliances, furniture, and the like.

[0012] As used in this summary, in the description below, in the claims below, and in the accompanying drawings, the term "real-time point cloud data" is defined as a data of plurality of three-dimensional points representing coordinates of the one or more obstacles and the free space region in the scene.

[0013] As used in this summary, in the description below, in the claims below, and in the accompanying drawings, the tem "mapping" is defined as transformation of the real-time point cloud data to the real-time image. Each three-dimensional point in the real-time point cloud data is transformed to two-dimensional points in thc real-time image, [0014] The foregoing summary is illustrative only and is not intended to be in any way limiting. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features will become apparent by reference to the drawings and the following detailed description.

BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS

[0015] The novel features and characteristics of the disclosure are set forth in the appended claims. The disclosure itself, however, as well as a preferred mode of use, further objectives, and advantages thereof will best be understood by reference to the following detailed description of an illustrative embodiment whcn read in conjunction with the accompanying figures. One or more embodiments are now described, by way of example only, with reference to the accompanying figures wherein like reference numerals represent like dements and in which: [0016] Figure I illustrates exemplary environment for detecting a free space region in surroundings of autonomous objects, in accordance with some embodiments of the present disclosure: [0017] Figure 2 illustrates an internal architecture of a detection system for detecting a free space region in surroundings of autonomous objects, in accordance with some embodiments of the present disclosure; [0018] Figure 3 shows an exemplary flow chart illustrating method steps for detecting a free space region in surroundings of autonomous objects, in accordance with some embodiments of the present disclosure; [0019] Figure 4A shows exemplary illustration for detecting a free space region in surroundings of autonomous objects, in accordance with some embodiments of the present disclosure: [0020] Figure 4B shows exemplary ground truth image for training a neural network in accordance with some embodiments of the present disclosure; [0021] Figure 5 shows exemplary illustration for determining a free space region by a neural network, in accordance with some embodiments of the present disclosure; and [0022] Figure 6 shows a block diagram of a general-purpose computing system for detecting a free space region in surroundings of autonomous objects, in accordance with embodiments of the present disclosure.

[0023] it should be appreciated by those skilled in the art that any block diagram herein represents conceptual views of illustrative systems embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and executed by a computer or processor, whether or not such computer or processor is explicitly shown.

DETAILED DESCRIPTION

[0024] In the present document, the word "exemplary is used herein to mean "serving as an example, instance, or illustration." Any embodiment or implementation of the present subject matter described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments, [0025] While the disclosure is susceptible to various modifications and alternative forms, specific embodiment thereof has been shown by way of example in the drawings and will be described in detail below. it should be understood, however that it is not intended to limit the disclosure to the particular forms disclosed, but on the contrary, the disclosure is to cover all modifications, equivalents, and alternatives falling within the scope of the disclosure [0026] The terms "comprises", "comprising", or any other variations thereof, are intended to cover a non-exclusive inclusion, such that a setup, device or method that comprises a list of components or steps does not include only those components or steps but may include other components or steps not expressly listed or inherent to such setup or device or method. In other words, one or more elements in a system or apparatus proceeded by "comprises.., a" does not, without more constraints, preclude the existence of other elements or additional elements in the system or apparatus.

[0027] Embodiments of the present disclosure relate to a method for detecting a free space region in surroundings of autonomous objects. A real-time image and real-time point cloud data of a scene ahead of an autonomous object is received. A variance is determined for each pixel from a plurality of pixels of the image. The variance is determined based on information in die plurality of pixels, Embodiments of the present disclosure considers less variance, and not only zero variance. This helps in detection of the free space region when there are shadows, reflections, poor weather and Fighting conditions, and the like on a road. Furthermore, the real-time image is mapped with the real-time point cloud data.. The real-time mapped image and the variance are provided to a neural network to determine a free space region in surroundings of the autonomous object. Embodiments of the present disclosure provide an accurate detection of the free space region by using both the image and the point cloud data. The drawback of reflections, poor weather and lighting conditions in images is overcome by use of the point cloud data. Further, the drawback of sparsity in the point cloud data is overcome by use of the image. Hence, the present disclosure provides a robust system for detecting the free space region. Further, since the variance is provided to the neural network, the computational complexity in determining the free space region is reduced.

[0028] Figure 1 illustrates exemplary environment 100 for detecting a free space region in surroundings of autonomous objects, in accordance with some embodiments of the present disclosure. The exemplary environment 100 comprises an autonomous object 10 I. a capturing unit 102, a scene ahead of the autonomous object 101, and a detection system 106. The scene may comprise at least one of, one or more obstacles and a free space region. For example, the scene comprises die free space region, a tree 104 and a building 105. The autonomous object 101 is illustrated as a car in Figure 1. The autonomous object 101 may be a car, a truck, a bus, and the like. The autonomous object 101 may be a self-driving vehicle. The autonomous object 101 may be a vehicle embedded with Advanced Driver Assistance Systems (ADAS). In an embodiment, the autonomous object 101 may be an autonomous robot. The capturing unit 102 may be associated with the autonomous object 101. The capturing unit 102 may be configured to capture a real-time image 112 of a scene ahead of the autonomous object 101. The real-time image 112 is a two-dimensional image indicating objects in the surroundings of the autonomous object 101. In an embodiment, the capturing unit 102 may be camera. A person skilled in art may appreciate that other kinds of capturing unit may be used (e.g., thermal cameras, IR cameras, etc). The capturing unit 102 may be placed on top of the autonomous object 101 such that a focal view of the capturing unit 102 covers entire scene or a portion thereof An exemplary focal view of the capturing unit 102 is represented by dotted bold lines in Figure I. A person skilled in the art will appreciate installation of the capturing unit 102 at other locations such that the entire scene or a portion thereof is covered. In an embodiment, when the autonomous object 101 is a vehicle, the detection system 106 may be an electronic control unit of the vehicle. In other embodiments, the detection system 106 may be an embedded unit of the autonomous object 101.

[0029] The detection system 106 may be configured to receive the real-time image 112 of the scene. The image may be obtained from the capturing unit 102. Further, the detection system 106 may receive real-time point cloud data 103 of the scene. The real-time point cloud data 103 of the scene comprises a plurality of three-dimensional points representing coordinates of the one or more obstacles (104, 105) and the free space region in the scene. The real-time point cloud data 103 may be received from Light Detection and Ranging (LIDAR) sensor. A person skilled in the art will appreciate that the real-time point cloud data 103 of the scene may be received from any other suitable sensors such as Radio Detection and Ranging (RADAR) sensor. Further, the detection system 106 may be configured to determine a variance for each pixel from a plurality of pixels of the real-time image 112. The variance may be determined based on information in the plurality of pixels. Further, the detection system 106 may map the real-time image 112 with the real-time point cloud data 103. A real-time mapped image may be obtained upon mapping. Further, the detection system 106 may be configured to provide the real-time mapped image and the variance to a neural network. The neural network may determine the free space region in the surroundings of the autonomous object 101.

[0030] The detection system 106 may include Central Processing Units 107 (also referred as "CPUs" or "one or more processors 107"). Input/ Output (I/O) interface 108, and a memoly 109. In some embodiments, the memory 109 may be communicatively coupled to the processor 107. The memory 109 stores instructions executable by the one or more processors 107. The one or more processors 107 may comprise at least one data processor for executing program components for executing user or system-generated requests. The memory 109 may be communicatively coupled to the one or more processors 107. The memory 109 stores instructions, executable by the one or more processors 107, which, on execution, may cause the one or more processors 107 to detect the free space region in the surroundings of the autonomous object 101. In an embodiment, the memory 109 may include one or more modules 111 and data 110. The one or more modules 111 may be configured to perform the steps of the present disclosure using the data 110, to detect the free space region in the surroundings of the autonomous object 101. In an embodiment, each of the one or more modules 1 1 I may be a hardware unit which may be outside the memory 109 and coupled with the detection system 106. As used herein, the term modules 111 refers to an Application Specific integrated Circuit (ASIC), an electronic circuit, a Field-Programmable Gate Arrays (FPGA), Programmable System-on-Chip (PSoC), a combinational logic circuit, and/or other suitable components that provide described functionality. The one or more modules 111 when configured with the described functionality defined in the present disclosure will result in a novel hardware. Further, the I/0 interface 108 is coupled with the one or more processors 107 through which an input signal or/and an output signal is communicated. For example, the detection system 106 may receive the real-time image 112 and the real-time point cloud data 103 of the scene via the I/O interface 108. The detection system 106 may transmit data related to the determined free space region via the I/O interface 108. In an embodiment the detection system 106, to detect the free space region in the surroundings of the autonomous object 101, may be implemented in a variety of computing systems, such as a laptop computer, a desktop computer, a Personal Computer (PC), a notebook, a smartphone, a tablet, e-book readers, a server, a network server, a cloud-based server and the like.

100311 Figure 2 illustrates an internal architecture 200 of the detection system 106 to detect the free space region in the surroundings of the autonomous object 101, in accordance with some embodiments of the present disclosure. The detection system 106 may include the one or more processors 107, the memory 109, and the 1/0 interface 108.

100321 In one implementation, the modules 111 may include, for example, an input module 206, a variance determination module 207, an image map module 208, an image provide module 209, and other modules 210, it will be appreciated that such aforementioned modules III may be represented as a single module or a combination of different modules. in one implementation, the data 110 may include, for example, input data 201, variance data 202, mapped data 203, image data 204, and other data 205.

100331 In an embodiment, the input module 206 may be configured to receive the real-time image 112 of the scene ahead of the autonomous object 101. The real-time image 112 of the scene may be captured by the capturing unit 102. The scene may comprise at least one of, the one or more obstacles (104, 105) and the free space region. The real-time image 112 may be received to determine the free space region in the scene. The real-time image 112 comprises a plurality of two-dimensional points representing pixel information of the scene. The pixel information comprises RGB (Red Green Blue) color data of the scene. The input module 206 may be configured with the capturing unit 102. The input module 206 may receive the real-time image 112 from the capturing unit 102. Further, the input module 206 may receive the real-time point cloud data 103. In an embodiment, the real-time point cloud data 103 may be received from the LIDAR sensor. A person skilled in the art will appreciate that the real-time point cloud data 103 of the scene may be received from any other suitable sensors such as the RADAR sensor. The real-time point cloud data 103 comprises a plurality of three-dimensional points representing coordinates of the one or more obstacles (104, 105) and the free space region in the scene. The input module 206 may be configured with the LIDAR sensor, RADAR sensor, and the like to receive the real-time point cloud data 103 of the scene.

100341 in an embodiment, the input module 206 may be configured to receive the real-time image 112 and the real-time point cloud data 103 continuously. For example, the input module 206 may be configured to receive the real-time image 112 and the real-time point cloud data 103 when the vehicle starts. Further, the input module 206 may be configured to continue receiving the real-time image 112 and the real-time point cloud data until the vehicle stops. In an embodiment, the input module 206 may be configured to receive the real-time image 112 and the real-time point cloud data 103 at pre-defined time intervals. For example, the predefined time intervals may be in microseconds. In another embodiment, the input module 206 may be configured to receive the real-time image 112 and the real-time point cloud data 103 when the autonomous object 101 is in motion. In another embodiment, the input module 206 may be configured to receive an indication from a user to receive the real-time image 112 and the real-time point cloud data 103. The real-time image 112 and the real-time point cloud data 103 may be stored as the input data 201 in the memory 109. In an embodiment, the input module 206 may pre-process the real-time image 112 and the real-time point cloud data 103. Preprocessing may include, but is not limited to, compressing the data, removing noises, normalizing, analog to digital conversion, changing format and the like.

100351 In an embodiment, the variance determination module 207 may be configured to receive the real-time image 112 from the input module 206. The variance determination module 207 may determine the variance for each pixel from a plurality of pixels of the real-time image 112. The variance may be determined based on information in the plurality of pixels. The variance determination module 207 may be configured to average values of each of the plurality of pixels. The variance determination module 207 may determine a mean value associated with the plurality of pixels. Further, the variance determination module 207 may be configured to determine a difference between value of each pixel from the plurality of pixels and a mean value associated with the plurality of pixels. The difference may indicate the variance of corresponding pixel. A person skilled in the art will appreciate that any known methods may be used to determine the variance of each pixel from the plurality of pixels and is not limited to above mentioned method.

100361 In an embodiment, the image map module 208 may be configured to receive the real-time image 112 and the real-time point cloud data 103 of the scene from the input module 206. The image map module 208 may be configured to upsample the real-time point cloud data 103. The upsampling of the real-time point cloud data 103 may be required since the real-time point cloud data 103 is sparse in nature. Data is considered as sparse when certain values in a dataset are missing. Upsampling the real-time point cloud data 103 may comprise increasing a spatial resolution of the real-time point cloud data 103. The real-time point cloud data 103 may be upsampled by interpolation of data. A person skilled in the art will appreciate any known upsampling techniques may be used to upsample the real-time point cloud data 103 and is not limited to the above-mentioned technique. The image map module 208 may be configured to transform each three-dimensional point from the plurality of three-dimensional points in the real-time point cloud data 103 to two-dimensional points from a plurality of two-dimensional points in the real-time image 112. In an embodiment, the mapping may be performed using a transformation matrix. Intrinsic parameters and extrinsic parameters of the capturing unit 102 may be considered for transformation. The intrinsic parameters may comprise focal length, principal points, skew coefficients, distortion coefficients, and the like. The extrinsic parameters may comprise rotation parameters, translation parameters, and the like. A person skilled in the art will appreciate to use any known methods of transforming three-dimensional points to two-dimensional points for mapping. The real-time point cloud data 103 outside a range of the real-time image 112 may be discarded. The real-time mapped image obtained by mapping the real-time point cloud data 103 with the real-time image 112, may be stored as the mapped data 203 in the memory 109.

[0037] In an embodiment, the image provide module 209 may be configured to receive the real-time mapped image from the image map module 208. Further, the image provide module 209 may be configured to receive the variance of each of the plurality of pixels of the real-time image 112 from the variance determination module 207. The image provide module 209 may provide the real-time mapped image and the variance to a neural network. The neural network may determine the free space region in the surroundings of the autonomous object 101. In an embodiment, the neural network may be part of the image provide module 209. In another embodiment, the neural network may be a part of a system integrated to the image provide module 209. The system and the image provide module 209 may communicate over a communication network. The communication network may include, without limitation, a direct interconnection, local area network (LAN), wide area network (WAN), wireless network (e.g., using Wireless Application Protocol), the Internet, etc. [0038] The neural network is trained by providing a plurality of mapped training images, a plurality of ground truth images corresponding to a plurality of training images, and a variance associated with the plurality of training images associated with a plurality of scenes for the autonomous object 101. The plurality of mapped training images of each scene of the plurality of scenes may be obtained by mapping the plurality of training images with a training point cloud daLl, associated with the corresponding scene from the plurality of scenes. The term,'ground truth image" may be defined as an actual measured data used to train the neural network. The plurality of ground truth images comprises a plurality of labels. The plurality of labels indicates a free space region in the corresponding scene, to be predicted by the neural network. The term "label" may be defined as a final output class to be predicted by the neural network. For example, a scene may comprise a free space region, a tree, and a building. The free space region may have a label "1". The tree and the building may have a label "0". In this example, the two classes may be the free space region and not the free space region, associated with the labels "1" and "0", respectively. The labels "1" in a ground truth image may indicate the free space region to be predicted by the neural network. The variance is associated with each pixel from a plurality of pixels in each of the plurality of training images. The neural network may identify one or more pixels in the real-time mapped image with variance lesser than a pre-determined threshold value. When the variance is equal to zero, the neural network may determine region of the one or more pixels from the plurality of pixels to be the free space region. When the variance of the one or more pixels is lesser than the pre-determined threshold value, the neural network may compare the one or more pixels of the real-time mapped image with one or more pixels in each of a plurality of ground truth images to deteimine the free space region. By considering lesser values of the variance, the free space region may be detected when there are shadows, reflections, poor weather and Fighting conditions, and the like on a road. The real-time mapped image and the variance of each pixel of the real-time image 112 may be stored as the image data 204 in the memory 109.

100391 The other data 205 may store data, including temporary data and temporary files, generated by the one or more modules 111 for performing the various functions of the detection system 106. For example, the other data 205 may comprise data related to training the neural network. The data related to training the neural network may comprise the plurality of mapped training images, the plurality of ground truth images, and the variance associated with the plurality of training images The term "associated" may be used to define a relation between two terms. For example, the variance associated with a training image may be defined as the variance of each pixel in the training image. The one or more modules 111 may also include the other modules 210 to perform various miscellaneous functionalities of the detection system 106. The other data 205 may be stored in the memory 109, it will be appreciated that the one or more modules 1 I I may be represented as a single module or a combination of different modules.

[0040] Figure 3 shows an exemplary flow chart illustrating method steps to detect the free space region in the surroundings of the autonomous object 101, in accordance with some embodiments of the present disclosure As illustrated in Figure 3, the method 300 may comprise one or more steps. The method 300 may be described in the general context of computer executable instructions. Generally, computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, and functions, which perform particular functions or implement particular abstract data types.

[0041] The order in which the method 300 is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method. Additionally, individual blocks may be deleted from the methods without departing from the scope of the subject matter described herein. Furthermore, the method can be implemented in any suitable hardware, software, firmware, or combination thereof [0042] At step 301, the detection system 106 may receive the real-time image 112 of the scene ahead of the autonomous object 101. The real-time image 112 of the scene may be captured by the capturing unit 102. The scene may comprise at least one of, the one or more obstacles (104, 105) and the free space region. The real-time image 112 may be received to determine the free space region in the scene. The real-time image 112 comprises a plurality of two-dimensional points representing pixel information of the scene. The detection system 106 may receive the real-time image 112 from the capturing unit 102. Referring to example 400 of Figure 4A, 401 shows a real-time image of a scene ahead of the autonomous object 101. The autonomous object 101 may be a vehicle. Further, the detection system 106 may receive the real-time point cloud data 103. In an embodiment, the real-time point cloud data 103 may be received from the LIDAR sensor. The real-time point cloud data 103 comprises a plurality of three-dimensional points representing coordinates of the one or more obstacles (104, 105) and the free space region in the scene. 403 in Figure 4A shows real-time point cloud data of the scene ahead of the autonomous object 101. In an embodiment, the detection system 106 may be configured to receive the real-time image 112 and the real-time point cloud data 103 continuously. For example, the detection system 106 may be configured to receive the real-time image 112 and the real-time point cloud data 103 when the vehicle starts. Further, the detection system 106 may continue to receive the real-time image 112 and the real-time point cloud data until the vehicle stops. In an embodiment, the input module 206 may be configured to receive the real-time image 112 and the real-time point cloud data 103 at pre-defined time intervals. For example, the pre-defined time intervals may be in microseconds. In another embodiment, the detection system 106 may be configured to receive the real-time image 112 and the real-time point cloud data 103 when the autonomous object 101 is in motion. For example, the real-time image 112 and the real-time point cloud data 103 may not be received when an autonomous vehicle is stationary due to traffic. In another embodiment, the detection system 106 may be configured to receive an indication from a user to receive the real-time image 112 and the real-time point cloud data 103. For example, the autonomous object 101 may be a self-driving car. In certain traffic areas, the driver may drive the car. in such areas, determination of free space region may not be a requirement. The indication may be provided by the driver according to the requirement.

[0043] Refeffing again to Figure 3, at step 302, the detection system 106 may determine the variance for each pixel from the plurality of pixels of the real-time image 112. The detection system 106 may determine the variance for each pixel from the plurality of pixels of the real-time image 112. The variance may be determined based on information in the plurality of pixels. The detection system 106 may be configured to average values of each of the plurality of pixels. The detection system 106 may determine a mean value associated with the plurality of pixels.

Further, the detection system 106 may be configured to determine a difference between value of each pixel from the plurality of pixels and a mean value associated with the plurality of pixels. The difference may indicate the variance of corresponding pixel. A person skilled in the art will appreciate that any known methods may be used to determine the variance of each pixel from the plurality of pixels and is not limited to above mentioned method. In an example, consider the scene comprises a free space region. Since all pixels in an image of the free space region are similar, the variance may be equal to zero. In another example, consider the scene comprises an obstacle Values of each pixel may vary to a greater extent from a mean value of all pixels in an image of the obstacle. The variance may be high. In another example, consider, the scene comprises a shadow. The variance may not be equal to zero. But the variance may be having a smaller value. The smaller value of variance may indicate the free space region, since the presence of shadow is not considered as an obstacle and is considered as the free space region. Referring back to example 400 of Figure 4A, 402 shows determination of variance for each of the plurality of pixels of the real-time image 112.

[0044] Referring again to Figure 3, at step 303, the detection system 106 may map the real-time image 112 with the real-time point cloud data 103. The detection system 106 may be configured to upsample the real-time point cloud data 103. The term "upsampling" is defined as increasing a spatial resolution of data. The upsampling of the real-time point cloud data 103 may be required since the real-time point cloud data is sparse in nature. The upsampling the real-time point cloud data 103 may comprise increasing a spatial resolution of the real-time point cloud data 103. The detection system 106 may be configured to transform each three-dimensional point from the plurality of three-dimensional points in the real-time point cloud data 103 to two-dimensional points from a plurality of two-dimensional points in the real-time image 112. The mapping may be performed using a transformation matrix. Intrinsic parameters and extrinsic parameters of the capturing unit 102 may be considered for transformation. The intrinsic parameters may comprise focal length, principal points, skew coefficients, distortion coefficients, and the like. The extrinsic parameters may comprise rotation parameters, translation parameters, and the like. A person skilled in the art will appreciate any known methods of transforming three-dimensional points to two-dimensional points may be used for mapping. The real-time point cloud data 103 outside a range of the real-time image 112 may be discarded. Referring back to the example 400 of Figure 4A, 404 shows a real-time mapped image obtained by mapping the real-time image 401 with the real-time point cloud data 403.

[0045] At step 304, the detection system 106 provides the real-time mapped image and the variance to the neural network. The neural network may determine the free space region in the surroundings of the autonomous object 101. The neural network is trained by providing a plurality of mapped training images, a plurality of ground truth images, and a variance associated with a plurality of training images associated with a plurality of scenes for the autonomous object 101. In an embodiment, the neural network may be a Convolutional Neural Network (CNN). In another embodiment, the neural network may be a Recurrent Neural Network (RNN). A person skilled in the art will appreciate any known neural networks may be used to determine the free space region and is not limited to the above-mentioned neural networks. The plurality of training images may be pre-defined. For example, a scene A may comprise a free space region and poles on both sides of a road. A scene B may comprise multiple vehicles, people, and a minimum free space region. The plurality of mapped training images of each scene of the plurality of scenes may be obtained by mapping the plurality of training images with a training point cloud data associated with corresponding scene from the plurality of scenes. The plurality of ground truth images corresponding to the plurality of training images, comprises a plurality of labels indicating a free space region in the corresponding scene, to be predicted by the neural network. The variance is associated with each pixel from a plurality of pixels in each of the plurality of training images.

[0046] The neural network may identify one or more pixels in the real-time mapped image with variance lesser than a pre-determined threshold value. When the variance is equal to zero, the neural network may determine region of one or more pixels from the plurality of pixels to be the free space region. When the variance of the one or more pixels is lesser than the predetermined threshold value, the neural network may compare the one or more pixels of the real-time mapped image with one or more pixels in each of a plurality of ground truth images to determine the free space region. A ground truth image 405 is shown in Figure 4B. 502 in Figure 5 shows an output of the neural network 501. 503 indicates the determined free space region. The determined free space region may be used by an autonomous vehicle when navigating. Also, an autonomous robot may use the determined free space region during its movement to move without being obstructed by obstacles in the surroundings. In general, the free space detected may be used by any autonomous objects. The free space detected aids in safe movement of the autonomous objects.

COMPUTER SYSTEM

[0047] Figure 6 illustrates a block diagram of an exemplary computer system 600 for implementing embodiments consistent with the present disclosure. In an embodiment, the computer system 600 may be used to implement the detection system 106. Thus, the computer system 600 may be used to detect the free space region in the surroundings of the autonomous object 101. In an embodiment, the computer system 600 may receive the determined free space region from the neural network 612 ove r the communication network 609. The computer system 600 may comprise a Central Processing Unit 602 (also referred as "CPU" or "processor"). The processor 602 may comprise at least one data processor. The processor 602 may include specialized processing units such as integrated system (bus) controllers, memory management control units, floating point units, graphics processing units, digital signal processing units, etc. [0048] The processor 602 may be disposed in communication with one or more input/output (I/O) devices (not shown) via I/O interface 601. The I/O interface 601 may employ communication protocols/methods such as, without limitation, audio, analog, digital, rnonoaural. RCA, stereo, IEEE (institute of Electrical and Electronics Engineers) -1394, serial bus, universal serial bus (USB), infrared, P5/2, BNC, coaxial, component, composite, digital visual interface (DVD, high-definition multimedia interface (HDMI), Radio Frequency (RF) antennas, S-Video, VGA, IEEE 802.n /b/g/n/x, Bluetooth, cellular (e.g., code-division multiple access (CDMA), high-speed packet access (HSPA+), global system for mobile communications (GSM), long-term evolution (LTE), WiMax, or the like), etc. [0049] Using the I/O interface 601, the computer system 600 may communicate with one or more I/O devices. For example, the input device 610 may be an antenna, keyboard, mouse, joystick, (infrared) remote control, camera, card reader, fax machine, dongle, biometric reader, microphone, touch screen, touchpacl, trackball, stylus, scanner, storage device, transceiver, video device/source. etc. The output device 611 may be a printer, fax machine, video display (e.g., cathode ray tube (CRT), liquid crystal display (LCD), light-emitting diode (LED), plasma, Plasma display panel (PDP), Organic light-emitting diode display (OLED) or the like), audio speaker, etc. [0050] The computer system 600 is connected to the neural network 612 through a communication network 609. The processor 602 may be disposed in communication with the communication network 609 via a network interface 603. The network interface 603 may communicate with the communication network 609. The network interface 603 may employ connection protocols including, without limitation, direct connect, Ethernet (e.g., twisted pair 10/100/1000 Base T), transmission control protocol/internet protocol (TCP/IP), token ring, IEEE 802.11afb/g/n/x, etc. The communication network 609 may include, without limitation, a direct interconnection, local area network (LAN), wide area network (WAN), wireless network (e.g., using Wireless Application Protocol), the Internet, etc. The network interface 603 may employ connection protocols include, but not limited to, direct connect, Ethernet (e.g., twisted pair 10/100/1000 Base T), transmission control protocol/intemet protocol (TCP/IP), token ring, IEEE 802.11a/big/nix, etc. 100511 The communication network 609 includes, but is not limited to, a direct interconnection, an e-commerce network, a peer to peer (P2P) network, local area network (LAN), wide area network (WAN), wireless network (e.g., using Wireless Application Protocol), the Internet, WiFi, and such. The first network and the second network may either be a dedicated network or a shared network, which represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), etc., to communicate with each other. Further, the first network and the second network may include a variety of network devices, including routers, bridges, servers, computing devices, storage devices, etc. [0052] in some embodiments, the processor 602 may be disposed in communication with a memory 605 (e.g., RAM, ROM, etc. not shown in Figure 6) via a storage interface 604. The storage interface 604 may connect to memory 605 including, without limitation, memory drives, removable disc drives, etc., employing connection protocols such as serial advanced technology attachment (SATA), Integrated Drive Electronics (IDE), IEEE-1394, Universal Serial Bus (USB), fiber channel, Small Computer Systems Interface (SCSI), etc. The memory drives may further include a drum, magnetic disc drive, magneto-optic& drive, optical drive, Redundant Array of Independent Discs (RAID), solid-state memory devices, solid-state drives, etc. [0053] The memory 605 may store a collection of program or database components, including, without limitation, user interface 606, an operating system 607, web browser 608 etc. In some embodiments, computer system 600 may store user/application data such as, the data, variables, records, etc., as described in this disclosure. Such databases may be implemented as fault-tolerant, relational, scalable, secure databases such as Oracle IV or Sybaseg.

[0054] The operating system 607 may facilitate resource management and operation of the computer system 600. Examples of operating systems include, without limitation, APPLE MACINTOSHR OS X, UNIXR, UNIX-like system distributions (E.G., BERKELEY SOFTWARE DISTRIBUTIONTNI (BSD), FREEBSDTM, NETBSDTm, OPENBSDTm, etc.), LINUX DISTRIBUTIONS' (E.G., RED HATTm, UBUNTUTNI, KUBUNTUTNI, etc.), IBMTNI OS/2, MICROSOFTTNI WINDOWSTNI VISTATm/7/8, 10 etc.), APPLER GOOGLER ANDROIDTM, BLACKBERRYR OS, or the like.

[0055] In some embodiments, the computer system 600 may implement the web browser 608 stored program component. The web browser 608 may be a hypertext viewing application, for example MICROSOFTR INTERNET EXPLORERTM, GOOGLER CHROMETm°, MOZTLLAR FIREFOXTM, APPLER SAFARITM, etc. Secure web browsing may be provided using Secure Hypertext Transport Protocol (HTTPS), Secure Sockets Layer (SSL), Transport Layer Security (TES), etc. Web browsers 608 may utilize facilities such as AJAXTm, DHTMETN4, ADOBER FLASHTM, JAVASCRIPTTm, JAVATM, Application Programming Interfaces (APIs), etc. In some embodiments, the computer system 600 may implement a mail server (not shown in Figure) stored program component. The mail server may be an Internet mail server such as Microsoft Exchange, or the like. The mail server may utilize facilities such as ASPTM, ACTIVEXTm, ANSITM C++/Cri, MICROSOFTR, .NETTNI, CGI SCRIPTSTm, JAVATNI, JAVASCRIPTTm, PERLTm, PHPTM, PYTHONTM, WEBOBJECTSINI, etc. The mail server may utilize communication protocols such as Internet Message Access Protocol (IMAP), Messaging Application Programming Interface (MAN), MICROSOFTR exchange, Post Office Protocol (POP), Simple Mail Transfer Protocol (SMTP), or the like. In some embodiments, the computer system 600 may implement a mail client stored program component. The mail client (not shown in Figure) may be a mail viewing application, such as APPLER MAILTM, MICROSOFTR ENTOURAGETm, MICROSOFTR OUTLOOKTNI, MOZILLAR THUNDERBIRDTNI, etc. [0056] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term "computer-readable medium" should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include Random Access Memory (RAM), Read-Only Memory (ROM), volatile memory, non-volatile memory, hard drives, Compact Disc Read-Only Memory (CD ROMs), Digital Video Disc (DVDs), flash drives, disks, and any other known physical storage media.

[0057] Embodiments of the present disclosure considers lesser values of variance, and not only zero variance. This helps in detection of the free space region when there are shadows, reflections, poor weather and lighting conditions, and the like on a road.

[0058] In the present disclosure the real-time image is mapped with the real-time point cloud data. Embodiments of the present disclosure provide an accurate detection of the free space region by using both the image and the point cloud data The drawback of reflections, poor weather and lighting conditions in images is overcome by use of the point cloud data. Further, the drawback of sparsity in the point cloud data is overcome by use of the image. Hence, the present disclosure provides a robust system for detecting the free space region.

[0059] In the embodiments of the present disclosure, variance is provided to the neural network during training. The computational complexity in determining the free space region is reduced, since the variance is provided to the neural network along with the real-time mapped image.

[0060] The terms "an embodiment", "embodiment", "embodiments", "the embodiment", "the embodiments", "one or more embodiments", "some embodiments", and "one embodiment" mean "one or more (but not all) embodiments of the invention(s)" unless expressly specified otherwise.

[0061] The terms "including", "comprising", "having" and variations thereof mean "including but not limited to", unless expressly specified otherwise.

[0062] The enumerated listing of items does not imply that any or all of the items are mutually exclusive, unless expressly specified otherwise. The terms "a", "an" and "the" mean "one or more", unless expressly specified otherwise.

[0063] A description of an embodiment with several components in communication with each other does not imply that all such components are required. On the contrary a variety of optional components are described to illustrate the wide variety of possible embodiments of the invention.

[0064] When a single device or article is described herein, it will be readily apparent that more than one device/article (whether or not they cooperate) may be used in place of a single device/article. Similarly, where more than one device or article is described herein (whether or not they cooperate), it will be readily apparent that a single device/article may be used in place of the more than one device or article or a different number of devices/articles may be used instead of the shown number of devices or programs. The functionality and/or the features of a device may be alternatively embodied by one or more other devices which are not explicitly described as having such functionality/features. Thus, other embodiments of the invention need not include the device itself [0065] The illustrated operations of Figure 3 show certain events occurring in a certain order. In alternative embodiments, certain operations may be performed in a different order, modified, or removed. Moreover, steps may be added to the above described logic and still conform to the described embodiments. Further, operations described herein may occur sequentially or certain operations may be processed in parallel. Yet further, operations may be performed by a single processing unit or by distributed processing units.

[0066] Finally, the language used in the specification has been principally selected for readability and instructional purposes, and it may not have been selected to delineate or circumscribe the inventive subject matter. It is therefore intended that the scope of the invention be limited not by this detailed description, but rather by any claims that issue on an application based here on. Accordingly, the disclosure of the embodiments of the invention is intended to be illustrative, but not limiting, of the scope of the invention, which is set forth in the following claims.

[0067] While various aspects and embodiments have been disclosed herein, other aspects and embodiments will be apparent to those skilled in the art. The various aspects and embodiments disclosed herein are for purposes of illustration and are not intended to be limiting, with the true scope being indicated by the following claims Referral Numerals:

Referral number Description

Exemplary environment 101 Autonomous object 102 Capturing unit 103 Real-time point cloud data 104 Obstacle Obstacle 106 Detection system 107 Processor 108 1/0 interface 109 Memory Data 11 I Modules 112 Real-time image Internal architecture 201 Input data 202 Variance data 203 Mapped data 204 Image data 205 Other data 206 Input module 207 Variance determination module 208 Image map module 209 Image provide module 210 Other modules

400 Example

401 Real-time image 402 Determination of variance for each pixel 403 Real-time point cloud data 404 Real-time mapped image 405 Ground truth image

500 Example

501 Neural network 502 Output of neural network 503 Detennined free space region 600 Computer system 601 1/0 interface 602 Processor 603 Network interface 604 Storage interface 605 Memory 606 User interface 607 Operating system 608 Web browser 609 Communication network 610 Input device 611 Output device 612 Neural network

Claims

We claim: 1 A method for detecting a free space region in the surroundings of autonomous objects, the method comprising: receiving, by one or more processors (107), a real-time image (112) of a scene and real-time point cloud data (103) of the scene ahead of an autonomous object (101), wherein the scene comprises at least one of, one or more obstacles (104, 105) or a free space region for movement of the autonomous object (101); determining, by the one or more processors (107), a variance for each pixel from a plurality of pixels of the real-time image (112), based on information in the plurality of pixels; mapping, by the one or more processors (107), the real-time image (112) with the real-time point cloud data (103), to obtain a real-time mapped image (404) of the scene; and providing, by the one or more processors (107), the real-time mapped image (404) and the variance, to a neural network (501) to determine the free space region in the surroundings of the autonomous object (101).
2. The method of claim 1, wherein the neural network (501) is trained by providing a plurality of mapped training images, a plurality of ground truth images (405) corresponding to a plurality of training images, and a variance associated with the plurality of training images associated with a plurality of scenes for the autonomous object ( 0 I), wherein the plurality of mapped training images of each scene of the plurality of scenes is obtained by mapping the plurality of training images with a training point cloud data associated with the plurality of scenes, wherein the plurality of ground truth images (405) comprises a plurality of labels indicating a free space region in the corresponding scene, to be predicted by the neural network (501), wherein the variance is associated \ ith each pixel from a plurality of pixels in each of the plurality of training images.
3 The method of claim 1, wherein the real-time point cloud data (103) comprises a plurality of three-dimensional points representing coordinates of the one or more obstacles (104, 105) and the free space region in the scene, and the real-time image (112) comprises a plurality of two-dimensional points representing pixel information of the scene.
4 The method of claim 1, wherein the variance of a pixel from the plurality of pixels is determined to be a difference between value of the pixel and a mean value associated with the plurality of pixels, wherein the mean value is determined by averaging values of each of the plurality of pixels.
5. The method of claim 1, wherein mapping the real-time image (112) with the real-time point cloud data (103) comprises: transforming, by the one or more processors (107), each three-dimensional point from a plurality of three-dimensional points in the real-time point cloud data (103) to two-dimensional points from a plurality of two-dimensional points in the real-time image (112).
6. The method of claim 1, wherein the mapping comprises upsampling, by the processor. the real-time point cloud data (103).
7 The method of claim 1, wherein determining the free space region comprises: identifying, by the one or more processors (107), one or more pixels in the real-time mapped image (404) with variance lesser than a pre-determined threshold value; and comparing, by the one or more processors (107), the one or more pixels of the real-time mapped image (404) with one or more pixels in each of a plurality of ground truth images (405) to determine the free space region.
8. A detection system (106) for detecting a free space region in surroundings of autonomous objects, the detection system (106) comprising: one or more processors (107); and a memory (109), wherein the memory (109) stores processor-executable instructions, which, on execution, cause the one or more processors (107) to: receive a real-time image (112) of a scene and real-time point cloud data (103) of the scene ahead of an autonomous object (101), wherein the scene comprises at least one of, one or more obstacles (104, 105) and a free space region for movement of the autonomous object (101); determine a variance for each pixel from a plurality of pixels of the real-time image (112), based on information in the plurality of pixels; map the real-time image (112) with the real-time point cloud data (103), to obtain a real-time mapped image (404) of the scene; and provide the real-time mapped image (404) and the variance, to a neural network (501) to detennine the free space region in the surroundings of the autonomous object (101).
9. The detection system (106) of claim 8, wherein the neural network (501) is trained by providing a plurality of mapped training images, a plurality of ground truth images (405) corresponding to a plurality of training images, and a variance associated with the plurality of training images associated with a plurality of scenes for the autonomous object (101), wherein the plurality of mapped training images of each scene of the plurality of scenes is obtained by mapping the plurality of training images with a training point cloud data associated with the plurality of scenes, wherein the plurality of ground truth images (405) comprises a plurality of labels indicating a free space region in the corresponding scene, to be predicted by the neural network (501), wherein the variance is associated with each pixel from a plurality of pixels in each of the plurality of training images.
10. The detection system (106) of claim 8, wherein the real-time point cloud data (103) comprises a plurality of three-dimensional points representing coordinates of the one or more obstacles (104, 105) and the free space region in the scene and the real-time image (112) comprises a plurality of two-dimensional points representing pixel information of the scene.
11. The detection system (106) of claim 8, wherein the one or more processors (107) dctennine the variance of a pixel from the plurality of pixels to be a difference between value of the pixel and a mean value associated with the plurality of pixels, wherein the mean value is determined by averaging values of each of the plurality of pixels.
12. The detection system (106) of claim 8, wherein the one or more processors (107) maps the real-time image (112) with the real-time point cloud data (103) by: transforming each three-dimensional point from a plurality of three-dimensional points in the real-time point cloud data (103) to two-dimensional points from a plurality of two-dimensional points in die real-time image (112)
13. The detection system (106) of claim 8, wherein the mapping comprises upsampling the real-time point cloud data (103).
14 The detection system (106) of claim 8, wherein the one or more processors (107) detennines the free space region by: identifying one or more pixels in the real-time mapped image (404) with variance lesser than a pre-determined threshold value; and comparing the one or more pixels of the real-time mapped image (404) with one or more pixels in each of a plurality of ground truth images (405) to determine the free space region.