GB2579080A - Improvements in or relating to perception modules - Google Patents

Improvements in or relating to perception modules Download PDF

Info

Publication number
GB2579080A
GB2579080A GB1818841.7A GB201818841A GB2579080A GB 2579080 A GB2579080 A GB 2579080A GB 201818841 A GB201818841 A GB 201818841A GB 2579080 A GB2579080 A GB 2579080A
Authority
GB
United Kingdom
Prior art keywords
data
video
feed
vehicle
augmented
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB1818841.7A
Other versions
GB201818841D0 (en
Inventor
Szyjanowicz Piotr
Maraci Mohamed
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
COSWORTH GROUP HOLDINGS Ltd
Original Assignee
COSWORTH GROUP HOLDINGS Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by COSWORTH GROUP HOLDINGS Ltd filed Critical COSWORTH GROUP HOLDINGS Ltd
Priority to GB1818841.7A priority Critical patent/GB2579080A/en
Publication of GB201818841D0 publication Critical patent/GB201818841D0/en
Publication of GB2579080A publication Critical patent/GB2579080A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60QARRANGEMENT OF SIGNALLING OR LIGHTING DEVICES, THE MOUNTING OR SUPPORTING THEREOF OR CIRCUITS THEREFOR, FOR VEHICLES IN GENERAL
    • B60Q9/00Arrangement or adaptation of signal devices not provided for in one of main groups B60Q1/00 - B60Q7/00, e.g. haptic signalling
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0231Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means
    • G05D1/0246Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using a video camera in combination with image processing means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/251Fusion techniques of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/98Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Electromagnetism (AREA)
  • Mechanical Engineering (AREA)
  • Human Computer Interaction (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Automation & Control Theory (AREA)
  • Traffic Control Systems (AREA)
  • Closed-Circuit Television Systems (AREA)

Abstract

A method of combining sensed data in a vehicle 20, such as an autonomous vehicle, comprises the steps of: obtaining at least one video data feed from one or more video cameras mounted on the vehicle; obtaining at least one other data input from a corresponding sensor 22, 24, 28, such as ; processing the or each video data feed so that the other data is incorporated into the video data feed to produce at least one augmented video feed, where the processing may involve calibrating the other data to correct for alignment errors and fusion of the other data onto a frame of the video data; and outputting the or each augmented video feed for use in the vehicle, for example as an input to a perception module. A synchronised system time may be allocated to the video data feed and other data. A flag may be applied to denote inconsistencies between the video data and other data. Object recognition and tracking may be performed on the augmented video feed using a neural network.

Description

I
IMPROVEMENTS IN OR RELATING TO PERCEPTION MODULES
This invention relates to improvements in or relating to perception modules and, in particular, to improvements in the data synchronisation resulting in a single output stream adjusted to compensate for the movement of the vehicle.
Modern vehicles are provided with a plethora of sensors that sense different aspects of the environment surrounding the vehicle. These sensors have differing fields of view and different data sampling rates. The output from these sensors are deployed in different parts of the vehicle control system, providing a range of interventions ranging from driver warning information, through driver assist functionality to partially or even fully autonomous drive.
Regardless of the end use of the data, whether it is merely informative for the driver, or whether it is passed through to prediction and policy modules in a fully autonomous environment, the data integrity is key. However, as the number of sensors on the vehicle increases so does the volume of data. Data storage and processing is therefore a major focus to make time critical analysis of sensed data within a time frame during which it remains relevant.
One approach deployed to date has been the creation of a point cloud describing an object in the field of view. The point cloud information can be augmented with data from multiple sensors. However, this data is not particularly compressible and therefore the volume of data can be challenging to process and, where relevant, store.
It is against this background that the present invention has arisen.
According to the present invention there is provided a method of combining sensed data in a vehicle, the method comprising the steps of: obtaining at least one video data feed from one or more video cameras mounted on the vehicle; obtaining at least one other data input from a corresponding sensor; processing the or each video data feed so that the other data is incorporated into the video data feed to produce at least one augmented video feed; and outputting the or each augmented video feed for use in the vehicle.
The method has several notable advantages over current systems. The fusion of all of the sensor data onto the video feed in order to create an augmented video feed, means that the output data format is a well known format that can be manipulated, interrogated and compressed in line with standard video data techniques. These techniques are constantly evolving and the augmented video feed will be able to benefit from this evolution.
The provision of an output in the form of an augmented video data stream makes the output directly addressable by the driver of the vehicle in the form of driver assist functionality without necessarily requiring further processing of the data. Furthermore, because the output format is immediately addressable by a human, it enables the interface of the output data stream with a neural network for the purpose of object recognition. Neural networks operate by training from a human taught training data set and the provision of the output in the form of a video data feed interfaces very well with a neural network. In contrast, a human user cannot easily review a training data set made up of point cloud data and therefore a point cloud based approach cannot easily benefit from the training by a human operator.
Furthermore, by removing the sensor data from its initial context and applying it to the video data stream, the sensor data is removed from the sensor identity itself Any processing provided at the sensor level will therefore be incorporated prior to the inclusion of the data in the video data stream. As a result, differences arising from the sensor itself should be expunged. Such differences might arise, for example, if the sensor has been replaced by a different proprietary sensor, which undertakes different processing of the data prior to outputting its data. This makes the method substantially agnostic as to the sensor type, thus increasing the robustness of the system.
The step of obtaining at least one video data feed may include the allocation of a system time to that video data feed; and the step of obtaining at least one other data input may include the allocation of a system time to the other data. These system time stamps on each thread of input data enables the step of processing the or each video data feed to include the synchronisation of the system time provided to each video data feed and other data feed.
The ingested video, sensor and vehicle data is time stamped using system time and these time stamps, together with location stamps created from GPS position data and Inertial Measurement Unit (IMU) trajectory data enable the data from different sensors to be associated with the correspondingly timed and located video data to augment the video data stream with the data from the non-video sensors.
The step of processing the video data may include calibrating other data to correct for alignment errors in the other data. Where there is an overlap in field of view between two sources of data, whether that is an overlap between the field of view of two of the video cameras or whether that is an overlap between the field of view of a ranging sensor such as a LIDAR or RADAR sensor and the field of view of one or more of the video cameras, these sensors will obtain data from the same objects. Calibration of the location and appearance of an identified object within the overlapping part of the field of view will enable corrections to the data to bring alignment between the sources.
For example, if one of the video cameras had moved from its previous position such that the combination of two video data feeds was effectively creating "double vision" within the overlap in their fields of view, this duplication could be used to correct the input from one of the cameras across its entire field of view. This would enable the data within the overlapping field of view to be aligned such that duplication was removed and, in the non-overlapping part of the field of view, the same correction could be applied in order to improve the accuracy of the data.
The step of processing the video data may include identifying inconsistencies between video data and other data and wherein a corresponding flag may be applied to the augmented video feed to identify these inconsistencies.
The application of flags to inconsistent data may be used in scenarios where it is not immediately apparent which source of data is correct and should therefore have other data sources corrected into alignment. The provision of flags to potentially inconsistent data will also help identify a faulty sensor or a circumstance in which the data from one sensor should be given primacy over the input from another data source. For example, if it is foggy, the video data stream may be compromised in comparison with the RADAR data stream and therefore the RADAR data should be given primacy where there is inconsistency between the two.
The flags may be applied to the augmented data stream and therefore included therein.
This enables their presence to be identified retrospectively when reviewing the data This may be a useful indicator for event reconstruction and forensics reviewing the data.
The processing step may include the fusion of the other data onto a corresponding frame of the or each video data feed. This may result in a single augmented video data stream or a plurality of augmented video data streams corresponding to the number of video cameras provided on the vehicle. For example, if six video cameras are provided to give 3600 coverage of the vehicle, this may be combined into a single panoramic augmented stream; or four streams with two cameras contributing to each of the front and rear views and one camera providing each side view; or six separate streams may be provided, one from each camera.
The fusion of the data, rather than the mere superposition of the data, enables the augmented data stream to be manipulated, compressed, interrogated and saved in line with standard techniques for dealing with video data. Therefore by fusing ranging data from LiDAR and RADAR sensors into a video data stream the fusion processor can leverage existing lossless compression methods to reduce the size of the output data stream significantly below the size of the raw 3D point cloud produced by LiDAR sensors.
The vehicle may be at least partially autonomous and the augmented video feed may be provided as an input to a vehicle controller. Alternatively or additionally, the augmented video feed may be provided to the driver to aid the driver to understand the vehicle surroundings.
The method may comprise the additional step of performing object recognition and tracking on the augmented video feed using a neural network. By retaining the video feed as the base for the output stream of augmented video data, the fusion processor enables optimal convolutional neural network performance because humans can easily interact with the video data in order to create training sets for the neural network. This is a considerable improvement on a point cloud based approach, which cannot be so readily interrogated by a human user in order to create training data sets.
The other data may be displayed in the form of an overlay on the video data feed. This could be a heat map, artificial pixel coloration and/or augmentation annotation overlays to indicate values of the other data. This enables the reviewer or driver, depending on the context, to digest the additional data without losing sight of the video data itself.
The method may further comprise the step of saving at least part of the augmented video feed in a memory. The memory may cyclically buffer so that the last 15 minutes of data will be retained at all times. The time period of data stored can be selected depending on the complexity of the system and the size of the memory so it will be understood that 15 minutes is merely exemplary and not limiting. Alternatively or additionally data surrounding certain trigger events or circumstances may be retained. For example, if a flag indicating a data inconsistency were to be raised, then the data for a few minutes on either side of the flag being raised may be retained. If the flag is short lived, then the data corresponding to the entire time period for which the flag was in force may be stored. Alternatively, if the flag remained once raised, then the data most relevant to fault finding would be the data immediately proceeding the raising of the flag. Therefore, once the flag has remained beyond a predetermined threshold time, the later data may be discarded and the data leading up to the raising of the flag retained.
The step of saving at least part of the augmented video feed may include the compression of the data. The augmented video can be subject to a lossless compression including the embedded data track which includes the synchronised sensor and vehicle data Furthermore, according to the present invention there is provided a perception module for use in a vehicle, the module comprising; at least one video camera mounted on the vehicle configured to create at least one video data stream; at least one other sensor configured to provide data, wherein there is an overlap in a field of view of the sensor with a field of view of at least one of the video cameras; the perception module further comprises a control system configured to fuse the output from the sensor into the video data stream of the video camera with an overlapping field of view to create an augmented video stream.
The control system is configured to fuse the data output from the sensor or sensors into the video stream or video streams with which there is an overlap in the field of view. The resultant augmented video stream or streams are therefore entirely independent of the sensor specification and/or manufacturer. Therefore, the replacement of a sensor should not interfere with the integrity of the data or, if it does appear to interfere, then this inconsistency can be identified and potentially dealt with to the extent to which the error in the output is understood to be systemic, rather than random. This results in a system that is more robust and is agnostic as to the specification and source or manufacturer of the sensors deployed within it.
It should be understood that the fusion of data is more than the mere superposition of data, and that the fused data stream has a consistent system time and includes the data from all available sensors. The fused data may be graphically represented in the style of a heat map over the video data, but it should be understood that this simplified user illustration is not representative of the computational activity that has created a truly fused data set.
The provision of an augmented video data stream as the output from the control system enables the resulting stream to be dealt with as any other video stream including compression for storage via standard techniques typical to video data manipulation.
The perception module may further comprise a memory configured to store augmented video stream. The memory may be co-located with the control system or it may form part of the vehicle. Alternatively or additionally the system may include telecommunications capability to enable the data to be exported from the vehicle and stored remotely. This is particularly pertinent when the vehicle is being operated as part of a fleet of vehicles and the fleet operator wishes to review data from across the fleet.
The control system may be configured to select which augmented video stream data is stored in the memory. The selection may comprise a rolling buffer covering the previous 5, 10, 15, 30 or 60 minutes. The exact time covered by the buffered stored data will depend on the size of the memory and the extent of the data compression that has been effected thereon. Additionally or alternatively, the data around any flagged data inconsistency may be stored for further review at a later date.
The perception module may comprise two or more video cameras and the control system may be configured to fuse the outputs of the video cameras to create a single video data stream. This single video data stream could potentially show up to a full 360° panorama around the vehicle. Once the video data has been fused and manipulated to ensure internal consistency, then data from other sensors may be fused into the single video data stream to provide a single augmented video data stream.
Alternatively, the module may comprise two or more video cameras and the control system may be configured to fuse the output from the sensor into each video data stream with an overlapping field of view to create an augmented video stream corresponding to each video camera. This enables an augmented video data stream to be prepared pertaining to one aspect of the vehicle. For example, there may be an augmented video data stream of the forward view from the vehicle, and then a separate augmented video data stream of the view from the rear of the vehicle. Further augmented video data streams of the sides of the vehicle may be separately provided.
The non-video sensors may be selected from a group including speed sensors, displacement sensors, inertial measurement sensors, pressure sensors, GPS coordinates, GPS time data, vehicle control signals, LiDAR sensors, radar sensors.
The invention will now be further and more particularly described, by way of example only, and with reference to the accompanying drawings, in which: Figure 1 shows a plan view of a vehicle with video and ranging sensors; Figures 2A to 20 illustrate the data obtained from three different sensors observing the same scene; Figure 3 illustrates the encoding of ranging data onto a frame of video data to create an augmented frame of video data; and Figure 4 shows schematically the constituent parts of the system.
Figure 1 shows a vehicle 20 provided with a video camera 261 and two ranging sensors 241, 242. The video camera 261 has a 600 horizontal field of view.
Figures 2 and 3 show the steps taken in the fusion of video and distance ranging data. The captured data is illustrated in Figure 2A to 2C.
In Figure 2A data 46 captured by the video camera 261 is displayed and it consists of a 2D grid of X1 by 111 pixels. A small number of these pixels are magnified in this illustration for clarity. These are the top left extreme of the display grid and they include the first pixel al, bl. In Figure 23 data 44 captured by the LiDAR sensor 241 is displayed. The LiDAR sensor 241 has a 30° field of view and the data is divided into pixels. Just one is shown in Figure 2B, the top left pixel az, bz. In Figure 20 data 45 captured by the RADAR sensor 242 is displayed. The RADAR sensor 242 has an 8° field of view and the data is again divided into pixels. Again, for clarity just one is shown a3, b3. This data is all captured at a distance of 300 feet.
The captured data is calibrated for field of view and angular resolution for and between each of the sensors. This enables a set of distance-based "pixel-to-point" mappings to be defined. In this example these mappings take the following form: at a distance of Xm, video pixel co-ordinate al, b1 corresponding to LiDAR point (az, b2).
Figure 3 shows a subsequent step of encoding the ranging data, in this example LiDAR data 44 into colour or luminosity values of the video data 46 to provide a distance of each pixel, thus creating an augmented video data stream 48 or 3D video.
If, during the calibration and encoding steps shown schematically in Figures 2 and 3, an inconsistency between data streams is identified, the fusion processor 10 cross references the ingested data to provide an indication of individual sensor performance. This can enable fault detection where one sensor is consistently providing data that is incompatible with the consensus of other data being provided from video and other sensors. Where the error appears to be systemic and relate only to the field of view of the sensor, then a systemic correction can be applied to the totality of the data from that sensor and the data may continue to be utilised by the fusion processor 10. This may occur, for example, where the position of one of the sensors has changed and it no longer provides data of the field of view that it was previously expected to provide. The fusion processor 10 therefore recalibrates the data received from this sensor to correct it in line with the consensus data gleaned from all of the other sensors.
If an inconsistency between data streams is identified that indicates that a sensor is faulty, the fusion processor 10 can exclude data from this sensor to prevent the augmented video data stream from being distorted by data from a faulty sensor. The fusion processor 10 may be configured to store this information in the memory 12 for interrogation at a later date, such as a vehicle service, or this information may be communicated via the wireless communications device 32 so that the remote monitoring of the vehicle 20 can take into account the effect of the exclusion of data from the faulty sensor.
Figure 4 shows a schematic diagram of the key constituent parts of the system. Various sensors are provided on the vehicle 20. These include environmental sensors 22, ranging sensors 24, positional sensors 28, and vehicle dynamics and control signal data 30. The positional sensors 28 can include GPS sensors, inertial measurement sensors or a combination of these and other positional sensors. The ranging sensors 24 can include LiDAR or RADAR sensors or both LiDAR and RADAR sensors may be provided. Data from all of these sensors is feed into a fusion processor 10. These data must be synchronised using a high resolution (1kHz) system time anchored to GPS time from a GNSS receiver in the fusion processor 10. These data are time stamped using system time and location stamped using GPS position and Inertial Measurement Unit (IMU) trajectory data.
The data rates of the sensors will not all be the same and therefore the sensor data is managed to interpolate or extrapolate as appropriate to provide corresponding non-video data from each sensor for each frame of the video data. Within the meaning of non-video data is included data from all of the other sensors which detect ranging data, environment data etc. The fusion processor 10 undertakes the fusion of the data into an augmented video data feed by undertaking the steps shown schematically in Figure 2 and 3, namely the calibration of the field of view and angular resolution of each data stream and then the encoding of the non-video data from the other sensors into the colour/luminosity values of the video to provide a distance for each pixel.
The augmented video stream is output from the fusion processor 10 in a lossless compression format with an embedded data track which includes the synchronised sensor and vehicle data. This augmented video stream can be provided to a driver of the vehicle 20 if the vehicle is only partially autonomous. Additionally or alternatively, the data can be stored in a memory 12 for interrogation at a later date. The extent of the data which is stored within the memory 12 can be managed to include a rolling update of the most recent 1, 2, 5, 10, 15, 30 or 60 minutes of the most recent journey. Additionally, or alternatively, the memory 12 can be managed to retain 1, 2, 5 or 15 minutes of data before and/or after a number of events. Such events may include the raising of a flag indicative of inconsistencies between data sources. Additionally or alternatively, the data can be shared, via a wireless communications device 32, so that the activity and experience of the vehicle 20 can be monitored remotely.
It will further be appreciated by those skilled in the art that although the invention has been described by way of example with reference to several embodiments it is not limited to the disclosed embodiments and that alternative embodiments could be constructed without departing from the scope of the invention as defined in the appended claims.

Claims (17)

  1. CLAIMS1. A method of combining sensed data in a vehicle, the method comprising the steps of: obtaining at least one video data feed from one or more video cameras mounted on the vehicle; obtaining at least one other data input from a corresponding sensor; processing the or each video data feed so that the other data is incorporated into the video data feed to produce at least one augmented video feed; and outputting the or each augmented video feed for use in the vehicle.
  2. 2. The method according to claim 1, wherein the step of obtaining at least one video data feed includes the allocation of a system time to that video data feed; and wherein the step of obtaining at least one other data input includes the allocation of a system time to the other data; and wherein the step of processing the or each video data feed includes the synchronisation of the system time provided to each video data feed and other data feed.
  3. 3. The method according to claim 1, wherein the step of processing the video data includes calibrating other data to correct for alignment errors in the other data.
  4. 4. The method according to claim 1, wherein the step of processing the video data includes identifying inconsistencies between video data and other data and wherein a corresponding flag is applied to the augmented video feed to identify these inconsistencies.
  5. 5. The method according to any one of claims 1 to 4, wherein the processing step includes the fusion of the other data onto a corresponding frame of the or each video data feed.
  6. 6. The method according to any one of claims 1 to 5, wherein the vehicle is at least partially autonomous and the augmented video feed is provided as an input to a vehicle controller.
  7. 7. The method according to claim 6, further comprising the step of performing object recognition and tracking on the augmented video feed using a neural network.
  8. 8. The method according to any one of claims 1 to 7, wherein the augmented video feed is provided to the driver.
  9. 9. The method according to claim 8, wherein the other data is displayed in the form of a heat map overlay on the video data feed.
  10. 10. The method according to any one of claims 1 to 9, wherein the method further comprises the step of saving at least part of the augmented video feed in a memory.
  11. 11. The method according to claim 10, wherein the step of saving at least part of the augmented video feed including compression of the data. 15
  12. 12. A perception module for use in a vehicle, the module comprising; at least one video camera mounted on the vehicle configured to create at least one video data stream; at least one other sensor configured to provide data, wherein there is an overlap in a field of view of the sensor with a field of view of at least one of the video cameras; wherein the perception module further comprises a control system configured to fuse the output from the sensor into the video data stream of the video camera with an overlapping field of view to create an augmented video stream.
  13. 13. The perception module according to claim 12, further comprising a memory configured to store augmented video stream.
  14. 14. The perception module according to claim 13, wherein the control system is configured to select which augmented video stream data is stored in the memory.
  15. 15. The perception module according to claim 12, wherein the module comprises two or more video cameras and wherein the control system is configured to fuse the outputs of the video cameras to create a single video data stream.
  16. 16. The perception module according to claim 12, wherein the module comprises two or more video cameras and wherein the control system is configured to fuse the output from the sensor into each video data stream with an overlapping field of view to create an augmented video stream corresponding to each video camera.
  17. 17. The perception module according to any one of claims 12 to 16, wherein the at least one other sensor is selected from a group including speed sensors, displacement sensors, inertial measurement sensors, pressure sensors, GPS co-ordinates, GPS time data, vehicle control signals, Lidar sensors, radar sensors.
GB1818841.7A 2018-11-19 2018-11-19 Improvements in or relating to perception modules Withdrawn GB2579080A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB1818841.7A GB2579080A (en) 2018-11-19 2018-11-19 Improvements in or relating to perception modules

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1818841.7A GB2579080A (en) 2018-11-19 2018-11-19 Improvements in or relating to perception modules

Publications (2)

Publication Number Publication Date
GB201818841D0 GB201818841D0 (en) 2019-01-02
GB2579080A true GB2579080A (en) 2020-06-10

Family

ID=64740149

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1818841.7A Withdrawn GB2579080A (en) 2018-11-19 2018-11-19 Improvements in or relating to perception modules

Country Status (1)

Country Link
GB (1) GB2579080A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110153198A1 (en) * 2009-12-21 2011-06-23 Navisus LLC Method for the display of navigation instructions using an augmented-reality concept
US20140267723A1 (en) * 2013-01-30 2014-09-18 Insitu, Inc. Augmented video system providing enhanced situational awareness
US20160035391A1 (en) * 2013-08-14 2016-02-04 Digital Ally, Inc. Forensic video recording with presence detection
WO2017020132A1 (en) * 2015-08-04 2017-02-09 Yasrebi Seyed-Nima Augmented reality in vehicle platforms
US20170039765A1 (en) * 2014-05-05 2017-02-09 Avigilon Fortress Corporation System and method for real-time overlay of map features onto a video feed
CN107911607A (en) * 2017-11-29 2018-04-13 天津聚飞创新科技有限公司 Video smoothing method, apparatus, unmanned plane and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110153198A1 (en) * 2009-12-21 2011-06-23 Navisus LLC Method for the display of navigation instructions using an augmented-reality concept
US20140267723A1 (en) * 2013-01-30 2014-09-18 Insitu, Inc. Augmented video system providing enhanced situational awareness
US20160035391A1 (en) * 2013-08-14 2016-02-04 Digital Ally, Inc. Forensic video recording with presence detection
US20170039765A1 (en) * 2014-05-05 2017-02-09 Avigilon Fortress Corporation System and method for real-time overlay of map features onto a video feed
WO2017020132A1 (en) * 2015-08-04 2017-02-09 Yasrebi Seyed-Nima Augmented reality in vehicle platforms
CN107911607A (en) * 2017-11-29 2018-04-13 天津聚飞创新科技有限公司 Video smoothing method, apparatus, unmanned plane and storage medium

Also Published As

Publication number Publication date
GB201818841D0 (en) 2019-01-02

Similar Documents

Publication Publication Date Title
US11379173B2 (en) Method of maintaining accuracy in a 3D image formation system
US8180107B2 (en) Active coordinated tracking for multi-camera systems
CN109644264B (en) Array detector for depth mapping
US20140285523A1 (en) Method for Integrating Virtual Object into Vehicle Displays
US20190206115A1 (en) Image processing device and method
US20170372444A1 (en) Image processing device, image processing method, program, and system
EP2725548A2 (en) Image processing apparatus and method
US11016560B1 (en) Video timewarp for mixed reality and cloud rendering applications
US20100239122A1 (en) Method for creating and/or updating textures of background object models, video monitoring system for carrying out the method, and computer program
US20090079830A1 (en) Robust framework for enhancing navigation, surveillance, tele-presence and interactivity
US20130155190A1 (en) Driving assistance device and method
US9170339B2 (en) Radiation measurement apparatus
CN109525790B (en) Video file generation method and system, and playing method and device
KR20080100984A (en) Three-dimensional picture display method and apparatus
WO2019163558A1 (en) Image processing device, image processing method, and program
CN110717994A (en) Method for realizing remote video interaction and related equipment
CN111210386A (en) Image shooting and splicing method and system
KR102234376B1 (en) Camera system, calibration device and calibration method
JP2006060425A (en) Image generating method and apparatus thereof
CN114549595A (en) Data processing method and device, electronic equipment and storage medium
GB2579080A (en) Improvements in or relating to perception modules
US11222481B2 (en) Visualization apparatus and program
KR20180024756A (en) Traffic accident analyzing system using multi view blackbox image data
CN106067942B (en) Image processing apparatus, image processing method and mobile unit
WO2015185537A1 (en) Method and device for reconstruction the face of a user wearing a head mounted display

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)