CN110154896B

CN110154896B - Method and equipment for detecting obstacle

Info

Publication number: CN110154896B
Application number: CN201810218085.9A
Authority: CN
Inventors: 余贵珍; 胡超伟; 王云鹏; 苏鸿杰; 雷傲
Original assignee: Beihang University
Current assignee: Beijing Tage Idriver Technology Co Ltd
Priority date: 2018-03-16
Filing date: 2018-03-16
Publication date: 2020-04-07
Anticipated expiration: 2038-03-16
Also published as: CN110154896A

Abstract

The invention discloses a method and a device for detecting obstacles, wherein the method comprises the following steps: receiving, by a first processor, a sequence of images from an image acquisition device; sending, by the first processor, a first image in the sequence of images to the second processor, invoking a child thread run by the second processor to detect a plurality of obstacles in the first image; receiving, by the first processor, the detected detection result from the second processor; initializing, by a first processor, a plurality of trackers for tracking at least a portion of the plurality of obstacles in the sequence of images according to the detection result; and tracking, by the first processor, at least a portion of the obstacle in the sequence of images with the plurality of trackers.

Description

Method and equipment for detecting obstacle

Technical Field

The present invention relates to the field of image recognition for vehicle driving, and more particularly, to a method and apparatus for detecting an obstacle.

Background

The driving assistance system can assist drivers in driving on expressways and urban environments, and has important significance in solving traffic jam and safe driving of automobiles. The obstacle collision warning is the core function of the driving assistance system. At present, scholars at home and abroad have proposed a plurality of vehicle front obstacle detection algorithms, and commonly used sensors comprise a monocular camera, a binocular camera, a laser radar and a millimeter wave radar. The detection scheme based on the monocular camera is mature in technology, but is greatly influenced by the environment; the detection scheme based on the binocular camera has high ranging precision, but has large calculated amount and is also greatly influenced by the environment; the detection scheme based on the laser radar has the advantages of long detection distance, good robustness, high cost and incapability of acquiring environmental details; the detection scheme based on the millimeter wave radar has high real-time performance and small calculated amount, but has small returned data amount and low precision.

The obstacle detection method based on the monocular camera is mainly divided into two types:

the first category is the traditional computer vision method, which is generally divided into three phases, first selecting some candidate regions on a given image; then extracting features from the regions, wherein the common features at the stage comprise SIFT, HOG and the like, and for vehicle detection, the common features comprise vehicle bottom shadow, vehicle transverse and longitudinal edges, texture information and the like; and finally, classifying by using a trained classifier, wherein the mainstream classification algorithm comprises SVM, Adaboost and the like. The traditional target detection has two main problems, one is that a region selection strategy based on a sliding window has no pertinence, the time complexity is high, the window is redundant, and the second is that the manually selected characteristics have no good robustness to the variation of diversity. For the two problems, the target detection algorithm based on deep learning can be well solved.

The second type is a barrier detection method based on deep learning, wherein the deep learning algorithm refers to a process of recognizing a target by simulating human brain learning through a deep neural network, the characteristics of the target are transmitted layer by layer from low to high depending on the neural network, the higher the level is, the more abstract the characteristics are, and an output result is the most accurate characteristic expression of the target. At present, image target detection methods based on deep learning are mainly divided into target detection methods based on classification ideas and target detection methods based on regression ideas, wherein the detection real-time performance of the latter method is higher than that of the former method, but the precision is slightly reduced.

Aiming at the technical problems of high time complexity, window redundancy and poor robustness of the existing obstacle detection method based on the monocular camera and the technical problem of poor real-time performance of the detection method based on the deep learning, an effective solution is not provided at present.

Disclosure of Invention

The embodiment of the invention provides a method and equipment for detecting obstacles, which at least solve the problems that the detection scheme of the traditional monocular camera or binocular camera is greatly influenced by the environment, the detection scheme of a laser radar is high in cost and cannot acquire environment details; the detection scheme based on the millimeter wave radar has the technical problems of small returned data volume and low precision.

According to an aspect of an embodiment of the present invention, there is provided a method of detecting an obstacle, including: receiving, by a first processor, a sequence of images from an image acquisition device; sending, by a first processor, a first image in the sequence of images to a second processor, invoking a child thread run by the second processor to detect a plurality of obstacles in the first image; receiving, by a first processor, the detected detection result from the second processor; initializing, by a first processor, a plurality of trackers for tracking at least a portion of the plurality of obstacles in the sequence of images according to the detection result; and tracking, by a first processor, the at least a portion of the obstacles in the sequence of images with the plurality of trackers.

According to another aspect of the embodiments of the present invention, there is also provided an apparatus for detecting an obstacle, including: an image acquisition device; a first processor; and a second processor. Wherein the first processor runs a first program, wherein the first program runs to perform the following processing steps for a sequence of images output from the image acquisition device: receiving a sequence of images from an image acquisition device; sending a first image in the image sequence to a second processor, and calling a second program run by the second processor to detect a plurality of obstacles in the first image; receiving the detected detection result from the second processor; initializing a plurality of trackers respectively corresponding to at least a part of obstacles in the plurality of obstacles according to the detection result; and tracking the at least a portion of the obstacle in the sequence of images with the plurality of trackers. The second processor runs the second program, wherein the second program runs to execute the following processing steps: receiving the first image from the first processor; detecting the plurality of obstacles in the first image; and sending the detection result to the first processor.

In the embodiment of the invention, two parallel processors are utilized to solve the technical problems of high time complexity, window redundancy and poor robustness of the existing obstacle detection method based on a monocular camera and the technical problem of poor real-time performance of the detection method based on deep learning. Therefore, the obstacle detection can be completed with high real-time performance and high accuracy.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

fig. 1 is a schematic diagram of a vehicle-mounted terminal for executing the method according to the embodiment of the invention;

FIG. 2 shows a flow chart of a method of detecting an obstacle according to an embodiment of the invention;

FIG. 3 is a diagram illustrating the parallel operation of a first processor and a second processor according to an embodiment of the invention;

FIG. 4 shows a detailed flow diagram of a method according to an embodiment of the invention; and

fig. 5 shows a schematic view of an apparatus for detecting obstacles according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

First, some terms or terms appearing in the description of the embodiments of the present application are applicable to the following explanations:

SSD algorithm structure: the English language of SSD is called as Single Shot MultiBox Detector, which is a target detection algorithm.

There is also provided, in accordance with a first aspect of embodiments of the present invention, an embodiment of a method of detecting obstacles, it being noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.

The method provided by the first embodiment of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Fig. 1 shows a hardware configuration block diagram of an in-vehicle terminal 10 for implementing a method of detecting an obstacle. As shown in fig. 1, the in-vehicle terminal 10 may include one or more processors 102 (where the processor 102a may be a CPU and the processor 102b may be a gpu, and further, the processor 102 may include but is not limited to a processing device such as a microprocessor MCU or a programmable logic device FPGA), a memory 104 for storing data, and a transmission device 106 for communication function, an image capturing device 108 for acquiring image information (for example, in the present embodiment, the image capturing device 108 may be a USB monocular camera with a fixed housing). Besides, the method can also comprise the following steps: a display, an input/output interface (I/O interface), a Universal Serial Bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, a power source, and/or a camera. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the electronic device. For example, the in-vehicle terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

It should be noted that the one or more processors 102 and/or other data processing circuitry described above may be referred to generally herein as "data processing circuitry". The data processing circuitry may be embodied in whole or in part in software, hardware, firmware, or any combination thereof. Further, the data processing circuit may be a single, stand-alone processing module, or incorporated in whole or in part into any of the other elements in the in-vehicle terminal 10. As referred to in the embodiments of the application, the data processing circuit acts as a processor control (e.g. selection of a variable resistance termination path connected to the interface).

The memory 104 may be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the obstacle detection method in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the software programs and modules stored in the memory 104, so as to implement the vulnerability detection method of the application program. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the computer terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transmission device 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device 106 can be a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.

The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the in-vehicle terminal 10.

Fig. 2 shows a flow chart of a method of detecting an obstacle according to the first aspect of the present embodiment. As shown in fig. 2, an embodiment of the present invention provides a method for detecting an obstacle, including:

s202: receiving, by a first processor, a sequence of images from an image acquisition device;

s204: sending, by the first processor, a first image in the sequence of images to the second processor, invoking a child thread run by the second processor to detect a plurality of obstacles in the first image;

s206: receiving, by the first processor, the detected detection result from the second processor;

s208: initializing, by a first processor, a plurality of trackers for tracking at least a portion of the plurality of obstacles in the sequence of images according to the detection result; and

s210: at least a portion of the obstacles are tracked in the sequence of images by the first processor using a plurality of trackers.

Specifically, referring to fig. 1, according to the method of the present embodiment, the in-vehicle terminal 10 first receives an image sequence from the image capturing device 108 through the first processor 102a (i.e., CPU); then, the vehicle-mounted terminal 10 sends a first image in the image sequence to a second processor 102b (namely GPU) through the first processor 102a, and invokes a sub-thread run by the second processor 102b to detect a plurality of obstacles in the first image; then the in-vehicle terminal 10 receives the detected detection result from the second processor 102b through the first processor 102 a; then the in-vehicle terminal 10 initializes a plurality of trackers for tracking at least a part of the plurality of obstacles in the plurality of image sequences according to the detection result by the first processor 102 a; the in-vehicle terminal 10 then tracks the at least a portion of the obstacle in the image sequence with the plurality of trackers through the first processor 102 a.

The invention provides an in-vehicle terminal 10 for detecting obstacles based on parallel operation, wherein the in-vehicle terminal 10 comprises a first processor 102a and a second processor 102 b. On this basis, the in-vehicle terminal 10 receives the avatar from the first processor 102a using the thread on the second processor 102b and detects the obstacle in the image. While performing a tracking operation on the sequence of images acquired by the image acquisition device based on the detected obstacle with the first processor 102a, so that the

first processors

102a and 102b can operate in parallel.

For video detection, there is a clear context between objects in adjacent frames, and in addition to the necessary object detection algorithms, object tracking algorithms can be used to extract the location of a particular object within adjacent frames based on the detection results. Therefore, the method has better performance in both precision and real-time. The first processor 102a thus greatly improves both accuracy and real-time by employing a tracker to track a plurality of obstacles in a sequence of images. However, to perform the tracking algorithm, it is necessary to extract the feature information of the obstacle by the target detection algorithm in order to perform tracking in the image sequence.

In addition, the image target detection method based on deep learning (especially the target detection method based on the classification idea) requires a long calculation time, and thus has poor real-time performance. However, the in-vehicle terminal 10 of the present embodiment performs obstacle detection using the GPU102b in parallel with the CPU102a, so that the operation of obstacle detection can be executed in parallel with the operation of obstacle tracking.

Thus, the first processor 102a may continuously transmit images to the second processor 102b during obstacle tracking. Such that the operation of obstacle detection by the second processor 102b is performed simultaneously with the operation of obstacle tracking by the first processor 102 a. And the second processor 102b sends the newly detected obstacle to the first processor 102a, and the first processor 102a tracks the newly detected obstacle in the sequence of images captured by the camera using the newly detected obstacle as a template.

Therefore, by the above mode, the technical problems of high time complexity, window redundancy and poor robustness of the existing obstacle detection method based on the monocular camera and the technical problem of poor real-time performance of the detection method based on the deep learning are solved by using the two

parallel processors

102a and 102 b. Therefore, the obstacle detection can be completed with high real-time performance and high accuracy.

Therefore, the method and the system for identifying the forward obstacles in real time can realize the functions of real-time detection, tracking and ranging of various forward obstacles on a common vehicle based on the vehicle-mounted embedded terminal and the monocular camera, and have high real-time performance and high robustness. The algorithm used by the invention is based on multithreading, adopts a method of parallel GPU and CPU and adopts the idea of tracking as main detection as auxiliary to design a target recognition algorithm with high real-time performance and high robustness.

Optionally, the operation of detecting the plurality of obstacles comprises: detecting the first image by using a pre-trained convolutional neural network to obtain a plurality of targets; and selecting, as the plurality of obstacles, a target that satisfies the following condition from among the plurality of targets: a probability of belonging to one category of obstacles is greater than a first predetermined value; and the distance to the image acquisition device is less than a second predetermined value.

Specifically, in the method of this embodiment, a sub-thread running on the second processor 102b cyclically invokes a target detection module to perform target detection on the picture, where the target detection module includes two sub-modules, namely, an obstacle detection module and a visual ranging module, the obstacle detection module employs a deep learning target detection method based on a convolutional neural network, and a final detection result is a set M,

M＝{O₀,O₁,O₂,O₃…,O_N}

wherein N represents the number of detected objects, wherein

O_k(k ═ 1,2,3 …, N) represents the set of attributes of the kth detection target, as shown in the following equation:

O_k＝{t_k,x_k,y_k,w_k,h_k,s_k,d_k}(k＝1,2,3…,N)

wherein t is_kRepresenting the serial numbers of target types, wherein eight types of pedestrians, bicycles, motorcycles, tricycles, saloon cars, van trucks, uncovered trucks and buses respectively correspond to the serial numbers of 1-8, (x)_k,y_k,w_k,h_k) Indicating the position of the object in the image, respectively the upper left of the rectangular areaThe abscissa and ordinate of the corner and the pixel width and height, s, of the rectangular area_kIndicating the probability that the object is of a certain class, d_kRepresenting the actual relative distance of the target from the camera.

Setting a probability threshold value P and a danger distance D, and searching for a condition satisfying s in a set M_k>P and d_k<Element O of D_kCombining these elements satisfying the conditions into a new set F, i.e.

F＝{O_k|O_k∈M,s_k>P,d_k＜D}

Set F is as d_kThe values are sorted from small to large, and the set is used as a dangerous target set for subsequent target tracking.

Thus, through the above operations, the sub-thread running on the second processor 102b passes the feature of the target layer by layer from low to high through the obstacle detection method based on deep learning, and thus the output result is the most accurate feature expression of the target. Thereby achieving a higher recognition accuracy.

Optionally, the present invention employs a convolutional neural network based on SSD algorithm structure. Optionally, the front-end network of the SSD algorithm structure of the volume and neural network adopted in the present invention is a partial layer of the shallow residual error network, and the back-end network of the SSD algorithm structure is a plurality of convolution layers.

Specifically, a convolutional neural network is selected for deep learning model training in the invention. Based on the existing shallow layer residual error network structure ResNet-18 and the existing target detection algorithm SSD, the first layer to the 'res 5 b' layer of ResNet-18 are selected as the front-end network of the SSD, then three convolutional layers are added as the back-end network of the SSD, and the positions of targets in the image are predicted by selecting the three convolutional layers of 'res 3 b', 'res 4 b', 'res 5 b' and the back-end network, so that the convolutional neural network structure used in the embodiment is formed. And then inputting all pictures and corresponding labeled files into a deep learning training program, wherein the picture input size is 224 × 224, the model file generated after training is used for actual detection, and the actual image detection algorithm is based on a multithreading and parallel scheme.

Optionally, the category of the obstacle in the present invention includes at least one of the following categories: pedestrians, bicycles, motorcycles, tricycles, cars, vans, gondola cars and buses.

Specifically, road image samples are collected by a real vehicle, pictures are manually marked, the image samples comprise a VOC public picture data set and pictures collected by a vehicle data recorder, the number of the pictures is 5-6 thousands, the pictures are manually marked, and the marked classes comprise eight classes of obstacles including pedestrians, bicycles, motorcycles, tricycles, cars, vans, uncovered trucks and buses.

Optionally, in the present invention, the plurality of trackers are single-target trackers based on correlation filtering. The reason for using the single target tracker based on the correlation filtering is that the speed is high, the technology is mature, and the cost is low.

Optionally, initializing the plurality of trackers comprises: the plurality of trackers are initialized with position information of at least a portion of the obstacle in the first image and a current image in the sequence of images.

Referring to fig. 3, a schematic diagram of an algorithm executed in parallel for target detection and target tracking according to the present disclosure is shown. Specifically, the method of the invention performs target detection in the GPU, and starts K sub-threads in the CPU, wherein the K sub-threads correspond to the first K dangerous targets one by one, each sub-thread initializes a tracker by using the current picture and the position information of the corresponding dangerous target, and finally, K trackers are initialized together.

Optionally, the method of this embodiment further includes: displaying a sequence of images on a display; and displaying the tracking result and the distance of at least a part of the obstacles from the camera in the images of the image sequence.

Specifically, when each of the child threads for target tracking is all ended, the next step is performed.

The result of the final tracking is a set T,

T＝{U₀,U₁,U₂…,U_K}

wherein K represents detectedTarget amount of wherein

U_k(K ═ 1,2 …, K) represents the set of attributes of the kth detection target, specifically expressed by the formula:

U_k＝{t_k,x_k,y_k,w_k,h_k}(k＝1,2…,K)

wherein t is_kIndicates the target type number, (x)_k,y_k,w_k,h_k) Indicating the position of the object in the image.

And inputting the set T into a visual ranging module, and outputting the distance of each target in the T. And finally, displaying the target tracking result and the distance information on the currently read frame image.

Referring to fig. 4, the method of the present embodiment is specifically described below in chronological order. Fig. 4 shows a flowchart of a specific method according to the present embodiment. Referring to fig. 4, the method includes:

step S402: the camera is installed on the real vehicle, the installation position of the camera is located on the upper portion of the center of the inner side of a front windshield of the vehicle, and the direction of an optical axis of the camera is parallel to the direction of a vehicle head and the horizontal plane. Measuring the height and the pitch angle of the camera, calibrating the camera based on the existing method, and acquiring internal parameters of the camera; and obtaining the accurate mapping relation between the actual longitudinal distance of the object in the world coordinate system and the pixel line number on the image through calibration, and further obtaining a formula for calculating the actual relative distance of the target in the image.

Step S404: the method comprises the steps of collecting road image samples in a real vehicle, manually marking the images, wherein the image samples comprise a VOC public image data set and images collected by a driving recorder, the number of the images is 5-6 thousands, the images are manually marked by ImageNet marking software, and the marking types comprise eight types in all, namely pedestrians, bicycles, motorcycles, tricycles, cars, van trucks, uncovered trucks and buses.

Step S406: selecting and designing a convolutional neural network, and training a deep learning model. Based on an existing shallow residual network structure ResNet-18 and an existing target detection algorithm SSD, selecting a ResNet-18 first layer to a 'res 5 b' layer as a front-end network of the SSD, then adding three convolutional layers as a rear-end network of the SSD, and simultaneously selecting the three convolutional layers of the 'res 3 b', 'res 4 b', 'res 5 b' and the rear-end network to predict the position of a target in an image, then inputting all pictures and corresponding annotation files into a deep learning training program, wherein the picture input size is 224 x 224, and a model file generated after training is used for actual detection.

The actual image detection algorithm is based on a multithreading and parallel scheme, and the specific flow is as follows:

step S408: the main program runs in the CPU (i.e., the first processor 102 a). The main program starts a target detection sub-thread that calls a deep learning based target detection module to run in the GPU (i.e., the second processor 102 b). The main program sets an infinite loop, once in each loop, the program reads in a frame of picture from the camera, and calls the target tracking module once, and each frame of picture can be used for the target detection module and the target tracking module simultaneously.

Step S410: the target detection sub-thread circularly calls a target detection module in the GPU to perform target detection on the picture, the target detection module comprises two sub-modules of obstacle detection and visual ranging, wherein the obstacle detection module adopts a deep learning target detection method based on a regression idea, the final detection result is a set M,

M＝{O₀,O₁,O₂,O₃…,O_N}

wherein N represents the number of detected objects, wherein

O_k(k ═ 1,2,3 …, N) represents the set of attributes of the kth detection target, specifically expressed by the formula:

O_k＝{t_k,x_k,y_k,w_k,h_k,s_k,d_k}(k＝1,2,3…,N)

wherein t is_kIndicating the serial number of the target type, pedestrian, bicycle, motorcycle, tricycle, carThe eight types of van, uncovered van and bus correspond to serial numbers 1-8 (x)_k,y_k,w_k,h_k) The position of the object in the image is shown as the abscissa and ordinate of the upper left corner of the rectangular region, and the pixel width and pixel height of the rectangular region, s_kIndicating the probability that the object is of a certain class, d_kRepresenting the actual relative distance of the target from the camera.

F＝{O_k|O_k∈M,s_k>P,d_k＜D}

And setting a detection end mark L with an initial value of 0, and setting the mark L as 1 if a new target is detected after one frame of picture is detected.

Step S412: the main program calls the target tracking module in a loop, reads the flag L,

if L is 1, a new target detection result is shown, immediately setting L to 0, and reading the dangerous target set F. In order to save computing resources, the algorithm only tracks the most dangerous part of targets, the number of dangerous targets needing to be tracked is set as I, and if J elements are contained in F, the algorithm makes

K＝min(I,J)

Because the used tracker is a single-target tracker which has higher speed and is based on relevant filtering, K sub-threads are started in a CPU and correspond to the first K dangerous targets one by one, each sub-thread initializes one tracker by using the current picture and the position information of the corresponding dangerous target, and finally, K trackers are initialized together; each tracker is then used in the child thread to track the corresponding target in the current picture.

If L is 0, it indicates that there is no new target detection result, and at this time, the corresponding dangerous target is continuously tracked and the state value of the tracker is updated based on the new picture and the tracker used up to that time.

When all of the child threads for target tracking are finished, the next step is performed.

The result of the final tracking is a set T,

T＝{U₀,U₁,U₂…,U_K}

wherein K represents the number of detected objects, wherein

U_k＝{t_k,x_k,y_k,w_k,h_k}(k＝1,2…,K)

wherein t is_kIndicates the target type number, (x)_k,y_k,w_k,h_k) Indicating the position of the object in the image

The above steps S410 and S412 are operated in parallel. Step S410 is to continuously execute the obstacle detection module in the GPU using the child thread. Step S412 is to continuously execute the obstacle tracking and ranging module in the main thread of the CPU. And finally, taking the result of the obstacle tracking as the final recognition result of each picture, wherein the accurate result of the obstacle detection is only used for initializing the tracker, and the accuracy of the obstacle tracking in the continuous pictures is ensured. For real-time performance, the target detection based on deep learning consumes more computing resources, so that the target detection algorithm has a lower running speed in the GPU of the embedded platform; the single target tracking algorithm based on the related filtering is high in instantaneity and precision, and multiple trackers are initialized by multiple threads in a CPU (central processing unit) and used for tracking multiple targets.

parallel processors

Further, referring to fig. 1, according to a second aspect of the present embodiment, there is provided a storage medium including a stored program, wherein an apparatus on which the storage medium is located is controlled to execute the above-described method when the program is executed.

Further, referring to fig. 5, according to a third aspect of the present embodiment, there is provided an apparatus 500 for detecting an obstacle. The apparatus 500 comprises: an image acquisition device; a first processor; and a second processor.

Wherein the first processor runs a first program, wherein the first program runs to perform the following processing steps for a sequence of images output from the image acquisition device: receiving a sequence of images from an image acquisition device; sending a first image in the image sequence to a second processor, and calling a second program run by the second processor to detect a plurality of obstacles in the first image; receiving the detected detection result from the second processor; initializing a plurality of trackers respectively corresponding to at least a part of obstacles in the plurality of obstacles according to the detection result; and tracking the at least a portion of the obstacle in the sequence of images with the plurality of trackers.

The second processor runs the second program, wherein the second program runs to execute the following processing steps: receiving the first image from the first processor; detecting the plurality of obstacles in the first image; and sending the detection result to the first processor.

Among them, the device 500 shown in fig. 5 also corresponds to the in-vehicle terminal 10 shown in fig. 1. The first processor in fig. 5 corresponds to the first processor 102a in fig. 1, the second processor in fig. 5 corresponds to the second processor 102b shown in fig. 1, and the storage medium in fig. 5 corresponds to the memory 104 in fig. 1.

Therefore, with the device 500 provided in this embodiment, by using two parallel processors, the technical problems of high time complexity, window redundancy and poor robustness of the existing obstacle detection method based on a monocular camera and the technical problem of poor real-time performance of the detection method based on deep learning are solved. Therefore, the obstacle detection can be completed with high real-time performance and high accuracy.

Optionally, the convolutional neural network is a convolutional neural network based on an SSD algorithm structure.

Optionally, the front-end network of the SSD algorithm structure is a partial layer of the residual error network, and the back-end network of the SSD algorithm structure is a plurality of convolutional layers.

Optionally, the categories of obstacles include at least one of the following categories: pedestrians, bicycles, motorcycles, tricycles, cars, vans, gondola cars and buses.

Optionally, the plurality of trackers are single target trackers based on correlation filtering.

Optionally, initializing the plurality of trackers comprises: initializing the plurality of trackers using position information of the at least a portion of the obstacle in the first image and a current image in the sequence of images.

Optionally, the first program further executes the following processing steps when running: displaying the sequence of images on a display; and displaying the result of the tracking and the distance of the at least a part of the obstacles from the image acquisition device in the images of the image sequence.

In summary, the method and the system for identifying the forward obstacle in real time provided by the invention are based on the vehicle-mounted embedded terminal and the monocular camera, can realize the real-time detection, tracking and ranging functions of various forward obstacles on a general vehicle, and have high real-time performance and high robustness. The algorithm used by the invention is based on multithreading, adopts a method of parallel GPU and CPU and adopts the idea of tracking as main detection as auxiliary to design a target recognition algorithm with high real-time performance and high robustness.

Therefore, the invention has the beneficial effects that:

(1) the invention provides a parallel operation-based real-time vehicle forward obstacle identification method, which is based on monocular vision, uses a deep learning method to detect obstacles, is trained by a large data set in the early stage, and shows high precision and high robustness for partially-appearing and severely-shielded targets.

(2) In order to solve the problem that the real-time performance of a deep learning algorithm running at an embedded end is insufficient, the method uses a multithreading method and a method that a GPU and a CPU are parallel, the idea that tracking is used as main detection is adopted as auxiliary, a target detection algorithm runs in the GPU, a quick target tracking algorithm runs in the CPU, finally, the identification result displayed on each frame of picture is the target tracking result, the accurate result of the target detection is only used for initializing a tracker, the processing time of each frame of picture is only related to the tracking time, and the processing speed is further effectively improved.

(3) The invention provides a vehicle forward obstacle real-time identification method based on parallel operation, which can identify eight types of target types including people, bicycles, motorcycles, tricycles, cars, van trucks, uncovered trucks and buses, can identify targets with parts appearing or seriously shielded, and the processing time of each frame of picture is only related to the tracking time. The method provided by the invention can ensure high real-time property, high precision and high robustness of target identification, and can be used for realizing a forward obstacle early warning function, a self-adaptive cruise function and the like of a driving auxiliary system.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A method of detecting an obstacle, comprising:

receiving, by a first processor, a sequence of images from an image acquisition device;

sending, by a first processor, a first image in the sequence of images to a second processor, invoking a child thread run by the second processor to detect a plurality of obstacles in the first image;

receiving, by a first processor, the detected detection result from the second processor;

initializing, by a first processor, a plurality of trackers for tracking at least a portion of the plurality of obstacles in the sequence of images according to the detection result;

and

tracking, by a first processor, the at least a portion of the obstacles in the sequence of images with the plurality of trackers;

the operation of detecting the plurality of obstacles comprises:

detecting the first image by using a pre-trained convolutional neural network to obtain a plurality of targets; and

selecting, as the plurality of obstacles, a target that satisfies the following condition from the plurality of targets:

a probability of belonging to one category of obstacles is greater than a first predetermined value; and

the distance to the image acquisition device is less than a second predetermined value.

2. The method of claim 1, wherein the convolutional neural network is a convolutional neural network based on an SSD algorithm structure.

3. The method of claim 2, wherein the front-end network of the SSD algorithmic structure is a partial layer of a shallow residual network, and wherein the back-end network of the SSD algorithmic structure is a plurality of convolutional layers.

4. The method of claim 3, wherein the category of obstacles comprises at least one of the following categories: pedestrians, bicycles, motorcycles, tricycles, cars, vans, gondola cars and buses.

5. The method of claim 4, wherein the plurality of trackers are single-target trackers based on correlation filtering.

6. The method of claim 5, wherein initializing operation of the plurality of trackers comprises:

initializing the plurality of trackers with position information of the at least a portion of the obstacle in the first image, and then tracking the obstacle in the sequence of images using the plurality of trackers.

7. The method of claim 6, further comprising:

displaying the sequence of images on a display; and

displaying the result of the tracking and the distance of the at least a portion of the obstacle from the image acquisition device in the images of the sequence of images.

8. An apparatus for detecting an obstacle, comprising: an image acquisition device; a first processor; and a second processor for processing the first and second signals,

wherein the first processor runs a first program, wherein the first program runs to perform the following processing steps for a sequence of images output from the image acquisition device:

receiving a sequence of images from an image acquisition device;

sending a first image in the image sequence to a second processor, and calling a second program run by the second processor to detect a plurality of obstacles in the first image;

receiving the detected detection result from the second processor;

initializing a plurality of trackers respectively corresponding to at least a part of obstacles in the plurality of obstacles according to the detection result; and

tracking the at least a portion of the obstacle in the sequence of images with the plurality of trackers, and

the second processor runs the second program, wherein the second program runs to execute the following processing steps:

receiving the first image from the first processor;

detecting the plurality of obstacles in the first image; and

sending the detection result to the first processor;

the operation of detecting the plurality of obstacles comprises: