CN113011268A

CN113011268A - Intelligent vehicle navigation method and device, electronic equipment and storage medium

Info

Publication number: CN113011268A
Application number: CN202110201502.0A
Authority: CN
Inventors: 任君兰; 张叔奇; 郑宇恒; 孙尚民; 宗春光
Original assignee: Tongfang Vision Technology Jiangsu Co ltd; Nuctech Co Ltd
Current assignee: Tongfang Vision Technology Jiangsu Co ltd; Nuctech Co Ltd
Priority date: 2021-02-23
Filing date: 2021-02-23
Publication date: 2021-06-22

Abstract

The disclosure provides a vehicle intelligent navigation method, a vehicle intelligent navigation device, electronic equipment and a storage medium. The method comprises the following steps: acquiring a video stream acquired by a target vehicle in an automatic navigation advancing process; obtaining video frames from the video stream; determining an area which accords with the color of the road traffic sign in the video frame through color difference; utilizing shape feature screening to obtain road traffic sign area images in the determined areas; obtaining a road traffic sign feature vector of the road traffic sign region image; processing the road traffic sign feature vector by using a trained road traffic sign classification model, and predicting the traffic sign category to which the road traffic sign region image belongs; and sending a navigation instruction matched with the predicted road surface traffic sign area image according to the traffic sign category to which the road surface traffic sign area image belongs so as to control the running of the target vehicle. The method can accurately predict the traffic sign types of the road traffic sign areas.

Description

Intelligent vehicle navigation method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for vehicle intelligent navigation, an electronic device, and a storage medium.

Background

With the rapid development of intelligent automobiles and unmanned technologies, the identification of road traffic signs becomes an important component of safe driving. The road traffic sign is used for warning, forbidding, limiting and indicating road users. Road traffic signs are generally provided in the form of markings on signboards on both sides of the road surface, or drawn directly on the road surface. In the related art, the accuracy of the identification of the road traffic sign region and the prediction of the traffic sign category is poor.

It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.

Disclosure of Invention

The invention aims to provide a vehicle intelligent navigation method, a vehicle intelligent navigation device, electronic equipment and a storage medium.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.

The embodiment of the disclosure provides an intelligent navigation method for a vehicle, which comprises the following steps: acquiring a video stream acquired by a target vehicle in an automatic navigation advancing process; obtaining a video frame from a video stream; determining an area which accords with the color of the road traffic sign in the video frame through the color difference; utilizing shape characteristics to screen in the determined area to obtain an image of the road traffic sign area; obtaining a road traffic sign feature vector of the road traffic sign region image; processing the road traffic sign feature vector by using the trained road traffic sign classification model, and predicting the traffic sign category to which the road traffic sign region image belongs; and sending a navigation instruction matched with the predicted road surface traffic sign area image according to the traffic sign category to which the road surface traffic sign area image belongs so as to control the running of the target vehicle.

In some exemplary embodiments of the present disclosure, determining an area in a video frame that corresponds to a color of a road traffic sign through a color difference includes: scanning each pixel in a video frame using a template of a predetermined size; replacing the value of the central pixel point corresponding to each template with the weighted average pixel value of the pixels in the neighborhood determined by each template to obtain a denoised video frame; converting the color format of the denoised video frame from a BGR image format to an HSV image format; and determining a pixel point set within a preset color threshold range in the denoised HSV image format video frame as an area which accords with the color of the road traffic sign.

In some exemplary embodiments of the present disclosure, obtaining an image of a road traffic sign region using shape feature screening in a determined region includes: acquiring a minimum circumscribed rectangle of the region; and if the length and the width of the minimum circumscribed rectangle are both larger than the preset length and the length-width ratio of the minimum circumscribed rectangle is within the preset proportion range, determining the area as the road traffic sign area image.

In some exemplary embodiments of the present disclosure, obtaining a road traffic sign feature vector of a road traffic sign region image comprises: determining the gradient amplitude and the gradient direction corresponding to each pixel point in the road traffic sign region image; obtaining a descriptor of a cell unit according to the gradient amplitude and the gradient direction corresponding to each pixel point, wherein the cell unit comprises a first number of pixel points; concatenating the descriptors of the cell units to obtain a descriptor of a square unit, the square unit comprising a second number of cell units; and connecting the descriptors of the square units in series to obtain the road traffic sign feature vector of the road traffic sign area image.

In some exemplary embodiments of the present disclosure, determining a gradient amplitude and a gradient direction corresponding to each pixel point in an image of a road traffic sign region includes: determining a horizontal direction gradient amplitude and a vertical direction gradient amplitude corresponding to each pixel point in the road traffic sign region image; and determining the gradient amplitude and the gradient direction corresponding to each pixel point according to the horizontal direction gradient amplitude and the vertical direction gradient amplitude corresponding to each pixel point.

In some exemplary embodiments of the present disclosure, the method further includes: acquiring a road traffic sign training image and setting a class label; converting a road traffic sign training image into a gray image, and zooming the gray image into a preset size; obtaining a traffic sign training feature vector of a road traffic sign training image; and training the road traffic sign classification model by taking the class label of the road traffic sign training image and the traffic sign training feature vector thereof as input.

In some exemplary embodiments of the present disclosure, acquiring a road traffic sign training image comprises: acquiring training video streams shot at a plurality of positions and under a plurality of environmental conditions; obtaining training video frames from a training video stream; and cutting the training video frame to obtain a road traffic sign training image.

The embodiment of the present disclosure provides a vehicle intelligent navigation device, including: the video stream acquisition module is used for acquiring a video stream acquired by a target vehicle in an automatic navigation advancing process; a video frame obtaining module, configured to obtain a video frame from a video stream; the area determining module is used for determining an area which accords with the color of the road traffic sign in the video frame through color difference; the road traffic sign region image obtaining module is used for obtaining road traffic sign region images in the determined regions by utilizing shape feature screening; the characteristic vector obtaining module is used for obtaining road traffic sign characteristic vectors of the road traffic sign area images; the mark type prediction module is used for processing the road traffic mark characteristic vector by utilizing the trained road traffic mark classification model and predicting the traffic mark type to which the road traffic mark region image belongs; and the instruction sending module is used for sending a navigation instruction matched with the predicted road traffic sign region image according to the traffic sign type to which the road traffic sign region image belongs so as to control the running of the target vehicle.

An embodiment of the present disclosure provides an electronic device, including: at least one processor; a storage device for storing at least one program which, when executed by the at least one processor, causes the at least one processor to implement any one of the vehicle intelligent navigation methods as described above.

The embodiment of the disclosure provides a computer-readable storage medium, on which a computer program is stored, wherein the computer program is used for implementing any one of the above-mentioned intelligent navigation methods when being executed by a processor.

Some embodiments of the present disclosure provide an intelligent navigation method for a vehicle, which obtains a video frame from a video stream acquired by a target vehicle during an automatic navigation process, determines whether a value of each pixel point in the video frame is within a color threshold range, the area which accords with the color of the road sign in the video frame can be accurately determined, the road traffic sign area image can be accurately screened out from the area by utilizing the shape characteristic, by obtaining the road traffic sign characteristic vector of the road traffic sign area image, can reduce the computation amount of the subsequent model, shorten the computation time, accurately predict the traffic sign category of the road traffic sign region image by utilizing the trained road traffic sign classification model, and according to the predicted traffic sign category, sending a navigation instruction matched with the traffic sign category, and accurately controlling the running of the target vehicle. In addition, the method can effectively realize automatic navigation of the target vehicle in the traveling process, greatly reduce labor cost, increase transportation benefits, reduce potential safety hazards and increase economic benefits.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty.

Fig. 1 shows a schematic diagram of an exemplary system architecture to which the vehicle intelligent navigation method of the disclosed embodiments may be applied.

FIG. 2 is a flow chart illustrating a method for intelligent navigation of a vehicle, according to an exemplary embodiment.

FIG. 3 is a flow chart illustrating another method for intelligent navigation of a vehicle, according to an exemplary embodiment.

FIG. 4 is a flow chart illustrating another method for intelligent navigation of a vehicle, according to an exemplary embodiment.

FIG. 5 is a flow chart illustrating another method for intelligent navigation of a vehicle, according to an exemplary embodiment.

Fig. 6 is a block diagram illustrating a vehicle smart navigation device according to an exemplary embodiment.

Fig. 7 is a schematic diagram of an electronic device according to an example embodiment.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

As shown in fig. 1, the system architecture may include a server 101, a network 102, and a terminal device 103. Network 102 is the medium used to provide communication links between terminal devices 103 and server 101. Network 102 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The server 101 may be a server that provides various services, such as a background management server that provides support for devices operated by the user using the terminal apparatus 103. The background management server can analyze and process the received data such as the request and feed back the processing result to the terminal equipment.

The terminal device 103 may be, for example, a mobile terminal such as a mobile phone and a tablet computer, and the terminal device 103 may be placed in a target vehicle and may control the target vehicle to perform automatic navigation.

The terminal device 103 may collect a video stream during the automatic navigation traveling process of the target vehicle and transmit the collected video stream to the server 101; the server 101 may receive a video stream collected by a target vehicle during an automatic navigation travel; the server 101 may obtain video frames from the video stream; the server 101 can determine the area in the video frame according with the color of the road traffic sign through the color difference; the server 101 can obtain road traffic sign area images in the determined areas by utilizing shape feature screening; the server 101 may obtain road traffic sign feature vectors of road traffic sign region images; the server 101 may use the trained road traffic sign classification model to process the road traffic sign feature vector, and predict the traffic sign category to which the road traffic sign region image belongs; the server 101 may send a navigation instruction matched with the predicted road traffic sign region image to the terminal device 103 according to the traffic sign category to which the road traffic sign region image belongs; the terminal device 103 may receive the navigation instruction and control the travel of the target vehicle according to the navigation instruction.

It should be understood that the number of the terminal devices, the networks, and the servers in fig. 1 is only illustrative, and the server 101 may be a physical server, a server cluster formed by a plurality of servers, and a cloud server, and may have any number of terminal devices, networks, and servers according to actual needs.

Hereinafter, the steps of the vehicle intelligent navigation method in the exemplary embodiment of the present disclosure will be described in more detail with reference to the drawings and the embodiment.

FIG. 2 is a flow chart illustrating a method for intelligent navigation of a vehicle, according to an exemplary embodiment. The method provided by the embodiment of the present disclosure may be executed by a server as shown in fig. 1, but the present disclosure is not limited thereto.

As shown in fig. 2, a vehicle intelligent navigation method provided by the embodiment of the present disclosure may include the following steps.

In step S202, a video stream captured by the target vehicle during automatic navigation travel is acquired.

In the disclosed embodiment, the target vehicle may be a car, a truck, a bus, a motorcycle, an electric vehicle, or the like.

In the embodiment of the disclosure, in the automatic navigation traveling process, the target vehicle can automatically acquire the video stream, and also can automatically acquire the video stream through the terminal equipment or the camera equipment arranged on the target vehicle; the video stream may be a video that includes road surface information during travel, which may include road surface traffic markings.

In step S204, a video frame is obtained from the video stream.

In the embodiment of the present disclosure, each frame image in a video stream may be acquired as a video frame, and stored according to a preset format, where the preset format may be a JPG (Joint Photographic Experts Group) format, the image size of the video frame may be stored as 2048 × 1536, and the horizontal resolution and the vertical resolution may be set as 96DPI (Dots Per Inch).

In step S206, the area in the video frame that matches the color of the road traffic sign is determined by the color difference.

In the embodiment of the present disclosure, a plurality of video frames may be obtained from a video stream, and one of the video frames is taken as an example for explanation.

In the embodiment of the disclosure, the area in the video frame, which accords with the color of the pavement marker, can be determined by judging whether the value of each pixel point in the video frame is within the color threshold range. The area corresponding to the color of the road surface traffic sign may be an area where the road surface traffic sign may exist, the area corresponding to the color of the road surface traffic sign may be, for example, a blue area, and a person skilled in the art may also set areas of other colors as needed, which is not limited in this disclosure.

For example, a color threshold range may be preset, and a set of pixel points in the video frame within the color threshold range is used as an area conforming to the color of the road traffic sign.

In step S208, a road surface traffic sign region image is obtained in the determined region using shape feature filtering.

Wherein the road traffic sign may include at least one of a reverse (Down) sign, a Forbid (Forbid) sign, a Left Turn (Left) sign, a stop (P) sign, a Right Turn (Right) sign, a Turn (S) sign, a Turn around (Turn) sign, and a go forward (Up) sign; the pavement marker area image may be an image containing a pavement traffic marker.

In the embodiment of the disclosure, the road traffic sign region image can be screened out according to the shape characteristics of the region.

For example, the minimum bounding rectangle of the region may be obtained, whether the length, width, and aspect ratio of the minimum bounding rectangle of the region satisfy preset conditions is determined, and the region satisfying the preset conditions is used as the road traffic sign region image. The preset condition may be set as required, which is not limited by the present disclosure.

In an exemplary embodiment, a minimum bounding rectangle of a region is obtained; and if the length and the width of the minimum circumscribed rectangle are both larger than the preset length and the length-width ratio of the minimum circumscribed rectangle is within the preset proportion range, determining the area as the road traffic sign area image.

The preset length and the preset proportion range can be set according to actual needs, and the preset proportion can be a preset length-width proportion.

For example, the preset length may be 50 pixels long, and the preset aspect ratio may be in the range of [0.7, 1.2 ].

For example, whether the length and the width of the minimum bounding rectangle of the region are both greater than 50 pixel lengths and whether the aspect ratio of the minimum bounding rectangle is within the range of [0.7, 1.2] may be determined, and if the length and the width of the minimum bounding rectangle of the region are both greater than 50 pixel lengths and the aspect ratio of the minimum bounding rectangle is within the range of [0.7, 1.2], the region may be intercepted and determined as the road traffic sign region image. If the area is inclined, the area can be rotated to be horizontal according to the inclination angle of the area, and the rotated area is intercepted to be used as the road traffic sign area image.

In step S210, a road traffic sign feature vector of the road traffic sign region image is obtained.

The road traffic sign feature vector may be a Histogram of Oriented Gradient (HOG) feature vector.

In the embodiment of the disclosure, the direction gradient histogram feature vector of the road traffic sign region image can be extracted as the road traffic sign feature vector of the road traffic sign region image.

For example, the gradient amplitude and the gradient direction corresponding to each pixel point in the image of the road traffic sign region can be calculated by using a direction gradient histogram algorithm, and the road traffic sign feature vector of the image of the road traffic sign region is obtained according to the gradient amplitude and the gradient direction corresponding to each pixel point.

In step S212, the road traffic sign feature vector is processed by using the trained road traffic sign classification model to predict the traffic sign class to which the road traffic sign region image belongs.

Wherein the traffic sign category may include at least one of back, no go, left turn, stop, right turn, curve, u-turn, and forward; the road traffic sign classification model may be an SVM (Support Vector Machine) model.

In the disclosed embodiment, the traffic sign category may be represented by words or decimal numbers. For example, reverse may be represented by 1, go may be represented by 2, left turn may be represented by 3, stop may be represented by 4, right turn may be represented by 5, turn around may be represented by 6, turn around may be represented by 7, and forward may be represented by 8.

In the embodiment of the disclosure, the road traffic sign feature vector can be input into a trained road traffic sign classification model, and the road traffic sign feature vector is processed by using the road traffic sign classification model, so that the traffic sign category to which the road traffic sign region image belongs can be predicted.

For example, the SVM model may be used to process the histogram of directional gradients of feature vectors of the road traffic sign region, and predict that the traffic sign category to which the road traffic sign region image belongs is, for example, a left turn.

In step S214, a navigation instruction matching the predicted road surface traffic sign region image is transmitted according to the traffic sign category to which the road surface traffic sign region image belongs, so as to control the traveling of the target vehicle.

The transmitted navigation instruction is matched with the predicted traffic sign type, for example, if the predicted traffic sign type is a left turn, the navigation instruction of the left turn can be transmitted to the target vehicle.

In the embodiment of the disclosure, a navigation instruction matched with the traffic sign type can be sent to a target vehicle or a terminal device on the target vehicle according to the predicted traffic sign type to which the road traffic sign area image belongs, and the target vehicle can operate according to the navigation instruction.

The vehicle intelligent navigation method provided by the embodiment of the disclosure obtains a video frame from a video stream acquired by a target vehicle in an automatic navigation advancing process, can accurately determine an area which accords with the color of a road sign in the video frame by judging whether the value of each pixel point in the video frame is within the range of a color threshold value, can accurately screen out an image of a road traffic sign area from the area by utilizing shape characteristics, can reduce the calculation amount of a subsequent model by obtaining a road traffic sign characteristic vector of the image of the road traffic sign area, shortens the calculation time, can accurately predict the traffic sign category to which the image of the road traffic sign area belongs by utilizing a trained road traffic sign classification model, and can accurately control the running of the target vehicle by sending a navigation instruction matched with the road traffic sign category according to the predicted traffic sign category. In addition, the method can effectively realize automatic navigation of the target vehicle in the traveling process, greatly reduce labor cost, increase transportation benefits, reduce potential safety hazards and increase economic benefits.

In the embodiment of the present disclosure, the vehicle intelligent navigation method shown in fig. 3 is a detailed description of the step of determining the area in the video frame according to the color of the road traffic sign through the color difference in the vehicle intelligent navigation method shown in fig. 2, that is, an embodiment of the step S206 is provided.

As shown in fig. 3, step S206 may include the following steps.

In step S302, each pixel in the video frame is scanned with a template of a predetermined size.

Wherein, the predetermined size can be set according to actual needs.

For example, the predetermined size may be 7 × 7.

For example, each pixel in a video frame may be scanned using a 7 × 7 template.

In step S304, the weighted average pixel value of the pixels in the neighborhood determined by each template is used to replace the value of the central pixel point corresponding to each template, so as to obtain a denoised video frame.

And the pixels of the neighborhood of each template are pixels except the central pixel in the template. For example, if the size of the template is 7 × 7, the neighborhood of the template is a pixel region of 7 × 7-1 excluding the center point.

In the embodiment of the present disclosure, a gaussian denoising method may be used to replace the value of the central pixel point corresponding to each template with the weighted average pixel value of the pixel point in the neighborhood determined by each template, and the value is used as the denoised video frame. The method can smooth the image and reduce the influence of noise in the image on color conversion analysis.

In step S306, the color format of the denoised video frame is converted from the BGR image format to the HSV image format.

In the embodiment of the present disclosure, the color format of the denoised video frame may be converted from a BGR (Blue Green Red) image format to an HSV (Hue Saturation Value) image format.

Among them, the HSV type image can reflect the color information of the image more intuitively.

In step S308, a set of pixel points in the video frame in the HSV image format after denoising within the preset color threshold range is determined as an area conforming to the color of the road traffic sign.

The preset color threshold range can be set according to actual needs.

For example, the highest value of the preset color threshold range may be [124, 255, 255], the lowest value of the preset color threshold range may be [60, 100, 60], and the region corresponding to the color of the road traffic sign may be a blue region.

For example, a pixel point set between [60, 100, 60] to [124, 255, 255] in a denoised video frame in the HSV image format may be used as a region conforming to the color of a road traffic sign, and a pixel point set not between [60, 100, 60] to [124, 255, 255] may be an interference region.

In the embodiment of the disclosure, in order to reduce the influence of the change of the shooting angle on the recognition accuracy of the subsequent road traffic sign classification model, the area which accords with the color of the road traffic sign can be determined, the area which accords with the shape characteristic is screened out from the determined area, the area is rotated to be horizontal according to the inclination angle of the screened area, and then the rotated area is intercepted as the road traffic sign image.

In the embodiment of the present disclosure, the vehicle intelligent navigation method shown in fig. 4 is a detailed description of the step of obtaining the road traffic sign feature vector of the road traffic sign region image in the vehicle intelligent navigation method shown in fig. 2, that is, an embodiment of the step S210 is provided.

As shown in fig. 4, step S210 may include the following steps.

In step S402, the gradient amplitude and the gradient direction corresponding to each pixel point in the road traffic sign region image are determined.

In the embodiment of the present disclosure, a Gradient amplitude and a Gradient direction corresponding to each pixel point in the road traffic sign region image may be calculated by using a Histogram of Oriented Gradients (HOG) algorithm.

The HOG algorithm is operated on the cell units of the image, so that the image is less affected by geometrical and optical conditions. For example, under the conditions of coarse spatial sampling, fine directional sampling and strong local optical normalization, when the amplitude of image skew variation is fine, the influence of the features extracted by the algorithm on the recognition result is small to be almost negligible. In addition, because the HOG algorithm has no rotation and scale invariance, the calculation amount is small, and the calculation time can be greatly shortened in the automatic navigation of the vehicle.

The parameters of the direction gradient histogram algorithm may be set as follows: number of unsigned direction divisions: orientation ═ 9; number of pixels per cell (cell unit): pixel _ per _ cell ═ 8, 8; number of cells in each block (cell): cell _ per _ block ═ 2, 2; the interior of the block adopts a Norm type: block _ norm is norm; gamma correction: tranform _ sqrt ═ Ture.

In an exemplary embodiment, the horizontal direction gradient amplitude and the vertical direction gradient amplitude corresponding to each pixel point in the road traffic sign region image may be determined.

For example, the horizontal direction gradient amplitude and the vertical direction gradient amplitude corresponding to each pixel point can be determined according to the following formulas:

G_x(x,y)＝H(x+1,y)-H(x-1,y)

G_y(x,y)＝H(x,y+1)-H(x,y-1)

wherein G is_x(x, y) represents the magnitude of the horizontal gradient at the pixel point (x, y), G_y(x, y) represents the magnitude of the vertical gradient at pixel point (x, y), and H (x, y) represents the pixel value at pixel point (x, y).

In an exemplary embodiment, the gradient amplitude and the gradient direction corresponding to each pixel point may be determined according to the horizontal direction gradient amplitude and the vertical direction gradient amplitude corresponding to each pixel point.

For example, the gradient magnitude and gradient direction corresponding to each pixel point can be determined according to the following formulas:

wherein, G (x, y) represents the gradient magnitude of the pixel (x, y), and α (x, y) represents the gradient direction of the pixel (x, y).

In the embodiment of the disclosure, the gradient amplitude and the gradient direction corresponding to each pixel point in the road traffic sign area image are determined, so that the contour information can be captured, a darker area is lightened, and the influence of shadow and illumination change on the image is reduced.

In step S404, a descriptor of a cell unit is obtained according to the gradient magnitude and the gradient direction corresponding to each pixel point, where the cell unit includes a first number of pixel points.

Wherein the first number may be set as desired. For example, the first number may be 8 × 8, i.e., one cell unit may include 8 × 8 pixel points.

In the embodiment of the present disclosure, the gradient direction may be divided into 9 direction blocks by 360 °, each pixel point in the cell unit is weighted to the corresponding direction block according to the corresponding gradient direction projection and the gradient amplitude, a gradient direction histogram of the cell unit may be obtained, and a 9-dimensional feature vector may be obtained, where the 9-dimensional feature vector may be used as a descriptor of the cell unit.

In step S406, the descriptors of the cell units are concatenated to obtain a descriptor of a square unit, the square unit including a second number of cell units.

Wherein the second number may be set as desired. For example, the second number may be 2 × 2, i.e., one square unit may comprise 2 × 2 cell units.

In the embodiment of the present disclosure, the descriptors of the cell units in the square unit can be connected in series to obtain the descriptors of the square unit.

In step S408, the descriptors of the square cells are concatenated to obtain the road traffic sign feature vector of the road traffic sign region image.

In the embodiment of the disclosure, descriptors of each block unit in the road traffic sign region image may be connected in series to obtain a road traffic sign feature vector of the road traffic sign region image, that is, a representation of the road traffic sign region image.

In the embodiment of the present disclosure, on the basis of the vehicle intelligent navigation method shown in fig. 2, the vehicle intelligent navigation method shown in fig. 5 may further include the following steps.

In step S502, a road traffic sign training image is acquired and a category label is set up.

In an exemplary embodiment, a training video stream taken at a plurality of locations and a plurality of environmental conditions is obtained; obtaining training video frames from a training video stream; and cutting the training video frame to obtain a road traffic sign training image.

In the embodiment of the disclosure, under the intelligent rail environment, the camera can be used for shooting and collecting each road traffic sign from different positions and different illumination environment conditions to obtain a training video stream, and the collection can be started by adjusting the focal length and the object distance until the traffic sign can be seen. After the traffic sign training video stream is acquired, each frame of image in the training video stream can be acquired and stored in a JPG format (the image size can be 2048 × 1536, and the horizontal resolution and the vertical resolution can be 96DPI) to serve as a training video frame. The training video frames can be cut to obtain the road traffic sign training images in each training video frame, the obtained road traffic sign training images are classified, category labels are set for the obtained road traffic sign training images, and each 2000 road traffic sign training images of each category can be selected to construct a training set of a road traffic sign classification model.

Wherein the category labels may be labeled with decimal number labels.

In step S504, the road traffic sign training image is converted into a grayscale image, and the grayscale image is scaled to a preset size.

In the embodiment of the disclosure, before the feature analysis and extraction of the road traffic sign training image, the road traffic sign training image may be converted into a grayscale image. The gray value range is 0-255, namely the original three-dimensional image is converted into a one-dimensional image between 0 and 255 according to an image type conversion rule to be represented. 0 denotes black with the lowest brightness value, 255 denotes white with the highest brightness value, and the intermediate value is gray representing different brightness levels between black and white. The gray level image can clearly embody the outlines of different objects in the image according to the brightness, so that the whole characteristic information of the road traffic sign training image can be analyzed and extracted conveniently.

In the disclosed embodiments, the grayscale image may be scaled to a preset size in order to speed up feature analysis and extraction. The preset size may be set according to actual conditions, and for example, the preset size may be 64 × 64.

In step S506, a traffic sign training feature vector of the road traffic sign training image is obtained.

In the embodiment of the present disclosure, step S506 may refer to step S210 described above, and the details of the present disclosure are not repeated herein.

In step S508, the road traffic sign classification model is trained using the class labels of the road traffic sign training images and the traffic sign training feature vectors thereof as inputs.

In the embodiment of the disclosure, the road traffic sign classification model may be an SVM model. The SVM model can simplify the problems of general classification, regression and the like because the SVM model basically does not relate to probability measurement and a law of large numbers. The computational complexity of the SVM model depends on the number of support vectors, rather than the dimensions of the sample space, in a way that avoids "dimension disasters". In addition, the SVM model can grasp key samples, can also put forward redundancy and has good robustness.

In the embodiment of the present disclosure, a method of "one-against-one" (one-to-one) "in the SVM model may be used, that is, n × n-1/2 classifiers may be reconstructed according to the required classification number n, and each classifier trains two types of road traffic sign training images. And training an SVM classifier, namely solving a quadratic programming problem.

In the embodiment of the present disclosure, the training parameters of the SVM model may be set as: the penalty parameter C is 10; the kernel function is set to be gaussian, namely kernel ═ rbf; the dimension degree of the polynomial Poly function is set to 3; the magnitude of the error value for stopping training is set as: tol-1 e-5; probability estimation is used to estimate probability True.

Each SVM classifier can train any two types of road traffic sign training images (without repetition) until all the classifiers train all the road traffic sign training images. Assuming that for the ith class of road traffic sign training image and the jth class of road traffic sign training image, the training principle may be:

wherein w represents a hyperplane normal vector, and w is a unique optimization target; b denotes the intercept of the hyperplane, ε denotes the relaxation variable, the subscript t denotes the index of the sample in the union of the i-th and j-th class of data,

representing a non-linear mapping of the input space to the feature space.

In the embodiment of the disclosure, after the training of the SVM model is completed, the input road traffic sign feature vector can be processed according to the trained SVM model, and the traffic sign category of the road traffic sign region is predicted.

In the embodiment of the disclosure, each trained SVM model can perform decision function on the input road traffic sign feature vector X_newThere is one prediction (vote). Taking SVM model prediction between class i and class j as an example, if X is paired_newIf the prediction is the ith class, the ith class is added with 1; otherwise, adding 1 to the jth type ticket; and finally, the class with the most votes is the prediction of the model on the road traffic sign feature vector. The decision function is a basis for judging the input data by the SVM model, and the decision function is as follows:

wherein, X_newRepresenting input data to be predicted;

indicating the result of the prediction. In the disclosed embodiment, X_newRepresenting the input road traffic sign feature vector,

representing the predicted traffic sign category.

It is noted that the above-mentioned figures are merely schematic illustrations of processes involved in methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.

The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods. For details not disclosed in the embodiments of the apparatus of the present disclosure, refer to the embodiments of the method of the present disclosure.

As shown in fig. 6, the vehicular smart navigation apparatus 600 may include: the system comprises a video stream acquisition module 602, a video frame acquisition module 604, an area determination module 606, a road traffic sign area image acquisition module 608, a feature vector acquisition module 610, a sign category prediction module 612 and an instruction sending module 614.

The video stream acquiring module 602 may be configured to acquire a video stream acquired by a target vehicle during an automatic navigation process; the video frame obtaining module 604 may be configured to obtain a video frame from a video stream; the region determining module 606 may be configured to determine a region in the video frame that matches the color of the road traffic sign according to the color difference; the area image obtaining module 608 may be configured to obtain an image of a road traffic sign area in the determined area by using shape feature screening; the feature vector obtaining module 610 may be configured to obtain a road traffic sign feature vector of the road traffic sign region image; the sign category prediction module 612 may be configured to process the road traffic sign feature vector by using the trained road traffic sign classification model, and predict a traffic sign category to which the road traffic sign region image belongs; the instruction sending module 614 may be configured to send a navigation instruction matched with the predicted road traffic sign region image according to the traffic sign category to which the road traffic sign region image belongs, so as to control the traveling of the target vehicle.

In an exemplary embodiment, the region determining module 606 may include: a template obtaining unit operable to scan each pixel in a video frame with a template of a predetermined size; the denoising unit can be used for replacing the value of the central pixel point corresponding to each template with the weighted average pixel value of the pixels in the neighborhood determined by each template to obtain a denoised video frame; the format conversion unit can be used for converting the color format of the denoised video frame from a BGR image format into a HSV image format; the area determining unit may be configured to determine a pixel point set in a preset color range in the denoised HSV image format video frame as an area that conforms to a color of the road traffic sign.

In an exemplary embodiment, the region image obtaining module 608 may include: the circumscribed rectangle acquiring unit can be used for acquiring the minimum circumscribed rectangle of the region; and the area image obtaining unit can be used for determining the area as the road traffic sign area image if the length and the width of the minimum circumscribed rectangle are both larger than the preset length and the length-width ratio of the minimum circumscribed rectangle is within the preset proportion range.

In an exemplary embodiment, the feature vector obtaining module 610 may include: the gradient determining unit can be used for determining the gradient amplitude and the gradient direction corresponding to each pixel point in the road traffic sign region image; the descriptor obtaining unit can be used for obtaining a descriptor of a cell unit according to the gradient amplitude and the gradient direction corresponding to each pixel point, and the cell unit comprises a first number of pixel points; a first concatenation unit operable to concatenate the descriptors of the cell units to obtain a descriptor of a square unit, the square unit comprising a second number of cell units; and the second series unit can be used for serially connecting the descriptors of the square units to obtain the road traffic sign feature vector of the road traffic sign area image.

In an exemplary embodiment, the gradient determining unit may include: the gradient amplitude determining unit can be used for determining the horizontal direction gradient amplitude and the vertical direction gradient amplitude corresponding to each pixel point in the road traffic sign region image; and the amplitude and direction determining unit can be used for determining the gradient amplitude and the gradient direction corresponding to each pixel point according to the horizontal direction gradient amplitude and the vertical direction gradient amplitude corresponding to each pixel point.

In an exemplary embodiment, the vehicular smart navigation apparatus 600 may further include: the training image acquisition module can be used for acquiring a road traffic sign training image and setting a class label; the image conversion module can be used for converting the road traffic sign training image into a gray image and zooming the gray image into a preset size; the training feature vector obtaining module can be used for obtaining a traffic sign training feature vector of a road traffic sign training image; and the model training module can be used for training the road traffic sign classification model by taking the class labels of the road traffic sign training images and the traffic sign training feature vectors thereof as input.

In an exemplary embodiment, training the image acquisition module may include: a training video stream acquisition unit operable to acquire a training video stream photographed at a plurality of positions and under a plurality of environmental conditions; a training video frame obtaining unit operable to obtain a training video frame from a training video stream; and the training video frame clipping unit can be used for clipping the training video frame to obtain the road traffic sign training image.

It is noted that the block diagrams shown in the above figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

Fig. 7 is a schematic structural diagram of an electronic device according to an example embodiment. It should be noted that the electronic device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.

As shown in fig. 7, the electronic apparatus 700 includes a Central Processing Unit (CPU)701, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the system 700 are also stored. The CPU 701, the ROM 702, and the RAM 703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.

In particular, according to an embodiment of the present invention, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 701.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present invention may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a transmitting unit, an obtaining unit, a determining unit, and a first processing unit. The names of these units do not in some cases constitute a limitation to the unit itself, and for example, the sending unit may also be described as a "unit sending a picture acquisition request to a connected server".

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: acquiring a video stream acquired by a target vehicle in an automatic navigation advancing process; obtaining a video frame from a video stream; determining an area which accords with the color of the road traffic sign in the video frame through the color difference; utilizing shape characteristics to screen in the determined area to obtain an image of the road traffic sign area; obtaining a road traffic sign feature vector of the road traffic sign region image; processing the road traffic sign feature vector by using the trained road traffic sign classification model, and predicting the traffic sign category to which the road traffic sign region image belongs; and sending a navigation instruction matched with the predicted road surface traffic sign area image according to the traffic sign category to which the road surface traffic sign area image belongs so as to control the running of the target vehicle.

Exemplary embodiments of the present invention are specifically illustrated and described above. It is to be understood that the invention is not limited to the precise construction, arrangements, or instrumentalities described herein; on the contrary, the invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. A vehicle intelligent navigation method is characterized by comprising the following steps:

acquiring a video stream acquired by a target vehicle in an automatic navigation advancing process;

obtaining video frames from the video stream;

determining an area which accords with the color of the road traffic sign in the video frame through color difference;

utilizing shape feature screening to obtain road traffic sign area images in the determined areas;

obtaining a road traffic sign feature vector of the road traffic sign region image;

processing the road traffic sign feature vector by using a trained road traffic sign classification model, and predicting the traffic sign category to which the road traffic sign region image belongs;

and sending a navigation instruction matched with the predicted road surface traffic sign area image according to the traffic sign category to which the road surface traffic sign area image belongs so as to control the running of the target vehicle.

2. The method of claim 1, wherein determining the area of the video frame that corresponds to the color of the road traffic sign from the color difference comprises:

scanning each pixel in the video frame with a template of a predetermined size;

replacing the value of the central pixel point corresponding to each template with the weighted average pixel value of the pixels in the neighborhood determined by each template to obtain a denoised video frame;

converting the color format of the denoised video frame from a BGR image format to an HSV image format;

and determining a pixel point set within a preset color threshold range in the denoised HSV image format video frame as an area which accords with the color of the road traffic sign.

3. The method of claim 2, wherein obtaining an image of an area of a road traffic sign using shape feature screening in the determined area comprises:

acquiring a minimum circumscribed rectangle of the region;

and if the length and the width of the minimum circumscribed rectangle are both larger than the preset length and the length-width ratio of the minimum circumscribed rectangle is within the preset proportion range, determining the area as the road traffic sign area image.

4. The method of claim 1, wherein obtaining a road traffic sign feature vector for the road traffic sign region image comprises:

determining the gradient amplitude and the gradient direction corresponding to each pixel point in the road traffic sign region image;

obtaining descriptors of cell units according to the gradient amplitude and the gradient direction corresponding to each pixel point, wherein the cell units comprise a first number of pixel points;

concatenating the descriptors of the cell units to obtain a descriptor of a square unit, the square unit comprising a second number of cell units;

and connecting the descriptors of the square units in series to obtain the road traffic sign feature vector of the road traffic sign area image.

5. The method of claim 4, wherein determining the gradient magnitude and gradient direction corresponding to each pixel point in the image of the road traffic sign region comprises:

determining a horizontal direction gradient amplitude and a vertical direction gradient amplitude corresponding to each pixel point in the road traffic sign region image;

and determining the gradient amplitude and the gradient direction corresponding to each pixel point according to the horizontal direction gradient amplitude and the vertical direction gradient amplitude corresponding to each pixel point.

6. The method of claim 1, further comprising:

acquiring a road traffic sign training image and setting a class label;

converting the road traffic sign training image into a gray image, and zooming the gray image into a preset size;

obtaining a traffic sign training feature vector of the road traffic sign training image;

and training the road traffic sign classification model by taking the class label of the road traffic sign training image and the traffic sign training feature vector thereof as input.

7. The method of claim 6, wherein acquiring a road traffic sign training image comprises:

acquiring training video streams shot at a plurality of positions and under a plurality of environmental conditions;

obtaining training video frames from the training video stream;

and cutting the training video frame to obtain the road traffic sign training image.

8. A vehicle intelligent navigation device, comprising:

the video stream acquisition module is used for acquiring a video stream acquired by a target vehicle in an automatic navigation advancing process;

a video frame obtaining module, configured to obtain a video frame from the video stream;

the area determining module is used for determining an area which accords with the color of the road traffic sign in the video frame through color difference;

the road traffic sign region image obtaining module is used for obtaining road traffic sign region images in the determined regions by utilizing shape feature screening;

the characteristic vector obtaining module is used for obtaining a road traffic sign characteristic vector of the road traffic sign area image;

the mark class prediction module is used for processing the road traffic mark characteristic vector by utilizing a trained road traffic mark classification model and predicting the traffic mark class to which the road traffic mark region image belongs;

and the instruction sending module is used for sending a navigation instruction matched with the road traffic sign region image according to the predicted traffic sign type to which the road traffic sign region image belongs so as to control the running of the target vehicle.

9. An electronic device, comprising:

at least one processor;

storage means for storing at least one program which, when executed by the at least one processor, causes the at least one processor to carry out the method of any one of claims 1 to 7.

10. A computer-readable storage medium having computer-executable instructions stored thereon which, when executed by a processor, implement the method of any one of claims 1 to 7.