CN117315596A

CN117315596A - Deep learning-based motor vehicle black smoke detection and identification method

Info

Publication number: CN117315596A
Application number: CN202311151301.XA
Authority: CN
Inventors: 钟云亮; 黄中羲
Original assignee: Chongqing Yunwang Technology Co ltd
Current assignee: Chongqing Yunwang Technology Co ltd
Priority date: 2023-09-07
Filing date: 2023-09-07
Publication date: 2023-12-29

Abstract

The invention relates to the technical field of black smoke detection, in particular to a motor vehicle black smoke detection and identification method based on deep learning. Including motor vehicle identification: the picture is preprocessed and then is led into a deep learning convolutional neural network, and an identification frame is obtained; randomly selecting pictures and generating candidate areas; labeling the candidate region and dividing the candidate region into positive and negative samples; and extracting the picture characteristics and predicting the positions and the categories of the candidate areas. And (5) identifying black smoke of the motor vehicle. Detecting black smoke outline of the motor vehicle: acquiring one-dimensional pixel characteristics; obtaining black smoke characteristics; performing region-based image segmentation; training a convolutional neural network; updating the optimization with a large number of training samples until an optimal solution is obtained. The design of the invention can reduce the workload, save the calculation time and the calculation amount of idle work, and improve the speed and the accuracy of recognition and detection; the prediction speed can be effectively improved; the accuracy of black smoke detection can be improved, time consumption is shortened, and labor input is reduced.

Description

Deep learning-based motor vehicle black smoke detection and identification method

Technical Field

The invention relates to the technical field of black smoke detection, in particular to a motor vehicle black smoke detection and identification method based on deep learning.

Background

Conventional black smoke vehicle detection methods include public reporting, periodic inspection, night patrol, tail gas analyzers, and manual video monitoring. However, the above method is expensive and requires a lot of manpower and material resources. With the continuous increase of the number of urban cameras and the development of computer vision technology, the automatic detection method of the black smoke vehicle based on video monitoring becomes the mainstream, and the automatic detection method can realize 24-hour unattended operation, automatically detect the black smoke vehicle and grasp uploading certificates.

At present, although a plurality of domestic cities are also provided with corresponding intelligent electronic snapshot systems of black smoke vehicles successively, a great deal of manpower and time are consumed, and the recognition and detection efficiency is low. In view of the above, we propose a method for detecting and identifying black smoke of a motor vehicle based on deep learning.

Disclosure of Invention

The invention aims to provide a motor vehicle black smoke detection and identification method based on deep learning, so as to solve the problems in the background technology.

In order to solve the technical problems, one of the purposes of the invention is to provide a motor vehicle black smoke detection and recognition method based on deep learning, which mainly comprises three parts of motor vehicle recognition, motor vehicle black smoke recognition and motor vehicle black smoke contour detection; the specific method flow comprises the following steps:

s1, motor vehicle identification:

s1.1, after relevant pretreatment is carried out on a picture containing a motor vehicle, the picture is led into a deep learning convolutional neural network to obtain a motor vehicle identification frame;

s1.2, randomly selecting a plurality of pictures from a data set picture, randomly scaling, randomly distributing and splicing the pictures, generating a series of candidate areas on the pictures according to a certain rule, and marking the candidate areas according to the position relation between the candidate areas and a real frame of an object on the picture;

s1.3, marking candidate areas close enough to the real frame as positive samples, and taking the position of the real frame as a position target of the positive samples; those candidate regions that deviate more from the true box will be marked as negative samples, which do not need predicted positions or categories;

s1.4, extracting picture features by using a convolutional neural network, predicting the position and the category of a candidate region, and establishing a loss function;

s2, identifying black smoke of the motor vehicle: according to the identification method in the step S1, or in the motor vehicle identification process of the step S1, the motor vehicle black smoke identification operation is synchronously carried out;

s3, detecting the black smoke outline of the motor vehicle:

s3.1, according to the brightness space of the image pixels, considering the characteristics of the color space, firstly obtaining one-dimensional pixel characteristics according to the multidimensional color space components of the pixel brightness;

s3.2, the pixel characteristic is corresponding to the pixel gray level in the gray level image, black smoke characteristics such as shape, brightness and the like are obtained, and the characteristics can be obtained according to a group of training sample images and are described by a characteristic drawing operator;

s3.3, carrying out region-based image segmentation on the input automobile black smoke image by using a segmentation algorithm, so that the characteristics of brightness, texture and the like of the image of the automobile black smoke in the segmentation result region have similarity, and finally obtaining and marking the black smoke outline;

s3.4, training the convolutional neural network by utilizing the input image and the corresponding segmentation diagram thereof and adopting random gradient descent;

s3.5, using high momentum to determine updating in the current optimization step by using a large number of training samples until an optimal solution is obtained.

As a further improvement of the present technical solution, in the step S1.2, a candidate region is generated, that is: generating a series of anchor frames fixed in position on the picture according to a certain rule, and regarding the anchor frames as possible candidate areas; predicting whether the anchor frame contains a target vehicle, and if so, adjusting the amplitude of the prediction frame relative to the position of the anchor frame; the specific algorithm is as follows:

dividing an original picture into m multiplied by n areas, setting the resolution of the original picture as (h, w), and selecting a small area as k multiplied by k, wherein m and n are respectively:

i.e. the original image can be divided intoGo->And (5) arranging small square areas.

As a further improvement of the present technical solution, in the step S1.2, an algorithm for adjusting the position of the prediction frame relative to the anchor frame is:

the algorithm generates a series of anchor frames at the center of each region; the anchor frame is fixed in position and cannot be overlapped with the vehicle boundary frame exactly, fine adjustment of the position is needed on the basis of the anchor frame to generate a prediction frame, and the prediction frame has different position centers and sizes relative to the anchor frame; for example, on the basis of the candidate region, at the mth _i I= {1,2, -, m } line n _j One anchor frame generated at the center of the small square area of the n } column takes the width of the small square as the unit length, and the position coordinates of the upper left corner of the small square area are:

c _x ＝n _j

C _y ＝m _i

the center coordinates (center_x, center_y) of this anchor frame are:

center_x＝c _x +0.5

center_y＝c _y +0.5

the center coordinates (b) of the prediction frame can be generated in the following manner _x ，b _y )：

b _x ＝c _x +σ(t _x )

b _y ＝c _y +σ(t _y )

Wherein t is _x And t _y For real numbers, σ (x) is a Sigmoid function, which is defined as follows:

since the function value of Sigmoid is between 0 and 1, the center point of the prediction frame calculated by the above formula always falls on the mth _i Line n _j The interior of the small region of the column;

when t _x ＝t _y When=0, b _x ＝c _x +0.5，b _y ＝c _y +0.5, the center of the prediction frame coincides with the center of the anchor frame, and the centers of the prediction frame and the anchor frame are the centers of small areas; the size of the anchor frame is preset, and can be regarded as super parameter in the model, and the size of the anchor frame is set as follows: the height of the anchor frame is p _h The width of the anchor frame is p _w The method comprises the steps of carrying out a first treatment on the surface of the The predicted frame size is: the height of the prediction frame is b _h The prediction frame width is b _w ；

The size of the prediction box is generated by the following formula:

when t _x ＝t _y ＝0，t _h ＝t _w =0, then the prediction box and the anchor box coincide.

As a further improvement of the present technical solution, in the step S1.4, the method for establishing the loss function is as follows:

firstly, extracting picture features by using a convolutional neural network and predicting the positions and the categories of candidate areas;

then, each prediction frame is regarded as a sample, and label values are obtained according to the positions and the categories of the real frames relative to the prediction frames;

the position and the category of the prediction frame are predicted through the convolutional neural network model, and the convolutional neural network prediction value and the label value are compared, so that a loss function can be established.

As a further improvement of the technical scheme, in the step S3.2, the black smoke features mainly comprise shape, brightness and texture; these features may be obtained from a set of training sample images.

As a further improvement of the present technical solution, in the step S3.3, the region-based image segmentation process is a process of marking pixels in an image as different marks, and the connected pixels with the same marks form a segmented region;

the image can be segmented by applying a classification or clustering algorithm in the pattern recognition field to the pixel points, and the image segmentation can be regarded as a transformation of the image marked with the region obtained from the original digital image;

wherein the segmentation algorithm may employ the feature operator in step S3.2 to perform segmentation;

a marker-based region growing algorithm and a marker-based watershed algorithm are also possible;

a region-division merging method may also be employed.

As a further improvement of the technical scheme, in the step S3.4, when the convolutional neural network is trained by adopting random gradient descent, since no padding is performed in the training process, the output image is smaller than the input image by a constant boundary width; large input blocks are used instead of large batch size to reduce the batch to a single image to minimize overhead and maximize memory utilization.

As a further improvement of the present technical solution, in the step S3.5, the specific algorithm for updating the current optimization step using high momentum includes:

firstly, an energy function is established, wherein the energy function is formed by combining a pixel level soft_max on a final feature diagram and a cross entropy loss function; define soft_max as:

wherein a is _k (x) Representing the sum of the pixel positions x E omega of the channel kIs activated by the activation of the device; k is the number of classes, K' represents the initial first channel, and pk (x) is the approximate maximum function; i.e. for having maximum activation a _k (x) K, pk (x) ≡0; then, cross entropy utilization

Punish p at each position _l(x) (x) Deviation from l;

wherein E represents an energy value, l: Ω→ {1,., K } is the true label for each pixel, W: Ω→r is a weight map we introduce that can give some pixels higher weight during the training phase, W (X) represents the mapping weight;

pre-calculating a weight map for each group trunk segmentation to compensate for certain classes of different pixel frequencies in the training data and force the convolutional neural network to learn small separation boundaries which we introduce around black smoke;

calculating a separation boundary using morphological operations; the mapping weight is equal to:

wherein W is _c (X): omega-R is a weight graph for balancing class frequency, d ₁ : omega-R represents the distance from the nearest black smoke edge, d ₂ : omega-R represents the distance from the edge of the second near black smoke, W ₀ Representing the initial weights;

adjusting the initial weights such that each feature map in the convolutional neural network has an approximate unit variance; by from having standard deviationExtracting initial weights from the gaussian distribution of (a), where N represents the number of afferent nodes of a neuron.

Another object of the present invention is to provide a platform device for detecting and identifying black smoke of a motor vehicle, which comprises a processor, a memory and a computer program stored in the memory and running on the processor, wherein the processor is used for implementing the steps of the method for detecting and identifying black smoke of a motor vehicle based on deep learning when executing the computer program.

It is a further object of the present invention to provide a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the deep learning based vehicle soot detection and identification method described above.

Compared with the prior art, the invention has the beneficial effects that:

1. in the motor vehicle black smoke detection and recognition method based on deep learning, detection work is divided into three parts, namely motor vehicle recognition, motor vehicle black smoke recognition and motor vehicle black smoke contour detection, motor vehicle and black smoke recognition can be synchronously carried out, and the method of firstly recognizing motor vehicles and black smoke and then detecting black smoke contours reduces workload, saves calculation time and calculation amount of idle work, and improves recognition detection speed and accuracy;

2. in the motor vehicle black smoke detection and recognition method based on deep learning, a convolutional neural network based on deep learning is adopted, and prediction is carried out by firstly learning and marking a pre-selected frame for pre-dividing a preprocessed random picture, so that the prediction speed can be effectively improved;

3. according to the motor vehicle black smoke detection and recognition method based on deep learning, through feature extraction and description of images, segmentation is carried out on the images, black smoke with similar features is detected in each segmentation area, black smoke contours are finally obtained, and then continuous optimization updating training is carried out by using a convolution neural network based on deep learning until the black smoke contours at accurate positions are detected, so that the accuracy of black smoke detection is improved, time consumption is shortened, labor investment is reduced, and further substantial contribution is made to effectively controlling black smoke emission and improving urban air quality.

Drawings

FIG. 1 is an exemplary overall process flow diagram of the present invention;

FIG. 2 is one of exemplary partial process flow diagrams of the present invention;

FIG. 3 is a second exemplary partial process flow diagram of the present invention;

fig. 4 is a block diagram of an exemplary electronic computer product apparatus according to the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Example 1

As shown in fig. 1-3, the embodiment provides a method for detecting and identifying black smoke of a motor vehicle based on deep learning, which mainly comprises three parts of motor vehicle identification, motor vehicle black smoke identification and motor vehicle black smoke contour detection; the specific method flow comprises the following steps.

S1, motor vehicle identification, as shown in FIG. 2:

s1.1, after relevant pretreatment is carried out on the pictures containing the motor vehicles, the pictures are led into a deep learning convolutional neural network, and a motor vehicle identification frame is obtained.

S1.2, randomly selecting a plurality of pictures, such as 4 pictures, randomly scaling, randomly distributing and splicing the pictures, generating a series of candidate areas on the pictures according to a certain rule, and marking the candidate areas according to the position relation between the candidate areas and the real frames of the objects on the pictures;

in this step, candidate regions are generated, namely: generating a series of anchor frames fixed in position on the picture according to a certain rule, and regarding the anchor frames as possible candidate areas; predicting whether the anchor frame contains a target vehicle, and if so, adjusting the amplitude of the prediction frame relative to the position of the anchor frame; the specific algorithm is as follows:

dividing the original picture into m×n regions, setting the resolution of the original picture as (h, w), if (640, 480), selecting the small block region as k×k, if 32×32, then m and n are respectively:

i.e. the original image is divided into 20 rows and 15 columns of small square areas.

Further, the algorithm for adjusting the position of the prediction frame relative to the anchor frame is as follows:

the algorithm generates a series of anchor frames at the center of each region; the anchor frame is fixed in position and cannot be overlapped with the vehicle boundary frame exactly, fine adjustment of the position is needed on the basis of the anchor frame to generate a prediction frame, and the prediction frame has different position centers and sizes relative to the anchor frame; for example, on the basis of the candidate region, at the mth _i I= {1,2, -, m } line n _j J= {1,2,..An anchor frame, the width of the small square is taken as a unit length, and the position coordinates of the upper left corner of the small square area are:

c _x ＝n _j ＝4

c _y ＝m _i ＝10

the center coordinates (center_x, center_y) of this anchor frame are:

center_x＝c _x +0.5＝4.5

center_y＝c _y +0.5＝10.5

b _x ＝c _x +σ(t _x )

b _y ＝c _y +σ(t _y )

since the function value of Sigmoid is between 0 and 1, the center point of the prediction frame calculated by the above formula always falls inside the small area of row 10 and column 4;

when t _x ＝t _y When=0, b _x ＝c _x +0.5，b _y ＝c _y +0.5, the center of the prediction frame coincides with the center of the anchor frame, and the centers of the prediction frame and the anchor frame are the centers of small areas; the size of the anchor frame is preset, and can be regarded as super parameter in the model, and the size of the anchor frame is set as follows: the height of the anchor frame is p _h The width of the anchor frame is p _w The method comprises the steps of carrying out a first treatment on the surface of the The predicted frame size is: the height of the prediction frame is b _h The prediction frame width is b _w The method comprises the steps of carrying out a first treatment on the surface of the The sputtering anchor frame size is:

p _h ＝350

p _w ＝250

the size of the prediction box is generated by the following formula:

If give t _x ，t _y ，t _h ，t _w The random assignment is as follows:

t _x ＝0.2，t _y ＝0.3，t _h ＝-0.12，t _w ＝0.1；

the predicted frame coordinates are (154.98, 357.44, 276.29, 310.42).

S1.3, marking candidate areas close enough to the real frame as positive samples, and taking the position of the real frame as a position target of the positive samples; those candidate regions that deviate significantly from the true box will then be marked as negative examples, which do not require predicted locations or categories.

S1.4, extracting picture features by using a convolutional neural network and predicting the positions and the categories of candidate areas, so that each prediction frame can be regarded as a sample, and labeling is carried out according to the positions and the categories of the real frames relative to the prediction frames to obtain a label value; the position and the category of the prediction frame are predicted through the convolutional neural network model, and the convolutional neural network prediction value and the label value are compared, so that a loss function can be established.

S2, identifying black smoke of the motor vehicle: according to the recognition method in step S1, or in the vehicle recognition process in step S1, the vehicle black smoke recognition operation is performed synchronously.

S3, detecting the black smoke outline of the motor vehicle, as shown in FIG. 3:

s3.1, according to the brightness space of the image pixel, considering the characteristics of the color space, firstly obtaining a one-dimensional pixel characteristic according to the multidimensional color space component of the pixel brightness.

S3.2, the pixel characteristic is corresponding to the pixel gray level in the gray level image, and black smoke characteristics such as shape, brightness and the like are obtained, and can be obtained according to a group of training sample images and described by a characteristic drawing operator.

And S3.3, carrying out region-based image segmentation on the input automobile black smoke image by using a segmentation algorithm, so that the characteristics of brightness, texture and the like of the image of the automobile black smoke in the segmentation result region have similarity, and finally obtaining and marking the black smoke outline.

In the step, the image segmentation process based on the region is a process of marking pixels in the image as different marks, and the communicated pixels with the same marks form a segmented region;

a region-division merging method may also be employed.

In particular, the image segmentation may be performed using a prior feature operator in a formal algorithm. In addition, the method can also use visual judgment of the black smoke of the motor vehicle to mark and divide the image on the basis of the mark, and concretely comprises a mark-based region growing algorithm, a mark-based watershed algorithm and the like. Generally, the above algorithm requires training samples in the segmentation algorithm to obtain prior information, and a supervised segmentation algorithm requiring manual intervention.

Further, the basic method includes a region growing method and a region dividing and merging method.

First, the region growing method is a processing method of integrating pixels or sub-regions into a larger region according to a predefined growth criterion. Typically, growing regions are formed starting with a set of seed points and pixels having pixel characteristics similar to the seed points (e.g., pixels having less than some specified brightness difference from the seed points) are appended to the regions grown by the seed points. And according to whether priori information is needed in the selection of the seed points, the supervised region is increased. The prior information of the selected seed points may originate from the image (as a sample) that has been successfully segmented, or from the object features to be identified. Without a priori information, it is necessary to compute features for all pixels according to a uniform standard, select seeds by the features of the individual pixels, and append pixels during the growth process based on these features. For example, pixels with gray values of local maxima or minima are used as seed points, and pixels with gray values close to each other in adjacent pixels of the growth region are added into the region. Without pretreatment, the seed points are selected as pixels of local minima of the gradient image, and a distance measure based on a topographic measure is used for adding the pixels in the region growing process. The region growing method is a bottom-up method of growing a region with seed points.

Secondly, the region segmentation and merging method uses a top-down segmentation process to subdivide the image into a group of non-connected regions, and then merges adjacent regions with similar characteristics.

S3.4, training the convolutional neural network by using the input image and the corresponding segmentation map and adopting random gradient descent such as caffe;

in this step, since no padding is performed, the output image is smaller than the input image by a constant boundary width. To minimize overhead and maximize memory, we tend to use large input blocks rather than large batch sizes to reduce the batch to a single image.

S3.5, using high momentum to determine the update in the current optimization step by using a large number of previously built training samples until an optimal solution is obtained;

in this step, the specific algorithm for updating the current optimization step using high momentum includes:

Punish p at each position _l(x) (x) Deviation from l;

wherein E represents an energy value, l: q→ {1,., K } is the true label for each pixel, W: Ω→r is a weight map we introduce that can give some pixels higher weight during the training phase, W (X) represents the mapping weight;

the weight map is calculated in advance for each group trunk segmentation to compensate for some categories with different pixel frequencies in the training data, and the convolutional neural network is forced to learn small separation boundaries which are introduced by the convolutional neural network around black smoke;

wherein W is _c (X): omega-R is a weight graph for balancing class frequency, d ₁ : omega-R represents the distance from the nearest black smoke edge, d ₂ : omega-R represents the distance from the edge of the second near black smoke, W ₀ Representing the initial weights; in practice, let W ₀ ＝10，σ≈5pixels。

Finally, the initial weight is adjusted so that each feature map in the convolutional neural network has about unit variance; by from having standard deviationExtracting initial weights from the gaussian distribution of (a), where N represents the number of afferent nodes of a neuron. For example, for the 3×3 convolution and 64 feature channels of the previous layer, n=9·64=576.

As shown in fig. 4, the present embodiment also provides a vehicle black smoke detection and identification platform device comprising a processor, a memory, and a computer program stored in the memory and running on the processor.

The processor comprises one or more than one processing core, the processor is connected with the memory through a bus, the memory is used for storing program instructions, and the steps of the deep learning-based motor vehicle black smoke detection and identification method are realized when the processor executes the program instructions in the memory.

Alternatively, the memory may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

In addition, the invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the steps of the method for detecting and identifying the black smoke of the motor vehicle based on deep learning when being executed by a processor.

Optionally, the present invention also provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps of the deep learning based vehicle soot detection and identification method of the above aspects.

It will be appreciated by those of ordinary skill in the art that the processes for implementing all or part of the steps of the above embodiments may be implemented by hardware, or may be implemented by a program for instructing the relevant hardware, and the program may be stored in a computer readable storage medium, where the above storage medium may be a read-only memory, a magnetic disk or optical disk, etc.

The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the above-described embodiments, and that the above-described embodiments and descriptions are only preferred embodiments of the present invention, and are not intended to limit the invention, and that various changes and modifications may be made therein without departing from the spirit and scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A motor vehicle black smoke detection and identification method based on deep learning is characterized in that: the method mainly comprises three parts of motor vehicle identification, motor vehicle black smoke identification and motor vehicle black smoke contour detection; the specific method flow comprises the following steps:

s1, motor vehicle identification:

s1.3, marking candidate areas close enough to the real frame as positive samples, and taking the position of the real frame as a position target of the positive samples; those candidate regions that deviate more from the true box will be marked as negative samples;

s3, detecting the black smoke outline of the motor vehicle:

s3.1, firstly obtaining a one-dimensional pixel characteristic according to the multidimensional color space component of the pixel brightness;

s3.2, the pixel characteristic is corresponding to the pixel gray in the gray image, black smoke characteristics are obtained, and the black smoke characteristics are described through a characteristic drawing operator;

s3.3, carrying out region-based image segmentation on the input automobile black smoke image by using a segmentation algorithm to enable the characteristics of the automobile black smoke in the segmentation result region to have similarity, and finally obtaining and marking the black smoke outline;

2. The deep learning-based vehicle soot detection and recognition method of claim 1, wherein: in the step S1.2, candidate regions are generated, namely: generating a series of anchor frames fixed in position on the picture according to a certain rule, and regarding the anchor frames as possible candidate areas; predicting whether the anchor frame contains a target vehicle, and if so, adjusting the amplitude of the prediction frame relative to the position of the anchor frame; the specific algorithm is as follows:

3. The deep learning-based vehicle soot detection and recognition method of claim 2, wherein: in the step S1.2, the algorithm for adjusting the position of the prediction frame relative to the anchor frame is as follows:

c _x ＝n _j

c _y ＝m _i

the center coordinates (center_x, center_y) of this anchor frame are:

center_x＝c _x +0.5

center_y＝c _y +0.5

b _x ＝c _x +σ(t _x )

b _y ＝c _y +σ(t _y )

since the function value of Sigmoid is atBetween 0 and 1, so that the center point of the prediction frame calculated from the above formula always falls on the mth _i Line n _j The interior of the small region of the column;

The size of the prediction box is generated by the following formula:

4. The deep learning-based vehicle soot detection and recognition method of claim 1, wherein: in the step S1.4, the method for establishing the loss function includes:

5. The deep learning-based vehicle soot detection and recognition method of claim 1, wherein: in the step S3.2, the black smoke features mainly comprise shape, brightness and texture; these features may be obtained from a set of training sample images.

6. The deep learning-based vehicle soot detection and recognition method of claim 1, wherein: in the step S3.3, the region-based image segmentation process is a process of marking pixels in an image as different marks, and the connected pixels with the same marks form a segmented region;

a region-division merging method may also be employed.

7. The deep learning-based vehicle soot detection and recognition method of claim 1, wherein: in the step S3.4, when the convolutional neural network is trained by adopting random gradient descent, since no padding is performed in the training process, the output image is smaller than the input image by a constant boundary width; large input blocks are used instead of large batch size to reduce the batch to a single image to minimize overhead and maximize memory utilization.

8. The deep learning-based vehicle soot detection and recognition method of claim 1, wherein: in the step S3.5, the specific algorithm for updating the current optimization step using the high momentum includes:

Punish p at each position _l(x) (x) Deviation from l;