WO2022217630A1 - Vehicle speed determination method and apparatus, device, and medium - Google Patents

Vehicle speed determination method and apparatus, device, and medium Download PDF

Info

Publication number
WO2022217630A1
WO2022217630A1 PCT/CN2021/088536 CN2021088536W WO2022217630A1 WO 2022217630 A1 WO2022217630 A1 WO 2022217630A1 CN 2021088536 W CN2021088536 W CN 2021088536W WO 2022217630 A1 WO2022217630 A1 WO 2022217630A1
Authority
WO
WIPO (PCT)
Prior art keywords
detection area
vehicle
area
image
prediction
Prior art date
Application number
PCT/CN2021/088536
Other languages
French (fr)
Chinese (zh)
Inventor
薛春芳
刘鹏
***
肖传欣
胡瑞通
Original Assignee
华北电力大学扬中智能电气研究中心
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华北电力大学扬中智能电气研究中心 filed Critical 华北电力大学扬中智能电气研究中心
Publication of WO2022217630A1 publication Critical patent/WO2022217630A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the present invention relates to the technical field of image processing, and in particular, to a vehicle speed determination method, device, device and medium.
  • the speed measurement technologies in the prior art include radar speed measurement, laser speed measurement, ground-sensing coil speed measurement, and the like.
  • the radar speed measurement is based on the principle of the Doppler effect. By transmitting radar waves to the vehicle and receiving the radar waves reflected back from the vehicle, the vehicle speed is determined according to the frequency of the transmitted radar waves and the frequencies of the received reflected radar waves.
  • radar speed measurement can only be used in mobile and close-range scenarios, and long-distance speed measurement cannot be used, such as the speed measurement of vehicles on highways, and radar speed measurement needs to be installed with radar, which makes the cost of determining the speed of the vehicle high.
  • Laser speed measurement is based on the principle of laser ranging.
  • the range finder emits two lasers with a set time interval to the vehicle and receives the returned laser to determine the moving distance of the vehicle within the set time interval, thereby determining the speed of the vehicle.
  • the laser speed measurement method has high requirements for measuring the deviation angle, resulting in a low success rate in determining the vehicle speed, and the rangefinder can only be used in a stationary state.
  • the ground-sensing coil speed measurement is based on the principle of electromagnetic induction. When a vehicle passes through the coil area, the magnetic flux of the coil changes, and a trigger signal prompts a vehicle to pass by. It can be determined through two coils with a known distance and combined with the time it takes for the vehicle to pass through the two coils. out speed.
  • it is necessary to install the ground-sensing coil, which requires a large amount of construction, and the coil is easily damaged, and the subsequent maintenance is difficult.
  • the position of the moving vehicle in the image is determined by the frame difference method, and the vehicles between every two frames are matched to detect the position of the same vehicle in different frames.
  • the pixel distance determines the actual distance traveled by the vehicle, and the speed of the vehicle is determined according to the actual distance and the time difference between two frames.
  • the difference image is obtained by subtracting the corresponding pixel values of the adjacent frame images, and then the difference image is binarized.
  • the pixel difference value corresponding to the pixel point is less than the predetermined threshold, it can be determined as the background pixel point, and if the pixel difference value corresponding to the pixel point is not less than the predetermined threshold value, it can be determined as the moving vehicle in the image. . Since this method can accurately determine the position of the vehicle in the image only when the ambient brightness changes little, the actual location of the vehicle is affected by the illumination and the background of the picture, which will lead to the inaccurate position of the vehicle in the image. As a result, the accuracy of the determined vehicle speed is low.
  • the existing visual detection technology can only determine the vehicle speed when only a single vehicle is included in the image, but cannot determine the vehicle speed when multiple vehicles are included in the image.
  • the present invention provides a vehicle speed determination method, device, device and medium to solve the problem of how to improve the accuracy of the determined vehicle speed when a plurality of vehicles are included in an image.
  • the present invention provides a method for determining vehicle speed, the method comprising:
  • the first detection area in the current frame image predict the first prediction area of the vehicle in the next frame image; according to the similarity between the first prediction area and the second candidate detection area , determining a target second detection area that matches the first prediction area;
  • the speed of the vehicle is determined according to the first detection area, the target second detection area, and the time difference between two adjacent frames of images.
  • determining the speed of the vehicle according to the first detection area, the target second detection area, and the time difference between two adjacent frames of images includes:
  • the first detection area and the target second detection area are determined according to the first coordinates of the vehicle preset position in the first detection area and the second coordinates of the vehicle preset position in the target second detection area
  • the speed of the vehicle is determined according to the actual distance and the time difference between two adjacent frames of images.
  • predicting that the vehicle is in the first prediction area of the next frame image includes:
  • the vehicle corresponding to the first detection area is predicted, and it is determined that the first detection area is in the first prediction area of the next frame of image.
  • the determining, according to the similarity between the first prediction region and the second candidate detection region, the target second detection region matching the first prediction region includes:
  • each first pixel point in the first prediction area and each second pixel point corresponding to the second candidate detection area determine the distance between the first prediction area and the second candidate detection area The sum of the distance and the cosine distance;
  • the Mahalanobis distance sum value determines the distance weight and value of the first prediction area and the second candidate detection area
  • the second candidate detection area is determined to be the target second detection area.
  • the training process of the deep learning model includes:
  • sample image in the sample set For any sample image in the sample set, obtain the sample image and first label information corresponding to the sample image, wherein the first label information identifies a set range area that includes a vehicle in the sample image;
  • the parameter values of the parameters of the original deep learning model are adjusted to obtain the trained deep learning model.
  • the present invention provides a vehicle speed determination device, the device comprising:
  • the identification module is used to identify the first detection area of the vehicle in the current frame image and the second candidate detection area of the vehicle in the next frame image adjacent to the current frame image based on the pre-trained deep learning model ;
  • a matching module configured to predict the first prediction area of the vehicle in the next frame image according to the first detection area in the current frame image; according to the first prediction area and the second candidate detection area The similarity between the two, determine the target second detection area that matches the first prediction area;
  • a determination module configured to determine the speed of the vehicle according to the first detection area, the target second detection area and the time difference between two adjacent frames of images.
  • the determining module is specifically configured to determine the said vehicle according to the first coordinates of the vehicle preset position in the first detection area and the second coordinates of the vehicle preset position in the target second detection area
  • the matching module is specifically configured to predict the vehicle corresponding to the first detection area based on the standard Kalman filter of the constant speed model and the linear observation model, and determine that the first detection area is in the lower The first prediction area of a frame of image.
  • the matching module is further configured to determine the first pixel point according to each first pixel point in the first prediction area and each second pixel point corresponding to the second candidate detection area
  • the Mahalanobis distance sum value and the cosine distance sum value between the prediction area and the second candidate detection area according to the Mahalanobis distance sum value, the cosine distance sum value and the corresponding preset weight, determine the The distance weight sum value between the prediction area and the second candidate detection area; if the distance weight sum value is less than the preset threshold, the second candidate detection area is determined as the target second detection area.
  • the device also includes:
  • a training module which is specifically used for obtaining the first label information corresponding to the sample image and the sample image for any sample image in the sample set, wherein the first label information identifies that the sample image contains the setting of the vehicle range area; input the sample image into the original deep learning model, obtain the second label information of the output sample image; according to the first label information and the second label information, the original deep learning The parameter values of each parameter of the model are adjusted to obtain the deep learning model that has been trained.
  • the present invention provides an electronic device, comprising: a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus;
  • a computer program is stored in the memory, and when the program is executed by the processor, the processor implements the steps of any one of the above-mentioned vehicle speed determination methods when executed by the processor.
  • the present invention provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, implements the steps of any one of the above-mentioned vehicle speed determination methods.
  • the present invention provides a vehicle speed determination method, device, equipment and medium.
  • a first detection area of a vehicle in a current frame image and a first detection area of the vehicle adjacent to the current frame image are identified.
  • the second candidate detection area of the vehicle in the next frame of image since the deep learning model identifies the detection area containing the vehicle, the influence of the illumination and the picture background on the pixel value will not reduce the determined detection area containing the vehicle.
  • FIG. 1 is a schematic process diagram of a method for determining vehicle speed according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a similar triangle formed by a connection line between a vehicle in an image and an actual vehicle and an image acquisition device, respectively, according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a YOLOv3 backbone network Darknet-53 according to an embodiment of the present invention
  • FIG. 4 is an overall structural diagram of a YOLOv3 provided by an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a basic component DBL of YOLOv3 provided by an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a basic component Res_unit of YOLOv3 provided by an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a basic component Resblock_body of YOLOv3 provided by an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a vehicle speed determination device according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
  • embodiments of the present invention provide a vehicle speed determination method, apparatus, device and medium.
  • FIG. 1 is a schematic process diagram of a vehicle speed determination method provided by an embodiment of the present invention, and the process includes the following steps:
  • S101 Based on a pre-trained deep learning model, identify a first detection area of the vehicle in the current frame image and a second candidate detection area of the vehicle in the next frame image adjacent to the current frame image.
  • a vehicle speed determination method provided by an embodiment of the present invention is applied to an electronic device.
  • the electronic device may be a smart terminal device such as a PC, a tablet computer, a smart phone, or the like, or a server, where the server may be a local server or a cloud. server.
  • the electronic device determines the speed of the vehicle in the video frame image according to the video frame image in the video collected by the image collection device, wherein the image collection device is a device that collects the image of the moving vehicle, such as a surveillance camera , cameras, etc.
  • the current frame image in the video collected by the image acquisition device may contain one vehicle or multiple vehicles; in order to determine the speed of the vehicle in the current frame image, the detection area of the vehicle in the current frame image needs to be identified.
  • the electronic device saves the trained deep learning model, and inputs the current frame image and the next frame image adjacent to the current frame image into the pre-trained deep learning model,
  • the deep learning model processes the vehicle in the current frame image and the next frame image, and identifies the first detection area of the vehicle in the current frame image and the second candidate detection area of the vehicle in the next frame image.
  • the deep learning model may be a convolutional neural network model (Convolutional neural networks, CNN), or a target detection network model (YOLOv3), or a deep residual network (Deep residual networks, DRN).
  • CNN convolutional neural networks
  • YOLOv3 target detection network model
  • DRN deep residual network
  • S102 Predict the first prediction area of the vehicle in the next frame image according to the first detection area in the current frame image; according to the difference between the first prediction area and the second candidate detection area The similarity is to determine the target second detection area that matches the first prediction area.
  • the electronic device further determines, for any vehicle in the current frame image, according to the first detection area of the identified vehicle in the current frame image, that the first detection area corresponds to the second detection target of the vehicle in the next frame image. area.
  • the method of predicting the first prediction area of the first detection area of the vehicle in the next frame image belongs to the prior art, which will not be repeated in this embodiment of the present invention.
  • the similarity between the first prediction region and the second candidate detection region is determined according to the first prediction region and the second candidate detection region of the next frame of image. For example, the similarity can be determined by calculating the Mahalanobis distance between the first prediction region and the second candidate detection region. The smaller the Mahalanobis distance, the higher the similarity between the first prediction region and the second candidate detection region.
  • a target second detection area matching the first prediction area is determined.
  • the second candidate detection area with the highest similarity and greater than a preset threshold may be determined as the target second detection area.
  • S103 Determine the speed of the vehicle according to the first detection area, the target second detection area, and the time difference between two adjacent frames of images.
  • the coordinate systems of the current frame image and the next frame image are the same, according to the first detection area in the current frame image and the next frame image
  • the target second detection area is determined, and the actual distance that the vehicle in the first detection area moves within two adjacent frame times is determined, and the vehicle corresponding to the first detection area is determined according to the actual distance and the time difference between the two adjacent frames of images. speed.
  • the embodiment of the present invention is based on a pre-trained deep learning model, the first detection area of the vehicle in the current frame image and the second candidate detection area of the vehicle in the next frame image adjacent to the current frame image are identified. area, since the deep learning model identifies the detection area containing the vehicle, the influence of the illumination and the screen background on the pixel value will not reduce the accuracy of the determined detection area containing the vehicle.
  • the first detection area in the current frame image detection area predict the vehicle in the first prediction area of the next frame of image; determine the match with the first prediction area according to the similarity between the first prediction area and the second candidate detection area the second detection area of the target; determine the speed of the vehicle according to the first detection area, the second detection area of the target, and the time difference between two adjacent frames of images; thus, when multiple vehicles are included in the image
  • the first detection area of a vehicle and the corresponding target second detection area are determined, so as to accurately determine the speed of the vehicle.
  • the determination is based on the first detection area, the target second detection area, and the time difference between two adjacent frames of images.
  • the speed of the vehicle includes:
  • the first detection area and the target second detection area are determined according to the first coordinates of the vehicle preset position in the first detection area and the second coordinates of the vehicle preset position in the target second detection area
  • the speed of the vehicle is determined according to the actual distance and the time difference between two adjacent frames of images.
  • the actual distance moved by the time difference between the two adjacent frames of images corresponding to the first detection area is determined.
  • the first coordinate of the preset position of the vehicle in the first detection area and the second coordinate of the preset position of the vehicle in the second detection area of the target determine the distance between the first coordinate and the second coordinate, and use the distance as the first coordinate
  • the pixel distance between the detection area and the target second detection area is the distance between the detection area and the target second detection area.
  • the preset position of the vehicle may be the center position of the first detection area, or may be the position of any point of the first detection area, and the area range of the first detection area and the target second detection area is the same, so in the first detection area In the first detection area and the target second detection area, the preset vehicle positions are both the positions of the coordinate points corresponding to the same row number and the same column number in the first detection area and the target second detection area.
  • the preset width is the actual width of the vehicle, for example, the preset width is 3 meters, 3.5 meters, etc.
  • the ratio is the ratio of the pixel width to the actual width.
  • FIG. 2 is a schematic diagram of a similar triangle formed by connecting lines between a vehicle and an actual vehicle in an image provided by an embodiment of the present invention, respectively, and an image acquisition device.
  • the connecting line AB in the figure represents an actual vehicle.
  • the connecting line CD in the figure represents the vehicle in the image.
  • the connecting lines of AB and CD and the focus O of the image acquisition device respectively form a triangle OAB and a triangle OCD, and the triangle OAB and the triangle OCD are similar triangles.
  • the ratio of the focal length of the image acquisition device to the distance between the vehicle and the image acquisition device is equivalent to the ratio of the pixel width to the actual width, but the distance between the vehicle and the image acquisition device is far, and the movement of the vehicle in the lane will affect the distance between the vehicle and the image.
  • the influence of the distance of the acquisition device can be ignored, so it can be considered that the ratio of the focal length of the image acquisition device to the distance from the vehicle to the image acquisition device is a fixed value, that is, the ratio of the pixel width to the actual width can be confirmed as the pixel distance and the actual distance. ratio.
  • the quotient of dividing the pixel distance by the ratio is determined, and the quotient is the corresponding value of the first detection area.
  • the actual distance that the vehicle moves in the time difference between two adjacent frames of images According to the actual distance corresponding to the movement of the vehicle in the first detection area and the time difference between two adjacent frames of images, the quotient of dividing the actual distance by the time difference is determined, and the quotient is the time difference of the vehicle corresponding to the first detection area. speed.
  • the center point position of the first detection area of the vehicle in the first frame is (x1, y1), the pixel height of the first detection area and the The pixel widths are h1 and w1 respectively; the position of the center point of the target second detection area of the vehicle in the second frame is (x2, y2), and the pixel distance D p of the vehicle is:
  • the ratio p of the pixel width to the preset width is determined, Determine the actual distance D that the vehicle moves in the time difference between two adjacent frames of images,
  • the time difference t of two adjacent frames of images is determined.
  • the number of video frames processed per second is 25 fps, so the time difference t is Calculation formula based on vehicle speed Therefore, the speed of the vehicle is determined
  • the first prediction area of the next frame of image includes:
  • the vehicle corresponding to the first detection area is predicted, and it is determined that the first detection area is in the first prediction area of the next frame of image.
  • the embodiment of the present invention In order to predict the first prediction area of the vehicle in the next frame of image, in the embodiment of the present invention, according to the existing standard Kalman filter based on the constant speed model and the linear observation model, and the center point of the first detection area Coordinates (x, y), pixel width w and pixel height h, predict the vehicle corresponding to the first detection area, and determine that the first detection area is in the first prediction area of the next frame image adjacent to the current frame image .
  • the pixel width and pixel height of the first prediction area are the same as the pixel width and pixel height of the first detection area, and only the center point coordinates of the first prediction area and the center point coordinates of the first detection area are different.
  • the first prediction area and the second candidate detection area are determined according to the difference between the first prediction area and the second candidate detection area.
  • the similarity of , determining the target second detection area that matches the first prediction area includes:
  • each first pixel point in the first prediction area and each second pixel point corresponding to the second candidate detection area determine the distance between the first prediction area and the second candidate detection area The sum of the distance and the cosine distance;
  • the Mahalanobis distance sum value determines the distance weight and value of the first prediction area and the second candidate detection area
  • the second candidate detection area is determined to be the target second detection area.
  • the existing target tracking algorithm determines the target second detection area that matches the first prediction area
  • the intersection and union ratio (IOU) between the first prediction area and the second candidate detection area is used as a metric for judging matching. , so as to determine the area of the same vehicle in the current frame image and the next frame image, so as to realize the tracking of the vehicle.
  • a multi-target tracking algorithm (Deep Sort) is used to determine the difference between the first The predicted area matches the target second detection area.
  • the Mahalanobis distance and the cosine distance between the first pixel point and the second pixel point are determined.
  • a second candidate detection area with the same number of rows and columns can be determined. Pixel point, the second pixel point is the second pixel point corresponding to the first pixel point in the second candidate detection area.
  • the Mahalanobis distance between each first pixel in the first prediction area and the second pixel corresponding to the second candidate detection area determine the sum of the Mahalanobis distance between the first prediction area and the second candidate detection area;
  • the cosine distance between each first pixel in the prediction area and the second pixel corresponding to the second candidate detection area determines the cosine distance and value of the first prediction area and the second candidate detection area.
  • the Mahalanobis distance sum value cosine distance sum value and the corresponding weight value of the first prediction area and the second candidate detection area, multiply the Mahalanobis distance sum value and the corresponding weight value to obtain the first product value.
  • the distance sum value is multiplied by the corresponding weight value to obtain the second product value, and the sum value of the first product value and the second product value is determined, and the sum value is the weight distance sum value of the first prediction area and the second candidate detection area.
  • the weight distance sum value is an index for evaluating the similarity between the first prediction area and the second candidate detection area. The smaller the weight distance sum value is, the more similar the first prediction area and the second candidate detection area are. The higher the degree of coincidence between a prediction region and the second candidate detection region, the more accurate the predicted first prediction region.
  • a preset threshold value for judging whether the area is matched is also pre-stored.
  • the preset threshold value is preset. If you want to improve the accuracy of area matching, you can The preset threshold is set to be smaller, and if the robustness of region matching is desired to be improved, the preset threshold can be set to be larger.
  • a second candidate detection area whose weighted distance sum is smaller than the preset threshold is determined as the target second detection area.
  • the training process of the deep learning model includes:
  • the sample image and the first label information corresponding to the sample image obtain the sample image and the first label information corresponding to the sample image, wherein the first label information identifies a set range area that includes a vehicle in the sample image;
  • the parameter values of the parameters of the original deep learning model are adjusted to obtain the trained deep learning model.
  • sample images in the sample set include vehicle images of each vehicle, and the first label information of the sample images in the sample set is manually pre-labeled , wherein the first label information is used to identify the set range area of the vehicle included in the sample image.
  • the sample image is input into the original deep learning model, and the original deep learning model outputs the second image of the sample image.
  • Label Information identifies a set range area that includes a vehicle in the sample image identified by the original deep learning model.
  • the original deep learning model is trained according to the second label information and the first label information of the sample image, so as to adjust the various parameters of the original deep learning model.
  • the parameter value of the item parameter is the parameter value of the item parameter.
  • the above operations are performed on each sample image included in the sample set for training the original deep learning model, and when a preset condition is met, a trained deep learning model is obtained.
  • the preset condition may be that the number of sample images whose first label information is consistent with the second label information obtained after the sample images in the sample set are trained by the original deep learning model is greater than the set number; The number of iterations for the model to be trained reaches the set maximum number of iterations, etc. Specifically, this application does not limit this.
  • the sample images in the sample set can be divided into training sample images and test sample images, and the original deep learning model is first trained based on the training sample images, and then based on the training sample images.
  • the test sample images test the reliability of the trained deep learning model.
  • the backbone network of YOLOv3 is a deep learning framework (Darknet-53), which consists of 52 convolutional layers and a final fully connected layer.
  • Darknet-53 a deep learning framework
  • 1x1 and 3x3 convolution kernels are used, which will greatly reduce the amount of parameters and calculations during model inference.
  • YOLOv3 is an improvement on the basis of YOLOv2.
  • Five residual blocks (Residual) are added to the backbone network structure, that is, the principle of the residual network ResNet is used to form the form of identity mapping, so that the deep network can be realized and shallow The same performance as the layer network, avoiding the gradient explosion caused by the network layer being too deep.
  • YOLOv3 uses Predictions Across Scales, which draws on the principle of feature pyramid networks (FPN), and uses multiple scales to detect targets of different sizes.
  • FPN feature pyramid networks
  • YOLOv3 provides 3 different sizes of bounding boxes, that is, for each target to be predicted, 3 prediction boxes of different sizes will be obtained, and then the probability of the prediction box will be calculated to filter out the most matching results.
  • the system uses this idea to extract features of different sizes to form a pyramid network.
  • the last convolutional layer of the YOLOv3 network predicts a 3D tensor encoding the predicted box, object and class.
  • the tensors obtained under the COCO dataset are: N*N*[3*(4+1+80)], 4 bounding box offsets, 1 object prediction and 80 category predictions.
  • FIG. 3 is a schematic diagram of a YOLOv3 backbone network Darknet-53 provided by an embodiment of the present invention.
  • Convolutional is represented as a convolutional layer
  • Residual is represented as a residual block
  • Type is represented as a network layer
  • Filters is identified as The convolution kernel included in the convolution layer
  • Size represents the size
  • Output represents the output
  • Avgpool represents the average pooling layer
  • Connected represents the fully connected layer
  • Softmax represents the function for numerical processing
  • Global represents the global pooling layer.
  • Figure 4 is an overall structural diagram of a YOLOv3 provided by an embodiment of the present invention.
  • the first row is the yolo layer with the smallest scale, and a 13*13 feature map is input, with a total of 1024 channels.
  • the size of the feature map remains the same, but the number of channels is finally reduced to 75, and the final output is a 13*13 feature map with 75 channels, and then classification and position regression are performed.
  • the second line is the mesoscale yolo layer.
  • the feature maps of 13*13 and 512 channels of 79 layers are convolved to generate feature maps of 13*13 and 256 channels, and then upsampling is performed to generate 26*26 and 256 channels.
  • the feature map of the channel is also merged with the mesoscale feature map of 26*26 and 512 channels.
  • a series of convolution operations are performed, and the size of the feature map remains unchanged, but the number of channels is finally reduced to 75, and the feature map of size 26*26 is finally output, with 75 channels, and then classification and position regression are performed.
  • the third line is the large-scale yolo layer.
  • the 26*26 and 512-channel feature maps of the 91-layer are convoluted to generate 26*26 and 128-channel feature maps, and then upsampled to generate 52*52 and 128 channels.
  • the feature map of the channel is merged with the feature map of 52*52 and 256 channels of 36 layers at the same time.
  • a series of convolution operations are performed, the size of the feature map is unchanged, but the number of channels is finally reduced to 75, and the final output of a feature map of 52*52 size, 75 channels, and then classification and position regression.
  • Fig. 5 is a schematic diagram of a basic component DBL of YOLOv3 provided by an embodiment of the present invention.
  • DBL includes a convolution layer, BN and leaky relu.
  • BN and leaky relu are and convolution layer Inseparable parts and together form the smallest component DBL.
  • FIG. 6 is a schematic diagram of a basic component Res_unit of YOLOv3 provided by an embodiment of the present invention. As shown in FIG. 6 , the Res_unit includes two basic components DBL and an add layer.
  • FIG. 7 is a schematic diagram of a basic component Resblock_body of YOLOv3 provided by an embodiment of the present invention. As shown in FIG. 7 , the Resblock_body includes basic components DBL, zero padding, and res unit.
  • FIG. 8 is a schematic structural diagram of a vehicle speed determination device provided by an embodiment of the present invention, and the device includes:
  • the identification module 801 is used to identify the first detection area of the vehicle in the current frame image and the second candidate detection area of the vehicle in the next frame image adjacent to the current frame image based on the pre-trained deep learning model area;
  • the matching module 802 is configured to predict the first prediction area of the vehicle in the next frame image according to the first detection area in the current frame image; detect the vehicle according to the first prediction area and the second candidate the similarity between the regions, to determine the target second detection region that matches the first prediction region;
  • the determining module 803 is configured to determine the speed of the vehicle according to the first detection area, the second target detection area and the time difference between two adjacent frames of images.
  • the determining module is specifically configured to determine the said vehicle according to the first coordinates of the vehicle preset position in the first detection area and the second coordinates of the vehicle preset position in the target second detection area
  • the matching module is specifically configured to predict the vehicle corresponding to the first detection area based on the standard Kalman filter of the constant speed model and the linear observation model, and determine that the first detection area is in the lower The first prediction area of a frame of image.
  • the matching module is further configured to determine the first pixel point according to each first pixel point in the first prediction area and each second pixel point corresponding to the second candidate detection area
  • the Mahalanobis distance sum value and the cosine distance sum value between the prediction area and the second candidate detection area according to the Mahalanobis distance sum value, the cosine distance sum value and the corresponding preset weight, determine the The distance weight sum value between the prediction area and the second candidate detection area; if the distance weight sum value is less than the preset threshold, the second candidate detection area is determined as the target second detection area.
  • the device also includes:
  • a training module which is specifically used for obtaining the first label information corresponding to the sample image and the sample image for any sample image in the sample set, wherein the first label information identifies that the sample image contains the setting of the vehicle range area; input the sample image into the original deep learning model, obtain the second label information of the output sample image; according to the first label information and the second label information, the original deep learning The parameter values of each parameter of the model are adjusted to obtain the deep learning model that has been trained.
  • FIG. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
  • an electronic device is also provided in the embodiment of the present invention, including a processor 901, a communication interface 902, a memory 903 and A communication bus 904, wherein the processor 901, the communication interface 902, and the memory 903 communicate with each other through the communication bus 904;
  • a computer program is stored in the memory 903, and when the program is executed by the processor 901, the processor 901 is caused to perform the following steps:
  • the first detection area in the current frame image predict the first prediction area of the vehicle in the next frame image; according to the similarity between the first prediction area and the second candidate detection area , determining a target second detection area that matches the first prediction area;
  • the speed of the vehicle is determined according to the first detection area, the target second detection area, and the time difference between two adjacent frames of images.
  • the processor 901 is specifically configured to determine the speed of the vehicle according to the first detection area, the target second detection area, and the time difference between two adjacent frames of images, including:
  • the first detection area and the target second detection area are determined according to the first coordinates of the vehicle preset position in the first detection area and the second coordinates of the vehicle preset position in the target second detection area
  • the speed of the vehicle is determined according to the actual distance and the time difference between two adjacent frames of images.
  • processor 901 is further configured to, according to the first detection area in the current frame image, predict that the vehicle is in the first prediction area of the next frame image including:
  • the vehicle corresponding to the first detection area is predicted, and it is determined that the first detection area is in the first prediction area of the next frame of image.
  • the processor 901 is further configured to determine, according to the similarity between the first prediction region and the second candidate detection region, a target second detection region that matches the first prediction region Areas include:
  • each first pixel point in the first prediction area and each second pixel point corresponding to the second candidate detection area determine the distance between the first prediction area and the second candidate detection area The sum of the distance and the cosine distance;
  • the Mahalanobis distance sum value determines the distance weight and value of the first prediction area and the second candidate detection area
  • the second candidate detection area is determined to be the target second detection area.
  • processor 901 is also used for the training process of the deep learning model including:
  • sample image in the sample set For any sample image in the sample set, obtain the sample image and first label information corresponding to the sample image, wherein the first label information identifies a set range area that includes a vehicle in the sample image;
  • the parameter values of the parameters of the original deep learning model are adjusted to obtain the trained deep learning model.
  • the communication bus mentioned in the above electronic device may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus or the like.
  • PCI peripheral component interconnect standard
  • EISA Extended Industry Standard Architecture
  • the communication bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.
  • the communication interface 902 is used for communication between the above-mentioned electronic device and other devices.
  • the memory may include random access memory (Random Access Memory, RAM), and may also include non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk storage.
  • RAM Random Access Memory
  • NVM non-Volatile Memory
  • the memory may also be at least one storage device located remotely from the aforementioned processor.
  • the above-mentioned processor may be a general-purpose processor, including a central processing unit, a network processor (Network Processor, NP), etc.; it may also be a digital instruction processor (Digital Signal Processing, DSP), an application-specific integrated circuit, a field programmable gate array or Other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • NP Network Processor
  • DSP Digital Signal Processing
  • an embodiment of the present invention further provides a computer-readable storage medium, which stores a computer program, and the computer program is executed by a processor as follows:
  • the first detection area in the current frame image predict the first prediction area of the vehicle in the next frame image; according to the similarity between the first prediction area and the second candidate detection area , determining a target second detection area that matches the first prediction area;
  • the speed of the vehicle is determined according to the first detection area, the target second detection area, and the time difference between two adjacent frames of images.
  • determining the speed of the vehicle according to the first detection area, the target second detection area, and the time difference between two adjacent frames of images includes:
  • the first detection area and the target second detection area are determined according to the first coordinates of the vehicle preset position in the first detection area and the second coordinates of the vehicle preset position in the target second detection area
  • the speed of the vehicle is determined according to the actual distance and the time difference between two adjacent frames of images.
  • predicting that the vehicle is in the first prediction area of the next frame image includes:
  • the vehicle corresponding to the first detection area is predicted, and it is determined that the first detection area is in the first prediction area of the next frame of image.
  • the determining, according to the similarity between the first prediction region and the second candidate detection region, the target second detection region matching the first prediction region includes:
  • each first pixel point in the first prediction area and each second pixel point corresponding to the second candidate detection area determine the distance between the first prediction area and the second candidate detection area The sum of the distance and the cosine distance;
  • the Mahalanobis distance sum value determines the distance weight and value of the first prediction area and the second candidate detection area
  • the second candidate detection area is determined to be the target second detection area.
  • the training process of the deep learning model includes:
  • sample image in the sample set For any sample image in the sample set, obtain the sample image and first label information corresponding to the sample image, wherein the first label information identifies a set range area that includes a vehicle in the sample image;
  • the parameter values of the parameters of the original deep learning model are adjusted to obtain the trained deep learning model.
  • embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
  • the apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

A vehicle speed determination method and apparatus. In the method, a first detection area of a vehicle in a current frame of an image and a second candidate detection area of the vehicle in the next frame of the image adjacent to the current frame of the image are identified on the basis of a pre-trained deep learning model. Since the deep learning model identifies detection areas comprising the vehicle, the impact of light and a picture background on a pixel value will not reduce the accuracy of the determined detection areas. A first prediction area of the vehicle in the next frame of the image is predicted according to the first detection area in the current frame of the image. A target second detection area that matches the first prediction area is determined according to the similarity between the first prediction area and the second candidate detection area. The vehicle speed of the vehicle is determined according to the first detection area, the target second detection area, and a time difference between two adjacent frames of an image. Therefore, the vehicle speed of the vehicle can be accurately determined when the image comprises a plurality of vehicles.

Description

一种车速确定方法、装置、设备和介质A vehicle speed determination method, device, equipment and medium
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求在2021年04月15日提交中国专利局、申请号为202110405781.2,申请名称为“一种车速确定方法、装置、设备和介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on April 15, 2021 with the application number 202110405781.2 and the application title is "A Vehicle Speed Determination Method, Device, Equipment and Medium", the entire contents of which are incorporated by reference in this application.
技术领域technical field
本发明涉及图像处理技术领域,尤其涉及一种车速确定方法、装置、设备和介质。The present invention relates to the technical field of image processing, and in particular, to a vehicle speed determination method, device, device and medium.
背景技术Background technique
随着社会的进步和经济的快速发展,汽车数量不断增加,在为社会提供便利的同时,交通事故的发生概率和交通道路的拥塞率也大大增加,而造成交通事故和道路拥塞的最主要原因,就是人为随意地增速、降速,因此准确地确定车辆的车速就成为较为急切的需求。With the progress of the society and the rapid development of the economy, the number of cars is increasing. While providing convenience to the society, the probability of traffic accidents and the congestion rate of traffic roads have also greatly increased. The main reason for traffic accidents and road congestion is , that is, artificially increasing and decreasing speed at will, so it is an urgent need to accurately determine the speed of the vehicle.
现有技术中的测速技术包括雷达测速、激光测速、地感线圈测速等。其中雷达测速是基于多普勒效应原理,通过向车辆发射雷达波并接收车辆反射回的雷达波,根据发射的雷达波的频率与接收的反射回的雷达波的频率,确定出车辆的车速。但是雷达测速只能应用于移动式、近距离的场景中,而无法使用远距离的测速,例如对高速路上车辆的测速,并且雷达测速还需要安装雷达,使得确定车辆的车速的成本较高。The speed measurement technologies in the prior art include radar speed measurement, laser speed measurement, ground-sensing coil speed measurement, and the like. The radar speed measurement is based on the principle of the Doppler effect. By transmitting radar waves to the vehicle and receiving the radar waves reflected back from the vehicle, the vehicle speed is determined according to the frequency of the transmitted radar waves and the frequencies of the received reflected radar waves. However, radar speed measurement can only be used in mobile and close-range scenarios, and long-distance speed measurement cannot be used, such as the speed measurement of vehicles on highways, and radar speed measurement needs to be installed with radar, which makes the cost of determining the speed of the vehicle high.
激光测速是基于激光测距的原理,测距仪向车辆发射两次间隔设定时长的激光并接收返回的激光,确定车辆在间隔的该设定时长内的移动距离,从而确定出车辆的车速。但是该激光测速方法对于测量偏差角度的要求较高,导致确定车速时的成功率较低,并且测距仪只能在静止状态下使用。Laser speed measurement is based on the principle of laser ranging. The range finder emits two lasers with a set time interval to the vehicle and receives the returned laser to determine the moving distance of the vehicle within the set time interval, thereby determining the speed of the vehicle. . However, the laser speed measurement method has high requirements for measuring the deviation angle, resulting in a low success rate in determining the vehicle speed, and the rangefinder can only be used in a stationary state.
地感线圈测速是基于电磁感应原理,当有车辆经过线圈区域时,线圈磁 通量发生变化,触发信号提示有车经过,通过两个距离已知的线圈,并结合车辆通过两个线圈所用时间可以确定出车速。但是采用地感线圈测试时需要安装地感线圈,施工量较大,并线圈容易损坏,后续的维护难度高。The ground-sensing coil speed measurement is based on the principle of electromagnetic induction. When a vehicle passes through the coil area, the magnetic flux of the coil changes, and a trigger signal prompts a vehicle to pass by. It can be determined through two coils with a known distance and combined with the time it takes for the vehicle to pass through the two coils. out speed. However, when using the ground-sensing coil test, it is necessary to install the ground-sensing coil, which requires a large amount of construction, and the coil is easily damaged, and the subsequent maintenance is difficult.
由于传统的测速方法的上述问题,并且近年来机器视觉相关技术发展迅速,在各行各业中应用广泛,且计算机视觉检测技术具有自动化程度高、效率高、精度高等诸多优点,因此现有技术提出了出基于视觉检测技术的车辆测速技术。Due to the above problems of traditional speed measurement methods, and the rapid development of machine vision related technologies in recent years, it is widely used in all walks of life, and the computer vision detection technology has many advantages such as high degree of automation, high efficiency and high precision. The vehicle speed measurement technology based on visual detection technology is presented.
在现有的视觉检测技术中,通过帧差法确定出运动的车辆在图像中的位置,对每两帧之间的车辆进行匹配,从而检测出同一车辆在不同帧内的位置,根据车辆的像素距离确定出车辆经过的实际距离,根据实际距离以及两帧间的时间差,确定车辆的车速。In the existing visual detection technology, the position of the moving vehicle in the image is determined by the frame difference method, and the vehicles between every two frames are matched to detect the position of the same vehicle in different frames. The pixel distance determines the actual distance traveled by the vehicle, and the speed of the vehicle is determined according to the actual distance and the time difference between two frames.
具体通过帧差法确定运动的车辆在图像中的位置时,是将相邻帧图像对应像素值相减得到差分图像,然后对差分图像二值化,在环境亮度变化不大的情况下,若像素点对应的像素差值小于事先确定的阈值时,可以确定为背景像素点,若像素点对应的像素差值不小于事先确定的阈值时,则可以确定为运动的车辆在图像中的像素点。由于该方法只有在环境亮度变化不大时才能准确地确定车辆在图像中的位置,因此实际在确定车辆位置时受光照和画面背景的影响,会导致确定的车辆在图像中的位置不准确,从而使得确定的车速的准确度低。Specifically, when the position of the moving vehicle in the image is determined by the frame difference method, the difference image is obtained by subtracting the corresponding pixel values of the adjacent frame images, and then the difference image is binarized. When the pixel difference value corresponding to the pixel point is less than the predetermined threshold, it can be determined as the background pixel point, and if the pixel difference value corresponding to the pixel point is not less than the predetermined threshold value, it can be determined as the moving vehicle in the image. . Since this method can accurately determine the position of the vehicle in the image only when the ambient brightness changes little, the actual location of the vehicle is affected by the illumination and the background of the picture, which will lead to the inaccurate position of the vehicle in the image. As a result, the accuracy of the determined vehicle speed is low.
并且现有的视觉检测技术只能在图像中只包括单个车辆时进行车速的确定,而无法在图像中包括多个车辆时进行车速的确定。And the existing visual detection technology can only determine the vehicle speed when only a single vehicle is included in the image, but cannot determine the vehicle speed when multiple vehicles are included in the image.
因此,如何在图像中包括多个车辆时提高确定的车速的准确性,就成为亟待解决的技术问题。Therefore, how to improve the accuracy of the determined vehicle speed when multiple vehicles are included in the image has become an urgent technical problem to be solved.
发明内容SUMMARY OF THE INVENTION
本发明提供了一种车速确定方法、装置、设备和介质,用以解决如何在图像中包括多个车辆时提高确定的车速的准确性的问题。The present invention provides a vehicle speed determination method, device, device and medium to solve the problem of how to improve the accuracy of the determined vehicle speed when a plurality of vehicles are included in an image.
本发明提供了一种车速确定方法,所述方法包括:The present invention provides a method for determining vehicle speed, the method comprising:
基于预先训练完成的深度学习模型,识别当前帧图像中的车辆的第一检测区域以及与所述当前帧图像相邻的下一帧图像中所述车辆的第二候选检测区域;Identifying the first detection area of the vehicle in the current frame image and the second candidate detection area of the vehicle in the next frame image adjacent to the current frame image based on the pre-trained deep learning model;
根据当前帧图像中的所述第一检测区域,预测所述车辆在所述下一帧图像的第一预测区域;根据所述第一预测区域与所述第二候选检测区域之间的相似度,确定与所述第一预测区域匹配的目标第二检测区域;According to the first detection area in the current frame image, predict the first prediction area of the vehicle in the next frame image; according to the similarity between the first prediction area and the second candidate detection area , determining a target second detection area that matches the first prediction area;
根据所述第一检测区域、所述目标第二检测区域以及相邻两帧图像的时间差值,确定所述车辆的车速。The speed of the vehicle is determined according to the first detection area, the target second detection area, and the time difference between two adjacent frames of images.
进一步地,所述根据所述第一检测区域、所述目标第二检测区域以及相邻两帧图像的时间差值,确定所述车辆的车速包括:Further, determining the speed of the vehicle according to the first detection area, the target second detection area, and the time difference between two adjacent frames of images includes:
根据所述第一检测区域中车辆预设位置的第一坐标和所述目标第二检测区域中所述车辆预设位置的第二坐标,确定所述第一检测区域与所述目标第二检测区域的像素距离,其中所述第一检测区域与所述目标第二检测区域的区域范围相同;The first detection area and the target second detection area are determined according to the first coordinates of the vehicle preset position in the first detection area and the second coordinates of the vehicle preset position in the target second detection area The pixel distance of the area, wherein the area range of the first detection area and the target second detection area is the same;
根据所述第一检测区域的像素宽度和预设宽度的比值和所述像素距离,确定所述车辆移动的实际距离;Determine the actual distance that the vehicle moves according to the ratio of the pixel width of the first detection area to the preset width and the pixel distance;
根据所述实际距离和相邻两帧图像的时间差值,确定所述车辆的车速。The speed of the vehicle is determined according to the actual distance and the time difference between two adjacent frames of images.
进一步地,所述根据当前帧图像中的所述第一检测区域,预测所述车辆在所述下一帧图像的第一预测区域包括:Further, according to the first detection area in the current frame image, predicting that the vehicle is in the first prediction area of the next frame image includes:
基于常量速度模型和线性观测模型的标准卡尔曼滤波器,对所述第一检测区域对应的车辆进行预测,确定所述第一检测区域在所述下一帧图像的第一预测区域。Based on the standard Kalman filter of the constant velocity model and the linear observation model, the vehicle corresponding to the first detection area is predicted, and it is determined that the first detection area is in the first prediction area of the next frame of image.
进一步地,所述根据所述第一预测区域与所述第二候选检测区域之间的相似度,确定与所述第一预测区域匹配的目标第二检测区域包括:Further, the determining, according to the similarity between the first prediction region and the second candidate detection region, the target second detection region matching the first prediction region includes:
根据所述第一预测区域中的每个第一像素点与所述第二候选检测区域中对应的每个第二像素点,确定所述第一预测区域与所述第二候选检测区域的 马氏距离和值与余弦距离和值;According to each first pixel point in the first prediction area and each second pixel point corresponding to the second candidate detection area, determine the distance between the first prediction area and the second candidate detection area The sum of the distance and the cosine distance;
根据所述马氏距离和值、所述余弦距离和值以及分别对应的预设权重,确定所述第一预测区域与所述第二候选检测区域的距离权重和值;According to the Mahalanobis distance sum value, the cosine distance sum value and the corresponding preset weights respectively, determine the distance weight and value of the first prediction area and the second candidate detection area;
若所述距离权重和值小于所述预设阈值,则确定第二候选检测区域为目标第二检测区域。If the distance weight sum value is less than the preset threshold, the second candidate detection area is determined to be the target second detection area.
进一步地,所述深度学习模型的训练过程包括:Further, the training process of the deep learning model includes:
针对样本集中的任一样本图像,获取所述样本图像及所述样本图像对应的第一标签信息,其中所述第一标签信息标识所述样本图像中包含车辆的设定范围区域;For any sample image in the sample set, obtain the sample image and first label information corresponding to the sample image, wherein the first label information identifies a set range area that includes a vehicle in the sample image;
将所述样本图像输入到原始深度学习模型中,获取输出的所述样本图像的第二标签信息;Input the sample image into the original deep learning model, and obtain the second label information of the output sample image;
根据所述第一标签信息和所述第二标签信息,对所述原始深度学习模型的各参数的参数值进行调整,得到训练完成的所述深度学习模型。According to the first label information and the second label information, the parameter values of the parameters of the original deep learning model are adjusted to obtain the trained deep learning model.
相应地,本发明提供了一种车速确定装置,所述装置包括:Accordingly, the present invention provides a vehicle speed determination device, the device comprising:
识别模块,用于基于预先训练完成的深度学习模型,识别当前帧图像中的车辆的第一检测区域以及与所述当前帧图像相邻的下一帧图像中所述车辆的第二候选检测区域;The identification module is used to identify the first detection area of the vehicle in the current frame image and the second candidate detection area of the vehicle in the next frame image adjacent to the current frame image based on the pre-trained deep learning model ;
匹配模块,用于根据当前帧图像中的所述第一检测区域,预测所述车辆在所述下一帧图像的第一预测区域;根据所述第一预测区域与所述第二候选检测区域之间的相似度,确定与所述第一预测区域匹配的目标第二检测区域;a matching module, configured to predict the first prediction area of the vehicle in the next frame image according to the first detection area in the current frame image; according to the first prediction area and the second candidate detection area The similarity between the two, determine the target second detection area that matches the first prediction area;
确定模块,用于根据所述第一检测区域、所述目标第二检测区域以及相邻两帧图像的时间差值,确定所述车辆的车速。A determination module, configured to determine the speed of the vehicle according to the first detection area, the target second detection area and the time difference between two adjacent frames of images.
进一步地,所述确定模块,具体用于根据所述第一检测区域中车辆预设位置的第一坐标和所述目标第二检测区域中所述车辆预设位置的第二坐标,确定所述第一检测区域与所述目标第二检测区域的像素距离,其中所述第一检测区域与所述目标第二检测区域的区域范围相同;根据所述第一检测区域的像素宽度和预设宽度的比值和所述像素距离,确定所述车辆移动的实际距 离;根据所述实际距离和相邻两帧图像的时间差值,确定所述车辆的车速。Further, the determining module is specifically configured to determine the said vehicle according to the first coordinates of the vehicle preset position in the first detection area and the second coordinates of the vehicle preset position in the target second detection area The pixel distance between the first detection area and the target second detection area, wherein the first detection area and the target second detection area have the same area range; according to the pixel width of the first detection area and the preset width Determine the actual distance moved by the vehicle; and determine the speed of the vehicle according to the actual distance and the time difference between two adjacent frames of images.
进一步地,所述匹配模块,具体用于基于常量速度模型和线性观测模型的标准卡尔曼滤波器,对所述第一检测区域对应的车辆进行预测,确定所述第一检测区域在所述下一帧图像的第一预测区域。Further, the matching module is specifically configured to predict the vehicle corresponding to the first detection area based on the standard Kalman filter of the constant speed model and the linear observation model, and determine that the first detection area is in the lower The first prediction area of a frame of image.
进一步地,所述匹配模块,具体还用于根据所述第一预测区域中的每个第一像素点与所述第二候选检测区域中对应的每个第二像素点,确定所述第一预测区域与所述第二候选检测区域的马氏距离和值与余弦距离和值;根据所述马氏距离和值、所述余弦距离和值以及分别对应的预设权重,确定所述第一预测区域与所述第二候选检测区域的距离权重和值;若所述距离权重和值小于所述预设阈值,则确定第二候选检测区域为目标第二检测区域。Further, the matching module is further configured to determine the first pixel point according to each first pixel point in the first prediction area and each second pixel point corresponding to the second candidate detection area The Mahalanobis distance sum value and the cosine distance sum value between the prediction area and the second candidate detection area; according to the Mahalanobis distance sum value, the cosine distance sum value and the corresponding preset weight, determine the The distance weight sum value between the prediction area and the second candidate detection area; if the distance weight sum value is less than the preset threshold, the second candidate detection area is determined as the target second detection area.
进一步地,所述装置还包括:Further, the device also includes:
训练模块,具体用于针对样本集中的任一样本图像,获取所述样本图像及所述样本图像对应的第一标签信息,其中所述第一标签信息标识所述样本图像中包含车辆的设定范围区域;将所述样本图像输入到原始深度学习模型中,获取输出的所述样本图像的第二标签信息;根据所述第一标签信息和所述第二标签信息,对所述原始深度学习模型的各参数的参数值进行调整,得到训练完成的所述深度学习模型。A training module, which is specifically used for obtaining the first label information corresponding to the sample image and the sample image for any sample image in the sample set, wherein the first label information identifies that the sample image contains the setting of the vehicle range area; input the sample image into the original deep learning model, obtain the second label information of the output sample image; according to the first label information and the second label information, the original deep learning The parameter values of each parameter of the model are adjusted to obtain the deep learning model that has been trained.
相应地,本发明提供了一种电子设备,包括:处理器、通信接口、存储器和通信总线,其中,处理器,通信接口,存储器通过通信总线完成相互间的通信;Correspondingly, the present invention provides an electronic device, comprising: a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus;
所述存储器中存储有计算机程序,当所述程序被所述处理器执行时,使得所述处理器执行时实现上述车速确定方法中任一所述方法的步骤。A computer program is stored in the memory, and when the program is executed by the processor, the processor implements the steps of any one of the above-mentioned vehicle speed determination methods when executed by the processor.
相应地,本发明提供了一种计算机可读存储介质,其存储有计算机程序,所述计算机程序被处理器执行时实现上述车速确定方法中任一所述方法的步骤。Correspondingly, the present invention provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, implements the steps of any one of the above-mentioned vehicle speed determination methods.
本发明提供了一种车速确定方法、装置、设备和介质,该方法中基于预先训练完成的深度学习模型,识别当前帧图像中的车辆的第一检测区域以及 与所述当前帧图像相邻的下一帧图像中所述车辆的第二候选检测区域,由于深度学习模型识别出的是包含车辆的检测区域,光照和画面背景对像素值的影响并不会降低确定的包含车辆的检测区域的准确性,根据当前帧图像中的所述第一检测区域,预测所述车辆在所述下一帧图像的第一预测区域;根据所述第一预测区域与所述第二候选检测区域之间的相似度,确定与所述第一预测区域匹配的目标第二检测区域;根据所述第一检测区域、所述目标第二检测区域以及相邻两帧图像的时间差值,确定所述车辆的车速;从而可以在图像中包括多个车辆时确定出一个车辆的第一检测区域和对应的目标第二检测区域,从而准确地确定出车辆的车速。The present invention provides a vehicle speed determination method, device, equipment and medium. In the method, based on a pre-trained deep learning model, a first detection area of a vehicle in a current frame image and a first detection area of the vehicle adjacent to the current frame image are identified. The second candidate detection area of the vehicle in the next frame of image, since the deep learning model identifies the detection area containing the vehicle, the influence of the illumination and the picture background on the pixel value will not reduce the determined detection area containing the vehicle. Accuracy, according to the first detection area in the current frame image, predict the vehicle in the first prediction area of the next frame image; according to the difference between the first prediction area and the second candidate detection area determine the target second detection area matching the first prediction area; according to the first detection area, the target second detection area and the time difference between two adjacent frames of images, determine the vehicle Therefore, when a plurality of vehicles are included in the image, the first detection area of a vehicle and the corresponding second detection area of the target can be determined, thereby accurately determining the vehicle speed.
附图说明Description of drawings
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简要介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions in the embodiments of the present invention more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.
图1为本发明实施例提供的一种车速确定方法的过程示意图;FIG. 1 is a schematic process diagram of a method for determining vehicle speed according to an embodiment of the present invention;
图2为本发明实施例提供的一种图像中车辆与实际的车辆分别与图像采集设备的连线构成的相似三角形的示意图;FIG. 2 is a schematic diagram of a similar triangle formed by a connection line between a vehicle in an image and an actual vehicle and an image acquisition device, respectively, according to an embodiment of the present invention;
图3为本发明实施例提供的一种YOLOv3的骨干网络Darknet-53的示意图;3 is a schematic diagram of a YOLOv3 backbone network Darknet-53 according to an embodiment of the present invention;
图4为本发明实施例提供的一种YOLOv3的整体结构图;4 is an overall structural diagram of a YOLOv3 provided by an embodiment of the present invention;
图5为本发明实施例提供的一种YOLOv3的基本组件DBL的示意图;5 is a schematic diagram of a basic component DBL of YOLOv3 provided by an embodiment of the present invention;
图6为本发明实施例提供的一种YOLOv3的基本组件Res_unit的示意图;6 is a schematic diagram of a basic component Res_unit of YOLOv3 provided by an embodiment of the present invention;
图7为本发明实施例提供的一种YOLOv3的基本组件Resblock_body的示意图;7 is a schematic diagram of a basic component Resblock_body of YOLOv3 provided by an embodiment of the present invention;
图8为本发明实施例提供的一种车速确定装置的结构示意图;FIG. 8 is a schematic structural diagram of a vehicle speed determination device according to an embodiment of the present invention;
图9为本发明实施例提供的一种电子设备结构示意图。FIG. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
具体实施方式Detailed ways
为了使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明作进一步地详细描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
为了在图像中包括多个车辆时提高确定的车速的准确性,本发明实施例提供了一种车速确定方法、装置、设备和介质。In order to improve the accuracy of the determined vehicle speed when a plurality of vehicles are included in the image, embodiments of the present invention provide a vehicle speed determination method, apparatus, device and medium.
实施例1:Example 1:
图1为本发明实施例提供的一种车速确定方法的过程示意图,该过程包括以下步骤:1 is a schematic process diagram of a vehicle speed determination method provided by an embodiment of the present invention, and the process includes the following steps:
S101:基于预先训练完成的深度学习模型,识别当前帧图像中的车辆的第一检测区域以及与所述当前帧图像相邻的下一帧图像中所述车辆的第二候选检测区域。S101: Based on a pre-trained deep learning model, identify a first detection area of the vehicle in the current frame image and a second candidate detection area of the vehicle in the next frame image adjacent to the current frame image.
本发明实施例提供的一种车速确定方法应用于电子设备,该电子设备可以是PC、平板电脑、智能手机等智能终端设备,也可以是服务器,其中该服务器可以是本地服务器,也可以是云端服务器。A vehicle speed determination method provided by an embodiment of the present invention is applied to an electronic device. The electronic device may be a smart terminal device such as a PC, a tablet computer, a smart phone, or the like, or a server, where the server may be a local server or a cloud. server.
在本发明实施例中,根据图像采集设备采集的视频中的视频帧图像,该电子设备确定视频帧图像中车辆的车速,其中该图像采集设备为采集行驶的车辆的图像的设备,例如监控摄像头、摄像机等。In the embodiment of the present invention, the electronic device determines the speed of the vehicle in the video frame image according to the video frame image in the video collected by the image collection device, wherein the image collection device is a device that collects the image of the moving vehicle, such as a surveillance camera , cameras, etc.
图像采集设备采集的视频中的当前帧图像可能是包含一个车辆,也可能是包含多个车辆;为了确定当前帧图像中的车辆的车速,需要识别当前帧图像中车辆的检测区域。The current frame image in the video collected by the image acquisition device may contain one vehicle or multiple vehicles; in order to determine the speed of the vehicle in the current frame image, the detection area of the vehicle in the current frame image needs to be identified.
为了识别当前帧图像中车辆的检测区域,该电子设备保存有训练完成的深度学习模型,将当前帧图像和与当前帧图像相邻的下一帧图像输入到该预先训练完成的深度学习模型,深度学习模型对当前帧图像和下一帧图像中的车辆进行处理,识别出该当前帧图像中车辆的第一检测区域和下一帧图像中 车辆的第二候选检测区域。In order to identify the detection area of the vehicle in the current frame image, the electronic device saves the trained deep learning model, and inputs the current frame image and the next frame image adjacent to the current frame image into the pre-trained deep learning model, The deep learning model processes the vehicle in the current frame image and the next frame image, and identifies the first detection area of the vehicle in the current frame image and the second candidate detection area of the vehicle in the next frame image.
其中,该深度学习模型可以是卷积神经网络模型(Convolutional neural networks,CNN)、或目标检测网络模型(YOLOv3)、或深度残差网络(Deep residual networks,DRN)。Wherein, the deep learning model may be a convolutional neural network model (Convolutional neural networks, CNN), or a target detection network model (YOLOv3), or a deep residual network (Deep residual networks, DRN).
S102:根据当前帧图像中的所述第一检测区域,预测所述车辆在所述下一帧图像的第一预测区域;根据所述第一预测区域与所述第二候选检测区域之间的相似度,确定与所述第一预测区域匹配的目标第二检测区域。S102: Predict the first prediction area of the vehicle in the next frame image according to the first detection area in the current frame image; according to the difference between the first prediction area and the second candidate detection area The similarity is to determine the target second detection area that matches the first prediction area.
由于当前帧图像和下一帧图像中均包含多个车辆,为了基于识别出的第一检测区域和第二候选检测区域,确定出同一车辆在当前帧图像和下一帧图像中的区域,在本发明实施例,电子设备还针对当前帧图像中任一车辆,根据识别出的车辆在当前帧图像的第一检测区域,确定该第一检测区域对应车辆在下一帧图像中的目标第二检测区域。Since both the current frame image and the next frame image contain multiple vehicles, in order to determine the area of the same vehicle in the current frame image and the next frame image based on the identified first detection area and the second candidate detection area, in the In this embodiment of the present invention, the electronic device further determines, for any vehicle in the current frame image, according to the first detection area of the identified vehicle in the current frame image, that the first detection area corresponds to the second detection target of the vehicle in the next frame image. area.
由于现有车辆的类型包括轿车、卡车、大巴等类型,而同一类型的车辆相似度较高,为了确定出该第一检测区域对应车辆在下一帧图像中的目标第二检测区域,根据当前帧图像的第一检测区域,预测该第一检测区域对应车辆在下一帧图像中的第一预测区域,基于该第一预测区域确定出该第一检测区域对应车辆在下一帧图像中的目标第二检测区域。其中,根据当前帧图像的第一检测区域,预测该第一检测区域对此车辆在下一帧图像中的第一预测区域的方法属于现有技术,本发明实施例对此不做赘述。Since the types of existing vehicles include cars, trucks, buses, etc., and vehicles of the same type have high similarity, in order to determine the target second detection area of the first detection area corresponding to the vehicle in the next frame of image, according to the current frame The first detection area of the image, it is predicted that the first detection area corresponds to the first prediction area of the vehicle in the next frame of image, and based on the first prediction area, it is determined that the first detection area corresponds to the target of the vehicle in the next frame of image. Detection area. Wherein, according to the first detection area of the current frame image, the method of predicting the first prediction area of the first detection area of the vehicle in the next frame image belongs to the prior art, which will not be repeated in this embodiment of the present invention.
根据下一帧图像的第一预测区域和第二候选检测区域,确定第一预测区域和第二候选检测区域的相似度。例如可以通过计算第一预测区域和第二候选检测区域之间的马氏距离确定相似度,马氏距离越小时则第一预测区域和第二候选检测区域的相似度越高。The similarity between the first prediction region and the second candidate detection region is determined according to the first prediction region and the second candidate detection region of the next frame of image. For example, the similarity can be determined by calculating the Mahalanobis distance between the first prediction region and the second candidate detection region. The smaller the Mahalanobis distance, the higher the similarity between the first prediction region and the second candidate detection region.
根据第一预测区域和第二候选检测区域的相似度,确定与第一预测区域匹配的目标第二检测区域。例如可以是将相似度最高,且大于预设阈值的第二候选检测区域确定为目标第二检测区域。According to the similarity between the first prediction area and the second candidate detection area, a target second detection area matching the first prediction area is determined. For example, the second candidate detection area with the highest similarity and greater than a preset threshold may be determined as the target second detection area.
S103:根据所述第一检测区域、所述目标第二检测区域以及相邻两帧图 像的时间差值,确定所述车辆的车速。S103: Determine the speed of the vehicle according to the first detection area, the target second detection area, and the time difference between two adjacent frames of images.
由于当前帧图像和下一帧图像对应的区域均为同一图像采集区域,因此该当前帧图像和下一帧图像的坐标系相同,根据当前帧图像中的第一检测区域和下一帧图像中的目标第二检测区域,确定出第一检测区域的车辆在相邻两帧时间内移动的实际距离,并根据实际距离和相邻两帧图像的时间差值,确定出第一检测区域对应车辆的车速。Since the areas corresponding to the current frame image and the next frame image are the same image acquisition area, the coordinate systems of the current frame image and the next frame image are the same, according to the first detection area in the current frame image and the next frame image The target second detection area is determined, and the actual distance that the vehicle in the first detection area moves within two adjacent frame times is determined, and the vehicle corresponding to the first detection area is determined according to the actual distance and the time difference between the two adjacent frames of images. speed.
由于本发明实施例中基于预先训练完成的深度学习模型,识别当前帧图像中的车辆的第一检测区域以及与所述当前帧图像相邻的下一帧图像中所述车辆的第二候选检测区域,由于深度学习模型识别出的是包含车辆地检测区域,光照和画面背景对像素值的影响并不会降低确定的包含车辆的检测区域的准确性,根据当前帧图像中的所述第一检测区域,预测所述车辆在所述下一帧图像的第一预测区域;根据所述第一预测区域与所述第二候选检测区域之间的相似度,确定与所述第一预测区域匹配的目标第二检测区域;根据所述第一检测区域、所述目标第二检测区域以及相邻两帧图像的时间差值,确定所述车辆的车速;从而可以在图像中包括多个车辆时确定出一个车辆的第一检测区域和对应的目标第二检测区域,从而准确地确定出车辆的车速。Since the embodiment of the present invention is based on a pre-trained deep learning model, the first detection area of the vehicle in the current frame image and the second candidate detection area of the vehicle in the next frame image adjacent to the current frame image are identified. area, since the deep learning model identifies the detection area containing the vehicle, the influence of the illumination and the screen background on the pixel value will not reduce the accuracy of the determined detection area containing the vehicle. According to the first detection area in the current frame image detection area, predict the vehicle in the first prediction area of the next frame of image; determine the match with the first prediction area according to the similarity between the first prediction area and the second candidate detection area the second detection area of the target; determine the speed of the vehicle according to the first detection area, the second detection area of the target, and the time difference between two adjacent frames of images; thus, when multiple vehicles are included in the image The first detection area of a vehicle and the corresponding target second detection area are determined, so as to accurately determine the speed of the vehicle.
实施例2:Example 2:
为了确定车辆的车速,在上述实施例的基础上,在本发明实施例中,所述根据所述第一检测区域、所述目标第二检测区域以及相邻两帧图像的时间差值,确定所述车辆的车速包括:In order to determine the speed of the vehicle, on the basis of the above embodiment, in the embodiment of the present invention, the determination is based on the first detection area, the target second detection area, and the time difference between two adjacent frames of images. The speed of the vehicle includes:
根据所述第一检测区域中车辆预设位置的第一坐标和所述目标第二检测区域中所述车辆预设位置的第二坐标,确定所述第一检测区域与所述目标第二检测区域的像素距离,其中所述第一检测区域与所述目标第二检测区域的区域范围相同;The first detection area and the target second detection area are determined according to the first coordinates of the vehicle preset position in the first detection area and the second coordinates of the vehicle preset position in the target second detection area The pixel distance of the area, wherein the area range of the first detection area and the target second detection area is the same;
根据所述第一检测区域的像素宽度和预设宽度的比值和所述像素距离,确定所述车辆移动的实际距离;Determine the actual distance that the vehicle moves according to the ratio of the pixel width of the first detection area to the preset width and the pixel distance;
根据所述实际距离和相邻两帧图像的时间差值,确定所述车辆的车速。The speed of the vehicle is determined according to the actual distance and the time difference between two adjacent frames of images.
为了确定车辆的车速,首先根据第一检测区域和目标第二检测区域,确定出第一检测区域对应车辆在相邻两帧图像的时间差值移动的实际距离。In order to determine the speed of the vehicle, firstly, according to the first detection area and the target second detection area, the actual distance moved by the time difference between the two adjacent frames of images corresponding to the first detection area is determined.
根据第一检测区域中车辆预设位置的第一坐标、以及目标第二检测区域中该车辆预设位置的第二坐标,确定出第一坐标与第二坐标的距离,将该距离作为第一检测区域与目标第二检测区域之间的像素距离。According to the first coordinate of the preset position of the vehicle in the first detection area and the second coordinate of the preset position of the vehicle in the second detection area of the target, determine the distance between the first coordinate and the second coordinate, and use the distance as the first coordinate The pixel distance between the detection area and the target second detection area.
其中,该车辆预设位置可以是第一检测区域的中心位置,也可以是第一检测区域的任一点的位置,该第一检测区域与目标第二检测区域的区域范围相同,因此在该第一检测区域中与目标第二检测区域中,车辆预设位置均为第一检测区域与目标第二检测区域中的相同行数和相同列数对应的坐标点的位置。Wherein, the preset position of the vehicle may be the center position of the first detection area, or may be the position of any point of the first detection area, and the area range of the first detection area and the target second detection area is the same, so in the first detection area In the first detection area and the target second detection area, the preset vehicle positions are both the positions of the coordinate points corresponding to the same row number and the same column number in the first detection area and the target second detection area.
确定第一检测区域的像素宽度与预设宽度的比值,其中该预设宽度为车辆常规的实际宽度,例如该预设宽度为3米,3.5米等。该比值为像素宽度与实际宽度的比值,基于图像采集设备的基本成像原理,图像中车辆与实际的车辆分别与图像采集设备的连线构成了相似三角形。Determine the ratio of the pixel width of the first detection area to a preset width, wherein the preset width is the actual width of the vehicle, for example, the preset width is 3 meters, 3.5 meters, etc. The ratio is the ratio of the pixel width to the actual width. Based on the basic imaging principle of the image acquisition device, the connection lines between the vehicle in the image and the actual vehicle and the image acquisition device respectively form a similar triangle.
图2为本发明实施例提供的一种图像中车辆与实际的车辆分别与图像采集设备的连线构成的相似三角形的示意图,如图2所示,图中的连线AB表示为实际车辆,图中的连线CD表示为图像中车辆,AB和CD分别与图像采集设备焦点O连线构成三角形OAB和三角形OCD,三角形OAB和三角形OCD为相似三角形。FIG. 2 is a schematic diagram of a similar triangle formed by connecting lines between a vehicle and an actual vehicle in an image provided by an embodiment of the present invention, respectively, and an image acquisition device. As shown in FIG. 2 , the connecting line AB in the figure represents an actual vehicle. The connecting line CD in the figure represents the vehicle in the image. The connecting lines of AB and CD and the focus O of the image acquisition device respectively form a triangle OAB and a triangle OCD, and the triangle OAB and the triangle OCD are similar triangles.
因此,图像采集设备的焦距与车辆距图像采集设备的距离的比值,即相当于像素宽度与实际宽度的比值,而车辆距图像采集设备的距离较远,车辆在车道中的移动对车辆距图像采集设备的距离的影响可以忽略不计,因此可以认为图像采集设备的焦距与车辆距图像采集设备的距离的比值为固定值,即可以将像素宽度与实际宽度的比值确认为该像素距离与实际距离的比值。Therefore, the ratio of the focal length of the image acquisition device to the distance between the vehicle and the image acquisition device is equivalent to the ratio of the pixel width to the actual width, but the distance between the vehicle and the image acquisition device is far, and the movement of the vehicle in the lane will affect the distance between the vehicle and the image. The influence of the distance of the acquisition device can be ignored, so it can be considered that the ratio of the focal length of the image acquisition device to the distance from the vehicle to the image acquisition device is a fixed value, that is, the ratio of the pixel width to the actual width can be confirmed as the pixel distance and the actual distance. ratio.
根据确定的第一检测区域与目标第二检测区域的像素距离、以及该像素距离与实际距离的比值,确定出该像素距离除以该比值的商值,该商值即为第一检测区域对应车辆在相邻两帧图像的时间差值移动的实际距离。根据第 一检测区域对应车辆移动的该实际距离和相邻两帧图像的时间差值,确定出该实际距离除以该时间差值的商值,该商值即为第一检测区域对应车辆的车速。According to the determined pixel distance between the first detection area and the target second detection area, and the ratio of the pixel distance to the actual distance, the quotient of dividing the pixel distance by the ratio is determined, and the quotient is the corresponding value of the first detection area. The actual distance that the vehicle moves in the time difference between two adjacent frames of images. According to the actual distance corresponding to the movement of the vehicle in the first detection area and the time difference between two adjacent frames of images, the quotient of dividing the actual distance by the time difference is determined, and the quotient is the time difference of the vehicle corresponding to the first detection area. speed.
下面通过一个具体的实施例对本发明的确定车辆的车速的过程进行说明,假设车辆在第1帧中的第一检测区域的中心点位置是(x1,y1),第一检测区域的像素高度和像素宽度分别是h1和w1;车辆在第2帧中的目标第二检测区域的中心点位置是(x2,y2),该车辆移动的像素距离D p为:
Figure PCTCN2021088536-appb-000001
Figure PCTCN2021088536-appb-000002
The process of determining the vehicle speed of the present invention will be described below through a specific embodiment, assuming that the center point position of the first detection area of the vehicle in the first frame is (x1, y1), the pixel height of the first detection area and the The pixel widths are h1 and w1 respectively; the position of the center point of the target second detection area of the vehicle in the second frame is (x2, y2), and the pixel distance D p of the vehicle is:
Figure PCTCN2021088536-appb-000001
Figure PCTCN2021088536-appb-000002
根据该第一检测区域的像素宽度w1和预设宽度3米,确定出该像素宽度与预设宽度的比例p,
Figure PCTCN2021088536-appb-000003
确定出该车辆在相邻两帧图像的时间差值移动的实际距离D,
Figure PCTCN2021088536-appb-000004
According to the pixel width w1 of the first detection area and the preset width of 3 meters, the ratio p of the pixel width to the preset width is determined,
Figure PCTCN2021088536-appb-000003
Determine the actual distance D that the vehicle moves in the time difference between two adjacent frames of images,
Figure PCTCN2021088536-appb-000004
根据每秒处理的视频帧数fps,确定出相邻两帧图像的时间差值t,
Figure PCTCN2021088536-appb-000005
一般情况下每秒处理的视频帧数fps为25,因此时间差值t为
Figure PCTCN2021088536-appb-000006
根据车辆速度的计算公式
Figure PCTCN2021088536-appb-000007
因此确定出车辆的车速
Figure PCTCN2021088536-appb-000008
According to the number of video frames fps processed per second, the time difference t of two adjacent frames of images is determined,
Figure PCTCN2021088536-appb-000005
In general, the number of video frames processed per second is 25 fps, so the time difference t is
Figure PCTCN2021088536-appb-000006
Calculation formula based on vehicle speed
Figure PCTCN2021088536-appb-000007
Therefore, the speed of the vehicle is determined
Figure PCTCN2021088536-appb-000008
实施例3:Example 3:
为了预测出车辆在下一帧图像的第一预测区域,在上述各实施例的基础上,在本发明实施例中,所述根据当前帧图像中的所述第一检测区域,预测所述车辆在所述下一帧图像的第一预测区域包括:In order to predict the first prediction area of the vehicle in the next frame image, on the basis of the above embodiments, in this embodiment of the present invention, the prediction of the vehicle in the first detection area in the current frame image is performed. The first prediction area of the next frame of image includes:
基于常量速度模型和线性观测模型的标准卡尔曼滤波器,对所述第一检测区域对应的车辆进行预测,确定所述第一检测区域在所述下一帧图像的第一预测区域。Based on the standard Kalman filter of the constant velocity model and the linear observation model, the vehicle corresponding to the first detection area is predicted, and it is determined that the first detection area is in the first prediction area of the next frame of image.
为了预测出车辆在下一帧图像中的第一预测区域,在本发明实施例中,根据现有的基于常量速度模型和线性观测模型的标准卡尔曼滤波器,以及该第一检测区域的中心点坐标(x,y)、像素宽度w和像素高度h,对该第一检测区域对应的车辆进行预测,确定出第一检测区域在与当前帧图像相邻的下 一帧图像的第一预测区域。In order to predict the first prediction area of the vehicle in the next frame of image, in the embodiment of the present invention, according to the existing standard Kalman filter based on the constant speed model and the linear observation model, and the center point of the first detection area Coordinates (x, y), pixel width w and pixel height h, predict the vehicle corresponding to the first detection area, and determine that the first detection area is in the first prediction area of the next frame image adjacent to the current frame image .
其中,该第一预测区域的像素宽度和像素高度与第一检测区域的像素宽度和像素高度相同,仅该第一预测区域的中心点坐标与第一检测区域的中心点坐标不同。The pixel width and pixel height of the first prediction area are the same as the pixel width and pixel height of the first detection area, and only the center point coordinates of the first prediction area and the center point coordinates of the first detection area are different.
实施例4:Example 4:
为了确定与第一预测区域匹配的目标第二检测区域,在上述各实施例的基础上,在本发明实施例中,所述根据所述第一预测区域与所述第二候选检测区域之间的相似度,确定与所述第一预测区域匹配的目标第二检测区域包括:In order to determine the target second detection area matching the first prediction area, on the basis of the above embodiments, in this embodiment of the present invention, the first prediction area and the second candidate detection area are determined according to the difference between the first prediction area and the second candidate detection area. The similarity of , determining the target second detection area that matches the first prediction area includes:
根据所述第一预测区域中的每个第一像素点与所述第二候选检测区域中对应的每个第二像素点,确定所述第一预测区域与所述第二候选检测区域的马氏距离和值与余弦距离和值;According to each first pixel point in the first prediction area and each second pixel point corresponding to the second candidate detection area, determine the distance between the first prediction area and the second candidate detection area The sum of the distance and the cosine distance;
根据所述马氏距离和值、所述余弦距离和值以及分别对应的预设权重,确定所述第一预测区域与所述第二候选检测区域的距离权重和值;According to the Mahalanobis distance sum value, the cosine distance sum value and the corresponding preset weights respectively, determine the distance weight and value of the first prediction area and the second candidate detection area;
若所述距离权重和值小于所述预设阈值,则确定第二候选检测区域为目标第二检测区域。If the distance weight sum value is less than the preset threshold, the second candidate detection area is determined to be the target second detection area.
现有的目标跟踪算法(Sort)确定与第一预测区域匹配的目标第二检测区域时,是以第一预测区域与第二候选检测区域间的交并比(IOU)作为判断匹配的度量标准,从而确定出当前帧图像和下一帧图像中的同一车辆的区域,实现对车辆的跟踪。When the existing target tracking algorithm (Sort) determines the target second detection area that matches the first prediction area, the intersection and union ratio (IOU) between the first prediction area and the second candidate detection area is used as a metric for judging matching. , so as to determine the area of the same vehicle in the current frame image and the next frame image, so as to realize the tracking of the vehicle.
由于目标跟踪算法(Sort)忽略了车辆的检测区域的表观信息,在车辆有遮挡的情况下的准确度很低,因此本发明实施例中采用多目标跟踪算法(Deep Sort)确定与第一预测区域匹配的目标第二检测区域。Since the target tracking algorithm (Sort) ignores the apparent information of the detection area of the vehicle, the accuracy is very low when the vehicle is occluded. Therefore, in the embodiment of the present invention, a multi-target tracking algorithm (Deep Sort) is used to determine the difference between the first The predicted area matches the target second detection area.
具体的,为了确定出第一预测区域与第二候选检测区域的匹配度,在根据第一预测区域中的每个第一像素点与第二候选检测区域中对应的每个第二像素点,确定出第一像素点与第二像素点之间的马氏距离和余弦距离。Specifically, in order to determine the degree of matching between the first prediction area and the second candidate detection area, according to each first pixel in the first prediction area and each second pixel corresponding to the second candidate detection area, The Mahalanobis distance and the cosine distance between the first pixel point and the second pixel point are determined.
针对第一预测区域中每个第一像素点,根据该第一像素点在第一预测区 域的行数和列数,可以确定出第一候选检测区域中相同的行数和列数的第二像素点,该第二像素点即为第一像素点在第二候选检测区域对应的第二像素点。For each first pixel in the first prediction area, according to the number of rows and columns of the first pixel in the first prediction area, a second candidate detection area with the same number of rows and columns can be determined. Pixel point, the second pixel point is the second pixel point corresponding to the first pixel point in the second candidate detection area.
根据第一预测区域中每个第一像素点与第二候选检测区域对应的第二像素点的马氏距离,确定第一预测区域与第二候选检测区域的马氏距离和值;根据第一预测区域中每个第一像素点与第二候选检测区域对应的第二像素点的余弦距离,确定第一预测区域与第二候选检测区域的余弦距离和值。According to the Mahalanobis distance between each first pixel in the first prediction area and the second pixel corresponding to the second candidate detection area, determine the sum of the Mahalanobis distance between the first prediction area and the second candidate detection area; The cosine distance between each first pixel in the prediction area and the second pixel corresponding to the second candidate detection area determines the cosine distance and value of the first prediction area and the second candidate detection area.
根据第一预测区域与第二候选检测区域的马氏距离和值、余弦距离和值以及分别对应的权重值,将马氏距离和值与对应的权重值相乘得到第一乘积值,将余弦距离和值与对应的权重值相乘得到第二乘积值,确定第一乘积值与第二乘积值的和值,该和值即为第一预测区域与第二候选检测区域的权重距离和值。According to the Mahalanobis distance sum value, cosine distance sum value and the corresponding weight value of the first prediction area and the second candidate detection area, multiply the Mahalanobis distance sum value and the corresponding weight value to obtain the first product value. The distance sum value is multiplied by the corresponding weight value to obtain the second product value, and the sum value of the first product value and the second product value is determined, and the sum value is the weight distance sum value of the first prediction area and the second candidate detection area. .
该权重距离和值即为评价第一预测区域与第二候选检测区域的相似度的指标,该权重距离和值越小时,则表示第一预测区域与第二候选检测区域越相似,也表示第一预测区域与第二候选检测区域的重合程度越高,即预测的第一预测区域越准确。The weight distance sum value is an index for evaluating the similarity between the first prediction area and the second candidate detection area. The smaller the weight distance sum value is, the more similar the first prediction area and the second candidate detection area are. The higher the degree of coincidence between a prediction region and the second candidate detection region, the more accurate the predicted first prediction region.
为了确定出与第一预测区域匹配的目标第二检测区域,还预先保存有判断区域是否匹配的预设阈值,其中该预设阈值是预先设置的,若希望提高区域匹配的准确度,则可以将该预设阈值设置地较小一些,若希望提高区域匹配的鲁棒性,则可以将该预设阈值设置地较大一些。In order to determine the target second detection area that matches the first prediction area, a preset threshold value for judging whether the area is matched is also pre-stored. The preset threshold value is preset. If you want to improve the accuracy of area matching, you can The preset threshold is set to be smaller, and if the robustness of region matching is desired to be improved, the preset threshold can be set to be larger.
根据该第一预测区域每个第二候选检测区域的权重距离和值以及该预设阈值,确定出权重距离和值小于预设阈值的第二候选检测区域为目标第二检测区域。According to the weighted distance and value of each second candidate detection area in the first prediction area and the preset threshold, a second candidate detection area whose weighted distance sum is smaller than the preset threshold is determined as the target second detection area.
实施例5:Example 5:
为了训练深度学习模型,在上述各实施例的基础上,在本发明实施例中,所述深度学习模型的训练过程包括:In order to train the deep learning model, on the basis of the above embodiments, in the embodiment of the present invention, the training process of the deep learning model includes:
针对样本集中的任一样本图像,获取所述样本图像及所述样本图像对应 的第一标签信息,其中所述第一标签信息标识所述样本图像中包含车辆的设定范围区域;For any sample image in the sample set, obtain the sample image and the first label information corresponding to the sample image, wherein the first label information identifies a set range area that includes a vehicle in the sample image;
将所述样本图像输入到原始深度学习模型中,获取输出的所述样本图像的第二标签信息;Input the sample image into the original deep learning model, and obtain the second label information of the output sample image;
根据所述第一标签信息和所述第二标签信息,对所述原始深度学习模型的各参数的参数值进行调整,得到训练完成的所述深度学习模型。According to the first label information and the second label information, the parameter values of the parameters of the original deep learning model are adjusted to obtain the trained deep learning model.
为了实现对深度学习模型的训练,本申请中保存有进行训练用的样本集,该样本集中的样本图像包括有每一车辆的车辆图像,该样本集中样本图像的第一标签信息为人工预先标注的,其中,该第一标签信息用于标识该样本图像中包含的车辆的设定范围区域。In order to realize the training of the deep learning model, a sample set for training is stored in this application. The sample images in the sample set include vehicle images of each vehicle, and the first label information of the sample images in the sample set is manually pre-labeled , wherein the first label information is used to identify the set range area of the vehicle included in the sample image.
在本申请中,在获取到样本集中的任一样本图像及该样本图像的第一标签信息后,将该样本图像输入到原始深度学习模型中,该原始深度学习模型输出该样本图像的第二标签信息。其中,第二标签信息标识该原始深度学习模型识别的该样本图像中包含车辆的设定范围区域。In this application, after obtaining any sample image in the sample set and the first label information of the sample image, the sample image is input into the original deep learning model, and the original deep learning model outputs the second image of the sample image. Label Information. Wherein, the second label information identifies a set range area that includes a vehicle in the sample image identified by the original deep learning model.
在根据原始深度学习模型确定出该样本图像的第二标签信息后,根据该第二标签信息以及该样本图像的第一标签信息,对原始深度学习模型进行训练,以调整原始深度学习模型的各项参数的参数值。After the second label information of the sample image is determined according to the original deep learning model, the original deep learning model is trained according to the second label information and the first label information of the sample image, so as to adjust the various parameters of the original deep learning model. The parameter value of the item parameter.
对原始深度学习模型进行训练的样本集中包含的每一个样本图像都进行上述操作,当满足预设的条件时,得到训练完成的深度学习模型。其中,该预设的条件可以是样本集中的样本图像通过原始深度学习模型训练后得到的第一标签信息与第二标签信息一致的样本图像的数量大于设定数量;也可以是对原始深度学习模型进行训练的迭代次数达到设置的最大迭代次数等。具体的,本申请对此不做限制。The above operations are performed on each sample image included in the sample set for training the original deep learning model, and when a preset condition is met, a trained deep learning model is obtained. The preset condition may be that the number of sample images whose first label information is consistent with the second label information obtained after the sample images in the sample set are trained by the original deep learning model is greater than the set number; The number of iterations for the model to be trained reaches the set maximum number of iterations, etc. Specifically, this application does not limit this.
作为一种可能的实施方式,在对原始深度学习模型进行训练时,可以把样本集中的样本图像分为训练样本图像和测试样本图像,先基于训练样本图像对原始深度学习模型进行训练,再基于测试样本图像对训练完成的深度学习模型的可靠性进行测试。As a possible implementation, when training the original deep learning model, the sample images in the sample set can be divided into training sample images and test sample images, and the original deep learning model is first trained based on the training sample images, and then based on the training sample images. The test sample images test the reliability of the trained deep learning model.
在该深度学习模型为目标检测模型(YOLOv3)时,YOLOv3的骨干网络是深度学习框架(Darknet-53),利用52个卷积层和最后的一个全连接层组成。在YOLOv3网络结构中,都是采用1x1和3x3的卷积核,这样会大大减少模型推理时的参数量和计算量。YOLOv3是在YOLOv2的基础上进行的改进,在骨干网络结构中增加了5个残差块(Residual),即利用了残差网络ResNet的原理,形成恒等映射的形式,让深度网络实现和浅层网络一样的性能,避免由于网络层太深而导致的梯度***。When the deep learning model is a target detection model (YOLOv3), the backbone network of YOLOv3 is a deep learning framework (Darknet-53), which consists of 52 convolutional layers and a final fully connected layer. In the YOLOv3 network structure, 1x1 and 3x3 convolution kernels are used, which will greatly reduce the amount of parameters and calculations during model inference. YOLOv3 is an improvement on the basis of YOLOv2. Five residual blocks (Residual) are added to the backbone network structure, that is, the principle of the residual network ResNet is used to form the form of identity mapping, so that the deep network can be realized and shallow The same performance as the layer network, avoiding the gradient explosion caused by the network layer being too deep.
YOLOv3的推理过程采用跨尺度预测(Predictions Across Scales),借鉴了特征金字塔网络(feature pyramid networks,FPN)的原理,用多个尺度来对不同大小的目标进行检测,越精细的网格可以检测出越小的物体。YOLOv3提供了3种不同尺寸的边界框,即对于每个待预测目标,将分别得出3个不同大小的预测框,然后再对预测框进行概率的计算,从而筛选出最匹配的结果。***用这种思想提取不同尺寸的特征,以形成金字塔形网络(pyramid network)。YOLOv3网络的最后一个卷积层预测得到一个三维张量编码预测框、对象和类别。COCO数据集下得到的张量为:N*N*[3*(4+1+80)],4个边界框偏移量,1个目标性预测和80个类别预测。The reasoning process of YOLOv3 uses Predictions Across Scales, which draws on the principle of feature pyramid networks (FPN), and uses multiple scales to detect targets of different sizes. The finer grid can detect smaller objects. YOLOv3 provides 3 different sizes of bounding boxes, that is, for each target to be predicted, 3 prediction boxes of different sizes will be obtained, and then the probability of the prediction box will be calculated to filter out the most matching results. The system uses this idea to extract features of different sizes to form a pyramid network. The last convolutional layer of the YOLOv3 network predicts a 3D tensor encoding the predicted box, object and class. The tensors obtained under the COCO dataset are: N*N*[3*(4+1+80)], 4 bounding box offsets, 1 object prediction and 80 category predictions.
图3为本发明实施例提供的一种YOLOv3的骨干网络Darknet-53的示意图,如图3所示,Convolutional表示为卷积层,Residual表示为残差块,Type表示为网络层,Filters标识为卷积层包括的卷积核,Size表示为尺寸,Output表示为输出,Avgpool表示为平均池化层,Connected表示为全连接层,Softmax为进行数值处理的函数,Global表示为全局池化层。Figure 3 is a schematic diagram of a YOLOv3 backbone network Darknet-53 provided by an embodiment of the present invention. As shown in Figure 3, Convolutional is represented as a convolutional layer, Residual is represented as a residual block, Type is represented as a network layer, and Filters is identified as The convolution kernel included in the convolution layer, Size represents the size, Output represents the output, Avgpool represents the average pooling layer, Connected represents the fully connected layer, Softmax represents the function for numerical processing, and Global represents the global pooling layer.
图4为本发明实施例提供的一种YOLOv3的整体结构图,如图4所示,第一行是最小尺度的yolo层,输入13*13的特征图,一共1024个通道,经过一系列的卷积操作,特征图的大小不变,但是通道数最后减少为75个,最终输出13*13的特征图,75个通道,然后进行分类和位置回归。Figure 4 is an overall structural diagram of a YOLOv3 provided by an embodiment of the present invention. As shown in Figure 4, the first row is the yolo layer with the smallest scale, and a 13*13 feature map is input, with a total of 1024 channels. After a series of In the convolution operation, the size of the feature map remains the same, but the number of channels is finally reduced to 75, and the final output is a 13*13 feature map with 75 channels, and then classification and position regression are performed.
第二行是中尺度的yolo层,将79层的13*13、512通道的特征图进行卷积操作,生成13*13、256通道的特征图,然后进行上采样,生成26*26、256 通道的特征图,同时与26*26、512通道的中尺度特征图合并。进行一系列的卷积操作,特征图的大小不变,但是通道数最后减少为75个,最终输出26*26大小的特征图,75个通道,然后进行分类和位置回归。The second line is the mesoscale yolo layer. The feature maps of 13*13 and 512 channels of 79 layers are convolved to generate feature maps of 13*13 and 256 channels, and then upsampling is performed to generate 26*26 and 256 channels. The feature map of the channel is also merged with the mesoscale feature map of 26*26 and 512 channels. A series of convolution operations are performed, and the size of the feature map remains unchanged, but the number of channels is finally reduced to 75, and the feature map of size 26*26 is finally output, with 75 channels, and then classification and position regression are performed.
第三行是大尺度的yolo层,将91层的26*26、512通道的特征图进行卷积草组,生成26*26、128通道的特征图,然后进行上采样生成52*52、128通道的特征图,同时与36层的52*52、256通道的特征图合并。进行一系列的卷积操作,特征图的大小不变,但是通道数最后减少为75个,最终输出52*52大小的特征图,75个通道,然后进行分类和位置回归。The third line is the large-scale yolo layer. The 26*26 and 512-channel feature maps of the 91-layer are convoluted to generate 26*26 and 128-channel feature maps, and then upsampled to generate 52*52 and 128 channels. The feature map of the channel is merged with the feature map of 52*52 and 256 channels of 36 layers at the same time. A series of convolution operations are performed, the size of the feature map is unchanged, but the number of channels is finally reduced to 75, and the final output of a feature map of 52*52 size, 75 channels, and then classification and position regression.
图5为本发明实施例提供的一种YOLOv3的基本组件DBL的示意图,如图5所示,DBL包括卷积层、BN和Leaky relu,对于YOLOv3来说,BN和leaky relu是和卷积层不可分离的部分,并共同构成了最小组件DBL。Fig. 5 is a schematic diagram of a basic component DBL of YOLOv3 provided by an embodiment of the present invention. As shown in Fig. 5, DBL includes a convolution layer, BN and leaky relu. For YOLOv3, BN and leaky relu are and convolution layer Inseparable parts and together form the smallest component DBL.
图6为本发明实施例提供的一种YOLOv3的基本组件Res_unit的示意图,如图6所示,Res_unit包括两个基本组件DBL和一个add层。FIG. 6 is a schematic diagram of a basic component Res_unit of YOLOv3 provided by an embodiment of the present invention. As shown in FIG. 6 , the Res_unit includes two basic components DBL and an add layer.
图7为本发明实施例提供的一种YOLOv3的基本组件Resblock_body的示意图,如图7所示,Resblock_body中包括基本组件DBL、zero padding和res unit。FIG. 7 is a schematic diagram of a basic component Resblock_body of YOLOv3 provided by an embodiment of the present invention. As shown in FIG. 7 , the Resblock_body includes basic components DBL, zero padding, and res unit.
实施例6:Example 6:
在上述各实施例的基础上,图8为本发明实施例提供的一种车速确定装置的结构示意图,所述装置包括:On the basis of the above embodiments, FIG. 8 is a schematic structural diagram of a vehicle speed determination device provided by an embodiment of the present invention, and the device includes:
识别模块801,用于基于预先训练完成的深度学习模型,识别当前帧图像中的车辆的第一检测区域以及与所述当前帧图像相邻的下一帧图像中所述车辆的第二候选检测区域;The identification module 801 is used to identify the first detection area of the vehicle in the current frame image and the second candidate detection area of the vehicle in the next frame image adjacent to the current frame image based on the pre-trained deep learning model area;
匹配模块802,用于根据当前帧图像中的所述第一检测区域,预测所述车辆在所述下一帧图像的第一预测区域;根据所述第一预测区域与所述第二候选检测区域之间的相似度,确定与所述第一预测区域匹配的目标第二检测区域;The matching module 802 is configured to predict the first prediction area of the vehicle in the next frame image according to the first detection area in the current frame image; detect the vehicle according to the first prediction area and the second candidate the similarity between the regions, to determine the target second detection region that matches the first prediction region;
确定模块803,用于根据所述第一检测区域、所述目标第二检测区域以及 相邻两帧图像的时间差值,确定所述车辆的车速。The determining module 803 is configured to determine the speed of the vehicle according to the first detection area, the second target detection area and the time difference between two adjacent frames of images.
进一步地,所述确定模块,具体用于根据所述第一检测区域中车辆预设位置的第一坐标和所述目标第二检测区域中所述车辆预设位置的第二坐标,确定所述第一检测区域与所述目标第二检测区域的像素距离,其中所述第一检测区域与所述目标第二检测区域的区域范围相同;根据所述第一检测区域的像素宽度和预设宽度的比值和所述像素距离,确定所述车辆移动的实际距离;根据所述实际距离和相邻两帧图像的时间差值,确定所述车辆的车速。Further, the determining module is specifically configured to determine the said vehicle according to the first coordinates of the vehicle preset position in the first detection area and the second coordinates of the vehicle preset position in the target second detection area The pixel distance between the first detection area and the target second detection area, wherein the first detection area and the target second detection area have the same area range; according to the pixel width of the first detection area and the preset width Determine the actual distance moved by the vehicle; and determine the speed of the vehicle according to the actual distance and the time difference between two adjacent frames of images.
进一步地,所述匹配模块,具体用于基于常量速度模型和线性观测模型的标准卡尔曼滤波器,对所述第一检测区域对应的车辆进行预测,确定所述第一检测区域在所述下一帧图像的第一预测区域。Further, the matching module is specifically configured to predict the vehicle corresponding to the first detection area based on the standard Kalman filter of the constant speed model and the linear observation model, and determine that the first detection area is in the lower The first prediction area of a frame of image.
进一步地,所述匹配模块,具体还用于根据所述第一预测区域中的每个第一像素点与所述第二候选检测区域中对应的每个第二像素点,确定所述第一预测区域与所述第二候选检测区域的马氏距离和值与余弦距离和值;根据所述马氏距离和值、所述余弦距离和值以及分别对应的预设权重,确定所述第一预测区域与所述第二候选检测区域的距离权重和值;若所述距离权重和值小于所述预设阈值,则确定第二候选检测区域为目标第二检测区域。Further, the matching module is further configured to determine the first pixel point according to each first pixel point in the first prediction area and each second pixel point corresponding to the second candidate detection area The Mahalanobis distance sum value and the cosine distance sum value between the prediction area and the second candidate detection area; according to the Mahalanobis distance sum value, the cosine distance sum value and the corresponding preset weight, determine the The distance weight sum value between the prediction area and the second candidate detection area; if the distance weight sum value is less than the preset threshold, the second candidate detection area is determined as the target second detection area.
进一步地,所述装置还包括:Further, the device also includes:
训练模块,具体用于针对样本集中的任一样本图像,获取所述样本图像及所述样本图像对应的第一标签信息,其中所述第一标签信息标识所述样本图像中包含车辆的设定范围区域;将所述样本图像输入到原始深度学习模型中,获取输出的所述样本图像的第二标签信息;根据所述第一标签信息和所述第二标签信息,对所述原始深度学习模型的各参数的参数值进行调整,得到训练完成的所述深度学习模型。A training module, which is specifically used for obtaining the first label information corresponding to the sample image and the sample image for any sample image in the sample set, wherein the first label information identifies that the sample image contains the setting of the vehicle range area; input the sample image into the original deep learning model, obtain the second label information of the output sample image; according to the first label information and the second label information, the original deep learning The parameter values of each parameter of the model are adjusted to obtain the deep learning model that has been trained.
实施例7:Example 7:
图9为本发明实施例提供的一种电子设备结构示意图,在上述各实施例的基础上,本发明实施例中还提供了一种电子设备,包括处理器901、通信接口902、存储器903和通信总线904,其中,处理器901,通信接口902,存 储器903通过通信总线904完成相互间的通信;FIG. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. On the basis of the foregoing embodiments, an electronic device is also provided in the embodiment of the present invention, including a processor 901, a communication interface 902, a memory 903 and A communication bus 904, wherein the processor 901, the communication interface 902, and the memory 903 communicate with each other through the communication bus 904;
所述存储器903中存储有计算机程序,当所述程序被所述处理器901执行时,使得所述处理器901执行如下步骤:A computer program is stored in the memory 903, and when the program is executed by the processor 901, the processor 901 is caused to perform the following steps:
基于预先训练完成的深度学习模型,识别当前帧图像中的车辆的第一检测区域以及与所述当前帧图像相邻的下一帧图像中所述车辆的第二候选检测区域;Identifying the first detection area of the vehicle in the current frame image and the second candidate detection area of the vehicle in the next frame image adjacent to the current frame image based on the pre-trained deep learning model;
根据当前帧图像中的所述第一检测区域,预测所述车辆在所述下一帧图像的第一预测区域;根据所述第一预测区域与所述第二候选检测区域之间的相似度,确定与所述第一预测区域匹配的目标第二检测区域;According to the first detection area in the current frame image, predict the first prediction area of the vehicle in the next frame image; according to the similarity between the first prediction area and the second candidate detection area , determining a target second detection area that matches the first prediction area;
根据所述第一检测区域、所述目标第二检测区域以及相邻两帧图像的时间差值,确定所述车辆的车速。The speed of the vehicle is determined according to the first detection area, the target second detection area, and the time difference between two adjacent frames of images.
进一步地,所述处理器901,具体用于所述根据所述第一检测区域、所述目标第二检测区域以及相邻两帧图像的时间差值,确定所述车辆的车速包括:Further, the processor 901 is specifically configured to determine the speed of the vehicle according to the first detection area, the target second detection area, and the time difference between two adjacent frames of images, including:
根据所述第一检测区域中车辆预设位置的第一坐标和所述目标第二检测区域中所述车辆预设位置的第二坐标,确定所述第一检测区域与所述目标第二检测区域的像素距离,其中所述第一检测区域与所述目标第二检测区域的区域范围相同;The first detection area and the target second detection area are determined according to the first coordinates of the vehicle preset position in the first detection area and the second coordinates of the vehicle preset position in the target second detection area The pixel distance of the area, wherein the area range of the first detection area and the target second detection area is the same;
根据所述第一检测区域的像素宽度和预设宽度的比值和所述像素距离,确定所述车辆移动的实际距离;Determine the actual distance that the vehicle moves according to the ratio of the pixel width of the first detection area to the preset width and the pixel distance;
根据所述实际距离和相邻两帧图像的时间差值,确定所述车辆的车速。The speed of the vehicle is determined according to the actual distance and the time difference between two adjacent frames of images.
进一步地,所述处理器901,具体还用于所述根据当前帧图像中的所述第一检测区域,预测所述车辆在所述下一帧图像的第一预测区域包括:Further, the processor 901 is further configured to, according to the first detection area in the current frame image, predict that the vehicle is in the first prediction area of the next frame image including:
基于常量速度模型和线性观测模型的标准卡尔曼滤波器,对所述第一检测区域对应的车辆进行预测,确定所述第一检测区域在所述下一帧图像的第一预测区域。Based on the standard Kalman filter of the constant velocity model and the linear observation model, the vehicle corresponding to the first detection area is predicted, and it is determined that the first detection area is in the first prediction area of the next frame of image.
进一步地,所述处理器901,具体还用于所述根据所述第一预测区域与所述第二候选检测区域之间的相似度,确定与所述第一预测区域匹配的目标第 二检测区域包括:Further, the processor 901 is further configured to determine, according to the similarity between the first prediction region and the second candidate detection region, a target second detection region that matches the first prediction region Areas include:
根据所述第一预测区域中的每个第一像素点与所述第二候选检测区域中对应的每个第二像素点,确定所述第一预测区域与所述第二候选检测区域的马氏距离和值与余弦距离和值;According to each first pixel point in the first prediction area and each second pixel point corresponding to the second candidate detection area, determine the distance between the first prediction area and the second candidate detection area The sum of the distance and the cosine distance;
根据所述马氏距离和值、所述余弦距离和值以及分别对应的预设权重,确定所述第一预测区域与所述第二候选检测区域的距离权重和值;According to the Mahalanobis distance sum value, the cosine distance sum value and the corresponding preset weights respectively, determine the distance weight and value of the first prediction area and the second candidate detection area;
若所述距离权重和值小于所述预设阈值,则确定第二候选检测区域为目标第二检测区域。If the distance weight sum value is less than the preset threshold, the second candidate detection area is determined to be the target second detection area.
进一步地,所述处理器901,还用于所述深度学习模型的训练过程包括:Further, the processor 901 is also used for the training process of the deep learning model including:
针对样本集中的任一样本图像,获取所述样本图像及所述样本图像对应的第一标签信息,其中所述第一标签信息标识所述样本图像中包含车辆的设定范围区域;For any sample image in the sample set, obtain the sample image and first label information corresponding to the sample image, wherein the first label information identifies a set range area that includes a vehicle in the sample image;
将所述样本图像输入到原始深度学习模型中,获取输出的所述样本图像的第二标签信息;Input the sample image into the original deep learning model, and obtain the second label information of the output sample image;
根据所述第一标签信息和所述第二标签信息,对所述原始深度学习模型的各参数的参数值进行调整,得到训练完成的所述深度学习模型。According to the first label information and the second label information, the parameter values of the parameters of the original deep learning model are adjusted to obtain the trained deep learning model.
上述电子设备提到的通信总线可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。该通信总线可以分为地址总线、数据总线、控制总线等。为便于表示,图中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The communication bus mentioned in the above electronic device may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus or the like. The communication bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.
通信接口902用于上述电子设备与其他设备之间的通信。The communication interface 902 is used for communication between the above-mentioned electronic device and other devices.
存储器可以包括随机存取存储器(Random Access Memory,RAM),也可以包括非易失性存储器(Non-Volatile Memory,NVM),例如至少一个磁盘存储器。可选地,存储器还可以是至少一个位于远离前述处理器的存储装置。The memory may include random access memory (Random Access Memory, RAM), and may also include non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk storage. Optionally, the memory may also be at least one storage device located remotely from the aforementioned processor.
上述处理器可以是通用处理器,包括中央处理器、网络处理器(Network Processor,NP)等;还可以是数字指令处理器(Digital Signal Processing,DSP)、 专用集成电路、现场可编程门陈列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。The above-mentioned processor may be a general-purpose processor, including a central processing unit, a network processor (Network Processor, NP), etc.; it may also be a digital instruction processor (Digital Signal Processing, DSP), an application-specific integrated circuit, a field programmable gate array or Other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
实施例8:Example 8:
在上述各实施例的基础上,本发明实施例还提供了一种计算机可读存储介质,其存储有计算机程序,所述计算机程序被处理器执行如下步骤:On the basis of the foregoing embodiments, an embodiment of the present invention further provides a computer-readable storage medium, which stores a computer program, and the computer program is executed by a processor as follows:
基于预先训练完成的深度学习模型,识别当前帧图像中的车辆的第一检测区域以及与所述当前帧图像相邻的下一帧图像中所述车辆的第二候选检测区域;Identifying the first detection area of the vehicle in the current frame image and the second candidate detection area of the vehicle in the next frame image adjacent to the current frame image based on the pre-trained deep learning model;
根据当前帧图像中的所述第一检测区域,预测所述车辆在所述下一帧图像的第一预测区域;根据所述第一预测区域与所述第二候选检测区域之间的相似度,确定与所述第一预测区域匹配的目标第二检测区域;According to the first detection area in the current frame image, predict the first prediction area of the vehicle in the next frame image; according to the similarity between the first prediction area and the second candidate detection area , determining a target second detection area that matches the first prediction area;
根据所述第一检测区域、所述目标第二检测区域以及相邻两帧图像的时间差值,确定所述车辆的车速。The speed of the vehicle is determined according to the first detection area, the target second detection area, and the time difference between two adjacent frames of images.
进一步地,所述根据所述第一检测区域、所述目标第二检测区域以及相邻两帧图像的时间差值,确定所述车辆的车速包括:Further, determining the speed of the vehicle according to the first detection area, the target second detection area, and the time difference between two adjacent frames of images includes:
根据所述第一检测区域中车辆预设位置的第一坐标和所述目标第二检测区域中所述车辆预设位置的第二坐标,确定所述第一检测区域与所述目标第二检测区域的像素距离,其中所述第一检测区域与所述目标第二检测区域的区域范围相同;The first detection area and the target second detection area are determined according to the first coordinates of the vehicle preset position in the first detection area and the second coordinates of the vehicle preset position in the target second detection area The pixel distance of the area, wherein the area range of the first detection area and the target second detection area is the same;
根据所述第一检测区域的像素宽度和预设宽度的比值和所述像素距离,确定所述车辆移动的实际距离;Determine the actual distance that the vehicle moves according to the ratio of the pixel width of the first detection area to the preset width and the pixel distance;
根据所述实际距离和相邻两帧图像的时间差值,确定所述车辆的车速。The speed of the vehicle is determined according to the actual distance and the time difference between two adjacent frames of images.
进一步地,所述根据当前帧图像中的所述第一检测区域,预测所述车辆在所述下一帧图像的第一预测区域包括:Further, according to the first detection area in the current frame image, predicting that the vehicle is in the first prediction area of the next frame image includes:
基于常量速度模型和线性观测模型的标准卡尔曼滤波器,对所述第一检测区域对应的车辆进行预测,确定所述第一检测区域在所述下一帧图像的第一预测区域。Based on the standard Kalman filter of the constant velocity model and the linear observation model, the vehicle corresponding to the first detection area is predicted, and it is determined that the first detection area is in the first prediction area of the next frame of image.
进一步地,所述根据所述第一预测区域与所述第二候选检测区域之间的相似度,确定与所述第一预测区域匹配的目标第二检测区域包括:Further, the determining, according to the similarity between the first prediction region and the second candidate detection region, the target second detection region matching the first prediction region includes:
根据所述第一预测区域中的每个第一像素点与所述第二候选检测区域中对应的每个第二像素点,确定所述第一预测区域与所述第二候选检测区域的马氏距离和值与余弦距离和值;According to each first pixel point in the first prediction area and each second pixel point corresponding to the second candidate detection area, determine the distance between the first prediction area and the second candidate detection area The sum of the distance and the cosine distance;
根据所述马氏距离和值、所述余弦距离和值以及分别对应的预设权重,确定所述第一预测区域与所述第二候选检测区域的距离权重和值;According to the Mahalanobis distance sum value, the cosine distance sum value and the corresponding preset weights respectively, determine the distance weight and value of the first prediction area and the second candidate detection area;
若所述距离权重和值小于所述预设阈值,则确定第二候选检测区域为目标第二检测区域。If the distance weight sum value is less than the preset threshold, the second candidate detection area is determined to be the target second detection area.
进一步地,所述深度学习模型的训练过程包括:Further, the training process of the deep learning model includes:
针对样本集中的任一样本图像,获取所述样本图像及所述样本图像对应的第一标签信息,其中所述第一标签信息标识所述样本图像中包含车辆的设定范围区域;For any sample image in the sample set, obtain the sample image and first label information corresponding to the sample image, wherein the first label information identifies a set range area that includes a vehicle in the sample image;
将所述样本图像输入到原始深度学习模型中,获取输出的所述样本图像的第二标签信息;Input the sample image into the original deep learning model, and obtain the second label information of the output sample image;
根据所述第一标签信息和所述第二标签信息,对所述原始深度学习模型的各参数的参数值进行调整,得到训练完成的所述深度学习模型。According to the first label information and the second label information, the parameter values of the parameters of the original deep learning model are adjusted to obtain the trained deep learning model.
本领域内的技术人员应明白,本发明的实施例可提供为方法、***、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本发明是参照根据本发明实施例的方法、设备(***)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通 过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.
尽管已描述了本发明的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本发明范围的所有变更和修改。Although preferred embodiments of the present invention have been described, additional changes and modifications to these embodiments may occur to those skilled in the art once the basic inventive concepts are known. Therefore, the appended claims are intended to be construed to include the preferred embodiment and all changes and modifications that fall within the scope of the present invention.
显然,本领域的技术人员可以对本发明实施例进行各种改动和变型而不脱离本发明实施例的精神和范围。这样,倘若本发明实施例的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the embodiments of the present invention without departing from the spirit and scope of the embodiments of the present invention. Thus, provided that these modifications and variations of the embodiments of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

  1. 一种车速确定方法,其特征在于,所述方法包括:A vehicle speed determination method, characterized in that the method comprises:
    基于预先训练完成的深度学习模型,识别当前帧图像中的车辆的第一检测区域以及与所述当前帧图像相邻的下一帧图像中所述车辆的第二候选检测区域;Identifying the first detection area of the vehicle in the current frame image and the second candidate detection area of the vehicle in the next frame image adjacent to the current frame image based on the pre-trained deep learning model;
    根据当前帧图像中的所述第一检测区域,预测所述车辆在所述下一帧图像的第一预测区域;根据所述第一预测区域与所述第二候选检测区域之间的相似度,确定与所述第一预测区域匹配的目标第二检测区域;According to the first detection area in the current frame image, predict the first prediction area of the vehicle in the next frame image; according to the similarity between the first prediction area and the second candidate detection area , determining a target second detection area that matches the first prediction area;
    根据所述第一检测区域、所述目标第二检测区域以及相邻两帧图像的时间差值,确定所述车辆的车速。The speed of the vehicle is determined according to the first detection area, the target second detection area, and the time difference between two adjacent frames of images.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述第一检测区域、所述目标第二检测区域以及相邻两帧图像的时间差值,确定所述车辆的车速包括:The method according to claim 1, wherein the determining the speed of the vehicle according to the first detection area, the second target detection area and the time difference between two adjacent frames of images comprises:
    根据所述第一检测区域中车辆预设位置的第一坐标和所述目标第二检测区域中所述车辆预设位置的第二坐标,确定所述第一检测区域与所述目标第二检测区域的像素距离,其中所述第一检测区域与所述目标第二检测区域的区域范围相同;The first detection area and the target second detection area are determined according to the first coordinates of the vehicle preset position in the first detection area and the second coordinates of the vehicle preset position in the target second detection area The pixel distance of the area, wherein the area range of the first detection area and the target second detection area is the same;
    根据所述第一检测区域的像素宽度和预设宽度的比值和所述像素距离,确定所述车辆移动的实际距离;Determine the actual distance that the vehicle moves according to the ratio of the pixel width of the first detection area to the preset width and the pixel distance;
    根据所述实际距离和相邻两帧图像的时间差值,确定所述车辆的车速。The speed of the vehicle is determined according to the actual distance and the time difference between two adjacent frames of images.
  3. 根据权利要求1所述的方法,其特征在于,所述根据当前帧图像中的所述第一检测区域,预测所述车辆在所述下一帧图像的第一预测区域包括:The method according to claim 1, wherein the predicting that the vehicle is in the first prediction area of the next frame image according to the first detection area in the current frame image comprises:
    基于常量速度模型和线性观测模型的标准卡尔曼滤波器,对所述第一检测区域对应的车辆进行预测,确定所述第一检测区域在所述下一帧图像的第一预测区域。Based on the standard Kalman filter of the constant velocity model and the linear observation model, the vehicle corresponding to the first detection area is predicted, and it is determined that the first detection area is in the first prediction area of the next frame of image.
  4. 根据权利要求1所述的方法,其特征在于,所述根据所述第一预测区 域与所述第二候选检测区域之间的相似度,确定与所述第一预测区域匹配的目标第二检测区域包括:The method according to claim 1, wherein the second detection target matching the first prediction area is determined according to the similarity between the first prediction area and the second candidate detection area Areas include:
    根据所述第一预测区域中的每个第一像素点与所述第二候选检测区域中对应的每个第二像素点,确定所述第一预测区域与所述第二候选检测区域的马氏距离和值与余弦距离和值;According to each first pixel point in the first prediction area and each second pixel point corresponding to the second candidate detection area, determine the distance between the first prediction area and the second candidate detection area The sum of the distance and the cosine distance;
    根据所述马氏距离和值、所述余弦距离和值以及分别对应的预设权重,确定所述第一预测区域与所述第二候选检测区域的距离权重和值;According to the Mahalanobis distance sum value, the cosine distance sum value and the corresponding preset weights respectively, determine the distance weight and value of the first prediction area and the second candidate detection area;
    若所述距离权重和值小于所述预设阈值,则确定第二候选检测区域为目标第二检测区域。If the distance weight sum value is less than the preset threshold, the second candidate detection area is determined to be the target second detection area.
  5. 根据权利要求1所述的方法,其特征在于,所述深度学习模型的训练过程包括:The method according to claim 1, wherein the training process of the deep learning model comprises:
    针对样本集中的任一样本图像,获取所述样本图像及所述样本图像对应的第一标签信息,其中所述第一标签信息标识所述样本图像中包含车辆的设定范围区域;For any sample image in the sample set, obtain the sample image and first label information corresponding to the sample image, wherein the first label information identifies a set range area that includes a vehicle in the sample image;
    将所述样本图像输入到原始深度学习模型中,获取输出的所述样本图像的第二标签信息;Input the sample image into the original deep learning model, and obtain the second label information of the output sample image;
    根据所述第一标签信息和所述第二标签信息,对所述原始深度学习模型的各参数的参数值进行调整,得到训练完成的所述深度学习模型。According to the first label information and the second label information, the parameter values of the parameters of the original deep learning model are adjusted to obtain the trained deep learning model.
  6. 一种车速确定装置,其特征在于,所述装置包括:A vehicle speed determination device, characterized in that the device comprises:
    识别模块,用于基于预先训练完成的深度学习模型,识别当前帧图像中的车辆的第一检测区域以及与所述当前帧图像相邻的下一帧图像中所述车辆的第二候选检测区域;The identification module is used to identify the first detection area of the vehicle in the current frame image and the second candidate detection area of the vehicle in the next frame image adjacent to the current frame image based on the pre-trained deep learning model ;
    匹配模块,用于根据当前帧图像中的所述第一检测区域,预测所述车辆在所述下一帧图像的第一预测区域;根据所述第一预测区域与所述第二候选检测区域之间的相似度,确定与所述第一预测区域匹配的目标第二检测区域;a matching module, configured to predict the first prediction area of the vehicle in the next frame image according to the first detection area in the current frame image; according to the first prediction area and the second candidate detection area The similarity between the two, determine the target second detection area that matches the first prediction area;
    确定模块,用于根据所述第一检测区域、所述目标第二检测区域以及相邻两帧图像的时间差值,确定所述车辆的车速。A determination module, configured to determine the speed of the vehicle according to the first detection area, the target second detection area and the time difference between two adjacent frames of images.
  7. 根据权利要求6所述的装置,其特征在于,所述确定模块,具体用于根据所述第一检测区域中车辆预设位置的第一坐标和所述目标第二检测区域中所述车辆预设位置的第二坐标,确定所述第一检测区域与所述目标第二检测区域的像素距离,其中所述第一检测区域与所述目标第二检测区域的区域范围相同;根据所述第一检测区域的像素宽度和预设宽度的比值和所述像素距离,确定所述车辆移动的实际距离;根据所述实际距离和相邻两帧图像的时间差值,确定所述车辆的车速。The device according to claim 6, wherein the determining module is specifically configured to: Set the second coordinates of the position to determine the pixel distance between the first detection area and the target second detection area, wherein the first detection area and the target second detection area have the same area range; according to the first detection area The ratio of the pixel width of the detection area to the preset width and the pixel distance determine the actual distance that the vehicle moves; and the speed of the vehicle is determined according to the actual distance and the time difference between two adjacent frames of images.
  8. 根据权利要求6所述的装置,其特征在于,所述匹配模块,具体用于基于常量速度模型和线性观测模型的标准卡尔曼滤波器,对所述第一检测区域对应的车辆进行预测,确定所述第一检测区域在所述下一帧图像的第一预测区域。The device according to claim 6, wherein the matching module is specifically configured to predict the vehicle corresponding to the first detection area based on a standard Kalman filter based on a constant speed model and a linear observation model, and determine The first detection area is in the first prediction area of the next frame of image.
  9. 根据权利要求6所述的装置,其特征在于,所述匹配模块,具体还用于根据所述第一预测区域中的每个第一像素点与所述第二候选检测区域中对应的每个第二像素点,确定所述第一预测区域与所述第二候选检测区域的马氏距离和值与余弦距离和值;根据所述马氏距离和值、所述余弦距离和值以及分别对应的预设权重,确定所述第一预测区域与所述第二候选检测区域的距离权重和值;若所述距离权重和值小于所述预设阈值,则确定第二候选检测区域为目标第二检测区域。The apparatus according to claim 6, wherein the matching module is further configured to, in particular, be further configured to: For the second pixel point, the Mahalanobis distance sum and the cosine distance sum of the first prediction area and the second candidate detection area are determined; according to the Mahalanobis distance sum value, the cosine distance sum value and the corresponding determine the distance weight and value between the first prediction area and the second candidate detection area; if the distance weight and value are less than the preset threshold, determine the second candidate detection area as the target first detection area. Two detection areas.
  10. 根据权利要求6所述的装置,其特征在于,所述装置还包括:The device according to claim 6, wherein the device further comprises:
    训练模块,具体用于针对样本集中的任一样本图像,获取所述样本图像及所述样本图像对应的第一标签信息,其中所述第一标签信息标识所述样本图像中包含车辆的设定范围区域;将所述样本图像输入到原始深度学习模型中,获取输出的所述样本图像的第二标签信息;根据所述第一标签信息和所述第二标签信息,对所述原始深度学习模型的各参数的参数值进行调整,得到训练完成的所述深度学习模型。A training module, which is specifically used for obtaining the first label information corresponding to the sample image and the sample image for any sample image in the sample set, wherein the first label information identifies that the sample image contains the setting of the vehicle range area; input the sample image into the original deep learning model, obtain the second label information of the output sample image; according to the first label information and the second label information, the original deep learning The parameter values of each parameter of the model are adjusted to obtain the deep learning model that has been trained.
PCT/CN2021/088536 2021-04-15 2021-04-20 Vehicle speed determination method and apparatus, device, and medium WO2022217630A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110405781.2 2021-04-15
CN202110405781.2A CN113191353A (en) 2021-04-15 2021-04-15 Vehicle speed determination method, device, equipment and medium

Publications (1)

Publication Number Publication Date
WO2022217630A1 true WO2022217630A1 (en) 2022-10-20

Family

ID=76977107

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/088536 WO2022217630A1 (en) 2021-04-15 2021-04-20 Vehicle speed determination method and apparatus, device, and medium

Country Status (2)

Country Link
CN (1) CN113191353A (en)
WO (1) WO2022217630A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116124499A (en) * 2022-11-25 2023-05-16 上海方酋机器人有限公司 Coal mining method, equipment and medium based on moving vehicle
CN116758732A (en) * 2023-05-18 2023-09-15 内蒙古工业大学 Intersection vehicle detection and bus priority passing method under fog computing environment

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114005095B (en) * 2021-10-29 2023-06-30 北京百度网讯科技有限公司 Vehicle attribute identification method, device, electronic equipment and medium
CN114627649B (en) * 2022-04-13 2022-12-27 北京魔门塔科技有限公司 Speed control model generation method, vehicle control method and device
CN114898585B (en) * 2022-04-20 2023-04-14 清华大学 Intersection multi-view-angle-based vehicle track prediction planning method and system
CN118015850B (en) * 2024-04-08 2024-06-28 云南省公路科学技术研究院 Multi-target vehicle speed synchronous estimation method, system, terminal and medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5761326A (en) * 1993-12-08 1998-06-02 Minnesota Mining And Manufacturing Company Method and apparatus for machine vision classification and tracking
CN107766821A (en) * 2017-10-23 2018-03-06 江苏鸿信***集成有限公司 All the period of time vehicle detecting and tracking method and system in video based on Kalman filtering and deep learning
CN108961315A (en) * 2018-08-01 2018-12-07 腾讯科技(深圳)有限公司 Method for tracking target, device, computer equipment and storage medium
WO2020149601A1 (en) * 2019-01-15 2020-07-23 포항공과대학교 산학협력단 Method and device for high-speed image recognition using 3d cnn
CN111523447A (en) * 2020-04-22 2020-08-11 北京邮电大学 Vehicle tracking method, device, electronic equipment and storage medium
CN111738032A (en) * 2019-03-24 2020-10-02 初速度(苏州)科技有限公司 Vehicle driving information determination method and device and vehicle-mounted terminal
US20210050095A1 (en) * 2018-12-13 2021-02-18 Christine I. Podilchuk Neural network-based object detection in visual input

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108446622A (en) * 2018-03-14 2018-08-24 海信集团有限公司 Detecting and tracking method and device, the terminal of target object
CN111127508B (en) * 2018-10-31 2023-05-02 杭州海康威视数字技术股份有限公司 Target tracking method and device based on video
CN111738033B (en) * 2019-03-24 2022-05-13 魔门塔(苏州)科技有限公司 Vehicle driving information determination method and device based on plane segmentation and vehicle-mounted terminal
CN110415277B (en) * 2019-07-24 2022-03-08 中国科学院自动化研究所 Multi-target tracking method, system and device based on optical flow and Kalman filtering

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5761326A (en) * 1993-12-08 1998-06-02 Minnesota Mining And Manufacturing Company Method and apparatus for machine vision classification and tracking
CN107766821A (en) * 2017-10-23 2018-03-06 江苏鸿信***集成有限公司 All the period of time vehicle detecting and tracking method and system in video based on Kalman filtering and deep learning
CN108961315A (en) * 2018-08-01 2018-12-07 腾讯科技(深圳)有限公司 Method for tracking target, device, computer equipment and storage medium
US20210050095A1 (en) * 2018-12-13 2021-02-18 Christine I. Podilchuk Neural network-based object detection in visual input
WO2020149601A1 (en) * 2019-01-15 2020-07-23 포항공과대학교 산학협력단 Method and device for high-speed image recognition using 3d cnn
CN111738032A (en) * 2019-03-24 2020-10-02 初速度(苏州)科技有限公司 Vehicle driving information determination method and device and vehicle-mounted terminal
CN111523447A (en) * 2020-04-22 2020-08-11 北京邮电大学 Vehicle tracking method, device, electronic equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116124499A (en) * 2022-11-25 2023-05-16 上海方酋机器人有限公司 Coal mining method, equipment and medium based on moving vehicle
CN116124499B (en) * 2022-11-25 2024-04-09 上海方酋机器人有限公司 Coal mining method, equipment and medium based on moving vehicle
CN116758732A (en) * 2023-05-18 2023-09-15 内蒙古工业大学 Intersection vehicle detection and bus priority passing method under fog computing environment

Also Published As

Publication number Publication date
CN113191353A (en) 2021-07-30

Similar Documents

Publication Publication Date Title
WO2022217630A1 (en) Vehicle speed determination method and apparatus, device, and medium
WO2022083402A1 (en) Obstacle detection method and apparatus, computer device, and storage medium
CN109087510B (en) Traffic monitoring method and device
US20230144209A1 (en) Lane line detection method and related device
US10078790B2 (en) Systems for generating parking maps and methods thereof
JP2020052694A (en) Object detection apparatus, object detection method, and computer program for object detection
US9361702B2 (en) Image detection method and device
US20210149408A1 (en) Generating Depth From Camera Images and Known Depth Data Using Neural Networks
US11776277B2 (en) Apparatus, method, and computer program for identifying state of object, and controller
Zhao et al. Automated traffic surveillance system with aerial camera arrays imagery: Macroscopic data collection with vehicle tracking
US20220129685A1 (en) System and Method for Determining Object Characteristics in Real-time
CN114495064A (en) Monocular depth estimation-based vehicle surrounding obstacle early warning method
Tak et al. Development of AI‐Based Vehicle Detection and Tracking System for C‐ITS Application
CN115063454A (en) Multi-target tracking matching method, device, terminal and storage medium
CN115546705A (en) Target identification method, terminal device and storage medium
Anandhalli et al. Image projection method for vehicle speed estimation model in video system
Nesti et al. Ultra-sonic sensor based object detection for autonomous vehicles
Sharma et al. Deep Learning-Based Object Detection and Classification for Autonomous Vehicles in Different Weather Scenarios of Quebec, Canada
Miles et al. Camera-based system for the automatic detection of vehicle axle count and speed using convolutional neural networks
CN111353481A (en) Road obstacle identification method based on laser point cloud and video image
CN116665179A (en) Data processing method, device, domain controller and storage medium
CN113887455B (en) Face mask detection system and method based on improved FCOS
CN114638947A (en) Data labeling method and device, electronic equipment and storage medium
Naresh et al. Real Time Vehicle Tracking using YOLO Algorithm
Sekhar et al. Vehicle Tracking and Speed Estimation Using Deep Sort

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21936504

Country of ref document: EP

Kind code of ref document: A1