CN114998815A - Traffic vehicle identification tracking method and system based on video analysis - Google Patents

Traffic vehicle identification tracking method and system based on video analysis Download PDF

Info

Publication number
CN114998815A
CN114998815A CN202210931367.XA CN202210931367A CN114998815A CN 114998815 A CN114998815 A CN 114998815A CN 202210931367 A CN202210931367 A CN 202210931367A CN 114998815 A CN114998815 A CN 114998815A
Authority
CN
China
Prior art keywords
image
license plate
video
vehicle
video frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210931367.XA
Other languages
Chinese (zh)
Other versions
CN114998815B (en
Inventor
岳建明
杨睿
杨冬俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Sanleng Smartcity&iot System Co ltd
Original Assignee
Jiangsu Sanleng Smartcity&iot System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Sanleng Smartcity&iot System Co ltd filed Critical Jiangsu Sanleng Smartcity&iot System Co ltd
Priority to CN202210931367.XA priority Critical patent/CN114998815B/en
Publication of CN114998815A publication Critical patent/CN114998815A/en
Application granted granted Critical
Publication of CN114998815B publication Critical patent/CN114998815B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/625License plates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a traffic vehicle identification tracking method and a system based on video analysis, wherein the method comprises the following steps: step 1, extracting a first video and a second video, wherein the first video and the second video have space-time relevance; step 2, aiming at the first video, vehicle identification is carried out; step 3, vehicle identification is carried out on the second video; and 4, marking the same vehicle with the same license plate successfully identified on the map, performing marking point connection on the map according to the space-time relevance of the first video and the second video to obtain a vehicle running track, and finishing vehicle tracking according to the vehicle running track. The method improves the efficiency of vehicle detection, identification and tracking by combining the improved switch transform deep learning model and the AlexNet model trained under the MapReduce framework.

Description

Traffic vehicle identification tracking method and system based on video analysis
Technical Field
The invention relates to the technical field of image processing, in particular to a traffic vehicle identification tracking method and system based on video analysis.
Background
In the conventional vehicle recognition technology, the vehicle is recognized by collecting vehicle-related image data and then statistically analyzing and processing the collected vehicle images by using a trained single model, and in the processing method, the vehicle is mostly tracked by recognizing the whole license plate image, for example: CN107273896A discloses a license plate detection and recognition method based on image recognition, wherein a vehicle recognition process comprises moving vehicle detection, image preprocessing, license plate positioning, character segmentation, character recognition and the like, a background difference method is adopted for extraction in the vehicle detection process, and a mixed Gaussian background model is adopted for background updating in the background difference method; in the image binarization preprocessing process, an edge detection binarization method is adopted for preprocessing, in the license plate positioning process, a license plate positioning method based on edge detection and priori knowledge is adopted for positioning the license plate, and in the character recognition process, a BP neural network character recognition method is adopted for completing the recognition of all license plate characters. However, the method uses a single model, and cannot overcome the defect of the used single model, and in addition, the efficiency can be greatly reduced by directly detecting the whole license plate, the vehicle cannot be detected and tracked quickly and in real time, and the accuracy is not high.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a traffic vehicle recognition and tracking method and system based on video analysis aiming at the defects of single use model, low recognition efficiency, low accuracy and low real-time property in the traditional vehicle recognition and tracking technology.
The technical scheme is as follows: in order to achieve the purpose, the invention adopts the following technical scheme:
a traffic vehicle identification tracking method based on video analysis comprises the following steps:
step 1, extracting a first video and a second video, wherein the first video and the second video have space-time relevance.
Step 2, aiming at the first video, vehicle identification is carried out, wherein the method comprises the following steps:
step 2.1, acquiring a video frame image of a first video at a frame rate of 1 frame per second, and normalizing to obtain a first video frame image;
step 2.2, automatically identifying the first video frame image in the step 2.1 by using an improved swin transform deep learning model to obtain a gray level image 1 containing a vehicle reference area;
2.3, constructing an image segmentation and fusion model, and rapidly partitioning and segmenting the gray level image 1 of the vehicle reference region by using the image segmentation and fusion model to obtain a license plate reference gray level image 1 in the video image;
step 2.4, training an AlexNet model by using a MapReduce framework to generate AlexNet for image set classification; recognizing and classifying license plate reference images 1 corresponding to all partition images in the license plate reference gray level image 1 by using a MapReduce combined trained AlexNet model to obtain license plate targets, and performing symbol identification on the license plate targets in the first video frame image; the license plate reference image 1 is an image of a corresponding position of the license plate reference gray-scale image 1 in an original first video frame;
and 2.5, when the number of frames in the first video stream, in which the symbol marks of the license plate targets continuously appear at the same or nearby positions, reaches a preset value, successfully identifying the vehicle.
And 3, aiming at the second video, carrying out vehicle identification, wherein the vehicle identification comprises the following steps:
step 3.1, acquiring a video frame image of a second video at a frame rate of 1 frame per second, and normalizing to obtain a second video frame image;
3.2, automatically identifying the second video frame image by using an improved swin transform deep learning model to obtain a gray image 2 containing a vehicle reference area;
3.3, constructing an image segmentation and fusion model, and rapidly segmenting the gray level image 2 of the vehicle reference region by using the image segmentation and fusion model to obtain a license plate reference gray level image 2 in the video image;
step 3.4, training an AlexNet model by using a MapReduce framework, and generating AlexNet aiming at image set classification; identifying and classifying license plate reference images 2 corresponding to the sub-images in the license plate reference gray level image 2 by using a MapReduce combined with the trained AlexNet model to obtain license plate targets, and marking the license plate targets in the second video frame image; the license plate reference image 2 is an image of a corresponding position of the license plate reference gray level image 2 in the original second video frame;
and 3.5, when the number of the continuously appeared frames of the symbol marks of the license plate targets at the same or nearby positions in the second video stream reaches a preset value, successfully identifying the vehicles.
And 4, marking the same license plate vehicle successfully identified in the step 2.5 and the step 3.5 on a map, performing marking point connection on the map according to the space-time relevance of the first video and the second video to obtain a vehicle running track, and finishing vehicle tracking according to the vehicle running track.
Further, the improved swin transformer deep learning model specifically comprises: an attention mechanism module was introduced in Swin Transformer, and a multiscale mixed convolution was introduced in PatchEmBed in Swin Transformer.
Further, in the step 2.2, the first video frame image in the step 2.1 is automatically identified by using an improved swin transform deep learning model, so as to obtain a grayscale image 1 including a vehicle reference region, which specifically includes: and inputting the normalized first video frame image into an improved swin transform deep learning model for detection, obtaining an image of a vehicle reference region corresponding to the video frame, and then performing binarization, wherein the pixel value of a vehicle region is set to be 1, and the pixel value of a non-vehicle region is set to be 0 in the binary image.
Step 3.2, automatically identifying the second video frame image by using an improved swin transform deep learning model to obtain a gray image 2 containing a vehicle reference region, and specifically comprising the following steps: and inputting the normalized second video frame image into an improved swin transform deep learning model for detection, obtaining an image of a vehicle reference region corresponding to the video frame, and then performing binarization, wherein the pixel value of a vehicle region is set to be 1, and the pixel value of a non-vehicle region is set to be 0 in the binary image.
Further, the spatiotemporal relevance specifically includes a temporal order and a spatial position relationship.
Further, the image segmentation and fusion model consists of a VGG network and U-Net, wherein the VGG network consists of layer1, layer2, layer3, layer4 and layer5 of VGG 16.
Further, the Loss function Loss of the image segmentation and fusion model is customized as follows:
Figure 214866DEST_PATH_IMAGE001
obtaining an optimal network parameter through a minimization loss function; where pred represents a set of predicted values, true represents a set of true values, α, γ are adjustment coefficients, α =0.5, y represents a label, n represents the number of categories, if a category i =1, y _ i =1, otherwise y _ i =0, p _ i represents a probability value output by the corresponding category, and L _1 represents a mean square error.
Further, the method includes the steps of utilizing a MapReduce to combine with a trained AlexNet model to recognize and classify license plate reference images 1 corresponding to all partition images in a license plate reference gray image 1 to obtain license plate targets, and marking the license plate targets in the first video frame image by symbols, and specifically includes the following steps: and finding license plate reference images 1 of the corresponding positions of all subarea images in the license plate reference gray-scale image 1 in the first video frame, normalizing all license plate reference images 1, inputting the normalized license plate reference images into an AlexNet model, identifying all license plate target positions, and carrying out symbol identification on all license plate target positions in the first video frame image.
Identifying and classifying the license plate reference images 2 corresponding to the sub-images in the license plate reference gray level image 2 by using a MapReduce combined with the trained AlexNet model to obtain license plate targets, and marking the license plate targets with symbols in the second video frame image, wherein the method specifically comprises the following steps: and finding license plate reference images 2 of the corresponding positions of all the subarea images in the license plate reference gray level images 2 in the second video frame, normalizing all the license plate reference images 2, inputting the normalized license plate reference images into an AlexNet model, identifying all license plate target positions, and performing symbol identification on all the license plate target positions in the second video frame images.
Based on the same inventive concept, the invention discloses a traffic vehicle identification and tracking system based on video analysis, which is used for realizing the traffic vehicle identification and tracking method based on video analysis, and specifically comprises the following steps:
the extraction module is used for extracting a first video and a second video, and the first video and the second video have space-time relevance.
The identification module 1 is configured to perform vehicle identification on a first video, and includes:
the acquisition module 1 is used for acquiring a video frame image of a first video at a frame rate of 1 frame per second and carrying out normalization to obtain a first video frame image;
the improvement module 1 is used for automatically identifying the first video frame image in the step 2.1 by using an improved swin transform deep learning model to obtain a gray level image 1 containing a vehicle reference area;
the segmentation module 1 is used for constructing an image segmentation fusion model, and performing fast partition segmentation on the gray level image 1 of the vehicle reference region by using the image segmentation fusion model to obtain the license plate reference gray level image 1 in the video image.
The classification module 1 is used for training an AlexNet model by utilizing a MapReduce framework and generating AlexNet for image set classification; identifying and classifying license plate reference images 1 corresponding to all partition images in the license plate reference gray level image 1 by using a MapReduce combined trained AlexNet model to obtain license plate targets, and marking the license plate targets in the first video frame image; the license plate reference image 1 is an image of a corresponding position of the license plate reference gray-scale image 1 in the original first video frame.
The detection module 1 is used for successfully identifying the vehicle when the number of frames in the first video stream, in which the symbol marks of the license plate target continuously appear at the same or nearby positions, reaches a preset value;
the identification module 2 is configured to perform vehicle identification on the second video, and includes:
and the acquisition module 2 is used for acquiring the video frame image of the second video at the frame rate of 1 frame per second, and normalizing the video frame image to obtain the second video frame image.
The improvement module 2 is used for automatically identifying the second video frame image by using an improved swin transform deep learning model to obtain a gray level image 2 containing a vehicle reference area;
and the segmentation module 2 is used for constructing an image segmentation fusion model, and rapidly segmenting the gray level image 2 of the vehicle reference region by using the image segmentation fusion model to obtain the license plate reference gray level image 2 in the video image.
The classification module 2 is used for training an AlexNet model by using a MapReduce framework and generating AlexNet for image set classification; identifying and classifying license plate reference images 2 corresponding to the sub-images in the license plate reference gray level image 2 by using a MapReduce combined with the trained AlexNet model to obtain license plate targets, and marking the license plate targets in the second video frame image; the license plate reference image 2 is an image of a corresponding position of the license plate reference gray-scale image 2 in the original second video frame.
And the detection module 2 is used for successfully identifying the vehicle when the number of frames in the second video stream, in which the symbol identifications of the license plate target continuously appear at the same or nearby positions, reaches a preset value.
And a tracking module, configured to mark the same license plate vehicle successfully identified in the step 2.5 and the step 3.5 on a map, perform mark point connection on the map according to the space-time correlation of the first video and the second video, obtain a vehicle travel track, and complete vehicle tracking according to the vehicle travel track.
Based on the same inventive concept, the invention discloses a traffic vehicle identification and tracking system based on video analysis, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the computer program realizes the traffic vehicle identification and tracking method based on video analysis when being loaded to the processor.
Compared with the prior art, the invention has the beneficial effects that:
1. the method creatively utilizes the swin transformer deep learning model, increases the receptive field and the network depth, inhibits the interference background information, and extracts more abundant characteristic information, thereby enhancing the visual representation capability and improving the identification speed and accuracy.
2. By means of fusing the improved swin transformer deep learning model and the AlexNet model, the defects of interference and single use model are effectively overcome.
3. Because the swin transformer has the capacity of large throughput and large-scale parallel processing, and the MapReduce framework can provide the distributed computing capacity, the distributed parallel processing of the image can be realized at each stage of the image processing by combining the two technologies, and the efficiency of vehicle/license plate detection, identification and tracking is greatly improved.
4. The gray level images of the vehicle reference regions are partitioned by the aid of the image partitioning and fusing model, and then the license plate reference images corresponding to the partitioned images are recognized and classified by the aid of the AlexNet model, so that detection of vehicles is achieved.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Fig. 2 is a schematic diagram of an image segmentation fusion model.
Detailed Description
An embodiment of the present invention will be described in detail below with reference to the accompanying drawings, but it should be understood that the scope of the present invention is not limited to the embodiment.
As shown in fig. 1-2, the embodiment provides a method for identifying and tracking a transportation vehicle based on video analysis, which includes the following steps:
step 1, extracting a first video and a second video, wherein the first video and the second video have space-time relevance.
Specifically, the first video and the second video are both monitoring videos related to a road, the monitoring videos can be cut into corresponding lengths to facilitate subsequent processing and analysis, and the time-space correlation specifically includes a time sequence and a spatial position relationship.
Step 2, aiming at the first video, vehicle identification is carried out, wherein the method comprises the following steps:
step 2.1, acquiring a video frame image of a first video at a frame rate of 1 frame per second, and normalizing to obtain a first video frame image;
specifically, the method further includes preprocessing the acquired video frame image of the first video before performing the normalization, and the preprocessing operation may specifically include gaussian filtering processing.
Step 2.2, automatically identifying the first video frame image in the step 2.1 by using an improved swin transform deep learning model to obtain a gray level image 1 containing a vehicle reference area;
specifically, the improved swin transformer deep learning model specifically comprises: the MSA and MLP layers in Swin Transformer add channel attention modules and multiscale mixed convolutions are introduced in the patchemped of Swin Transformer.
2.3, constructing an image segmentation and fusion model, and rapidly partitioning and segmenting the gray level image 1 of the vehicle reference region by using the image segmentation and fusion model to obtain a license plate reference gray level image 1 in the video image;
specifically, the image segmentation and fusion model is composed of a VGG network and U-Net, wherein the VGG network is composed of layer1, layer2, layer3, layer4 and layer5 of VGG 16.
Step 2.4, training an AlexNet model by using a MapReduce framework, and generating AlexNet aiming at image set classification; identifying and classifying license plate reference images 1 corresponding to all partition images in the license plate reference gray level image 1 by using a MapReduce combined trained AlexNet model to obtain license plate targets, and marking the license plate targets in the first video frame image; the license plate reference image 1 is an image of a corresponding position of the license plate reference gray-scale image 1 in the original first video frame.
Specifically, MapReduce is used as a parallel program design model and method, provides a simple parallel program design method, realizes basic parallel computing tasks by using two functions of Map and Reduce, and provides an abstract operation and parallel programming interface so as to simply and conveniently complete the programming and computing processing of large-scale data. The distributed computing capacity provided by the MapReduce framework is fully utilized to improve the efficiency of vehicle/license plate detection, identification and tracking.
And 2.5, when the number of frames in the first video stream, in which the symbol marks of the license plate targets continuously appear at the same or nearby positions, reaches a preset value, successfully identifying the vehicle.
Specifically, if the same symbol mark appears in 20 consecutive images, it indicates that the vehicle identification is successful.
And 3, aiming at the second video, carrying out vehicle identification, wherein the vehicle identification comprises the following steps:
step 3.1, acquiring a video frame image of a second video at a frame rate of 1 frame per second, and normalizing to obtain a second video frame image;
specifically, the method further includes preprocessing the acquired video frame image of the first video before performing the normalization, and the preprocessing operation may specifically include gaussian filtering processing.
3.2, automatically identifying the second video frame image by using an improved swin transform deep learning model to obtain a gray image 2 containing a vehicle reference area;
3.3, constructing an image segmentation and fusion model, and rapidly segmenting the gray level image 2 of the vehicle reference region by using the image segmentation and fusion model to obtain a license plate reference gray level image 2 in the video image;
step 3.4, training an AlexNet model by using a MapReduce framework, and generating AlexNet aiming at image set classification; identifying and classifying license plate reference images 2 corresponding to the sub-images in the license plate reference gray level image 2 by using a MapReduce combined with the trained AlexNet model to obtain license plate targets, and marking the license plate targets in the second video frame image; the license plate reference image 2 is an image of a corresponding position of the license plate reference gray level image 2 in the original second video frame;
and 3.5, when the number of the continuously appeared frames of the symbol marks of the license plate targets at the same or nearby positions in the second video stream reaches a preset value, successfully identifying the vehicles.
Specifically, if the same symbol mark appears in 20 consecutive images, it indicates that the vehicle identification is successful.
And 4, marking the same license plate vehicle successfully identified in the step 2.5 and the step 3.5 on a map, performing marking point connection on the map according to the space-time relevance of the first video and the second video to obtain a vehicle running track, and finishing vehicle tracking according to the vehicle running track.
Specifically, successfully identified vehicle positions are respectively marked on a Baidu map or a Gade map, marking points of the vehicle are sequentially connected based on a time sequence and a spatial position relation, so that a vehicle running track is obtained, and the vehicle is tracked according to the vehicle running track.
Based on the same inventive concept, the embodiment of the invention discloses a traffic vehicle identification and tracking system based on video analysis, which is used for realizing the traffic vehicle identification and tracking method based on video analysis, and specifically comprises the following steps:
the extraction module is used for extracting a first video and a second video, and the first video and the second video have space-time relevance.
The identification module 1 is configured to perform vehicle identification on a first video, and includes:
the acquisition module 1 is configured to acquire a video frame image of a first video at a frame rate of 1 frame per second, and perform normalization to obtain the first video frame image.
And the improvement module 1 is configured to automatically identify the first video frame image in the step 2.1 by using an improved swin transformer deep learning model, so as to obtain a grayscale image 1 including a vehicle reference region.
The segmentation module 1 is used for constructing an image segmentation fusion model, and performing fast partition segmentation on the gray level image 1 of the vehicle reference region by using the image segmentation fusion model to obtain the license plate reference gray level image 1 in the video image.
The classification module 1 is used for training an AlexNet model by utilizing a MapReduce framework and generating AlexNet for image set classification; identifying and classifying license plate reference images 1 corresponding to all partition images in the license plate reference gray level image 1 by using a MapReduce combined trained AlexNet model to obtain license plate targets, and marking the license plate targets in the first video frame image; the license plate reference image 1 is an image of a corresponding position of the license plate reference gray-scale image 1 in the original first video frame.
The detection module 1 is used for successfully identifying the vehicle when the number of frames in the first video stream, in which the symbol marks of the license plate target continuously appear at the same or nearby positions, reaches a preset value.
The identification module 2 is configured to perform vehicle identification on the second video, and includes:
and the acquisition module 2 is used for acquiring the video frame image of the second video at the frame rate of 1 frame per second, and normalizing the video frame image to obtain the second video frame image.
And the improvement module 2 is used for automatically identifying the second video frame image by using an improved swin transformer deep learning model to obtain a gray image 2 containing a vehicle reference area.
And the segmentation module 2 is used for constructing an image segmentation fusion model, and rapidly segmenting the gray level image 2 of the vehicle reference region by using the image segmentation fusion model to obtain the license plate reference gray level image 2 in the video image.
The classification module 2 is used for training an AlexNet model by utilizing a MapReduce framework and generating AlexNet for image set classification; identifying and classifying license plate reference images 2 corresponding to the sub-images in the license plate reference gray level image 2 by using a MapReduce combined with the trained AlexNet model to obtain license plate targets, and marking the license plate targets in the second video frame image; the license plate reference image 2 is an image of a corresponding position of the license plate reference gray-scale image 2 in the original second video frame.
And the detection module 2 is used for successfully identifying the vehicle when the number of frames continuously appearing at the same or nearby position of the symbol mark of the license plate target in the second video stream reaches a preset value.
And a tracking module, configured to mark the same license plate vehicle successfully identified in the step 2.5 and the step 3.5 on a map, perform mark point connection on the map according to the space-time correlation of the first video and the second video, obtain a vehicle travel track, and complete vehicle tracking according to the vehicle travel track.
Based on the same inventive concept, the embodiment of the invention discloses a traffic vehicle identification and tracking system based on video analysis, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the computer program realizes the traffic vehicle identification and tracking method based on video analysis when being loaded to the processor.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should make the description as a whole, and the embodiments may be appropriately combined to form other embodiments understood by those skilled in the art.

Claims (10)

1. A traffic vehicle identification and tracking method based on video analysis is characterized by comprising the following steps:
step 1, extracting a first video and a second video, wherein the first video and the second video have space-time relevance;
step 2, aiming at the first video, vehicle identification is carried out, wherein the vehicle identification comprises the following steps:
step 2.1, acquiring a video frame image of a first video at a frame rate of 1 frame per second, and normalizing to obtain a first video frame image;
step 2.2, automatically identifying the first video frame image in the step 2.1 by using an improved swin transform deep learning model to obtain a gray level image 1 containing a vehicle reference area;
2.3, constructing an image segmentation and fusion model, and rapidly partitioning and segmenting the gray level image 1 of the vehicle reference region by using the image segmentation and fusion model to obtain a license plate reference gray level image 1 in the video image;
step 2.4, training an AlexNet model by using a MapReduce framework, and generating AlexNet aiming at image set classification; identifying and classifying license plate reference images 1 corresponding to all partition images in the license plate reference gray level image 1 by using a MapReduce combined trained AlexNet model to obtain license plate targets, and marking the license plate targets in the first video frame image; the license plate reference image 1 is an image of a corresponding position of the license plate reference gray-scale image 1 in an original first video frame;
step 2.5, when the number of frames in the first video stream, in which the symbol marks of the license plate target continuously appear at the same or nearby positions, reaches a preset value, the vehicle identification is successful;
and 3, aiming at the second video, carrying out vehicle identification, wherein the vehicle identification comprises the following steps:
step 3.1, acquiring a video frame image of a second video at a frame rate of 1 frame per second, and normalizing to obtain a second video frame image;
3.2, automatically identifying the second video frame image by using an improved swin transform deep learning model to obtain a gray image 2 containing a vehicle reference area;
3.3, constructing an image segmentation and fusion model, and rapidly segmenting the gray level image 2 of the vehicle reference region by using the image segmentation and fusion model to obtain a license plate reference gray level image 2 in the video image;
step 3.4, training an AlexNet model by using a MapReduce framework, and generating AlexNet aiming at image set classification; recognizing and classifying the license plate reference image 2 corresponding to each sub-image in the license plate reference gray image 2 by using a MapReduce combined trained AlexNet model to obtain a license plate target, and performing symbol identification on the license plate target in the second video frame image; the license plate reference image 2 is an image of a corresponding position of the license plate reference gray level image 2 in the original second video frame;
step 3.5, when the number of frames continuously appearing at the same or nearby positions of the symbol marks of the license plate targets in the second video stream reaches a preset value, successfully identifying the vehicles;
and 4, marking the same license plate vehicle identified successfully in the step 2.5 and the step 3.5 on a map, connecting marking points on the map according to the space-time relevance of the first video and the second video to obtain a vehicle running track, and finishing vehicle tracking according to the vehicle running track.
2. The method for identifying and tracking the transportation vehicles based on the video analysis as claimed in claim 1, wherein the improved swin transformer deep learning model comprises: an attention mechanism module was introduced in Swin Transformer, and a multiscale mixed convolution was introduced in PatchEmBed in Swin Transformer.
3. The method for identifying and tracking the transportation vehicles based on the video analysis as claimed in claim 1, wherein in the step 2.2, the first video frame image in the step 2.1 is automatically identified by using an improved swin transformer deep learning model to obtain a gray image 1 including a vehicle reference region, and the method specifically comprises: inputting the normalized first video frame image into an improved swin transform deep learning model for detection, obtaining an image of a vehicle reference region corresponding to the video frame, and then performing binarization, wherein the pixel value of a vehicle region in the binary image is set to be 1, and the pixel value of a non-vehicle region is set to be 0;
step 3.2, automatically identifying the second video frame image by using an improved swin transform deep learning model to obtain a gray image 2 containing a vehicle reference region, and specifically comprising the following steps: and inputting the normalized second video frame image into an improved swin transform deep learning model for detection, obtaining an image of a vehicle reference region corresponding to the video frame, and then performing binarization, wherein the pixel value of a vehicle region is set to be 1, and the pixel value of a non-vehicle region is set to be 0 in the binary image.
4. The method as claimed in claim 1, wherein the spatiotemporal correlation specifically includes a temporal sequence and a spatial position relationship.
5. The method for identifying and tracking the transportation vehicles based on the video analysis as claimed in claim 1, wherein the image segmentation fusion model is composed of a VGG network and U-Net, wherein the VGG network is composed of layer1, layer2, layer3, layer4 and layer5 of VGG 16.
6. The method for identifying and tracking the transportation vehicles based on the video analysis as claimed in claim 1 or 5, wherein the Loss function Loss of the image segmentation and fusion model is self-defined as:
Figure 162792DEST_PATH_IMAGE001
obtaining an optimal network parameter through a minimization loss function; where pred represents a set of predicted values, true represents a set of true values, α, γ are adjustment coefficients, α =0.5, y represents a label, n represents the number of categories, if a category i =1, y _ i =1, otherwise y _ i =0, p _ i represents a probability value output by the corresponding category, and L _1 represents a mean square error.
7. The traffic vehicle recognition and tracking method based on video analysis according to claim 1, wherein the license plate reference images 1 corresponding to each partition image in the license plate reference gray level image 1 are recognized and classified by using a MapReduce combined trained AlexNet model to obtain license plate targets, and the license plate targets are marked in the first video frame image, specifically comprising: finding license plate reference images 1 of the corresponding positions of all subarea images in the license plate reference gray level image 1 in a first video frame, normalizing all license plate reference images 1, inputting the normalized license plate reference images into an AlexNet model, identifying all license plate target positions, and carrying out symbol identification on all license plate target positions in the first video frame image;
identifying and classifying the license plate reference images 2 corresponding to the sub-images in the license plate reference gray level image 2 by using a MapReduce combined with the trained AlexNet model to obtain license plate targets, and marking the license plate targets with symbols in the second video frame image, wherein the method specifically comprises the following steps: and finding license plate reference images 2 of the corresponding positions of all the subarea images in the license plate reference gray level images 2 in the second video frame, normalizing all the license plate reference images 2, inputting the normalized license plate reference images into an AlexNet model, identifying all license plate target positions, and performing symbol identification on all the license plate target positions in the second video frame images.
8. A traffic vehicle identification and tracking system based on video analysis, which is used for implementing a traffic vehicle identification and tracking method based on video analysis according to any one of claims 1-7, and comprises:
the extraction module is used for extracting a first video and a second video, and the first video and the second video have space-time relevance;
the identification module 1 is configured to perform vehicle identification on a first video, and includes:
the acquisition module 1 is used for acquiring a video frame image of a first video at a frame rate of 1 frame per second and carrying out normalization to obtain a first video frame image;
the improvement module 1 is used for automatically identifying the first video frame image in the step 2.1 by using an improved swin transform deep learning model to obtain a gray level image 1 containing a vehicle reference area;
the segmentation module 1 is used for constructing an image segmentation fusion model, and performing fast partition segmentation on the gray level image 1 of the vehicle reference region by using the image segmentation fusion model to obtain a license plate reference gray level image 1 in a video image;
the classification module 1 is used for training an AlexNet model by utilizing a MapReduce framework and generating AlexNet for image set classification; identifying and classifying license plate reference images 1 corresponding to all partition images in the license plate reference gray level image 1 by using a MapReduce combined trained AlexNet model to obtain license plate targets, and marking the license plate targets in the first video frame image; the license plate reference image 1 is an image of a corresponding position of the license plate reference gray-scale image 1 in an original first video frame;
the detection module 1 is used for successfully identifying the vehicle when the number of frames in the first video stream, in which the symbol marks of the license plate target continuously appear at the same or nearby positions, reaches a preset value;
the identification module 2 is configured to perform vehicle identification on the second video, and includes:
the acquisition module 2 is used for acquiring a video frame image of a second video at a frame rate of 1 frame per second, and normalizing the video frame image to obtain a second video frame image;
the improvement module 2 is used for automatically identifying the second video frame image by using an improved swin transform deep learning model to obtain a gray level image 2 containing a vehicle reference area;
the segmentation module 2 is used for constructing an image segmentation fusion model, and rapidly segmenting the gray level image 2 of the vehicle reference region by using the image segmentation fusion model to obtain a license plate reference gray level image 2 in the video image;
the classification module 2 is used for training an AlexNet model by utilizing a MapReduce framework and generating AlexNet for image set classification; recognizing and classifying the license plate reference image 2 corresponding to each sub-image in the license plate reference gray image 2 by using a MapReduce combined trained AlexNet model to obtain a license plate target, and performing symbol identification on the license plate target in the second video frame image; the license plate reference image 2 is an image of a corresponding position of the license plate reference gray-scale image 2 in an original second video frame;
the detection module 2 is used for successfully identifying the vehicle when the number of frames in the second video stream, in which the symbol marks of the license plate target continuously appear at the same or nearby positions, reaches a preset value;
and the tracking module is used for marking the same license plate vehicle identified successfully in the step 2.5 and the step 3.5 on a map, performing marking point connection on the map according to the space-time relevance of the first video and the second video to obtain a vehicle running track, and completing vehicle tracking according to the vehicle running track.
9. The video analysis-based traffic vehicle identification and tracking system according to claim 8, wherein the improved swin transformer deep learning model comprises: an attention mechanism module is introduced in Swin Transformer, and a multiscale mixed convolution is introduced in PatchEmBed of Swin Transformer.
10. A system for identifying and tracking vehicles based on video analysis, comprising a memory, a processor and a computer program stored on the memory and operable on the processor, wherein the computer program, when loaded into the processor, implements a method for identifying and tracking vehicles based on video analysis according to any one of claims 1-7.
CN202210931367.XA 2022-08-04 2022-08-04 Traffic vehicle identification tracking method and system based on video analysis Active CN114998815B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210931367.XA CN114998815B (en) 2022-08-04 2022-08-04 Traffic vehicle identification tracking method and system based on video analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210931367.XA CN114998815B (en) 2022-08-04 2022-08-04 Traffic vehicle identification tracking method and system based on video analysis

Publications (2)

Publication Number Publication Date
CN114998815A true CN114998815A (en) 2022-09-02
CN114998815B CN114998815B (en) 2022-10-25

Family

ID=83023197

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210931367.XA Active CN114998815B (en) 2022-08-04 2022-08-04 Traffic vehicle identification tracking method and system based on video analysis

Country Status (1)

Country Link
CN (1) CN114998815B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115171029A (en) * 2022-09-09 2022-10-11 山东省凯麟环保设备股份有限公司 Unmanned-driving-based method and system for segmenting instances in urban scene
CN115844424A (en) * 2022-10-17 2023-03-28 北京大学 Sleep spindle wave grading identification method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112836699A (en) * 2020-11-30 2021-05-25 爱泊车美好科技有限公司 Long-time multi-target tracking-based berth entrance and exit event analysis method
CN113705577A (en) * 2021-04-23 2021-11-26 中山大学 License plate recognition method based on deep learning
CN113936222A (en) * 2021-09-18 2022-01-14 北京控制工程研究所 Mars terrain segmentation method based on double-branch input neural network
CN114724131A (en) * 2022-04-01 2022-07-08 北京明略昭辉科技有限公司 Vehicle tracking method and device, electronic equipment and storage medium
CN114782901A (en) * 2022-06-21 2022-07-22 深圳市禾讯数字创意有限公司 Sand table projection method, device, equipment and medium based on visual change analysis

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112836699A (en) * 2020-11-30 2021-05-25 爱泊车美好科技有限公司 Long-time multi-target tracking-based berth entrance and exit event analysis method
CN113705577A (en) * 2021-04-23 2021-11-26 中山大学 License plate recognition method based on deep learning
CN113936222A (en) * 2021-09-18 2022-01-14 北京控制工程研究所 Mars terrain segmentation method based on double-branch input neural network
CN114724131A (en) * 2022-04-01 2022-07-08 北京明略昭辉科技有限公司 Vehicle tracking method and device, electronic equipment and storage medium
CN114782901A (en) * 2022-06-21 2022-07-22 深圳市禾讯数字创意有限公司 Sand table projection method, device, equipment and medium based on visual change analysis

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115171029A (en) * 2022-09-09 2022-10-11 山东省凯麟环保设备股份有限公司 Unmanned-driving-based method and system for segmenting instances in urban scene
CN115171029B (en) * 2022-09-09 2022-12-30 山东省凯麟环保设备股份有限公司 Unmanned-driving-based method and system for segmenting instances in urban scene
CN115844424A (en) * 2022-10-17 2023-03-28 北京大学 Sleep spindle wave grading identification method and system
CN115844424B (en) * 2022-10-17 2023-09-22 北京大学 Sleep spindle wave hierarchical identification method and system

Also Published As

Publication number Publication date
CN114998815B (en) 2022-10-25

Similar Documents

Publication Publication Date Title
CN108171112B (en) Vehicle identification and tracking method based on convolutional neural network
Chen et al. Vehicle detection in high-resolution aerial images via sparse representation and superpixels
CN114998815B (en) Traffic vehicle identification tracking method and system based on video analysis
CN110119726B (en) Vehicle brand multi-angle identification method based on YOLOv3 model
Zhang et al. Ripple-GAN: Lane line detection with ripple lane line detection network and Wasserstein GAN
Dehghan et al. View independent vehicle make, model and color recognition using convolutional neural network
Zhang et al. Study on traffic sign recognition by optimized Lenet-5 algorithm
Nandi et al. Traffic sign detection based on color segmentation of obscure image candidates: a comprehensive study
CN103093201B (en) Vehicle-logo location recognition methods and system
CN112966631A (en) License plate detection and identification system and method under unlimited security scene
Ye et al. A two-stage real-time YOLOv2-based road marking detector with lightweight spatial transformation-invariant classification
Naufal et al. Preprocessed mask RCNN for parking space detection in smart parking systems
CN111008574A (en) Key person track analysis method based on body shape recognition technology
Hatolkar et al. A survey on road traffic sign recognition system using convolution neural network
Zaarane et al. Real‐Time Vehicle Detection Using Cross‐Correlation and 2D‐DWT for Feature Extraction
Gad et al. Real-time lane instance segmentation using SegNet and image processing
Latha et al. Image understanding: semantic segmentation of graphics and text using faster-RCNN
Lee et al. License plate detection via information maximization
CN117475353A (en) Video-based abnormal smoke identification method and system
CN114359493B (en) Method and system for generating three-dimensional semantic map for unmanned ship
Zhang et al. A front vehicle detection algorithm for intelligent vehicle based on improved gabor filter and SVM
CN111986233A (en) Large-scene minimum target remote sensing video tracking method based on feature self-learning
Kodwani et al. Automatic license plate recognition in real time videos using visual surveillance techniques
Lokkondra et al. DEFUSE: deep fused end-to-end video text detection and recognition
CN111178158B (en) Rider detection method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant