CN112597996B - Method for detecting traffic sign significance in natural scene based on task driving - Google Patents

Method for detecting traffic sign significance in natural scene based on task driving Download PDF

Info

Publication number
CN112597996B
CN112597996B CN202011577655.7A CN202011577655A CN112597996B CN 112597996 B CN112597996 B CN 112597996B CN 202011577655 A CN202011577655 A CN 202011577655A CN 112597996 B CN112597996 B CN 112597996B
Authority
CN
China
Prior art keywords
convolutional
features
images
traffic sign
global
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011577655.7A
Other languages
Chinese (zh)
Other versions
CN112597996A (en
Inventor
李雨萌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanxi Cloud Times R & D Innovation Center Co ltd
Original Assignee
Shanxi Cloud Times R & D Innovation Center Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanxi Cloud Times R & D Innovation Center Co ltd filed Critical Shanxi Cloud Times R & D Innovation Center Co ltd
Priority to CN202011577655.7A priority Critical patent/CN112597996B/en
Publication of CN112597996A publication Critical patent/CN112597996A/en
Application granted granted Critical
Publication of CN112597996B publication Critical patent/CN112597996B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • G06V20/582Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of traffic signs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the field of computer vision, and discloses a method for detecting the significance of traffic signs in a natural scene based on task driving, which comprises the following steps: s1, acquiring training data; s2, inputting images in training set data, extracting total global features of the images by using a convolutional neural network, and extracting feature information of the images under a plurality of different resolutions; the method comprises the steps of utilizing an expansion convolution network to carry out multilayer expansion convolution learning on feature information under different resolutions of an image to extract features and contrast features; s3, carrying out up-sampling type learning on the features and the contrast features to obtain feature images under each resolution, and then fusing the feature images into total local features; s4, predicting to obtain a traffic sign significance characteristic diagram; s5, repeating the steps S2-S4 to train the convolutional neural network, and storing a training model; s6, inputting the image to be predicted, and obtaining the traffic sign significance characteristic diagram. The traffic sign detection precision is improved, and the method can be widely applied to the unmanned field.

Description

Method for detecting traffic sign significance in natural scene based on task driving
Technical Field
The invention relates to the field of computer vision, in particular to the technical field of image saliency detection based on deep learning, and more particularly relates to a saliency detection method of traffic signs in a natural scene based on task driving.
Background
As the number of vehicles increases, the traffic problem is also increasing, and identifying traffic signs is the most important problem in driving, and is also important for road maintenance, driver assistance systems, and automatic driving vehicles.
Many practical factors need to be considered, for example, in the development of Advanced Driving Assistance Systems (ADAS), the most fundamental of which is the identification of traffic signs (Traffic Sign Recognition, TSR). TSR is a difficult real-scene pattern recognition problem, and the main function of TSR is to provide road information for a driver and remind the driver of making reasonable operations by detecting traffic signs. If the complex conditions such as road condition congestion, rain and snow weather or driver fatigue occur in the running process of the vehicle, the TSR can prevent the driver from having traffic accidents in complex environments such as negligence, fatigue driving and severe weather.
For the reasons described above, it becomes particularly important how the computer can be precisely positioned to the coordinates of the traffic sign. It is well known that image saliency detection provides a method for extracting main information in image processing, has become a key technology in the field of computer vision, and has wide application in practical computer vision tasks. It is mainly by simulating the functions of human visual mechanism to effectively extract the parts of the scene that people are paying attention to. That is, traffic sign is a very important technology in the field of automatic driving as a salient object detection, and the most important problem facing the present is how to make the detection result coincide with the subjective intention of people in a real scene, and how to improve the high-efficiency robustness of detection in a complex scene.
The purpose of saliency detection is to mimic the human visual system, while the selective attention of the human visual system can be divided into two mechanisms, one based on a data-driven bottom-up (bottom-up) attention mechanism, which is a task-independent approach based on saliency driving, which is relatively fast, and the other based on a task-driven top-down (top-down) attention mechanism, which is a task-dependent approach controlled by our mind, which is relatively slow. In the prior art, detection is generally performed in a task-independent manner based on significance driving, because this is fast. However, in a real scene, when a specific target needs to be detected, for example, an application such as automatic driving needs to detect a specific target, for example, a traffic sign that a driver needs to pay attention to, by using the first detection, it is highly likely that only the traffic sign that is most significant is detected, but all traffic signs in the image cannot be displayed, so that a detection method for traffic signs in a natural scene based on task driving needs to be provided to realize accurate detection of traffic signs.
Disclosure of Invention
The invention overcomes the defects existing in the prior art, and solves the technical problems that: the method for detecting the significance of the traffic sign in the natural scene based on task driving is provided, so that the traffic sign in the natural scene can be positioned more accurately.
In order to solve the technical problems, the invention adopts the following technical scheme: a significance detection method of traffic signs in a natural scene based on task driving comprises the following steps:
s1, acquisition of training data: collecting images containing traffic signs in natural scenes, marking the traffic signs in the images, and unifying the resolution of the images;
s2, inputting images in training set data, extracting total global features of the images by using a convolutional neural network, and extracting feature information of the images under a plurality of different resolutions; the convolutional neural network comprises a plurality of convolutional blocks and a global convolutional block which are sequentially connected, wherein the output of each convolutional block corresponds to characteristic information under the resolution of an image, and the output of the global convolutional block corresponds to the total global characteristic of the image; the method comprises the steps of utilizing an expansion convolution network to carry out multilayer expansion convolution learning on feature information of images under different resolutions to extract features, and carrying out contrast feature extraction according to the extracted features;
s3, carrying out upsampling learning on the features and contrast features under each resolution obtained in the step S2 to restore the features and contrast features to the original resolution, obtaining feature images under each resolution, and then fusing the feature images under different resolutions into total local features in a concat fusion mode;
s4, predicting a traffic sign saliency map finally according to the total global features extracted in the step S2 and the total local features obtained in the step S3;
s5, adjusting parameters of the convolutional neural network, repeating the steps S2-S4 until the traffic sign saliency map obtained through prediction is consistent with the marked traffic sign, and storing a training model;
s6, inputting an image to be predicted, and repeating the steps S2-S4 to obtain a traffic sign significance map.
In the step S2, the convolutional neural network includes five convolutional blocks CONV1-CONV5 and a GLOBAL convolutional block, the convolutional blocks CONV1-CONV2 respectively include two-layer convolutional operations with a convolutional kernel size of 3*3, the convolutional blocks CONV3-CONV5 respectively include three-layer convolutional operations with a convolutional kernel size of 3*3, the outputs of the convolutional blocks CONV1, CONV2, CONV3, CONV4 and CONV5 are sequentially connected with an expansion convolutional network, and the GLOBAL convolutional block GLOBAL includes three-layer convolutional operations with a convolutional kernel size of 5x5, 5x5 and 3x3.
In the step S2, the dilation convolutional network includes four-layer dilation convolutional operations with dilation rates of 1, 3, 5, and 7, respectively.
In the step S1, the resolution of the image is unified to 256×256, and the resolution of the image output by the convolution blocks CONV1-CONV5 is 256×256, 128×128, 64×64, 32×32, and 16×16, respectively.
In the step S2, a gaussian pyramid algorithm is adopted to extract contrast characteristics.
In the step S3, when the feature maps under different resolutions are fused in a concat fusion manner, the convolution kernel size is 1x1.
In the step S4, the specific method for predicting the finally obtained traffic sign significance feature map is as follows: and (3) respectively convolving the total global features obtained in the step (S2) and the total local features obtained in the step (S3) with convolution kernels of 1x1 to obtain local scores and global scores, and finally performing saliency prediction on each pixel through a Softmax function to obtain a final traffic sign saliency map.
Compared with the prior art, the invention has the following beneficial effects: the invention provides a method for detecting the significance of traffic signs in a natural scene based on task driving, which mainly adopts a convolution operation of expanding a convolution field of view layer by layer, namely fully utilizing the expansion of an acceptance domain of an expansion convolution network to increase as much target information as possible, improving the extraction capability of traffic sign features, learning semantic information of traffic target area information and the context thereof, and reducing the loss of related traffic sign information as much as possible. In addition, the invention also adopts a contrast method to capture the features, and because the salient object is a foreground object which is different from the background area, a Gaussian pyramid method for calculating the contrast features is adopted to find out relatively prominent and important pixels and areas in the image. Finally, the prediction results are effectively combined to obtain a final significance map, and experimental results show that the traffic sign in the natural scene can be more accurately positioned based on the method.
Drawings
Fig. 1 is a network flow chart of a method for detecting the significance of traffic signs in a natural scene based on task driving, which is provided by the embodiment of the invention;
FIG. 2 is a schematic diagram of the structure of an extended convolutional network according to an embodiment of the present invention;
FIG. 3 is a graph showing the comparison of the test results of the significance detection method of the present invention with other methods.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions in the embodiments of the present invention will be clearly and completely described below, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments; all other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1-2, the embodiment of the invention provides a method for detecting the significance of traffic signs in a natural scene based on task driving, which comprises the following steps:
s1, acquisition of training data: and acquiring images containing traffic signs in the natural scene, marking the traffic signs in the images, and unifying the resolution of the images.
Specifically, in this embodiment, the image is uniformly resolved to 256×256 before extracting the features.
S2, inputting images in training set data, extracting total global features of the images by using a convolutional neural network, and extracting feature information of the images under a plurality of different resolutions; the convolutional neural network comprises a plurality of convolutional blocks and a global convolutional block which are sequentially connected, wherein the output of each convolutional block corresponds to characteristic information under the resolution of an image, and the output of the global convolutional block corresponds to the total global characteristic of the image; and utilizing the expansion convolution network to carry out multilayer expansion convolution learning on the feature information of the images under different resolutions to extract features, and carrying out contrast feature extraction according to the extracted features.
Specifically, as shown in fig. 1-2, in this embodiment, the convolutional neural network includes five convolutional blocks CONV1-CONV5 and a GLOBAL convolutional block GLOBAL, the convolutional blocks CONV1-CONV2 respectively include two-layer convolutional operations with a convolutional kernel size of 3*3, the convolutional blocks CONV3-CONV5 respectively include three-layer convolutional operations with a convolutional kernel size of 3*3, and the outputs of the convolutional blocks CONV1, CONV2, CONV3, CONV4, CONV5 are sequentially connected with one expansion convolutional network DILACON 1-DILACON 5, where the expansion convolutional network includes four-layer expansion convolutional operations with expansion rates of 1, 3, 5, and 7, respectively. The GLOBAL convolution block GLOBAL comprises three layers of convolution operations, with convolution kernel sizes of 5x5, and 3x3 respectively.
As shown in fig. 1, after the image data is input into the convolutional neural network, the image data is sequentially output to the GLOBAL convolutional block GLOBAL after passing through the convolutional blocks CONV1-CONV5, and on the other hand, the convolutional blocks CONV1-CONV5 respectively output characteristic information under different resolutions, and after the characteristic information is respectively subjected to the expansion convolution operation through the expansion convolutional networks DILACON 1-DILACON 5, the characteristics of the image under different resolutions can be learned and extracted, and meanwhile, the characteristics are respectively subjected to the characteristic extraction through the contrast characteristic extraction modules (contrast 1-contrast 5) to obtain contrast characteristics. In this embodiment, the convolutional blocks CONV1, CONV2, CONV3, CONV4, and CONV5 of the convolutional neural network are the first 13 layers of the convolutional neural network VGG-16, and when the total GLOBAL feature of the image is extracted, the full connection layer after the last convolutional block in the convolutional neural network VGG-16 is removed, a GLOBAL convolutional block is added and named as GLOBAL, and three layers of convolutional operations are included, wherein the sizes of the convolutional kernels adopted are 5x5, and 3x3 respectively.
Specifically, in the present embodiment, the resolutions of the images output by the convolution blocks CONV1-CONV5 are 256×256, 128×128, 64×64, 32×32 and 16×16, respectively.
As shown in fig. 2, in this embodiment, the characteristic information of the target is extracted by performing the expansion convolution learning by using the principle that the expansion convolution network includes a plurality of layers of expansion convolution operations (the expansion rates are 1, 3, 5, and 7, respectively), so as to achieve the purpose of expanding the size of the field of view in the convolution operation process to obtain more information characteristics.
Specifically, in the step S2, the contrast features are extracted by using a gaussian pyramid algorithm to find relatively prominent and important pixels and regions in the image, thereby extracting the contrast features at different resolutions.
S3, up-sampling learning is carried out on the features and contrast features under each resolution obtained in the step S2 to restore to the original resolution, feature graphs under each resolution are obtained, and then the feature graphs under different resolutions are fused into total local features in a concat fusion mode, and the convolution kernel size is 1x1.
S4, predicting according to the total global features extracted in the step S2 and the total local features obtained in the step S3, and finally obtaining a traffic sign saliency map. The specific method for predicting comprises the following steps: and (3) respectively convolving the total global features obtained in the step (S2) and the total local features obtained in the step (S3) with convolution kernels of 1x1 to obtain local scores and global scores, and finally performing saliency prediction on each pixel through a Softmax function to obtain a final traffic sign saliency map.
And S5, adjusting parameters of the convolutional neural network, and repeating the steps S2-S4 until the traffic sign saliency characteristic map obtained through prediction is consistent with the marked traffic sign.
S6, inputting an image to be predicted, and repeating the steps S2-S4 to obtain a traffic sign significance map.
In the experimental process, the acquired image is made into a training data set and a testing data set, and training of a network model is realized under a TensorFlow, wherein the weight of the convolutional neural network is initialized by the pretraining weight of VGG-16, the weights of all newly added convolutional layers are randomly initialized (delta=0.01), and the deviation is initialized to 0. In the embodiment, an Adam optimizer is adopted to train a model, and the initial learning rate is 10 -6 ,β 1 =0.9,β 2 =0.999. The entire training process takes about 6 hours with an apparatus using an NVIDIA 1080Ti GPU where one image batch size completes 20 iterative processes.
The trained model is stored, then a test image is input for test verification, and finally experimental results show that the method has higher accuracy than other significance detection methods based on the method. First, a standard significance evaluation index F metric (F-measure) is used in performance (higher is better) and an average absolute error (Mean Absolute Error, MAE) is averaged (lower is better), wherein maxF is β 0.886, MAE 0.062; and then, comparing experimental effect graphs, as shown in fig. 3, three groups of a, b and c of detection result graphs obtained in the embodiment are detected natural scene images containing traffic signs, the obtained result of a wCtr algorithm, the obtained result of an NDF algorithm and the obtained result of the algorithm of the invention sequentially from left to right. As can be seen from the results, the invention can better enlarge the visual field to learn the context information, improve the extraction capability of the characteristics and reduce the loss of the information as much as possible, and in the figure 3, three groups of images from a to c are natural scenes with dark weather, small target volume and shielding objects in sequence, and in the figure, the three groups of images from a to c are seen from natural scenes with dark weather, small target volume and shielding objectsThe method of the present invention can still achieve relatively accurate results under various conditions in natural scenes.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims (4)

1. The method for detecting the significance of the traffic sign in the natural scene based on the task driving is characterized by comprising the following steps of:
s1, acquisition of training data: collecting images containing traffic signs in natural scenes, marking the traffic signs in the images, and unifying the resolution of the images;
s2, inputting images in training set data, extracting total global features of the images by using a convolutional neural network, and extracting feature information of the images under a plurality of different resolutions; the convolutional neural network comprises a plurality of convolutional blocks and a global convolutional block which are sequentially connected, wherein the output of each convolutional block corresponds to characteristic information under the resolution of an image, and the output of the global convolutional block corresponds to the total global characteristic of the image; the method comprises the steps of utilizing an expansion convolution network to carry out multilayer expansion convolution learning on feature information of images under different resolutions to extract features, and carrying out contrast feature extraction according to the extracted features;
s3, carrying out upsampling learning on the features and contrast features under each resolution obtained in the step S2 to restore the features and contrast features to the original resolution, obtaining feature images under each resolution, and then fusing the feature images under different resolutions into total local features in a concat fusion mode;
s4, predicting a traffic sign significance feature map finally obtained according to the total global features extracted in the step S2 and the total local features obtained in the step S3;
s5, adjusting parameters of the convolutional neural network, repeating the steps S2-S4 until the traffic sign saliency feature map obtained through prediction is consistent with the marked traffic sign, and storing a training model;
s6, inputting an image to be predicted, and repeating the steps S2-S4 to obtain a traffic sign significance characteristic diagram;
in the step S2, the convolutional neural network includes five convolutional blocks CONV1-CONV5 and a GLOBAL convolutional block, wherein the five convolutional blocks CONV1-CONV5 and the GLOBAL convolutional block are sequentially connected, the convolutional blocks CONV1-CONV2 respectively include two-layer convolutional operations with the convolutional kernel size of 3*3, the convolutional blocks CONV3-CONV5 respectively include three-layer convolutional operations with the convolutional kernel size of 3*3, the outputs of the convolutional blocks CONV1, CONV2, CONV3, CONV4 and CONV5 are sequentially connected with an expansion convolutional network, and the GLOBAL convolutional block GLOBAL includes three-layer convolutional operations with the convolutional kernel sizes of 5x5, 5x5 and 3x3;
in the step S2, a Gaussian pyramid algorithm is adopted to extract contrast characteristics;
in the step S4, the specific method for predicting the finally obtained traffic sign significance feature map is as follows: and (3) respectively convolving the total global features obtained in the step (S2) and the total local features obtained in the step (S3) with convolution kernels of 1x1 to obtain local scores and global scores, and finally performing saliency prediction on each pixel through a Softmax function to obtain a final traffic sign saliency feature map.
2. The method for detecting the significance of the traffic sign in the natural scene based on the task driving according to claim 1, wherein in the step S2, the dilation convolution network comprises four-layer dilation convolution operations with dilation rates of 1, 3, 5 and 7, respectively.
3. The method for detecting the saliency of a traffic sign in a natural scene based on task driving according to claim 1, wherein in the step S1, the resolution of the images output by the convolution blocks CONV1-CONV5 is unified to 256×256, and the resolution of the images output by the convolution blocks CONV1-CONV5 is 256×256, 128×128, 64×64, 32×32 and 16×16, respectively.
4. The method for detecting the significance of the traffic sign in the natural scene based on the task driving according to claim 1, wherein in the step S3, when the feature maps under different resolutions are fused in a concat fusion manner, the convolution kernel size is 1x1.
CN202011577655.7A 2020-12-28 2020-12-28 Method for detecting traffic sign significance in natural scene based on task driving Active CN112597996B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011577655.7A CN112597996B (en) 2020-12-28 2020-12-28 Method for detecting traffic sign significance in natural scene based on task driving

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011577655.7A CN112597996B (en) 2020-12-28 2020-12-28 Method for detecting traffic sign significance in natural scene based on task driving

Publications (2)

Publication Number Publication Date
CN112597996A CN112597996A (en) 2021-04-02
CN112597996B true CN112597996B (en) 2024-03-29

Family

ID=75202815

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011577655.7A Active CN112597996B (en) 2020-12-28 2020-12-28 Method for detecting traffic sign significance in natural scene based on task driving

Country Status (1)

Country Link
CN (1) CN112597996B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113536973B (en) * 2021-06-28 2023-08-18 杭州电子科技大学 Traffic sign detection method based on saliency
CN113449667A (en) * 2021-07-08 2021-09-28 四川师范大学 Salient object detection method based on global convolution and boundary refinement

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109558880A (en) * 2018-10-16 2019-04-02 杭州电子科技大学 A kind of whole profile testing method with Local Feature Fusion of view-based access control model
CN110287981A (en) * 2019-05-08 2019-09-27 中国科学院西安光学精密机械研究所 Conspicuousness detection method and system based on biological enlightening representative learning
KR20190119261A (en) * 2018-04-12 2019-10-22 가천대학교 산학협력단 Apparatus and method for segmenting of semantic image using fully convolutional neural network based on multi scale image and multi scale dilated convolution
CN110633708A (en) * 2019-06-28 2019-12-31 中国人民解放军军事科学院国防科技创新研究院 Deep network significance detection method based on global model and local optimization
CN110929735A (en) * 2019-10-17 2020-03-27 杭州电子科技大学 Rapid significance detection method based on multi-scale feature attention mechanism
CN111563418A (en) * 2020-04-14 2020-08-21 浙江科技学院 Asymmetric multi-mode fusion significance detection method based on attention mechanism
CN112070753A (en) * 2020-09-10 2020-12-11 浙江科技学院 Multi-scale information enhanced binocular convolutional neural network saliency image detection method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SG10202108020VA (en) * 2017-10-16 2021-09-29 Illumina Inc Deep learning-based techniques for training deep convolutional neural networks

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190119261A (en) * 2018-04-12 2019-10-22 가천대학교 산학협력단 Apparatus and method for segmenting of semantic image using fully convolutional neural network based on multi scale image and multi scale dilated convolution
CN109558880A (en) * 2018-10-16 2019-04-02 杭州电子科技大学 A kind of whole profile testing method with Local Feature Fusion of view-based access control model
CN110287981A (en) * 2019-05-08 2019-09-27 中国科学院西安光学精密机械研究所 Conspicuousness detection method and system based on biological enlightening representative learning
CN110633708A (en) * 2019-06-28 2019-12-31 中国人民解放军军事科学院国防科技创新研究院 Deep network significance detection method based on global model and local optimization
CN110929735A (en) * 2019-10-17 2020-03-27 杭州电子科技大学 Rapid significance detection method based on multi-scale feature attention mechanism
CN111563418A (en) * 2020-04-14 2020-08-21 浙江科技学院 Asymmetric multi-mode fusion significance detection method based on attention mechanism
CN112070753A (en) * 2020-09-10 2020-12-11 浙江科技学院 Multi-scale information enhanced binocular convolutional neural network saliency image detection method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Morphologically dilated convolutional neural network for hyperspectral image classification;Vinod Kumar,等;《Signal Processing: Image Communication》;第116549 *
融合深度学习与成像模型的水下图像增强算法;陈学磊,等;《计算机工程》;第48卷(第2期);第243-249页 *
面向车辆检测的扩张全卷积神经网络;程雅慧,等;《计算机***应用》 *

Also Published As

Publication number Publication date
CN112597996A (en) 2021-04-02

Similar Documents

Publication Publication Date Title
CN111368687B (en) Sidewalk vehicle illegal parking detection method based on target detection and semantic segmentation
CN108876780B (en) Bridge crack image crack detection method under complex background
CN111488789B (en) Pedestrian detection method and device for monitoring based on image analysis
CN110136154B (en) Remote sensing image semantic segmentation method based on full convolution network and morphological processing
CN111127449B (en) Automatic crack detection method based on encoder-decoder
CN112488025B (en) Double-temporal remote sensing image semantic change detection method based on multi-modal feature fusion
CN113468967A (en) Lane line detection method, device, equipment and medium based on attention mechanism
CN111611861B (en) Image change detection method based on multi-scale feature association
CN110781980B (en) Training method of target detection model, target detection method and device
CN112597996B (en) Method for detecting traffic sign significance in natural scene based on task driving
CN111882620A (en) Road drivable area segmentation method based on multi-scale information
JP2022025008A (en) License plate recognition method based on text line recognition
CN115063786A (en) High-order distant view fuzzy license plate detection method
CN114913498A (en) Parallel multi-scale feature aggregation lane line detection method based on key point estimation
CN114519819A (en) Remote sensing image target detection method based on global context awareness
CN112785610B (en) Lane line semantic segmentation method integrating low-level features
Saravanarajan et al. Improving semantic segmentation under hazy weather for autonomous vehicles using explainable artificial intelligence and adaptive dehazing approach
CN116580232A (en) Automatic image labeling method and system and electronic equipment
CN117115616A (en) Real-time low-illumination image target detection method based on convolutional neural network
CN111832463A (en) Deep learning-based traffic sign detection method
CN115049836B (en) Image segmentation method, device, equipment and storage medium
CN111160274A (en) Pedestrian detection method based on binaryzation fast RCNN (radar cross-correlation neural network)
CN116052189A (en) Text recognition method, system and storage medium
CN112446292B (en) 2D image salient object detection method and system
Jia et al. Sample generation of semi‐automatic pavement crack labelling and robustness in detection of pavement diseases

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant