CN111488834A - Crowd counting method based on multi-level feature fusion - Google Patents

Crowd counting method based on multi-level feature fusion Download PDF

Info

Publication number
CN111488834A
CN111488834A CN202010284030.5A CN202010284030A CN111488834A CN 111488834 A CN111488834 A CN 111488834A CN 202010284030 A CN202010284030 A CN 202010284030A CN 111488834 A CN111488834 A CN 111488834A
Authority
CN
China
Prior art keywords
crowd
feature
convolution
layer
density map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010284030.5A
Other languages
Chinese (zh)
Other versions
CN111488834B (en
Inventor
霍占强
路斌
宋素玲
雒芬
乔应旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan University of Technology
Original Assignee
Henan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan University of Technology filed Critical Henan University of Technology
Priority to CN202010284030.5A priority Critical patent/CN111488834B/en
Publication of CN111488834A publication Critical patent/CN111488834A/en
Application granted granted Critical
Publication of CN111488834B publication Critical patent/CN111488834B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a crowd counting method based on multi-level feature fusion, which comprises the following steps: preprocessing the acquired crowd image, generating a corresponding crowd density map by utilizing label information, constructing a multi-level feature fusion crowd counting network, initializing network weight parameters, inputting the preprocessed crowd image and the crowd density map into the network, completing forward propagation, calculating loss of a forward propagation result and a real density map, updating model parameters, iterating the forward propagation and updating the model parameters to appointed times, and acquiring the crowd density map to obtain the estimated number of people. The method provided by the invention can overcome the problem of crowd scale change in the crowd counting task, and the crowd counting is more accurate.

Description

Crowd counting method based on multi-level feature fusion
Technical Field
The invention relates to the field of image crowd counting and deep learning, in particular to a crowd counting method based on deep learning.
Background
People counting is an important problem in the field of image processing and computer vision, and aims to: a crowd density map is automatically generated from the crowd images and the number of people in the scene is estimated. The crowd counting is widely applied to the fields of traffic scheduling, safety prevention and control, city management and the like.
The traditional crowd counting method needs to carry out complex preprocessing on crowd images, needs to manually design and extract human body features, needs to re-extract features under the condition of crossing scenes, and is poor in adaptability. In recent years, the successful application of convolutional neural networks has brought about a major breakthrough to the task of population counting. Zhang [1] et al propose a convolutional neural network model suitable for crowd counting, which realizes end-to-end training without foreground segmentation and artificial design and feature extraction, obtains high-level features through multilayer convolution, and improves the performance of crowd counting in cross-scene. However, in different crowded scenes, the crowd scales are different greatly, and the density and distribution of the crowd also differ in the same image due to different distances from the camera, so that the method is lower in accuracy when processing scenes with large crowd scale differences.
In order to solve the problem of population scale variation, the attention of the existing research work is mainly focused on extracting a plurality of features with different scales to reduce the influence of the scale variation. Zhang [2] et al propose a multi-branch convolutional neural network, in which each branch is composed of convolutional kernels of different sizes, and the problem of crowd scale variation is solved by extracting features of different scales through the convolutional kernels of different branches. Cao 3 et al propose a scale aware network that solves the scale variation problem by designing feature extraction modules consisting of convolution kernels of different sizes. The above methods all solve the problem of scale variation of the crowd by extracting features of different scales through convolution kernels of different sizes. However, the scale variation of the population size in an image is continuous, and only the features of the population at discrete scales can be extracted by convolution kernels of different sizes, which ignores the population at other scales. Therefore, the problem of the scale difference of the crowd in different scenes is not completely solved.
Reference documents:
1.C.Zhang,H.Li,X.Wang,and X.Yang,Cross-Scene Crowd Counting via DeepConvolutional Neural Networks[C].Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition,2015,833-841.
2.Y.Zhang,D.Zhou,S.Chen,et al.Single-image crowd counting via multi-column convolutional neural network[C].Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition,2016,589-597.
3.X.Cao,Z.Wang,Y.Zhao,and F.Su,Scale aggregation network for accurateand efficient crowd counting[C].European Conference on Computer Vision,2018,734-750.
disclosure of Invention
The invention provides a crowd counting method based on multi-level feature fusion, which aims to solve the problem of crowd scale difference in different scenes in the prior art. The method mainly comprises the following steps:
step S1: preprocessing the acquired crowd image, and generating a corresponding crowd density map by using the labeling information;
step S2: constructing a multi-level feature fused crowd counting network;
step S3: initializing a network weight parameter;
step S4: inputting the preprocessed crowd image and the crowd density map of the S1 into a network to finish forward propagation;
step S5: calculating loss by using the result of forward propagation of S4 and a real density map, and updating model parameters;
step S6: iterating steps S4, S5 a specified number of times;
step S7: and acquiring a crowd density map to obtain the estimated number of people.
Compared with the current method for solving the crowd scale change by adopting multi-branch and multi-size convolution kernels, the invention provides a method based on multi-level feature fusion, wherein the shallow output features of the VGG16 feature extractor contained in the network comprise the spatial information and the texture information of the crowd, and the high-level output features comprise the semantic information of the crowd. The shallow features describe the spatial position of the crowd, and the high-level features provide specific details of the crowd features. The method combines the low-level features and the high-level features, can effectively solve the problem of crowd scale change, and overcomes the defect that the method adopting multi-branch and multi-size convolution kernels can only extract the crowd features with discrete scales. Compared with the existing method, the method provided by the invention is more accurate.
Drawings
Fig. 1 is a flowchart of a crowd counting method based on multi-level feature fusion according to the present invention.
Fig. 2 is a diagram of a crowd counting network structure based on multi-level feature fusion according to the present invention.
Fig. 3 is a structural diagram of a channel domain attention module of a crowd counting network based on multi-level feature fusion according to the present invention.
Detailed Description
Fig. 1 is a flowchart of a crowd counting method based on multi-level feature fusion according to the present invention. The method mainly comprises the following steps: preprocessing the acquired crowd image, generating a corresponding crowd density map by utilizing label information, constructing a multi-level feature fusion crowd counting network, initializing network weight parameters, inputting the preprocessed crowd image and the crowd density map into the network, completing forward propagation, calculating loss of a forward propagation result and a real density map, updating model parameters, iterating the forward propagation and updating the model parameters to specified times, and acquiring the crowd density map to obtain an estimated number of people, wherein the specific implementation details of each step are as follows:
step S1: preprocessing the acquired crowd image, and generating a corresponding crowd density map by using the labeling information, wherein the specific mode is as follows:
step S11: the collected crowd image is subjected to centralization processing, specifically, the average value corresponding to the channel is subtracted from the elements on the three channels of the image R, G and B, and then the average value is divided by the standard deviation corresponding to the channel, wherein the average value corresponding to the three channels of R, G and B is (0.485,0.456,0.406), and the corresponding standard deviation is (0.229,0.224, 0.225).
Step S12: and generating a position matrix for the provided labeling information, wherein the specific mode is that a matrix with the elements which are the same as the corresponding image resolution and are all 0 is created, and the elements at the corresponding positions of the matrix are set to be 1 according to the coordinates provided by the labeling information.
And step S13, randomly cutting the centralized crowd image and the corresponding position matrix into image blocks and matrixes with fixed sizes, wherein the cutting size is 400 × 400 in the specific embodiment of the invention.
Step S14: and performing convolution operation on the two-dimensional Gaussian convolution kernels and elements with the size of 1 in the position matrix to generate the crowd density map.
And step S15, the density map generated in the step S14 is down-sampled to 200 × 200 resolution, specifically, the density map is convolved by taking steps as 2 through a convolution kernel with 2 × 2 parameters being 1.
Step S2: a multi-level feature fusion crowd counting network is constructed, as shown in fig. 2, in a specific manner as follows:
step S21: a VGG16 network was built that did not contain a full connectivity layer.
And S22, building a channel domain attention module, as shown in FIG. 3, specifically, building a global average pooling layer on the channel domain, pooling input features X into features of 1 × 1 × C, adding two full connection layers behind the pooling layer, wherein the number of neurons is C/4 and C respectively, building a Sigmoid activation layer behind the two full connection layers, and performing element multiplication operation on the activation layer output and the input features X to obtain the output of the channel domain attention module.
Step S23: outputting characteristics X of the fifth layer to the fourth layer of the VGG16 network constructed in the step S2150,X40Performing feature fusion by outputting the fifth layer with the feature X50Performing an upsampling operation (the amplification factors of the upsampling layer are all 2 in the invention), and combining the upsampled characteristics with the output characteristics X of the fourth layer40Performing splicing operation on the channel domain, inputting the spliced characteristics into a channel domain attention module, and inputting the output of the channel domain attention module into a convolution block consisting of two convolution layers with the channel number of 256 of 3 × 3 to obtain the output characteristic X of the convolution block41
Step S24: outputting characteristics X of the fourth layer to the third layer of the VGG16 network constructed in the step S2140,X30And the feature X obtained in step S2341Performing feature fusion by combining the features X40Up-sampling is carried out, and the up-sampled result and the characteristic X are30Performing splicing operation on the channel domain, inputting the spliced characteristics into a convolution block consisting of two convolution layers with the channel number of 128 of 3 × 3 to obtain characteristics X31The feature X41Performing an upsampling operation to obtain a feature X32The feature X31And feature X32Performing splicing operation on a channel domain, inputting the spliced characteristics into a channel domain attention module, and inputting the output of the channel domain attention module into a convolution block consisting of two convolution layers with the channel number of 128 of 3 × 3 to obtain the output characteristics X of the convolution block33
Step S25: outputting characteristics X from the third layer to the second layer of the VGG16 network constructed in the step S2130,X20And the feature X obtained in step S2431,X33Performing feature fusion by combining the features X30Performing an upsampling operation, and comparing the upsampled feature with the feature X20Performing splicing operation on the channel domain, inputting the spliced characteristics into a convolution block consisting of two convolution layers with the channel number of 3 × 3 being 64 to obtain characteristics X21The feature X31Performing an upsampling operation to obtain a feature X22The feature X21And feature X22Performing splicing operation on the channel domain, inputting the spliced characteristics into a convolution block consisting of two convolution layers with the channel number of 3 × 3 being 64 and obtaining the output characteristics X of the convolution block23The feature X33Performing an upsampling operation to obtain a feature X24The feature X23And feature X24And performing splicing operation on a channel domain, inputting the spliced features into a channel domain attention module, inputting the output of the channel domain attention module into a convolution block consisting of two convolution layers with the channel number of 3 × 3 being 64 and a convolution layer with the channel number of 3 × 3 being 32, and inputting the output of the convolution block into a convolution layer with the channel number of 1 × 1 being 1, so as to complete the construction of the crowd counting network with multi-level feature fusion.
Step S3, initializing network weight parameters, specifically, for the crowd counting network obtained in step S2, the initial value of the feature extractor VGG16 is the classification weight of ImageNet of VGG16 not including the full connection layer, and other convolutional layers and the full connection layer all adopt positive-too-distribution initialization parameters, wherein: μ ═ 0 and σ ═ 0.01.
And step S4, inputting the crowd image and the crowd density map preprocessed in the step S1 into a network to finish forward propagation.
And S5, calculating loss by the result of forward propagation in the step S4 and the real density map of the input network, and updating model parameters in the following specific mode:
step S51 calculating mean square error loss L of the result of forward propagation and the true density mapMSEThe concrete mode is as follows:
Figure BDA0002447808570000061
where N represents the number of samples of input data that are propagated forward at one time, where N is 8 in the present invention,
Figure BDA0002447808570000062
a density map representing the current ith data forward propagation computation,
Figure BDA0002447808570000063
representing the true density map of the current ith datum.
Step S52, loss L calculated in the step S51MSEAnd updating the model parameters by using a random gradient descent method.
And step S6, iterating the steps S4 and S5 to a specified number of times, wherein the iteration number is 50 times.
And step S7, acquiring the crowd density map to obtain the estimated number of people, wherein the specific mode is that the number of people contained in the crowd image is obtained by summing all pixels in the crowd density map calculated by the model.
Compared with the current method for solving the crowd scale change by adopting multi-branch and multi-size convolution kernels, the invention provides a method based on multi-level feature fusion, wherein the shallow output features of the VGG16 feature extractor contained in the network comprise the spatial information and the texture information of the crowd, and the high-level output features comprise the semantic information of the crowd. The shallow features describe the spatial position of the crowd, and the high-level features provide specific details of the crowd features. The method combines the low-level features and the high-level features, can effectively solve the problem of crowd scale change, and overcomes the defect that the method adopting multi-branch and multi-size convolution kernels can only extract the crowd features with discrete scales. Compared with the existing method, the method provided by the invention is more accurate.

Claims (1)

1. A crowd counting method based on multi-level feature fusion is characterized by specifically comprising the following steps:
step S1: preprocessing the acquired crowd image, and generating a corresponding crowd density map by using the labeling information, wherein the specific mode is as follows:
step S11: centralizing the acquired crowd image, specifically, subtracting an average value corresponding to a channel from elements on three channels of R, G and B of the image, and dividing the average value by a standard deviation corresponding to the channel, wherein the average value corresponding to the three channels of R, G and B is (0.485,0.456,0.406), and the corresponding standard deviation is (0.229,0.224, 0.225);
step S12: generating a position matrix for the provided labeling information, wherein the specific mode is that a matrix with the elements which are the same as the corresponding image resolution and are all 0 is created, and the elements at the corresponding positions of the matrix are set to be 1 according to the coordinates provided by the labeling information;
step S13, randomly cutting image blocks and matrixes with fixed sizes from the centralized crowd images and the corresponding position matrixes, wherein in the specific embodiment of the invention, the cutting size is 400 × 400;
step S14: generating a corresponding crowd density map by convolving the position matrix through a Gaussian kernel in a specific mode that two one-dimensional Gaussian convolution kernels are generated, wherein mu is 15, and sigma is 4, transposing one of the Gaussian convolution kernels and multiplying the other Gaussian convolution kernel to obtain a two-dimensional Gaussian convolution kernel, and performing convolution operation on the two-dimensional Gaussian convolution kernel and an element with the size of 1 in the position matrix to generate the crowd density map;
step S15, down-sampling the density map generated in the step S14 to 200 × 200 resolution, specifically, performing convolution operation on the density map by taking the step size as 2 by using convolution kernels with 2 × 2 parameters as 1;
step S2: the method comprises the following steps of constructing a multi-level feature fusion crowd counting network, and specifically comprising the following steps:
step S21: building a VGG16 network which does not contain a full connection layer;
step S22, constructing a channel domain attention module, wherein the specific method is that a global average pooling layer on the channel domain is constructed, an input feature X is pooled into a feature of 1 × 1 × C, two full connection layers are added behind the pooling layer, the number of neurons is C/4 and C respectively, a Sigmoid activation layer is constructed behind the two full connection layers, and element multiplication operation is carried out on the output of the activation layer and the input feature X to obtain the output of the channel domain attention module;
step S23: outputting characteristics X of the fifth layer to the fourth layer of the VGG16 network constructed in the step S2150,X40Performing feature fusion by outputting the fifth layer with the feature X50Performing an upsampling operation (the amplification factors of the upsampling layer are all 2 in the invention), and combining the upsampled characteristics with the output characteristics X of the fourth layer40Performing splicing operation on the channel domain, inputting the spliced characteristics into a channel domain attention module, and inputting the output of the channel domain attention module into a convolution block consisting of two convolution layers with the channel number of 256 of 3 × 3 to obtain the output characteristic X of the convolution block41
Step S24: outputting characteristics X of the fourth layer to the third layer of the VGG16 network constructed in the step S2140,X30And the feature X obtained in step S2341Performing feature fusion by combining the features X40Up-sampling is carried out, and the up-sampled result and the characteristic X are30Performing splicing operation on the channel domain, inputting the spliced characteristics into a convolution block consisting of two convolution layers with the channel number of 128 of 3 × 3 to obtain characteristics X31The feature X41Performing an upsampling operation to obtain a feature X32The feature X31And feature X32Performing splicing operation on a channel domain, inputting the spliced characteristics into a channel domain attention module, and inputting the output of the channel domain attention module into a convolution block consisting of two convolution layers with the channel number of 128 of 3 × 3 to obtain the output characteristics X of the convolution block33
Step S25: outputting characteristics X from the third layer to the second layer of the VGG16 network constructed in the step S2130,X20And the feature X obtained in step S2431,X33Performing feature fusion by combining the features X30Performing an upsampling operation, and comparing the upsampled feature with the feature X20Performing splicing operation on the channel domain, inputting the spliced characteristics into a convolution block consisting of two convolution layers with the channel number of 3 × 3 being 64 to obtain characteristics X21The feature X31Performing an upsampling operation to obtain a feature X22The feature X21And feature X22Performing splicing operation on the channel domain, inputting the spliced characteristics into a convolution block consisting of two convolution layers with the channel number of 3 × 3 being 64 and obtaining the output characteristics X of the convolution block23The feature X33Performing an upsampling operation to obtain a feature X24The feature X23And feature X24Performing splicing operation on a channel domain, inputting spliced features into a channel domain attention module, inputting the output of the channel domain attention module into a convolution block consisting of two convolution layers with the channel number of 3 × 3 being 64 and a convolution layer with the channel number of 3 × 3 being 32, inputting the output of the convolution block into a convolution layer with the channel number of 1 × 1 being 1, and completing the construction of a multi-level feature fusion crowd counting network;
step S3, initializing network weight parameters, specifically, for the crowd counting network obtained in step S2, the initial value of the feature extractor VGG16 is the classification weight of ImageNet of VGG16 not including the full connection layer, and other convolutional layers and the full connection layer all adopt positive-too-distribution initialization parameters, wherein: μ ═ 0, σ ═ 0.01;
step S4, inputting the crowd image and the crowd density map preprocessed in the step S1 into a network to finish forward propagation;
and S5, calculating loss by the result of forward propagation in the step S4 and the real density map of the input network, and updating model parameters in the following specific mode:
step S51 calculating mean square error loss L of the result of forward propagation and the true density mapMSEThe concrete mode is as follows:
Figure FDA0002447808560000031
where N represents the number of samples of input data that are propagated forward at one time, where N is 8 in the present invention,
Figure FDA0002447808560000033
a density map representing the current ith data forward propagation computation,
Figure FDA0002447808560000032
a true density map representing the current ith datum;
step S52, loss L calculated in the step S51MSEUpdating the model parameters by using a random gradient descent method;
step S6, iterating the steps S4 and S5 to the specified times, wherein the iteration times are 50 times;
and step S7, acquiring the crowd density map to obtain the estimated number of people, wherein the specific mode is that the number of people contained in the crowd image is obtained by summing all pixels in the crowd density map calculated by the model.
CN202010284030.5A 2020-04-13 2020-04-13 Crowd counting method based on multi-level feature fusion Active CN111488834B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010284030.5A CN111488834B (en) 2020-04-13 2020-04-13 Crowd counting method based on multi-level feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010284030.5A CN111488834B (en) 2020-04-13 2020-04-13 Crowd counting method based on multi-level feature fusion

Publications (2)

Publication Number Publication Date
CN111488834A true CN111488834A (en) 2020-08-04
CN111488834B CN111488834B (en) 2023-07-04

Family

ID=71792806

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010284030.5A Active CN111488834B (en) 2020-04-13 2020-04-13 Crowd counting method based on multi-level feature fusion

Country Status (1)

Country Link
CN (1) CN111488834B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801340A (en) * 2020-12-16 2021-05-14 北京交通大学 Crowd density prediction method based on multilevel city information unit portrait

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107301387A (en) * 2017-06-16 2017-10-27 华南理工大学 A kind of image Dense crowd method of counting based on deep learning
CN109271960A (en) * 2018-10-08 2019-01-25 燕山大学 A kind of demographic method based on convolutional neural networks
CN109598220A (en) * 2018-11-26 2019-04-09 山东大学 A kind of demographic method based on the polynary multiple dimensioned convolution of input
CN109903339A (en) * 2019-03-26 2019-06-18 南京邮电大学 A kind of video group personage's position finding and detection method based on multidimensional fusion feature
CN110705344A (en) * 2019-08-21 2020-01-17 中山大学 Crowd counting model based on deep learning and implementation method thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107301387A (en) * 2017-06-16 2017-10-27 华南理工大学 A kind of image Dense crowd method of counting based on deep learning
CN109271960A (en) * 2018-10-08 2019-01-25 燕山大学 A kind of demographic method based on convolutional neural networks
CN109598220A (en) * 2018-11-26 2019-04-09 山东大学 A kind of demographic method based on the polynary multiple dimensioned convolution of input
CN109903339A (en) * 2019-03-26 2019-06-18 南京邮电大学 A kind of video group personage's position finding and detection method based on multidimensional fusion feature
CN110705344A (en) * 2019-08-21 2020-01-17 中山大学 Crowd counting model based on deep learning and implementation method thereof

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801340A (en) * 2020-12-16 2021-05-14 北京交通大学 Crowd density prediction method based on multilevel city information unit portrait
CN112801340B (en) * 2020-12-16 2024-04-26 北京交通大学 Crowd density prediction method based on multi-level city information unit portraits

Also Published As

Publication number Publication date
CN111488834B (en) 2023-07-04

Similar Documents

Publication Publication Date Title
CN112541503B (en) Real-time semantic segmentation method based on context attention mechanism and information fusion
CN110210551B (en) Visual target tracking method based on adaptive subject sensitivity
CN107704857A (en) A kind of lightweight licence plate recognition method and device end to end
CN113344806A (en) Image defogging method and system based on global feature fusion attention network
CN111815665B (en) Single image crowd counting method based on depth information and scale perception information
CN112396607A (en) Streetscape image semantic segmentation method for deformable convolution fusion enhancement
CN113449735B (en) Semantic segmentation method and device for super-pixel segmentation
CN107506792B (en) Semi-supervised salient object detection method
CN111640116B (en) Aerial photography graph building segmentation method and device based on deep convolutional residual error network
CN105243154A (en) Remote sensing image retrieval method and system based on significant point characteristics and spare self-encodings
CN108921850B (en) Image local feature extraction method based on image segmentation technology
CN114048822A (en) Attention mechanism feature fusion segmentation method for image
CN111833360B (en) Image processing method, device, equipment and computer readable storage medium
CN112348870A (en) Significance target detection method based on residual error fusion
CN113269224A (en) Scene image classification method, system and storage medium
CN112967218A (en) Multi-scale image restoration system based on wire frame and edge structure
CN116797787B (en) Remote sensing image semantic segmentation method based on cross-modal fusion and graph neural network
CN116258757A (en) Monocular image depth estimation method based on multi-scale cross attention
CN111488834B (en) Crowd counting method based on multi-level feature fusion
CN115049945A (en) Method and device for extracting lodging area of wheat based on unmanned aerial vehicle image
CN117726954A (en) Sea-land segmentation method and system for remote sensing image
CN111275076B (en) Image significance detection method based on feature selection and feature fusion
CN113553949A (en) Tailing pond semantic segmentation method based on photogrammetric data
CN112560719A (en) High-resolution image water body extraction method based on multi-scale convolution-multi-core pooling
CN116543165A (en) Remote sensing image fruit tree segmentation method based on dual-channel composite depth network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant