WO2023207073A1 - 一种目标检测方法、装置、设备及介质 - Google Patents

一种目标检测方法、装置、设备及介质 Download PDF

Info

Publication number
WO2023207073A1
WO2023207073A1 PCT/CN2022/135154 CN2022135154W WO2023207073A1 WO 2023207073 A1 WO2023207073 A1 WO 2023207073A1 CN 2022135154 W CN2022135154 W CN 2022135154W WO 2023207073 A1 WO2023207073 A1 WO 2023207073A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target detection
data
position parameters
image size
Prior art date
Application number
PCT/CN2022/135154
Other languages
English (en)
French (fr)
Inventor
金良
郭振华
赵雅倩
范宝余
刘璐
徐聪
李辰
蒋东东
Original Assignee
浪潮电子信息产业股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浪潮电子信息产业股份有限公司 filed Critical 浪潮电子信息产业股份有限公司
Publication of WO2023207073A1 publication Critical patent/WO2023207073A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • the present application relates to the field of image detection technology, and in particular to a target detection method, device, equipment and non-volatile readable storage medium.
  • Target detection is one of the most important research directions in the field of computer vision.
  • Common public data sets include COCO (data set), VOC (data set), object365 (data set), etc., and algorithms based on these public data sets include YOLO (Target Detection Algorithm), CenterNet (Target Detection Algorithm), etc.
  • YOLO Target Detection Algorithm
  • CenterNet Target Detection Algorithm
  • the PANDA (dataset name) dataset is the first super-resolution large scene object detection dataset for real scenes. The scene covers an average square kilometer range, thousands of people can be observed at the same time, and faces can be clearly identified hundreds of meters away.
  • the video resolution is nearly 1 billion pixels, but the existing computing power cannot satisfy the input of the entire image into the model for training or inference. If the image resolution is simply reduced, a large amount of details and small targets will be lost. This will lead to the loss of a lot of details and small targets based on commonly used methods such as COCO. Data set-related algorithms cannot adapt to such a scenario.
  • a super-resolution image is usually divided into several small pictures that can be processed by the model, and each small picture is sent to the model in turn for reasoning, and the detection results of each small picture are obtained, and finally the The detection results of all small images are spliced into the detection results of the super-resolution image.
  • the original super-resolution image is divided into different small blocks and sent to the detection model for inference in turn, the detection efficiency is seriously affected.
  • the purpose of this application is to provide a target detection method, device, equipment and medium that can improve the efficiency of target detection.
  • the specific plan is as follows:
  • This application discloses a target detection method, including:
  • the target detection results are fused to obtain the final target detection result.
  • obtain the tile position parameters corresponding to the image size including:
  • the preset parameter mapping table is a mapping table between different scales and different cutting position parameters
  • the image to be detected is segmented based on the segmenting position parameters to obtain segmented data, including:
  • the image to be detected is sliced into slices based on different slice position parameters corresponding to different scales to obtain slice data at each scale.
  • the construction process of the preset parameter mapping table includes:
  • fuse the target detection results to obtain the final target detection results including:
  • the target detection results of the sliced data at the same scale are fused respectively to obtain the fusion result;
  • construct all the tile data of the image to be detected into a batch data including:
  • the target detection results of the sliced data at the same scale are fused respectively to obtain the fusion results, including:
  • the target detection results of the sliced data at the same scale are fused respectively to obtain the fusion result.
  • the tile position parameter is a tile position parameter calculated based on the image size, the preset sliding window size, and the preset sliding window movement step.
  • Optional also includes:
  • tile position parameters corresponding to the image size including:
  • This application discloses a target detection device, including:
  • a size determination module used to determine the image size of the image to be detected
  • Parameter acquisition module used to obtain the cutting position parameters corresponding to the image size
  • the image slicing module is used to slice the image to be detected based on the slicing position parameters to obtain slicing data
  • the data construction module is used to construct all the tile data of the image to be detected into a batch data
  • the model inference module is used to input batch data into the target detection model to obtain the target detection results of each sliced data;
  • the result determination module is used to fuse the target detection results to obtain the final target detection result.
  • This application discloses an electronic device, including a processor and a memory; wherein,
  • Memory used to hold computer programs
  • the processor is used to execute the computer program to implement the aforementioned target detection method.
  • This application discloses a computer-readable storage medium for storing a computer program, wherein the computer program implements the aforementioned target detection method when executed by a processor.
  • this application first determines the image size of the image to be detected, then obtains the tile position parameters corresponding to the image size, and then cuts the image to be detected based on the tile position parameters to obtain the tile data, and combines all the slices of the image to be detected.
  • the block data is constructed as a batch data, and then the batch data is input into the target detection model to obtain the target detection results of each cut data. Finally, the target detection results are fused to obtain the final target detection results.
  • this application first obtains the corresponding tile position parameters based on the image size of the image to be detected, then cuts the image to be detected based on the tile position parameters, and constructs all the tile data of the image to be detected into a batch processing data , In this way, the target detection model is used to perform an inference on the batch data to obtain the target detection results of each sliced data. Finally, the target detection results of each sliced data are fused to obtain the final detection result. This reduces the number of model inferences. This improves target detection efficiency.
  • Figure 1 is a flow chart of a target detection method disclosed in this application.
  • Figure 2 is a flow chart of a specific target detection method disclosed in this application.
  • Figure 3 is a schematic structural diagram of a target detection device disclosed in this application.
  • Figure 4 is a structural diagram of an electronic device disclosed in this application.
  • the method of sliding window cutting and splicing is often used to handle super-resolution large scene object detection tasks.
  • This algorithm draws on the classic target detection algorithm. Set a fixed-size window and slide it in the original image with a fixed step size. Each time, only the image that overlaps with the sliding window is processed. In this way, a super-resolution image is divided into several small images that can be processed by the model, and all the images are processed. The detection results of the small image are spliced into the detection results of the large image.
  • the sliding window cutting method is used to decompose the image into small images as training samples for model tuning. During testing, the same method is used to predict and splice the image to be detected into blocks into a complete target detection result.
  • the sliding window cutting algorithm shows good performance, due to the need to manually set the sliding window size and overlap (overlap), and the manually set parameters cannot match all situations, there is a target that is cut into two by the sliding window, resulting in no target. Completely appeared inside the two adjacent sliding windows, leading to missed detection.
  • the multi-scale pyramid cutting reasoning algorithm came into being.
  • the multi-scale pyramid down-sampling cutting method is used to cut the image into blocks and down-sample to different sizes. Then, the cut-up images are detected and all Detection result fusion.
  • the multi-scale pyramid cutting inference algorithm solves the problem of missed detection by the sliding window cutting algorithm
  • the original super-resolution image is divided into different small pieces and sent to the detection model in turn, which will inevitably affect the inference. s efficiency. That is to say, currently, for target detection in super-resolution images, a super-resolution image is usually divided into several small pictures that can be processed by the model, and each small picture is sent to the model in turn for reasoning, and the detection results of each small picture are obtained. , and finally the detection results of all small images are spliced into the detection results of the super-resolution image.
  • the original super-resolution image is divided into different small blocks and sent to the detection model for inference in turn, the efficiency of detection is seriously affected.
  • the embodiment of the present application discloses a target detection method, including:
  • Step S11 Determine the image size of the image to be detected.
  • the image to be detected is a super-resolution image of a large scene.
  • the image size of the image is first determined.
  • the image size is 204800 pixels ⁇ 115200 pixels.
  • Step S12 Obtain the cutting position parameters corresponding to the image size.
  • the image size of the image collected by each camera device is obtained; the tile position parameters corresponding to each image size are calculated respectively, and the mapping relationship between each image size and each tile position parameter is obtained; based on the mapping relationship, the The tile position parameters corresponding to the image size.
  • a preset parameter mapping table corresponding to the image size can be obtained; wherein the preset parameter mapping table is a mapping table between different scales and different cutting position parameters.
  • the construction process of the preset parameter mapping table includes: determining the tile sizes corresponding to different scales based on the number of divisions and image sizes at different scales; calculating the tile position parameters corresponding to different scales based on the tile sizes to obtain the preset Parameter mapping table.
  • the embodiment of the present application can use the multi-scale pyramid cutting algorithm to calculate the cutting position parameters corresponding to each image size, and construct a preset parameter mapping table corresponding to each image size.
  • This algorithm calculates the position of each block based on the known number of cuts. Let the number of cuts in the horizontal direction and vertical direction be h_nums and v_nums respectively, which are divided into the following two types:
  • img w and img h represent the width and height of the image respectively
  • sliding w and sliding h represent the width and height of the slice respectively.
  • the calculation formula corresponding to the original position of each slice is as follows:
  • h_idx ⁇ h_nums, v_idx ⁇ v_nums, start_pos r and end_pos r represent the row starting position and end position respectively
  • start_pos c and end_pos c represent the column starting position and end position respectively.
  • img w and img h represent the width and height of the image respectively
  • sliding w and sliding h represent the width and height of the slice respectively.
  • the calculation formula corresponding to the original position of each slice is as follows:
  • start_pos r (h_idx-1)*(1-h_ratio)*sliding w
  • end_pos r (h_idx-1)*(1-h_ratio)*sliding w +sliding w
  • start_pos c (v_idx-1)*(1-v_ratio)*sliding h
  • end_pos c (v_idx-1)*(1-v_ratio)*sliding h+ sliding h
  • h_idx ⁇ h_nums, v_idx ⁇ v_nums, start_pos r and end_pos r represent the row starting position and end position respectively
  • start_pos c and end_pos c represent the column starting position and end position respectively.
  • this application can set relevant parameters based on the selected multi-scale pyramid cutting algorithm, and construct a mapping table between scales and cutting position parameters for all image sizes.
  • the specific steps include: obtaining the current image size from all image sizes, represented by w, h; according to w, h, using the multi-scale pyramid overlapping segmentation algorithm to calculate the segmented region at each scale Position, each region stores the row starting position and row ending position, as well as the column starting position and column ending position, as shown in Table 1; establish and store the different scales and tile parameters of the current image size w, h.
  • Mapping table traverse all image sizes and establish a parameter mapping table between scales and tiles corresponding to each image size. In this way, for the image to be detected, based on the parameter mapping table and using python (computer programming language) vectorization calculation method, all the slice data can be obtained at one time.
  • python computer programming language
  • Cutting position parameters 1 ⁇ 1 Region1 is [0:img w ,0:img h ] 2 ⁇ 2 Region1,Region2,Region3,Region4 3 ⁇ 3 Region1,Region2,...,Region9 ... ...
  • the tile position parameter is a tile position parameter calculated based on the image size, the preset sliding window size, and the preset sliding window movement step.
  • the embodiment of the present application can use the sliding window cutting algorithm to calculate the cutting position parameters corresponding to each image size.
  • This algorithm directly segments the image based on the known sliding window size and overlap.
  • the calculation formula for the number of slices in the horizontal and vertical directions is as follows:
  • h_nums and v_nums represent the number of divisions in the horizontal and vertical directions respectively
  • step w and step h respectively represent the step length of the sliding window movement in the horizontal and vertical directions.
  • img w and img h represent the width and height of the image respectively
  • sliding w and sliding h represent the width and height of the slice respectively.
  • Step S13 Slice the image to be detected based on the slicing position parameters to obtain slicing data.
  • the image to be detected can be sliced into slices based on different slice position parameters corresponding to different scales to obtain slice data at each scale.
  • Step S14 Construct all slice data of the image to be detected into a batch of data.
  • identification information can be added to each slice data; where the identification information is used to identify the scale corresponding to each slice data; all slice data of the image to be detected are constructed as one batch data;
  • Step S15 Input the batch data into the target detection model to obtain the target detection results of each sliced data.
  • Step S16 Fusion of the target detection results to obtain the final target detection result.
  • the target detection results of the sliced data at the same scale can be fused respectively to obtain the fusion result; the fusion results of different scales can be fused to obtain the final target detection result.
  • the target detection results of the block data at the same scale can be fused based on the identification information to obtain the fusion result.
  • embodiments of the present application can construct a mapping table of different scales and identification information based on the identification information of each sliced data. As shown in Table 2, scale 0, number of slices is 1, identification information is 0, scale 1, The cut data is 4, the identification information is 1, 2, 3, 4 respectively, the start identification is 1, the end identification is 4, etc. Correspondingly, based on the mapping table, the target detection of the cut data at the same scale is performed. The results are fused to obtain the fusion result.
  • Figure 2 provides a flow chart of a specific target detection method in this application.
  • Read the super-resolution image to be processed and obtain the corresponding width and height of the image, represented by w and h respectively; based on the image size w and h to be processed, obtain the corresponding multi-scale pyramid cutting position parameters according to the mapping relationship table; then Obtain the cutting data of each piece based on the cutting position parameters, and construct a mapping table of different scales and identification information. Based on this mapping table, all the tile data of the entire image are organized into a batch, and identification information is added. Based on the built batch data and the pre-trained model, perform a forward pass and save the detection results.
  • mapping tables of different scales and identification information soft (fine)-nms (non-Maxima Suppression, non-maximum simulation) or softer (fine)-nms (on-Maxima Suppression, non-maximum simulation) are used. Dahua Simulation) and other related algorithms are used to fuse the block detection results at the same scale. Based on the fusion of detection results of blocks at the same scale, related algorithms such as soft-nms or softer-nms are used to fuse different scales and output the final detection results. In this way, the super-resolution image is built into a batch data, and the entire image detection can be completed by inferring only one super-resolution image model, which greatly improves the inference speed of super-resolution large scene object detection.
  • the embodiment of the present application first determines the image size of the image to be detected, and then obtains the tile position parameters corresponding to the image size. Then, the image to be detected is diced based on the tile position parameters to obtain the tile data, and the tile position parameters of the image to be detected are obtained. All cut data are constructed as a batch data, and then the batch data is input into the target detection model to obtain the target detection results of each cut data. Finally, the target detection results are fused to obtain the final target detection results.
  • this application first obtains the corresponding tile position parameters based on the image size of the image to be detected, then cuts the image to be detected based on the tile position parameters, and constructs all the tile data of the image to be detected into a batch processing data , In this way, the target detection model is used to perform an inference on the batch data to obtain the target detection results of each sliced data. Finally, the target detection results of each sliced data are fused to obtain the final detection result. This reduces the number of model inferences. This improves target detection efficiency.
  • an embodiment of the present application discloses a target detection device, which includes:
  • Size determination module used to determine the image size of the image to be detected
  • Parameter acquisition module 12 is used to obtain the cutting position parameters corresponding to the image size
  • the image slicing module 13 is used to slice the image to be detected based on the slicing position parameters to obtain slicing data;
  • the data construction module 14 is used to construct all the slice data of the image to be detected into a batch data
  • the model inference module 15 is used to input batch data into the target detection model to obtain the target detection results of each sliced data;
  • the result determination module 16 is used to fuse the target detection results to obtain the final target detection result.
  • the embodiment of the present application first determines the image size of the image to be detected, and then obtains the tile position parameters corresponding to the image size. Then, the image to be detected is diced based on the tile position parameters to obtain the tile data, and the tile position parameters of the image to be detected are obtained. All cut data are constructed as a batch data, and then the batch data is input into the target detection model to obtain the target detection results of each cut data. Finally, the target detection results are fused to obtain the final target detection results.
  • this application first obtains the corresponding tile position parameters based on the image size of the image to be detected, then cuts the image to be detected based on the tile position parameters, and constructs all the tile data of the image to be detected into a batch processing data , In this way, the target detection model is used to perform an inference on the batch data to obtain the target detection results of each sliced data. Finally, the target detection results of each sliced data are fused to obtain the final detection result. This reduces the number of model inferences. This improves target detection efficiency.
  • the parameter acquisition module 12 is specifically used to obtain a preset parameter mapping table corresponding to the image size; wherein the preset parameter mapping table is a mapping table between different scales and different cutting position parameters;
  • the image slicing module 13 is specifically used for:
  • the image to be detected is sliced into slices based on different slice position parameters corresponding to different scales to obtain slice data at each scale.
  • the device also includes a parameter mapping table building module, which is used to determine the cutting size corresponding to different scales based on the number of cuttings and the image size at different scales; calculate the cutting position parameters corresponding to different scales based on the cutting size, and obtain Default parameter mapping table.
  • a parameter mapping table building module which is used to determine the cutting size corresponding to different scales based on the number of cuttings and the image size at different scales; calculate the cutting position parameters corresponding to different scales based on the cutting size, and obtain Default parameter mapping table.
  • the result determination module 16 specifically includes:
  • the first fusion sub-module is used to fuse the target detection results of the sliced data at the same scale respectively to obtain the fusion result;
  • the second fusion sub-module is used to fuse the fusion results of different scales to obtain the final target detection result.
  • the data building module 14 specifically includes:
  • the identification information adding sub-module is used to add identification information to each sliced data; where the identification information is used to identify the scale corresponding to each sliced data;
  • the data construction sub-module constructs all the slice data of the image to be detected into a batch data
  • the first fusion sub-module is specifically used to fuse the target detection results of the sliced data at the same scale based on the identification information to obtain the fusion result.
  • the tile position parameter is a tile position parameter calculated based on the image size, the preset sliding window size, and the preset sliding window movement step.
  • the device also includes:
  • the image size determination module is used to obtain the image size of the images collected by each camera device
  • the mapping relationship acquisition module is used to calculate the tile position parameters corresponding to each image size, and obtain the mapping relationship between each image size and each tile position parameter;
  • the parameter acquisition module 12 is specifically configured to acquire the tile position parameters corresponding to the image size based on the mapping relationship.
  • the embodiment of the present application discloses an electronic device 20, which includes a processor 21 and a memory 22; the memory 22 is used to save the computer program; the processor 21 is used to execute the computer program.
  • the aforementioned embodiment Publicly available object detection methods.
  • the memory 22, as a carrier for resource storage may be a read-only memory, a random access memory, a magnetic disk or an optical disk, etc., and the storage method may be short-term storage or permanent storage.
  • the electronic device 20 also includes a power supply 23, a communication interface 24, an input and output interface 25 and a communication bus 26; the power supply 23 is used to provide operating voltage for each hardware device on the electronic device 20; the communication interface 24 can provide the electronic device 20 with working voltage.
  • a data transmission channel with external devices and the communication protocol it follows is any communication protocol that can be applied to the technical solution of this application, which is not specifically limited here; the input and output interface 25 is used to obtain external input data or To output data to the outside world, the specific interface type can be selected according to specific application needs, and is not specifically limited here.
  • embodiments of the present application also disclose a computer non-volatile readable storage medium for storing a computer program, wherein when the computer program is executed by a processor, the target detection method disclosed in the foregoing embodiments is implemented.
  • computer non-volatile readable storage media can be implemented by any type of volatile or non-volatile storage devices or their combination, such as static random access memory (Static Random Access Memory, SRAM), electrically erasable memory Except Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM) ), read-only memory (Read-Only Memory, ROM), magnetic memory, flash memory, one or more of magnetic disks or optical disks.
  • SRAM static random access memory
  • EEPROM Electrically Erasable Programmable Read-Only Memory
  • EPROM Erasable Programmable Read-Only Memory
  • PROM Programmable Read-Only Memory
  • ROM read-only memory
  • magnetic memory flash memory
  • flash memory one or more of magnetic disks or optical disks.
  • RAM random access memory
  • ROM read-only memory
  • electrically programmable ROM electrically erasable programmable ROM
  • registers hard disks, removable disks, CD-ROMs, or anywhere in the field of technology. any other known form of storage media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

一种目标检测方法、装置、设备及介质,包括:确定待检测图像的图像尺寸(S11);获取所述图像尺寸对应的切块位置参数(S12);基于所述切块位置参数对所述待检测图像进行切块,得到切块数据(S13);将所述待检测图像的所有所述切块数据构建为一个批处理数据(S14);将所述批处理数据输入目标检测模型,得到各所述切块数据的目标检测结果(S15);对所述目标检测结果进行融合,得到最终的目标检测结果(S16)。这样,利用目标检测模型对批处理数据进行一次推理,得到各切块数据的目标检测结果,最后对各切块数据的目标检测结果融合,得到最终的检测结果,这样降低了模型推理次数,从而提升了目标检测效率。

Description

一种目标检测方法、装置、设备及介质
相关申请的交叉引用
本申请要求于2022年04月29日提交中国专利局,申请号为202210466978.1,申请名称为“一种目标检测方法、装置、设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像检测技术领域,特别涉及一种目标检测方法、装置、设备及非易失性可读存储介质。
背景技术
目标检测是计算机视觉领域中最重要的研究方向之一,常见的公开数据集有COCO(数据集)、VOC(数据集)、object365(数据集)等,而基于这些公开数据集的算法有YOLO(目标检测算法)、CenterNet(目标检测算法)等,但在现实场景中,摄像头采集到图像的分辨率远远大于公开的数据集,导致现有模型较难适配现实场景。而PANDA(数据集名)数据集是第一个面向真实场景,具有超分辨大场景对象检测数据集,场景平均覆盖平方千米级范围,可同时观测数千人,百米外人脸清晰可识别,视频分辨率近10亿像素,但现有算力无法满足将整张图像输入到模型中训练或推理,若单纯的降低图像分辨率会丢失大量细节和小目标,这样就导致基于COCO等常用数据集相关算法无法适配这样的场景。
目前,针对超分辨图像的目标检测,通常将一张超分辨图像切分成若干个模型能够处理的小图,将每个小图依次送入模型进行推理,得到个小图的检测结果,最终将所有小图的检测结果拼接成超分辨图像的检测结果,但由于将原超分辨图像切分成不同小块,依次送入到检测模型中进行推理,严重影响检测的效率。
发明内容
有鉴于此,本申请的目的在于提供一种目标检测方法、装置、设备及介质,能够提升目标检测的效率。其具体方案如下:
本申请公开了一种目标检测方法,包括:
确定待检测图像的图像尺寸;
获取图像尺寸对应的切块位置参数;
基于切块位置参数对待检测图像进行切块,得到切块数据;
将待检测图像的所有切块数据构建为一个批处理数据;
将批处理数据输入目标检测模型,得到各切块数据的目标检测结果;
对目标检测结果进行融合,得到最终的目标检测结果。
可选的,获取图像尺寸对应的切块位置参数,包括:
获取图像尺寸对应的预设参数映射表;其中,预设参数映射表为不同尺度与不同切块位置参数之间的映射表;
相应的,基于切块位置参数对待检测图像进行切块,得到切块数据,包括:
基于不同尺度对应的不同切块位置参数分别对待检测图像进行切块,得到每个尺度下的切块数据。
可选的,预设参数映射表的构建过程包括:
基于不同尺度下的切分数目以及图像尺寸分别确定不同尺度对应的切块尺寸;
基于切块尺寸计算不同尺度对应的切块位置参数,得到预设参数映射表。
可选的,对目标检测结果进行融合,得到最终的目标检测结果,包括:
分别对同一尺度下的切块数据的目标检测结果进行融合,得到融合结果;
对不同尺度的融合结果进行融合,得到最终的目标检测结果。
可选的,将待检测图像的所有切块数据构建为一个批处理数据,包括:
为每个切块数据添加标识信息;其中,标识信息用于标识每个切块数据对应的尺度;
将待检测图像的所有切块数据构建为一个批处理数据;
相应的,分别对同一尺度下的切块数据的目标检测结果进行融合,得到融合结果,包括:
基于标识信息分别对同一尺度下的切块数据的目标检测结果进行融合,得到融合结果。
可选的,切块位置参数为基于图像尺寸、预设滑窗尺寸、预设滑窗移动步长计算得到的切块位置参数。
可选的,还包括:
获取各摄像装置所采集图像的图像尺寸;
分别计算各图像尺寸对应的切块位置参数,得到各图像尺寸与各切块位置参数之间的映射关系;
相应的,获取图像尺寸对应的切块位置参数,包括:
基于映射关系获取图像尺寸对应的切块位置参数。
本申请公开了一种目标检测装置,包括:
尺寸确定模块,用于确定待检测图像的图像尺寸;
参数获取模块,用于获取图像尺寸对应的切块位置参数;
图像切块模块,用于基于切块位置参数对待检测图像进行切块,得到切块数据;
数据构建模块,用于将待检测图像的所有切块数据构建为一个批处理数据;
模型推理模块,用于将批处理数据输入目标检测模型,得到各切块数据的目标检测结果;
结果确定模块,用于对目标检测结果进行融合,得到最终的目标检测结果。
本申请公开了一种电子设备,包括处理器和存储器;其中,
存储器,用于保存计算机程序;
处理器,用于执行计算机程序以实现前述的目标检测方法。
本申请公开了一种计算机可读存储介质,用于保存计算机程序,其中,计算机程序被处理器执行时实现前述的目标检测方法。
可见,本申请先确定待检测图像的图像尺寸,然后获取图像尺寸对应的切块位置参数,之后基于切块位置参数对待检测图像进行切块,得到切块数据,并将待检测图像的所有切块数据构建为一个批处理数据,然后将批处理数据输入目标检测模型,得到各切块数据的目标检测结果,最后对目标检测结果进行融合,得到最终的目标检测结果。也即,本申请先基于待检测图像的图像尺寸获取相应的切块位置参数,然后基于切块位置参数对待检测图像进行切块,并将待检测图像的所有切块数据构建为一个批处理数据,这样,利用目标检测模型对批处理数据进行一次推理,得到各切块数据的目标检测结果,最后对各切块数据的目标检测结果融合,得到最终的检测结果,这样降低了模型推理次数,从而提升了目标检测效率。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。
图1为本申请公开的一种目标检测方法流程图;
图2为本申请公开的一种具体的目标检测方法流程图;
图3为本申请公开的一种目标检测装置结构示意图;
图4为本申请公开的一种电子设备结构图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请 中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
目前,处理超分辨率大场景对象检测任务常采用滑窗切图并拼接的方法,此算法借鉴经典的目标检测算法。设置固定尺寸窗口,并以固定步长在原始图像中滑动,每次仅处理与滑窗重叠部分的图像,这样就将一张超分辨图像切分成若干个模型能处理的小图,并将所有小图的检测结果拼接成大图的检测结果。训练时,通过滑窗切图方法将图片分解为小图作为训练样本进行模型调优,而测试时,使用同样的方式将待检测图像分块预测并拼接成一张完整的目标检测结果。滑窗切图算法虽表现出良好的性能,但是由于需要手动设置滑窗尺寸以及overlap(重叠部分),而手动设置的参数无法匹配所有情况,存在目标被滑窗一切为二,导致目标都无法完整出现在上下相邻的两个滑窗内部,因而导致漏检。为解决此问题多尺度金字塔切图推理算法应运而生,采用多尺度金字塔降采样切图方式,将图像依次切块降采样到不同尺寸,随后将切分后若干切块图像检测,并将所有检测结果融合。多尺度金字塔切图推理算法虽解决了滑窗切图算法漏检问题,但在推理过程中,是将原超分辨图像切分成不同小块,依次送入到检测模型中,这势必会影响推理的效率。也即,目前,针对超分辨图像的目标检测,通常将一张超分辨图像切分成若干个模型能够处理的小图,将每个小图依次送入模型进行推理,得到个小图的检测结果,最终将所有小图的检测结果拼接成超分辨图像的检测结果,但由于将原超分辨图像切分成不同小块,依次送入到检测模型中进行推理,严重影响检测的效率。
参见图1所示,本申请实施例公开了目标检测方法,包括:
步骤S11:确定待检测图像的图像尺寸。
在具体的实施方式中,待检测图像为大场景的超分辨率图像。当获取到摄像装置采集的图像,则先确定图像的图像尺寸,比如,图像尺寸为204800像素×115200像素。
步骤S12:获取图像尺寸对应的切块位置参数。
在具体的实施方式中,获取各摄像装置所采集图像的图像尺寸;分别计算各图像尺寸对应的切块位置参数,得到各图像尺寸与各切块位置参数之间的映射关系;基于映射关系获取图像尺寸对应的切块位置参数。
需要指出的是,在实际场景中,超分辨率图像往往是从多个摄像头获取,而每个摄像头参数不尽相同,导致出现不同分辨率。另外,同一摄像头不同时刻因参数不同也可能存在不同分辨率。因此,统计待检测图像的各种图像尺寸,并预先计算每个图像尺寸对应的切块位置参数,构建各图像尺寸与各切块位置参数之间的映射关系,当获取到待检测图像,确定图 像尺寸,基于该映射关系获取图像尺寸对应的切块位置参数。
在一种实施方式中,可以获取图像尺寸对应的预设参数映射表;其中,预设参数映射表为不同尺度与不同切块位置参数之间的映射表。
并且,预设参数映射表的构建过程包括:基于不同尺度下的切分数目以及图像尺寸分别确定不同尺度对应的切块尺寸;基于切块尺寸计算不同尺度对应的切块位置参数,得到预设参数映射表。
也即,在具体的实施方式中,本申请实施例可以利用多尺度金字塔切图算法计算各图像尺寸对应的切块位置参数,构建个图像尺寸对应的预设参数映射表。该算法在已知切分数目基础,去计算每个切块位置,令水平方向和垂直方向切分数目分别为h_nums和v_nums,分为以下两种:
(1)无重叠。即切块之间没有重叠,切块尺寸的计算公式如下:
Figure PCTCN2022135154-appb-000001
Figure PCTCN2022135154-appb-000002
其中,img w和img h分别表示图像的宽和高,sliding w和sliding h分别表示切块的宽和高。相应的,每个切块对应于原始的位置计算公式如下:
start_pos r=(h_idx-1)*sliding w
end_pos r=(h_idx-1)*sliding w
start_pos c=(v_iex-1)*sliding h
end_pos c=(v_idx-1)*sliding h
其中,h_idx∈h_nums、v_idx∈v_nums,start_pos r和end_pos r分别代表行起始位置和结束位置,start_pos c和end_pos c分别代表列起始位置和结束位置。这样,得到包括start_pos r、end_pos r、start_pos c和end_pos c的切块位置参数。
(2)有重叠。即切块之间有重叠,水平重叠和垂直方向重叠的比例分别为h_ratio和v_ratio,切块的尺寸的计算公式如下:
Figure PCTCN2022135154-appb-000003
Figure PCTCN2022135154-appb-000004
其中,img w和img h分别表示图像的宽和高,sliding w和sliding h分别表示切块的宽和高。相应的,每个切块对应于原始的位置计算公式如下:
start_pos r=(h_idx-1)*(1-h_ratio)*sliding w
end_pos r=(h_idx-1)*(1-h_ratio)*sliding w+sliding w
start_pos c=(v_idx-1)*(1-v_ratio)*sliding h
end_pos c=(v_idx-1)*(1-v_ratio)*sliding h+sliding h
其中,h_idx∈h_nums、v_idx∈v_nums,start_pos r和end_pos r分别代表行起始位置和 结束位置,start_pos c和end_pos c分别代表列起始位置和结束位置。这样,得到包括start_pos r、end_pos r、start_pos c和end_pos c的切块位置参数。
也即,本申请可以在选定的多尺度金字塔切图算法基础上,设置相关参数,对于所有图像尺寸构建尺度与切块位置参数之间的映射表,以多尺度金字塔有重叠切分为例,构建此映射表,具体步骤包括:从所有图像尺寸中,获取当前的图像尺寸,用w,h表示;根据w,h,利用多尺度金字塔有重叠切分算法,计算每个尺度下切块region位置,每个region分别存储行起始位置和行终止位置、以及列起始位置和列终止位置,如表1所示;建立并存储当前图像尺寸w,h的不同尺度与切块之间参数映射表;遍历所有图像尺寸,建立每个图像尺寸对应的尺度与切块之间参数映射表。这样对于待检测图像,基于参数映射表,并利用python(计算机编程语言)向量化计算方式,可一次性得到所有切块数据。
表1
多尺度金字塔切图数目 切块位置采参数
1×1 Region1为[0:img w,0:img h]
2×2 Region1,Region2,Region3,Region4
3×3 Region1,Region2,…,Region9
在另一种实施方式中,切块位置参数为基于图像尺寸、预设滑窗尺寸、预设滑窗移动步长计算得到的切块位置参数。
进一步的,在具体的实施方式中,本申请实施例可以利用滑窗切图算法计算各图像尺寸对应的切块位置参数。该算法在已知滑窗尺寸和重叠overlap基础上,直接去切分图像,水平方向和垂直方向切块数目计算公式如下:
Figure PCTCN2022135154-appb-000005
Figure PCTCN2022135154-appb-000006
其中,h_nums和v_nums分别表示水平方向和垂直方向的切分数目,step w与step h分别表示水平方向和垂直方向滑动窗口移动的步长,通常情况下,为了保证重叠,sliding w>step w和sliding h>step h。img w和img h分别表示图像的宽和高,sliding w和sliding h分别表示切块的宽和高。
步骤S13:基于切块位置参数对待检测图像进行切块,得到切块数据。
在一种实施方式中,可以基于不同尺度对应的不同切块位置参数分别对待检测图像进行切块,得到每个尺度下的切块数据。
步骤S14:将待检测图像的所有切块数据构建为一个批处理数据。
在具体的实施方式中,可以为每个切块数据添加标识信息;其中,标识信息用于标识每个切块数据对应的尺度;将待检测图像的所有切块数据构建为一个批处理数据;
步骤S15:将批处理数据输入目标检测模型,得到各切块数据的目标检测结果。
步骤S16:对目标检测结果进行融合,得到最终的目标检测结果。
在一种实施方式中,可以分别对同一尺度下的切块数据的目标检测结果进行融合,得到融合结果;对不同尺度的融合结果进行融合,得到最终的目标检测结果。
在具体的实施方式中,可以基于标识信息分别对同一尺度下的切块数据的目标检测结果进行融合,得到融合结果。
进一步的,本申请实施例可以基于每个切块数据的标识信息构建不同尺度与标识信息的映射表,如表2所示,尺度0,切块数目为1,标识信息为0,尺度1,切块数据为4,标识信息分别为1、2、3、4,起始标识为1,结束标识为4等等,相应的,基于该映射表分别对同一尺度下的切块数据的目标检测结果进行融合,得到融合结果。
表2
尺度 起始标识 结束标识 切块数目
0 0 0 1
1 1 4 4
2 6 13 9
     
当然,在硬件资源足够的情况下,batch(批)数据可增加一个维度,单卡可以同时处理多张超分辨率图像,对应的尺度与标识信息的映射表见表3:
表3
图像名称 尺度 起始标识 结束标识 切块数目
Img0 0 0 0 1
Img0 1 1 4 4
Img0 2 6 13 9
Img1 0 0 0 1
Img1 1 1 4 4
Img1 2 6 13 9
例如,参见图2所示,图2为本申请提供一种具体的目标检测方法流程图。读取待处理超分辨率图像,并获取图像对应的宽和高,分别使用w和h表示;基于待处理图像尺寸w和h,根据映射关系表获取对应多尺度金子塔切块位置参数;然后基于切块位置参数获取每块切块数据,并构建不同尺度与标识信息的映射表。基于该映射表构建将整幅图像所有的切块数据组织成一个batch,并添加标识信息。基于构建的batch数据,以及预先已训练好模型,执行一次前传,并保存检测结果。根据batch检测结果、不同尺度与标识信息的映射表,利用soft(细)-nms(即Non-Maxima Suppression,非极大化拟制)或softer(精细)-nms(on-Maxima Suppression,非极大化拟制)等相关算法进行同一尺度下切块检测结果融合。在同一尺度切块检测结果融合的基础上,利用soft-nms或softer-nms等相关算法进行不同尺度间融合,并输出最终检测结果。这样将超分辨率图像搭建成一个batch数据,针对一张超分辨率图像模型仅推理一次,便可以完成整幅图像检测,大大提升了超分辨大场景对象检测的推理速度。
可见,本申请实施例先确定待检测图像的图像尺寸,然后获取图像尺寸对应的切块位置参数,之后基于切块位置参数对待检测图像进行切块,得到切块数据,并将待检测图像的所有切块数据构建为一个批处理数据,然后将批处理数据输入目标检测模型,得到各切块数据的目标检测结果,最后对目标检测结果进行融合,得到最终的目标检测结果。也即,本申请先基于待检测图像的图像尺寸获取相应的切块位置参数,然后基于切块位置参数对待检测图像进行切块,并将待检测图像的所有切块数据构建为一个批处理数据,这样,利用目标检测模型对批处理数据进行一次推理,得到各切块数据的目标检测结果,最后对各切块数据的目 标检测结果融合,得到最终的检测结果,这样降低了模型推理次数,从而提升了目标检测效率。
参见图3所示,本申请实施例公开了一种目标检测装置,包括:
尺寸确定模块11,用于确定待检测图像的图像尺寸;
参数获取模块12,用于获取图像尺寸对应的切块位置参数;
图像切块模块13,用于基于切块位置参数对待检测图像进行切块,得到切块数据;
数据构建模块14,用于将待检测图像的所有切块数据构建为一个批处理数据;
模型推理模块15,用于将批处理数据输入目标检测模型,得到各切块数据的目标检测结果;
结果确定模块16,用于对目标检测结果进行融合,得到最终的目标检测结果。
可见,本申请实施例先确定待检测图像的图像尺寸,然后获取图像尺寸对应的切块位置参数,之后基于切块位置参数对待检测图像进行切块,得到切块数据,并将待检测图像的所有切块数据构建为一个批处理数据,然后将批处理数据输入目标检测模型,得到各切块数据的目标检测结果,最后对目标检测结果进行融合,得到最终的目标检测结果。也即,本申请先基于待检测图像的图像尺寸获取相应的切块位置参数,然后基于切块位置参数对待检测图像进行切块,并将待检测图像的所有切块数据构建为一个批处理数据,这样,利用目标检测模型对批处理数据进行一次推理,得到各切块数据的目标检测结果,最后对各切块数据的目标检测结果融合,得到最终的检测结果,这样降低了模型推理次数,从而提升了目标检测效率。
其中,参数获取模块12,具体用于获取图像尺寸对应的预设参数映射表;其中,预设参数映射表为不同尺度与不同切块位置参数之间的映射表;
相应的,图像切块模块13,具体用于:
基于不同尺度对应的不同切块位置参数分别对待检测图像进行切块,得到每个尺度下的切块数据。
进一步的,装置还包括参数映射表构建模块,用于基于不同尺度下的切分数目以及图像尺寸分别确定不同尺度对应的切块尺寸;基于切块尺寸计算不同尺度对应的切块位置参数,得到预设参数映射表。
进一步的,结果确定模块16,具体包括:
第一融合子模块,用于分别对同一尺度下的切块数据的目标检测结果进行融合,得到融 合结果;
第二融合子模块,用于对不同尺度的融合结果进行融合,得到最终的目标检测结果。
并且,数据构建模块14,具体包括:
标识信息添加子模块,用于为每个切块数据添加标识信息;其中,标识信息用于标识每个切块数据对应的尺度;
数据构建子模块将待检测图像的所有切块数据构建为一个批处理数据;
相应的,第一融合子模块,具体用于基于标识信息分别对同一尺度下的切块数据的目标检测结果进行融合,得到融合结果。
在另一种实施方式中,切块位置参数为基于图像尺寸、预设滑窗尺寸、预设滑窗移动步长计算得到的切块位置参数。
进一步的,装置还包括:
图像尺寸确定模块,用于获取各摄像装置所采集图像的图像尺寸;
映射关系获取模块,用于分别计算各图像尺寸对应的切块位置参数,得到各图像尺寸与各切块位置参数之间的映射关系;
相应的,参数获取模块12,具体用于基于映射关系获取图像尺寸对应的切块位置参数。
参见图4所示,本申请实施例公开了一种电子设备20,包括处理器21和存储器22;其中,存储器22,用于保存计算机程序;处理器21,用于执行计算机程序,前述实施例公开的目标检测方法。
关于上述目标检测方法的具体过程可以参考前述实施例中公开的相应内容,在此不再进行赘述。
并且,存储器22作为资源存储的载体,可以是只读存储器、随机存储器、磁盘或者光盘等,存储方式可以是短暂存储或者永久存储。
另外,电子设备20还包括电源23、通信接口24、输入输出接口25和通信总线26;其中,电源23用于为电子设备20上的各硬件设备提供工作电压;通信接口24能够为电子设备20创建与外界设备之间的数据传输通道,其所遵循的通信协议是能够适用于本申请技术方案的任意通信协议,在此不对其进行具体限定;输入输出接口25,用于获取外界输入数据或向外界输出数据,其具体的接口类型可以根据具体应用需要进行选取,在此不进行具体限定。
进一步的,本申请实施例还公开了一种计算机非易失性可读存储介质,用于保存计算机 程序,其中,计算机程序被处理器执行时实现前述实施例公开的目标检测方法。
关于上述目标检测方法的具体过程可以参考前述实施例中公开的相应内容,在此不再进行赘述。
其中,计算机非易失性可读存储介质可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,例如静态随机存取存储器(Static Random Access Memory,SRAM)、电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、可擦除可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM)、可编程只读存储器(Programmable Read-Only Memory,PROM)、只读存储器(Read-Only Memory,ROM)、磁存储器、快闪存储器、磁盘或光盘中的一种或多种。
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。
以上对本申请所提供的一种目标检测方法、装置、设备及介质进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上,本说明书内容不应理解为对本申请的限制。

Claims (20)

  1. 一种目标检测方法,其特征在于,包括:
    确定待检测图像的图像尺寸;
    获取所述图像尺寸对应的切块位置参数;
    基于所述切块位置参数对所述待检测图像进行切块,得到切块数据;
    将所述待检测图像的所有所述切块数据构建为一个批处理数据;
    将所述批处理数据输入目标检测模型,得到各所述切块数据的目标检测结果;
    对所述目标检测结果进行融合,得到最终的目标检测结果。
  2. 根据权利要求1所述的目标检测方法,其特征在于,所述获取所述图像尺寸对应的切块位置参数,包括:
    获取所述图像尺寸对应的预设参数映射表;其中,所述预设参数映射表为不同尺度与不同切块位置参数之间的映射表;
    相应的,所述基于所述切块位置参数对所述待检测图像进行切块,得到切块数据,包括:
    基于不同尺度对应的不同切块位置参数分别对所述待检测图像进行切块,得到每个尺度下的切块数据。
  3. 根据权利要求2所述的目标检测方法,其特征在于,所述预设参数映射表的构建过程包括:
    基于不同尺度下的切分数目以及所述图像尺寸分别确定不同尺度对应的切块尺寸;
    基于所述切块尺寸计算不同尺度对应的切块位置参数,得到所述预设参数映射表。
  4. 根据权利要求3所述的目标检测方法,其特征在于,所述基于所述切块尺寸计算不同尺度对应的切块位置参数,得到所述预设参数映射表,包括:
    从所有所述图像尺寸中,获取当前的图像尺寸;
    计算每个所述尺度下切块位置参数;
    建立所述当前图像尺寸的不同尺度与切块位置参数之间参数映射表;
    遍历所述所有图像尺寸,建立每个所述图像尺寸对应的尺度与切块位置参数之间参数映射表。
  5. 根据权利要求4所述的目标检测方法,其特征在于,在所述建立所述当前图像尺寸的不同尺度与切块位置参数之间参数映射表之后,所述预设参数映射表的构建过程还包括:
    存储所述当前图像尺寸的不同尺度与所述切块位置参数之间参数映射表。
  6. 根据权利要求4所述的目标检测方法,其特征在于,所述基于所述切块位置参数对所述待检测图像进行切块,得到切块数据,包括:
    基于不同尺度对应的不同切块位置参数分别对所述待检测图像利用向量化计算,得到每个尺度下的切块数据。
  7. 根据权利要求2所述的目标检测方法,其特征在于,所述对所述目标检测结果进行融合,得到最终的目标检测结果,包括:
    分别对同一尺度下的所述切块数据的目标检测结果进行融合,得到融合结果;
    对不同尺度的所述融合结果进行融合,得到最终的目标检测结果。
  8. 根据权利要求7所述的目标检测方法,其特征在于,所述将所述待检测图像的所有所述切块数据构建为一个批处理数据,包括:
    为每个所述切块数据添加标识信息;其中,所述标识信息用于标识每个所述切块数据对应的尺度;
    将所述待检测图像的所有所述切块数据构建为一个批处理数据;
    相应的,所述分别对同一尺度下的所述切块数据的目标检测结果进行融合,得到融合结果,包括:
    基于所述标识信息分别对同一尺度下的所述切块数据的目标检测结果进行融合,得到融合结果。
  9. 根据权利要求8所述的目标检测方法,其特征在于,还包括:
    基于每个所述切块数据的标识信息构建不同尺度与标识信息的映射表。
  10. 根据权利要求9所述的目标检测方法,其特征在于,所述分别对同一尺度下的所述切块数据的目标检测结果进行融合,得到融合结果,包括:
    基于所述不同尺度与标识信息的映射表分别对同一尺度下的所述切块数据的目标检测结果进行融合,得到融合结果。
  11. 根据权利要求1所述的目标检测方法,其特征在于,所述切块位置参数为基于所述图像尺寸、预设滑窗尺寸、预设滑窗移动步长计算得到的切块位置参数。
  12. 根据权利要求11所述的目标检测方法,其特征在于,所述获取所述图像尺寸对应的切块位置参数包括:
    利用滑窗切图算法计算各所述图像尺寸对应的切块位置参数。
  13. 根据权利要求1至12任一项所述的目标检测方法,其特征在于,还包括:
    获取各摄像装置所采集图像的图像尺寸;
    分别计算各所述图像尺寸对应的切块位置参数,得到各图像尺寸与各切块位置参数之间的映射关系;
    相应的,所述获取所述图像尺寸对应的切块位置参数,包括:
    基于所述映射关系获取所述图像尺寸对应的切块位置参数。
  14. 根据权利要求13所述的目标检测方法,其特征在于,所述分别计算各所述图像尺寸对应的切块位置参数,得到各图像尺寸与各切块位置参数之间的映射关系,
    统计各所述图像尺寸,计算各所述图像尺寸对应的切块位置参数,构建各所述图像尺寸与各所述切块位置参数之间的映射关系。
  15. 根据权利要求1至12任一项所述的目标检测方法,其特征在于,所述待检测图像为超分辨率图像。
  16. 根据权利要求15所述的目标检测方法,其特征在于,所述确定待检测图像的图像尺寸,包括:
    读取所述超分辨率图像,获取所述超分辨率图像对应的宽和高;
    组合所述宽和所述高生成所述超分辨率图像的图像尺寸。
  17. 根据权利要求1所述的目标检测方法,其特征在于,所述切块位置参数包括行起始位置和行终止位置、以及列起始位置和列终止位置。
  18. 一种目标检测装置,其特征在于,包括:
    尺寸确定模块,用于确定待检测图像的图像尺寸;
    参数获取模块,用于获取所述图像尺寸对应的切块位置参数;
    图像切块模块,用于基于所述切块位置参数对所述待检测图像进行切块,得到切块数据;
    数据构建模块,用于将所述待检测图像的所有所述切块数据构建为一个批处理数据;
    模型推理模块,用于将所述批处理数据输入目标检测模型,得到各所述切块数据的目标检测结果;
    结果确定模块,用于对所述目标检测结果进行融合,得到最终的目标检测结果。
  19. 一种电子设备,其特征在于,包括处理器和存储器;其中,
    所述存储器,用于保存计算机程序;
    所述处理器,用于执行所述计算机程序以实现如权利要求1至16任一项所述的目标检测方法。
  20. 一种计算机非易失性可读存储介质,其特征在于,用于保存计算机程序,其中,所述计算机程序被处理器执行时实现如权利要求1至16任一项所述的目标检测方法。
PCT/CN2022/135154 2022-04-29 2022-11-29 一种目标检测方法、装置、设备及介质 WO2023207073A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210466978.1A CN114842203A (zh) 2022-04-29 2022-04-29 一种目标检测方法、装置、设备及介质
CN202210466978.1 2022-04-29

Publications (1)

Publication Number Publication Date
WO2023207073A1 true WO2023207073A1 (zh) 2023-11-02

Family

ID=82567049

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/135154 WO2023207073A1 (zh) 2022-04-29 2022-11-29 一种目标检测方法、装置、设备及介质

Country Status (2)

Country Link
CN (1) CN114842203A (zh)
WO (1) WO2023207073A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114842203A (zh) * 2022-04-29 2022-08-02 浪潮电子信息产业股份有限公司 一种目标检测方法、装置、设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110738208A (zh) * 2019-10-08 2020-01-31 创新奇智(重庆)科技有限公司 一种高效的尺度规范化目标检测训练方法
CN113221895A (zh) * 2021-05-31 2021-08-06 北京灵汐科技有限公司 小目标检测方法、装置、设备及介质
CN113989744A (zh) * 2021-10-29 2022-01-28 西安电子科技大学 一种基于超大尺寸高分辨图像的行人目标检测方法及***
WO2022037087A1 (zh) * 2020-08-18 2022-02-24 眸芯科技(上海)有限公司 监控边缘计算中提升视频目标检测性能的方法及装置
CN114842203A (zh) * 2022-04-29 2022-08-02 浪潮电子信息产业股份有限公司 一种目标检测方法、装置、设备及介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110738208A (zh) * 2019-10-08 2020-01-31 创新奇智(重庆)科技有限公司 一种高效的尺度规范化目标检测训练方法
WO2022037087A1 (zh) * 2020-08-18 2022-02-24 眸芯科技(上海)有限公司 监控边缘计算中提升视频目标检测性能的方法及装置
CN113221895A (zh) * 2021-05-31 2021-08-06 北京灵汐科技有限公司 小目标检测方法、装置、设备及介质
CN113989744A (zh) * 2021-10-29 2022-01-28 西安电子科技大学 一种基于超大尺寸高分辨图像的行人目标检测方法及***
CN114842203A (zh) * 2022-04-29 2022-08-02 浪潮电子信息产业股份有限公司 一种目标检测方法、装置、设备及介质

Also Published As

Publication number Publication date
CN114842203A (zh) 2022-08-02

Similar Documents

Publication Publication Date Title
US20210201445A1 (en) Image cropping method
US10867437B2 (en) Computer vision database platform for a three-dimensional mapping system
US9554030B2 (en) Mobile device image acquisition using objects of interest recognition
CN105721853B (zh) 生成图像捕获指令的方法、***和计算机可读存储设备
CN112734641A (zh) 目标检测模型的训练方法、装置、计算机设备及介质
US10643071B2 (en) System and method for augmenting an image with stylized features
KR20170056474A (ko) 건축물 높이 산출 방법, 장치 및 저장 매체
CN112634343A (zh) 图像深度估计模型的训练方法、图像深度信息的处理方法
WO2023207073A1 (zh) 一种目标检测方法、装置、设备及介质
CN110399842B (zh) 视频处理方法、装置、电子设备及计算机可读存储介质
CN113360589B (zh) 地图数据更新方法、装置及电子设备
CN110570435A (zh) 用于对车辆损伤图像进行损伤分割的方法及装置
US11232561B2 (en) Capture and storage of magnified images
CN110599453A (zh) 一种基于图像融合的面板缺陷检测方法、装置及设备终端
CN113947768A (zh) 一种基于单目3d目标检测的数据增强方法和装置
CN114004840A (zh) 图像处理方法、训练方法、检测方法、装置、设备及介质
CN113657370B (zh) 一种文字识别方法及其相关设备
CN117315406A (zh) 一种样本图像处理方法、装置及设备
CN113011409A (zh) 一种图像识别方法、装置、电子设备及存储介质
CN110689565B (zh) 一种深度图确定的方法、装置及电子设备
CN110381353A (zh) 视频缩放方法、装置、服务端、客户端和存储介质
CN116385651A (zh) 图像处理方法、神经网络模型的训练方法、装置和设备
CN116091709A (zh) 建筑物的三维重建方法、装置、电子设备和存储介质
CN113657369B (zh) 一种文字识别方法及其相关设备
CN115019321A (zh) 一种文本识别、模型训练方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22939887

Country of ref document: EP

Kind code of ref document: A1