WO2024031736A1 - 图像分类方法及***、存储介质及终端 - Google Patents

图像分类方法及***、存储介质及终端 Download PDF

Info

Publication number
WO2024031736A1
WO2024031736A1 PCT/CN2022/112630 CN2022112630W WO2024031736A1 WO 2024031736 A1 WO2024031736 A1 WO 2024031736A1 CN 2022112630 W CN2022112630 W CN 2022112630W WO 2024031736 A1 WO2024031736 A1 WO 2024031736A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature vector
vector
classified
target detection
Prior art date
Application number
PCT/CN2022/112630
Other languages
English (en)
French (fr)
Inventor
孔欧
刘益东
王君
Original Assignee
上海蜜度科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海蜜度科技股份有限公司 filed Critical 上海蜜度科技股份有限公司
Publication of WO2024031736A1 publication Critical patent/WO2024031736A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present invention relates to the technical field of image classification, and in particular, to an image classification method and system, a storage medium and a terminal.
  • Image classification is an image processing method that distinguishes different categories of targets based on the different characteristics reflected in the image information. It uses computers to quantitatively analyze images and classify each pixel or area in the image or image. Classified into one of several categories to replace human visual interpretation.
  • image classification tasks usually require the use of classification data sets to train image classification models, and then perform image classification through the trained image classification model.
  • the trained model can only support classification of 1,000 categories. If you need to classify images that are not in these 1000 categories, you need to retrain the image classification model. Therefore, existing image classification models have the following shortcomings:
  • the image classification model When adding new image classification categories, the image classification model needs to be retrained; if the number of image classification categories is too large, such as 100 million categories, the image classification model cannot be trained;
  • the purpose of the present invention is to provide an image classification method and system, storage medium and terminal, which can achieve accurate retrieval of images through target detection, image recognition and feature vector retrieval, and can easily expand the classification category.
  • the present invention provides an image classification method, which includes the following steps: constructing an object vector retrieval library, the object vector retrieval library is used to store object feature vectors and object names of objects; Target detection, obtaining the object image of each object contained in the image to be classified; performing image recognition on the object image, obtaining the object feature vector of the object image; querying the object vector retrieval database with the object The object feature vector of the object image matches the object name corresponding to the object feature vector, and the object name is used as a category of the image to be classified.
  • constructing the object vector retrieval library includes the following steps:
  • the object name and the object feature vector are stored in a one-to-one correspondence manner to complete the construction of the object vector retrieval library.
  • it also includes updating the object vector retrieval library when a new object image appears;
  • Updating the object vector retrieval library includes the following steps:
  • target detection is performed on the image to be classified, and obtaining the object image of each object included in the image to be classified includes the following steps:
  • An object image of the object is intercepted from the image to be classified based on the object position.
  • image recognition is performed on the object image, and obtaining the object feature vector of the object image includes the following steps:
  • querying the object vector retrieval database for the object name corresponding to the object feature vector that matches the object feature vector of the object image includes the following steps:
  • the similarity is cosine similarity.
  • the invention provides an image classification system, including a building module, a target detection module, an image recognition module and a classification module;
  • the building module is used to build an object vector retrieval library, and the object vector retrieval library is used to store the object feature vector and object name of the object;
  • the target detection module is used to perform target detection on the image to be classified and obtain the object image of each object contained in the image to be classified;
  • the image recognition module is used to perform image recognition on the object image and obtain the object feature vector of the object image
  • the classification module is configured to query the object vector retrieval library for the object name corresponding to the object feature vector that matches the object feature vector of the object image, and use the object name as a category of the image to be classified.
  • the present invention provides a storage medium on which a computer program is stored.
  • the program is executed by a processor, the above image classification method is implemented.
  • the invention provides an image classification terminal, including: a processor and a memory;
  • the memory is used to store computer programs
  • the processor is configured to execute the computer program stored in the memory, so that the image classification terminal executes the above image classification method.
  • the image classification method and system, storage medium and terminal according to the present invention have the following beneficial effects:
  • the target detection model When expanding the classification category, the target detection model only needs to be trained to recognize one object.
  • the image recognition model does not need to be retrained. It only needs to update the feature vector retrieval library of the object, thus simplifying the process and reducing the system load.
  • Figure 1 shows a flow chart of the image classification method in one embodiment of the present invention
  • Figure 2 shows a schematic structural diagram of the image classification system in one embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of an image classification terminal according to an embodiment of the present invention.
  • the image classification method and system, storage medium and terminal of the present invention can not only achieve accurate retrieval of images through target detection, image recognition and feature vector retrieval, but also easily expand classification categories without being limited to the number and size of objects, satisfying It is extremely practical to meet the needs of actual application scenarios.
  • the image classification method of the present invention includes the following steps:
  • Step S1 Construct an object vector retrieval library, which is used to store object feature vectors and object names of objects.
  • building the object vector retrieval library includes the following steps:
  • the object image contains a single object.
  • image recognition is performed on the object image based on an image recognition model, thereby obtaining an object feature vector of the object.
  • an object image can obtain a feature vector of 512 values through the image recognition model.
  • each object image can obtain a 5000*512 feature matrix after passing the image recognition model.
  • Numpy and PP-LCNet are used to save the object feature vector.
  • the image recognition model adopts PaddlePaddle's open source PP-LCNet image recognition model.
  • the corresponding object name needs to be obtained, such as vehicle, person, computer, etc.
  • the object vector retrieval library is constructed based on the object name and the object feature vector, and can achieve one-to-one corresponding storage of the object name and the object feature vector.
  • Step S2 Perform target detection on the image to be classified, and obtain the object image of each object contained in the image to be classified.
  • performing target detection on the image to be classified and obtaining the object image of each object contained in the image to be classified includes the following steps:
  • the target detection model For the image to be classified, input it into the target detection model to obtain the object positions, such as object coordinates, of the objects contained in the image to be classified. There may be one or more objects detected by the target detection model.
  • the target detection model adopts PaddlePaddle's open source picodet target detection network.
  • an object image of the object is intercepted from the image to be classified.
  • the object image of the object is intercepted; when the number of the objects is multiple, the object image corresponding to each object is intercepted.
  • Step S3 Perform image recognition on the object image, and obtain the object feature vector of the object image.
  • the object feature vector of the object image can be output.
  • Step S4 Query the object name corresponding to the object feature vector matching the object feature vector of the object image in the object vector retrieval database, and use the object name as a category of the image to be classified.
  • querying the object vector retrieval database for the object name corresponding to the object feature vector matching the object feature vector of the object image includes the following steps:
  • the similarity is calculated one by one with each object feature vector in the object vector retrieval library.
  • the similarity adopts cosine similarity.
  • the object feature vector in the object vector retrieval library corresponding to the calculated maximum similarity value is determined to be a matching object feature vector.
  • the object name corresponding to the matching object feature vector is searched in the object vector retrieval library, and the object name is used as a category of the image to be classified.
  • the image to be classified contains multiple object images, multiple corresponding categories can be obtained.
  • the object vector retrieval library needs to be updated. Specifically, updating the object vector retrieval library includes the following steps:
  • image recognition is performed on the new object image based on the image recognition model, and the corresponding object feature vector and object name are obtained.
  • the target detection model When performing image classification, the target detection model only needs to train the target detection of the new object, and the image recognition model does not need to be retrained, so that the expansion of image classification can be quickly realized without the need for fundamental algorithm updates.
  • the image classification system of the present invention includes a building module 21 , a target detection module 22 , an image recognition module 23 and a classification module 24 .
  • the building module 21 is used to build an object vector retrieval library, and the object vector retrieval library is used to store object feature vectors and object names of objects.
  • the target detection module 22 is used to perform target detection on the image to be classified, and obtain the object image of each object included in the image to be classified.
  • the image recognition module 23 is connected to the target detection module 22 and is used to perform image recognition on the object image and obtain the object feature vector of the object image.
  • the classification module 24 is connected to the construction module 21 and the image recognition module 23, and is used to query the object vector retrieval library for the object name corresponding to the object feature vector matching the object feature vector of the object image, And the object name is used as a category of the image to be classified.
  • the structures and principles of the building module 21, the target detection module 22, the image recognition module 23 and the classification module 24 correspond to the steps in the above image classification method, so they will not be described again here.
  • the present invention provides a storage medium on which a computer program is stored.
  • the program is executed by a processor, the above image classification method is implemented.
  • each module of the above device is only a division of logical functions. In actual implementation, they can be fully or partially integrated into a physical entity, or they can also be physically separated. And these modules can all be implemented in the form of software calling through processing components, or they can all be implemented in the form of hardware. Some modules can also be implemented in the form of software calling through processing components, and some modules can be implemented in the form of hardware.
  • the x module can be a separate processing element, or it can be integrated into a chip of the above device.
  • the x module can also be stored in the memory of the above-mentioned device in the form of program code, and a certain processing element of the above-mentioned device can call and execute the function of the above x module.
  • the implementation of other modules is similar. All or part of these modules can be integrated together or implemented independently.
  • the processing element described here may be an integrated circuit with signal processing capabilities. During the implementation process, each step of the above method or each of the above modules can be completed by instructions in the form of hardware integrated logic circuits or software in the processor element.
  • the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more application specific integrated circuits (ASIC for short), one or more microprocessors (Digital Signal Processor, DSP for short), one or more Field Programmable Gate Array (Field Programmable Gate Array, FPGA for short), etc.
  • ASIC application specific integrated circuit
  • DSP Digital Signal Processor
  • FPGA Field Programmable Gate Array
  • the processing element can be a general-purpose processor, such as a central processing unit (Central Processing Unit, CPU for short) or other processors that can call program code.
  • CPU Central Processing Unit
  • SOC system-on-a-chip
  • a computer program is stored on the storage medium of the present invention, and when the program is executed by a processor, the above image classification method is implemented.
  • the storage medium includes: ROM, RAM, magnetic disk, USB disk, memory card or optical disk and other various media that can store program codes.
  • the image classification terminal of the present invention includes: a processor 31 and a memory 32 .
  • the memory 32 is used to store computer programs.
  • the memory 32 includes various media that can store program codes, such as ROM, RAM, magnetic disk, USB disk, memory card or optical disk.
  • the processor 31 is connected to the memory 32 and is used to execute the computer program stored in the memory, so that the image classification terminal executes the above image classification method.
  • the processor 31 can be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; it can also be a digital signal processor (Digital Signal Processor). , referred to as DSP), application specific integrated circuit (Application Specific Integrated Circuit, referred to as ASIC), field programmable gate array (Field Programmable Gate Array, referred to as FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP central processing unit
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the image classification method and system, storage medium and terminal of the present invention realize accurate retrieval of images through target detection, image recognition and feature vector retrieval; it is easy to expand classification categories and does not limit the number of image classification categories; classification When the category is expanded, the target detection model only needs to be trained to recognize one object, and the image recognition model does not need to be retrained. It only needs to update the feature vector retrieval library of the object, thus simplifying the process and reducing the system load. Therefore, the present invention effectively overcomes various shortcomings in the prior art and has high industrial utilization value.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Library & Information Science (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明提供一种图像分类方法及***、存储介质及终端,包括以下步骤:构建物体向量检索库,所述物体向量检索库用于存储物体的物体特征向量和物体名称;对待分类图像进行目标检测,获取所述待分类图像所包含的每个物体的物体图像;对所述物体图像进行图像识别,获取所述物体图像的物体特征向量;在所述物体向量检索库中查询与所述物体图像的物体特征向量匹配的物体特征向量对应的物体名称,并以所述物体名称作为所述待分类图像的一个类别。本发明的图像分类方法及***、存储介质及终端通过目标检测、图像识别和特征向量检索,实现图像的准确检索,且易于扩展分类类别。

Description

图像分类方法及***、存储介质及终端 技术领域
本发明涉及图像分类的技术领域,特别是涉及一种图像分类方法及***、存储介质及终端。
背景技术
图像分类就是根据各自在图像信息中所反映的不同特征,把不同类别的目标区分开来的图像处理方法,其利用计算机对图像进行定量分析,把图像或图像中的每个像元或区域划归为若干个类别中的某一种,以代替人的视觉判读。
现有技术中,图像分类任务通常需要利用分类数据集来训练图像分类模型,进而通过训练好的图像分类模型进行图像分类。其中,对于一个包含1000个类别的数据集,其训练出来的模型只能支持分类1000个类别。如果需要分类不在这1000个类别里的图像,则需要重新训练图像分类模型。因此,现有的图像分类模型具有以下不足:
(1)当增加新图像分类类别时,需要重新训练图像分类模型;若图像分类类别数量过大,例如1亿个类别,图像分类模型则无法训练;
(2)无法识别图像中的多个物体,仅会输出一个类别;
(3)对于图像中出现的小物体,识别效果差。
发明内容
鉴于以上所述现有技术的缺点,本发明的目的在于提供一种图像分类方法及***、存储介质及终端,通过目标检测、图像识别和特征向量检索,实现图像的准确检索,且易于扩展分类类别。
为实现上述目的及其他相关目的,本发明提供一种图像分类方法,包括以下步骤:构建物体向量检索库,所述物体向量检索库用于存储物体的物体特征向量和物体名称;对待分类图像进行目标检测,获取所述待分类图像所包含的每个物体的物体图像;对所述物体图像进行图像识别,获取所述物体图像的物体特征向量;在所述物体向量检索库中查询与所述物体图像的物体特征向量匹配的物体特征向量对应的物体名称,并以所述物体名称作为所述待分类图像的一个类别。
于本发明一实施例中,构建物体向量检索库包括以下步骤:
获取物体图像;
对所述物体图像进行图像识别,获取所述物体图像的物体特征向量;
获取所述物体图像对应的物体名称;
将所述物体名称和所述物体特征向量按照一一对应的方式进行存储,完成物体向量检索库的构建。
于本发明一实施例中,还包括当出现新的物体图像时,对所述物体向量检索库进行更新;
对所述物体向量检索库进行更新包括以下步骤:
获取所述新的物体图像进行图像识别,获取所述新的物体图像的物体特征向量和物体名称;
将所述物体名称和所述物体特征向量更新至所述物体向量检索库。
于本发明一实施例中,对待分类图像进行目标检测,获取所述待分类图像所包含的每个物体的物体图像包括以下步骤:
基于目标检测模型对所述待分类图像进行目标检测,获取所述待分类图像所包含的物体的物***置;
基于所述物***置在所述待分类图像中截取所述物体的物体图像。
于本发明一实施例中,对所述物体图像进行图像识别,获取所述物体图像的物体特征向量包括以下步骤:
基于PP-LCNet图像识别模型对所述物体图像进行图像识别;
输出所述物体图像的物体特征向量。
于本发明一实施例中,在所述物体向量检索库中查询与所述物体图像的物体特征向量匹配的物体特征向量对应的物体名称包括以下步骤:
计算所述物体图像的物体特征向量与所述物体向量检索库中每个物体特征向量的相似度;
判定所述物体向量检索库中相似度最大的物体特征向量与所述物体图像的物体特征向量相匹配;
在所述物体向量检索库中获取相匹配的物体特征向量对应的物体名称。
于本发明一实施例中,所述相似度采用余弦相似度。
本发明提供一种图像分类***,包括构建模块、目标检测模块、图像识别模块和分类模块;
所述构建模块用于构建物体向量检索库,所述物体向量检索库用于存储物体的物体特征向量和物体名称;
所述目标检测模块用于对待分类图像进行目标检测,获取所述待分类图像所包含的每个物体的物体图像;
所述图像识别模块用于对所述物体图像进行图像识别,获取所述物体图像的物体特征向量;
所述分类模块用于在所述物体向量检索库中查询与所述物体图像的物体特征向量匹配的物体特征向量对应的物体名称,并以所述物体名称作为所述待分类图像的一个类别。
本发明提供一种存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述的图像分类方法。
本发明提供一种图像分类终端,包括:处理器及存储器;
所述存储器用于存储计算机程序;
所述处理器用于执行所述存储器存储的计算机程序,以使所述图像分类终端执行上述的图像分类方法。
如上所述,本发明所述的图像分类方法及***、存储介质及终端,具有以下有益效果:
(1)通过目标检测、图像识别和特征向量检索,实现图像的准确检索;
(2)易于扩展分类类别,不限制图像分类类别的数量;
(3)进行分类类别扩展时,目标检测模型只需要训练识别一个物体即可,图像识别模型不需要重新训练,只需要更新物体的特征向量检索库,从而简化了流程,降低了***负荷。
附图说明
图1显示为本发明的图像分类方法于一实施例中的流程图;
图2显示为本发明的图像分类***于一实施例中的结构示意图;
图3显示为本发明的图像分类终端于一实施例中的结构示意图。
元件标号说明
21       构建模块
22       目标检测模块
23       图像识别模块
24       分类模块
31       处理器
32       存储器
具体实施方式
以下通过特定的具体实例说明本发明的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本发明的精神下进行各种修饰或改变。需说明的是,在不冲突的情况下,以下实施例及实施例中的特征可以相互组合。
需要说明的是,以下实施例中所提供的图示仅以示意方式说明本发明的基本构想,遂图式中仅显示与本发明中有关的组件而非按照实际实施时的组件数目、形状及尺寸绘制,其实际实施时各组件的型态、数量及比例可为一种随意的改变,且其组件布局型态也可能更为复杂。
本发明的图像分类方法及***、存储介质及终端通过目标检测、图像识别和特征向量检索,不仅能够实现图像的准确检索,而且易于扩展分类类别,不受限于物体的个数和大小,满足实际应用场景的需求,极具实用性。
如图1所示,于一实施例中,本发明的图像分类方法包括以下步骤:
步骤S1、构建物体向量检索库,所述物体向量检索库用于存储物体的物体特征向量和物体名称。
具体地,构建物体向量检索库包括以下步骤:
11)获取物体图像。
具体地,采集一定数量的物体图像。所述物体图像中包含有单个物体。
12)对所述物体图像进行图像识别,获取所述物体图像的物体特征向量。
具体地,基于图像识别模型对所述物体图像进行图像识别,从而获取所述物体的物体特征向量。例如,物体图像通过图像识别模型能够获取512个值的特征向量。假设物体图像有5000张,那么每一张物体图像均通过图像识别模型后可获取5000*512的特征矩阵。优选地,采用Numpy和PP-LCNet保存所述物体特征向量。
优选地,所述图像识别模型采用PaddlePaddle开源的PP-LCNet图像识别模型。
13)获取所述物体图像对应的物体名称。
具体地,针对每个物体图像,还需获取对应的物体名称,如车辆、人、电脑等等。
14)将所述物体名称和所述物体特征向量按照一一对应的方式进行存储,完成物体向量检索库的构建。
具体地,基于所述物体名称和所述物体特征向量构建所述物体向量检索库,并能够实现 物体名称和物体特征向量的一一对应存储。
步骤S2、对待分类图像进行目标检测,获取所述待分类图像所包含的每个物体的物体图像。
具体地,对待分类图像进行目标检测,获取所述待分类图像所包含的每个物体的物体图像包括以下步骤:
21)基于目标检测模型对所述待分类图像进行目标检测,获取所述待分类图像所包含的物体的物***置。
具体地,对于所述待分类图像,将其输入目标检测模型,即可获取所述待分类图像所包含的物体的物***置,如物体坐标等。其中,所述目标检测模型检测到的物体可以为一个或多个。优选地,所述目标检测模型采用PaddlePaddle开源的picodet目标检测网络。
22)基于所述物***置在所述待分类图像中截取所述物体的物体图像。
具体地,通过所述物***置,在所述待分类图像中截取所述物体的物体图像。当所述物体的个数为一个时,截取该物体的物体图像;当所述物体的个数为多个时,截取每个物体对应的物体图像。
步骤S3、对所述物体图像进行图像识别,获取所述物体图像的物体特征向量。
具体地,将所述物体图像输入PP-LCNet图像识别模型,即可输出所述物体图像的物体特征向量。
步骤S4、在所述物体向量检索库中查询与所述物体图像的物体特征向量匹配的物体特征向量对应的物体名称,并以所述物体名称作为所述待分类图像的一个类别。
具体地,在所述物体向量检索库中查询与所述物体图像的物体特征向量匹配的物体特征向量对应的物体名称包括以下步骤:
41)计算所述物体图像的物体特征向量与所述物体向量检索库中每个物体特征向量的相似度。
具体地,针对获取的每个物体图像的物体特征向量,与所述物体向量检索库中每个物体特征向量逐一进行相似度的计算。优选地,所述相似度采用余弦相似度。
42)判定所述物体向量检索库中相似度最大的物体特征向量与所述物体图像的物体特征向量相匹配。
具体地,将计算得到的相似度最大值对应的所述物体向量检索库中的物体特征向量判定为匹配物体特征向量。
43)在所述物体向量检索库中获取相匹配的物体特征向量对应的物体名称。
具体地,在所述物体向量检索库中查找所述匹配物体特征向量对应的物体名称,该物体名称即作为所述待分类图像的一个类别。当所述待分类图像中包含多个物体图像时,则可获得多个对应的类别。
于本发明一实施例中,本发明的图像分类方法中,当出现新的物体图像时,图像的分类类别就会增加,此时需要对所述物体向量检索库进行更新。具体地,对所述物体向量检索库进行更新包括以下步骤:
a)获取所述新的物体图像进行图像识别,获取所述新的物体图像的物体特征向量和物体名称。
具体地,基于图像识别模型对所述新的物体图像进行图像识别,获取对应的物体特征向量和物体名称。
b)将所述物体名称和所述物体特征向量更新至所述物体向量检索库。
在进行图像分类时,目标检测模型只需训练该新的物体的目标检测,图像识别模型无需重新训练,从而能够快速实现图像分类的扩展,无需从根本上进行算法更新。
如图2所示,于一实施例中,本发明的图像分类***包括构建模块21、目标检测模块22、图像识别模块23和分类模块24。
所述构建模块21用于构建物体向量检索库,所述物体向量检索库用于存储物体的物体特征向量和物体名称。
所述目标检测模块22用于对待分类图像进行目标检测,获取所述待分类图像所包含的每个物体的物体图像。
所述图像识别模块23与所述目标检测模块22相连,用于对所述物体图像进行图像识别,获取所述物体图像的物体特征向量。
所述分类模块24与所述构建模块21和所述图像识别模块23相连,用于在所述物体向量检索库中查询与所述物体图像的物体特征向量匹配的物体特征向量对应的物体名称,并以所述物体名称作为所述待分类图像的一个类别。
其中,构建模块21、目标检测模块22、图像识别模块23和分类模块24的结构和原理与上述图像分类方法中的步骤一一对应,故在此不再赘述。
本发明提供一种存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述的图像分类方法。
需要说明的是,应理解以上装置的各个模块的划分仅仅是一种逻辑功能的划分,实际实现时可以全部或部分集成到一个物理实体上,也可以物理上分开。且这些模块可以全部以软 件通过处理元件调用的形式实现,也可以全部以硬件的形式实现,还可以部分模块通过处理元件调用软件的形式实现,部分模块通过硬件的形式实现。例如:x模块可以为单独设立的处理元件,也可以集成在上述装置的某一个芯片中实现。此外,x模块也可以以程序代码的形式存储于上述装置的存储器中,由上述装置的某一个处理元件调用并执行以上x模块的功能。其它模块的实现与之类似。这些模块全部或部分可以集成在一起,也可以独立实现。这里所述的处理元件可以是一种集成电路,具有信号的处理能力。在实现过程中,上述方法的各步骤或以上各个模块可以通过处理器元件中的硬件的集成逻辑电路或者软件形式的指令完成。以上这些模块可以是被配置成实施以上方法的一个或多个集成电路,例如:一个或多个特定集成电路(Application Specific Integrated Circuit,简称ASIC),一个或多个微处理器(Digital Signal Processor,简称DSP),一个或者多个现场可编程门阵列(Field Programmable Gate Array,简称FPGA)等。当以上某个模块通过处理元件调度程序代码的形式实现时,该处理元件可以是通用处理器,如中央处理器(Central Processing Unit,简称CPU)或其它可以调用程序代码的处理器。这些模块可以集成在一起,以片上***(System-on-a-chip,简称SOC)的形式实现。
本发明的存储介质上存储有计算机程序,该程序被处理器执行时实现上述的图像分类方法。优选地,所述存储介质包括:ROM、RAM、磁碟、U盘、存储卡或者光盘等各种可以存储程序代码的介质。
如图3所示,于一实施例中,本发明的图像分类终端包括:处理器31和存储器32。
所述存储器32用于存储计算机程序。
所述存储器32包括:ROM、RAM、磁碟、U盘、存储卡或者光盘等各种可以存储程序代码的介质。
所述处理器31与所述存储器32相连,用于执行所述存储器存储的计算机程序,以使所述图像分类终端执行上述的图像分类方法。
优选地,所述处理器31可以是通用处理器,包括中央处理器(Central Processing Unit,简称CPU)、网络处理器(Network Processor,简称NP)等;还可以是数字信号处理器(Digital Signal Processor,简称DSP)、专用集成电路(Application Specific Integrated Circuit,简称ASIC)、现场可编程门阵列(Field Programmable Gate Array,简称FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。
综上所述,本发明的图像分类方法及***、存储介质及终端通过目标检测、图像识别和特征向量检索,实现图像的准确检索;易于扩展分类类别,不限制图像分类类别的数量;进 行分类类别扩展时,目标检测模型只需要训练识别一个物体即可,图像识别模型不需要重新训练,只需要更新物体的特征向量检索库,从而简化了流程,降低了***负荷。因此,本发明有效克服了现有技术中的种种缺点而具高度产业利用价值。
上述实施例仅例示性说明本发明的原理及其功效,而非用于限制本发明。任何熟悉此技术的人士皆可在不违背本发明的精神及范畴下,对上述实施例进行修饰或改变。因此,举凡所属技术领域中具有通常知识者在未脱离本发明所揭示的精神与技术思想下所完成的一切等效修饰或改变,仍应由本发明的权利要求所涵盖。

Claims (10)

  1. 一种图像分类方法,其特征在于,包括以下步骤:
    构建物体向量检索库,所述物体向量检索库用于存储物体的物体特征向量和物体名称;
    对待分类图像进行目标检测,获取所述待分类图像所包含的每个物体的物体图像;
    对所述物体图像进行图像识别,获取所述物体图像的物体特征向量;
    在所述物体向量检索库中查询与所述物体图像的物体特征向量匹配的物体特征向量对应的物体名称,并以所述物体名称作为所述待分类图像的一个类别。
  2. 根据权利要求1所述的图像分类方法,其特征在于,构建物体向量检索库包括以下步骤:
    获取物体图像;
    对所述物体图像进行图像识别,获取所述物体图像的物体特征向量;
    获取所述物体图像对应的物体名称;
    将所述物体名称和所述物体特征向量按照一一对应的方式进行存储,完成物体向量检索库的构建。
  3. 根据权利要求1所述的图像分类方法,其特征在于,还包括当出现新的物体图像时,对所述物体向量检索库进行更新;
    对所述物体向量检索库进行更新包括以下步骤:
    获取所述新的物体图像进行图像识别,获取所述新的物体图像的物体特征向量和物体名称;
    将所述物体名称和所述物体特征向量更新至所述物体向量检索库。
  4. 根据权利要求1所述的图像分类方法,其特征在于,对待分类图像进行目标检测,获取所述待分类图像所包含的每个物体的物体图像包括以下步骤:
    基于目标检测模型对所述待分类图像进行目标检测,获取所述待分类图像所包含的物体的物***置;
    基于所述物***置在所述待分类图像中截取所述物体的物体图像。
  5. 根据权利要求1所述的图像分类方法,其特征在于,对所述物体图像进行图像识别,获取所述物体图像的物体特征向量包括以下步骤:
    基于PP-LCNet图像识别模型对所述物体图像进行图像识别;
    输出所述物体图像的物体特征向量。
  6. 根据权利要求1所述的图像分类方法,其特征在于,在所述物体向量检索库中查询与所述物体图像的物体特征向量匹配的物体特征向量对应的物体名称包括以下步骤:
    计算所述物体图像的物体特征向量与所述物体向量检索库中每个物体特征向量的相似度;
    判定所述物体向量检索库中相似度最大的物体特征向量与所述物体图像的物体特征向量相匹配;
    在所述物体向量检索库中获取相匹配的物体特征向量对应的物体名称。
  7. 根据权利要求6所述的图像分类方法,其特征在于,所述相似度采用余弦相似度。
  8. 一种图像分类***,其特征在于,包括构建模块、目标检测模块、图像识别模块和分类模块;
    所述构建模块用于构建物体向量检索库,所述物体向量检索库用于存储物体的物体特征向量和物体名称;
    所述目标检测模块用于对待分类图像进行目标检测,获取所述待分类图像所包含的每个物体的物体图像;
    所述图像识别模块用于对所述物体图像进行图像识别,获取所述物体图像的物体特征向量;
    所述分类模块用于在所述物体向量检索库中查询与所述物体图像的物体特征向量匹配的物体特征向量对应的物体名称,并以所述物体名称作为所述待分类图像的一个类别。
  9. 一种存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现权利要求1至7中任一项所述的图像分类方法。
  10. 一种图像分类终端,其特征在于,包括:处理器及存储器;
    所述存储器用于存储计算机程序;
    所述处理器用于执行所述存储器存储的计算机程序,以使所述图像分类终端执行权利要求1至7中任一项所述的图像分类方法。
PCT/CN2022/112630 2022-08-10 2022-08-16 图像分类方法及***、存储介质及终端 WO2024031736A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210955787.1 2022-08-10
CN202210955787.1A CN117633264A (zh) 2022-08-10 2022-08-10 图像分类方法及***、存储介质及终端

Publications (1)

Publication Number Publication Date
WO2024031736A1 true WO2024031736A1 (zh) 2024-02-15

Family

ID=89850432

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/112630 WO2024031736A1 (zh) 2022-08-10 2022-08-16 图像分类方法及***、存储介质及终端

Country Status (2)

Country Link
CN (1) CN117633264A (zh)
WO (1) WO2024031736A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104657708A (zh) * 2015-02-02 2015-05-27 郑州酷派电子设备有限公司 一种新型三维物体识别装置及方法
CN107256262A (zh) * 2017-06-13 2017-10-17 西安电子科技大学 一种基于物体检测的图像检索方法
CN109919149A (zh) * 2019-01-18 2019-06-21 平安科技(深圳)有限公司 基于物体检测模型的物体标注方法及相关设备
CN114049484A (zh) * 2021-11-02 2022-02-15 广州华多网络科技有限公司 商品图像检索方法及其装置、设备、介质、产品

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104657708A (zh) * 2015-02-02 2015-05-27 郑州酷派电子设备有限公司 一种新型三维物体识别装置及方法
CN107256262A (zh) * 2017-06-13 2017-10-17 西安电子科技大学 一种基于物体检测的图像检索方法
CN109919149A (zh) * 2019-01-18 2019-06-21 平安科技(深圳)有限公司 基于物体检测模型的物体标注方法及相关设备
CN114049484A (zh) * 2021-11-02 2022-02-15 广州华多网络科技有限公司 商品图像检索方法及其装置、设备、介质、产品

Also Published As

Publication number Publication date
CN117633264A (zh) 2024-03-01

Similar Documents

Publication Publication Date Title
Ding et al. Extreme learning machine with kernel model based on deep learning
Xie et al. Point clouds learning with attention-based graph convolution networks
Iyer et al. Shape-based searching for product lifecycle applications
Zhang et al. Panorama: a data system for unbounded vocabulary querying over video
US20220414439A1 (en) Neuromorphic Synthesizer
WO2023201924A1 (zh) 对象缺陷检测方法、装置、计算机设备和存储介质
Li et al. Fuzzy based affinity learning for spectral clustering
WO2020168814A1 (zh) 服饰识别、分类及检索的方法、装置、设备及存储介质
CN111191526A (zh) 行人属性识别网络训练方法、***、介质及终端
CN108804617B (zh) 领域术语抽取方法、装置、终端设备及存储介质
Wang et al. A novel GCN-based point cloud classification model robust to pose variances
CN105320764A (zh) 一种基于增量慢特征的3d模型检索方法及其检索装置
Chew et al. Large-scale 3D point-cloud semantic segmentation of urban and rural scenes using data volume decomposition coupled with pipeline parallelism
Pedronette et al. A graph-based ranked-list model for unsupervised distance learning on shape retrieval
CN115605862A (zh) 训练用于3d模型数据库查询的可微分渲染器和神经网络
Chiu et al. Integrating content-based image retrieval and deep learning to improve wafer bin map defect patterns classification
CN114821140A (zh) 基于曼哈顿距离的图像聚类方法、终端设备及存储介质
CN110083731A (zh) 图像检索方法、装置、计算机设备及存储介质
CN113837635A (zh) 风险检测处理方法、装置及设备
CN111488479A (zh) 超图构建方法、装置以及计算机***和介质
CN113420642A (zh) 一种基于类别语义特征重加权的小样本目标检测方法及***
WO2024031736A1 (zh) 图像分类方法及***、存储介质及终端
Kim et al. Image recognition accelerator design using in-memory processing
Liu et al. An incremental broad learning approach for semi-supervised classification
CN111767710B (zh) 印尼语的情感分类方法、装置、设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22954684

Country of ref document: EP

Kind code of ref document: A1