CN117011575A - Training method and related device for small sample target detection model - Google Patents

Training method and related device for small sample target detection model Download PDF

Info

Publication number
CN117011575A
CN117011575A CN202211330223.5A CN202211330223A CN117011575A CN 117011575 A CN117011575 A CN 117011575A CN 202211330223 A CN202211330223 A CN 202211330223A CN 117011575 A CN117011575 A CN 117011575A
Authority
CN
China
Prior art keywords
target
training
detection model
small sample
classification network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211330223.5A
Other languages
Chinese (zh)
Inventor
高斌斌
陈晓辰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202211330223.5A priority Critical patent/CN117011575A/en
Publication of CN117011575A publication Critical patent/CN117011575A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a training method and a related device for a small sample target detection model. The embodiment of the application can be applied to the technical field of computer vision. The method comprises the following steps: firstly, acquiring a first training image containing label information and a second training image without label information, then inputting the first training image into a first target classification network in a small sample target detection model, and outputting first prediction information; then, inputting a second training image into a second target classification network in the small sample target detection model, and outputting second prediction information; and finally, generating an optimized small sample target detection model according to the first prediction information, the first label information, the second prediction information and the second label information. According to the method provided by the embodiment of the application, the small sample target detection model is decomposed into the first target classification network and the second target classification network which can independently detect the first target object and the second target object, so that the accuracy of identifying the small sample target object is improved.

Description

Training method and related device for small sample target detection model
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a training method and a related device of a small sample target detection model.
Background
In recent years, fully supervised deep convolutional neural networks have made significant progress in various computer vision tasks such as image classification, object detection, semantic segmentation, and instance segmentation. However, this excellent performance is largely dependent on large scale, precisely annotated image datasets.
However, for some new class samples with low occurrence frequency and difficult sample collection, the number of samples with marks is small, so that the generalization capability of the model for the class is limited, and the recognition rate of the model for the class is greatly reduced. For example, in an industrial manufacturing scene, the surface of a product is inspected by a deep learning method based on artificial intelligence (Artificial Intelligence, AI) to detect defective products with defects on the surface of the product, and as the industrial production advances, the number of defective products produced on an industrial production line is reduced, so sample data for training a target detection model is also reduced, thereby limiting the recognition capability of the target detection model on various defects on the surface of the product in the quality inspection process, reducing the accuracy of target detection and reducing quality of quality inspection.
Disclosure of Invention
The embodiment of the application provides a training method and a related device for a small sample target detection model, which are used for decomposing the small sample target detection model into a first target classification network and a second target classification network, training the first target classification network through a large amount of sample image data containing labels, so that the small sample target detection model has the capability of detecting a first target object, training the second target classification network through a small amount of sample image data without labels, so that the small sample target detection model has the capability of detecting a second target object, and the first target classification network and the second target classification network can independently detect the first target object and the second target object, thereby relieving the deviation classification problem caused by label deletion in a small sample scene and improving the accuracy of identifying the small sample target object.
One aspect of the present application provides a training method for a small sample target detection model, including:
acquiring a first training sample set and a second training sample set, wherein the first training sample set comprises m first training images, the m first training images comprise P first target objects, the P first target objects carry P first label information, the second training sample set comprises n second training images, the n second training images comprise Q second target objects, the Q second target objects carry Q second label information, the first label information is used for indicating the labeling category of the first target objects, the second label information is used for indicating the labeling category of the second target objects, m, n, P and Q are integers larger than 1, m is larger than n, and P is larger than Q;
Taking m first training images as input of a small sample target detection model, and outputting P first prediction information through a first target classification network in the small sample target detection model, wherein the first prediction information is used for indicating the prediction type of a first target object;
taking n second training images as input of a small sample target detection model, and outputting Q second prediction information through a second target classification network in the small sample target detection model, wherein the second prediction information is used for indicating the prediction type of a second target object;
optimizing parameters of the first target classification network according to the P pieces of first prediction information and the P pieces of first label information, optimizing parameters of the second target classification network according to the Q pieces of second prediction information and the Q pieces of second label information, and generating an optimized small sample target detection model.
One aspect of the present application provides a target detection method, including:
acquiring a target image;
and inputting the target image as a small sample target detection model trained by the method, and outputting first prediction information and second prediction information through the small sample target detection model, wherein the first prediction information is used for indicating the prediction type of the first target object included in the target image, and the second prediction information is used for indicating the prediction type of the second target object included in the target image.
One aspect of the present application provides a training apparatus for a small sample target detection model, including:
the training sample set acquisition module is used for acquiring a first training sample set and a second training sample set, wherein the first training sample set comprises m first training images, the m first training images comprise P first target objects, the P first target objects carry P first label information, the second training sample set comprises n second training images, the n second training images comprise Q second target objects, the Q second target objects carry Q second label information, the first label information is used for indicating the labeling category of the first target objects, the second label information is used for indicating the labeling category of the second target objects, m, n, P and Q are integers larger than 1, m is larger than n, and P is larger than Q;
the first target object training module is used for taking m first training images as input of a small sample target detection model, and outputting P first prediction information through a first target classification network in the small sample target detection model, wherein the first prediction information is used for indicating the prediction category of the first target object;
the second target object training module is used for taking n second training images as input of a small sample target detection model, and outputting Q second prediction information through a second target classification network in the small sample target detection model, wherein the second prediction information is used for indicating the prediction category of the second target object;
The small sample target detection model optimizing module is used for optimizing parameters of the first target classification network according to the P pieces of first prediction information and the P pieces of first label information, optimizing parameters of the second target classification network according to the Q pieces of second prediction information and the Q pieces of second label information, and generating an optimized small sample target detection model.
In another implementation manner of the embodiment of the present application, the second target object training module is further configured to:
acquiring training target images from n second training images, wherein the training target images comprise second target objects and second background areas, the second target objects carry second label information, and the second background areas carry background category information;
and taking the second label information and the background category information as the input of a second target classification network, and generating second prediction information of a second target object through the second target classification network.
In another implementation manner of the embodiment of the present application, the second target object carries a second reference bounding box, and the second target object training module is further configured to:
acquiring a confidence threshold;
generating K candidate frames according to the training target image, wherein K is an integer greater than 1;
Determining a second prediction boundary box from the K candidate boxes according to the second reference boundary box and the confidence threshold;
and determining the position information of the second target object according to the second prediction boundary box.
In another implementation manner of the embodiment of the present application, the second target object training module is further configured to:
according to the second reference boundary frame and the N candidate frames, calculating N intersection ratio values, wherein the intersection ratio values are used for representing the coincidence degree of the second reference boundary frame and the candidate frames;
and generating a second prediction boundary frame according to the candidate frames which meet the confidence threshold value in the N intersection ratio values.
In another implementation manner of the embodiment of the present application, the small sample target detection model optimization module is further configured to:
extracting features of the second training image to obtain a second training image feature dimension, wherein the second training image feature dimension comprises a second target object feature dimension and a second background region feature dimension, the second target object feature dimension corresponds to the second target object, and the second background region feature dimension corresponds to the second background region;
and carrying out gradient update on the second target classification network according to the characteristic dimension of the second target object and the characteristic dimension of the second background area to obtain an updated second target classification network.
In another implementation manner of the embodiment of the present application, the small sample target detection model optimization module is further configured to:
extracting features of the first training image to obtain feature dimensions of the first training image;
and carrying out gradient update on the first target classification network according to the feature dimension of the first training image to obtain an updated first target classification network.
In another implementation manner of the embodiment of the present application, the small sample target detection model optimization module is further configured to:
generating a loss function of the first target classification network according to the P pieces of first prediction information and the P pieces of first label information;
generating a loss function of the second target classification network according to the Q second prediction information and the Q second label information;
and generating a loss function of the small sample target detection model according to the loss function of the first target classification network and the loss function of the second target classification network.
Another aspect of the present application provides an object detection apparatus including:
the target image acquisition module is used for acquiring a target image;
the target object prediction module is used for inputting the target image as the small sample target detection model trained by the method, and outputting first prediction information and second prediction information through the small sample target detection model, wherein the first prediction information is used for indicating the prediction type of the first target object included in the target image, and the second prediction information is used for indicating the prediction type of the second target object included in the target image.
Another aspect of the present application provides a computer apparatus comprising:
memory, transceiver, processor, and bus system;
wherein the memory is used for storing programs;
the processor is used for executing programs in the memory, and the method comprises the steps of executing the aspects;
the bus system is used to connect the memory and the processor to communicate the memory and the processor.
Another aspect of the application provides a computer readable storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the methods of the above aspects.
Another aspect of the application provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the methods provided in the above aspects.
From the above technical solutions, the embodiment of the present application has the following advantages:
the application provides a training method of a small sample target detection model and a related device, wherein the method comprises the following steps: firstly, acquiring a first training sample set and a second training sample set, wherein the first training sample set comprises m first training images, the m first training images comprise P first target objects, the P first target objects carry P first label information, the second training sample set comprises n second training images, the n second training images comprise Q second target objects, the Q second target objects carry Q second label information, the first label information is used for indicating the labeling category of the first target objects, and the second label information is used for indicating the labeling category of the second target objects; then, taking m first training images as input of a small sample target detection model, and outputting P first prediction information through a first target classification network in the small sample target detection model, wherein the first prediction information is used for indicating the prediction category of a first target object; then, n second training images are used as input of a small sample target detection model, and Q second prediction information is output through a second target classification network in the small sample target detection model, wherein the second prediction information is used for indicating the prediction category of a second target object; and finally, optimizing parameters of the first target classification network according to the P pieces of first prediction information and the P pieces of first label information, optimizing parameters of the second target classification network according to the Q pieces of second prediction information and the Q pieces of second label information, and generating an optimized small sample target detection model. According to the method provided by the embodiment of the application, the small sample target detection model is decomposed into the first target classification network and the second target classification network, the first target classification network is trained through a large amount of sample image data containing labels, so that the small sample target detection model has the capability of detecting a first target object, the second target classification network is trained through a small amount of sample image data without labels, so that the small sample target detection model has the capability of detecting a second target object, the first target classification network and the second target classification network can independently detect the first target object and the second target object, the deviation classification problem caused by label deletion in a small sample scene is relieved, and the accuracy of identifying the small sample target object is improved.
Drawings
FIG. 1 is a schematic diagram of a training system for a small sample target detection model according to an embodiment of the present application;
FIG. 2 is a flowchart of a training method of a small sample target detection model according to an embodiment of the present application;
FIG. 3 is a flowchart of a training method of a small sample target detection model according to another embodiment of the present application;
FIG. 4 is a flowchart of a training method of a small sample target detection model according to another embodiment of the present application;
FIG. 5 is a flowchart of a training method of a small sample target detection model according to another embodiment of the present application;
FIG. 6 is a schematic diagram illustrating calculation of an intersection ratio according to an embodiment of the present application;
FIG. 7 is a flowchart of a training method of a small sample target detection model according to another embodiment of the present application;
FIG. 8 is a flowchart of a training method of a small sample target detection model according to another embodiment of the present application;
FIG. 9 is a flowchart of a training method of a small sample target detection model according to another embodiment of the present application;
FIG. 10 is a flowchart of a target detection method according to an embodiment of the present application;
FIG. 11 is a schematic structural diagram of a training device for a small sample target detection model according to an embodiment of the present application;
FIG. 12 is a schematic diagram of a target detection apparatus according to an embodiment of the present application;
fig. 13 is a schematic diagram of a server structure according to an embodiment of the present application.
Detailed Description
The embodiment of the application provides a training method of a small sample target detection model, which is characterized in that the small sample target detection model is decomposed into a first target classification network and a second target classification network, the first target classification network is trained through a large amount of sample image data containing labels, so that the small sample target detection model has the capability of detecting a first target object, the second target classification network is trained through a small amount of sample image data without labels, the small sample target detection model has the capability of detecting a second target object, the first target classification network and the second target classification network can independently detect the first target object and the second target object, the deviation classification problem caused by label deletion in a small sample scene is relieved, and the accuracy of identifying the small sample target object is improved.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented, for example, in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "includes" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.
Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
Computer Vision (CV) is a science of studying how to "look" a machine, and more specifically, to replace a human eye with a camera and a Computer to perform machine Vision such as recognition and measurement on a target, and further perform graphic processing to make the Computer process an image more suitable for human eye observation or transmission to an instrument for detection. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision techniques typically include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, and map construction, among others, as well as common biometric recognition techniques such as face recognition, fingerprint recognition, and others.
Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.
The scheme provided by the embodiment of the application relates to the technology of computer vision, machine learning and the like of artificial intelligence.
In order to facilitate understanding of the technical solution provided by the embodiments of the present application, some key terms used in the embodiments of the present application are explained here:
small sample study: in the training stage, the object counting model can obtain a sample image, position labels of all sample objects in the sample image and a plurality of bounding boxes surrounding the sample objects in the sample image, so as to perform counting learning.
Small sample target detection (few shot object detection, FSOD) is a solution to the target detection problem with few training samples. Compared with a small sample classification task, FSOD is not only required to judge whether a target object appears in an image, but also required to give out a specific position of the target object, so that the learning difficulty is higher.
Common small sample target detection (general few shot object detection, GFSOD) solves the problem of forgetting to detect a base class (base) target object in small sample target detection.
Bias classification problem: the model may be prone to misclassification resulting from discriminating a sample as a class due to class sample imbalance.
In recent years, fully supervised deep convolutional neural networks have made significant progress in various computer vision tasks such as image classification, object detection, semantic segmentation, and instance segmentation. This excellent performance depends to a large extent on the large scale of the precisely annotated image dataset. However, in some scenarios, the number of sample data is less when the model for target detection is trained, and the sample tag data is less, resulting in a lower recognition rate when the target detection model detects such target objects.
For example, in an industrial manufacturing scene, the surface of a product is inspected by a deep learning method based on artificial intelligence (Artificial Intelligence, AI) to detect defective products with defects on the surface of the product, and as the industrial production advances, the number of defective products produced on an industrial production line is reduced, so sample data for training a target detection model is also reduced, thereby limiting the recognition capability of the target detection model on various defects on the surface of the product in the quality inspection process, reducing the accuracy of target detection and reducing quality of quality inspection.
In a rare animal detection scene, animals in an ecological environment image are detected by a deep learning method based on artificial intelligence (Artificial Intelligence, AI), and the recognition capability of a target detection model in the rare animal detection scene on the rare animals is limited due to less rare animal sample data, so that the accuracy of target detection is reduced.
The patent proposes a training method of a small sample target detection model, which decomposes the small sample target detection model into a first target classification network and a second target classification network, trains the first target classification network through a large amount of sample image data containing labels, so that the small sample target detection model has the capability of detecting a first target object, trains the second target classification network through a small amount of sample image data without labels, and has the capability of detecting a second target object, the first target classification network and the second target classification network can independently detect the first target object and the second target object, the deviation classification problem caused by the label deletion in a small sample scene is relieved, and the accuracy of identifying the small sample target object is improved.
For easy understanding, referring to fig. 1, fig. 1 is an application environment diagram of a training method of a small sample target detection model in an embodiment of the present application, and as shown in fig. 1, the training method of a small sample target detection model in an embodiment of the present application is applied to a training system of a small sample target detection model. The training system of the small sample target detection model comprises: a server and a terminal device; the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, a content distribution network (Content Delivery Network, CDN), basic cloud computing services such as big data and an artificial intelligent platform. The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and embodiments of the present application are not limited herein.
The method comprises the steps that a first training sample set and a second training sample set are firstly obtained by a server, wherein the first training sample set comprises m first training images, P first target objects are included in the m first training images, the P first target objects carry P first label information, the second training sample set comprises n second training images, the n second training images comprise Q second target objects, the Q second target objects carry Q second label information, the first label information is used for indicating the labeling category of the first target objects, and the second label information is used for indicating the labeling category of the second target objects; then, the server takes m first training images as input of a small sample target detection model, and outputs P first prediction information through a first target classification network in the small sample target detection model, wherein the first prediction information is used for indicating the prediction category of a first target object; then, the server takes n second training images as input of a small sample target detection model, and outputs Q second prediction information through a second target classification network in the small sample target detection model, wherein the second prediction information is used for indicating the prediction category of a second target object; and finally, the server optimizes the parameters of the first target classification network according to the P pieces of first prediction information and the P pieces of first label information, and optimizes the parameters of the second target classification network according to the Q pieces of second prediction information and the Q pieces of second label information, so as to generate an optimized small sample target detection model.
The training method of the small sample target detection model in the application will be described from the perspective of the server. Referring to fig. 2, the training method for the small sample target detection model provided by the embodiment of the application includes: step S110 to step S140. Specific:
s110, acquiring a first training sample set and a second training sample set.
The first training sample set comprises m first training images, the m first training images comprise P first target objects, the P first target objects carry P first label information, the second training sample set comprises n second training images, the n second training images comprise Q second target objects, the Q second target objects carry Q second label information, the first label information is used for indicating the labeling category of the first target objects, the second label information is used for indicating the labeling category of the second target objects, m, n, P and Q are integers greater than 1, m is greater than n, and P is greater than Q.
It will be appreciated that the first training image includes a first target object and the first target object carries first tag information indicating a class of annotation of the first target object, and that the first training image may also include a second target object, but the second target object in the first training image does not carry second tag information indicating a class of annotation of the second target object. The second training image includes a second target object, and the second target object carries second label information for indicating a labeling category of the second target object, and the second training image may also include the first target object, but the first target object in the second training image does not carry first label information for indicating a labeling category of the first target object. Or the second training image comprises a first target object and a second target object, the first target object carries first label information for indicating the labeling category of the first target object, and the second target object carries second label information for indicating the labeling category of the second target object.
The first target object is a basic target object of the small sample target detection model, and the second target object is a new class target object of the small sample target detection model. The sample data of the first target object is more than the sample data of the second target object, and the sample label data of the first target object is more than the sample label data of the second target object, which are used for training the small sample target detection model. Small sample in small sample target detection refers in an embodiment of the application to the second target object.
S120, taking m first training images as input of a small sample target detection model, and outputting P first prediction information through a first target classification network in the small sample target detection model.
The first prediction information is used for indicating a prediction category of the first target object.
S130, taking n second training images as input of a small sample target detection model, and outputting Q second prediction information through a second target classification network in the small sample target detection model.
The second prediction information is used for indicating the prediction category of the second target object.
It may be appreciated that the small sample object detection model provided by the embodiment of the present application may be an object classifier, the first object classification network may be a positive output head in the object classifier, and the second object classification network may be a negative output head in the object classifier.
The first target classification network is trained with m first training images comprising base class target objects such that the small sample target detection model has the ability to detect the base class target objects (first target objects). The second target classification network is trained by n second training images comprising new class target objects such that the small sample target detection model has the ability to detect new class target objects (second target objects).
And S140, optimizing parameters of the first target classification network according to the P pieces of first prediction information and the P pieces of first label information, optimizing parameters of the second target classification network according to the Q pieces of second prediction information and the Q pieces of second label information, and generating an optimized small sample target detection model.
It will be appreciated that the parameters of the small sample target detection model include three parts: the method comprises the steps of basic detection model parameters, first target classification network parameters and second target classification network parameters. The small sample target detection model firstly learns and identifies the first target object through a large number of first training images, setting and optimizing of basic detection model parameters and first target classification network parameters in the small sample target detection model are completed, at the moment, the identification rate of the first target object is higher, and on the basis, the second target object is directly learned and identified through a small number of second training images, and the second target object can be identified only by setting and optimizing the second target classification network parameters. The degree of optimization of the parameters is determined by the number of sample data, the higher the degree of optimization of the parameters, the lower the number of sample data, and the lower the degree of optimization of the parameters. The sample data of the first training image are more, and the optimization degree of the parameters of the first target classification network is higher; the second training image has less sample data and the parameters of the second target classification network are less optimized.
According to the method provided by the embodiment of the application, the small sample target detection model is decomposed into the first target classification network and the second target classification network, the first target classification network is trained through a large amount of sample image data containing labels, so that the small sample target detection model has the capability of detecting a first target object, the second target classification network is trained through a small amount of sample image data without labels, so that the small sample target detection model has the capability of detecting a second target object, the first target classification network and the second target classification network can independently detect the first target object and the second target object, the deviation classification problem caused by label deletion in a small sample scene is relieved, and the accuracy of identifying the small sample target object is improved.
In an alternative embodiment of the training method of the small sample target detection model provided in the corresponding embodiment of fig. 2, referring to fig. 3, step S130 includes sub-steps S1301 to S1302.
Specific:
s1301, acquiring training target images from n second training images.
The training target image comprises a second target object and a second background area, the second target object carries second label information, and the second background area carries background category information.
It can be understood that the region where the second target object carrying the second tag information is located in the training target image is a foreground region, and the portion excluding the foreground region in the training target image is a background region.
S1302, second label information and background category information are used as input of a second target classification network, and second prediction information of a second target object is generated through the second target classification network.
It can be understood that the training of the second target classification network is limited to the second tag information and the background category information, that is, the second target classification network performs the detection learning of the second target object in the second tag information and the background category information, so as to solve the bias classification problem.
For n second training images, firstly determining Q second target objects in the n second training images and Q second label information carried by the Q second target objects, and marking the Q second target objects asWherein m is i (1.ltoreq.i.ltoreq.Q) represents Q types, m of the second target object i For binary indicator symbol, m Q+1 For background category information in the training image, if a certain second training image comprises a first type second target object m 1 Second object m of second class 2 Second target object m of fifth class 5 The second training image is +. >Next, get +.>Constraint regression model (logit) for conditions, i.e.> To predict category, x i Is a label category. Then, a softmax function was obtained: />Wherein (1)>For predictive information, t is an integer different from i, m i Representing object of class i, x i Tag class, m, representing class i object t Representing class t object, x t And the label class of the t-th target object is represented.
For m first training images, firstly determining P first target objects in the m first training images and P second label information carried by the P first target objects, and marking the P first target objects asWherein m is i (1.ltoreq.i.ltoreq.Q) represents P types, m of the first target object i Is binary systemIndicating symbol, if a certain first training image comprises a first target object m of a first type 1 First target object m of second class 2 First target object m of fifth class 5 The second training image is +.> Then, a softmax function was obtained: />Wherein (1)>For predictive information, t is an integer different from i, m i Representing object of class i, x i Tag class, m, representing class i object t Representing class t object, x t And the label class of the t-th target object is represented.
According to the method provided by the embodiment of the application, training of the second target classification network is limited in the second label information and the background category information, so that the bias classification problem is solved.
In an alternative embodiment of the training method of the small sample object detection model provided in the corresponding embodiment of fig. 3 of the present application, please refer to fig. 4, the sub-step S1302 further includes sub-steps S1303 to S1306. Specific:
s1303, obtaining a confidence threshold.
It will be appreciated that since the number of training samples of the second target object is small, a larger range of confidence threshold values, for example 0.5, may be selected when predicting the position information of the second target object, so as to improve the accuracy of the position information prediction of the second target object.
S1304, generating K candidate frames according to the training target image.
Wherein K is an integer greater than 1.
It can be appreciated that the training target image is segmented to obtain K candidate frames. The candidate box is used for determining the position information of the second target object.
S1305, determining a second prediction boundary box from the K candidate boxes according to the second reference boundary box and the confidence threshold.
It is understood that the candidate box satisfying the confidence threshold is taken as the second prediction boundary box.
S1306, determining the position information of the second target object according to the second prediction boundary box.
It is understood that the second prediction bounding box is the position information of the second target object predicted by the small sample target detection model.
According to the method provided by the embodiment of the application, the accuracy of the position information prediction of the second target object is improved by setting the confidence threshold.
In an alternative embodiment of the training method of the small sample target detection model provided in the corresponding embodiment of fig. 4 of the present application, please refer to fig. 5, the sub-step S1305 includes sub-steps S13051 to S13052. Specific:
s13051, according to the second reference boundary frame and the N candidate frames, N intersection ratios are obtained through calculation.
Wherein the intersection ratio is used for representing the coincidence degree of the second reference boundary frame and the candidate frame.
It will be appreciated that the cross-over ratio (intersection over union, ioU) is used to measure the relative size of the second reference bounding box overlapping the candidate box. The higher the intersection ratio value is, the more the overlapping portions of the second reference bounding box and the candidate frame are indicated, and the lower the intersection ratio value is, the less the overlapping portions of the second reference bounding box and the candidate frame are indicated. Referring to fig. 6, fig. 6 is a schematic diagram illustrating calculation of an intersection ratio according to an embodiment of the application. The area a and the area B are two areas requiring the calculation of the cross-over ratio,the cross ratio can be calculated by the area of A.u.B and the area of A.u.B.
S13052, generating a second prediction boundary frame according to the candidate frames which meet the confidence threshold value in the N cross ratio values.
It is appreciated that the second prediction bounding box is determined based on the relative size of the second reference bounding box overlapping the candidate box and the confidence threshold. For example, according to the training target image, 5 candidate frames are generated, and according to the 5 candidate frames and the second reference boundary frame, 5 intersection ratios are calculated: and 0.1, 0.2, 0.6, 0.3 and 0.2, and the confidence coefficient threshold is 0.5, taking the candidate frame with the intersection ratio larger than the confidence coefficient threshold, namely the candidate frame with the intersection ratio of 0.6 as a second prediction boundary frame.
According to the method provided by the embodiment of the application, the accuracy of the position information prediction of the second target object is improved by setting the confidence coefficient threshold and calculating the intersection ratio of the reference frame and the candidate frame.
In an alternative embodiment of the training method of the small sample object detection model provided in the corresponding embodiment of fig. 3 of the present application, please refer to fig. 7, step S140 includes sub-steps S1402 to S1404.
Specific:
and S1402, extracting the characteristics of the second training image to obtain the characteristic dimension of the second training image.
The second training image feature dimension includes a second target object feature dimension and a second background region feature dimension, the second target object feature dimension corresponds to the second target object, and the second background region feature dimension corresponds to the second background region.
And S1404, carrying out gradient update on the second target classification network according to the characteristic dimension of the second target object and the characteristic dimension of the second background area to obtain an updated second target classification network.
It can be understood that, in order to limit the training of the second target classification network to the second tag information and the background category information, that is, the second target classification network performs the detection learning of the second target object in the second tag information and the background category information, the feature extraction is performed on the second training image to obtain the feature dimension of the second training image, and for the gradient update of the second target classification network, only the feature dimension of the second target object and the feature dimension (specific dimension) of the second background area are considered to reduce the bias of classification.
Updating parameters of the second target classification network through gradient descent:wherein θ cls For the parameters of the target classification network, λ is the learning rate,/->Classifying a loss function of the network for the object, +.>To derive symbols. According to the chain law, it is possible to obtain +.>Wherein (1)>For predictive information +.>And the second annotation information. Then, for the second object classification network, +.> Wherein (1)>For the type of the target object- >Is a label category. For the second object classification network, due to +.>Is defined between specific dimensions to mitigate bias in classification.
According to the method provided by the embodiment of the application, only the characteristic dimension of the second target object and the characteristic dimension of the second background area are considered when the gradient update is carried out on the second target classification network, so that the classification bias is reduced.
In an alternative embodiment of the training method of the small sample target detection model provided in the corresponding embodiment of fig. 2 of the present application, referring to fig. 8, step S140 includes sub-steps S1401 to S1403.
Specific:
and S1401, extracting features of the first training image to obtain the feature dimension of the first training image.
S1403, gradient updating is carried out on the first target classification network according to the feature dimension of the first training image, and the updated first target classification network is obtained.
It can be understood that the feature extraction is performed on the first training image to obtain the feature dimension of the first training image, and for the gradient update of the first target classification network, only all the feature dimensions of the second target object are obtained.
Updating parameters of the first target classification network through gradient descent: Wherein θ cls For the parameters of the target classification network, λ is the learning rate,/->Classifying a loss function of the network for the object, +.>To derive symbols. According to the chain law, it is possible to obtain +.>Wherein (1)>For predictive information +.>Is the first annotation information. Then, for the first object classification network, +.> Wherein (1)>For the type of the target object,is a label category.
According to the method provided by the embodiment of the application, when the gradient update is carried out on the first target classification network, the accuracy of the first target object identification is improved by considering all dimensions in the feature dimensions of the first training image.
In an alternative embodiment of the training method of the small sample target detection model provided in the corresponding embodiment of fig. 2, referring to fig. 9, step S140 includes sub-steps S141 to S143.
Specific:
s141, generating a loss function of the first target classification network according to the P pieces of first prediction information and the P pieces of first label information.
It will be appreciated that the loss function of the first object classification network is expressed by the following formula:
wherein,is the firstThe loss function of the target classification network, Q is the total number of second target objects, i is the ith first training image in the m first training images, +. >Predictive information, y, for the ith first training image i Label information for the ith first training image.
S142, generating a loss function of the second target classification network according to the Q pieces of second prediction information and the Q pieces of second label information.
It will be appreciated that the loss function of the second object classification network is expressed by the following formula:
wherein,for the loss function of the second object classification network, Q is the total number of second object objects, i is the ith second training image of the n second training images,/A->Predicting information, y, for the ith second training image i Label information for the ith second training image.
S143, generating a loss function of the small sample target detection model according to the loss function of the first target classification network and the loss function of the second target classification network.
It will be appreciated that the loss function of the small sample object detection model is the sum of the loss function of the first object classification network and the loss function of the second object classification network. The loss function of the small sample target detection model is calculated by the following formula:
wherein,classifying a loss function of the network for the first object, < >>The loss function of the network is classified for the second target.
According to the method provided by the embodiment of the application, the loss function of the small sample target detection model is calculated through the loss function of the first target classification network and the loss function of the second target classification network, so that the accuracy of identifying the first target object and the second target object of the small sample target detection model is improved.
For easy understanding, a training method of a small sample target detection model applied to product defects in industrial quality inspection is described below, and the training method comprises steps 1 to 4. Specific:
step 1, a first training sample set and a second training sample set are obtained.
The first training sample set comprises m first training images, the m first training images comprise P first target objects, the P first target objects carry P first label information, the second training sample set comprises n second training images, the n second training images comprise Q second target objects, the Q second target objects carry Q second label information, the first label information is used for indicating the labeling category of the first target objects, the second label information is used for indicating the labeling category of the second target objects, m, n, P and Q are integers greater than 1, m is greater than n, and P is greater than Q.
It will be appreciated that the first target object refers to a defect of the type of dishing in the surface of the product and the second target object refers to a defect of the product that lacks a particular component type. With the progress of industrial development, in a product production line, dishing of a product belongs to a common problem in product defects, and the lack of a specific part of a product is a rare problem, namely that n second training images in a second training sample set are small sample data sets relative to m first training images in a first training sample set. The small sample data set is embodied in two schemes of fewer samples and fewer label data, namely, the number n of the second training images is smaller than the number m of the first training images, and the number Q of the second label information for indicating the labeling category of the second target object is smaller than the number P of the first label information for indicating the labeling category of the first target object. The first training image may include a first target object, where the first target object carries first tag information for indicating a labeling category of the first target object, and the first training image may further include a second target object, where the second target object does not carry second tag information for indicating a labeling category of the second target object. The second training image may include a second target object, where the second target object carries second tag information for indicating a labeling category of the second target object, and the second training image may further include a first target object, where the first target object does not carry first tag information for indicating a labeling category of the first target object. The first target object is a basic target object of the small sample target detection model, and the second target object is a new class target object of the small sample target detection model.
And 2, taking m first training images as input of a small sample target detection model, and outputting P first prediction information through a first target classification network in the small sample target detection model.
The first prediction information is used for indicating a prediction category of the first target object.
And step 3, taking n second training images as the input of a small sample target detection model, and outputting Q second prediction information through a second target classification network in the small sample target detection model.
The second prediction information is used for indicating the prediction category of the second target object.
It will be appreciated that the first target classification network is trained with m first training images comprising base class target objects such that the small sample target detection model has the ability to detect the base class target objects (first target objects). The second target classification network is trained by n second training images comprising new class target objects such that the small sample target detection model has the ability to detect new class target objects (second target objects).
And 4, optimizing parameters of the first target classification network according to the P pieces of first prediction information and the P pieces of first label information, optimizing parameters of the second target classification network according to the Q pieces of second prediction information and the Q pieces of second label information, and generating an optimized small sample target detection model.
It will be appreciated that the parameters of the small sample target detection model include three parts: the method comprises the steps of basic detection model parameters, first target classification network parameters and second target classification network parameters. The small sample target detection model firstly learns and identifies the first target object through a large number of first training images, setting and optimizing of basic detection model parameters and first target classification network parameters in the small sample target detection model are completed, at the moment, the identification rate of the first target object is higher, and on the basis, the second target object is directly learned and identified through a small number of second training images, and the second target object can be identified only by setting and optimizing the second target classification network parameters. The degree of optimization of the parameters is determined by the number of sample data, the higher the degree of optimization of the parameters, the lower the number of sample data, and the lower the degree of optimization of the parameters. The sample data of the first training image are more, and the optimization degree of the parameters of the first target classification network is higher; the second training image has less sample data and the parameters of the second target classification network are less optimized.
Step 3 further comprises substep 3.1 to substep 3.6. Specific:
and 3.1, acquiring training target images from the n second training images.
The training target image comprises a second target object and a second background area, the second target object carries second label information, and the second background area carries background category information.
And 3.2, taking the second label information and the background category information as the input of a second target classification network, and generating second prediction information of a second target object through the second target classification network.
It can be understood that the training of the second target classification network is limited to the second tag information and the background category information, that is, the second target classification network performs the detection learning of the second target object in the second tag information and the background category information, so as to solve the bias classification problem.
And 3.3, obtaining a confidence coefficient threshold value.
And 3.4, generating K candidate frames according to the training target image.
Wherein K is an integer greater than 1.
And 3.5, determining a second prediction boundary box from the K candidate boxes according to the second reference boundary box and the confidence threshold.
And 3.6, determining the position information of the second target object according to the second prediction boundary box.
Step 3.5 further comprises sub-steps 3.5.1 to 3.5.2. Specific:
step 3.5.1, calculating N intersection ratio values according to the second reference boundary frame and the N candidate frames, wherein the intersection ratio values are used for representing the coincidence degree of the second reference boundary frame and the candidate frames;
and 3.5.2, generating a second prediction boundary frame according to the candidate frames which meet the confidence threshold value in the N cross ratio values.
It is appreciated that the second prediction bounding box is determined based on the relative size of the second reference bounding box overlapping the candidate box and a confidence threshold of a greater extent.
Step 4 further comprises steps 4.1 to 4.3. Specific:
and 4.1, generating a loss function of the first target classification network according to the P pieces of first prediction information and the P pieces of first label information.
And 4.2, generating a loss function of the second target classification network according to the Q pieces of second prediction information and the Q pieces of second label information.
And 4.3, generating a loss function of the small sample target detection model according to the loss function of the first target classification network and the loss function of the second target classification network.
It will be appreciated that the loss function of the small sample target detection model is calculated by the following formula:
Wherein,classifying a loss function of the network for the first object, < >>The loss function of the network is classified for the second target.
The loss function of the first object classification network is expressed by the following formula:
wherein,for the loss function of the first object classification network, Q is the total number of second object objects, i is the i first training image of the m first training images,/A>Predictive information, y, for the ith first training image i Label information for the ith first training image.
First training image prediction informationIs expressed by the following formula:
wherein the method comprises the steps of,For the first training image prediction information, t is an integer different from i, m i Representing object of class i, x i Tag class, m, representing class i object t Representing class t object, x t And the label class of the t-th target object is represented.
The parameters of the first object classification network are updated by gradient descent:
wherein θ cls For parameters of the target classification network, lambda is the learning rate,classifying a loss function of the network for the object, +.>To derive symbols.
According to the chain law, can be obtainedWherein (1)>For the first training image prediction information,is the first annotation information. Then, for the first object classification network, +. > Wherein,for the type of the target object->Is a label category.
The loss function of the second object classification network is expressed by the following formula:
wherein,for the loss function of the second object classification network, Q is the total number of second object objects, i is the ith second training image of the n second training images,/A->Predicting information, y, for the ith second training image i Label information for the ith second training image.
Second training image prediction informationIs expressed by the following formula: />
Wherein,for the first training image prediction information, t is an integer different from i, m i Representing object of class i, x i Tag class, m, representing class i object t Representing class t object, x t And the label class of the t-th target object is represented.
The parameters of the second object classification network are updated by gradient descent:
wherein θ cls For parameters of the target classification network, lambda is the learning rate,classifying a loss function of the network for the object, +.>To derive symbols.
According to the chain law, can be obtainedWherein (1)>For predictive information +.>And the second annotation information. Then, for the second object classification network, +.>Wherein (1)>For the type of the target object->For the purpose of markingA signature class. For the second object classification network, due to +. >Is defined between specific dimensions to mitigate bias in classification.
According to the method provided by the embodiment of the application, the small sample target detection model is decomposed into the first target classification network and the second target classification network, the first target classification network is trained through a large amount of sample image data containing labels, so that the small sample target detection model has the capability of detecting a first target object, the second target classification network is trained through a small amount of sample image data without labels, so that the small sample target detection model has the capability of detecting a second target object, the first target classification network and the second target classification network can independently detect the first target object and the second target object, the deviation classification problem caused by label deletion in a small sample scene is relieved, and the accuracy of identifying the small sample target object is improved.
The following describes the object detection method in the present application. Referring to fig. 10, the target detection method provided by the embodiment of the application includes: step S210 to step S220. Specific:
s210, acquiring a target image.
S220, inputting the target image as a small sample target detection model trained by any one of the methods, and outputting first prediction information and second prediction information through the small sample target detection model.
The first prediction information is used for indicating the prediction category of a first target object included in the target image, and the second prediction information is used for indicating the prediction category of a second target object included in the target image.
It will be appreciated that the training of the small sample object detection model described above provides the small sample object detection model with the ability to detect the first object and the second object. And outputting predicted category information, position information and the like of the first target object and the second target object.
According to the method provided by the embodiment of the application, the target image is processed through the small sample target detection model, so that the prediction information of the first target object and the second target object in the target image is obtained, the prediction information comprises the predicted category information, the position information and the like, and the accuracy of target object identification is improved.
In order to facilitate understanding, a method for detecting defects of products applied to industrial quality inspection is described below, and includes steps 1 to 2. Specific:
and step 1, acquiring a target image.
It will be appreciated that the target image is a photographic image of the product in an industrial quality inspection.
And 2, inputting the target image as a small sample target detection model trained by any one of the methods, and outputting first prediction information and second prediction information through the small sample target detection model.
The first prediction information is used for indicating the prediction category of a first target object included in the target image, and the second prediction information is used for indicating the prediction category of a second target object included in the target image.
It will be appreciated that the first target object refers to a defect of the type of dishing in the surface of the product and the second target object refers to a defect of the product that lacks a particular component type.
According to the method provided by the embodiment of the application, the target image is processed through the small sample target detection model, so that the prediction information of the first target object and the second target object in the target image is obtained, the prediction information comprises the predicted category information, the position information and the like, and the accuracy of target object identification is improved.
The training device of the small sample object detection model in the present application is described in detail below, referring to fig. 11. Fig. 11 is a schematic diagram of an embodiment of a training apparatus 10 for a small sample target detection model according to an embodiment of the present application, where the training apparatus 10 for a small sample target detection model includes:
the training sample set obtaining module 110 is configured to obtain a first training sample set and a second training sample set.
The first training sample set comprises m first training images, the m first training images comprise P first target objects, the P first target objects carry P first label information, the second training sample set comprises n second training images, the n second training images comprise Q second target objects, the Q second target objects carry Q second label information, the first label information is used for indicating the labeling category of the first target objects, the second label information is used for indicating the labeling category of the second target objects, m, n, P and Q are integers greater than 1, m is greater than n, and P is greater than Q.
The first target object training module 120 is configured to take m first training images as input of a small sample target detection model, and output P first prediction information through a first target classification network in the small sample target detection model.
The first prediction information is used for indicating a prediction category of the first target object.
The second target object training module 130 is configured to take n second training images as input of the small sample target detection model, and output Q second prediction information through a second target classification network in the small sample target detection model.
The second prediction information is used for indicating the prediction category of the second target object;
the small sample target detection model optimizing module 140 is configured to optimize parameters of the first target classification network according to the P first prediction information and the P first tag information, and optimize parameters of the second target classification network according to the Q second prediction information and the Q second tag information, so as to generate an optimized small sample target detection model.
According to the device provided by the embodiment of the application, the small sample target detection model is decomposed into the first target classification network and the second target classification network, the first target classification network is trained through a large amount of sample image data containing labels, so that the small sample target detection model has the capability of detecting a first target object, the second target classification network is trained through a small amount of sample image data without labels, so that the small sample target detection model has the capability of detecting a second target object, the first target classification network and the second target classification network can independently detect the first target object and the second target object, the deviation classification problem caused by label deletion in a small sample scene is relieved, and the accuracy of identifying the small sample target object is improved.
In an alternative embodiment of the training apparatus for a small sample target detection model provided in the corresponding embodiment of fig. 11 of the present application, the second target object training module 130 is further configured to:
training target images are acquired from n second training images.
The training target image comprises a second target object and a second background area, the second target object carries second label information, and the second background area carries background category information.
And taking the second label information and the background category information as the input of a second target classification network, and generating second prediction information of a second target object through the second target classification network.
The device provided by the embodiment of the application limits the training of the second target classification network to the second label information and the background category information so as to solve the bias classification problem.
In an alternative embodiment of the training apparatus for a small sample target detection model provided in the corresponding embodiment of fig. 11 of the present application, the second target object carries a second reference bounding box, and the second target object training module 130 is further configured to:
acquiring a confidence threshold;
and generating K candidate frames according to the training target image.
Wherein K is an integer greater than 1.
And determining a second prediction boundary box from the K candidate boxes according to the second reference boundary box and the confidence threshold.
And determining the position information of the second target object according to the second prediction boundary box.
The device provided by the embodiment of the application improves the accuracy of the position information prediction of the second target object by setting the confidence threshold.
In an alternative embodiment of the training apparatus for a small sample target detection model provided in the corresponding embodiment of fig. 11 of the present application, the second target object training module 130 is further configured to:
and according to the second reference boundary frame and the N candidate frames, N intersection ratios are calculated.
Wherein the intersection ratio is used for representing the coincidence degree of the second reference boundary frame and the candidate frame.
And generating a second prediction boundary frame according to the candidate frames which meet the confidence threshold value in the N intersection ratio values.
The device provided by the embodiment of the application improves the accuracy of the position information prediction of the second target object by setting the confidence threshold and calculating the intersection ratio of the reference frame and the candidate frame.
In an alternative embodiment of the training apparatus for a small sample target detection model provided in the corresponding embodiment of fig. 11 of the present application, the small sample target detection model optimizing module 140 is further configured to:
And extracting the features of the second training image to obtain the feature dimension of the second training image.
The second training image feature dimension includes a second target object feature dimension and a second background region feature dimension, the second target object feature dimension corresponds to the second target object, and the second background region feature dimension corresponds to the second background region.
And carrying out gradient update on the second target classification network according to the characteristic dimension of the second target object and the characteristic dimension of the second background area to obtain an updated second target classification network.
According to the device provided by the embodiment of the application, only the characteristic dimension of the second target object and the characteristic dimension of the second background area are considered when the gradient update is carried out on the second target classification network, so that the classification bias is reduced.
In an alternative embodiment of the training apparatus for a small sample target detection model provided in the corresponding embodiment of fig. 11 of the present application, the small sample target detection model optimizing module 140 is further configured to:
and extracting the features of the first training image to obtain the feature dimension of the first training image.
And carrying out gradient update on the first target classification network according to the feature dimension of the first training image to obtain an updated first target classification network.
According to the device provided by the embodiment of the application, when the gradient update is carried out on the first target classification network, the accuracy of the first target object identification is improved by considering all dimensions in the feature dimensions of the first training image.
In an alternative embodiment of the training apparatus for a small sample target detection model provided in the corresponding embodiment of fig. 11 of the present application, the small sample target detection model optimizing module 140 is further configured to:
and generating a loss function of the first target classification network according to the P pieces of first prediction information and the P pieces of first label information.
And generating a loss function of the second target classification network according to the Q pieces of second prediction information and the Q pieces of second label information.
And generating a loss function of the small sample target detection model according to the loss function of the first target classification network and the loss function of the second target classification network.
According to the device provided by the embodiment of the application, the loss function of the small sample target detection model is calculated through the loss function of the first target classification network and the loss function of the second target classification network, so that the accuracy of identifying the first target object and the second target object of the small sample target detection model is improved.
The following describes the object detection device in detail, please refer to fig. 12. Fig. 12 is a schematic diagram of an embodiment of an object detection device 20 according to an embodiment of the present application, where the object detection device 20 includes:
The target image acquisition module 210 is configured to acquire a target image.
The target object prediction module 220 is configured to input the target image as the small sample target detection model trained by the above method, and output the first prediction information and the second prediction information through the small sample target detection model.
The first prediction information is used for indicating the prediction category of a first target object included in the target image, and the second prediction information is used for indicating the prediction category of a second target object included in the target image.
According to the device provided by the embodiment of the application, the target image is processed through the small sample target detection model, so that the prediction information of the first target object and the second target object in the target image is obtained, the prediction information comprises the predicted category information, the position information and the like, and the accuracy of target object identification is improved.
Fig. 13 is a schematic diagram of a server structure according to an embodiment of the present application, where the server 300 may have a relatively large difference between configurations or performances, and may include one or more central processing units (central processing units, CPU) 322 (e.g., one or more processors) and a memory 332, and one or more storage media 330 (e.g., one or more mass storage devices) storing application programs 342 or data 344. Wherein the memory 332 and the storage medium 330 may be transitory or persistent. The program stored on the storage medium 330 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, the central processor 322 may be configured to communicate with the storage medium 330 and execute a series of instruction operations in the storage medium 330 on the server 300.
The Server 300 may also include one or more power supplies 326, one or more wired or wireless network interfaces 350, one or more input/output interfaces 358, and/or one or more operating systems 341, such as Windows Server TM ,Mac OS X TM ,Unix TM ,Linux TM ,FreeBSD TM Etc.
The steps performed by the server in the above embodiments may be based on the server structure shown in fig. 13.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (13)

1. A training method for a small sample target detection model, comprising:
acquiring a first training sample set and a second training sample set, wherein the first training sample set comprises m first training images, the m first training images comprise P first target objects, the P first target objects carry P first label information, the second training sample set comprises n second training images, the n second training images comprise Q second target objects, the Q second target objects carry Q second label information, the first label information is used for indicating the labeling category of the first target objects, the second label information is used for indicating the labeling category of the second target objects, m, n, P and Q are integers greater than 1, m is greater than n, and P is greater than Q;
Taking the m first training images as input of a small sample target detection model, and outputting P pieces of first prediction information through a first target classification network in the small sample target detection model, wherein the first prediction information is used for indicating the prediction category of the first target object;
taking the n second training images as input of the small sample target detection model, and outputting Q second prediction information through a second target classification network in the small sample target detection model, wherein the second prediction information is used for indicating the prediction category of the second target object;
optimizing parameters of the first target classification network according to the P pieces of first prediction information and the P pieces of first label information, optimizing parameters of the second target classification network according to the Q pieces of second prediction information and the Q pieces of second label information, and generating an optimized small sample target detection model.
2. The method for training a small sample object detection model according to claim 1, wherein the outputting Q second prediction information by the second object classification network in the small sample object detection model using the n second training images as input of the small sample object detection model comprises:
Acquiring training target images from the n second training images, wherein the training target images comprise second target objects and second background areas, the second target objects carry second label information, and the second background areas carry background category information;
and taking the second tag information and the background category information as the input of the second target classification network, and generating second prediction information of the second target object through the second target classification network.
3. The method for training a small sample object detection model according to claim 2, wherein the second object carries a second reference bounding box, and further comprising, after outputting second prediction information through the second object segmentation network:
acquiring a confidence threshold;
generating K candidate frames according to the training target image, wherein K is an integer greater than 1;
determining a second prediction boundary box from K candidate boxes according to the second reference boundary box and the confidence threshold;
and determining the position information of the second target object according to the second prediction boundary box.
4. A method of training a small sample object detection model as claimed in claim 3 wherein said determining a second prediction bounding box from K candidate boxes based on said second reference bounding box and said confidence threshold comprises:
According to the second reference boundary frame and the N candidate frames, calculating N intersection ratio values, wherein the intersection ratio values are used for representing the coincidence degree of the second reference boundary frame and the candidate frames;
and generating a second prediction boundary frame according to the candidate frames which meet the confidence threshold value in the N intersection ratio values.
5. The method of training a small sample object detection model according to claim 2, wherein optimizing parameters of the second object classification network according to the Q second prediction information and the Q second label information comprises:
extracting features of the second training image to obtain a second training image feature dimension, wherein the second training image feature dimension comprises a second target object feature dimension and a second background region feature dimension, the second target object feature dimension corresponds to a second target object, and the second background region feature dimension corresponds to the second background region;
and carrying out gradient update on the second target classification network according to the characteristic dimension of the second target object and the characteristic dimension of the second background area to obtain an updated second target classification network.
6. The method of training a small sample object detection model according to claim 1, wherein optimizing parameters of the first object classification network according to the P first prediction information and the P first label information comprises:
Extracting features of the first training image to obtain feature dimensions of the first training image;
and carrying out gradient update on the first target classification network according to the feature dimension of the first training image to obtain an updated first target classification network.
7. The method for training the small sample object detection model according to claim 1, wherein optimizing the parameters of the first object classification network according to the P first prediction information and the P first label information, and optimizing the parameters of the second object classification network according to the Q second prediction information and the Q second label information, generating the optimized small sample object detection model comprises:
generating a loss function of a first target classification network according to the P pieces of first prediction information and the P pieces of first label information;
generating a loss function of a second target classification network according to the Q pieces of second prediction information and the Q pieces of second label information;
and generating the loss function of the small sample target detection model according to the loss function of the first target classification network and the loss function of the second target classification network.
8. A method of detecting an object, comprising:
acquiring a target image;
inputting the target image as the small sample target detection model trained by the method according to any one of claims 1-7, and outputting first prediction information and second prediction information through the small sample target detection model, wherein the first prediction information is used for indicating the prediction category of the first target object included in the target image, and the second prediction information is used for indicating the prediction category of the second target object included in the target image.
9. A training device for a small sample object detection model, comprising:
the training sample set acquisition module is used for acquiring a first training sample set and a second training sample set, wherein the first training sample set comprises m first training images, the m first training images comprise P first target objects, the P first target objects carry P first label information, the second training sample set comprises n second training images, the n second training images comprise Q second target objects, the Q second target objects carry Q second label information, the first label information is used for indicating the labeling category of the first target objects, the second label information is used for indicating the labeling category of the second target objects, m, n, P and Q are integers larger than 1, m is larger than n, and P is larger than Q;
The first target object training module is used for taking the m first training images as input of a small sample target detection model, and outputting P pieces of first prediction information through a first target classification network in the small sample target detection model, wherein the first prediction information is used for indicating the prediction category of the first target object;
the second target object training module is used for taking the n second training images as input of the small sample target detection model, and outputting Q second prediction information through a second target classification network in the small sample target detection model, wherein the second prediction information is used for indicating the prediction category of the second target object;
the small sample target detection model optimizing module is used for optimizing parameters of the first target classification network according to the P pieces of first prediction information and the P pieces of first label information, optimizing parameters of the second target classification network according to the Q pieces of second prediction information and the Q pieces of second label information, and generating an optimized small sample target detection model.
10. An object detection apparatus, comprising:
the target image acquisition module is used for acquiring a target image;
A target object prediction module, configured to input the target image as the small sample target detection model trained by the method according to any one of claims 1 to 7, and output first prediction information and second prediction information through the small sample target detection model, where the first prediction information is used to indicate a prediction class of a first target object included in the target image, and the second prediction information is used to indicate a prediction class of a second target object included in the target image.
11. A computer device, comprising: memory, transceiver, processor, and bus system;
wherein the memory is used for storing programs;
the processor for executing a program in the memory, comprising executing a training method of the small sample object detection model according to any one of claims 1 to 7 or an object detection method according to claim 8;
the bus system is used for connecting the memory and the processor so as to enable the memory and the processor to communicate.
12. A computer readable storage medium comprising instructions which, when run on a computer, cause the computer to perform the training method of the small sample object detection model of any one of claims 1 to 7 or the object detection method of claim 8.
13. A computer program product comprising a computer program, characterized in that the computer program is executed by a processor for a training method of a small sample object detection model according to any one of claims 1 to 7 or an object detection method according to claim 8.
CN202211330223.5A 2022-10-27 2022-10-27 Training method and related device for small sample target detection model Pending CN117011575A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211330223.5A CN117011575A (en) 2022-10-27 2022-10-27 Training method and related device for small sample target detection model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211330223.5A CN117011575A (en) 2022-10-27 2022-10-27 Training method and related device for small sample target detection model

Publications (1)

Publication Number Publication Date
CN117011575A true CN117011575A (en) 2023-11-07

Family

ID=88564207

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211330223.5A Pending CN117011575A (en) 2022-10-27 2022-10-27 Training method and related device for small sample target detection model

Country Status (1)

Country Link
CN (1) CN117011575A (en)

Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160283735A1 (en) * 2015-03-24 2016-09-29 International Business Machines Corporation Privacy and modeling preserved data sharing
US20190304090A1 (en) * 2018-04-02 2019-10-03 Pearson Education, Inc. Automatic graph scoring for neuropsychological assessments
CN111210024A (en) * 2020-01-14 2020-05-29 深圳供电局有限公司 Model training method and device, computer equipment and storage medium
CN111444828A (en) * 2020-03-25 2020-07-24 腾讯科技(深圳)有限公司 Model training method, target detection method, device and storage medium
EP3731144A1 (en) * 2019-04-25 2020-10-28 Koninklijke Philips N.V. Deep adversarial artifact removal
US20210042570A1 (en) * 2019-08-07 2021-02-11 Applied Materials, Inc. Automatic and adaptive fault detection and classification limits
AU2020103901A4 (en) * 2020-12-04 2021-02-11 Chongqing Normal University Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field
CN112633425A (en) * 2021-03-11 2021-04-09 腾讯科技(深圳)有限公司 Image classification method and device
CN112990306A (en) * 2021-03-12 2021-06-18 国网智能科技股份有限公司 Transformer equipment defect identification method and system
CN113191273A (en) * 2021-04-30 2021-07-30 西安聚全网络科技有限公司 Oil field well site video target detection and identification method and system based on neural network
CN113221993A (en) * 2021-05-06 2021-08-06 西安电子科技大学 Large-view-field small-sample target detection method based on meta-learning and cross-stage hourglass
CN113256594A (en) * 2021-06-07 2021-08-13 之江实验室 Small sample model generation and weld joint detection method based on regional characteristic metric learning
CN113283513A (en) * 2021-05-31 2021-08-20 西安电子科技大学 Small sample target detection method and system based on target interchange and metric learning
CN113591967A (en) * 2021-07-27 2021-11-02 南京旭锐软件科技有限公司 Image processing method, device and equipment and computer storage medium
CN113822368A (en) * 2021-09-29 2021-12-21 成都信息工程大学 Anchor-free incremental target detection method
CN113868497A (en) * 2021-09-28 2021-12-31 绿盟科技集团股份有限公司 Data classification method and device and storage medium
CN114022774A (en) * 2022-01-10 2022-02-08 航天宏图信息技术股份有限公司 Radar image-based marine mesoscale vortex monitoring method and device
CN114385846A (en) * 2021-12-23 2022-04-22 北京旷视科技有限公司 Image classification method, electronic device, storage medium and program product
CN114445670A (en) * 2022-04-11 2022-05-06 腾讯科技(深圳)有限公司 Training method, device and equipment of image processing model and storage medium
CN114510570A (en) * 2022-01-21 2022-05-17 平安科技(深圳)有限公司 Intention classification method and device based on small sample corpus and computer equipment
US20220198781A1 (en) * 2020-12-17 2022-06-23 Robert Bosch Gmbh Device and method for training a classifier
CN115082740A (en) * 2022-07-18 2022-09-20 北京百度网讯科技有限公司 Target detection model training method, target detection method, device and electronic equipment
CN115146761A (en) * 2022-05-26 2022-10-04 腾讯科技(深圳)有限公司 Defect detection model training method and related device

Patent Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160283735A1 (en) * 2015-03-24 2016-09-29 International Business Machines Corporation Privacy and modeling preserved data sharing
US20190304090A1 (en) * 2018-04-02 2019-10-03 Pearson Education, Inc. Automatic graph scoring for neuropsychological assessments
EP3731144A1 (en) * 2019-04-25 2020-10-28 Koninklijke Philips N.V. Deep adversarial artifact removal
US20210042570A1 (en) * 2019-08-07 2021-02-11 Applied Materials, Inc. Automatic and adaptive fault detection and classification limits
CN111210024A (en) * 2020-01-14 2020-05-29 深圳供电局有限公司 Model training method and device, computer equipment and storage medium
CN111444828A (en) * 2020-03-25 2020-07-24 腾讯科技(深圳)有限公司 Model training method, target detection method, device and storage medium
AU2020103901A4 (en) * 2020-12-04 2021-02-11 Chongqing Normal University Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field
US20220198781A1 (en) * 2020-12-17 2022-06-23 Robert Bosch Gmbh Device and method for training a classifier
CN112633425A (en) * 2021-03-11 2021-04-09 腾讯科技(深圳)有限公司 Image classification method and device
CN112990306A (en) * 2021-03-12 2021-06-18 国网智能科技股份有限公司 Transformer equipment defect identification method and system
CN113191273A (en) * 2021-04-30 2021-07-30 西安聚全网络科技有限公司 Oil field well site video target detection and identification method and system based on neural network
CN113221993A (en) * 2021-05-06 2021-08-06 西安电子科技大学 Large-view-field small-sample target detection method based on meta-learning and cross-stage hourglass
CN113283513A (en) * 2021-05-31 2021-08-20 西安电子科技大学 Small sample target detection method and system based on target interchange and metric learning
CN113256594A (en) * 2021-06-07 2021-08-13 之江实验室 Small sample model generation and weld joint detection method based on regional characteristic metric learning
CN113591967A (en) * 2021-07-27 2021-11-02 南京旭锐软件科技有限公司 Image processing method, device and equipment and computer storage medium
CN113868497A (en) * 2021-09-28 2021-12-31 绿盟科技集团股份有限公司 Data classification method and device and storage medium
CN113822368A (en) * 2021-09-29 2021-12-21 成都信息工程大学 Anchor-free incremental target detection method
CN114385846A (en) * 2021-12-23 2022-04-22 北京旷视科技有限公司 Image classification method, electronic device, storage medium and program product
CN114022774A (en) * 2022-01-10 2022-02-08 航天宏图信息技术股份有限公司 Radar image-based marine mesoscale vortex monitoring method and device
CN114510570A (en) * 2022-01-21 2022-05-17 平安科技(深圳)有限公司 Intention classification method and device based on small sample corpus and computer equipment
CN114445670A (en) * 2022-04-11 2022-05-06 腾讯科技(深圳)有限公司 Training method, device and equipment of image processing model and storage medium
CN115146761A (en) * 2022-05-26 2022-10-04 腾讯科技(深圳)有限公司 Defect detection model training method and related device
CN115082740A (en) * 2022-07-18 2022-09-20 北京百度网讯科技有限公司 Target detection model training method, target detection method, device and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HASSENE BEN AMARA;FAKHRI KARRAY;: "End-to-End Multiview Gesture Recognition for Autonomous Car Parking System", INSTRUMENTATION, no. 03, 15 September 2019 (2019-09-15) *
周末;徐玲;杨梦宁;廖胜平;鄢萌;: "基于深度自编码网络的软件缺陷预测方法", 计算机工程与科学, no. 10, 15 October 2018 (2018-10-15) *

Similar Documents

Publication Publication Date Title
CN109948425B (en) Pedestrian searching method and device for structure-aware self-attention and online instance aggregation matching
CN107133569B (en) Monitoring video multi-granularity labeling method based on generalized multi-label learning
CN116049397B (en) Sensitive information discovery and automatic classification method based on multi-mode fusion
CN113569554B (en) Entity pair matching method and device in database, electronic equipment and storage medium
CN112036514B (en) Image classification method, device, server and computer readable storage medium
CN112487822A (en) Cross-modal retrieval method based on deep learning
CN111325237B (en) Image recognition method based on attention interaction mechanism
CN113158777B (en) Quality scoring method, training method of quality scoring model and related device
CN114694178A (en) Method and system for monitoring safety helmet in power operation based on fast-RCNN algorithm
CN114332473A (en) Object detection method, object detection device, computer equipment, storage medium and program product
CN114358279A (en) Image recognition network model pruning method, device, equipment and storage medium
CN111898528B (en) Data processing method, device, computer readable medium and electronic equipment
CN117829243A (en) Model training method, target detection device, electronic equipment and medium
CN106980878B (en) Method and device for determining geometric style of three-dimensional model
CN113128465A (en) Small target detection method based on improved YOLOv4 for industrial scene
CN112861881A (en) Honeycomb lung recognition method based on improved MobileNet model
CN115439919B (en) Model updating method, device, equipment, storage medium and program product
CN116958724A (en) Training method and related device for product classification model
CN117011575A (en) Training method and related device for small sample target detection model
Ke et al. Human attribute recognition method based on pose estimation and multiple-feature fusion
CN113627522A (en) Image classification method, device and equipment based on relational network and storage medium
Kopparthi et al. Content based image retrieval using deep learning technique with distance measures
CN117975204B (en) Model training method, defect detection method and related device
Zheng et al. Research on Target Detection Algorithm of Bank Card Number Recognition
CN117372815A (en) Quality inspection method and related device for industrial parts based on computer vision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination