CN113255915A - Knowledge distillation method, device, equipment and medium based on structured instance graph - Google Patents

Knowledge distillation method, device, equipment and medium based on structured instance graph Download PDF

Info

Publication number
CN113255915A
CN113255915A CN202110551061.7A CN202110551061A CN113255915A CN 113255915 A CN113255915 A CN 113255915A CN 202110551061 A CN202110551061 A CN 202110551061A CN 113255915 A CN113255915 A CN 113255915A
Authority
CN
China
Prior art keywords
loss
network
distillation
node
diagram
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110551061.7A
Other languages
Chinese (zh)
Other versions
CN113255915B (en
CN113255915B8 (en
Inventor
陈亦新
陈鹏光
贾佳亚
沈小勇
吕江波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Smartmore Technology Co Ltd
Shanghai Smartmore Technology Co Ltd
Original Assignee
Shenzhen Smartmore Technology Co Ltd
Shanghai Smartmore Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Smartmore Technology Co Ltd, Shanghai Smartmore Technology Co Ltd filed Critical Shenzhen Smartmore Technology Co Ltd
Priority to CN202110551061.7A priority Critical patent/CN113255915B8/en
Publication of CN113255915A publication Critical patent/CN113255915A/en
Publication of CN113255915B publication Critical patent/CN113255915B/en
Application granted granted Critical
Publication of CN113255915B8 publication Critical patent/CN113255915B8/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/027Frames
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The present application relates to a knowledge distillation method, apparatus, computer device and storage medium based on structured instance graphs. By the method and the device, the foreground characteristic and the background characteristic in the teacher network can be effectively migrated to the student network, and the accuracy of target detection is further improved. The method comprises the following steps: acquiring a training image, and respectively inputting the training image to a teacher network student network to obtain a first characteristic diagram and a second characteristic diagram; respectively inputting the first characteristic diagram and the second characteristic diagram into the area candidate network to obtain a first diagram to be detected and a second diagram to be detected which contain a boundary frame; coding is carried out based on the object examples in each boundary box to obtain a first structural diagram and a second structural diagram; obtaining a distillation loss part based on the first structural diagram and the second structural diagram; obtaining a basic loss part according to the distance between the detection result of the second image to be detected and the real label; training a student network based on the distillation loss part and the base loss part.

Description

Knowledge distillation method, device, equipment and medium based on structured instance graph
Technical Field
The present application relates to the field of artificial intelligence technology, and in particular, to a knowledge distillation method, apparatus, computer device, and storage medium based on a structured instance graph.
Background
With the development of neural networks in the field of target detection, the function of the target detector is more and more powerful. These target detectors generally adopt a deep neural network structure, but the large amount of weight in these deep neural networks consumes much memory space and computation amount, and thus are difficult to deploy on a mobile device.
To solve this problem, the known distillation methods have come into force. Knowledge Distillation (Knowledge Distillation) is a model compression method, and the main idea is to utilize a completely trained larger teacher network (teacher network) to assist the training of a student network (student network) with low resource consumption, so as to reduce the resource consumption in the computer vision task processing and achieve the same task processing effect as the teacher network (teacher network).
However, in the process of computer vision task processing, the ratio of foreground features to background features of an image is greatly different, so that the foreground features and the background features learned by students through network are always unbalanced in the knowledge distillation process, the foreground features in the image are not ignored, the background features are not fully utilized, and the corresponding features do not play a role in due knowledge transfer in the knowledge distillation.
Disclosure of Invention
In view of the above, there is a need to provide a knowledge distillation method, apparatus, computer device and storage medium based on a structured example graph.
A knowledge distillation method based on a structured example graph, the method comprising:
acquiring a training image;
inputting the training image into a backbone network of a teacher network to obtain a first feature map, and inputting the training image into a backbone network of a student network to obtain a second feature map;
inputting the first feature map and the second feature map into a regional candidate network respectively to obtain a first to-be-detected map containing a boundary frame and a second to-be-detected map containing a boundary frame;
coding is carried out on the basis of an object instance in the boundary frame of the first diagram to be detected and an object instance in the boundary frame of the second diagram to be detected, and a first structural diagram and a second structural diagram are obtained;
obtaining a distillation loss part by combining a preset distillation loss function based on the first structural diagram and the second structural diagram;
obtaining a basic loss part according to the distance between the detection result of the second image to be detected and the real label;
training the student network based on the distillation loss fraction and the base loss fraction.
In one embodiment, the object instances are characterized using nodes and corresponding edges; the nodes comprise foreground nodes and background nodes; the encoding is carried out based on the object instance in the boundary frame of the first diagram to be detected and the object instance in the boundary frame of the second diagram to be detected, so as to obtain a first structural diagram and a second structural diagram, and the method comprises the following steps:
calculating a classification loss value of the background node by using a preset classification loss function;
and after removing the background nodes with the classification loss value smaller than a preset threshold value, coding the rest background nodes and the foreground nodes to obtain the first structural diagram and the second structural diagram.
In one embodiment, the training of the student network based on the distillation loss part and the base loss part comprises
Taking the sum of the distillation lost portion and the base lost portion as the global loss of the student network;
and adjusting network parameters of the student network based on the global loss until the global loss meets a preset condition.
In one embodiment, the object instances are characterized using nodes and corresponding edges; the distillation loss part comprises a foreground node loss part, a background node loss part and an edge loss part; the distillation loss function is:
Figure BDA0003075418640000021
wherein the content of the first and second substances,
Figure BDA0003075418640000022
Figure BDA0003075418640000031
Figure BDA0003075418640000032
wherein L isGThe function of the loss of distillation is expressed,
Figure BDA0003075418640000033
represents distillation loss of the foreground node;
Figure BDA0003075418640000034
represents the distillation loss of the background node, LERepresents an edge loss; lambda [ alpha ]1Distillation loss weight, λ, for the foreground node2Weight loss by distillation, λ, for background nodes3Distillation loss weight as edge; n is a radical offgRepresenting the total number of foreground nodes,
Figure BDA0003075418640000035
a node vector representing the ith foreground node in the teacher network,
Figure BDA0003075418640000036
a node vector representing an ith foreground node in the student network; n is a radical ofbgRepresenting the total number of background nodes,
Figure BDA0003075418640000037
a node vector representing the ith background node in the teacher network,
Figure BDA0003075418640000038
a node vector representing an ith background node in the student network; n represents the total number of edges,
Figure BDA0003075418640000039
representing the correlation in feature space from the ith node to the jth node in the teacher network,
Figure BDA00030754186400000310
representing the correlation of the ith node and the jth node in the student network in the feature space.
In one embodiment, the base loss section includes a detection loss function and a KL divergence function.
A method of target detection, the method comprising:
training the student network for image instance detection by utilizing the steps in the embodiment of the knowledge distillation method based on the structured instance graph;
acquiring an image including an object to be detected;
inputting the image to the student network to cause the student network to output a target image containing a prediction box; the prediction box is used for identifying the class label and the position of the object to be detected.
A knowledge distillation apparatus based on a structured example graph, the apparatus comprising:
the image acquisition module is used for acquiring a training image;
the characteristic diagram acquisition module is used for inputting the training image into a backbone network of a teacher network to obtain a first characteristic diagram and inputting the training image into a backbone network of a student network to obtain a second characteristic diagram;
the mapping acquisition module is used for inputting the first feature map and the second feature map into a candidate network of a region respectively to obtain a first mapping containing a boundary frame and a second mapping containing the boundary frame;
the structure diagram output module is used for coding based on an object example in a boundary frame of the first diagram to be detected and an object example in a boundary frame of the second diagram to be detected to obtain a first structure diagram and a second structure diagram;
a distillation loss acquisition module, configured to obtain a distillation loss part by combining a preset distillation loss function based on the first structural diagram and the second structural diagram;
a basic loss acquisition module for obtaining a basic loss part according to the distance between the detection result of the second to-be-detected image and the real label;
and the student network training module is used for training the student network based on the distillation loss part and the basic loss part.
An object detection apparatus, the apparatus comprising:
the image acquisition module is used for acquiring an image comprising an object to be detected;
a target image output module, configured to utilize the student network for image instance detection, which is trained in each step in the above embodiment of the knowledge distillation method based on the structured instance graph, to input the image to the student network, so that the student network outputs a target image including a prediction box; the prediction box is used for identifying the class label and the position of the object to be detected.
A computer device comprising a memory storing a computer program and a processor that when executed implements the steps of any of the above described structured example graph-based knowledge distillation method and target detection method embodiments.
A computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of any of the above-described structured example graph-based knowledge distillation method and target detection method embodiments.
The knowledge distillation method and device based on the structured example graph, the computer equipment and the storage medium acquire the training images, and input the training images into a backbone network of a teacher network and a backbone network of a student network respectively to obtain a first characteristic graph and a second characteristic graph; respectively inputting the first characteristic diagram and the second characteristic diagram into the area candidate network to obtain a first diagram to be detected containing a boundary frame and a second diagram to be detected containing the boundary frame; coding is carried out based on the object examples in each boundary box to obtain a first structural diagram and a second structural diagram; obtaining a distillation loss part by combining a preset distillation loss function based on the first structural diagram and the second structural diagram; obtaining a basic loss part according to the distance between the detection result of the second image to be detected and the real label; and training the student network based on the distillation loss part and the basic loss part. The method uses an example model of a graph structure, fuses the relation between the foreground characteristic and the background characteristic in the image through a preset distillation loss function, so that the foreground characteristic and the background characteristic in a teacher network can be effectively transferred to a student network, the student network can extract more effective knowledge from the teacher network, and the target detection accuracy is further improved.
Drawings
FIG. 1 is a schematic diagram of a distillation structure of a known distillation method based on a structured example graph in one embodiment;
FIG. 2 is a schematic flow diagram of a knowledge distillation process based on a structured example graph in one embodiment;
FIG. 3 is a schematic flow chart diagram of a method for object detection in one embodiment;
FIG. 4 is a block diagram of a knowledge distillation apparatus based on a diagram of a structured example in one embodiment;
FIG. 5 is a block diagram of an embodiment of an object detection device;
FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment;
fig. 7 is an internal structural view of a computer device in another embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The knowledge distillation method based on the structured example graph provided by the application can be assisted and understood according to the knowledge distillation structure schematic diagram shown in figure 1. Knowledge distillation is to train a compact neural network by using knowledge gathered and extracted from a large model or a model set, wherein the large model or the model set is called a teacher network, a small and compact model is called a student network, the teacher network generally has high requirements on hardware and usually needs a large server or a server cluster, and the student network can operate in various personal computers, notebook computers, smart phones, tablet computers and portable wearable devices.
In one embodiment, as shown in fig. 2, a knowledge distillation method based on a structured example graph is provided, comprising the steps of:
step S201, acquiring a training image;
the training image is an image used for model construction, the image includes category labels of object categories, positions, ranges and other elements which are marked manually or by a machine, and one image includes an image surrounded by a bounding box with different colors, such as a character image surrounded by a red bounding box and a puppy image surrounded by a yellow bounding box.
Step S202, inputting a training image into a backbone network of a teacher network to obtain a first feature map, and inputting the training image into the backbone network of a student network to obtain a second feature map;
the backbone network refers to a front-end network used for extracting image features in a teacher network or a student network.
Specifically, the training images are respectively input into a teacher network and a student network, backbone networks of the teacher network and the student network are both formed by convolution kernels, each convolution kernel represents a feature extraction mode, namely, one convolution kernel only extracts features in one form, the training images are subjected to feature extraction through the backbone networks of the teacher network and the student network to respectively obtain a first feature map and a second feature map, and the feature maps comprise different features such as color, shape, texture and the like.
Step S203, inputting the first characteristic diagram and the second characteristic diagram into a candidate network of the area respectively to obtain a first diagram to be detected containing a boundary frame and a second diagram to be detected containing the boundary frame;
as shown in fig. 1, the model further includes a Region candidate Network (RPN), where the RPN is configured to output a to-be-detected map including bounding boxes, where the bounding boxes are mainly used to distinguish a foreground from a background in the first feature map and the second feature map, and the bounding boxes are used to respectively frame the foreground from the background so as to input a next layer for further detection and analysis. In this step, the first feature map and the second feature map are extracted by RPN to obtain a first inspection map containing a bounding box and a second inspection map containing a bounding box, respectively.
Step S204, coding is carried out based on an object example in a boundary frame of a first diagram to be detected and an object example in a boundary frame of a second diagram to be detected, and a first structural diagram and a second structural diagram are obtained;
the object instance refers to a specific object which is expected to be identified in the image, such as a red balloon, a blue balloon, a pet dog, a building, a person and the like, different individuals in the same object are distinguished by the instance, for example, a plurality of people exist in the image, and each person is called an object instance.
Specifically, first, the first inspection diagram containing the bounding box and the second inspection diagram containing the bounding box obtained in step S203 are respectively transferred to a next layer for inspection, the next layer is an ROI Pooling layer in fig. 1, the ROI Pooling layer is used to pool (i.e. downsample) the first inspection diagram and the second inspection diagram, and output to obtain a Region of Interest (Region of Interest), wherein the student network and the teacher network share a sampling Region for aligning their distillation objects, and the teacher network and the student network can simultaneously distill for the same object instance. Further, a representation form based on a graph structure is used in the present application, and therefore, it is also necessary to represent the interested areas by using the graph structure, where the graph structure represents object instances in the interested areas by using nodes and edges; wherein, the node is used for representing the attribute characteristics (such as color, texture, shape and the like) of the object instance, and the edge is used for representing the relationship characteristics (such as the distance between the nodes) between the nodes; these nodes and edges are all represented by corresponding matrices, so all nodes and edges need to be encoded to obtain corresponding matrices. And finally obtaining the graph structures of all the nodes, wherein the teacher network obtains a corresponding first structural graph, and the student network obtains a corresponding second structural graph.
Optionally, the encoding process may be self-encoding using a corresponding model, or may be represented by manual encoding.
The relationship features include node correlation, which is measured using cosine similarity between node vectors.
Step S205, based on the first structural diagram and the second structural diagram, combining a preset distillation loss function to obtain a distillation loss part;
wherein the predetermined distillation loss function can be regarded as the difference loss between the first structural diagram of the teacher network and the second structural diagram of the student network, and respectively comprises the graph node loss LVAnd loss of graph edge LEThe present application also calculates such losses using the euclidean distance function, i.e. the distance between the corresponding object instances in the teacher network and the student network:
Figure BDA0003075418640000071
wherein, LGThe function of the loss of distillation is expressed,
Figure BDA0003075418640000072
represents distillation loss of the foreground node;
Figure BDA0003075418640000073
represents the distillation loss of the background node, LERepresents an edge loss; lambda [ alpha ]1Distillation loss weight, λ, for the foreground node2Weight loss by distillation, λ, for background nodes3Weight is lost for distillation of the side. The distillation loss fraction was calculated using the distillation loss function described above.
And step S206, obtaining a basic loss part according to the distance between the detection result of the second to-be-detected image and the real label.
Wherein, the real label refers to the label which is already marked by a machine or a man.
Specifically, there is also a loss function in the original classification task, which is determined according to the difference between the prediction result of the student network and the true tag, and the loss function may be, for example, softmax loss function (softmax loss) or bbox regression loss function (bbox regression loss, also called border regression loss function).
And step S207, training the student network based on the distillation loss part and the basic loss part.
Specifically, the total of the distillation loss part and the basic operation part is used as a global loss function, and in the training process, network parameters in the student network are continuously adjusted to enable the global loss function value to reach a preset condition, wherein the preset condition can be that the global loss function value reaches a minimum value or reaches a minimum value within a certain range.
In the embodiment, the training images are respectively input into the backbone network of the teacher network and the backbone network of the student network by acquiring the training images to obtain the first characteristic diagram and the second characteristic diagram; respectively inputting the first characteristic diagram and the second characteristic diagram into the area candidate network to obtain a first diagram to be detected containing a boundary frame and a second diagram to be detected containing the boundary frame; coding is carried out based on the object examples in each boundary box to obtain a first structural diagram and a second structural diagram; obtaining a distillation loss part by combining a preset distillation loss function based on the first structural diagram and the second structural diagram; obtaining a basic loss part according to the distance between the detection result of the second image to be detected and the real label; and training the student network based on the distillation loss part and the basic loss part. The method uses an example model of a graph structure, fuses the relation between the foreground characteristic and the background characteristic in the image through a preset distillation loss function, so that the foreground characteristic and the background characteristic in a teacher network can be effectively transferred to a student network, the student network can extract more effective knowledge from the teacher network, and the accuracy of target detection is further improved.
In one embodiment, the object instances are characterized by using nodes and corresponding edges; the nodes comprise foreground nodes and background nodes; the step S204 includes: calculating a classification loss value of the background node by using a preset classification loss function; and after removing the background nodes with the classification loss value smaller than the preset threshold value, coding the rest background nodes and foreground nodes to obtain a first structural diagram and a second structural diagram.
Specifically, in the present application, object instances are characterized by using nodes and edges, where the nodes include foreground nodes and background nodes, and the present application obtains feature establishing nodes based on RoI pooling, that is, classifying the nodes based on IoU (Intersection over Union, i.e., Intersection area of two rectangular boxes/Union area of two rectangular boxes). The node sets in each graph G are represented as
Figure BDA0003075418640000091
Wherein the content of the first and second substances,
Figure BDA0003075418640000092
a feature vector representing the ith foreground feature,
Figure BDA0003075418640000093
a feature vector representing the ith background feature. In this node set, n and m represent the data of the object instance of the foreground and the number of the object instances of the background, respectively, in each graph.
The set of edges in each graph G is denoted as E ═ Eij]k×kWhere k represents the size of the node set, eijRepresents the correlation of the ith node and the jth node in the feature space:
eij:=sim_function(vj,vj)
sim_function(vi,vj) For the ith node viAnd j node vjA correlation function between. In the present application, cosine similarity may be adopted as a measure of correlation between two nodes, that is:
Figure BDA0003075418640000094
wherein v isiIs a node vector of the ith node, vjIs the node vector of the jth node. Cosine similarity is not affected by the two node vector modes. We assume that there are edges between every two nodes in graph G, and graph G is a complete graph. In practice, because the correlation metric is a symmetric function, e is for each of i and jijjiI.e., the edge from node i to node j is equal to the edge from node j to node i.
For the above-mentioned complete graph, distilling the whole non-sparse edge similarity matrix may adversely affect the training because a large number of background nodes may generate many useless edges, and in the distilling, many unproductive loss function values may be brought, and the affected model may converge. If all the edges relevant to the background are pruned, an edge set only containing foreground nodes is established, and a large amount of information is lost, so that a background sample mining method is designed to mine some background nodes meeting the expectation of people to construct a graph with information value.
The background sample mining method in the application is to select nodes added into a graph based on the RoI classification loss, namely, a preset classification loss function is used for calculating classification loss values of background nodes, after the background nodes with the classification loss values smaller than a preset threshold value are removed, the rest background nodes are added into the graph to be constructed, the rest background nodes can be considered as nodes which are easily classified to the foreground in a wrong mode, therefore, the nodes and the foreground nodes are coded together, the background sample mining method is adopted for construction of graphs in a teacher network and a student network, and finally a first structural graph and a second structural graph are obtained.
In the embodiment, by adopting the background sample mining method, a part of useless background nodes are removed, and nodes having a large relation with the foreground are left, so that the model can be simplified, the calculation amount is reduced, and effective knowledge can be transferred from a teacher network to a student network without being omitted.
In an embodiment, the object instance is characterized by using nodes and corresponding edges, the distillation loss part includes a robbery node loss part, a background node loss part and an edge loss part, and the distillation loss function is:
Figure BDA0003075418640000101
wherein the content of the first and second substances,
Figure BDA0003075418640000102
Figure BDA0003075418640000103
Figure BDA0003075418640000104
LGrepresenting a distillation loss function that can be viewed as a difference loss between an object instance of a teacher network and an object instance of a student network, each comprising a graphNode loss LVAnd loss of graph edge LE(ii) a In the present application, the two losses are calculated using the euclidean distance function. In the above formula, the first and second carbon atoms are,
Figure BDA0003075418640000105
represents distillation loss of the foreground node;
Figure BDA0003075418640000106
represents the distillation loss of the background node, LERepresents an edge loss; lambda [ alpha ]1Distillation loss weight, λ, for the foreground node2Weight loss by distillation, λ, for background nodes3Distillation loss weight as edge; n is a radical offgRepresenting the total number of foreground nodes,
Figure BDA0003075418640000107
a node vector representing the ith foreground node in the teacher network,
Figure BDA0003075418640000108
a node vector representing an ith foreground node in the student network; n is a radical ofbgRepresenting the total number of background nodes,
Figure BDA0003075418640000109
a node vector representing the ith background node in the teacher network,
Figure BDA00030754186400001010
a node vector representing an ith background node in the student network; n represents the total number of edges,
Figure BDA00030754186400001011
representing the correlation in feature space from the ith node to the jth node in the teacher network,
Figure BDA00030754186400001012
representing the correlation of the ith node and the jth node in the student network in the feature space.
According to the cross validation, dividing lambda1And λ3Are all set to 0.5, lambda2Loss weight designed to be adaptive:
Figure BDA0003075418640000111
where α is a hyperparameter achieving the same order of magnitude as the other losses.
The above embodiment adjusts the parameters of the student network by accounting for the loss of graph nodes and the loss of graph edges in the distillation loss function, the graph node loss being the loss generated in the node set, and it performs a pixel-level matching of the object instance features of the student network with the object instance features of the teacher network, and generally speaking, directly matching the feature graph between two networks is a simple and direct distillation method. However, in the detection model, not all pixels in the feature map can subsequently generate classification and bounding box regression loss. Compared to using a global feature map, we sample the foreground and background features of the image to compute the map node loss, which causes the student model to focus more on the RoI and valuable knowledge. The edge loss of the graph is the loss generated in the set of edges that causes the nodes of the student network to align with the correlations generated by the nodes of the teacher network. Distillation at the pixel level alone does not fully exploit the potential for knowledge migration from teacher networks to student networks. Since the correlation of high-order semantics is not well distilled in training, the designed edge loss function directly promotes the optimization of the pair-wise correlation. Therefore, in order to align the topological relationship between the students and the teachers, it is necessary to design edge loss so as to capture the global structural information in the detection network model.
In one embodiment, the base loss section includes a detection loss function and a KL divergence function.
Specifically, the global loss function of the student network includes a base loss part and a distillation loss part, and the global loss function is expressed as follows:
L=LDet+LG+Llogits
wherein L isDetThe method is a detection loss function, and commonly used detection loss functions include loss functions such as L1, L2 and Smooth L1, and can be selected and adjusted according to actual needs; l isGIs the distillation loss fraction, L, abovelogitsIs KLD loss function (KLD) in classification and regression, i.e. KL divergence function.
In the embodiment, the basic loss function and the distillation loss function are used as the global loss function together, and training is performed on the student network, so that the performance of the student network can be further optimized, and the student network can learn more comprehensive knowledge from a teacher network.
In an embodiment, a target detection method is further provided, as shown in fig. 3, including the following steps:
step S301, acquiring an image including an object to be detected;
specifically, given any push, the image contains objects to be detected, such as a specific person and its location, which are contained in the image to be detected.
Step S302, inputting the image to the student network so that the student network outputs a target image containing a prediction frame; the prediction box is used for identifying the class label and the position of the object to be detected. The student network is obtained by training through the steps in the method embodiments.
Specifically, a student network which can be used for target detection is obtained by training using the knowledge distillation method, the image to be detected is input into the trained student network, the student network can output a target image containing a prediction box, the prediction box can mark the category and the position of the target for a desired target area, for example, a desired person and the position of the person in the image are marked by using a box, wherein the object a is marked by using a red box, the object B is marked by using a green box, and the like.
In the embodiment, the student network is obtained by training by using the methods in the knowledge distillation method embodiments, and the student network can extract deeper knowledge from the teacher network, can be used for detecting the image target and outputting the target position and the class, and further improves the performance of the student network.
It should be understood that although the various steps in the flow charts of fig. 1-3 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 1-3 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least some of the other steps.
In one embodiment, as shown in fig. 4, there is provided a knowledge distillation apparatus 400 based on a structured example graph, comprising: the system comprises an image acquisition module 401, a characteristic map acquisition module 402, a to-be-detected map acquisition module 403, a structure diagram output module 404, a distillation loss acquisition module 405, a basic loss acquisition module 406 and student network training 407, wherein:
an image acquisition module 401, configured to acquire a training image;
a feature map obtaining module 402, configured to input the training image to a backbone network of a teacher network to obtain a first feature map, and input the training image to a backbone network of a student network to obtain a second feature map;
a diagram to be detected acquisition module 403, configured to input the first feature diagram and the second feature diagram into a candidate network of a region, respectively, to obtain a first diagram to be detected including a bounding box and a second diagram to be detected including the bounding box;
a structure diagram output module 404, configured to perform encoding based on an object instance in a boundary frame of the first to-be-detected map and an object instance in a boundary frame of the second to-be-detected map to obtain a first structure diagram and a second structure diagram;
a distillation loss obtaining module 405, configured to obtain a distillation loss part based on the first structural diagram and the second structural diagram by combining a preset distillation loss function;
a base loss obtaining module 406, configured to obtain a base loss portion according to a distance between a detection result of the second to-be-detected map and the real label;
a student network training module 407, configured to train the student network based on the distillation loss part and the base loss part.
In one embodiment, the object instances are characterized by using nodes and corresponding edges; the nodes comprise foreground nodes and background nodes; the structure diagram output module 404 is further configured to calculate a classification loss value of the background node by using a preset classification loss function; and after removing the background nodes with the classification loss value smaller than a preset threshold value, coding the rest background nodes and the foreground nodes to obtain the first structural diagram and the second structural diagram.
In an embodiment, the student network training module 407 is further configured to take the sum of the distillation loss fraction and the base loss fraction as the global loss of the student network; and adjusting network parameters of the student network based on the global loss until the global loss meets a preset condition.
In one embodiment, the object instances are characterized using nodes and corresponding edges; the distillation loss part comprises a foreground node loss part, a background node loss part and an edge loss part; the distillation loss function is:
Figure BDA0003075418640000131
wherein the content of the first and second substances,
Figure BDA0003075418640000141
Figure BDA0003075418640000142
Figure BDA0003075418640000143
wherein L isGThe function of the loss of distillation is expressed,
Figure BDA0003075418640000144
represents distillation loss of the foreground node;
Figure BDA0003075418640000145
represents the distillation loss of the background node, LERepresents an edge loss; lambda [ alpha ]1Distillation loss weight, λ, for the foreground node2Weight loss by distillation, λ, for background nodes3Distillation loss weight as edge; n is a radical offgRepresenting the total number of foreground nodes,
Figure BDA0003075418640000146
a node vector representing the ith foreground node in the teacher network,
Figure BDA0003075418640000147
a node vector representing an ith foreground node in the student network; n is a radical ofbgRepresenting the total number of background nodes,
Figure BDA0003075418640000148
a node vector representing the ith background node in the teacher network,
Figure BDA0003075418640000149
a node vector representing an ith background node in the student network; n represents the total number of edges,
Figure BDA00030754186400001410
representing the correlation in feature space from the ith node to the jth node in the teacher network,
Figure BDA00030754186400001411
representing the i-th node and the j-th node in the student networkCorrelation in feature space.
In one embodiment, the base loss section includes a detection loss function and a KL divergence function.
In one embodiment, as shown in fig. 5, there is also provided an object detecting apparatus 500, including: an image acquisition module 501, a target image output module 502; wherein the content of the first and second substances,
an image obtaining module 501, configured to obtain an image including an object to be detected;
a target image output module 501, configured to utilize the student network for image instance detection obtained through training in the above embodiment of the knowledge distillation method based on the structured instance graph, and input the image to the student network, so that the student network outputs a target image including a prediction box; the prediction box is used for identifying the class label and the position of the object to be detected.
For specific limitations of the knowledge distillation apparatus and the target detection apparatus based on the structured example diagram, reference may be made to the above limitations of the knowledge distillation method and the target detection method based on the structured example diagram, which are not described herein again. The various modules in the knowledge distillation apparatus and the target detection apparatus based on the structured example diagram described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used to store image characteristic data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a knowledge distillation method or an object detection method based on a structured example graph.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 7. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a knowledge distillation method or an object detection method based on a structured example graph. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the configurations shown in fig. 6-7 are only block diagrams of some of the configurations relevant to the present disclosure, and do not constitute a limitation on the computing devices to which the present disclosure may be applied, and that a particular computing device may include more or less components than shown in the figures, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory having a computer program stored therein and a processor that when executed implements the steps in the knowledge distillation method embodiment and the object detection method embodiment as described above based on the structured example graph.
In one embodiment, a computer readable storage medium is provided, having stored thereon a computer program that, when executed by a processor, performs the steps in the structured instance graph-based knowledge distillation method embodiment and the object detection method embodiment as described above.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A knowledge distillation method based on a structured example graph, the method comprising:
acquiring a training image;
inputting the training image into a backbone network of a teacher network to obtain a first feature map, and inputting the training image into a backbone network of a student network to obtain a second feature map;
inputting the first feature map and the second feature map into a regional candidate network respectively to obtain a first to-be-detected map containing a boundary frame and a second to-be-detected map containing a boundary frame;
coding is carried out on the basis of an object instance in the boundary frame of the first diagram to be detected and an object instance in the boundary frame of the second diagram to be detected, and a first structural diagram and a second structural diagram are obtained;
obtaining a distillation loss part by combining a preset distillation loss function based on the first structural diagram and the second structural diagram;
obtaining a basic loss part according to the distance between the detection result of the second image to be detected and the real label;
training the student network based on the distillation loss fraction and the base loss fraction.
2. The method of claim 1, wherein the object instances are characterized using nodes and corresponding edges; the nodes comprise foreground nodes and background nodes; the encoding is carried out based on the object instance in the boundary frame of the first diagram to be detected and the object instance in the boundary frame of the second diagram to be detected, so as to obtain a first structural diagram and a second structural diagram, and the method comprises the following steps:
calculating a classification loss value of the background node by using a preset classification loss function;
and after removing the background nodes with the classification loss value smaller than a preset threshold value, coding the rest background nodes and the foreground nodes to obtain the first structural diagram and the second structural diagram.
3. The method of claim 1, wherein training the student network based on the distillation loss fraction and the base loss fraction comprises training the student network based on the distillation loss fraction and the base loss fraction
Taking the sum of the distillation lost portion and the base lost portion as the global loss of the student network;
and adjusting network parameters of the student network based on the global loss until the global loss meets a preset condition.
4. The method of claim 3, wherein the object instances are characterized using nodes and corresponding edges; the distillation loss part comprises a foreground node loss part, a background node loss part and an edge loss part; the distillation loss function is:
Figure FDA0003075418630000021
wherein the content of the first and second substances,
Figure FDA0003075418630000022
Figure FDA0003075418630000023
Figure FDA0003075418630000024
wherein L isGThe function of the loss of distillation is expressed,
Figure FDA0003075418630000025
represents distillation loss of the foreground node;
Figure FDA0003075418630000026
represents the distillation loss of the background node, LERepresents an edge loss; lambda [ alpha ]1Distillation loss weight, λ, for the foreground node2Weight loss by distillation, λ, for background nodes3Distillation loss weight as edge; n is a radical offgRepresenting the total number of foreground nodes,
Figure FDA0003075418630000027
a node vector representing the ith foreground node in the teacher network,
Figure FDA0003075418630000028
a node vector representing an ith foreground node in the student network; n is a radical ofbgRepresenting the total number of background nodes,
Figure FDA0003075418630000029
a node vector representing the ith background node in the teacher network,
Figure FDA00030754186300000210
a node vector representing an ith background node in the student network; n represents the total number of edges,
Figure FDA00030754186300000211
representing the correlation in feature space from the ith node to the jth node in the teacher network,
Figure FDA00030754186300000212
representing the correlation of the ith node and the jth node in the student network in the feature space.
5. The method according to any of claims 1 to 4, wherein the base loss fraction comprises a detection loss function and a KL divergence function.
6. A method of object detection, the method comprising:
training the student network for image instance detection using the method of any one of claims 1 to 5;
acquiring an image including an object to be detected;
inputting the image to the student network to cause the student network to output a target image containing a prediction box; the prediction box is used for identifying the class label and the position of the object to be detected.
7. A knowledge distillation apparatus based on a structured example graph, the apparatus comprising:
the image acquisition module is used for acquiring a training image;
the characteristic diagram acquisition module is used for inputting the training image into a backbone network of a teacher network to obtain a first characteristic diagram and inputting the training image into a backbone network of a student network to obtain a second characteristic diagram;
the mapping acquisition module is used for inputting the first feature map and the second feature map into a candidate network of a region respectively to obtain a first mapping containing a boundary frame and a second mapping containing the boundary frame;
the structure diagram output module is used for coding based on an object example in a boundary frame of the first diagram to be detected and an object example in a boundary frame of the second diagram to be detected to obtain a first structure diagram and a second structure diagram;
a distillation loss acquisition module, configured to obtain a distillation loss part by combining a preset distillation loss function based on the first structural diagram and the second structural diagram;
a basic loss acquisition module for obtaining a basic loss part according to the distance between the detection result of the second to-be-detected image and the real label;
and the student network training module is used for training the student network based on the distillation loss part and the basic loss part.
8. An object detection apparatus, characterized in that the apparatus comprises:
the image acquisition module is used for acquiring an image comprising an object to be detected;
a target image output module, configured to utilize the student network for image instance detection trained by the method according to any one of claims 1 to 5, input the image to the student network, so that the student network outputs a target image including a prediction box; the prediction box is used for identifying the class label and the position of the object to be detected.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 6.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
CN202110551061.7A 2021-05-20 2021-05-20 Knowledge distillation method, device, equipment and medium based on structured example graph Active CN113255915B8 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110551061.7A CN113255915B8 (en) 2021-05-20 2021-05-20 Knowledge distillation method, device, equipment and medium based on structured example graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110551061.7A CN113255915B8 (en) 2021-05-20 2021-05-20 Knowledge distillation method, device, equipment and medium based on structured example graph

Publications (3)

Publication Number Publication Date
CN113255915A true CN113255915A (en) 2021-08-13
CN113255915B CN113255915B (en) 2022-11-18
CN113255915B8 CN113255915B8 (en) 2024-02-06

Family

ID=77182967

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110551061.7A Active CN113255915B8 (en) 2021-05-20 2021-05-20 Knowledge distillation method, device, equipment and medium based on structured example graph

Country Status (1)

Country Link
CN (1) CN113255915B8 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113887610A (en) * 2021-09-29 2022-01-04 内蒙古工业大学 Pollen image classification method based on cross attention distillation transducer
CN113920307A (en) * 2021-09-29 2022-01-11 北京百度网讯科技有限公司 Model training method, device, equipment, storage medium and image detection method
CN114359649A (en) * 2021-11-22 2022-04-15 腾讯科技(深圳)有限公司 Image processing method, apparatus, device, storage medium, and program product
CN114842449A (en) * 2022-05-10 2022-08-02 安徽蔚来智驾科技有限公司 Target detection method, electronic device, medium, and vehicle
CN114882243A (en) * 2022-07-11 2022-08-09 浙江大华技术股份有限公司 Target detection method, electronic device, and computer-readable storage medium
CN114970862A (en) * 2022-04-28 2022-08-30 北京航空航天大学 PDL1 expression level prediction method based on multi-instance knowledge distillation model
CN115019060A (en) * 2022-07-12 2022-09-06 北京百度网讯科技有限公司 Target recognition method, and training method and device of target recognition model
CN116664840A (en) * 2023-05-31 2023-08-29 博衍科技(珠海)有限公司 Semantic segmentation method, device and equipment based on mutual relationship knowledge distillation

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180268292A1 (en) * 2017-03-17 2018-09-20 Nec Laboratories America, Inc. Learning efficient object detection models with knowledge distillation
CN110163344A (en) * 2019-04-26 2019-08-23 北京迈格威科技有限公司 Neural network training method, device, equipment and storage medium
CN110580487A (en) * 2018-06-08 2019-12-17 Oppo广东移动通信有限公司 Neural network training method, neural network construction method, image processing method and device
CN111462162A (en) * 2019-01-18 2020-07-28 上海大学 Foreground segmentation algorithm for specific class of pictures
CN111507227A (en) * 2020-04-10 2020-08-07 南京汉韬科技有限公司 Multi-student individual segmentation and state autonomous identification method based on deep learning
CN111709497A (en) * 2020-08-20 2020-09-25 腾讯科技(深圳)有限公司 Information processing method and device and computer readable storage medium
CN111898735A (en) * 2020-07-14 2020-11-06 上海眼控科技股份有限公司 Distillation learning method, distillation learning device, computer equipment and storage medium
CN111914727A (en) * 2020-07-28 2020-11-10 联芯智能(南京)科技有限公司 Small target human body detection method based on balance sampling and nonlinear feature fusion
CN112164054A (en) * 2020-09-30 2021-01-01 交叉信息核心技术研究院(西安)有限公司 Knowledge distillation-based image target detection method and detector and training method thereof
CN112200318A (en) * 2020-10-10 2021-01-08 广州云从人工智能技术有限公司 Target detection method, device, machine readable medium and equipment
CN112330709A (en) * 2020-10-29 2021-02-05 奥比中光科技集团股份有限公司 Foreground image extraction method and device, readable storage medium and terminal equipment
WO2021090771A1 (en) * 2019-11-08 2021-05-14 Canon Kabushiki Kaisha Method, apparatus and system for training a neural network, and storage medium storing instructions

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180268292A1 (en) * 2017-03-17 2018-09-20 Nec Laboratories America, Inc. Learning efficient object detection models with knowledge distillation
CN110580487A (en) * 2018-06-08 2019-12-17 Oppo广东移动通信有限公司 Neural network training method, neural network construction method, image processing method and device
CN111462162A (en) * 2019-01-18 2020-07-28 上海大学 Foreground segmentation algorithm for specific class of pictures
CN110163344A (en) * 2019-04-26 2019-08-23 北京迈格威科技有限公司 Neural network training method, device, equipment and storage medium
WO2021090771A1 (en) * 2019-11-08 2021-05-14 Canon Kabushiki Kaisha Method, apparatus and system for training a neural network, and storage medium storing instructions
CN111507227A (en) * 2020-04-10 2020-08-07 南京汉韬科技有限公司 Multi-student individual segmentation and state autonomous identification method based on deep learning
CN111898735A (en) * 2020-07-14 2020-11-06 上海眼控科技股份有限公司 Distillation learning method, distillation learning device, computer equipment and storage medium
CN111914727A (en) * 2020-07-28 2020-11-10 联芯智能(南京)科技有限公司 Small target human body detection method based on balance sampling and nonlinear feature fusion
CN111709497A (en) * 2020-08-20 2020-09-25 腾讯科技(深圳)有限公司 Information processing method and device and computer readable storage medium
CN112164054A (en) * 2020-09-30 2021-01-01 交叉信息核心技术研究院(西安)有限公司 Knowledge distillation-based image target detection method and detector and training method thereof
CN112200318A (en) * 2020-10-10 2021-01-08 广州云从人工智能技术有限公司 Target detection method, device, machine readable medium and equipment
CN112330709A (en) * 2020-10-29 2021-02-05 奥比中光科技集团股份有限公司 Foreground image extraction method and device, readable storage medium and terminal equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FU ZHIHANG, ET AL.: "Foreground gating and background refining network for surveilance object detection", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113887610B (en) * 2021-09-29 2024-02-02 内蒙古工业大学 Pollen image classification method based on cross-attention distillation transducer
CN113920307A (en) * 2021-09-29 2022-01-11 北京百度网讯科技有限公司 Model training method, device, equipment, storage medium and image detection method
CN113887610A (en) * 2021-09-29 2022-01-04 内蒙古工业大学 Pollen image classification method based on cross attention distillation transducer
CN114359649A (en) * 2021-11-22 2022-04-15 腾讯科技(深圳)有限公司 Image processing method, apparatus, device, storage medium, and program product
CN114359649B (en) * 2021-11-22 2024-03-22 腾讯科技(深圳)有限公司 Image processing method, apparatus, device, storage medium, and program product
CN114970862B (en) * 2022-04-28 2024-05-28 北京航空航天大学 PDL1 expression level prediction method based on multi-instance knowledge distillation model
CN114970862A (en) * 2022-04-28 2022-08-30 北京航空航天大学 PDL1 expression level prediction method based on multi-instance knowledge distillation model
CN114842449A (en) * 2022-05-10 2022-08-02 安徽蔚来智驾科技有限公司 Target detection method, electronic device, medium, and vehicle
CN114882243B (en) * 2022-07-11 2022-11-22 浙江大华技术股份有限公司 Target detection method, electronic device, and computer-readable storage medium
CN114882243A (en) * 2022-07-11 2022-08-09 浙江大华技术股份有限公司 Target detection method, electronic device, and computer-readable storage medium
CN115019060A (en) * 2022-07-12 2022-09-06 北京百度网讯科技有限公司 Target recognition method, and training method and device of target recognition model
CN116664840A (en) * 2023-05-31 2023-08-29 博衍科技(珠海)有限公司 Semantic segmentation method, device and equipment based on mutual relationship knowledge distillation
CN116664840B (en) * 2023-05-31 2024-02-13 博衍科技(珠海)有限公司 Semantic segmentation method, device and equipment based on mutual relationship knowledge distillation

Also Published As

Publication number Publication date
CN113255915B (en) 2022-11-18
CN113255915B8 (en) 2024-02-06

Similar Documents

Publication Publication Date Title
CN113255915B (en) Knowledge distillation method, device, equipment and medium based on structured instance graph
CN112766244B (en) Target object detection method and device, computer equipment and storage medium
CN111126258B (en) Image recognition method and related device
Wang et al. Deep networks for saliency detection via local estimation and global search
EP3757905A1 (en) Deep neural network training method and apparatus
Arietta et al. City forensics: Using visual elements to predict non-visual city attributes
CN109960742B (en) Local information searching method and device
Dong et al. Oil palm plantation mapping from high-resolution remote sensing images using deep learning
CN112101165A (en) Interest point identification method and device, computer equipment and storage medium
CN110633708A (en) Deep network significance detection method based on global model and local optimization
CN113807399A (en) Neural network training method, neural network detection method and neural network detection device
CN112801236B (en) Image recognition model migration method, device, equipment and storage medium
CN112116599A (en) Sputum smear tubercle bacillus semantic segmentation method and system based on weak supervised learning
CN106156777A (en) Textual image detection method and device
CN110334628B (en) Outdoor monocular image depth estimation method based on structured random forest
CN113240120A (en) Knowledge distillation method and device based on temperature learning mechanism, computer equipment and medium
CN115577768A (en) Semi-supervised model training method and device
Li et al. Coarse-to-fine salient object detection based on deep convolutional neural networks
CN112241736A (en) Text detection method and device
CN111914809B (en) Target object positioning method, image processing method, device and computer equipment
Yang et al. From intuition to reasoning: Analyzing correlative attributes of walkability in urban environments with machine learning
Osuna-Coutiño et al. Structure extraction in urbanized aerial images from a single view using a CNN-based approach
CN115660069A (en) Semi-supervised satellite image semantic segmentation network construction method and device and electronic equipment
Huo et al. Local graph regularized coding for salient object detection
CN115019055A (en) Image matching method and device, intelligent equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Chen Yixin

Inventor after: Chen Pengguang

Inventor after: Shen Xiaoyong

Inventor after: Lv Jiangbo

Inventor before: Chen Yixin

Inventor before: Chen Pengguang

Inventor before: Jia Jiaya

Inventor before: Shen Xiaoyong

Inventor before: Lv Jiangbo

GR01 Patent grant
GR01 Patent grant
OR01 Other related matters
OR01 Other related matters
CI03 Correction of invention patent
CI03 Correction of invention patent

Correction item: Inventor

Correct: Chen Yixin|Chen Pengguang|Shen Xiaoyong|Lv Jiangbo

False: Chen Yixin|Shen Xiaoyong|Lv Jiangbo

Number: 46-02

Page: The title page

Volume: 38

Correction item: Inventor

Correct: Chen Yixin|Chen Pengguang|Shen Xiaoyong|Lv Jiangbo

False: Chen Yixin|Shen Xiaoyong|Lv Jiangbo

Number: 46-02

Volume: 38