CN115147873A - Method, equipment and medium for automatically classifying dental images based on dual-label cascade - Google Patents
Method, equipment and medium for automatically classifying dental images based on dual-label cascade Download PDFInfo
- Publication number
- CN115147873A CN115147873A CN202211063304.3A CN202211063304A CN115147873A CN 115147873 A CN115147873 A CN 115147873A CN 202211063304 A CN202211063304 A CN 202211063304A CN 115147873 A CN115147873 A CN 115147873A
- Authority
- CN
- China
- Prior art keywords
- photograph
- dental
- label
- dual
- cascade
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000001514 detection method Methods 0.000 claims abstract description 29
- 238000002372 labelling Methods 0.000 claims description 15
- 238000012549 training Methods 0.000 claims description 13
- 238000005286 illumination Methods 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 230000009977 dual effect Effects 0.000 claims description 5
- 239000000203 mixture Substances 0.000 claims description 5
- 238000005070 sampling Methods 0.000 claims description 5
- 230000003044 adaptive effect Effects 0.000 claims description 4
- 230000004927 fusion Effects 0.000 claims description 4
- 238000000638 solvent extraction Methods 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 238000012795 verification Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 abstract description 24
- 238000012545 processing Methods 0.000 abstract description 5
- 230000000694 effects Effects 0.000 abstract description 3
- 210000000214 mouth Anatomy 0.000 description 17
- 238000010586 diagram Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000011176 pooling Methods 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 230000000875 corresponding effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 210000001847 jaw Anatomy 0.000 description 2
- 210000004373 mandible Anatomy 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 210000002050 maxilla Anatomy 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000011514 reflex Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Multimedia (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Human Computer Interaction (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the technical field of image processing, in particular to a method, equipment and a medium for automatically classifying dental images based on dual-label cascade, which comprises the following steps of; creating a dental image dataset; performing data enhancement on the dental image dataset; designing a lightweight network to detect a specific target of a dental image dataset, and classifying the image data into a plurality of categories; respectively as follows: frontal photograph, side photograph, intraoral left and right side photograph, intraoral frontal photograph, intraoral upper and lower jaws photograph, side photograph, panoramic photograph, positioning photograph, joint photograph and small dental photograph; performing secondary classification by using a dual-label classification network aiming at the left and right side photographs in the mouth, and performing secondary classification by using target detection aiming at the front photograph and the mouth upper and lower jaw photograph categories to obtain a front photograph and a smile photograph so as to obtain a mouth upper jaw photograph and a mouth lower jaw photograph; the method effectively solves the problems of poor automatic classification effect and low classification accuracy of the dental images, improves the automatic classification accuracy of the dental images, and stabilizes the performance of triple loss functions.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a method, equipment and medium for automatically classifying dental images based on dual-label cascade.
Background
With the rapid development of society and the rapid improvement of the quality of life of people, the oral cavity problem is more and more emphasized by people. The dentist takes various CT images of the patient such as a lateral photograph, a panorama photograph, a frontal photograph, a small dental photograph, etc. by using the CT machine, and acquires image data of a frontal photograph, a smile photograph, a side photograph, an intraoral photograph of various angles, etc. of the patient using a single-lens reflex camera. Through the collection of multiple pieces of relevant image data, the dentist can make a corresponding diagnosis.
The large amount of image data requires doctors to manually identify and classify, which is a huge workload for doctors, and the image distinguishing capability of doctors is reduced along with the increase of workload and working time. How to efficiently and accurately distinguish different images is the focus of research.
In recent years, with the rapid development of deep learning, many industries of society have maturely applied the deep learning, and the work and production efficiency is greatly improved. In dentistry, the dental patient images are automatically classified by using a deep learning method, so that the working efficiency of a dentist can be greatly improved. However, the similarity of some images in dental images is high, which causes great difficulty in automatic classification. Such as intraoral left and right side-lighting, from a visual point of view, the image is merely mirrored. The characteristics of dental data diversity and class imbalance are particularly prominent compared to large-scale data common in general classification tasks. Solving the automatic accurate classification of similar dental images is a crucial problem.
Therefore, there is a need for further improvement of the method, apparatus and medium for automatic classification of dental images based on dual-label cascade to solve the above problems.
Disclosure of Invention
The purpose of this application is: the method, the equipment and the medium for automatically classifying the dental images based on the dual-label cascade are provided, the problems of poor automatic classification effect and low classification accuracy of the dental images are effectively solved, the automatic classification accuracy of the dental images is improved, and the performance of a triple loss function is stabilized.
The purpose of the application is achieved through the following technical scheme, and the dental image automatic classification method based on double-label cascade is characterized by comprising the following steps of: the method comprises the following steps:
s1: creating a dental image dataset;
s2: performing data enhancement on the dental image dataset;
s3: designing a specific target of the dental image data set in the lightweight network detection step S2, and dividing the image data into a plurality of categories; respectively as follows: frontal photograph, side photograph, intraoral left and right side photograph, intraoral frontal photograph, intraoral upper and lower jaws photograph, side photograph, panoramic photograph, positioning photograph, joint photograph and small dental photograph;
s4: aiming at the left and right side photographs in the output port of the step S3, secondary classification is carried out by utilizing a dual-label classification network to obtain the left side photograph in the port and the right side photograph in the port
S5: performing secondary classification on the front face photos and the intraoral upper and lower jaw photos output in the step S3 by using target detection, classifying the front face photos to obtain the front face photos and the smile photos, and classifying the intraoral upper and lower jaw photos to obtain the intraoral upper jaw photos and the intraoral lower jaw photos;
wherein, the step S4 specifically includes:
s41, constructing a dual-label classification network;
s42, selecting positive and negative samples; each anchor point selects a positive sample and a negative sample which have large information quantity and are representative;
s43, self-adaptive triple loss; features of the same label are spatially close together and features of different labels are spatially far apart.
Preferably, the step S1 specifically includes:
s11, dental image data set composition: dividing actually shot dental images into a plurality of categories, and dividing the plurality of categories into single labels and double labels, wherein the double labels are additional labeling categories for special image data;
s12, labeling dental image data sets: labeling the dental image according to target detection, and labeling by using a labellimg labeling tool;
s13: dental image dataset partitioning: the dental image data set is divided into a training set, a verification set and a test set according to the proportion of 8.
Preferably, the data enhancement in step S2 specifically includes: and carrying out left-right mirror image adjustment or illumination and contrast adjustment on the dental image data set.
Preferably, the S3 specifically includes:
s31, designing a lightweight network: the model consists of 10 convolutional layers, and conventional convolution and depth separable convolution are utilized;
s32, a class balance loss function: the class imbalance problem is handled using CB-Focal local, which is the following formula:
wherein,is a categoryThe number of true samples of the sample,andare all the parameters of the weight that are,a probability value predicted for the network.
S33, model initialization: initialization is done using VGG 16 weights trained on ImageNet data sets.
Preferably, the S42 specifically includes:
s421, evaluating the information content of the dental image;
s422, selecting a representative sample from the positive sample and the negative sample with the largest information quantity to construct a triple;
s423. For each imageRelative to anchor pointsScore of the amount of informationTo representWhether it is a candidate positive sample sumTo representWhether the negative sample is a candidate negative sample or not, the calculation formula is as follows:
wherein,representing an imageAndthe similarity of class labels is calculated by using the pair-wise similarity distance,representing imagesAndthe degree of difficulty between the two is calculated by the Euclidean distance of the embedding space of the two.
Preferably, the step S43 specifically includes:
s431, enabling the features of the same label to be close to each other in spatial position, enabling the features of different labels to be far away from each other in spatial position, and enabling the distance between the negative example and the positive example to be at least a long-distance term for two positive examples and one negative example of the same class; the formula is expressed as follows:
wherein,indicating that the maximum value is taken in comparison with 0,、andrespectively representing an anchor point, a positive sample and a negative sample,representing a distance term;
s432, setting a threshold value between the positive sample and the negative sample, namelyWhereinthe angle is represented as a function of time,i.e. byThe adaptive triplet loss function is:
wherein,in order to be a hyper-parameter,expressed as a strict edge-to-edge term,indicating a slack margin term.
Preferably, the step S5 specifically includes:
s51, a target detection network; taking NanoDet as a detection network, selecting ShuffleNet V2 as a backbone, removing the last layer of convolution, extracting 8, 16 and 32 times of down-sampling features, and inputting the features into PAN for multi-scale feature fusion;
s52, a target detection loss function; selecting a Generalized local to train the network;
s53, target detection training; dual labels are used for training.
Preferably, the formula adopted in step S52 is:
the present invention also provides an electronic device, comprising: one or more processors; storage means for storing one or more programs; when executed by the one or more processors, cause the one or more processors to implement a dual-label cascade-based dental image automatic classification method as provided by the present invention.
The present invention also provides a computer-readable storage medium storing a computer program executable by a computer processor to implement any one of the above-mentioned methods for automatic classification of dental images based on dual-label cascade.
Compared with the prior art, the application has the following obvious advantages and effects:
1. in the invention, the tooth classification is carried out in a dual-label and cascade mode, so that the problem of similarity of dental images is solved, and the accuracy of dental image classification is improved.
2. In the invention, the performance of the triple loss function is stabilized through the self-adaptive triple loss function.
3. In the invention, a new positive and negative sample selection strategy is designed, and samples are selected according to the information content of the images, namely the relevance, the difficulty and the representativeness of the images.
Drawings
Fig. 1 is a flow chart of a dual-label cascade network of the present application.
Fig. 2 is a schematic diagram of a dual-tag cascade network structure according to the present application.
Fig. 3 is a lightweight classification network in the present application.
Fig. 4 is a diagram of the composition of a dual-label classification network in the present application.
Fig. 5 is a diagram of a triplet loss function in the present application.
FIG. 6 is an intraoral maxillomandibular view under dual tag target detection in the present application.
Fig. 7 is a schematic structural diagram of an electronic device in the present application.
Reference numbers in this application:
processor 101, storage device 102, output device 103, output device 104, bus 105.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Before discussing exemplary embodiments in greater detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations (or steps) can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
The method, apparatus and medium for automatic classification of dental images based on dual tag cascade provided in the present application are described in detail by the following embodiments and alternatives thereof.
Fig. 1 is a flowchart of a method for automatically classifying dental images based on dual-label cascade in an embodiment of the present invention. Fig. 2 is a schematic diagram of a dual-tag cascade network structure according to the present application. The embodiment of the invention can be suitable for the condition of a double-label cascade dental image automatic classification method. The method can be executed by a double-label cascade dental image automatic classification device which can be realized in a software and/or hardware mode and is integrated on any electronic device with a network communication function. As shown in fig. 1, the method for automatically classifying a two-label cascade dental image provided in the embodiment of the present application may include the following steps:
s1: creating a dental image dataset; the step S1 specifically includes:
s11, dental image data set composition: dividing actually shot dental images into a plurality of categories, and dividing the plurality of categories into single labels and double labels, wherein the double labels are additional labeling categories for special image data;
in the embodiment of the present application, the data is derived from images actually taken by the doctor. The data are divided into 10 categories, including frontal photograph, side photograph, intraoral left and right side photograph, intraoral frontal photograph, intraoral upper and lower jaw photograph, side photograph, panoramic photograph, positioning photograph, articular photograph and small dental photograph. Wherein, the front lighting, the left and right side lighting in the mouth and the upper and lower jaw lighting in the mouth are double labels. Dual tags refer to the additional tagging of a class for particular data. Wherein, the ditag includes: the extra label under the front face, the front face is not smiling and is smiling, the extra label under the side-lighting about in the mouth, the side-lighting about in the mouth and the right side-lighting in the mouth, the lower jaw-lighting in the mouth are intraoral upper jaw-lighting and intraoral lower jaw-lighting.
S12, labeling of dental image data sets: labeling the dental image data set by a doctor with related professional experience according to target detection, and labeling by using a labelimg labeling tool;
s13: dental image dataset partitioning: dental image data sets are divided into training sets, verification sets and test sets in the proportion of 8.
S2: performing data enhancement on the dental image dataset;
in the embodiment of the present application, the data enhancement specifically includes: and carrying out left and right mirror image adjustment or illumination and contrast adjustment on the dental image data set. Specifically, the left and right mirror image adjustment is additionally performed on all the other images except for the adjustment of illumination and contrast only for the intraoral left and right side illumination in 10 label categories.
S3: designing a specific target of the dental image data set in the lightweight network detection step S2, and dividing the image data into a plurality of categories; respectively as follows: frontal irradiation, side irradiation, oral left and right side irradiation, oral front irradiation, oral upper and lower jaw irradiation, side film, panoramic film, normal film, joint film, and small dental film; and step S3 is a first stage, and single-label classification is carried out on the dental images by using a lightweight network, so that a foundation is laid for subsequent steps. S3 specifically comprises the following steps:
s31, designing a lightweight network: the model consists of 10 convolutional layers, and conventional convolution and depth separable convolution are utilized;
as shown in fig. 3, a lightweight classification network structure diagram in the present application is a lightweight deep CNN model capable of effectively utilizing embedded device resources is designed in the embodiment of the present application. The model consists of 10 convolutional layers, with conventional convolution and depth separable convolution. The first three layers are conventional convolutional layers, and the remaining layers are depth separable convolutional layers. The use of deep separable convolutions can increase the network computation time and make the model smaller. Layer 9 is the bottleneck layer and the output of the convolutional layer is then flattened and fed into the sigmoid classifier. Since most of the calculations and parameters occur in the fully connected layer, full connection is not employed in order to reduce model size and processing time. And the maximum pooling layer is adopted after the 3 rd, 5 th, 6 th, 7 th and 8 th layers, the kernel size of the pooling layer is 2, the step size is 2, and the maximum pooling reduces the calculation cost. The ReLU activation function is used for all layers, and the ReLU generally converges faster than other activation functions.
S32, class balance loss function: the choice of using CB-Focal local to deal with the class imbalance problem, CB is a re-weighting scheme, and Loss is re-balanced by the number of valid samples of each class, and the formula of CB-Focal local is as follows:
wherein,is a categoryThe number of true samples of the sample,andare all the parameters of the weight that are,a probability value predicted for the network.
S33, model initialization: initialization is done using VGG 16 weights trained on ImageNet data sets.
In the embodiment of the present application, weight initialization is an important step in training the model. The designed lightweight classification model was initialized with VGG 16 weights trained on the ImageNet dataset.
S4: aiming at the left and right side photographs in the mouth output in the step S3, in the second stage, secondary classification is carried out by using a double-label classification network to obtain the left side photograph in the mouth and the right side photograph in the mouth; step S4 specifically includes:
s41, constructing a dual-label classification network; as shown in fig. 4, a composition diagram of a dual-label classification network in the present application is shown, in the embodiment of the present application, the dual-label classification network adopts a lightweight classification network designed in a first stage, and performs secondary classification on intra-oral left and right side-view category images acquired in the first stage. The categories of this classification are intraoral left and intraoral right photographs. Unlike the network of the first stage, the second stage replaces the first three convolutional layers with one convolutional layer with a core size of 7 x 7. The 4 th and 5 th layers were replaced with one convolution layer with a core size of 5 x 5, the rest of the model remaining unchanged. The number of layers of the model is reduced from 10 to 7, and compared with the network of the first stage, the convolution kernel size of the first two layers of the model is larger, but the acceptance domain of the network is the same. The average processing time is reduced to 1 second and the number of parameters is also reduced. The design makes the model faster and smaller, and the precision is slightly reduced.
S42, selecting positive and negative samples; each anchor point selects a positive sample and a negative sample which have large information quantity and are representative; in the embodiment of the present application, the selection strategy of positive and negative samples aims to select positive and negative samples with large amount of information (i.e., correlation and difficulty) and representativeness (i.e., different from each other in the feature space) for each anchor point. The sample to anchor point correlation is defined based on its tag similarity with respect to the anchor point. If the class label similarity of a positive sample is high, then its correlation to the anchor point is high, and vice versa. If the class label similarity of a negative example is small, it is highly correlated with the anchor point, and vice versa. The difficulty of a sample is related to its distance to the anchor point in the feature space. In the embedding space, positive samples may be difficult if the distance to the anchor point is large, while negative samples may be difficult if the distance to the anchor point is small.
A positive and negative sample selection strategy: the information content (i.e., relevance and difficulty) of the image is first evaluated. Then, of the positive samples and the negative samples with the largest information amount, representative (diverse) samples are selected to construct the triples. For each imageRelative to anchor pointsScore of the amount of information(is shown inWhether it is a candidate positive sample) and(is shown inWhether it is a negative sample of a candidate). The calculation formula is as follows:
wherein,representing imagesAndclass label similarity between them, calculated using the pair-wise similarity distance.Representing imagesAndthe degree of difficulty between the two is calculated by the Euclidean distance of the embedding space of the two. A new positive and negative sample selection strategy is designed, and samples are selected according to the information quantity of the images, namely the relevance, the difficulty and the representativeness of the images. Selecting samples according to relevance and difficulty modes can enable the network to select samples better, and the samples are more helpful for network training and better in accuracy and generalization of the model.
S43, self-adaptive triple loss; features of the same label are spatially close together and features of different labels are spatially far apart.
Fig. 5 is a schematic diagram of a triplet loss function in the present application. In the embodiment of the present application, the objective of the triplet loss function is to make the features of the same label as close as possible in spatial position, and the features of different labels as far as possible in spatial position, and to avoid the features of the samples from converging into a very small space, it is required that for two positive examples and one negative example of the same class, the negative example should be at least a long distance term (margin) away from the positive example. The mathematical expression is as follows:
wherein,indicating that the maximum value is taken in comparison with 0,,,representing the anchor point (candidate sample), positive and negative samples,representing a distance term. However, this expression only ensures that the distance between the anchor point and the positive sample is strictly smaller than the distance between the anchor point and the negative sample. This formula allows for a special case where both distances are arbitrarily small. In this case, although the margin term is increasedThe distance to the negative sample can be enlarged, butGreater than 0.5Resulting in a significant degradation of network performance. Therefore, in the present embodiment it is proposed to set a threshold between the positive and negative sample pairs, i.e.Whereinthe angle is represented as a function of time,i.e. by. The adaptive triplet loss function proposed in this embodiment is:
wherein,in order to be a hyper-parameter,expressed as a strict edge distance term, the term,indicating a slack margin term. Through the adaptive triple loss function, the performance of the triple loss function is stabilized.
And S44, the intra-oral left and right side illumination output by the S3 is sent to a trained double-label classification network. The network further classifies the intraoral photographs according to the output category scores to obtain specific categories of the input intraoral photographs, namely intraoral left photographs or intraoral right photographs.
S5: in the third stage, performing secondary classification by using target detection according to the categories of the front face photograph and the intraoral upper and lower jaw photographs output in the step S3, classifying the front face photograph to obtain the front face photograph and the smiling photograph, and classifying the intraoral upper and lower jaw photographs to obtain the intraoral upper jaw photograph and the intraoral lower jaw photograph; s51, a target detection network; and (3) taking the NanoDet as a detection network, selecting the ShuffleNet V2 as a backbone, removing the last layer of convolution, extracting 8, 16 and 32 times of down-sampling characteristics, and inputting the characteristics into the PAN for multi-scale characteristic fusion. In the embodiment of the application, the structure of the PAN is different from that of the conventional PAN, all convolutions in the PAN are removed, only 1 × 1 convolution after extracting from the backbone network features is reserved for aligning the feature channel dimensions, and both up-sampling and down-sampling are completed by interpolation. Unlike the catanate operation used by yolo, the choice is to add Feature maps at multiple scales directly, so that the overall Feature fusion module becomes less computationally intensive. s
S52, a target detection loss function; in the present embodiment, in consideration of the dispersion of dental data and in order to ensure the consistency of training and reasoning, a Generalized Focal local is selected to train the network; the formula used is:
s53, target detection training; as shown in fig. 6, it is an intraoral upper and lower jaw chart under the dual-label target detection in the present application, and in the present embodiment, in order to better distinguish the frontal photograph from the smile photograph and the intraoral upper and lower jaw photograph, the dual-label training is performed. The front lighting and the smiling lighting are additionally provided with mouth detection on the basis of the original front lighting, and the detected mouths are subjected to secondary classification and are divided into two categories of smiling and non-smiling. Detection of intraoral mandible photographs after detection of the oral cavity, a secondary classification was performed into mandible photographs with tongue and maxilla photographs without tongue.
And S54, sending the front face photograph and the intraoral jaw photograph output in the S3 into the trained target detection network. The network distinguishes whether the front face is smiling according to whether the front face contains the mouth and the characteristics of the mouth. The intraoral photographs are classified into maxillofacial photographs and mandibular photographs according to whether the photographs contain the tongue or not.
The method solves the problem that the similarity of dental data cannot be effectively solved by the traditional target detector and classifier by adopting a double-label training method, solves the problems of diversity and class imbalance of the dental data by a cascade frame, improves the automatic classification accuracy of dental images and stabilizes the performance of triple loss functions.
The present invention further provides an electronic device, as shown in fig. 7, which is a schematic structural diagram of an electronic device in the present application, and includes one or more processors 101 and a storage device 102; the processor 101 in the electronic device may be one or more, and fig. 7 illustrates one processor 101 as an example; storage 102 is used to store one or more programs; the one or more programs are executable by the one or more processors 101 to cause the one or more processors 101 to implement a method for automatic classification of dental images based on dual-label cascading as described in any of the embodiments of the present invention.
The electronic device may further include: an input device 103 and an output device 104. The processor 101, the storage device 102, the input device 103, and the output device 104 in the electronic apparatus may be connected by a bus 105 or other means, and fig. 7 illustrates an example in which the processor, the storage device 102, the input device 103, and the output device are connected by the bus 105.
The storage device 102 in the electronic apparatus is used as a computer-readable storage medium for storing one or more programs, which may be software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the method for automatic classification of dental images based on dual-label cascading provided in the embodiments of the present invention. The processor 101 executes various functional applications and data processing of the electronic device by running software programs, instructions and modules stored in the storage device 102, namely, the method for automatically classifying dental images based on dual-label cascade in the above method embodiment is realized.
The storage device 102 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device, and the like. In addition, the storage device 102 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the storage 102 may further include memory located remotely from the processor 101, which may be connected to the device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 103 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus. The output device 104 may include a display device such as a display screen.
And, when one or more programs included in the above-mentioned electronic device are executed by the one or more processors 101, the programs perform the following operations:
creating a dental image dataset;
performing data enhancement on the dental image dataset;
designing a lightweight network to detect a specific target of a dental image dataset, and classifying the image data into a plurality of categories; respectively as follows: frontal irradiation, side irradiation, oral left and right side irradiation, oral front irradiation, oral upper and lower jaw irradiation, side film, panoramic film, normal film, joint film, and small dental film;
aiming at the left and right side illumination in the mouth, carrying out secondary classification by using a dual-label classification network to obtain the left and right side illumination in the mouth;
and (3) carrying out secondary classification by utilizing target detection aiming at the front face and the intraoral upper and lower jaw photographs, classifying the front face photographs to obtain the front face photographs and the smiling photographs, and classifying the intraoral upper and lower jaw photographs to obtain the intraoral upper jaw photographs and the intraoral lower jaw photographs.
Of course, it will be understood by those skilled in the art that when one or more programs included in the electronic device are executed by the one or more processors 101, the programs may also perform operations related to the method for automatically classifying dental images by two-tag cascade provided in any embodiment of the present invention.
It should be further noted that the present invention also provides a computer-readable storage medium, which stores a computer program, where the computer program can be executed by a computer processor, to implement the above-mentioned embodiment of the method for automatic classification of dental images based on dual-label cascade. The computer program may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
Since any modifications, equivalents, improvements, etc. made within the spirit and principles of the application may readily occur to those skilled in the art, it is intended to be included within the scope of the claims of this application.
Claims (10)
1. A dental image automatic classification method based on dual-label cascade is characterized in that: the method comprises the following steps:
s1: creating a dental image dataset;
s2: performing data enhancement on the dental image dataset;
s3: designing a specific target of the dental image data set in the lightweight network detection step S2, and dividing the image data into a plurality of categories; respectively as follows: frontal photograph, side photograph, intraoral left and right side photograph, intraoral frontal photograph, intraoral upper and lower jaws photograph, side photograph, panoramic photograph, positioning photograph, joint photograph and small dental photograph;
s4: aiming at the left and right side illumination in the output port of the step S3, carrying out secondary classification by using a dual-label classification network to obtain the left side illumination in the port and the right side illumination in the port;
s5: performing secondary classification on the front face photos and the intraoral upper and lower jaw photos output in the step S3 by using target detection, classifying the front face photos to obtain the front face photos and the smile photos, and classifying the intraoral upper and lower jaw photos to obtain the intraoral upper jaw photos and the intraoral lower jaw photos;
wherein, the step S4 specifically includes:
s41, constructing a dual-label classification network;
s42, selecting positive and negative samples; each anchor point selects a positive sample and a negative sample which have large information quantity and are representative;
s43, self-adaptive triple loss; features of the same label are spatially close together and features of different labels are spatially far apart.
2. The method for automatic classification of dental images based on dual-label cascade as claimed in claim 1, wherein the step S1 specifically comprises:
s11, dental image data set composition: dividing actually shot dental images into a plurality of categories, and dividing the plurality of categories into single labels and double labels, wherein the double labels are additional labeling categories for special image data;
s12, labeling of dental image data sets: labeling the dental image according to target detection, and labeling by using a labellimg labeling tool;
s13: dental image dataset partitioning: dental image data sets are divided into training sets, verification sets and test sets in the proportion of 8.
3. The method for automatically classifying dental images based on dual-label cascade as claimed in claim 1, wherein the data enhancement in step S2 is specifically: and carrying out left and right mirror image adjustment or illumination and contrast adjustment on the dental image data set.
4. The automatic classification method for dental images based on dual-label cascade as claimed in claim 1, characterized in that: the S3 specifically includes:
s31, designing a lightweight network: the model consists of 10 convolutional layers, and conventional convolution and depth separable convolution are utilized;
s32, class balance loss function: the class imbalance problem is handled using CB-Focal local, which is the following formula:
wherein,is a categoryThe number of true samples of the sample,andare all the parameters of the weight that are,a predicted probability value for the network;
s33, model initialization: initialization is done using VGG 16 weights trained on ImageNet data sets.
5. The automatic classification method for dental images based on dual-label cascade as claimed in claim 1, characterized in that: the S42 specifically includes:
s421, evaluating the information content of the dental image;
s422, selecting a representative sample from the positive sample and the negative sample with the largest information quantity to construct a triple;
s423. For each imageRelative to anchor pointsScore of the amount of informationTo representWhether it is a candidate positive sample sumTo representWhether the negative sample is a candidate negative sample, the calculation formula is as follows:
6. The automatic classification method for dental images based on dual-label cascade as claimed in claim 1, characterized in that: the step S43 specifically includes:
s431, enabling the features of the same label to be close to each other in spatial position, enabling the features of different labels to be far away from each other in spatial position, and enabling the distance between the negative example and the positive example to be at least a long-distance term for two positive examples and one negative example of the same class; the formula is expressed as follows:
wherein,indicating that the maximum value is taken in comparison with 0,、andrespectively representing an anchor point, a positive sample and a negative sample,representing a distance term;
s432. Set a threshold between the positive and negative sample pairs, i.e.Whereinthe angle is represented by the number of angles,i.e. byThe adaptive triplet loss function is:
7. The automatic classification method for dental images based on dual-label cascade as claimed in claim 1, characterized in that: the step S5 specifically includes:
s51, a target detection network; taking NanoDet as a detection network, selecting ShuffleNet V2 as a backbone, removing the last layer of convolution, extracting 8, 16 and 32 times of down-sampling features, and inputting the features into PAN for multi-scale feature fusion;
s52, a target detection loss function; selecting a Generalized local to train the network;
s53, target detection training; dual labels are used for training.
9. an electronic device, comprising:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the dual-label cascade-based dental image automatic classification method of any one of claims 1 to 8.
10. A computer-readable storage medium, having stored thereon a computer program, characterized in that the computer program is executable by a computer processor for executing computer-readable instructions for carrying out the method according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211063304.3A CN115147873A (en) | 2022-09-01 | 2022-09-01 | Method, equipment and medium for automatically classifying dental images based on dual-label cascade |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211063304.3A CN115147873A (en) | 2022-09-01 | 2022-09-01 | Method, equipment and medium for automatically classifying dental images based on dual-label cascade |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115147873A true CN115147873A (en) | 2022-10-04 |
Family
ID=83416227
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211063304.3A Pending CN115147873A (en) | 2022-09-01 | 2022-09-01 | Method, equipment and medium for automatically classifying dental images based on dual-label cascade |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115147873A (en) |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109875513A (en) * | 2019-03-19 | 2019-06-14 | 厦门瑞谱拓医疗科技有限公司 | A kind of tooth detection device and automatic classification method based on couple photographing mode |
CN112418134A (en) * | 2020-12-01 | 2021-02-26 | 厦门大学 | Multi-stream multi-label pedestrian re-identification method based on pedestrian analysis |
CN112734031A (en) * | 2020-12-31 | 2021-04-30 | 珠海格力电器股份有限公司 | Neural network model training method, neural network model recognition method, storage medium, and apparatus |
CN113033555A (en) * | 2021-03-25 | 2021-06-25 | 天津大学 | Visual SLAM closed loop detection method based on metric learning |
CN113723236A (en) * | 2021-08-17 | 2021-11-30 | 广东工业大学 | Cross-mode pedestrian re-identification method combined with local threshold value binary image |
CN114022904A (en) * | 2021-11-05 | 2022-02-08 | 湖南大学 | Noise robust pedestrian re-identification method based on two stages |
CN114283316A (en) * | 2021-09-16 | 2022-04-05 | 腾讯科技(深圳)有限公司 | Image identification method and device, electronic equipment and storage medium |
CN114419672A (en) * | 2022-01-19 | 2022-04-29 | 中山大学 | Cross-scene continuous learning pedestrian re-identification method and device based on consistency learning |
US20220138252A1 (en) * | 2019-09-03 | 2022-05-05 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Image searches based on word vectors and image vectors |
CN114550729A (en) * | 2022-01-22 | 2022-05-27 | 珠海亿智电子科技有限公司 | Cry detection model training method and device, electronic equipment and storage medium |
CN114862771A (en) * | 2022-04-18 | 2022-08-05 | 四川大学 | Smart tooth identification and classification method based on deep learning network |
CN114898407A (en) * | 2022-06-15 | 2022-08-12 | 汉斯夫(杭州)医学科技有限公司 | Tooth target instance segmentation and intelligent preview method based on deep learning |
-
2022
- 2022-09-01 CN CN202211063304.3A patent/CN115147873A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109875513A (en) * | 2019-03-19 | 2019-06-14 | 厦门瑞谱拓医疗科技有限公司 | A kind of tooth detection device and automatic classification method based on couple photographing mode |
US20220138252A1 (en) * | 2019-09-03 | 2022-05-05 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Image searches based on word vectors and image vectors |
CN112418134A (en) * | 2020-12-01 | 2021-02-26 | 厦门大学 | Multi-stream multi-label pedestrian re-identification method based on pedestrian analysis |
CN112734031A (en) * | 2020-12-31 | 2021-04-30 | 珠海格力电器股份有限公司 | Neural network model training method, neural network model recognition method, storage medium, and apparatus |
CN113033555A (en) * | 2021-03-25 | 2021-06-25 | 天津大学 | Visual SLAM closed loop detection method based on metric learning |
CN113723236A (en) * | 2021-08-17 | 2021-11-30 | 广东工业大学 | Cross-mode pedestrian re-identification method combined with local threshold value binary image |
CN114283316A (en) * | 2021-09-16 | 2022-04-05 | 腾讯科技(深圳)有限公司 | Image identification method and device, electronic equipment and storage medium |
CN114022904A (en) * | 2021-11-05 | 2022-02-08 | 湖南大学 | Noise robust pedestrian re-identification method based on two stages |
CN114419672A (en) * | 2022-01-19 | 2022-04-29 | 中山大学 | Cross-scene continuous learning pedestrian re-identification method and device based on consistency learning |
CN114550729A (en) * | 2022-01-22 | 2022-05-27 | 珠海亿智电子科技有限公司 | Cry detection model training method and device, electronic equipment and storage medium |
CN114862771A (en) * | 2022-04-18 | 2022-08-05 | 四川大学 | Smart tooth identification and classification method based on deep learning network |
CN114898407A (en) * | 2022-06-15 | 2022-08-12 | 汉斯夫(杭州)医学科技有限公司 | Tooth target instance segmentation and intelligent preview method based on deep learning |
Non-Patent Citations (3)
Title |
---|
KHANH NGUYEN: "AdaTriplet:Adaptive Gradient Triplet Loss with Automatic Margin Learning for Forensic Medical Image Matching", 《CS.CV》 * |
XIAONAN ZHAO: "a weekly supervised adaptive triplet loss for deep metric learning", 《CS.CV》 * |
潘丽丽: "基于自适应三元组网络的细粒度图像检索算法", 《郑州大学学报》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111062871B (en) | Image processing method and device, computer equipment and readable storage medium | |
EP4009231A1 (en) | Video frame information labeling method, device and apparatus, and storage medium | |
WO2021068323A1 (en) | Multitask facial action recognition model training method, multitask facial action recognition method and apparatus, computer device, and storage medium | |
WO2020199931A1 (en) | Face key point detection method and apparatus, and storage medium and electronic device | |
WO2020182121A1 (en) | Expression recognition method and related device | |
WO2020024484A1 (en) | Method and device for outputting data | |
JP2019032773A (en) | Image processing apparatus, and image processing method | |
JP6800351B2 (en) | Methods and devices for detecting burr on electrode sheets | |
WO2021233017A1 (en) | Image processing method and apparatus, and device and computer-readable storage medium | |
CN111680550B (en) | Emotion information identification method and device, storage medium and computer equipment | |
CN113888541B (en) | Image identification method, device and storage medium for laparoscopic surgery stage | |
CN113570689B (en) | Portrait cartoon method, device, medium and computing equipment | |
WO2023207778A1 (en) | Data recovery method and device, computer, and storage medium | |
CN111476878A (en) | 3D face generation control method and device, computer equipment and storage medium | |
CN114549557A (en) | Portrait segmentation network training method, device, equipment and medium | |
CN114565602A (en) | Image identification method and device based on multi-channel fusion and storage medium | |
WO2024131407A1 (en) | Facial expression simulation method and apparatus, device, and storage medium | |
CN113643297A (en) | Computer-aided age analysis method based on neural network | |
CN115147873A (en) | Method, equipment and medium for automatically classifying dental images based on dual-label cascade | |
WO2020215682A1 (en) | Fundus image sample expansion method and apparatus, electronic device, and computer non-volatile readable storage medium | |
CN111967289A (en) | Uncooperative human face in-vivo detection method and computer storage medium | |
CN115761226A (en) | Oral cavity image segmentation identification method and device, electronic equipment and storage medium | |
WO2022226744A1 (en) | Texture completion | |
CN113450381A (en) | System and method for evaluating accuracy of image segmentation model | |
Nallapati et al. | Identification of Deepfakes using Strategic Models and Architectures |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
AD01 | Patent right deemed abandoned | ||
AD01 | Patent right deemed abandoned |
Effective date of abandoning: 20230811 |