CN116993712A

CN116993712A - Method and device for processing oral cavity CBCT image based on deep learning

Info

Publication number: CN116993712A
Application number: CN202311057358.3A
Authority: CN
Inventors: 林娇; 李成龙; 董卓君; 梁成容; 姜宇欣; 刘洋; 温碧馨
Original assignee: Third Hospital of Mianyang
Current assignee: Third Hospital of Mianyang
Priority date: 2023-08-21
Filing date: 2023-08-21
Publication date: 2023-11-03

Abstract

The application discloses a method and a device for processing oral CBCT images based on deep learning, which relate to the technical field of computer vision and comprise the following steps: preprocessing an original cone beam CT image, wherein the preprocessing comprises standardization and size adjustment of the image; the target detection module performs feature extraction by using a VGG16 model, and processes the extracted features to obtain coordinates of a target frame; in the MB2 recognition module, firstly, a GhostNet model is used for feature extraction, then, parameters of each convolution layer are evaluated by using joint importance indexes, model pruning is carried out, and finally, a transducer is used for feature reconstruction and classification. The method identifies the second canal of the first molar of the upper jaw on the CBCT image, so that the second canal of the first molar of the upper jaw can be automatically and accurately identified, thereby not only being beneficial to improving the success rate of root canal treatment, but also being capable of greatly reducing the workload of dentists and improving the diagnosis and treatment efficiency.

Description

Method and device for processing oral cavity CBCT image based on deep learning

Technical Field

The application relates to the technical field of computer vision, in particular to a method and a device for processing oral CBCT images based on deep learning.

Background

In the current field of oral medicine, accurate root canal treatment is an important task, and the key to success of root canal treatment is to accurately identify and treat complex root canal systems within teeth. For the first molar of the upper jaw, the complexity of its root canal anatomy and the inter-individual differences make accurate identification and treatment of the root canal particularly difficult. In general, there may be one or more root canal for a certain root of the first molar of the upper jaw, including the "second canal" which may be ignored. In clinical practice, after a portion of patients receive maxillary molar root canal treatment, near-mid cheek root tip lesions remain or form new lesions, often due to missing second root canal. The presence or absence of the second canal of the upper first molar near the cheek, a commonly ignored anatomical structure, requires observation using Cone Beam CT (CBCT) images, which is critical to the identification and assessment of the presence or absence of the second canal in the upper first molar near the cheek.

Conventional two-dimensional X-ray films often have difficulty accurately displaying all root canals, particularly hidden or overlapping root canals, due to their limitations, which also results in failure of a portion of the root canal treatment. The introduction of Cone Beam Computed Tomography (CBCT) technology provides a clear image of the three-dimensional anatomy of the tooth, making the identification of root canals new, but the interpretation and interpretation of CBCT image data still requires a great deal of expertise and experience, and thus still faces a challenge in practical applications.

In the prior art, the assessment process usually needs to rely on experienced medical staff to determine the existence of the second tube through manual observation of the CBCT image, but the method has obvious defects that the assessment efficiency is low, the assessment result is often influenced by subjective judgment of the medical staff, objectivity and consistency are lacking, and moreover, a large amount of image processing work also makes the medical staff face huge working pressure, so that the assessment of the second tube by relying on manual work becomes more difficult.

Disclosure of Invention

The application aims to solve the technical problem that a second root canal possibly existing in the prior maxillary first molar cannot be accurately and efficiently identified, and provides a method and a device for processing oral CBCT images based on deep learning, which solve the problem that the second root canal possibly existing in the prior maxillary first molar cannot be accurately and efficiently identified.

The application is realized by the following technical scheme:

in a first aspect, the present application provides a method for processing oral CBCT images based on deep learning, comprising:

preprocessing an original cone beam CT image, wherein the preprocessing comprises standardization and size adjustment of the image;

the target detection module performs feature extraction by using a VGG16 model, and processes the extracted features to obtain coordinates of a target frame;

in the MB2 recognition module, firstly, using a GhostNet model to perform feature extraction, and then using a transducer to perform feature reconstruction and classification; in the model training process, after feature extraction is performed on the GhostNet model, the parameters of each convolution layer need to be evaluated by using the joint importance index, and model pruning is performed.

The method provided by the application can be used for identifying the second canal of the first molar of the upper jaw on the CBCT image, so that the second canal of the first molar of the upper jaw can be automatically and accurately identified, the success rate of root canal treatment can be improved, the workload of dentists can be greatly reduced, the diagnosis and treatment efficiency can be improved, a central road is opened up in a dental diagnosis method, the method has a wide application prospect, and the method can be popularized to the identification of root canals of other teeth or the identification of other complex anatomical structures of the teeth, and the diagnosis and treatment quality of dental medical treatment can be greatly improved.

Preferably, the normalized gray-scale value of the image is adjusted to 0 to 1 when the image is preprocessed.

Preferably, when the image is preprocessed, each pixel value is normalized according to the following formula:

where Pixel represents the original Pixel value and Max (I) and Min (I) represent the maximum and minimum Pixel values of image I, respectively.

Preferably, the input image is first downsampled to a resolution of 300 x 200 before being fed into the VGG16 model; after the feature extraction, the output feature map is sent into a single-layer full-connection model to carry out regression of the coordinates of the target frame; the target value of the regression is the coordinates of the target box.

Preferably, the loss function uses a mean square error.

Preferably, during model training, after feature extraction of the GhostNet model, the joint importance index I of model pruning is calculated according to the following formula by using the joint importance index to evaluate the parameters of each convolution layer:

wherein W is _i Is the ith parameter of the convolution kernel, N is the total number of convolution kernel parameters,is a loss ofThe loss function is related to W _i Is a gradient of (a).

Preferably, the MB2 recognition module uses BCELoss function and Adam optimizer, with a learning rate set to 0.001.

In a second aspect, the present application provides an apparatus for processing oral CBCT images based on deep learning, comprising:

the material and data preprocessing module is used for preprocessing the image and standardizing the gray value to be between 0 and 1;

the target detection module is used for extracting features;

and the MB2 recognition module is used for cutting, extracting, reconstructing and classifying the target area of the CBCT image.

In a third aspect, the present application provides a computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the above method.

In a fourth aspect, the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the above method.

Compared with the prior art, the application has the following advantages and beneficial effects:

(1) The accuracy of image diagnosis is improved: by using VGG16 and GhostNet models to perform feature extraction and combining with a transducer to perform feature reconstruction and classification, whether a second root canal exists or not can be accurately identified from a CBCT image, and the accuracy of image diagnosis is remarkably improved;

(2) The efficiency of image processing is improved: the application adopts the strategies of image preprocessing and model pruning, adjusts the size of an original image into a proper size, evaluates the importance of a convolution kernel through combining importance indexes to carry out model pruning, obviously improves the efficiency of image processing and reduces the computational complexity of a model;

(3) The interpretability and the reliability of the model are improved: by evaluating the importance of the convolution kernel by using the joint importance index, an intuitive way can be provided for understanding and explaining the model decision process, so that the interpretability of the model is improved, and meanwhile, by pruning the convolution kernel with low importance, factors possibly introducing noise or overfitting can be removed, so that the reliability of the model is improved;

(4) The degree of automation of image diagnosis is improved: the method of the application enables the second root canal identification in the CBCT image to be automatically carried out, thereby greatly reducing the workload of doctors and enabling medical staff to put more time and effort into places which need more professional skills.

Drawings

In order to more clearly illustrate the technical solutions of the exemplary embodiments of the present application, the drawings that are needed in the examples will be briefly described below, it being understood that the following drawings only illustrate some examples of the present application and therefore should not be considered as limiting the scope, and that other related drawings may be obtained from these drawings without inventive effort for a person skilled in the art. In the drawings:

fig. 1 is a flowchart of a method for processing an oral CBCT image based on deep learning according to embodiment 1 of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Example 1

As shown in fig. 1, this embodiment provides a method for processing an oral CBCT image based on deep learning, which is used to accurately identify a second root canal that may exist in a first molar of the upper jaw, the method specifically comprising:

material and data pretreatment:

the cone beam CT image (CBCT) with the resolution of 3000 multiplied by 2000 is used for preprocessing the image, and the standardized gray scale value is between 0 and 1, and the specific implementation is as follows: each pixel value is normalized using the following formula:

where Pixel represents the original Pixel value and Max (I) and Min (I) represent the maximum and minimum Pixel values of image I, respectively. The training sample of this example is 1000 images for a total of 2000 regions.

And (3) target detection:

and (3) carrying out feature extraction by adopting a VGG16 model, wherein before the input image is sent into the model, firstly, the input image is downsampled to have the resolution of 300 multiplied by 200, and after the feature extraction, the output feature image is sent into a single-layer full-connection model to carry out regression of coordinates of a target frame. The target value of the regression is the coordinates (normalized value) of the target frame. The loss function uses a mean square error.

And (3) image identification:

in the MB2 recognition module (the MB2 recognition module uses BCELoss functions and Adam optimizers,the learning rate is set to 0.001), first, the original CBCT image is subject to target region cropping, and the selected region is scaled down to a size of 150×150. These cropped and scaled images are denoted Img _150×150 . These Imgs _150×150 And then input to the GhostNet feature extraction layer.

In the GhostNet feature extraction stage, the output feature map is described as a function f _G Corresponding to a tensor output of 960 x 9:

f _G (Img _150×150 )＝Features _960×9×9 ；

during the model training phase, the parameters of each convolution layer are evaluated using a back propagation algorithm and model pruning is performed (this operation is performed when model training is near tail sounds, and this step is not required for model operation after model pruning is completed). To comprehensively evaluate the importance of the convolution kernel, a joint importance index I is designed:

wherein W is _i Is the ith parameter of the convolution kernel, N is the total number of convolution kernel parameters,is a loss function about W _i Is a gradient of (a). By calculating I, importance evaluation of each convolution kernel can be obtained, and for the convolution kernels with evaluation structures lower than a preset threshold value, corresponding parameters are cut, so that a simplified feature extractor +.>The reduced feature extractor outputs a 16 x 9 feature map:

in the transducer stage, the feature map is first rearranged into a feature vector V of 16×81 _16×81 ：

V _16×81 ＝reshape(Features _16×9×9 )。

The MB2 recognition module is designed with a transducer module consisting of 3 transducer coding layers, which is denoted as f _T ：

f _T (V _16×81 )＝Features _16×81’ 。

Finally, the output characteristics of the transducer module are classified into two classes through the full connection layer $f_ { FC } $ to obtain a predicted output Pred:

f _FC (Features _16×81’ )＝Pred

where Pred is a binary value, positive (presence of the second root canal) is noted as 1 and negative (absence of the second root canal) is noted as 0.

The output values on the left and right sides in this embodiment are 1, thus indicating that there is a second root canal on both the left and right sides.

Example 2

This embodiment provides an apparatus for processing oral CBCT images based on deep learning for accurately identifying a second canal that may be present in a first molar of the upper jaw, the apparatus comprising:

The target detection module is used for extracting features;

In the MB2 recognition module (the MB2 recognition module uses BCELoss function and Adam optimizer, the learning rate is set to 0.001), first, the original CBCT image is subjected to target region cropping, and the selected region is scaled to size 150×150. These cropped and scaled images are denoted Img _150×150 . These Imgs _150×150 Then is input to GhostN _e And t a feature extraction layer.

f _G (Img _150×150 )＝Features _960×9×9 ；

next, the parameters of each convolutional layer are evaluated using a back-propagation algorithm and model pruning is performed. To comprehensively evaluate the importance of the convolution kernel, a joint importance index I is designed:

wherein W is _i Is the ith parameter of the convolution kernel, N is the total number of convolution kernel parameters,is a loss function about W _i Is a gradient of (a). By calculating I, importance evaluation of each convolution kernel can be obtained, and for the convolution kernels with evaluation structures lower than a preset threshold value, corresponding parameters are cut, so that a simplified feature extractor +.>The reduced feature extractor outputsIs a characteristic diagram of 16×9×9:

V _16×81 ＝reshape(Features _16×9×9 )。

f _T (V _16×81 )＝Features _16×8l’ 。

f _FC (Features _16×81’ )＝Pred

Example 3

This embodiment provides a computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the method of processing oral CBCT images based on deep learning described in embodiment 1 above.

The computer equipment can be computing equipment such as a desktop computer, a notebook computer, a palm computer, a cloud server and the like. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.

The storage includes at least one type of readable storage medium including flash memory, hard disk, multimedia card, card memory (e.g., SD or D interface display memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory may be an internal storage unit of the computer device, for example, a hard disk or a memory of the computer device. In other embodiments, the memory may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like. Of course, the memory may also include both internal storage units and external storage devices of the computer device. In this embodiment, the memory is typically used to store an operating system and various types of application software installed on the computer device, such as program code for running the method for processing oral CBCT images based on deep learning. In addition, the memory may be used to temporarily store various types of data that have been output or are to be output.

The processor may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor is typically used to control the overall operation of the computer device. In this embodiment, the processor is configured to execute the program code stored in the memory or to process data, such as program code for executing the method of processing oral CBCT images based on deep learning.

Example 4

This embodiment provides a computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method of processing oral CBCT images based on deep learning described above.

Wherein the computer-readable storage medium stores an interface display program executable by at least one processor to cause the at least one processor to perform the steps of a method of processing oral CBCT images based on deep learning.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course, may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server or a network device, etc.) to perform the method according to the embodiments of the present application.

The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the application, and is not meant to limit the scope of the application, but to limit the application to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the application are intended to be included within the scope of the application. It will be evident to those skilled in the art that the application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims

1. A method for processing oral CBCT images based on deep learning, comprising:

2. The method for processing oral CBCT images based on deep learning according to claim 1, wherein the standardized gray value of the image is adjusted to 0 to 1 when the image is preprocessed.

3. The method of claim 1, wherein each pixel value is normalized by preprocessing the image according to the following formula:

4. The method of claim 1, wherein the input image is first downsampled to a resolution of 300 x 200 before being fed into the VGG16 model; after the feature extraction, the output feature map is sent into a single-layer full-connection model to carry out regression of the coordinates of the target frame; the target value of the regression is the coordinates of the target box.

5. The method of claim 4, wherein the loss function uses a mean square error.

6. The method for processing oral CBCT images based on deep learning according to claim 1, wherein, after feature extraction of the GhostNet model during model training, the joint importance index I of evaluating the parameters of each convolution layer and performing model pruning is calculated according to the following formula:

wherein W is _i Is the ith parameter of the convolution kernel, N is the total number of convolution kernel parameters,is a loss function about W _i Is a gradient of (a).

7. The method of claim 1, wherein the MB2 recognition module uses BCELoss function and Adam optimizer, and the learning rate is set to 0.001.

8. An apparatus for processing oral CBCT images based on deep learning, comprising:

the target detection module is used for extracting features;

9. A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of the method of any of claims 1 to 7.

10. A computer readable storage medium, characterized in that a computer program is stored, which, when being executed by a processor, causes the processor to perform the steps of the method according to any of claims 1 to 7.