CN116416221A - Ultrasonic image analysis method - Google Patents

Ultrasonic image analysis method Download PDF

Info

Publication number
CN116416221A
CN116416221A CN202310250057.6A CN202310250057A CN116416221A CN 116416221 A CN116416221 A CN 116416221A CN 202310250057 A CN202310250057 A CN 202310250057A CN 116416221 A CN116416221 A CN 116416221A
Authority
CN
China
Prior art keywords
nodule
feature
ultrasonic
ultrasonic image
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310250057.6A
Other languages
Chinese (zh)
Inventor
朴锦春
韩雪华
李德
胡昊
杨天宇
李慧瑛
周雨桐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yanbian University
Original Assignee
Yanbian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yanbian University filed Critical Yanbian University
Priority to CN202310250057.6A priority Critical patent/CN116416221A/en
Publication of CN116416221A publication Critical patent/CN116416221A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10132Ultrasound image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30061Lung
    • G06T2207/30064Lung nodule

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Radiology & Medical Imaging (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Ultra Sonic Daignosis Equipment (AREA)

Abstract

The invention relates to the technical field of ultrasonic imaging, in particular to an ultrasonic image analysis method. The method comprises the following steps: acquiring an ultrasonic image obtained by ultrasonic scanning of a thyroid region of a detected object, detecting a benign and malignant classification result of a nodule in the ultrasonic image, performing semantic segmentation on the ultrasonic image to obtain a nodule segmentation map, analyzing nodule characteristic information and an image visible result based on the nodule segmentation map, combining the nodule characteristic information with the benign and malignant classification result of the nodule, and determining the TI-RADS classification and the malignant probability of the nodule based on a pre-trained TI-RADS scoring model classification model. The ultrasonic image analysis method can solve the technical problems that the existing processing scheme lacks the visible and grading reports of the corresponding images, the diagnosis result lacks the interpretability, the disease treatment diagnosis suggestion cannot be provided, and the clinical significance is limited.

Description

Ultrasonic image analysis method
Technical Field
The invention relates to the technical field of ultrasonic imaging, in particular to an ultrasonic image analysis method.
Background
In studies of the association of thyroid nodules with cancer risk, many documents have elucidated the association of thyroid nodules with cancer risk over the last 20 years. The focus of the research is to define the classification systems of the ultrasonic image, each classification system represents a specific imaging feature combination, and how to select the feature combination becomes the key of diagnosis accuracy. In the implementation process of thyroid ultrasonic diagnosis based on artificial intelligence, model training depends on the size and quality of data, and the obtained diagnosis result lacks the interpretability.
The artificial intelligence auxiliary ultrasonic diagnosis technology not only can make the ultrasonic image clearer, but also can use the artificial intelligence to solve the productivity and improve the efficiency on the one hand, and can diagnose the nodule characteristics of the ultrasonic image on the other hand.
However, the existing thyroid ultrasonic diagnosis technology based on artificial intelligence is only to locate and classify the node area, lacks a corresponding image visible and grading report, cannot provide disease treatment diagnosis advice, and has limited clinical significance.
Disclosure of Invention
In order to solve the technical problems, the ultrasonic image analysis method can reliably classify results and the results seen in images based on the classification model, continuously optimize the model in a small sample through continuous learning aiming at the classification model, continuously perfect the model in the use process of the system, and lay a solid foundation for providing better diagnosis suggestions.
In order to achieve the above purpose, the technical solution adopted in the embodiment of the present application is as follows:
in a first aspect, there is provided an ultrasound image analysis method, the method comprising: acquiring an ultrasonic image obtained by ultrasonic scanning of a thyroid region of a detected object, detecting a benign and malignant classification result of a nodule in the ultrasonic image, performing semantic segmentation on the ultrasonic image to obtain a nodule segmentation map, analyzing nodule characteristic information and an image detected result based on the nodule segmentation map, combining the nodule characteristic information with the benign and malignant classification result of the nodule, and determining the TI-RADS classification and the malignant probability of the nodule based on a trained TI-RADS scoring model classification model.
Further, detecting a benign and malignant classification result of the nodule in the ultrasound image comprises: extracting the nodule characteristics of the ultrasonic image based on the trained target detection model to obtain a nodule detection diagram, and processing the extracted nodule characteristics to obtain a benign and malignant nodule probability value; the target detection model comprises a feature extraction network for extracting the nodule feature information in the ultrasound image; the feature fusion network is used for fusing the extracted nodule feature information to obtain a nodule fusion feature map; and the classification network is used for classifying the nodule fusion characteristics to obtain nodule probability values under different classifications, wherein the classifications comprise early, medium and late period classifications of the nodules.
Further, the feature extraction network comprises five convolution layers, a Focus structure and an inner convolution layer connected with the Focus structure are arranged in a first convolution layer, the Focus structure is used for converting the wide-high information of the ultrasonic image into a channel, and the inner convolution layer extracts the ultrasonic image through inner convolution to obtain a preliminary feature map containing shallow features.
Further, the convolution structure in the third convolution layer is a deformable convolution.
Furthermore, the feature fusion network comprises a pyramid network structure which performs feature sampling from top to bottom after performing down-up sampling according to semantic degrees based on a plurality of feature graphs.
Further, the detection heads comprise three yolo detection heads, and the three yolo detection heads are used for detecting nodules of different sizes.
Further, performing semantic segmentation on the ultrasonic image to obtain a nodule segmentation map, including: and carrying out feature extraction on the ultrasonic image based on a U-Net network through an encoder to obtain a feature map, and predicting and outputting pixels in the extracted feature map based on the U-Net network through a decoder to obtain a segmented image, namely a nodule segmentation map.
Further, the U-Net network is constructed based on a valid convolution structure.
Further, the TI-RADS scoring model grading model comprises a trained nodule characteristic analysis model and a scoring model, the nodule characteristic analysis model performs characteristic analysis on the nodule segmentation map to obtain ultrasonic characteristic information, and the ultrasonic characteristic information is converted based on a CLIP language model to obtain ultrasonic image characteristic information; the scoring model combines the ultrasound image characteristic information and the benign and malignant nodule classification result based on a TI-RADS mechanism to obtain TI-RADS classification and malignant nodule probability.
Further, training of the scoring model is performed based on a regularized continuous learning method.
In a second aspect, there is provided an ultrasound image analysis system comprising: an ultrasonic probe; the transmitting circuit is used for exciting the ultrasonic probe to transmit ultrasonic waves to a thyroid region of a tested object; a receiving circuit for controlling the ultrasonic probe to receive ultrasonic echo returned from the thyroid region to obtain an ultrasonic echo signal; a processor for performing a corresponding ultrasound image analysis method, determining a nodule TI-RADS classification and a nodule malignancy probability.
In a third aspect, a computer readable storage medium is provided, the computer readable storage medium storing a computer program which, when executed by a processor, implements the method of any of the above.
In the technical scheme provided by the embodiment of the application, ACR TI-RADS which is widely used in clinical diagnosis of thyroid nodule ultrasonic images at home and abroad is applied to artificial intelligent ultrasonic clinical diagnosis in a grading manner, so that a more objective and accurate ultrasonic diagnosis report basis is formed. On the basis of the application of the traditional artificial intelligent medical image, a solid foundation is laid for solving the problem of poor diagnosis effect caused by small data quantity and continuously perfecting a model in the use process of the system and providing better diagnosis suggestions. The method can solve the technical problems that the existing treatment scheme lacks the visible and grading reports of the corresponding images, the diagnosis result lacks the interpretability, the disease treatment diagnosis suggestion cannot be provided, and the clinical significance is limited.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
The methods, systems, and/or programs in the accompanying drawings will be described further in terms of exemplary embodiments. These exemplary embodiments will be described in detail with reference to the drawings. These exemplary embodiments are non-limiting exemplary embodiments, wherein the exemplary numbers represent like mechanisms throughout the various views of the drawings.
Fig. 1 is a schematic diagram of an ultrasound image analysis system according to an embodiment of the present application.
Fig. 2 is a flowchart of an ultrasound image analysis method provided in an embodiment of the present application.
Fig. 3 is a schematic diagram of a deformable convolution structure provided in an embodiment of the present application.
Fig. 4 is a schematic view of the attention layer structure provided in the embodiment of the present application.
Fig. 5 is a schematic diagram of a feature fusion network structure provided in an embodiment of the present application.
Fig. 6 is a schematic diagram of a hierarchical model structure of a TI-RADS scoring model according to an embodiment of the present application.
Fig. 7 is a schematic diagram of a scoring model of TI-RADS provided in an embodiment of the present application.
Detailed Description
In order to better understand the technical solutions described above, the following detailed description of the technical solutions of the present application is provided through the accompanying drawings and specific embodiments, and it should be understood that the specific features of the embodiments and embodiments of the present application are detailed descriptions of the technical solutions of the present application, and not limit the technical solutions of the present application, and the technical features of the embodiments and embodiments of the present application may be combined with each other without conflict.
In the following detailed description, numerous specific details are set forth by way of examples in order to provide a thorough understanding of the relevant teachings. However, it will be apparent to one skilled in the art that the present application may be practiced without these details. In other instances, well-known methods, procedures, systems, components, and/or circuits have been described at a relatively high-level, without detail, in order to avoid unnecessarily obscuring aspects of the present application.
The flowcharts are used in this application to describe implementations performed by systems according to embodiments of the present application. It should be clearly understood that the execution of the flowcharts may be performed out of order. Rather, these implementations may be performed in reverse order or concurrently. Additionally, at least one other execution may be added to the flowchart. One or more of the executions may be deleted from the flowchart.
Before describing embodiments of the present invention in further detail, the terms and terminology involved in the embodiments of the present invention will be described, and the terms and terminology involved in the embodiments of the present invention will be used in the following explanation.
(1) In response to a condition or state that is used to represent the condition or state upon which the performed operation depends, the performed operation or operations may be in real-time or with a set delay when the condition or state upon which it depends is satisfied; without being specifically described, there is no limitation in the execution sequence of the plurality of operations performed.
(2) Based on the conditions or states that are used to represent the operations that are being performed, one or more of the operations that are being performed may be in real-time or with a set delay when the conditions or states that are being relied upon are satisfied; without being specifically described, there is no limitation in the execution sequence of the plurality of operations performed.
According to the technical scheme provided by the embodiment of the application, the main application scene is to analyze the ultrasonic image and identify the grade and pathological condition corresponding to the nodule. Ultrasound (Ultrasound) is one of the most common imaging modalities in clinical practice in the prior art, and is known as X-ray, CT, magnetic resonance imaging and four-modality imaging. The ultrasonic has the characteristics of relative safety, low cost, non-invasiveness, real-time display and the like, and has wide application fields. Compared with medical images such as CT, the ultrasonic imaging method has the problems of low imaging quality, high variability and the like. Resulting in the need for a sonographer to have some medical branch of science knowledge and experience, but high-level sonographers are scarce, showing national sonographer talents breach by at least 15 ten thousand according to national Wei Jian Committee statistics. Thyroid nodule is one of the most advanced endocrine diseases in the population, and ultrasound is the most informative and widely used imaging means in thyroid and parathyroid assessment. But the use and familiarity of ultrasound equipment, knowledge of normal neck dig, the proficiency of ultrasound examination, and the accuracy of ultrasound-guided biopsies, as well as the ability of normal image feature recognition and clinical diagnostic assessment are essential conditions for the development of neck ultrasound. Along with the ultrasonic diagnosis technology assisted by artificial intelligence, the ultrasonic imaging can be clearer, more importantly, the artificial intelligence is applied, so that on one hand, the productivity is solved, the efficiency is improved, on the other hand, the diagnosis can be made on the nodule characteristics of the ultrasonic imaging, and the defect of manual requirement of neck ultrasonic is overcome.
In the prior art, in the implementation process of thyroid ultrasonic diagnosis based on artificial intelligence, the size and quality of a data set are obtained after model training, and the obtained diagnosis result lacks interpretability, so that researchers face the following steps when the system is specifically implemented:
1. problems in the implementation of artificial intelligence technology: the ultrasonic image has low image quality and serious noise; the nodular condition is irregular and the boundary is fuzzy; the ultrasonic image data set is deficient, and the data acquisition and marking cost is high; early-onset thyroid nodules have low diagnostic accuracy and lack detection of benign nodules; the image and the diagnosis result are generally deviated from clinical medical diagnosis standards;
2. problems in ultrasound medicine: the use of ultrasonic equipment and the familiarity of ultrasonic inspection are low; the knowledge of normal neck anatomy is not high; image feature recognition and clinical diagnosis evaluation capability residual errors are uneven; medical resource allocation is unbalanced and high-level sonographers are scarce; has a dependence on manual and doctor experience.
The key subject behind the method is that the existing thyroid ultrasonic diagnosis technology based on artificial intelligence is only to locate and classify the node area, lacks corresponding image visible and grading reports, lacks the interpretability of diagnosis results, cannot provide disease treatment diagnosis suggestions, and has limited clinical significance.
Against the above technical background, the present embodiment provides an ultrasound image analysis system, including:
an ultrasonic probe;
the transmitting circuit is used for exciting the ultrasonic probe to transmit ultrasonic waves to a thyroid region of a tested object;
a receiving circuit for controlling the ultrasonic probe to receive ultrasonic echo returned from the thyroid region to obtain an ultrasonic echo signal;
a processor for performing a corresponding ultrasound image analysis method, determining a nodule TI-RADS classification and a nodule malignancy probability.
While in view of the above technical background and the corresponding ultrasound image analysis system, the ultrasound image analysis method enables classification of nodules in ultrasound images and classification of nodules based on TI-RADS and determination of the probability of malignancy of the final nodules, wherein in the present embodiment, classification of nodules is based on definition of various medical properties of nodules in medicine, specifically including but not limited to the following cases: the composition of the nodules, echoes of the nodules, morphology of the nodules, edges of the nodules, and hyperechoic foci of the nodules, with different medical metrics for each definition, e.g., cystic or almost cystic, spongy, cystic and with or without practice for the composition of the nodules; echoes include anechoic, hyperechoic or isoechoic, hypoechoic and very hypoechoic; the shape comprises a transverse diameter larger than a longitudinal diameter and a longitudinal diameter larger than the transverse diameter; edges include smooth, hazy, foliated or irregular and thyroid outdiffusion; hyperechoic foci include aneroid or large comet tails, large calcifications, edge calcifications or punctiform hyperechoics. And correspondingly scoring different results when the nodule is determined according to different conditions, calculating the total score to determine TI-RADS grading, and giving specific medical advice.
In the ultrasonic image analysis method provided by the embodiment, the acquired ultrasonic image is processed by a computer technology to obtain the result information, and the corresponding characters in the image are correspondingly scored according to the image processing technology so as to determine the final nodule type and score.
For the method, the method specifically comprises the following steps:
and S210, obtaining an ultrasonic image by carrying out ultrasonic scanning on a region to be detected of the detected object.
In this embodiment, the acquisition of the ultrasound image is performed by an ultrasound scanning technique, and the determination of the region to be scanned is based on a corresponding disease, for example, a nodule for identifying thyroid disease, and is obtained by scanning the thyroid region; for nodules of thyroid disease, they are obtained by scanning over the thyroid region. In this embodiment, the disease corresponding to a thyroid nodule is mainly identified, that is, the scanned area is a thyroid area, and the identified nodule type is a thyroid nodule.
And step S220-1, detecting a benign and malignant classification result of the nodule in the ultrasonic image.
In this embodiment, the recognition of benign and malignant nodules in an ultrasound image is performed based on a trained target detection model, and specifically includes the steps of extracting features of the nodules from the ultrasound image by using the target detection model to obtain a nodule detection map, and processing the extracted nodules to obtain probability values of benign and malignant nodules.
The target detection model comprises a feature extraction network, a feature fusion network and a detection head, wherein the detection head comprises a classification network.
Specifically, the method aims at a feature extraction network and is used for extracting nodule feature information in an ultrasonic image; the feature fusion network is used for fusing the extracted nodule feature information to obtain nodule fusion features; the detection head is used for classifying the nodule fusion features to obtain nodule probability values under different classifications, and in the embodiment, the classification comprises early, middle and late stage classification of the nodules. Wherein, early, middle and late stage classification is mainly used for determining the benign and malignant degree of the nodule.
In this embodiment, the feature extraction network includes a plurality of basic convolutional layer structures. Wherein the number of the plurality of convolution layers is 5, namely the convolution processing of the S1-S5 stages is included. In the S1 stage, first, the wide-high information of the thyroid ultrasound image is converted into a channel using a Focus structure in a convolution layer, and then, shallow features with stronger spatial specificity are extracted using inner convolution. In the S2 stage, the width and height of the characteristic diagram are further compressed, and the number of channels is further expanded. In the S3, S4 and S5 stages, three feature graphs are respectively generated for constructing a feature fusion network, wherein the feature fusion network is in a pyramid structure and fuses feature information of different scales through bottom-up and top-down connection.
In this embodiment, the Focus structure is used to perform slicing processing on the input image, so as to reduce the amount of computation. However, the resolution of the ultrasound image obtained in reality is low, and the image processed by the Focus structure is directly used for subsequent convolution operation, so that the extraction of edge information such as lines and textures of a shallow convolution layer can be influenced. Therefore, in order to acquire more detailed information of the ultrasound image in this embodiment, the present embodiment introduces an inner convolution layer after the Focus structure. Specifically, for the H×W×4C-scale feature map X generated by the Focus structure, the inner convolution layer first uses the feature vector v in X ij Is sized to
Figure SMS_1
Next, the vector F after adjustment is expanded to obtain a convolution kernel Z of size kxkx1. Then, convolution kernels Z and v are used ij The eigenvectors of the region are multiplied to obtain an eigenvector F with the size of k multiplied by c p . Finally, the eigenvectors of k×k 4c dimensions in Fp are added to obtain an inner convolution result +.>
Figure SMS_2
. Through the operation, a convolution kernel can be generated from the channel containing the width and height information and used for acquiring the feature vector of the area, so that a preliminary feature map is obtained. In this embodiment, the preliminary feature map is a feature map Y with spatial specificity, which provides more abundant detailed information for subsequent convolution operations.
In this embodiment, in the feature extraction network, the S3 stage receptive field is smaller and is mainly responsible for detecting small targets. However, in the conventional convolution method adopted in S3, the sampling area is a fixed rectangle, and when the early nodule features are extracted, a large amount of surrounding tissue information is contained, so that the detection effect of the early nodule is affected. Therefore, in the embodiment of the application, deformable convolution is introduced in the S3 stage, early nodes with irregular shapes are flexibly handled, more accurate early node characteristics are extracted, and the limitation of a traditional convolution mode is broken through.
Specifically, in a deformable convolution, a feature map is first obtained using a conventional convolution
Figure SMS_3
The width and height of the channel are the same as the input characteristic diagram, the number of channels is twice the size of convolution kernel, and the channel is +.>
Figure SMS_4
Each pixel point of (2) represents the offset of k×k sampling points +.>
Figure SMS_5
. Then according to->
Figure SMS_6
The values of the middle pixels change the corresponding sampling area, i.e. the middle grid in the figure. Finally, after the adjustment, the mining is carried outThe sample area, namely the blue grid position in the figure, is subjected to convolution operation, so that the deformable convolution effect is formed. The process can be expressed mathematically as follows:
first, a set of sampling regions R is constructed, as shown in the following formula, where each element in R represents the relative position of a sampling point, taking a convolution region of size 3×3 as an example, (-1, -1) represents the upper left corner position of the sampling region.
Figure SMS_7
The following convolution operation is then performed on R in the above equation:
Figure SMS_8
for the following purposes
Figure SMS_9
The sampling area as the center, first of all, at each sampling point +.>
Figure SMS_10
Add offset +.>
Figure SMS_11
Thereby increasing the flexibility of the sampling location. Then, convolution operation is carried out to obtain +.>
Figure SMS_12
Pixel value +.>
Figure SMS_13
. Since the sampling point position in the feature map is integer, add +.>
Figure SMS_14
The latter sampling point locations are typically of the floating point type. Therefore, the deformable convolution adopts a bilinear interpolation method to take values in the input feature map, and the process is as follows: />
Figure SMS_15
Wherein, for the obtained floating point sampling point p, firstly, the pixel value x (q) which is actually present is selected in the four adjacent domains. Then, the weight of each x (q) is calculated by G (∙). Finally, the weighted sum of x (q) is taken as the pixel value x (p) of the p position in the input feature map.
In the embodiment, the deformable convolution is more flexible than the sampling area of the traditional convolution, can be better suitable for early nodules with different shapes in the thyroid ultrasound image, and extracts more accurate early nodule characteristics, so that the detection effect of the model on the early nodules is improved.
In this embodiment, the detection heads include Yolo detection heads, wherein the number of Yolo detection heads is three, and three Yolo detection heads are used to detect nodules of different scale sizes, i.e., early, medium, and late.
Wherein, for thyroid ultrasound images different from images in natural scenes, there is a lot of noise and background information is complex. The redundant information is treated equally in the subsequent feature map along with the increase of the network layer number, so that the characterization capability of the whole network is blocked, and the whole detection effect of the model on the nodes is affected.
Referring to fig. 4, in this embodiment, the feature extraction network is used to initially gather information of the thyroultrasonic images, and outputs three feature maps as inputs of the feature fusion network. Among these, in order to make the feature fusion network generate more effective features, the present embodiment introduces a CBAM hybrid attention mechanism between the feature extraction network and the feature fusion network. After the CBAM is introduced, each feature map generated by the feature extraction network is weighted in the space direction and the channel direction before being sent into the feature fusion network, so that the effects of noise suppression and complex background information are achieved, meanwhile, the relation of the features in the channel and the space is enhanced, and the detection effect of the nodules in the thyroid ultrasonic image is improved.
Referring to fig. 5, in this embodiment, the feature fusion network is a special pyramid network structure, and receives three feature graphs output by the feature extraction network, and the feature fusion network processes features in a bottom-up and top-down manner. In the bottom-up process, the high semantic feature map is up-sampled and fused with the feature map containing rich low semantic information to obtain multi-scale features fused with abstract information and detail information. In the top-down process, through extracting the secondary features of the feature map, the association expression among the features of different scales is enhanced, and the deep fusion of the features is realized. And finally, outputting three fusion feature graphs for subsequent classification and regression tasks.
Specifically, the multi-scale problem in target detection is mainly solved aiming at the feature fusion network, wherein the detection performance is greatly improved by changing the network connection mode aiming at the special pyramid structure under the condition of almost not increasing the calculation cost. By using the feature fusion network with the special pyramid structure, shallow features with less semantic information and accurate target positions are fused with high-level features with rich semantic information and rough target positions, so that the characteristic capability of the features is improved, and the target detection effect is enhanced.
In this embodiment, the classification network includes three yolo detection heads, and the three yolo detection heads are used to detect nodules of different sizes, i.e. early, middle and late, at different periods. The detection head in the present embodiment may employ the detection head in the existing yolo algorithm.
The probability value corresponding to the benign and malignant nodule as the classification result of the benign and malignant nodule is obtained by the above processing, and the determined benign and malignant nodule is input and processed most subsequently.
And S220-2, carrying out semantic segmentation on the ultrasonic image to obtain a nodule segmentation map, and analyzing nodule characteristic information based on the nodule segmentation map.
In this embodiment, the processing of the feature map for semantic segmentation of the ultrasound image mainly includes a U-Net network, and an encoder and a decoder are configured on the basis of the U-Net network, where the encoder is used for feature extraction of the ultrasound image based on the U-Net network, and the decoder is used for recovering information such as details and edges of the feature map obtained by the encoder, and the overall U-shaped structure is presented. And finally, predicting and outputting each pixel on the feature map, namely the segmented image. The embodiment can combine the upper and lower Wen Yuyi through the encoder, the decoder and the U-Net network structure, has the advantages of high training speed and less required data, and can be matched with pain points of medical image segmentation.
In this embodiment, the U-Net network adopts a feature fusion manner spanning "stitching", that is, features are stitched in the channel dimension to form features with thicker dimension, which ensures that the final recovery feature map fuses more low-level semantic features, and simultaneously fuses features with different dimensions, so that segmentation recovers fine edge information which is more focused in the medical image field. The U-Net also adds a plurality of characteristic channels in the up-sampling process, so that more original image textures are allowed to be transmitted, and focus contour details are restored. In addition, because the U-Net uses valid convolution modes in the whole process, the context characteristics of the spatial domain, which cannot be lost in the segmentation result, are ensured. The valid convolution mode is an existing mode, and will not be described in this embodiment. Moreover, the U-Net network is used for semantic segmentation of thyroid nodules, and the depth and the convolution kernel size of the network can be flexibly adjusted according to the requirements of related private data sets, so that the updated segmentation network can be applied to medical data set processing with smaller sample size.
And S230, combining the nodule characteristic information with the nodule benign and malignant classification result, and determining the nodule TI-RADS classification and the nodule malignant probability based on a pre-trained TI-RADS scoring model grading model.
In this embodiment, the TI-RADS scoring model classification model includes a trained nodule feature analysis model and a scoring model, the nodule feature analysis model performs feature analysis on the nodule segmentation map to obtain ultrasonic feature information, and converts the ultrasonic feature information to obtain ultrasonic image feature information based on a CLIP language model; the scoring model combines the ultrasound image characteristic information and the benign and malignant nodule classification result based on a TI-RADS mechanism to obtain TI-RADS classification and malignant nodule probability. CLIP (Contrastive Language-Image Pre-Training) is a multimodal Pre-Training model that is capable of processing text and Image data simultaneously and linking them together. The goal of CLIP is to let the model understand the relationship between text and image so that it can identify the content in the image and generate the appropriate text description for the image. The training process for the CLIP model is an unsupervised process that uses a large amount of text and image data for training. During training, the CLIP inputs an image and a text paragraph into the model, and the model is trained by computing the similarity between the image and the text. After training is completed, the CLIP model can be used for tasks such as image classification, image retrieval, text classification, and the like. Unlike other image recognition models, CLIP can be trained without using any annotation data. This means that CLIP can be applied in any field without spending a lot of time and effort collecting and annotating data.
Wherein, for the grading evaluation method based on the ACR TI-RADS grading method, the ACR TI-RADS grading is a widely used standard in the clinic during thyroid nodule ultrasonic diagnosis, and according to the standard, a sonologist provides a corresponding diagnosis proposal.
Referring to fig. 6, the ti-RADS classification contains 5 class score terms such as composition, echo, morphology, edge, and strong echo range, and a node size class 1 condition term. The total score calculated from the class 5 score term determines the TI-RADS grade, which is TR1 (benign), TR2 (possibly benign), TR3 (mildly suspected malignant), TR4 (mildly suspected malignant), TR5 (highly suspected malignant), respectively, and further diagnostic advice is made in combination with the size of the nodule.
Wherein, for the nodule feature analysis model, the TI-RADS classification and the nodule malignancy probability are obtained based on the thyroid nodule features and the benign and malignant classification results obtained in the step S210 and the step S220. Wherein, for the data set with few samples in early stage of the scoring model, a continuous learning or zero sample learning method is designed to solve the problem of insufficient data set samples.
In this embodiment, a continuous learning method is adopted, new classification and malignancy probability are obtained according to the obtained characteristics and the benign and malignant classification result, and then continuous prediction is continuously updated according to continuous learning, and manually identified characteristics can be added according to actual use conditions so as to improve model accuracy. The continuous learning related mathematical expression is as follows:
continuous learning starts with an initial non-incremental phase S0, whose model M0 passes through the dataset d0= { Xi, yi; i=1, 2, ··, P0 is trained to see, xi, yi represents a sample set and a tag set of the i-th class of data respectively, P0 represents the number of categories trained in stage S0. For a continuous learning process with T phases, it includes an initial phase and T-1 incremental phases; i=nt-1+1, ··nt-1+pt, thereby enabling the model Mt to identify nt=p0+p1+·pt class data.
In this embodiment, the method proposed above may involve patient privacy during data processing, involve some application scenarios sensitive to privacy, or be more suitable for adopting a regularized continuous learning method because the historical data cannot be obtained in subsequent training due to privacy problems. The regularization-based continuous learning method can be realized by adding a regularization term into a training loss function without storing old samples.
Wherein the steps for the algorithm are processed as follows:
(1) The weights that are more important for the old task are selected.
(2) The importance degree of the talent weight is ordered.
(3) In the optimization process, the weight change with larger importance degree is smaller, so that the change in a small range is ensured, and the old task is not influenced greatly.
The specific method comprises the following steps: the posterior probability of the model is fitted to a gaussian distribution, where the mean is the weight of the old task and the variance is the reciprocal of the diagonal element of the Fisher information matrix. The variance represents the importance of each weight.
The posterior probability of the parameter theta in the given data set D is calculated by using a Bayesian formula aiming at the model, and the importance of the parameter theta is continuously calculated according to the probability, wherein the calculation formula is as follows:
Figure SMS_16
the expression for feature extraction is:
Figure SMS_17
the following describes each component of the processor in detail:
wherein in the present embodiment, the processor is a specific integrated circuit (application specific integrated circuit, ASIC), or one or more integrated circuits configured to implement embodiments of the present application, such as: one or more microprocessors (digital signal processor, DSPs), or one or more field programmable gate arrays (field programmable gate array, FPGAs).
Alternatively, the processor may perform various functions, such as performing the method shown in fig. 2 described above, by running or executing a software program stored in memory, and invoking data stored in memory.
In a particular implementation, the processor may include one or more microprocessors, as one embodiment.
The memory is configured to store a software program for executing the solution of the present application, and the processor is used to control the execution of the software program, and the specific implementation manner may refer to the above method embodiment, which is not described herein again.
Alternatively, the memory may be read-only memory (ROM) or other type of static storage device that can store static information and instructions, random access memory (random access memory, RAM) or other type of dynamic storage device that can store information and instructions, but may also be, without limitation, electrically erasable programmable read-only memory (electrically erasable programmable read-only memory, EEPROM), compact disc read-only memory (compact disc read-only memory) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store the desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory may be integrated with the processor or may exist separately and be coupled to the processing unit through an interface circuit of the processor, which is not specifically limited in the embodiments of the present application.
It should be noted that the structure of the processor shown in this embodiment is not limited to the apparatus, and an actual apparatus may include more or less components than those shown in the drawings, or may combine some components, or may be different in arrangement of components.
In addition, the technical effects of the processor may refer to the technical effects of the method described in the foregoing method embodiments, which are not described herein.
It should be appreciated that the processor in embodiments of the present application may be other general purpose processors, digital signal processors (digital signal processor, DSP), application specific integrated circuits (application specific integrated circuit, ASIC), off-the-shelf programmable gate arrays (field programmable gate array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
It should also be appreciated that the memory in embodiments of the present application may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. The volatile memory may be random access memory (random access memory, RAM) which acts as an external cache. By way of example but not limitation, many forms of random access memory (random access memory, RAM) are available, such as Static RAM (SRAM), dynamic Random Access Memory (DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced Synchronous Dynamic Random Access Memory (ESDRAM), synchronous Link DRAM (SLDRAM), and direct memory bus RAM (DR RAM).
The above embodiments may be implemented in whole or in part by software, hardware (e.g., circuitry), firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions described in accordance with the embodiments of the present application are all or partially produced. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more sets of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.
In the present application, "at least one" means one or more, and "a plurality" means two or more. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.
It should be understood that, in various embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present application.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method of ultrasound image analysis, the method comprising:
acquiring an ultrasonic image obtained by ultrasonic scanning of a thyroid region of a detected object, detecting a benign and malignant classification result of a nodule in the ultrasonic image, performing semantic segmentation on the ultrasonic image to obtain a nodule segmentation map, analyzing nodule characteristic information and an image visible result based on the nodule segmentation map, combining the nodule characteristic information with the benign and malignant classification result of the nodule, and determining the TI-RADS classification and the malignant probability of the nodule based on a trained TI-RADS scoring model classification model.
2. The method of ultrasound image analysis according to claim 1, wherein detecting the classification of benign and malignant nodules in the ultrasound image comprises:
extracting the nodule characteristics of the ultrasonic image based on the trained target detection model to obtain a nodule detection diagram, and processing the extracted nodule characteristics to obtain a benign and malignant nodule probability value; the target detection model comprises a feature extraction network for extracting the nodule feature information in the ultrasound image; the feature fusion network is used for fusing the extracted nodule feature information to obtain nodule fusion features; and the detection head is used for classifying the nodule fusion characteristics to obtain nodule probability values under different classifications, wherein the classifications comprise early, medium and late period classifications of the nodules.
3. The ultrasonic image analysis method according to claim 2, wherein the feature extraction network comprises five convolution layers, and a Focus structure and an inner convolution layer connected with the Focus structure are arranged in a first convolution layer, the Focus structure is used for converting the width and height information of the ultrasonic image into a channel, and the inner convolution layer extracts the ultrasonic image through inner convolution to obtain a preliminary feature map containing shallow features.
4. The method of ultrasound image analysis according to claim 3, wherein the convolution structure in the third convolution layer is a deformable convolution.
5. The method of claim 3, wherein the feature fusion network comprises a pyramidal network structure that performs feature sampling from top to bottom after performing down-to-up sampling according to semantic degrees based on a plurality of the feature maps.
6. The method of claim 3, wherein the detection heads comprise yolo detection heads, the number of yolo detection heads being three, three of the yolo detection heads being configured to detect nodules of different scale sizes.
7. The method of claim 3, wherein semantically segmenting the ultrasound image to obtain a nodule segmentation map comprises: and carrying out feature extraction on the ultrasonic image based on a U-Net network through an encoder to obtain a feature map, and predicting and outputting pixels in the extracted feature map based on the U-Net network through a decoder to obtain a segmented image, namely a nodule segmentation map.
8. The ultrasound image analysis method of claim 7, wherein the U-Net network is constructed based on a valid convolution structure.
9. The ultrasonic image analysis method according to claim 7, wherein the TI-RADS scoring model classification model comprises a trained nodule feature analysis model and a scoring model, the nodule feature analysis model performs feature analysis on the nodule segmentation map to obtain ultrasonic feature information, and converts the ultrasonic feature information based on a CLIP language model to obtain ultrasonic image feature information; the scoring model combines the ultrasound image characteristic information and the benign and malignant nodule classification result based on a TI-RADS mechanism to obtain TI-RADS classification and malignant nodule probability.
10. The ultrasound image analysis method according to claim 9, wherein the training of the scoring model is performed based on a regularized continuous learning method.
CN202310250057.6A 2023-03-15 2023-03-15 Ultrasonic image analysis method Pending CN116416221A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310250057.6A CN116416221A (en) 2023-03-15 2023-03-15 Ultrasonic image analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310250057.6A CN116416221A (en) 2023-03-15 2023-03-15 Ultrasonic image analysis method

Publications (1)

Publication Number Publication Date
CN116416221A true CN116416221A (en) 2023-07-11

Family

ID=87058953

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310250057.6A Pending CN116416221A (en) 2023-03-15 2023-03-15 Ultrasonic image analysis method

Country Status (1)

Country Link
CN (1) CN116416221A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116934738A (en) * 2023-08-14 2023-10-24 威朋(苏州)医疗器械有限公司 Organ and nodule joint segmentation method and system based on ultrasonic image

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116934738A (en) * 2023-08-14 2023-10-24 威朋(苏州)医疗器械有限公司 Organ and nodule joint segmentation method and system based on ultrasonic image
CN116934738B (en) * 2023-08-14 2024-03-22 威朋(苏州)医疗器械有限公司 Organ and nodule joint segmentation method and system based on ultrasonic image

Similar Documents

Publication Publication Date Title
CN109993726B (en) Medical image detection method, device, equipment and storage medium
WO2020224406A1 (en) Image classification method, computer readable storage medium, and computer device
Singh et al. Machine learning in cardiac CT: basic concepts and contemporary data
CN110930416B (en) MRI image prostate segmentation method based on U-shaped network
Oghli et al. Automatic fetal biometry prediction using a novel deep convolutional network architecture
CN112132959B (en) Digital rock core image processing method and device, computer equipment and storage medium
CN111539930A (en) Dynamic ultrasonic breast nodule real-time segmentation and identification method based on deep learning
Singh et al. A quantum-clustering optimization method for COVID-19 CT scan image segmentation
US20230281809A1 (en) Connected machine-learning models with joint training for lesion detection
Li et al. Automated measurement network for accurate segmentation and parameter modification in fetal head ultrasound images
Bhadoria et al. Comparison of segmentation tools for multiple modalities in medical imaging
Hussein et al. Fully‐automatic identification of gynaecological abnormality using a new adaptive frequency filter and histogram of oriented gradients (HOG)
WO2022110525A1 (en) Comprehensive detection apparatus and method for cancerous region
CN116416221A (en) Ultrasonic image analysis method
CN103169506A (en) Ultrasonic diagnosis device and method capable of recognizing liver cancer automatically
Xing et al. Automatic detection of A‐line in lung ultrasound images using deep learning and image processing
CN111128348A (en) Medical image processing method, device, storage medium and computer equipment
WO2021032325A1 (en) Updating boundary segmentations
CN116129184A (en) Multi-phase focus classification method, device, equipment and readable storage medium
Dai et al. More reliable AI solution: Breast ultrasound diagnosis using multi-AI combination
CN112862786B (en) CTA image data processing method, device and storage medium
CN112862785B (en) CTA image data identification method, device and storage medium
CN111768367B (en) Data processing method, device and storage medium
CN115393246A (en) Image segmentation system and image segmentation method
Ren et al. Automated segmentation of left ventricular myocardium using cascading convolutional neural networks based on echocardiography

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination