CN116433607B - Bone age assessment method and system for X-ray images of hand bones of children based on double weighted fusion of key region features - Google Patents

Bone age assessment method and system for X-ray images of hand bones of children based on double weighted fusion of key region features Download PDF

Info

Publication number
CN116433607B
CN116433607B CN202310261475.5A CN202310261475A CN116433607B CN 116433607 B CN116433607 B CN 116433607B CN 202310261475 A CN202310261475 A CN 202310261475A CN 116433607 B CN116433607 B CN 116433607B
Authority
CN
China
Prior art keywords
region
bone
network
hand bone
key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310261475.5A
Other languages
Chinese (zh)
Other versions
CN116433607A (en
Inventor
郑欣
田博
李娟�
周頔
唐成玉
陶安位
张兴宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SICHUAN UNIVERSITY OF ARTS AND SCIENCE
Original Assignee
SICHUAN UNIVERSITY OF ARTS AND SCIENCE
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SICHUAN UNIVERSITY OF ARTS AND SCIENCE filed Critical SICHUAN UNIVERSITY OF ARTS AND SCIENCE
Priority to CN202310261475.5A priority Critical patent/CN116433607B/en
Publication of CN116433607A publication Critical patent/CN116433607A/en
Application granted granted Critical
Publication of CN116433607B publication Critical patent/CN116433607B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10116X-ray image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • G06T2207/20104Interactive definition of region of interest [ROI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30008Bone
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Radiology & Medical Imaging (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a bone age assessment method for a children hand bone X-ray image based on double weighted fusion of key region features, which specifically comprises the following steps: constructing a bone age prediction network, wherein the bone age prediction network comprises a hand bone region screening sub-network and a bone age prediction sub-network, and the bone age prediction sub-network comprises a complete hand bone region learning network and a key hand bone region learning network; the method comprises the steps of collecting X-ray images of the hand bones of children in a hand bone region screening sub-network, extracting complete hand bone regions and key hand bone regions, sending the extracted complete hand bone regions into a complete hand bone region learning network to extract features and scale conversion, sending different key hand bone regions into different key hand bone region learning networks through a region weight configuration module, configuring corresponding weights for the key hand bone regions, extracting the features and the scale conversion, fusing the features extracted by the complete hand bone region learning network and the key hand bone region learning network with sex information, and outputting a bone age assessment result. The method can obtain better bone age prediction performance.

Description

Bone age assessment method and system for X-ray images of hand bones of children based on double weighted fusion of key region features
Technical Field
The invention relates to the technical field of artificial intelligence computer vision and deep learning image processing, in particular to a method and a system for assessing bone age of a child hand bone X-ray image based on double weighted fusion of key region features.
Background
The bone age assessment of children has wide application in the fields of pediatric clinical diagnosis, adult height prediction of children, sports competition and the like. Since the wrist portion most represents skeletal development and growth potential, the wrist portion is most commonly used in bone age assessment of children.
Traditional bone age assessment methods can be divided into two types, including Greulich-Pyle mapping and Tanner-White-house scoring. These methods all require doctors to evaluate bone age by reading morphological features of epiphyseal and metaphyseal development conditions, and require high expertise of the evaluator. In addition, the manual film reading, analysis and deduction of bone age all have obvious defects of long time consumption, larger error, poor consistency and the like. For the same X-ray hand bone image, the judging results of the same reader at different times and the judging results of different readers are greatly different.
With the development of computer vision technology, intelligent bone age assessment methods are gradually rising. Early intelligent assessment methods were automated extraction of features used for manual assessment and most used proprietary dataset testing, but the prediction results thus obtained were very poor in accuracy. In recent years, deep learning techniques have made breakthrough progress in the field of computer vision. As a representative method of deep learning, convolutional neural networks (convolutional neural networks, CNN) can automatically and rapidly extract key features from images, replacing the conventional feature extraction method based on priori knowledge. Researches on bone age assessment of children based on CNN also show a rapid growth trend, and the bone age assessment precision is improved, but the defects still exist, and most of the reasons are that the structural advantages of a backbone network are not fully exerted, and semantic features of a hand bone region are not fully learned. In order to obtain a more accurate bone age assessment result, a bone age assessment method with higher accuracy is urgently needed.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention aims to provide a bone age assessment method for a children hand bone X-ray image based on double weighted fusion of key region features.
In order to achieve the above purpose of the present invention, the present invention provides a bone age assessment method for a children hand bone X-ray image based on dual weighted fusion of key region features, comprising the following steps:
acquiring an X-ray image set of the hand bones of the child;
constructing a bone age prediction network, wherein the bone age prediction network comprises a hand bone region screening sub-network and a bone age prediction sub-network, the bone age prediction sub-network comprises a complete hand bone region learning network and a key hand bone region learning network, and the key hand bone region learning network comprises a wrist joint region learning network and/or a finger joint region learning network;
the method comprises the steps of collecting X-ray images of the hand bones of children in a hand bone region screening sub-network to extract complete hand bone regions and key hand bone regions, wherein the key hand bone regions comprise wrist joint regions and/or finger joint regions, sending the extracted complete hand bone regions into the complete hand bone region learning network to extract characteristics and scale transformation, sending different key hand bone regions into different key hand bone region learning networks through a region weight configuration module, configuring corresponding weights for the key hand bone regions, extracting the characteristics and the scale transformation, fusing the characteristics extracted by the complete hand bone region learning network and the key hand bone region learning network with sex information, and outputting bone age assessment results.
According to the method, the performance is improved by fusing the characteristic information of the key hand bone region in the hand bone X-ray image, the sex information is considered to make up for the physiological bone age difference of men and women, the effective characteristics of the hand bone X-ray image of children can be fully extracted, the bone age assessment precision is improved, and better bone age prediction performance is obtained.
The optimal scheme of the bone age assessment method for the X-ray image of the hand bone of the child based on the double weighted fusion of the key region features is as follows: the complete hand bone region learning network comprises M variable convolution network modules, the wrist joint region learning network comprises N variable convolution network modules, and the finger joint region learning network comprises P variable convolution network modules;
the number of the network layers of each variable convolution network module is a plurality of layers, each variable convolution network module comprises a plurality of layers of perception attention modules and a plurality of layers of aggregation attention modules, and the plurality of layers of perception attention modules and the plurality of layers of aggregation attention modules are connected in parallel or in series to capture key effective characteristics in a hand bone region.
The optimal scheme introduces a multi-layer perception attention module and a convergence attention module in the variable convolution network to improve the backbone network, so that the bone age prediction sub-network can refine the characteristics from different dimensions, thereby extracting key characteristics which are deeper and more effectively used for representing the bone age.
In the preferred embodiment, the feature map of the input multilayer perceptual attention module is defined as CF IN The dimension is H×W×C, when CF IN When the multi-layer perception attention module is input, CF IN Firstly, respectively generating two feature graphs with the dimension of 1 multiplied by C through maximum pooling and leveling, wherein the maximum pooling reserves local typical features of an image, and the average pooling reserves overall distribution information of the image, then respectively compressing feature sizes of the two feature graphs with the dimension of 1 multiplied by C through a multi-layer perceptron, adding two paths of outputs of the multi-layer perceptron and carrying out Sigmoid normalization to obtain a multi-layer perception attention weight A MPAM The multi-layer perceived attention weight A MPAM CF with input IN Multiplication results in the output characteristic CF of the multilayer perceptual attention module OUT
CF OUT =CF IN ×A MPAM =CF IN ×(F s (F mlp (F max (CF IN )))⊕F s (F mlp (F avg (CF IN ) ()), wherein F s Representing Sigmoid returnsFirst, F mlp Representing MLP operations, F max Representing maximum pooling operations, F avg Representing an average pooling operation, and where a number of channels is unchanged, representing that characteristic channels are added element by element.
The convolution kernel size k of the multi-layer perception attention module is adaptively selected through the channel number C of the input feature diagram, and the corresponding relation between the convolution kernel size k and the channel number C is shown in the following formula:
wherein,<·> odd representing the odd number that is closest to the result of the operation.
The multi-layer perception attention module can adaptively calibrate the channel characteristic weight, so that the extraction capability of a network to channel characteristics is enhanced, and the multi-layer perception attention module can identify specific channels in the characteristic diagram and provide enhanced key characteristic information for the aggregation attention module.
In the preferred embodiment, the feature map of the input aggregate attention module is defined as SF IN The dimension is H W C', when SF IN SF when inputting the aggregate attention module IN Firstly, respectively generating two feature graphs through maximum pooling and flattening, performing feature stitching on the two feature graphs to obtain a position weight with a scale of H multiplied by W multiplied by 2, and then performing v multiplied by v convolution reduction and Sigmoid normalization to obtain an aggregate attention weight A FAM Aggregate attention weight A FAM And input SF IN Multiplying to obtain output SF of aggregate attention module OUT
SF OUT =SF IN ×A FAM =SF IN ×(F s (F v×v,conv (F max (SF IN )))⊙(F avg (SF IN ) And) wherein F s Representation Sigmoid normalization, F v×v,conv Representing a v x v convolution dimensionality reduction operation, F max Representing maximum pooling operations, F avg Indicating an average pooling operation, as well as channel concatenation and a change in the number of channels.
The convolution kernel size w of the aggregation attention module is adaptively selected through the number C 'of channels of the input feature diagram, and the corresponding relation between the convolution kernel size w and the number C' is shown in the following formula:
wherein,<·> odd representing the odd number that is closest to the result of the operation.
The focusing attention module can focus the key points of the extracted features of the network on meaningful positions, so that the sensitivity of the network to key feature areas in the hand bone X-ray image is improved.
The optimal scheme of the bone age assessment method for the X-ray image of the hand bone of the child based on the double weighted fusion of the key region features is as follows: when different key hand bone regions are sent into different key hand bone region learning networks through a region weight configuration module and corresponding weights are configured for the different key hand bone regions, the configured corresponding weights comprise wrist joint region weight information imbalance adjustment factors IUAF w And IUAF, which is a weight information imbalance adjustment factor for a joint area f
Wrist joint region weight information imbalance adjustment factor IUAF w And the imbalance adjustment factor IUAF of the weight information of the finger joint area f The calculation formulas of (a) are respectively as follows:
wherein, gamma h 、γ w And gamma f Cumulative gradient of the entire hand bone region, wrist region, finger joint region, parameters ∈> Is four relevant thresholds for truncating the information imbalance adjustment factor, IUAF max Is the maximum value of the information imbalance adjustment factor.
IUAF for adjusting imbalance of wrist joint area weight information w And the imbalance adjustment factor IUAF of the weight information of the finger joint area f The information content aiming at the critical hand bone region is less obvious (difficult sample) and the training is carried out in the forward directionAnd when the weight occupied by the sample is larger, the more attention to the difficult sample is realized based on the information richness estimation of the key hand bone region, and compared with the common difficult sample, the learning of the type of the difficult sample in the key region is focused more.
The optimal scheme of the bone age assessment method for the X-ray image of the hand bone of the child based on the double weighted fusion of the key region features is as follows: the loss function of the bone age prediction sub-network is as follows:wherein N is the number of samples, < >>Predicting bone age results for a bone age prediction sub-network, y i For corresponding labeling of the true value +.>Represents the loss contribution of the entire hand bone region in the ith sample, +.>Represents the loss contribution of the wrist region in the ith sample, +.>Representing the loss contribution of the joint region in the middle finger of the ith sample; IUAF (IUAF) w Imbalance adjustment factor and IUAF for wrist joint area weight information f Imbalance regulating factors are the weight information imbalance of the finger joint region; CUWF h 、CUWF f 、CUWF w The weight factors of the class imbalance of the complete hand bone region, the wrist joint region and the finger joint region are respectively obtained.
Introducing wrist region weight information imbalance adjustment factor IUAF into the loss function w And the imbalance adjustment factor IUAF of the weight information of the finger joint area f The weight occupied by the sample is larger when the reverse error function is calculated, so that the prediction of the abundance degree of the key hand bone region information is based, more attention to the difficult sample is realized, the loss contribution of different samples is rebalanced, and the loss contribution of the difficult sample is obviously increased. At the loss ofThe loss function introduces a class imbalance weighting factor such that the training process of the model focuses attention on the problem of imbalance in the number of key regions. Therefore, the weight redistribution of each part in the loss function is realized through the setting of the loss function, and meanwhile, the problems of information imbalance of the whole hand bone region and the key hand bone region and sample imbalance between the key hand bone regions are solved.
In the preferred embodiment, the class imbalance weighting factor CUWF of the intact hand bone region h =key_n, where key_n refers to the number of Key hand bone regions set;
class imbalance weighting factor for wrist region
Class imbalance weighting factors for knuckle regions
Wherein T is h 、T f 、T w The number of the complete hand bone region, the effective wrist joint region and the effective finger joint region in the whole training set respectively,<·>represents the number T of intact hand bone regions by rounding the expression h Equal to the number of X-ray images in the training set, the number of effective wrist areas T f IUAF, which is equal to the weight information imbalance adjustment factor of wrist joint region in training set w >Number of X-ray images of 0, number of effective knuckle regions T w IUAF, equal to the weight information imbalance adjustment factor of the knuckle region in the training set f >0X-ray image number.
The invention also provides a child hand bone X-ray image bone age assessment system, which comprises an image receiving module, a processing module and a storage module, wherein the image receiving module receives images for training or assessment and sends the received images to the processing module, the processing module is in communication connection with the storage module, and the storage module is used for storing at least one executable instruction, so that the processing module executes the operation corresponding to the child hand bone X-ray image bone age assessment method based on the key region feature double weighted fusion according to the received images. The bone age evaluation system for the X-ray images of the hand bones of the children has all the advantages of the bone age evaluation method for the X-ray images of the hand bones of the children based on the double weighted fusion of the key region features.
The beneficial effects of the invention are as follows: according to the invention, the background area in the original image is removed by adopting the hand bone key area screening sub-network, and the key hand bone area in the hand bone image is screened out, so that the bone age prediction sub-network can focus on the hand bone area containing key characteristics, the bone age assessment performance is improved, and the bone age prediction sub-network introduces an attention mechanism for increasing a multi-layer perception attention module and a multi-layer aggregation attention module, so that the network can refine the characteristics from a plurality of independent dimensions, and further, the deeper and more effective characteristics are extracted.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:
FIG. 1 is a schematic view of the overall framework of the present invention;
FIG. 2 is a hand bone X-ray image;
FIG. 3 is a schematic diagram of a hand bone region screening sub-network structure;
FIG. 4 is a schematic diagram of a bone age prediction sub-network;
FIG. 5 is a schematic diagram of the weight configuration of a region weight configuration module to a key hand bone region in a bone age prediction sub-network;
FIG. 6 (a) is a schematic diagram of MPAM and FAM parallel connections;
FIG. 6 (b) is a schematic diagram of a serial connection of MPAM and FAM;
FIG. 6 (c) is another schematic diagram of a serial connection of MPAM and FAM;
FIG. 7 is a schematic block diagram of an MPAM module;
fig. 8 is a schematic diagram of a FAM module structure;
FIG. 9 is a plot of age and gender distribution in the RSNA dataset;
FIG. 10 is a comparative graph showing the effect of sex factors on bone age assessment.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.
In the description of the present invention, unless otherwise specified and defined, it should be noted that the terms "mounted," "connected," and "coupled" are to be construed broadly, and may be, for example, mechanical or electrical, or may be in communication with each other between two elements, directly or indirectly through intermediaries, as would be understood by those skilled in the art, in view of the specific meaning of the terms described above.
As shown in FIG. 1, the invention provides a model integration-based method for assessing bone age of a X-ray image of a hand bone of a child, which comprises the following steps:
and acquiring an X-ray image set of the hand bones of the child. The pediatric hand bone X-ray image set in this example was taken from a pediatric bone age challenge game public dataset held by the north american radiology society (Radiological Society of North America, RSNA) 2017. The data set comprises 14236X-ray images of the bones of the hands of the children for 1-228 months, wherein the training set comprises 12611 pieces, the verification set comprises 1425 pieces, the test set comprises 200 pieces, and the details of the data set are shown in fig. 9. These images were labeled by six radiologists with bone age and sex provided by clinical radiology reports. The training set is used for model training, the verification set is used for monitoring the training process and feeding back real-time training performance, and the test set evaluates the model after training.
The bone age prediction network is constructed and comprises a hand bone region screening sub-network and a bone age prediction sub-network, wherein the bone age prediction sub-network comprises a complete hand bone region learning network and a key hand bone region learning network, and the key hand bone region learning network comprises a wrist joint region learning network and/or a finger joint region learning network.
The complete hand bone region in the X-ray image of the hand bone of the child contains the effective information required for bone age assessment, as shown in a block A in fig. 2, wherein the information in the wrist joint region (shown in a block B in fig. 2) and the finger joint region (shown in a block C in fig. 2) is the most critical. Therefore, the complete hand bone region and the key hand bone region are extracted from the hand bone region screening sub-network by collecting the hand bone X-ray image of the child, wherein the key hand bone region comprises a wrist joint region and/or a finger joint region.
Mask R-CNN is a two-stage segmentation framework, the first stage scans the image and generates a suggested region (region proposal networks, RPN), and the second stage classifies the suggested region and generates a bounding box (Mask) and a Mask (Mask). Mask R-CNN extends from the target detection framework Faster R-CNN, as shown in FIG. 3. Mask R-CNN adds a branch of the predictive split Mask, the FCN layer (fully convolutional networks, FCN), on the basis of the Faster R-CNN. The Mask R-CNN also introduces a feature pyramid network (feature pyramid networks, FPN) to realize the fusion of the bottom layer to the high-layer feature map so as to fully utilize the features of each stage. In addition, the RoI Pooling layer in Faster R-CNN is replaced with a RoI Align layer. The ROI alignment uses bilinear interpolation (bi-linear interpolation) to obtain pixel values with coordinates of floating point numbers, thereby solving for mismatch in ROI Pooling.
As shown in fig. 4, the complete hand bone region learning network includes M variable convolution network modules AMVCN, the wrist joint region learning network includes N variable convolution network modules AMVCN, the finger joint region learning network includes P variable convolution network modules AMVCN, the number of network layers of each variable convolution network module AMVCN is several, and M, N, P is a positive integer, which is the present embodimentIn an embodiment, M is preferably but not limited to 4, N and P are preferably but not limited to 3, i.e., the intact hand bone region is divided into four variable convolutional network modules AMVCN (including AMVCN [11 ]]、AMVCN[12]、AMVCN[13]、AMVCN[14]) The number of network layers (namely the number of convolution units) in the module is Num respectively 11 、Num 12 、Num 13 、Num 14 Preferred values are 4, 9, 4, respectively; the wrist region is segmented by three variable convolutional network modules AMVCN (including AMVCN [21 ]]、AMVCN[22]、AMVCN[23]) The number of network layers (namely the number of convolution units) in the module is Num respectively 21 、Num 22 、Num 23 Preferred values are 4, 9, 4, respectively; the knuckle region is segmented by three variable convolutional network modules AMVCN (including AMVCN [31 ]]、AMVCN[32]、AMVCN[33]) The number of network layers (namely the number of convolution units) in the module is Num respectively 31 、Num 32 、Num 33 The preferred values are 4, 9, 4, respectively. And finally, fusing the characteristics extracted by the complete hand bone region learning network and the key hand bone region learning network (namely the wrist joint region learning network and the finger joint region learning network) with sex information, and outputting a bone age assessment result through two dense connecting layers and one full connecting layer.
In this embodiment, the variable convolutional network module AMVCN is an attention mechanism-based variable convolutional network module AMVCN formed by introducing a dual attention mechanism (Multi-layer aware attention module (Multi-layer Perceptual Attention Module, MPAM) and aggregate attention module (Focusing Attention Module, FAM)) into the variable convolutional network VCN, where both the Multi-layer aware attention module MPAM and the aggregate attention module FAM can be connected in parallel or serial manner to capture key effective features in the hand bone region, as shown in fig. 6 (a) to 6 (c), and three connection manners are respectively given in serial and parallel. Wherein the feature F is input in Dividing the characteristic group (router) into G sub-characteristic groups, respectively learning the channel characteristic and the space characteristic of each group of characteristic diagrams in the groups, and outputting the characteristic F out . Table 1 describes the operation flow of the bone age prediction sub-network and the scale transformation process of the characteristic diagram, and each variable convolution network in the tableEach layer of the complex module AMVCN is formed by connecting Conv1X1, conv3X3 and Conv1X1 modules in series, and then connecting a plurality of layers of perception attention modules MPAM and an aggregation attention module FAM, wherein Conv1X1 is a 1X1 convolution module, and the number of Conv1X1 convolution modules is 2; conv3X3 is a 3X3 convolution module, 1. For example by F O Representing the output eigenvector of the variable convolutional network module AMVCN, using F I The input feature vectors representing the variable convolutional network module AMVCN are:
wherein CBlock×<A MPAM ,A FAM >Represents->The correlation operation on the input feature vector,representing a concatenation of 3 convolutional layers (i.e., conv1X1, conv3X3 and Conv1X 1),<A MPAM ,A FAM >representing the connection of a multi-layer perceived attention module MPAM to an aggregate attention module FAM, when the connection employs a preferred scheme (MPAM module series FAM module) as shown in FIG. 6 (c), the expression is:<A MPAM ,A FAM >=A MPAM ×A FAM wherein A is MPAM Attention weight, A, for a multi-layer awareness attention module MPAM FAM To aggregate the attention weights of the attention module.
TABLE 1 network structure of bone age prediction subnetwork
The following is a description of the multi-layer awareness attention module MPAM, the aggregate attention module FAM.
Defining a feature map of an input multilayer awareness attention module MPAM as CF IN The dimension is H×W×C. As shown in FIG. 7, when CF IN When the multi-layer perception attention module MPAM is input, CF IN Firstly, respectively generating two feature graphs with the scale of 1 multiplied by C through maximum pooling and leveling, wherein local typical features of images are reserved through maximum pooling, overall distribution information of the images is reserved through average pooling, then the feature graphs with the scale of 1 multiplied by C are respectively compressed by a multi-layer perceptron MLP to reduce parameter expenditure, and two paths of outputs of the multi-layer perceptron MLP are added and Sigmoid normalized to obtain multi-layer perception attention weight A MPAM Multilayer perceptual attention weight A MPAM CF with input IN Multiplying to obtain output characteristic CF of multi-layer perception attention module MPAM OUT
The calculation process is as follows:
CF OUT =CF IN ×A MPAM =CF IN ×(F s (F mlp (F max (CF IN )))⊕F s (F mlp (F avg (CF IN ) ()), wherein F s Representation Sigmoid normalization, F mlp Representing MLP operations, F max Representing maximum pooling operations, F avg Representing an average pooling operation, and where a number of channels is unchanged, representing that characteristic channels are added element by element. To achieve proper cross-channel interaction, the convolution kernel size k of the multi-layer perceptual attention module MPAM is determined by inputting a feature map CF IN The number C of the channels is adaptively selected to aggregate similar characteristics of different spatial positions, and the corresponding relation between the number C of the channels and the similar characteristics is shown in the following formula:
wherein,<·> odd representing the odd number that is closest to the result of the operation.
Defining a feature map of an input aggregated attention module FAM as SF IN The dimension is H W C'. As shown in FIG. 8, when SF IN SF when inputting the aggregated attention module FAM IN Firstly, respectively generating two feature graphs through maximum pooling and flattening pooling, and performing feature stitching on the two feature graphs to obtain a dimension H×W×2, and normalizing by v×v convolution decreasing sum Sigmoid to obtain aggregate attention weight A FAM Aggregate attention weight A FAM And input SF IN Multiplying to obtain output SF of aggregate attention module OUT . The calculation process is as follows:
SF OUT =SF IN ×A FAM =SF IN ×(F s (F v×v,conv (F max (SF IN )))⊙(F avg (SF IN ) And) wherein F s Representation Sigmoid normalization, F v×v,conv Representing a v x v convolution dimensionality reduction operation, F max Representing maximum pooling operations, F avg Indicating an average pooling operation, as well as channel concatenation and a change in the number of channels. In order to realize an effective space aggregation effect, the convolution kernel size w of the aggregation attention module FAM is adaptively selected through the channel number C 'of the input feature map so as to aggregate similar features of different space positions, and the corresponding relation between the convolution kernel size w and the channel number C' is shown as the following formula:
wherein,<·> odd representing the odd number that is closest to the result of the operation.
As can be seen from the above, in this embodiment, the number of network layers of each variable convolution network module AMVCN is variable, and the scale of convolution kernels in the multi-layer perception attention module MPAM and the aggregation attention module FAM in the variable convolution network module AMVCN is variable.
After the bone age prediction network is constructed, as shown in fig. 5, the extracted complete hand bone region is sent to the complete hand bone region learning network to extract features and scale transformation, different key hand bone regions are sent to different key hand bone region learning networks through a region weight configuration module RWC, and the region weight configuration module RWC configures corresponding weights for the key hand bone region learning networks (i.e., the wrist joint region learning network and the finger joint region learning network). The corresponding weights configured include the wrist region weight information imbalance adjustment factor IUAF w And IUAF, which is a weight information imbalance adjustment factor for a joint area f Both represent whether the bone age prediction sub-network uses the corresponding critical region in the sample and if so, the attention weight of the function is lost as the model is passed forward.
IUAF for adjusting imbalance of wrist joint area weight information w And the imbalance adjustment factor IUAF of the weight information of the finger joint area f The calculation formulas of (a) are respectively as follows:
wherein, gamma h 、γ w And gamma f Cumulative gradient of the entire hand bone region, wrist region, finger joint region, parameters ∈> Are four relevant thresholds for truncating the information imbalance adjustment factor, preferably but not limited to 0.3, 0.1, iuaf, respectively max Is the maximum value of the information imbalance adjustment factor, preferably but not limited to 10.
IUAF, which is an imbalance adjustment factor for wrist region weight information w When=0, it means that the bone age prediction sub-network does not use the corresponding wrist joint region in the sample; IUAF, as an imbalance adjustment factor for weight information of a finger joint region f When=0, it means that the bone age prediction subnetwork has not used the corresponding knuckle region in the sample.
Considering that the appearance time of a male ossification center and the healing time of metaphyseal are 1-2 years old later than those of females, as shown in fig. 4 and 5, gender information is added in the training process of a bone age prediction sub-network, gender characteristics (male is 1 and female is 0) are fused through a dense connection layer with 32 neurons, and image characteristics output by a complete hand bone region learning network and a key hand bone region learning network, and then two dense connection units are connected, wherein in the embodiment, the two dense connection units are fed by a ReLU activation layer and a Dropout (0.2) layer which are tightly connected with 1024 neurons, and the last full connection layer is a full connection layer formed by one linearly activated neuron for predicting bone age. The two dense connection units provide more learning parameters for the bone age prediction sub-network so as to adjust the bone age prediction sub-network during training, improve the accuracy of bone age assessment, and meanwhile, the added Dropout layer can prevent the network from being over fitted and improve the generalization capability of the network.
When considering the setting of the loss function, the present embodiment proposes an information and category double weighted root mean square error DWIE-RMSE, that is, the information imbalance adjustment factor IUAF and the category imbalance weighting factor CUWF are used to form the double adjustment factors related to the estimated information imbalance and the key area category imbalance in the root mean square error RMSE. In the bone age prediction sub-network, weight redistribution of each part in the loss function is realized by introducing the double adjustment factors, and meanwhile, the problems of information imbalance of the whole hand bone region and the key region and sample imbalance between the key regions are solved. The double regulatory factors introduced include:
information imbalance modifier (information unbalance adjustment factor, IUAF): according to the estimated information proportion of the whole hand bone region and the key region, firstly determining whether the key region is used, then determining the forward transmission of the key region in the model, and calculating the weight of the loss function. In this embodiment, the above-mentioned wrist region weight information imbalance adjustment factor IUAF is adopted w And the imbalance adjustment factor IUAF of the weight information of the finger joint area f
Class imbalance weighting factor (class unbalance weighting factor, CUWF): increasing the impact of fewer categories in the critical area ensures that the loss contribution of fewer category samples is not overwhelmed by more category samples.
The DWIE-RMSE enables the model to dynamically adjust the loss contribution of the sample according to the estimated information and the corresponding class state of the sample, so the embodiment takes the DWIE-RMSE as the loss function of the bone age prediction sub-network, and the calculation method is as follows:
wherein N is the number of samples, < >>Predicting bone age results for a bone age prediction sub-network, y i For corresponding labeling of the true value +.>Represents the loss contribution of the entire hand bone region in the ith sample, +.>Representing the loss contribution of the wrist region in the ith sample,representing the loss contribution of the joint region in the middle finger of the ith sample; IUAF (IUAF) w Imbalance adjustment factor and IUAF for wrist joint area weight information f For the weight information unbalance adjustment factors of the finger joint region, the information content of the finger joint region is less obvious (difficult samples), the weight occupied by the samples is greater when the finger joint region is trained forward and a reverse error function is calculated, so that more attention to the difficult samples is realized based on the information richness estimation of the finger joint region, the loss contribution of different samples is rebalanced by the information unbalance adjustment factors, the loss contribution of the difficult samples is obviously increased, and meanwhile, compared with the common difficult samples, the learning of the difficult sample category of the key region is more focused; CUWF h 、CUWF f 、CUWF w The category imbalance weighting factors of the complete hand bone region, the wrist joint region and the finger joint region are used for increasing the influence of a few categories in the key region, and ensuring that the loss contribution of a few category samples cannot be submerged by a large number of category samples. Specifically, the class imbalance weighting factor CUWF of the intact hand bone region h Key_n, where key_n refers to the number of Key hand bone regions set, here preferably but not limited to 2; class imbalance weighting factor for wrist regionClass imbalance weighting factor of the knuckle region>Wherein T is h 、T f 、T w The number of the complete hand bone region, the effective wrist joint region and the effective finger joint region in the whole training set respectively,<·>represents the number T of intact hand bone regions by rounding the expression h Equal to the number of X-ray images in the training set, the number of effective wrist areas T f IUAF, which is equal to the weight information imbalance adjustment factor of wrist joint region in training set w >Number of X-ray images of 0, number of effective knuckle regions T w IUAF, equal to the weight information imbalance adjustment factor of the knuckle region in the training set f >0X-ray image number.
The hardware environment used in this embodiment is Intel (R) Core (TM) i7-8700K CPU, four NVIDIAGeForce RTX 2080Ti (11 GB) GPU, and the memory is 64G. The software environment is Ubuntu 16.04 operating system, pyThorch 1.7 open source framework. The network input image size is 299×299 pixels, and the training parameters are set as follows:
(1) When the hand bone region screening sub-network is trained, an adaptive moment estimation algorithm (adaptive moment estimation, adam) is adopted to optimize the network, the Batch Size is set to 2, the initial learning rate is set to 0.001, 120 epochs are iteratively trained, and an optimal training model is selected for hand bone region segmentation of an X-ray image.
(2) In the bone age prediction sub-network, an adaptive moment estimation algorithm is adopted to optimize the network, the Batch Size is set to 8, the initial learning rate is set to 0.001, and 200 epochs are trained iteratively. When the loss value of the validation set is not reduced by 10 consecutive epochs, the learning rate is adjusted according to the 10% ratio. An optimal training model is selected for bone age assessment of the test set.
Evaluation index
In the bone age assessment, the average absolute error is used as an evaluation index, and the calculation method is shown in the following formula. The smaller the MAE value, the better the evaluation result.
Wherein N is the number of samples, ">To mark the true value, ++>Bone age results were assessed for each model.
Comparative experiments using different reference networks:
to select the appropriate benchmark network for bone age assessment, seven classical networks of EfficientNet-B4, acceptance-V4, resNet-101, resNet-50, densNet-201, acceptance-ResNet-V2 and Xacceptance are chosen herein for bone age assessment and the results of the assessments are compared. In the evaluation, the network structure and the data set were not processed, the image size was unified to 299×299, and then the reference network was input, and the bone age evaluation results shown in table 2 were obtained. The evaluation result of the variable convolution network module AMVCN is best in eight groups of networks, and the MAE for bone age evaluation is 7.31 months, so that the variable convolution network module AMVCN is selected for improvement in the subsequent bone age evaluation work, and the final evaluation accuracy is optimized.
Table 2 evaluation errors for different reference networks
Reference network Mean absolute error (MAE: month)
EfficientNet-B4 7.68
Inception-V4 8.94
ResNet-101 8.68
ResNet-50 8.42
DensNet-201 8.48
Inception-ResNet-V2 8.37
Xception 7.59
AMVCN 7.31
Ablation experiments for bone age assessment:
the bone age assessment effort herein is divided into two parts, namely extraction of hand bone regions and bone age regression using the variable convolutional network module AMVCN integration network. To verify the validity and necessity of the above modules and related mechanisms, related ablation experiments were designed, including: (1) Directly performing bone age assessment on the original image by adopting a VCN network; (2) Firstly, dividing a hand bone region by using Mask R-CNN, and then inputting a VCN network; (3) Firstly, dividing a hand bone region by using Mask R-CNN, and then inputting VCN combined with MPAM module; (4) the method herein. The ablation experiments all used gender information, and the evaluation accuracy is shown in table 4.
Table 3 results of ablation experiments
In Table 3, the MAE for bone age assessment using VCN directly on the original image was 7.08 months, and after hand bone region extraction using Mask R-CNN network, the MAE for bone age assessment was reduced to 5.22 months. Further, introducing the MPAM module, the MAE for bone age assessment was reduced to 4.81 months. Finally, the attention mechanism is fully introduced herein, with the final MAE of 4.59 months. The ablation experiment result shows that three key areas in the hand bones are extracted through the Mask R-CNN network, interference of background information can be effectively reduced, and MPAM and FAM dual attention mechanisms are added into the VCN, so that the network can pay attention to richer key features in the hand bone image areas. The module and the related mechanism can effectively improve the accuracy of bone age assessment.
Sex factor comparison experiment:
in the growth and development process of children, male and female have different hand development maturity degrees at the same age, so we designed four groups of bone age assessment experiments related to sex factors, including: (1) performing an age assessment of a male image in the dataset; (2) performing an age assessment of the female image in the dataset; (3) performing an age assessment of all images in the dataset; (4) Bone age assessment was performed on all images in the dataset in combination with gender information. The results of the above experiments are shown in FIG. 10.
As can be seen from FIG. 10, the MAE for bone age assessment for men and women alone was 4.78 and 4.92 months, respectively, and the MAE for bone age assessment without gender information was 5.38 months. After the addition of gender information, the bone age was assessed for MAE of 4.59 months. For a single gender bone age assessment, the error was reduced by 0.6 and 0.46 months, respectively, compared to the absence of gender information, whereas the addition of gender factors reduced the error by 0.79 months. Therefore, the sex information is added in the bone age assessment, so that errors can be effectively reduced, and the assessment precision can be improved.
Comparison of different deep learning methods:
to illustrate the advancement of the methods herein in bone age assessment, the present methods are compared to a variety of bone age assessment methods that have been representative in recent years. The method has the highest accuracy of bone age assessment (minimum MAE, 4.59 months).
The application also provides a child hand bone X-ray image bone age assessment system, which comprises an image receiving module, a processing module and a storage module, wherein the image receiving module receives images used for training or to be assessed and sends the received images to the processing module, the processing module is in communication connection with the storage module, and the storage module is used for storing at least one executable instruction, so that the processing module executes operations corresponding to the child hand bone X-ray image bone age assessment method based on the critical area feature dual weighted fusion according to the received images.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.

Claims (5)

1. The method for assessing the bone age of the X-ray image of the hand bone of the child based on the double weighted fusion of the key region features is characterized by comprising the following steps:
acquiring an X-ray image set of the hand bones of the child;
constructing a bone age prediction network, wherein the bone age prediction network comprises a hand bone region screening sub-network and a bone age prediction sub-network, the bone age prediction sub-network comprises a complete hand bone region learning network and a key hand bone region learning network, and the key hand bone region learning network comprises a wrist joint region learning network and/or a finger joint region learning network;
the complete hand bone region learning network comprises M variable convolution network modules, the wrist joint region learning network comprises N variable convolution network modules, and the finger joint region learning network comprises P variable convolution network modules;
the number of the network layers of each variable convolution network module is a plurality of layers, each variable convolution network module comprises a plurality of layers of perception attention modules and a plurality of layers of aggregation attention modules, and the plurality of layers of perception attention modules and the plurality of layers of aggregation attention modules are connected in parallel or in series to capture key effective characteristics in a hand bone region;
defining a feature map of an input multilayer perceptual attention module as CF IN The dimension is H×W×C, when CF IN When the multi-layer perception attention module is input, CF IN Firstly, respectively generating two feature graphs with the dimension of 1 multiplied by C through maximum pooling and leveling, wherein the maximum pooling reserves local typical features of an image, and the average pooling reserves overall distribution information of the image, then respectively compressing feature sizes of the two feature graphs with the dimension of 1 multiplied by C through a multi-layer perceptron, adding two paths of outputs of the multi-layer perceptron and carrying out Sigmoid normalization to obtain a multi-layer perception attention weight A MPAM The multi-layer perceived attention weight A MPAM CF with input IN Multiplication results in the output characteristic CF of the multilayer perceptual attention module OUT
Wherein F is s Representation Sigmoid normalization, F mlp Representing MLP operations, F max Representing maximum pooling operations, F avg Representing an average pooling operation,/->Representing element-by-element addition of characteristic channels, wherein the number of the channels is unchanged;
defining a feature map of an input aggregate attention module as SF IN The dimension is H W C', when SF IN SF when inputting the aggregate attention module IN Firstly, respectively generating two feature graphs through maximum pooling and flattening, and performing feature stitching on the two feature graphs to obtain a bit with a dimension of H multiplied by W multiplied by 2The weight is set, and then v multiplied by v convolution reduction and Sigmoid normalization are carried out to obtain the aggregate attention weight A FAM Aggregate attention weight A FAM And input SF IN Multiplying to obtain output SF of aggregate attention module OUT
SF OUT =SF IN ×A FAM =SF IN ×(F s (F v×v,conv (F max (SF IN )))⊙(F avg (SF IN ) And) wherein F s Representation Sigmoid normalization, F v×v,conv Representing a v x v convolution dimensionality reduction operation, F max Representing maximum pooling operations, F avg Indicating the average pooling operation, as well as channel splicing and channel number change;
collecting X-ray images of the hand bones of children in a hand bone region screening sub-network to extract complete hand bone regions and key hand bone regions, wherein the key hand bone regions comprise wrist joint regions and/or finger joint regions, sending the extracted complete hand bone regions into the complete hand bone region learning network to extract features and scale transformation, sending different key hand bone regions into different key hand bone region learning networks through a region weight configuration module, configuring corresponding weights for each key hand bone region learning network by the region weight configuration module to extract the features and scale transformation,
the corresponding weights configured include the wrist region weight information imbalance adjustment factor IUAF w And the imbalance adjustment factor IUAF of the weight information of the finger joint area f
Wrist joint region weight information imbalance adjustment factor IUAF w And the imbalance adjustment factor IUAF of the weight information of the finger joint area f The calculation formulas of (a) are respectively as follows:
wherein, gamma h 、γ w And gamma f Cumulative gradient of the entire hand bone region, wrist region, finger joint region, parameters ∈> Is four relevant thresholds for truncating the information imbalance adjustment factor, IUAF max Is the maximum value of the information imbalance adjustment factor;
the loss function of the bone age prediction sub-network is as follows:wherein N is the number of samples, < >>Predicting bone age results for a bone age prediction sub-network, y i For corresponding labeling of the true value +.>Represents the loss contribution of the entire hand bone region in the ith sample, +.>Represents the loss contribution of the wrist region in the ith sample, +.>Representing the loss contribution of the joint region in the middle finger of the ith sample; IUAF (IUAF) w Imbalance adjustment factor and IUAF for wrist joint area weight information f Imbalance regulating factors are the weight information imbalance of the finger joint region; CUWF h 、CUWF f 、CUWF w The weight factors of the category unbalance of the complete hand bone region, the wrist joint region and the finger joint region are respectively obtained;
and fusing the characteristics extracted by the complete hand bone region learning network and the key hand bone region learning network with the sex information, and outputting a bone age assessment result.
2. The bone age evaluation method of the X-ray image of the hand bone of the child based on the double weighted fusion of the key region features according to claim 1, wherein the convolution kernel size k of the multi-layer perception attention module is adaptively selected through the channel number C of the input feature map, and the corresponding relation between the two is shown as the following formula:
wherein,<·> odd representing the odd number that is closest to the result of the operation.
3. The bone age evaluation method of the children hand bone X-ray image based on the dual weighted fusion of the key region features according to claim 1, wherein the convolution kernel size w of the aggregate attention module is adaptively selected by the number of channels C' of the input feature map, and the correspondence between the two is shown as the following formula:
wherein,<·> odd representing the odd number that is closest to the result of the operation.
4. The method for estimating bone age of a pediatric hand bone X-ray image based on dual weighted fusion of key region features according to claim 1, wherein the class imbalance weighting factor CUWF of the complete hand bone region h =key_n, where key_n refers to the number of Key hand bone regions set;
class imbalance weighting factor for wrist region
Class imbalance weighting factors for knuckle regions
Wherein T is h 、T f 、T w Respectively, complete hands in the whole training setThe number of bone regions, effective wrist regions, and effective knuckle regions,<·>represents the number T of intact hand bone regions by rounding the expression h Equal to the number of X-ray images in the training set, the number of effective wrist areas T f IUAF, which is equal to the weight information imbalance adjustment factor of wrist joint region in training set w >Number of X-ray images of 0, number of effective knuckle regions T w IUAF, equal to the weight information imbalance adjustment factor of the knuckle region in the training set f >0X-ray image number.
5. The system for assessing the bone age of the X-ray images of the hands of the children is characterized by comprising an image receiving module, a processing module and a storage module, wherein the image receiving module receives images used for training or to be assessed and sends the received images to the processing module, the processing module is in communication connection with the storage module, and the storage module is used for storing at least one executable instruction, and the executable instruction enables the processing module to execute the operation corresponding to the bone age assessment method for the X-ray images of the hands of the children based on the double weighted fusion of the key area characteristics according to the images received by the processing module.
CN202310261475.5A 2023-03-17 2023-03-17 Bone age assessment method and system for X-ray images of hand bones of children based on double weighted fusion of key region features Active CN116433607B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310261475.5A CN116433607B (en) 2023-03-17 2023-03-17 Bone age assessment method and system for X-ray images of hand bones of children based on double weighted fusion of key region features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310261475.5A CN116433607B (en) 2023-03-17 2023-03-17 Bone age assessment method and system for X-ray images of hand bones of children based on double weighted fusion of key region features

Publications (2)

Publication Number Publication Date
CN116433607A CN116433607A (en) 2023-07-14
CN116433607B true CN116433607B (en) 2024-03-15

Family

ID=87082428

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310261475.5A Active CN116433607B (en) 2023-03-17 2023-03-17 Bone age assessment method and system for X-ray images of hand bones of children based on double weighted fusion of key region features

Country Status (1)

Country Link
CN (1) CN116433607B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070760A (en) * 2020-09-17 2020-12-11 安徽大学 Bone mass detection method based on convolutional neural network
CN113298780A (en) * 2021-05-24 2021-08-24 云南大学 Child bone age assessment method and system based on deep learning
CN113887459A (en) * 2021-10-12 2022-01-04 中国矿业大学(北京) Open-pit mining area stope change area detection method based on improved Unet +
CN114549470A (en) * 2022-02-23 2022-05-27 合肥工业大学 Method for acquiring critical region of hand bone based on convolutional neural network and multi-granularity attention
CN114663735A (en) * 2022-04-06 2022-06-24 杭州健培科技有限公司 Double-bone-age assessment method based on joint global and local convolution neural network characteristics
CN114663426A (en) * 2022-04-21 2022-06-24 重庆邮电大学 Bone age assessment method based on key bone area positioning
CN115063665A (en) * 2022-07-05 2022-09-16 西安邮电大学 Lightweight fire detection method based on weighted bidirectional feature pyramid network
CN115375897A (en) * 2022-07-29 2022-11-22 五邑大学 Image processing method, apparatus, device and medium
CN115393336A (en) * 2022-09-01 2022-11-25 杭州类脑科技有限公司 Bone age assessment method, system and computer readable storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070760A (en) * 2020-09-17 2020-12-11 安徽大学 Bone mass detection method based on convolutional neural network
CN113298780A (en) * 2021-05-24 2021-08-24 云南大学 Child bone age assessment method and system based on deep learning
CN113887459A (en) * 2021-10-12 2022-01-04 中国矿业大学(北京) Open-pit mining area stope change area detection method based on improved Unet +
CN114549470A (en) * 2022-02-23 2022-05-27 合肥工业大学 Method for acquiring critical region of hand bone based on convolutional neural network and multi-granularity attention
CN114663735A (en) * 2022-04-06 2022-06-24 杭州健培科技有限公司 Double-bone-age assessment method based on joint global and local convolution neural network characteristics
CN114663426A (en) * 2022-04-21 2022-06-24 重庆邮电大学 Bone age assessment method based on key bone area positioning
CN115063665A (en) * 2022-07-05 2022-09-16 西安邮电大学 Lightweight fire detection method based on weighted bidirectional feature pyramid network
CN115375897A (en) * 2022-07-29 2022-11-22 五邑大学 Image processing method, apparatus, device and medium
CN115393336A (en) * 2022-09-01 2022-11-25 杭州类脑科技有限公司 Bone age assessment method, system and computer readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Adaptive Multilayer Perceptual Attention Network for Facial Expression Recognition;Hanwei Liu et al.;IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY;第32卷(第9期);第6253-6266页 *
Focusing Attention: Towards Accurate Text Recognition in Natural Images;Zhanzhan Cheng et al.;2017 IEEE International Conference on Computer Vision;第5086-5094页 *

Also Published As

Publication number Publication date
CN116433607A (en) 2023-07-14

Similar Documents

Publication Publication Date Title
CN110930416B (en) MRI image prostate segmentation method based on U-shaped network
CN116342516B (en) Model integration-based method and system for assessing bone age of X-ray images of hand bones of children
CN113298830B (en) Acute intracranial ICH region image segmentation method based on self-supervision
CN110991254B (en) Ultrasonic image video classification prediction method and system
CN111583285A (en) Liver image semantic segmentation method based on edge attention strategy
CN112052877B (en) Picture fine granularity classification method based on cascade enhancement network
CN114549469A (en) Deep neural network medical image diagnosis method based on confidence degree calibration
CN112820399A (en) Method and device for automatically diagnosing benign and malignant thyroid nodules
Yang et al. RADCU-Net: Residual attention and dual-supervision cascaded U-Net for retinal blood vessel segmentation
Bhimavarapu et al. Analysis and characterization of plant diseases using transfer learning
CN113343755A (en) System and method for classifying red blood cells in red blood cell image
CN115526801A (en) Automatic color homogenizing method and device for remote sensing image based on conditional antagonistic neural network
CN111383759A (en) Automatic pneumonia diagnosis system
Hou et al. Image quality assessment guided collaborative learning of image enhancement and classification for diabetic retinopathy grading
Liu et al. A cross-lesion attention network for accurate diabetic retinopathy grading with fundus images
CN116740041B (en) CTA scanning image analysis system and method based on machine vision
CN116433607B (en) Bone age assessment method and system for X-ray images of hand bones of children based on double weighted fusion of key region features
Chen et al. Enhancing multi-disease diagnosis of chest X-rays with advanced deep-learning networks in real-world data
CN116341620A (en) Efficient neural network architecture method and system based on ERetinaNet
CN112785559B (en) Bone age prediction method based on deep learning and formed by mutually combining multiple heterogeneous models
CN116091446A (en) Method, system, medium and equipment for detecting abnormality of esophageal endoscope image
CN112418290A (en) ROI (region of interest) region prediction method and display method of real-time OCT (optical coherence tomography) image
CN112489012A (en) Neural network architecture method for CT image recognition
Wu et al. Mscan: Multi-scale channel attention for fundus retinal vessel segmentation
Lei et al. GNN-fused CapsNet with multi-head prediction for diabetic retinopathy grading

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant