WO2023165033A1 - 识别医学图像中的目标的模型训练、方法、设备及介质 - Google Patents

识别医学图像中的目标的模型训练、方法、设备及介质 Download PDF

Info

Publication number
WO2023165033A1
WO2023165033A1 PCT/CN2022/095137 CN2022095137W WO2023165033A1 WO 2023165033 A1 WO2023165033 A1 WO 2023165033A1 CN 2022095137 W CN2022095137 W CN 2022095137W WO 2023165033 A1 WO2023165033 A1 WO 2023165033A1
Authority
WO
WIPO (PCT)
Prior art keywords
region
model
area
training
target
Prior art date
Application number
PCT/CN2022/095137
Other languages
English (en)
French (fr)
Inventor
潘晓春
王娟
陈素平
夏斌
Original Assignee
深圳硅基智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳硅基智能科技有限公司 filed Critical 深圳硅基智能科技有限公司
Publication of WO2023165033A1 publication Critical patent/WO2023165033A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the present disclosure relates to the field of image processing based on artificial intelligence, and in particular to a model training, method, device and medium for identifying targets in medical images.
  • the deep learning target recognition technology can obtain high recognition accuracy for large-sized targets, but the recognition performance for small targets (such as thin objects or small objects) is not satisfactory, which is likely to cause false positives and false alarms. , and it is also difficult to distinguish the categories of small objects.
  • small target signs such as spot hemorrhages and microvascular tumors are not easy to find and distinguish when deep learning is used for target recognition because the targets are small, light in color, and close in color. Therefore, how to effectively identify small targets remains to be studied.
  • the present disclosure is proposed in view of the above-mentioned state of the prior art, and its purpose is to provide a model training, method, device and medium for identifying targets in medical images that can effectively identify small targets.
  • the first aspect of the present disclosure provides a model training method for identifying a target in a medical image, including: acquiring the medical image as a training sample and a labeled region corresponding to the target in the training sample; determining the The region segmentation result corresponding to the marked region, and using the training sample and the region segmentation result to construct a training set, wherein the region segmentation result is obtained by under-segmenting the image data in the marked region; and based on The training set trains the model to be trained, and optimizes the model to be trained using a training loss function, wherein, in the training loss function, spatial weights are used to reduce the pair of pixels in the first region of the training sample to the Negative impact of the model to be trained, the first region is a region other than the target region of the target in the labeled region in the training sample, and the target region is determined by the region segmentation result.
  • obtaining the region segmentation result further includes: obtaining image data to be segmented based on the image data corresponding to the marked region in the training sample, Or obtain the image data to be segmented based on the image data corresponding to the marked region in the training sample and the image data corresponding to the marked region in the segmentation result of interest, wherein the segmentation result of interest is used to identify the A binary image of the region of interest of the training sample; and performing threshold segmentation on the image data to be segmented by using a target segmentation threshold, and then obtaining the region segmentation result, wherein the region segmentation result is a binary image.
  • the target region in the image data to be segmented can be identified through threshold segmentation, and when the labeled region includes a region other than the region of interest, noise outside the region of interest can be eliminated.
  • the target segmentation threshold is obtained according to the method for obtaining the threshold value of the label category to which the target belongs, wherein the method for obtaining the threshold value of each label category is determined by The average area and average color of each label category are determined, and the threshold value acquisition method includes the first method and the second method, and the average area of the label category corresponding to the first method is larger than the label corresponding to the second method The average area of the category and the average color of the label category corresponding to the first method is lighter than the average color of the label category corresponding to the second method; for the first method, find the threshold, so that the to-be-segmented The area of the pixel whose grayscale value is greater than the threshold in the image data is less than a preset multiple of the area of the image data to be segmented, and the threshold is used as the target segmentation threshold, wherein the preset multiple is greater than 0 and less than 1.
  • the target segmentation threshold can be obtained according to the characteristics of the label category corresponding to the target. Accordingly, the accuracy of threshold segmentation can be improved.
  • an erosion operation is also performed on the threshold segmentation result of the image data to be segmented to obtain at least one connected region, selecting the connected region whose center is closest to the center of the image data to be segmented from the at least one connected region as the region segmentation result.
  • the pixels in the first region in the training sample are assigned a first weight, wherein the The first weight is 0.
  • samples of undetermined categories can be ignored to reduce the negative impact of samples of undetermined categories on the model to be trained.
  • the pixels in the first region, the second region, the third region and the fourth region in the training sample are respectively assigned the first A weight, a second weight, a third weight, and a fourth weight, wherein the second area is the target area, the third area is an area in the area of interest that does not belong to the marked area, and the The fourth area is an area outside the area of interest, the first weight is less than the second weight and less than the third weight, the fourth weight is less than the second weight and less than the third weight Weights.
  • the negative impact of pixels of undetermined categories and pixels outside the region of interest on the model to be trained can be suppressed, and the positive influence of the target region within the target region and the non-target region within the region of interest on the model to be trained can be improved.
  • the accuracy of the model can be improved.
  • the model to be trained is a semantic segmentation model
  • a prediction result of the model to be trained is a semantic segmentation result of the training sample.
  • the shape of the labeled region is a rectangle.
  • the difficulty of labeling can be reduced.
  • a second aspect of the present disclosure provides an electronic device, which includes: at least one processing circuit configured to execute the steps of the model training method described in the first aspect of the present disclosure.
  • a third aspect of the present disclosure provides a computer-readable storage medium, the computer-readable storage medium stores at least one instruction, and when the at least one instruction is executed by a processor, the steps of the above-mentioned model training method are implemented.
  • a fourth aspect of the present disclosure provides a method for identifying a target in a medical image, the method comprising: acquiring the medical image as an input image; and using at least one model training method according to the first aspect of the present disclosure to train training a model, determining a prediction result of each trained model for the input image, and obtaining a target prediction result based on the prediction result of the at least one trained model.
  • the prediction results of each trained model include the probability that each pixel in the input image belongs to the corresponding label category, and the Integrating the prediction results of at least one trained model to obtain an integrated probability that each pixel of the input image belongs to the corresponding label category, determining a connected region based on the integrated probability, and obtaining the target prediction corresponding to each label category based on the connected region
  • the probability is used as the integrated probability
  • the average value of the prediction results of multiple trained models is calculated to obtain the probability that each pixel in the input image belongs to the corresponding label category
  • the probability mean is used as the integrated probability. In this case, obtaining the target prediction result based on the integrated probability can further improve the accuracy of the target prediction result.
  • the medical image is a fundus image.
  • the trained model is able to recognize small objects in fundus images.
  • the target includes microangioma, spotting, sheet bleeding and linear bleeding.
  • the trained model is able to recognize small objects in fundus images.
  • a fifth aspect of the present disclosure provides an electronic device, the electronic device comprising: at least one processing circuit configured to: acquire the medical image as an input image; and utilize the medical image according to the first aspect of the present disclosure
  • the at least one trained model trained by the model training method determines a prediction result of each trained model for the input image, and obtains a target prediction result based on the prediction result of the at least one trained model.
  • a model training, method, device and medium for identifying targets in medical images capable of effectively identifying small targets are provided.
  • FIG. 1 is a schematic diagram showing an example of a recognition target environment related to an example of the present disclosure.
  • FIG. 2 is a flowchart illustrating an example of a model training method related to an example of the present disclosure.
  • FIG. 3 is a schematic diagram illustrating some examples of labeled regions related to examples of the present disclosure.
  • FIG. 4 is a schematic diagram showing some example region segmentation results related to examples of the present disclosure.
  • FIG. 5 is a flowchart illustrating an example of acquiring a result of area division related to an example of the present disclosure.
  • FIG. 6 is an architecture diagram showing an example of a model to be trained using the U-Net architecture involved in an example of the present disclosure.
  • FIG. 7 is a schematic diagram illustrating several areas of some examples to which examples of the present disclosure relate.
  • FIG. 8 is a flowchart illustrating an example of a method of recognizing an object in an image according to an example of the present disclosure.
  • circuitry herein may refer to hardware circuits and/or a combination of hardware circuits and software.
  • model in this disclosure is capable of processing an input and providing a corresponding output.
  • neural network deep neural network
  • model network
  • neural network model the terms “neural network”, “deep neural network”, “model”, “network” and “neural network model” are used interchangeably.
  • this paper mentions the rectangular characteristics (such as side, width, height, width, and height) of related objects (such as labeled regions, image data to be segmented, and objects). If the object itself is not rectangular, unless otherwise specified, it can be defaulted Rectangle property that is the bounding rectangle of the object.
  • the present disclosure obtains a region segmentation result by under-segmenting the labeled region, and uses the result as a gold standard to segment the image, thereby realizing accurate recognition of small objects.
  • the present disclosure adopts a spatial weight method to deal with the negative impact of pixels of undetermined categories on image segmentation caused by under-segmentation in the labeled region. In this case, small targets can be effectively identified.
  • examples of the present disclosure propose a scheme for training models and recognizing objects in images to address one or more of the above-mentioned problems and/or other potential problems.
  • the scheme adopts the method of image segmentation for target recognition (that is, image segmentation is first performed on the image data in the marked area in the training sample to obtain the area segmentation result, and then the area segmentation result is post-processed to obtain the target recognition result).
  • this scheme identifies pixels of undetermined categories in the labeled region by under-segmenting the image data in the labeled region in the training sample, and combines spatial weights (that is, the setting of the weights can be related to the position of the pixels) to neural networks.
  • the network model is trained to reduce the negative impact of pixels of undetermined categories in the labeled region on the neural network model, which can improve the accuracy of the prediction result of the trained model on the input image (eg, medical image).
  • the trained model can be a trained neural network model (that is, a trained neural network model.
  • a trained neural network model For example, a trained semantic segmentation model.
  • the trained model may be an optimal neural network model obtained after training.
  • Examples of the present disclosure relate to training models and schemes for recognizing objects in images, which efficiently recognize small objects.
  • the model training method for recognizing objects in images involved in the examples of the present disclosure may be simply referred to as a model training method or a training method. It should be noted that the solutions involved in the examples of the present disclosure are also applicable to the recognition of large objects.
  • Examples of the present disclosure may refer to images from cameras, CT scans, PET-CT scans, SPECT scans, MRI, ultrasound, X-rays, angiograms, fluoroscopy, capsule endoscopic images, or combinations thereof.
  • the images may be medical images.
  • medical images may include, but are not limited to, fundus images, lung images, stomach images, chest images, brain images, and the like.
  • small targets in medical images can be identified.
  • the image can be a natural image.
  • the natural image may be an image observed or captured in a natural scene.
  • small objects in natural images can be identified.
  • other types of images can be used without limitation.
  • FIG. 1 is a schematic diagram showing an example of a recognition target environment 100 related to an example of the present disclosure.
  • recognition target environment 100 may include computing device 110 .
  • Computing device 110 may be any device having computing capabilities.
  • computing device 110 may be a cloud server, a personal computer, a mainframe, a distributed computing system, and the like.
  • the computing device 110 can obtain an input 120 and generate an output 140 corresponding to the input 120 by using a neural network model 130 (sometimes also referred to as a model to be trained 130 or a model 130 for short).
  • the input 120 may be the above-mentioned image
  • the output 140 may be a prediction result, training parameters (eg, weights), or performance indicators (eg, accuracy rate and error rate), etc.
  • the neural network model 130 may include, but is not limited to, a semantic segmentation model (eg, U-Net), or other models related to image processing.
  • the neural network model 130 may be implemented using any suitable network structure. For example, convolutional neural network (CNN), recurrent neural network (RNN) and deep neural network (DNN), etc.
  • CNN convolutional neural network
  • RNN recurrent neural network
  • DNN deep neural network
  • the recognition target environment 100 may further include a model training device and a model application device (not shown).
  • the model training device can be used to implement the training method of training the neural network model 130 to obtain a trained model.
  • the model application device can be used to implement a related method for obtaining prediction results using a trained model to identify objects in an image.
  • the neural network model 130 may be the model 130 to be trained.
  • the neural network model 130 may be a trained model.
  • FIG. 2 is a flowchart illustrating an example of a model training method related to an example of the present disclosure.
  • the model training method can be performed by the computing device 110 shown in FIG. 1 .
  • the model training method can train a model that recognizes objects in medical images.
  • the model training method may include step S102.
  • step S102 medical images serving as training samples and labeled regions corresponding to targets in the training samples may be acquired. That is, in the training phase, medical images can be obtained as training samples. Thereby, it is possible to recognize the target in the medical image.
  • the medical images may be color images. Thus, the accuracy of recognizing small targets can be improved.
  • the medical image may contain a corresponding target, and the target may belong to at least one category of interest (that is, the category that needs to be identified).
  • the target may include small targets such as microangioma, spot hemorrhage, sheet hemorrhage, and linear hemorrhage.
  • the trained model is able to recognize small objects in fundus images.
  • FIG. 3 is a schematic diagram illustrating some examples of labeled regions related to examples of the present disclosure.
  • objects in the training samples may be labeled to obtain labeled regions.
  • the shape of the labeled region can be a rectangle, a circle, or a shape that matches the shape of the object in the training sample (for example, the shape of the labeled region can be the outline of the object).
  • the shape of the marked area can be a rectangle.
  • FIG. 3 shows a marked area D1 in a fundus image, where the marked area D1 is rectangular in shape, and the target in the marked area D1 is a sheet hemorrhage.
  • the marked region may have a corresponding marked label (that is, marked category of the target), and the marked label may be used to distinguish the category of the target.
  • Label categories can be in one-to-one correspondence with target categories.
  • the target category and label category may respectively include but not limited to microvascular tumor, spot hemorrhage, sheet hemorrhage, linear hemorrhage, and the like.
  • corresponding label categories may be represented numerically.
  • the labeled regions and corresponding labeled labels may be referred to as labeled results.
  • the model training method may further include step S104.
  • step S104 a region segmentation result (also called a pseudo-segmentation result) corresponding to the labeled region in the training sample may be determined, and a training set may be constructed using the training sample and the region segmentation result. It should be noted that, in some other examples, it is not necessary to determine the region segmentation result corresponding to the marked region, as long as the target region in the marked region (described later) can be identified, and the pixels in the target region are determined to belong to the target. Can.
  • the training samples can be preprocessed and then used to construct the training set. set.
  • preprocessing the training samples may include unifying the size of the training samples.
  • the size of the training samples can be unified to 1024 ⁇ 1024 or 2048 ⁇ 2048.
  • the disclosure does not limit the size of the training samples.
  • preprocessing the training samples may include clipping the training samples.
  • a region of interest in the training sample may be obtained and the region of interest may be used to clip the training sample. Accordingly, it is possible to make the size of the training samples uniform and include the region of interest.
  • the region of interest may be a region where objects may exist (also referred to as a foreground region).
  • the region of interest may be a fundus region.
  • training samples may be segmented to obtain regions of interest.
  • threshold segmentation may be performed on the training samples to obtain segmentation results of interest, wherein the segmentation results of interest may be used to identify regions of interest of the training samples. Thereby, a region of interest can be identified.
  • the segmentation result of interest obtained through threshold segmentation may be a binary image (also referred to as a binary image). It can be understood that, although the segmentation result of interest is obtained through threshold segmentation above, other segmentation results suitable for obtaining the segmentation result of interest are also applicable. For example, the segmentation result of interest can be obtained through a neural network.
  • performing threshold segmentation on the training sample may be to divide the training sample into a preset number of shares (for example, 9 equal parts), determine the segmentation threshold based on the gray values of the four corners of the training sample and the central area, and based on The segmentation threshold performs threshold segmentation on the training samples, and then obtains the segmentation results of interest.
  • the determination of the segmentation threshold based on the gray values of the four corner areas and the central area of the training sample may be the gray mean value of the pixels in each area in the four corner areas and the gray level of the pixels in the middle area The average value of the mean is used as the segmentation threshold for threshold segmentation, and then the segmentation result of interest is obtained.
  • an erosion operation may be performed on the segmentation result of the threshold corresponding to the training sample (that is, the initial segmentation result) to obtain the segmentation result of interest.
  • two erosion operations may be performed on the threshold segmentation results of the training samples to obtain the segmentation results of interest, where the size of the erosion kernel may be 5. In this way, noise at the edge of the region of interest (for example, the fundus region) can be eliminated.
  • the region segmentation result corresponding to the labeled region may be determined.
  • the region segmentation result can be used to determine the object region of the object within the labeled region.
  • the target area in the labeled area can be identified, and then pixels of undetermined categories can be determined based on the target area.
  • the pixels outside the target area in the marked area in the training sample may be pixels of undetermined categories.
  • the region segmentation result may be any form of data (for example, an image) that can identify the target region.
  • the region segmentation result may be a binary image.
  • the region corresponding to the pixel with a value of 1 can be set as the target region (that is, if the value of the pixel is 1, it can represent the pixel at the corresponding position in the training sample Belonging to the target, if the value of the pixel is 0, it can indicate that the pixel at the corresponding position in the training sample is a pixel of an undetermined category). In this case, it is possible to reduce the negative impact of pixels of undetermined categories on the model to be trained 130 .
  • FIG. 4 is a schematic diagram showing some example region segmentation results related to examples of the present disclosure.
  • FIG. 4 shows the region segmentation result A1 corresponding to the labeled region D1 in FIG. 3 , where D2 is the target region.
  • the region segmentation result A1 in FIG. 4 is the result of an equal scale enlargement, which does not represent a limitation to the present disclosure.
  • the region segmentation result A1 in FIG. 4 can actually be compared with the labeled region D1 of the same size.
  • under-segmentation may be performed on the image data in the marked region in the training sample to obtain the region segmentation result (that is, the target region corresponding to the target in the marked region may be segmented through under-segmentation to obtain the region segmentation result).
  • region segmentation result that is, the target region corresponding to the target in the marked region may be segmented through under-segmentation to obtain the region segmentation result.
  • pixels of undetermined categories in the labeled region can be identified based on the region segmentation result obtained by under-segmentation.
  • the mis-segmentation of the foreground object as the background but the background is not mis-segmented as the foreground object can be called under-segmentation.
  • under-segmentation can be that the pixels belonging to the target in the labeled area are mis-segmented as non-targets but the pixels in the labeled area that do not belong to the target are not mis-segmented as targets.
  • the pixels in the target area in the area segmentation result can be determined to belong to the target.
  • pixels outside the target region do not necessarily belong to the target (that is, they may be pixels of undetermined categories).
  • the region segmentation results corresponding to the marked regions may be determined based on the training samples and the image data respectively corresponding to the marked regions in the above segmentation results of interest.
  • the image data corresponding to the labeled region in the training sample (hereinafter referred to as the first image data corresponding to the labeled region in the training sample) and the above-mentioned segmentation result of interest (that is, the segmentation result of interest can be
  • the image data corresponding to the marked region in the binary image used to identify the region of interest of the training sample (hereinafter referred to as the second image data corresponding to the marked region in the segmentation result of interest) is subjected to a product operation to obtain the image to be segmented data (that is, the image data in the marked area), the image data to be segmented is under-segmented to determine the area segmentation result corresponding to the marked area.
  • the labeled region includes regions other than the region of interest, noise outside the region of interest can be eliminated.
  • FIG. 5 is a flowchart illustrating an example of acquiring a result of area division related to an example of the present disclosure. That is, some examples of the present disclosure obtain the flow of region segmentation results.
  • obtaining the region segmentation result may include step S202.
  • image data to be segmented may be acquired based on the labeled region.
  • the first image data may be the image data corresponding to the labeled region in the training sample
  • the second image data may be the image data corresponding to the labeled region in the segmentation result of interest. That is, the first image data and/or the second image data may be acquired based on the marked region, and then the image data to be segmented may be acquired based on the first image data, or the first image data and the second image data.
  • image data to be segmented may be obtained based on the first image data.
  • the image data to be segmented may be acquired based on color channels (eg, red channel, green channel, blue channel) of the first image data. Taking the fundus image as an example, the image data to be segmented may be acquired based on the green channel of the first image data.
  • the first image data corresponding to the labeled region can be obtained (for example, cropped) from the training sample, and then the green channel (ie, G channel) of the first image data is taken, and the image to be segmented is obtained based on the green channel of the first image data data.
  • the corresponding color channel (green channel) of the first image data may be used as the image data to be segmented to obtain the image data to be segmented.
  • the color space and color channel can be selected according to the characteristics of the medical image itself, which is not particularly limited in this disclosure.
  • the image data to be segmented may be acquired based on the first image data and the second image data. In this case, when the labeled region includes regions other than the region of interest, noise outside the region of interest can be eliminated.
  • the first image data, the second image data, and the image data to be segmented may represent image data (for example, pixel data, data stream or image) of the corresponding area, and in practice, the pixels of the corresponding area may be The value of or the position mark of the pixel is stored in a corresponding medium (such as a memory or a disk) to form a corresponding form of image data, which can be conveniently processed.
  • a corresponding medium such as a memory or a disk
  • the shapes of the first image data, the second image data, and the image data to be segmented can match the shape of the marked area, or can be a circumscribed rectangle of the marked area, which can be selected according to the method of obtaining the area segmentation result.
  • the process of obtaining the region segmentation result if it is necessary to use the rectangular characteristics of the image data to be segmented (for example, side, length, width, height, four corners) and the shape of the marked region is not rectangular, it can be based on the circumscribed region of the marked region
  • the area corresponding to the rectangle acquires the image data to be segmented. That is, after the shape of the labeled region is converted into a rectangle, the image data to be segmented can be obtained based on the converted labeled region.
  • obtaining the region segmentation result may further include step S204.
  • step S204 threshold segmentation may be performed on the image data to be segmented to obtain a region segmentation result.
  • the examples of the present disclosure are not limited thereto.
  • the image data to be segmented may be under-segmented in other ways to obtain the region segmentation result.
  • a target segmentation threshold (described later) may be used to perform threshold segmentation on the image data to be segmented, and then obtain a region segmentation result.
  • the target region in the image data to be segmented can be identified through threshold segmentation.
  • the value of the pixel whose gray value is not less than the target segmentation threshold in the image data to be segmented can be set to 1, and the value of other pixels to be 0, and then the region segmentation result can be obtained.
  • an erosion operation may also be performed on the threshold segmentation result (that is, the initial segmentation result) of the image data to be segmented. In this case, it is possible to reduce the probability of isolated pixels in the threshold segmentation result due to the influence of noise.
  • the erosion kernel k may satisfy the formula:
  • h can represent the height of the marked area (that is, the marked area corresponding to the image data to be segmented) and w can represent the width of the marked area, H can represent the height of the training sample, W can represent the width of the training sample, and p can represent preset hyperparameters.
  • a corrosion kernel of an appropriate size can be obtained according to the size of the training sample, the size of the labeled region, and the preset hyperparameters. Thereby, excessive corrosion can be suppressed.
  • preset hyperparameters may be used to tune the size of the corrosion kernel.
  • smaller corrosion nuclei can be used for particularly small objects. Thereby, it is possible to avoid an excessive erosion operation which would cause the target area of a particularly small target to disappear.
  • the preset hyperparameters may be fixed values. In some examples, the preset hyperparameters may be determined according to the average size of objects of the same category in the medical images. In some examples, the preset hyperparameters may be determined according to the average width and average height of objects of the same category in the medical images. In some examples, the preset hyperparameter p can satisfy the formula:
  • the medical image may be an image in a data source for obtaining preset hyperparameters.
  • the width and height of objects of the same category in multiple training samples and the width and height of the training samples may be counted to obtain relevant parameters of preset hyperparameters. That is, the data source can be training data.
  • the width and height of the target when acquiring preset hyperparameters in a medical image with labeled regions (for example, a training sample), may also be the width and height of the corresponding labeled region. Thus, the width and height of the target can be acquired conveniently.
  • the threshold segmentation result of the image data to be segmented there may be multiple connected regions in the threshold segmentation result of the image data to be segmented.
  • an erosion operation may be performed on the threshold segmentation result of the image data to be segmented to obtain at least one connected region, and a connected region whose center is closest to the center of the image data to be segmented is selected from the at least one connected region as the region segmentation result.
  • the connected region closest to the center of the image data to be segmented may represent the identified target region.
  • contours can be searched for corrosion results (that is, at least one connected region), and the preset number (for example, 3) contours with the largest area are taken as candidates, and the distance between the center of the contour and the image data to be segmented is retained in the candidate contours.
  • the connected region corresponding to the nearest contour of the center of is the region segmentation result.
  • the object segmentation threshold can be obtained according to the common big law (OTSU) way.
  • OTD common big law
  • at least one manner of acquiring the object segmentation threshold may be selected from the manners described in the examples of the present disclosure.
  • the target segmentation threshold may be obtained according to the label category to which the target belongs.
  • the object segmentation threshold may be acquired according to an acquisition threshold method of the label category to which the object belongs.
  • the target segmentation threshold can be obtained according to the characteristics of the label category corresponding to the target. Accordingly, the accuracy of threshold segmentation can be improved.
  • the method for obtaining the threshold value of the labeled category may include the first method and the second method.
  • the labeled categories of the objects in the training samples can be known.
  • the labeling category to which the target in the training sample belongs may be the labeling label in the labeling result.
  • the method for obtaining the threshold value of each labeling category may be obtained through the features of each labeling category.
  • the acquisition threshold method may be determined according to the average area and average color of each label category.
  • the examples of the present disclosure are not limited thereto, and in some other examples, the method for obtaining the threshold value of the label category may also be determined empirically.
  • the first method can be used for sheet hemorrhage in the fundus image
  • the second method can be used for microvascular tumors, spot hemorrhages, and linear hemorrhages in the fundus image.
  • the average area and average color of each label category may be fixed values, which may be obtained according to statistics on sample data.
  • the area and color of objects of the same category eg, for the training sample, the same category may refer to the same labeled category
  • sample data eg, training samples
  • the fixed value may also be an empirical value.
  • the average area of the label category corresponding to the first method may be greater than the average area of the label category corresponding to the second method and The average color of the label category corresponding to the first method may be lighter than the average color of the label category corresponding to the second method.
  • the first method can be used for objects with large areas and light colors (for example, flake hemorrhages in fundus images).
  • the second method can target objects of this labeling category with small areas and dark colors (for example, microvascular tumors, spotting and linear bleeding in fundus images).
  • the method for obtaining the threshold value for the label category may be determined by the first preset area and preset color values. In this way, it is possible to automatically acquire the acquisition threshold method used by the label category.
  • the label category can be It is determined to use the first method, otherwise if the average area of the label category is not greater than the first preset area and the average color is not less than the preset color value (that is, the area of the target of the label category is relatively small, and the color is relatively dark), Then the label class can be determined to use the second method.
  • the first preset area and the preset color value can be adjusted according to the result of the region segmentation.
  • the first preset area and the preset color value may be fixed values, and the fixed values may be obtained according to statistics on sample data. That is, a statistical method can be used to count the region segmentation results of a small amount of sample data under different first preset areas and preset color values to determine the best first preset area and preset color values for classification .
  • the target segmentation threshold can be obtained according to the method for obtaining the threshold value of the label category to which the target belongs.
  • the object segmentation threshold may be obtained according to the method for obtaining the threshold value of the labeled category to which the object belongs and the image data to be segmented corresponding to the training samples.
  • the threshold value can be searched to make the area of the pixel whose gray value is greater than the threshold value in the image data to be segmented If it is smaller than a preset multiple of the area of the image data to be segmented, the threshold is used as the target segmentation threshold, wherein the preset multiple may be greater than 0 and less than 1.
  • the threshold value from 0 to 255 can be traversed to find the threshold value so that the area of the pixel whose gray value is greater than the threshold value in the image data to be segmented is smaller than the preset multiple of the area of the image data to be segmented.
  • the preset multiple can be any value that makes the target area not divided.
  • the preset multiplier can take a smaller value so that the target area is not divided.
  • the preset multiplier may be empirically determined by the shape of the target.
  • the mean value of the gray value of the pixels in the image data to be segmented can be used as the target segmentation threshold or based on The gray values of the four corner regions and the central region of the image data to be segmented determine the target segmentation threshold.
  • the preset length can be any value that prevents the target area from being segmented.
  • the preset length may be a first preset ratio of the smallest side of the training sample. Specifically, the preset length can be represented as min(rH, rW), where r can represent the first preset ratio, H can represent the height of the training sample, and W can represent the width of the training sample.
  • the first preset ratio may be a fixed value. In some examples, the first preset ratio may be determined according to an average size of objects of the same category in the medical image. In some examples, the first preset ratio may be determined according to the average width and average height of objects of the same category in the medical image. In some examples, the first preset ratio may satisfy the formula:
  • the medical image may be an image in the data source used to obtain the first preset ratio.
  • the data source can be training data.
  • related parameters related to the first preset ratio may be acquired in a manner similar to related parameters related to acquiring preset hyperparameters, which will not be repeated here.
  • the target can be determined based on the gray values of the four corner areas and the central area of the image data to be segmented Segmentation threshold.
  • the image data to be segmented can be equally divided into a preset number of parts (for example, 9 equal parts), and the target segmentation threshold can be determined based on the gray values of the four corner regions and the central region of the image data to be segmented.
  • the target segmentation threshold can be determined based on the gray values of the four corner regions and the central region of the image data to be segmented.
  • a training set may be constructed using training samples and region segmentation results. That is, the training set may be constructed based on the training samples and at least one region segmentation result corresponding to the training samples.
  • the training set can include training samples and a gold standard of training samples.
  • a gold standard of training samples can be obtained based on the region segmentation results. That is, the target region can be identified based on the region segmentation result, and then the real category to which the pixels in the training samples belong can be determined based on the target region. Thereby, the gold standard of training samples can be obtained.
  • the ground-truth category may include at least one of annotated categories of objects (for example, for fundus images, may include microangioma, spotting, sheet-like bleeding, and linear bleeding), no-target category, and undetermined category . Specifically, it is related to the process of optimizing the model to be trained 130 .
  • the labeled category of the object in the real category may be the category to which the pixels of the target area (ie, the second area described later) of the object in the labeled area in the training sample belong.
  • the undetermined category in the true category may be the category to which the pixels of the region other than the target region (that is, the first region described later) of the object in the labeled region in the training sample belong.
  • the non-target category in the ground truth category can be the category to which the pixels outside the labeled region in the training samples belong.
  • the areas outside the labeled area in the training samples may include areas within the ROI that do not belong to the labeled area (ie, the third area described later).
  • the region within the region of interest and not belonging to the marked region may be the region corresponding to the non-target tissue in the medical image.
  • the areas outside the marked area in the training samples may include areas within the ROI that do not belong to the marked area, and areas outside the ROI (ie, the fourth area described later).
  • a validation set and a test set can also be constructed using the training samples and the region segmentation results.
  • the model training method may further include step S106.
  • step S106 the model to be trained 130 may be trained based on the training set, and the model to be trained 130 may be optimized using a training loss function.
  • the model to be trained 130 may include, but is not limited to, a semantic segmentation model.
  • the prediction results of the model to be trained 130 may include, but not limited to, semantic segmentation results of training samples.
  • small objects can be identified.
  • the prediction result may be the semantic segmentation result of the image data.
  • the above-mentioned input 120 may be color image data.
  • high-dimensional feature information can be added to the model to be trained 130 .
  • the accuracy of recognition of small objects can be improved.
  • feature information of different dimensions in medical images such as training samples
  • the feature information of a preset dimension close to the feature information of the highest dimension can be combined with the feature information of the highest dimension Fusion is performed to increase high-dimensional feature information.
  • FIG. 6 is an architecture diagram showing an example of a model to be trained 130 employing a U-Net architecture involved in an example of the present disclosure.
  • FIG. 6 shows a model 130 to be trained using the U-Net architecture, wherein, for the common network layers in the U-Net architecture, too much explanation is not given here.
  • the preset dimension can be 2
  • the feature information of the two dimensions can include feature information 131a and feature information 132b, wherein the feature information 131a can be fused with the feature information of the highest dimension through the upsampling layer 132a, and the feature information The information 131b can be fused with the feature information of the highest dimension through the upsampling layer 132b.
  • the convolution size of the upsampling layer 132a and the upsampling layer 132b may be any value that makes feature information (for example, feature information 131a and feature information 131b ) consistent with the size of the feature information of the highest dimension after upsampling.
  • the prediction result corresponding to the training sample can be obtained based on the training sample of the training set by the model to be trained 130, and then the training loss function can be constructed based on the region segmentation result and the prediction result corresponding to the training sample (That is, the training loss function can be constructed using the gold standard and prediction results of the training samples obtained based on the region segmentation results).
  • the training loss function can represent the degree of difference between the gold standard of the training sample and the corresponding prediction result.
  • the region segmentation results can be directly used as the gold standard for training samples.
  • the region segmentation result may be used as the gold standard of the pixels in the labeled region corresponding to the target in the training sample to obtain the gold standard of the training sample.
  • the gold standard of pixels in areas other than the marked area corresponding to the target in the training sample can be set as required. For example, it may be fixed as a category (for example, it may be the no-target category involved in the example of the present disclosure). For another example, it may be set by manually labeling the training samples or automatically labeling the training samples by an artificial intelligence algorithm. The example of the present disclosure does not specifically limit the method of setting the gold standard of the pixels in the region other than the marked region corresponding to the target in the training sample.
  • weights may be assigned to the aforementioned pixels of undetermined categories in the training samples to reduce the negative impact of the pixels of undetermined categories on the model 130 to be trained.
  • accuracy of the model to be trained 130 can be improved.
  • spatial weights can be used to reduce the negative impact of pixels of undetermined categories in the training samples on the model 130 to be trained.
  • the training sample may be divided into several regions (also referred to as at least one region), and weights are used to adjust the influence of each of the several regions on the model 130 to be trained.
  • the number of regions may include the first region.
  • the first area may be an area of pixels of undetermined categories in the training sample (that is, an area other than the target area in the marked area in the training sample).
  • spatial weights may be used to reduce the negative impact of pixels in the first region of the training samples on the model to be trained 130 .
  • the pixels in the first region in the training samples may be assigned a first weight to reduce the negative impact on the model 130 to be trained.
  • the first weight may be any value that reduces the negative impact on the model 130 to be trained.
  • the first weight may be a fixed value.
  • the first weight may be zero. In this case, samples of undetermined categories can be ignored, so as to reduce the negative impact of samples of undetermined categories on the model 130 to be trained.
  • the number of regions may include the second region.
  • the second area may be the target area of the training samples.
  • pixels of the second region may be assigned a second weight in the spatial weights.
  • the first weight may be less than the second weight.
  • the second weight may be any value that increases the positive influence of the pixels in the second region on the model 130 to be trained.
  • the second weight can be a fixed value.
  • the second weight may be one.
  • the number of regions may include a third region.
  • the third region may be a region in the region of interest in the training sample that does not belong to the labeled region.
  • pixels of the third region may be assigned a third weight in the spatial weights.
  • the first weight may be less than the third weight.
  • the principle of setting the third weight may be similar to that of the second weight.
  • the number of regions may include a fourth region.
  • the fourth area may be an area outside the area of interest in the training sample.
  • pixels of the fourth region may be assigned a fourth weight in the spatial weights.
  • the fourth weight may be less than the second weight.
  • the principle of setting the fourth weight may be similar to that of the first weight.
  • several areas may include the first area, the second area, the third area and the fourth area at the same time, and the pixels in the first area, the second area, the third area and the fourth area may be respectively assigned the first weight, second weight, third weight and fourth weight, wherein the first weight may be smaller than the second weight and smaller than the third weight, and the fourth weight may be smaller than the second weight and smaller than the third weight.
  • the first weight may be smaller than the second weight and smaller than the third weight
  • the fourth weight may be smaller than the second weight and smaller than the third weight.
  • the first weight may be 0, the second weight may be 1, the third weight may be 1, and the fourth weight may be 0.
  • the accuracy of the model to be trained 130 can be improved.
  • the several areas may include any combination of the first area, the second area, the third area and the fourth area.
  • FIG. 7 is a schematic diagram illustrating several areas of some examples to which examples of the present disclosure relate.
  • FIG. 7 is a schematic diagram showing various binarized regions, and does not limit the present disclosure to be divided into all the regions shown in FIG. 7 .
  • D3 may represent the first area
  • D4 may represent the second area
  • D5 may represent the third area
  • D6 may represent the fourth area.
  • the training sample in the spatial weighting, can be divided into several regions, and the influence of each region in the several regions can be adjusted by using the weights to the model 130 to be trained.
  • losses may be computed per class.
  • the ground-truth category may include at least one of the labeled category of the target, the non-target category, and the undetermined category.
  • the classes can be derived from the ground-truth classes described above. That is, the categories in the training loss function may include the labeled category of the target and the non-target category, or the categories in the training loss function may include the labeled category of the target, the non-target category, and the undetermined category.
  • the categories in the specific training loss function are related to the samples selected in the training loss function.
  • the training loss function if samples (that is, pixels) of each category in the training samples belong to corresponding regions among several regions, the loss of the corresponding samples may be multiplied by the weight of the corresponding region.
  • the training loss function can be determined based on the spatial weights, and then the influence of pixels in different regions on the model to be trained 130 can be adjusted.
  • the influence of the samples of each category on the model 130 to be trained can be adjusted based on the weight of each category.
  • the influence of samples to be trained on the model 130 can be adjusted based on both spatial weights and category weights.
  • the influence of samples on the model to be trained 130 can be adjusted by region and class.
  • the training loss function may employ weighted balanced cross-entropy. In this case, the imbalance between positive and negative samples can be suppressed, thereby further improving the recognition accuracy of the model 130 to be trained for small targets.
  • a training loss function based on weighted equalization cross entropy may be used, and spatial weights may be used to control the negative impact of pixels of undetermined categories on the model to be trained 130 .
  • the first weight of the first area is 0, the second weight of the second area is 1, the third weight of the third area is 1, and the fourth weight of the fourth area is 0 as an example.
  • the description is based on Weighted balanced cross-entropy training loss function. It should be noted that this does not represent a limitation to the disclosure, and those skilled in the art can design a training loss function based on weighted balanced cross-entropy by freely combining the weights of each area and each category in several areas according to the situation.
  • the training loss function L based on weighted balanced cross-entropy can satisfy the formula (that is, it is equivalent to ignoring the losses of the first area and the fourth area by setting the first weight and the fourth weight to 0):
  • C can represent the number of categories
  • W i can represent the weight of the i-th category
  • M i can represent the number of samples of the i-th category
  • y ij can represent the jth of the i-th category in the gold standard of the above training samples.
  • the actual value of the sample, p ij can represent the predicted value of the jth sample of the i-th category in the prediction result (that is, the probability that the j-th sample belongs to the i-th category).
  • the samples of each category may be pixels of the corresponding category in the training samples.
  • the samples of a category can be determined based on the above-mentioned gold standard of the training samples.
  • the weight of the category can adjust the impact of samples of each category on the model 130 to be trained.
  • the categories in the training loss function can include the label category of the target and the non-target category, and the target The labeled category may be the category to which the pixels in the second region in the training sample belong, and the non-target category may be the category to which the pixels in the third region in the training sample belong.
  • the categories in the training loss function of Equation (1) can include microangioma, spot hemorrhage, sheet hemorrhage and linear hemorrhage and no target category.
  • FIG. 8 is a flowchart illustrating an example of a method of recognizing an object in an image according to an example of the present disclosure.
  • the identification method may include step S302.
  • step S302 a medical image as an input image may be acquired.
  • the input image can be input into the trained model after the same preprocessing as the above-mentioned training samples.
  • the identification method may further include step S304.
  • step S304 at least one trained model can be used to determine the prediction results of each trained model for the input image, and the target prediction result can be obtained based on the prediction results of at least one trained model, wherein at least one trained model can be based on the above-mentioned Obtained by training with the model training method.
  • at least one trained model may be a model based on the same type of network architecture (eg, U-Net), but with different network structures and/or parameters. For example, some branches or network levels may be added or subtracted to form at least one trained model.
  • U-Net network architecture
  • examples of the present disclosure are not limited thereto, and in other examples, at least one trained model may not be based on the same type of network architecture.
  • the prediction results of each trained model may include the probability that each pixel in the input image belongs to the corresponding labeled category.
  • the labeling category may be the labeling category of the above-mentioned target.
  • the prediction results of at least one trained model may be integrated according to the label category and pixels to obtain the integrated probability that each pixel of the input image belongs to the corresponding label category, determine the connected region based on the integrated probability, and obtain each pixel based on the connected region.
  • the target prediction result corresponding to the labeled category In this case, obtaining the target prediction result based on the integrated probability can further improve the accuracy of the target prediction result.
  • the probability that each pixel in the input image in the predicted result of the trained model belongs to the corresponding label category can be used as the integrated probability; otherwise, the multi- The prediction results of each trained model are averaged to obtain the average probability of each pixel in the input image belonging to the corresponding label category (that is, the pixel-level probability average can be calculated according to the label category).
  • the connected region in determining the connected region based on the integrated probability, may be determined based on the integrated probability and the classification threshold of each label category. Specifically, the values of pixels whose integration probability is not less than the classification threshold can be set to 1, and the values of other pixels can be set to 0. In some examples, the classification threshold can be determined based on a validation set using performance metrics. In addition, if there are connected regions, the number of connected regions may be one or more.
  • the circumscribed rectangle of each connected region in obtaining the target prediction result based on the connected region, can be obtained. If the area of the circumscribed rectangle is greater than the second preset area, it can indicate that there is a target at the circumscribed rectangle, otherwise it can indicate that the There is no target at the bounding rectangle.
  • the second preset area may be a second preset ratio of the area of the training samples.
  • the second preset area may be expressed as sHW, where s may represent the second preset ratio, H may represent the height of the input image, and W may represent the width of the input image.
  • the second preset ratio may be a fixed value. In some examples, the second preset ratio may be determined according to the median area of objects of the same category in the medical image. In some examples, the second preset ratio s may satisfy the formula:
  • m can represent the median value of the area of the target of the same category in the medical image
  • can represent the standard deviation of the area of the target of the same category in the medical image
  • the average width and average height of medical images can be represented respectively.
  • the medical image may be an image in the data source used to acquire the second preset ratio.
  • the data source can be training data.
  • the relevant parameters involved in the second preset ratio may be acquired in a similar manner to the related parameters involved in acquiring the preset hyperparameters, which will not be repeated here.
  • the present disclosure also relates to a computer-readable storage medium.
  • the computer-readable storage medium can store at least one instruction, and when the at least one instruction is executed by a processor, one or more steps in the above-mentioned model training method or recognition method are realized.
  • the present disclosure also relates to electronic devices, which may include at least one processing circuit.
  • At least one processing circuit is configured as one or more steps in the above-mentioned model training method or recognition method.
  • the model training, method, device and medium for identifying targets in medical images under-segment the image data in the marked area in the training sample to identify pixels of undetermined categories in the marked area, and combine the spatial weights
  • the model to be trained 130 is trained to reduce the negative impact of the pixels of undetermined categories in the labeled region on the model to be trained 130 , thereby improving the accuracy of the predicted result of the trained model 130 to the input image.
  • small targets can be effectively identified.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Probability & Statistics with Applications (AREA)
  • Radiology & Medical Imaging (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Image Analysis (AREA)

Abstract

一种识别医学图像中的目标的模型训练、方法、设备及介质。模型训练包括获取作为训练样本的医学图像和训练样本中的目标对应的标注区域(S102);确定标注区域对应的区域分割结果,利用训练样本和区域分割结果构建训练集(S104),其中,通过对标注区域内的图像数据进行欠分割以获取区域分割结果;并且基于训练集训练待训练模型,并利用训练损失函数优化待训练模型(S106),其中,在训练损失函数中,利用空间权重减小训练样本中的第一区域的像素对待训练模型的负面影响,第一区域为训练样本中的标注区域内的目标的目标区域以外的区域,目标区域由区域分割结果确定。由此,能够有效地对小目标进行识别。

Description

识别医学图像中的目标的模型训练、方法、设备及介质 技术领域
本公开涉及基于人工智能的图像处理领域,具体涉及一种识别医学图像中的目标的模型训练、方法、设备及介质。
背景技术
近些年来,人工智能技术在计算机视觉领域中取得了巨大的成就。例如,深度学习技术在语义分割、图像分类和目标识别等方面的应用越来越广泛。特别在医学领域,常常通过对医学图像中的目标进行分割、识别或分类以辅助对目标进行分析。
目前,深度学习目标识别技术对大尺寸的目标可以获得较高的识别准确度,但是对小目标(例如细物体或小物体)的识别性能却不尽人意,容易造成漏报和虚警的情况,并且区别小目标的类别也很困难。例如,在眼底图像中,点状出血和微血管瘤等小目标体征由于目标小、颜色浅、颜色接近,因此在深度学习进行目标识别时不易发现,也不易区分。因此,如何有效地对小目标进行识别还有待于研究。
发明内容
本公开是有鉴于上述现有技术的状况而提出的,其目的在于提供一种能够有效地对小目标进行识别的识别医学图像中的目标的模型训练、方法、设备及介质。
为此,本公开第一方面提供一种识别医学图像中的目标的模型训练方法,包括:获取作为训练样本的所述医学图像和所述训练样本中的所述目标对应的标注区域;确定所述标注区域对应的区域分割结果,并利用所述训练样本和所述区域分割结果构建训练集,其中,通过对所述标注区域内的图像数据进行欠分割以获取所述区域分割结果;并且基于所述训练集训练待训练模型,并利用训练损失函数优化所述待训练模型,其中,在所述训练损失函数中,利用空间权重减小所述训 练样本中的第一区域的像素对所述待训练模型的负面影响,所述第一区域为所述训练样本中的所述标注区域内的所述目标的目标区域以外的区域,所述目标区域由所述区域分割结果确定。在这种情况下,通过对训练样本中标注区域内的图像数据进行欠分割以识别标注区域内未确定类别的像素,并结合空间权重对待训练模型进行训练以减小标注区域内的未确定类别的像素对待训练模型的负面影响,进而能够使训练后的待训练模型对输入图像的预测结果的准确性提高。由此,能够有效地对小目标进行识别。
另外,在本公开的第一方面所涉及的模型训练方法中,可选地,获取所述区域分割结果进一步包括:基于所述训练样本中所述标注区域对应的图像数据获取待分割图像数据、或基于所述训练样本中所述标注区域对应的图像数据以及感兴趣分割结果中所述标注区域对应的图像数据获取所述待分割图像数据,其中,所述感兴趣分割结果为用于识别所述训练样本的感兴趣区域的二值图像;并且利用目标分割阈值对所述待分割图像数据进行阈值分割,进而获取所述区域分割结果,其中,所述区域分割结果为二值图像。在这种情况下,能够通过阈值分割识别待分割图像数据中的目标区域,并且在标注区域包括感兴趣区域以外的区域时,能够消除感兴趣区域以外的噪声。
另外,在本公开的第一方面所涉及的模型训练方法中,可选地,根据所述目标所属的标注类别的获取阈值方法获取所述目标分割阈值,其中,各个标注类别的获取阈值方法由各个标注类别的平均面积和平均颜色确定,所述获取阈值方法包括第一种方法和第二种方法,所述第一种方法对应的标注类别的平均面积大于所述第二种方法对应的标注类别的平均面积且所述第一种方法对应的标注类别的平均颜色比所述第二种方法对应的标注类别的平均颜色浅;对于所述第一种方法,查找阈值,使所述待分割图像数据内灰度值大于所述阈值的像素的面积小于所述待分割图像数据的面积的预设倍数,将所述阈值作为所述目标分割阈值,其中,所述预设倍数大于0且小于1;对于所述第二种方法,若所述待分割图像数据的最小的边的长度小于预设长度,则取所述待分割图像数据中像素的灰度值的均值作为所述目标分割阈值,否则基于所述待分割图像数据的四个角的区域和中心区域的灰度值确 定所述目标分割阈值。在这种情况下,能够根据目标对应的标注类别自身的特点获取目标分割阈值。由此,能够提高阈值分割的准确性。
另外,在本公开的第一方面所涉及的模型训练方法中,可选地,在获取所述区域分割结果之前:还对所述待分割图像数据的阈值分割结果进行腐蚀操作以获取至少一个连通区域,从所述至少一个连通区域中选择中心离所述待分割图像数据的中心最近的所述连通区域作为所述区域分割结果。由此,能够获得准确的目标区域。
另外,在本公开的第一方面所涉及的模型训练方法中,可选地,在所述空间权重中,所述训练样本中的所述第一区域的像素被分配第一权重,其中,所述第一权重为0。在这种情况下,能够忽略未确定类别的样本,以减小未确定类别的样本对待训练模型的负面影响。
另外,在本公开的第一方面所涉及的模型训练方法中,可选地,所述训练样本中的所述第一区域、第二区域、第三区域和第四区域的像素分别被分配第一权重、第二权重、第三权重和第四权重,其中,所述第二区域为所述目标区域,所述第三区域为感兴趣区域内的不属于所述标注区域的区域,所述第四区域为所述感兴趣区域之外的区域,所述第一权重小于所述第二权重且小于所述第三权重,所述第四权重小于所述第二权重且小于所述第三权重。在这种情况下,能够抑制未确定类别的像素以及感兴趣区域以外的像素对待训练模型的负面影响,提高目标区域以内和感兴趣区域内的无目标区域对待训练模型的正面影响。由此,能够提高模型的准确性。
另外,在本公开的第一方面所涉及的模型训练方法中,可选地,所述待训练模型是语义分割模型,所述待训练模型的预测结果是所述训练样本的语义分割结果。由此,能够对小目标进行识别。
另外,在本公开的第一方面所涉及的模型训练方法中,可选地,所述标注区域的形状为矩形。由此,能够降低标注的难度。
本公开第二方面提供了一种电子设备,该电子设备包括:至少一个处理电路,所述至少一个处理电路被配置为执行本公开第一方面所述的模型训练方法的步骤。
本公开第三方面提供了一种计算机可读存储介质,所述计算机可读存储介质存储有至少一个指令,所述至少一个指令被处理器执行时 实现上述的模型训练方法的步骤。
本公开第四方面提供了一种识别医学图像中目标的方法,该方法包括:获取作为输入图像的所述医学图像;并且利用根据本公开第一方面所述的模型训练方法训练的至少一个经训练模型,确定针对所述输入图像的各个经训练模型的预测结果,基于所述至少一个经训练模型的预测结果获取目标预测结果。
另外,在本公开的第四方面所涉及的方法中,可选地,各个经训练模型的预测结果包括所述输入图像中的各个像素属于相应标注类别的概率,按标注类别和像素对所述至少一个经训练模型的预测结果进行集成以获取所述输入图像的各个像素属于相应标注类别的集成概率,基于所述集成概率确定连通区域,基于该连通区域获取各个标注类别对应的所述目标预测结果,其中,若仅存在一个经训练模型,则将所述概率作为所述集成概率,否则对多个经训练模型的预测结果求均值以获取所述输入图像中的各个像素属于相应标注类别的概率均值并作为所述集成概率。在这种情况下,基于集成概率获取目标预测结果,能够进一步提高目标预测结果的准确性。
另外,在本公开的第四方面所涉及的方法中,可选地,所述医学图像为眼底图像。在这种情况下,训练后获得的模型能够对眼底图像中的小目标进行识别。
另外,在本公开的第四方面所涉及的方法中,可选地,所述目标包括微血管瘤、点状出血、片状出血和线状出血。在这种情况下,训练后获得的模型能够对眼底图像中的小目标进行识别。
本公开第五方面提供了一种电子设备,该电子设备包括:至少一个处理电路,所述至少一个处理电路被配置为:获取作为输入图像的所述医学图像;并且利用根据本公开第一方面所述的模型训练方法训练的至少一个经训练模型,确定针对所述输入图像的各个经训练模型的预测结果,基于所述至少一个经训练模型的预测结果获取目标预测结果。
根据本公开,提供一种能够有效地对小目标进行识别的识别医学图像中的目标的模型训练、方法、设备及介质。
附图说明
现在将仅通过参考附图的例子进一步详细地解释本公开,其中:
图1是示出了本公开示例所涉及的识别目标环境的示例的示意图。
图2是示出了本公开示例所涉及的模型训练方法的示例的流程图。
图3是示出了本公开示例所涉及的一些示例的标注区域的示意图。
图4是示出了本公开示例所涉及的一些示例的区域分割结果的示意图。
图5是示出了本公开示例所涉及的获取区域分割结果的示例的流程图。
图6是示出了本公开示例所涉及的采用U-Net架构的待训练模型的示例的架构图。
图7是示出了本公开示例所涉及的一些示例的若干个区域的示意图。
图8是示出了本公开示例所涉及的识别图像中的目标的方法的示例的流程图。
具体实施方式
以下,参考附图,详细地说明本公开的优选实施方式。在下面的说明中,对于相同的部件赋予相同的符号,省略重复的说明。另外,附图只是示意性的图,部件相互之间的尺寸的比例或者部件的形状等可以与实际的不同。需要说明的是,本公开中的术语“包括”和“具有”以及它们的任何变形,例如所包括或所具有的一系列步骤或单元的过程、方法、***、产品或设备不必限于清楚地列出的那些步骤或单元,而是可以包括或具有没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。本公开所描述的所有方法可以以任何合适的顺序执行,除非在此另有指示或者与上下文明显矛盾。
本文中的术语“电路”可以指硬件电路和/或硬件电路和软件的组合。在本公开中术语“模型”能够处理输入并且提供相应输出。在本文中,术语“神经网络”、“深度神经网络”、“模型”、“网络”和“神经网络模型”可互换使用。另外,本文提到相关对象(例如,标注区域、待分割图像数据和目标)的矩形特性(例如,边、宽、高、 宽度和高度),若对象本身不为矩形,除非特别说明,可以默认为对象的外接矩形的矩形特性。
现有深度学习目标识别的方案,采用各种形状的框标的标注方式(也即,不要求准确边界的标注方式)对小目标进行识别。但是如上文所简要提及的,该方案对图像中小目标的识别效果不尽人意,存在较大的漏报和虚警的风险。这是由于小目标的面积较小,可提取的特征少,容易受噪声和其他组织的干扰。因此一种更好的方案是通过深度学习目标分割的方式对小目标进行分割从而实现对小目标的识别。但是该方案需要对小目标的边界进行精确标注,造成图像标注的困难。为了克服上面两个方案的不足,本公开通过对标注区域进行欠分割以获取区域分割结果,并使用该结果作为金标准对图像进行分割,从而实现了小目标的准确识别。特别地,本公开采用空间权重的方法处理标注区域内欠分割所导致的未确定类别的像素对图像分割的负面影响。在这种情况下,能够有效地对小目标进行识别。
因此,本公开的示例提出了一种训练模型和识别图像中的目标的方案,以解决上述问题和/或其他潜在问题中的一个或多个。该方案采用图像分割的方法进行目标识别(也即,先对训练样本中标注区域内的图像数据进行图像分割以获取区域分割结果,然后将区域分割结果进行后处理得到目标识别结果)。具体地,该方案通过对训练样本中标注区域内的图像数据进行欠分割以识别标注区域内未确定类别的像素,并结合空间权重(也即,权重的设置可以与像素的位置相关)对神经网络模型进行训练以减小标注区域内的未确定类别的像素对神经网络模型的负面影响,能够提高经训练模型对输入图像(例如,医学图像)的预测结果的准确性。另外,经训练模型可以为经训练的神经网络模型(也即,训练后的神经网络模型。例如,经训练的语义分割模型。由此,能够优化训练所得的模型的性能,提高模型对小目标的识别的准确性。在一些示例中,经训练模型可以是训练后获得的最优的神经网络模型。
本公开的示例涉及的训练模型和识别图像中的目标的方案,其有效地对小目标进行识别。本公开的示例涉及的识别图像中的目标的模 型训练方法可以简称为模型训练方法或训练方法。需要说明的是,本公开的示例涉及的方案同样适用于大目标的识别。
本公开的示例涉及的图像可以来自相机、CT扫描、PET-CT扫描、SPECT扫描、MRI、超声、X射线、血管造影照片、荧光图、胶囊内窥镜拍摄的图像或其组合。在一些示例中,图像可以为医学图像。例如,医学图像可以包括但不限于眼底图像、肺部图像、胃部图像、胸部图像和脑部图像等。由此,能够对医学图像中的小目标进行识别。在一些示例中,图像可以为自然图像。自然图像可以为自然场景下观察或者拍摄到的图像。由此,能够对自然图像中的小目标进行识别。以下以图像为医学图像中的眼底图像为例描述了本公开的示例,并且这样的描述并不限制本公开的范围,对于本领域技术人员而言,可以使用其它类型的图像而没有限制。
以下将结合附图来详细描述本公开的示例。为了便于理解,在下文描述中提及的具体数据均是示例性的,并不用于限定本公开的保护范围。应当理解,根据本公开示例还可以包括未示出的附加模块、可以省略所示出的模块、未示出的附加动作和/或可以省略所示出的动作,本公开的范围在此方面不受限制。
图1是示出了本公开示例所涉及的识别目标环境100的示例的示意图。如图1所示,识别目标环境100可以包括计算设备110。计算设备110可以是具有计算能力的任何设备。例如,计算设备110可以为云服务器、个人计算机、大型机和分布式计算***等。
计算设备110可以获取输入120并利用神经网络模型130(有时也可以被简称为待训练模型130或模型130)生成与输入120对应的输出140。在一些示例中,输入120可以为上述的图像,输出140可以为预测结果、训练参数(例如,权重)、或性能指标(例如,准确率和错误率)等。在一些示例中,神经网络模型130可以包括但不限于语义分割模型(例如,U-Net)、或者其他与图像处理相关的模型。另外,可以利用任何合适的网络结构来实现神经网络模型130。例如,卷积神经网络(CNN)、循环神经网络(RNN)和深度神经网络(DNN)等。
在一些示例中,识别目标环境100还可以包括模型训练装置和模型应用装置(未图示)。模型训练装置可以用于实施训练神经网络模 型130的训练方法以获取经训练模型。模型应用装置可以用于实施利用经训练模型获取预测结果的相关方法以对图像中的目标进行识别。另外,在模型训练阶段,神经网络模型130可以是待训练模型130。在模型应用阶段,神经网络模型130可以是经训练模型。
图2是示出了本公开示例所涉及的模型训练方法的示例的流程图。例如,模型训练方法可以由图1所示的计算设备110来执行。另外,模型训练方法可以训练识别医学图像中的目标的模型。
如图2所示,模型训练方法可以包括步骤S102。在步骤S102中,可以获取作为训练样本的医学图像和训练样本中的目标对应的标注区域。也即,在训练阶段,可以获取医学图像作为训练样本。由此,能够对医学图像中的目标进行识别。在一些示例中,医学图像可以为彩色图像。由此,能够提高对小目标识别的准确性。
另外,医学图像中可以包含相应的目标,目标可以属于至少一个感兴趣的类别(也即,需要识别的类别)。在一些示例中,对于医学图像为眼底图像,目标可以包括微血管瘤、点状出血、片状出血和线状出血等小目标。在这种情况下,训练后获得的模型能够对眼底图像中的小目标进行识别。
图3是示出了本公开示例所涉及的一些示例的标注区域的示意图。
在一些示例中,可以对训练样本中目标进行标注以获取标注区域。另外,标注区域的形状可以为矩形、圆形或与训练样本中目标的形状相匹配的形状(例如,标注区域的形状可以为目标的轮廓)。优先地,标注区域的形状可以为矩形。由此,能够降低标注的难度。作为示例,图3示出了眼底图像中的标注区域D1,其中,标注区域D1的形状为矩形,标注区域D1内的目标为片状出血。
另外,标注区域可以具有相应的标注标签(也即,目标的标注类别),该标注标签可以用于区分目标的类别。标注类别可以与目标的类别一一对应。例如,对于眼底图像,目标的类别和标注类别可以分别包括但不限于微血管瘤、点状出血、片状出血和线状出血等。在一些示例中,可以用数字表示相应的标注类别。由此,能够方便计算设备110进行计算。另外,标注区域和对应的标注标签可以称为标注结果。
如图2所示,模型训练方法还可以包括步骤S104。在步骤S104 中,可以确定训练样本中的标注区域对应的区域分割结果(也可以称为伪分割结果),并利用训练样本和区域分割结果构建训练集。需要说明的是,在另一些示例中,确定标注区域对应的区域分割结果也可以不是必须,只要能够识别中标注区域中的目标区域(稍后描述),且目标区域内的像素确定属于目标即可。
在一些示例中,根据实际情况(例如,训练样本的质量不满足训练要求或训练样本的尺寸大小不统一),在构建训练集之前,可以对训练样本进行相应的预处理后再用于构建训练集。
在一些示例中,对训练样本进行预处理可以包括对训练样本的尺寸大小进行统一。例如,可以将训练样本的尺寸大小统一为1024×1024或2048×2048。本公开并不限制训练样本的尺寸大小。在一些示例中,对训练样本进行预处理可以包括对训练样本进行裁剪。在一些示例中,在对训练样本进行裁剪中,可以获取训练样本中的感兴趣区域并利用感兴趣区域对训练样本进行裁剪。由此,能够使训练样本的尺寸一致且包含感兴趣区域。在一些示例中,感兴趣区域可以为可存在目标的区域(也可以称为前景区域)。例如,对于眼底图像,感兴趣区域可以为眼底区域。
在一些示例中,可以对训练样本进行分割以获取感兴趣区域。在一些示例中,可以对训练样本进行阈值分割以获取感兴趣分割结果,其中,感兴趣分割结果可以用于识别训练样本的感兴趣区域。由此,能够识别感兴趣区域。另外,通过阈值分割获得的感兴趣分割结果可以为二值图像(也可以称为二值化图像)。可以理解的是,虽然上文通过阈值分割的方式获取感兴趣分割结果,但其他适合于获取感兴趣分割结果也同样适用。例如,可以通过神经网络的方式获取感兴趣分割结果。
在一些示例中,对训练样本进行阈值分割可以为将训练样本平分为预设份数(例如9等份),基于训练样本的四个角的区域和中心区域的灰度值确定分割阈值,基于分割阈值对训练样本进行阈值分割,进而获取感兴趣分割结果。在一些示例中,基于训练样本的四个角的区域和中心区域的灰度值确定分割阈值可以为取四个角的区域中的各个区域的像素的灰度均值和中间区域的像素的灰度均值的平均值作为分 割阈值进行阈值分割,进而获取感兴趣分割结果。
另外,在阈值分割中,获取感兴趣分割结果之前,可以对训练样本对应的阈值分割结果(也即,初始分割结果)做腐蚀操作以获取感兴趣分割结果。例如,可以对训练样本的阈值分割结果做两次腐蚀操作以获取感兴趣分割结果,其中,腐蚀核的大小可以为5。由此,能够消除感兴趣区域(例如眼底区域)边缘的噪声。
返回参考图2,如上所述,在步骤S104中,可以确定标注区域对应的区域分割结果。另外,区域分割结果可以用于确定标注区域内的目标的目标区域。在这种情况下,能够识别出标注区域中的目标区域,进而能够基于目标区域确定未确定类别的像素。具体地,训练样本中的标注区域内的目标区域以外的像素可以为未确定类别的像素。
另外,区域分割结果可以是能够识别出目标区域的任意形式的数据(例如,图像)。在一些示例中,区域分割结果可以为二值图像。在一些示例中,对于为二值图像的区域分割结果中,可以令值为1的像素对应的区域为目标区域(也即,若像素的值为1,则可以表示训练样本中对应位置的像素属于目标,若像素的值为0,则可以表示训练样本中对应位置的像素为未确定类别的像素)。在这种情况下,能够减小未确定类别的像素对待训练模型130的负面影响。
图4是示出了本公开示例所涉及的一些示例的区域分割结果的示意图。
作为示例,图4示出了图3中的标注区域D1对应的区域分割结果A1,其中,D2为目标区域。另外,为了使区域分割结果A1显示得更清楚,图4的区域分割结果A1是进行了等比例放大的结果,并不表示对本公开的限制,图4的区域分割结果A1实际可以与标注区域D1的大小一致。
在一些示例中,可以对训练样本中标注区域内的图像数据进行欠分割以获取区域分割结果(也即,可以通过欠分割分割出标注区域内的目标对应的目标区域以获取区域分割结果)。由此,能够基于欠分割获得的区域分割结果识别标注区域内未确定类别的像素。一般而言,前景目标物误分割为背景但是背景未被误分割为前景目标可以称为欠分割。这里,欠分割可以为标注区域中的属于目标的像素误分割为非 目标但标注区域中的不属于目标的像素未被误分割为目标。在这种情况下,能够使区域分割结果中目标区域内的像素确定属于目标。另外,在标注区域以内,目标区域以外的像素不一定属于目标(也即,可以为未确定类别的像素)。
在一些示例中,可以基于训练样本和上述的感兴趣分割结果中标注区域分别对应的图像数据确定标注区域对应的区域分割结果。具体地,可以将训练样本中标注区域对应的图像数据(以下将训练样本中标注区域对应的图像数据简称为第一图像数据)和上述的感兴趣分割结果(也即,感兴趣分割结果可以为用于识别训练样本的感兴趣区域的二值图像)中标注区域对应的图像数据(以下将感兴趣分割结果中标注区域对应的图像数据简称为第二图像数据)进行乘积运算以获取待分割图像数据(也即,标注区域内的图像数据),对待分割图像数据进行欠分割确定标注区域对应的区域分割结果。在这种情况下,在标注区域包括感兴趣区域以外的区域时,能够消除感兴趣区域以外的噪声。
图5是示出了本公开示例所涉及的获取区域分割结果的示例的流程图。也即,本公开的一些示例获取区域分割结果的流程。
如图5所示,获取区域分割结果可以包括步骤S202。在步骤S202中,可以基于标注区域获取待分割图像数据。如上所述,第一图像数据可以为训练样本中标注区域对应的图像数据,第二图像数据可以为感兴趣分割结果中标注区域对应的图像数据。也即,可以基于标注区域可以获取第一图像数据和/或第二图像数据,然后基于第一图像数据、或第一图像数据和第二图像数据获取待分割图像数据。
在一些示例中,可以基于第一图像数据获取待分割图像数据。在一些示例,可以基于第一图像数据的颜色通道(例如,红色通道、绿色通道、蓝色通道)获取待分割图像数据。以眼底图像为例,可以基于第一图像数据的绿色通道获取待分割图像数据。具体地,可以从训练样本中获取(例如裁剪)标注区域对应的第一图像数据,然后取第一图像数据的绿色通道(也即G通道),基于第一图像数据的绿色通道获取待分割图像数据。在一些示例中,可以将第一图像数据的相应颜色通道(绿色通道)作为待分割图像数据以获取待分割图像数据。另 外,可以根据医学图像自身的特点选择颜色空间和颜色通道,本公开不做特别限制。
在另一些示例中,可以基于第一图像数据以及第二图像数据获取待分割图像数据。在这种情况下,在标注区域包括感兴趣区域以外的区域时,能够消除感兴趣区域以外的噪声。在一些示例中,可以基于第一图像数据的颜色通道以及第二图像数据获取待分割图像数据。具体地,可以令第一图像数据的颜色通道表示为G 1,第二图像数据表示为B 1,则待分割图像数据可以表示为I 1=G 1□B 1,其中,I 1可以表示待分割图像数据,□可以表示元素(也即,像素的灰度值)乘积运算。
需要说明的是,第一图像数据、第二图像数据以及待分割图像数据可以表示相应区域的图像数据(例如,像素数据、数据流或图像),在实践中,可以根据需要将相应区域的像素的值或像素的位置标记存储在相应的介质(例如内存或磁盘)以形成相应形式的图像数据,进而能够方便处理。另外,第一图像数据、第二图像数据和待分割图像数据的形状可以与标注区域的形状相匹配,也可以为标注区域的外接矩形,可以根据获取区域分割结果的方式进行选择。
另外,获取区域分割结果过程中,若必须利用待分割图像数据的矩形特性(例如,边、长、宽、高、四个角)且标注区域的形状不为矩形时,可以基于标注区域的外接矩形对应的区域获取待分割图像数据。也即,将标注区域的形状转换成矩形后,可以基于转换后的标注区域获取待分割图像数据。
如图5所示,获取区域分割结果还可以包括步骤S204。在步骤S204中,可以对待分割图像数据进行阈值分割以获取区域分割结果。但本公开的示例不限于此,在另一些示例中,也可以使用其他方式对待分割图像数据进行欠分割以获取区域分割结果。
在一些示例中,在步骤S204中,可以利用目标分割阈值(稍后描述)对待分割图像数据进行阈值分割,进而获取区域分割结果。由此,能够通过阈值分割识别待分割图像数据中的目标区域。在一些示例中,在阈值分割中,可以令待分割图像数据中灰度值不小于目标分割阈值的像素的值为1,其他像素的值为0,进而获取区域分割结果。
在一些示例中,在获取区域分割结果之前,还可以对待分割图像 数据的阈值分割结果(也即,初始分割结果)进行腐蚀操作。在这种情况下,能够降低由于噪声的影响,导致阈值分割结果中存在像素上孤立的概率。
在一些示例中,对待分割图像数据的阈值分割结果进行腐蚀操作中,腐蚀核k可以满足公式:
Figure PCTCN2022095137-appb-000001
其中,h可以表示标注区域(也即,待分割图像数据对应的标注区域)的高和w可以表示标注区域的宽,H可以表示训练样本的高度,W可以表示训练样本的宽度,p可以表示预设的超参数。在这种情况下,能够根据训练样本的大小、标注区域的大小和预设的超参数获取合适大小的腐蚀核。由此,能够抑制过度腐蚀。
在一些示例中,预设的超参数可以是用于调整腐蚀核的大小。在这种情况下,能够使特别小的目标采用较小的腐蚀核。由此,能够避免过度腐蚀操作而导致特别小的目标的目标区域消失。
在一些示例中,预设的超参数可以为固定值。在一些示例中,预设的超参数可以根据医疗图像中同类别的目标的平均的尺寸大小确定。在一些示例中,预设的超参数可以根据医疗图像中同类别的目标的平均宽度和平均高度确定。在一些示例中,预设的超参数p可以满足公式:
Figure PCTCN2022095137-appb-000002
其中,
Figure PCTCN2022095137-appb-000003
Figure PCTCN2022095137-appb-000004
可以分别表示医疗图像中同类别的目标的平均宽度和平均高度,σ w和σ h可以分别表示宽度标准差和高度标准差,
Figure PCTCN2022095137-appb-000005
Figure PCTCN2022095137-appb-000006
可以分别表示医疗图像的平均宽度和平均高度。这里,医疗图像可以为用于获取预设的超参数的数据源中的图像。在一些示例中,可以对多个训练样本中同类别的目标的宽度和高度以及训练样本的宽度和高度进行统计,以获取预设的超参数的相关参数。也即,数据源可以为训练数据。在一些示例中,在具有标注区域的医疗图像(例如,训练样本)中,获取预设的超参数时,目标的宽度和高度也可以为对应的标注区域的宽度和高度。由此,能够方便地获取目标的宽度和高度。
一般而言,待分割图像数据的阈值分割结果可能存在多个连通区域。在一些示例中,可以对待分割图像数据的阈值分割结果进行腐蚀操作以获取至少一个连通区域,从至少一个连通区域中选择中心离待分割图像数据的中心最近的连通区域作为区域分割结果。另外,离待分割图像数据的中心最近的连通区域可以表示识别出的目标区域。由此,能够获得准确的目标区域。在一些示例中,可以对腐蚀结果(也即,至少一个连通区域)查找轮廓,取面积最大的预设数量(例如3个)的轮廓作为候选,并保留候选轮廓中轮廓中心离待分割图像数据的中心最近的轮廓所对应的连通区域为区域分割结果。
另外,在待分割图像数据的阈值分割中,获取目标分割阈值的方式可以有多种。例如,可以根据常见的大律法(OTSU)的方式获取目标分割阈值。在一些示例中,获取目标分割阈值的方式可以从本公开的示例中描述的方式中选择至少一种。
在一些示例中,可以根据目标所属的标注类别获取目标分割阈值。在一些示例中,可以根据目标所属的标注类别的获取阈值方法获取目标分割阈值。在这种情况下,能够根据目标对应的标注类别自身的特点获取目标分割阈值。由此,能够提高阈值分割的准确性。另外,标注类别的获取阈值方法可以包括第一种方法和第二种方法。另外,训练样本中目标所属的标注类别可以是已知的。例如训练样本中目标所属的标注类别可以为标注结果中的标注标签。
在一些示例中,各个标注类别的获取阈值方法可以是通过各个标注类别的特征得到。在一些示例中,可以根据各个标注类别的平均面积和平均颜色来确定获取阈值方法。但本公开的示例不限于此,在另外一些示例中,也可以根据经验确定标注类别的获取阈值方法。例如,对于眼底图像,眼底图像中的片状出血可以使用第一种方法,眼底图像中的微血管瘤、点状出血和线状出血可以使用第二种方法。
在一些示例中,各个标注类别的平均面积和平均颜色可以是固定值,可以根据对样本数据进行统计获得。例如,可以对样本数据(例如训练样本)中同类别(例如,对于训练样本,同类别可以指同标注类别)的目标的面积和颜色分别进行求平均以获取平均面积和平均颜色。在另一些示例中,固定值也可以是经验值。
在一些示例中,在根据各个标注类别的平均面积和平均颜色确定标注类别的获取阈值方法中,第一种方法对应的标注类别的平均面积可以大于第二种方法对应的标注类别的平均面积且第一种方法对应的标注类别的平均颜色可以比第二种方法对应的标注类别的平均颜色浅。例如,第一种方法可以针对面积大,颜色浅的这种标注类别的目标(例如,眼底图像中的片状出血)。第二种方法可以针对面积小,颜色深的这种标注类别的目标(例如,眼底图像中的微血管瘤、点状出血和线状出血)。
在一些示例中,在根据各个标注类别的平均面积和平均颜色确定标注类别的获取阈值方法中,可以通过第一预设面积和预设颜色值确定标注类别使用的获取阈值方法。由此,能够自动获取标注类别所使用的获取阈值方法。
在一些示例中,若标注类别的平均面积大于第一预设面积且平均颜色小于预设颜色值(也即,该标注类别的目标的面积相对大,颜色相对浅),则可以将该标注类别确定为使用第一种方法,否则若标注类别的平均面积不大于第一预设面积且平均颜色不小于预设颜色值(也即,该标注类别的目标的面积相对小,颜色相对深),则可以将该标注类别确定为使用第二种方法。
在一些示例中,第一预设面积和预设颜色值可以根据区域分割结果进行调整。在一些示例中,第一预设面积和预设颜色值可以是固定值,固定值可以根据对样本数据进行统计获得。也即,可以利用统计学的方法统计少量的样本数据在不同第一预设面积和预设颜色值下的区域分割结果以确定最佳的用于分类的第一预设面积和预设颜色值。
如上所述,可以根据目标所属的标注类别的获取阈值方法获取目标分割阈值。在一些示例中,可以根据目标所属的标注类别的获取阈值方法和训练样本对应的待分割图像数据获取目标分割阈值。
在一些示例中,对于第一种方法(也即,目标所属的标注类别的获取阈值方法为第一种方法),可以查找阈值,使待分割图像数据内灰度值大于该阈值的像素的面积小于待分割图像数据的面积的预设倍数,将该阈值作为目标分割阈值,其中,预设倍数可以大于0且小于1。以医学图像为8位量化的图像为例,可以遍历0至255的阈值,找到阈 值使得待分割图像数据内灰度值大于该阈值的像素的面积小于待分割图像数据的面积的预设倍数,将该阈值作为目标分割阈值。另外,预设倍数可以为使目标区域不过分割的任意值。例如,预设倍数可以取偏小的值以使目标区域不过分割。在一些示例中,预设倍数可以由目标的形状通过经验来确定。
在一些示例中,对于第二种方法(也即,目标所属的标注类别的获取阈值方法为第二种方法),可以将待分割图像数据中像素的灰度值的均值作为目标分割阈值或基于待分割图像数据的四个角的区域和中心区域的灰度值确定目标分割阈值。
在一些示例中,对于第二种方法,若待分割图像数据的最小的边的长度小于预设长度,则可以取待分割图像数据中像素的灰度值的均值作为目标分割阈值。在一些示例中,预设长度可以为使目标区域不过分割的任意值。在一些示例中,预设长度可以为训练样本的最小边的第一预设比例。具体地,预设长度可以表示为min(rH,rW),其中,r可以表示第一预设比例,H可以表示训练样本的高度,W可以表示训练样本的宽度。
在一些示例中,第一预设比例可以为固定值。在一些示例中,第一预设比例可以根据医疗图像中同类别的目标的平均的尺寸大小确定。在一些示例中,第一预设比例可以根据医疗图像中同类别的目标的平均宽度和平均高度确定。在一些示例中,第一预设比例可以满足公式:
Figure PCTCN2022095137-appb-000007
其中,
Figure PCTCN2022095137-appb-000008
Figure PCTCN2022095137-appb-000009
可以分别表示医疗图像中同类别的目标的平均宽度和平均高度,σ w和σ h可以分别表示宽度标准差和高度标准差,
Figure PCTCN2022095137-appb-000010
Figure PCTCN2022095137-appb-000011
可以分别表示医疗图像的平均宽度和平均高度。这里,医疗图像可以为用于获取第一预设比例的数据源中的图像。在一些示例中,数据源可以为训练数据。另外,可以与获取预设的超参数涉及的相关参数类似的方式获取第一预设比例涉及的相关参数,此处不再赘述。
在一些示例中,对于第二种方法,若待分割图像数据的最小的边的长度不小于预设长度,则可以基于待分割图像数据的四个角的区域和中心区域的灰度值确定目标分割阈值。具体地,可以将待分割图像数据平分为预设份数(例如9等份),并基于待分割图像数据的四个角 的区域和中心区域的灰度值确定目标分割阈值。具体内容参见上述基于训练样本的四个角的区域和中心区域的灰度值确定分割阈值的相关描述。
返回参考图2,如上所述,在步骤S104中,可以利用训练样本和区域分割结果构建训练集。也即,可以基于训练样本和训练样本对应的至少一个区域分割结果构建训练集。在一些示例中,训练集可以包括训练样本和训练样本的金标准。在一些示例中,可以基于区域分割结果获取训练样本的金标准。也即,可以基于区域分割结果识别目标区域,进而基于目标区域确定训练样本中的像素属于的真实类别。由此,能够获取训练样本的金标准。
在一些示例中,真实类别可以包括目标的标注类别(例如,对于眼底图像,可以包括微血管瘤、点状出血、片状出血和线状出血)、无目标类别和未确定类别中的至少一项。具体跟优化待训练模型130的过程有关。
另外,真实类别中的目标的标注类别可以为训练样本中的标注区域内的目标的目标区域(也即,稍后描述的第二区域)的像素属于的类别。另外,真实类别中的未确定类别可以为训练样本中的标注区域内的目标的目标区域以外的区域(也即,稍后描述的第一区域)的像素属于的类别。另外,真实类别中的无目标类别可以为训练样本中标注区域之外的像素属于的类别。在一些示例中,训练样本中标注区域之外的区域可以包括感兴趣区域之内且不属于标注区域的区域(也即,稍后描述的第三区域)。例如,对于医学图像,感兴趣区域之内且不属于标注区域的区域可以为医学图像中无目标的组织对应的区域。在一些示例中,训练样本中标注区域之外的区域可以包括感兴趣区域之内且不属于标注区域的区域、和感兴趣区域之外的区域(也即,稍后描述的第四区域)。
在一些示例中,还可以利用训练样本和区域分割结果构建验证集和测试集。
返回参考图2,模型训练方法还可以包括步骤S106。在步骤S106中,可以基于训练集训练待训练模型130,并利用训练损失函数优化待训练模型130。
在一些示例中,待训练模型130可以包括但不限于是语义分割模型。另外,待训练模型130的预测结果可以包括但不限于是训练样本的语义分割结果。由此,能够对小目标进行识别。例如,在上述的输入120是待进行语义分割的图像数据,并且待训练模型130是语义分割模型的示例中,预测结果可以是图像数据的语义分割结果。另外,上述的输入120可以为彩色的图像数据。
在一些示例中,在待训练模型130中,可以增加高维度的特征信息。由此,能够提高对小目标的识别的准确性。在一些示例中,在待训练模型130中,可以提取医学图像(例如训练样本)中的不同维度的特征信息,并将靠近最高维度的特征信息的预设维度的特征信息与最高维度的特征信息进行融合以增加高维度的特征信息。
图6是示出了本公开示例所涉及的采用U-Net架构的待训练模型130的示例的架构图。
作为示例,图6示出了采用U-Net架构的待训练模型130,其中,对于U-Net架构中的常见的网络层,此处不做过多解释。如图6所示,预设维度可以为2,2个维度的特征信息可以包括特征信息131a和特征信息132b,其中,特征信息131a可以通过上采样层132a与最高维度的特征信息进行融合,特征信息131b可以通过上采样层132b与最高维度的特征信息进行融合。另外,上采样层132a和上采样层132b的卷积大小可以是使特征信息(例如,特征信息131a和特征信息131b)经过上采样与最高维度的特征信息的大小一致的任意值。
在一些示例中,在训练待训练模型130中,可以通过待训练模型130基于训练集的训练样本,获得训练样本对应的预测结果,然后基于训练样本对应的区域分割结果和预测结果构建训练损失函数(也即,可以利用基于区域分割结果获得的训练样本的金标准和预测结果构建训练损失函数)。另外,训练损失函数可以表示训练样本的金标准与对应的预测结果的差异程度。
在一些示例中,可以直接将区域分割结果作为训练样本的金标准。在一些示例中,可以将区域分割结果作为训练样本中的目标对应的标注区域内的像素的金标准以获取训练样本的金标准。另外,训练样本中的目标对应的标注区域以外的区域的像素的金标准可以根据需要进 行设置。例如,可以固定设置为一种类别(例如,可以为本公开示例涉及的无目标类别)。又例如,可以通过对训练样本进行人工标注的方式进行设置或通过人工智能算法对训练样本进行自动标注的方式进行设置。本公开的示例对训练样本中的目标对应的标注区域以外的区域的像素的金标准的设置方式不做特别限制。
在一些示例中,在训练损失函数中,可以对训练样本中上述的未确定类别的像素分配权重以减小未确定类别的像素对待训练模型130的负面影响。由此,能够提高待训练模型130的准确性。在一些示例中,在训练损失函数中,可以利用空间权重减小训练样本中的未确定类别的像素对待训练模型130的负面影响。
在一些示例中,在空间权重中,可以将训练样本分成若干个区域(也可以称为至少一个区域),并利用权重调整若干个区域中的各个区域对待训练模型130的影响。
在一些示例中,若干个区域可以包括第一区域。第一区域可以为训练样本中的未确定类别的像素的区域(也即,训练样本中标注区域内的目标区域以外的区域)。在一些示例中,在训练损失函数中,可以利用空间权重减小训练样本中的第一区域的像素对待训练模型130的负面影响。在一些示例中,在空间权重中,训练样本中的第一区域的像素可以被分配第一权重以减小对待训练模型130的负面影响。
另外,第一权重可以是使对待训练模型130的负面影响减小的任意值。在一些示例中,第一权重可以是固定值。在一些示例中,第一权重可以为0。在这种情况下,能够忽略未确定类别的样本,以减小未确定类别的样本对待训练模型130的负面影响。
在一些示例中,若干个区域可以包括第二区域。第二区域可以为训练样本的目标区域。在一些示例中,在空间权重中,第二区域的像素可以被分配第二权重。在一些示例中,第一权重可以小于第二权重。另外,第二权重可以是使第二区域的像素对待训练模型130的正面影响增大的任意值。在一些示例中,第二权重可以是固定值。在一些示例中,第二权重可以为1。
在一些示例中,若干个区域可以包括第三区域。第三区域可以为训练样本中感兴趣区域内的不属于标注区域的区域。在一些示例中, 在空间权重中,第三区域的像素可以被分配第三权重。在一些示例中,第一权重可以小于第三权重。另外,第三权重的设置原则可以与第二权重类似。
在一些示例中,若干个区域可以包括第四区域。第四区域可以为训练样本中感兴趣区域之外的区域。在一些示例中,在空间权重中,第四区域的像素可以被分配第四权重。在一些示例中,第四权重可以小于第二权重。另外,第四权重的设置原则可以与第一权重类似。
在一些示例中,若干个区域可以同时包括第一区域、第二区域、第三区域和第四区域,第一区域、第二区域、第三区域和第四区域的像素可以分别被分配第一权重、第二权重、第三权重和第四权重,其中,第一权重可以小于第二权重且小于第三权重,第四权重可以小于第二权重且小于第三权重。在这种情况下,能够抑制未确定类别的像素以及感兴趣区域以外的像素对待训练模型130的负面影响,提高目标区域以内和感兴趣区域内的无目标区域对待训练模型130的正面影响。由此,能够提高模型的准确性。优选地,第一权重可以为0,第二权重可以为1,第三权重可以为1,第四权重可以为0。在这种情况下,能够避免未确定类别的像素以及感兴趣区域以外的像素对待训练模型130的负面影响,提高目标区域以内和感兴趣区域内的无目标区域对待训练模型130的正面影响。由此,能够提高待训练模型130的准确性。
但本公开的示例不限于此,在另一些示例中,若干个区域可以包括第一区域、第二区域、第三区域和第四区域的任意组合。
图7是示出了本公开示例所涉及的一些示例的若干个区域的示意图。
另外,为了清楚地描述若干个区域,图7是示出了二值化的各个区域的示意图,并不限制本公开一定要划分成图7所示的所有区域。其中,D3可以表示第一区域、D4可以表示第二区域,D5可以表示第三区域,D6可以表示第四区域。
如上所述,在一些示例中,在空间权重中,可以将训练样本分成若干个区域,并利用权重调整若干个区域中的各个区域对待训练模型130的影响。
在一些示例中,在训练损失函数中,可以按类别计算损失。如上 所述,真实类别可以包括目标的标注类别、无目标类别和未确定类别中的至少一项。在一些示例中,在训练损失函数中,类别可以来源上述的真实类别。也即,训练损失函数中的类别可以包括目标的标注类别和无目标类别、或训练损失函数中的类别可以包括目标的标注类别、无目标类别、和未确定类别。具体训练损失函数中的类别与训练损失函数中选择的样本有关。
在一些示例中,在训练损失函数中,若训练样本中各个类别的样本(也即,像素)属于若干个区域中的相应区域,则可以将相应样本的损失乘以对应区域的权重。在这种情况下,能够基于空间权重确定训练损失函数,进而调整不同区域的像素对待训练模型130的影响。
在一些示例中,在训练损失函数中,可以基于各个类别的权重调整各个类别的样本对待训练模型130的影响。由此,能够调整不同类别的样本对待训练模型130的影响。
在一些示例中,在训练损失函数中,可以同时基于空间权重和类别的权重调整样本对待训练模型130的影响。由此,能够按区域和类别调整样本对待训练模型130的影响。
在一些示例中,训练损失函数可以采用加权均衡交叉熵。在这种情况下,能够抑制正负样本失衡,进而进一步提高待训练模型130对小目标的识别的准确性。在一些示例中,在训练待训练模型130时,可以基于加权均衡交叉熵的训练损失函数,并利用空间权重来控制未确定类别的像素对待训练模型130的负面影响。
以下以空间权重中,第一区域的第一权重为0,第二区域的第二权重为1,第三区域的第三权重为1,第四区域的第四权重为0为例,描述基于加权均衡交叉熵的训练损失函数。需要说明的是,并不表示对本公开的限制,本领域人员可以根据情况,通过自由组合若干个区域中各个区域的权重和各个类别的权重设计基于加权均衡交叉熵的训练损失函数。基于加权均衡交叉熵的训练损失函数L可以满足公式(也即,相当于通过设置第一权重和第四权重为0,忽略第一区域和第四区域的损失):
Figure PCTCN2022095137-appb-000012
其中,C可以表示类别的数量,W i可以表示第i个类别的权重,M i可以表示第i个类别的样本数量,y ij可以表示上述训练样本的金标准中第i个类别的第j个样本的真实值,p ij可以表示预测结果中第i个类别的第j个样本的预测值(也即,第j个样本属于第i个类别的概率)。另外,各个类别的样本可以为训练样本中相应类别的像素。另外,一个类别的样本,可以基于上述训练样本的金标准确定。如上所述,类别的权重可以调整各个类别的样本对待训练模型130的影响。
另外,公式(1)中,通过设置第一权重和第四权重为0,忽略第一区域和第四区域的样本,训练损失函数中的类别可以包括目标的标注类别和无目标类别,目标的标注类别可以为训练样本中第二区域的像素所属的类别,无目标类别可以为训练样本中第三区域的像素所属的类别。以眼底图像为例,公式(1)的训练损失函数中的类别可以包括微血管瘤、点状出血、片状出血和线状出血和无目标类别。
以下结合图8描述本公开涉及的识别图像中的目标的方法(以下简称识别方法)。另外,识别方法可以识别医学图像中的目标。图8是示出了本公开示例所涉及的识别图像中的目标的方法的示例的流程图。
如图8所示,识别方法可以包括步骤S302。在步骤S302中,可以获取作为输入图像的医学图像。在一些示例中,输入图像可以经过与上述的训练样本一样的预处理后再输入经训练模型。
如图8所示,识别方法还可以包括步骤S304。在步骤S304中,可以利用至少一个经训练模型,确定针对输入图像的各个经训练模型的预测结果,基于至少一个经训练模型的预测结果获取目标预测结果,其中,至少一个经训练模型可以根据上述模型训练方法训练获得。另外,至少一个经训练模型可以为基于同一种类型的网络架构(例如,U-Net),但网络结构不同和/或参数不同的模型。例如,可以增加或减少一些分支或网络层次以形成至少一个经训练模型。但本公开的示例不限于此,在另一些示例中,至少一个经训练模型也可以不基于同一种类型的网络架构。
另外,各个经训练模型的预测结果可以包括输入图像中的各个像素属于相应标注类别的概率。标注类别可以为上述目标的标注类别。在一些示例中,可以按标注类别和像素对至少一个经训练模型的预测 结果进行集成以获取输入图像的各个像素属于相应标注类别的集成概率,基于集成概率确定连通区域,基于该连通区域获取各个标注类别对应的目标预测结果。在这种情况下,基于集成概率获取目标预测结果,能够进一步提高目标预测结果的准确性。
在一些示例中,在获取集成概率中,若仅存在一个经训练模型,则可以将该经训练模型的预测结果中输入图像中的各个像素属于相应标注类别的概率作为集成概率,否则可以对多个经训练模型的预测结果求均值以获取输入图像中的各个像素属于相应标注类别的概率均值(也即,可以按标注类别进行像素级别的求概率均值)。
在一些示例中,在基于集成概率确定连通区域中,可以基于集成概率和各个标注类别的分类阈值确定连通区域。具体地,可以将集成概率不小于分类阈值的像素的值设置为1,其他像素的值设置为0。在一些示例中,分类阈值可以基于验证集并采用性能指标来确定。另外,若存在连通区域,连通区域的数量可以为一个或多个。
在一些示例中,在基于连通区域获取目标预测结果中,可以获取各个连通区域的外接矩形,若外接矩形的面积大于第二预设面积,则可以表示该外接矩形处存在目标,否则可以表示该外接矩形处不存在目标。
在一些示例中,第二预设面积可以为训练样本的面积的第二预设比例。具体地,第二预设面积可以表示为sHW,其中,s可以表示第二预设比例,H可以表示输入图像的高度,W可以表示输入图像的宽度。
在一些示例中,第二预设比例可以为固定值。在一些示例中,第二预设比例可以根据医疗图像中同类别的目标的面积的中值确定。在一些示例中,第二预设比例s可以满足公式:
Figure PCTCN2022095137-appb-000013
其中,m可以分别表示医疗图像中同类别的目标的面积的中值,σ可以表示医疗图像中同类别的目标的面积的标准差,
Figure PCTCN2022095137-appb-000014
Figure PCTCN2022095137-appb-000015
可以分别表示医疗图像的平均宽度和平均高度。这里,医疗图像可以为用于获取第二预设比例的数据源中的图像。在一些示例中,数据源可以为训练数据。另外,可以与获取预设的超参数涉及的相关参数类似的方式 获取第二预设比例涉及的相关参数,此处不再赘述。
本公开还涉及计算机可读存储介质,该计算机可读存储介质可以存储有至少一个指令,至少一个指令被处理器执行时实现上述的模型训练方法或识别方法中的一个或多个步骤。
本公开还涉及电子设备,电子设备可以包括至少一个处理电路。至少一个处理电路被配置为上述的模型训练方法或识别方法中的一个或多个步骤。
本公开的示例的识别医学图像中的目标的模型训练、方法、设备及介质,通过对训练样本中标注区域内的图像数据进行欠分割以识别标注区域内未确定类别的像素,并结合空间权重对待训练模型130进行训练以减小标注区域内的未确定类别的像素对待训练模型130的负面影响,进而能够使训练后的待训练模型130对输入图像的预测结果的准确性提高。由此,能够有效地对小目标进行识别。
虽然以上结合附图和示例对本公开进行了具体说明,但是可以理解,上述说明不以任何形式限制本公开。本领域技术人员在不偏离本公开的实质精神和范围的情况下可以根据需要对本公开进行变形和变化,这些变形和变化均落入本公开的范围内。

Claims (15)

  1. 一种识别医学图像中的目标的模型训练方法,其特征在于,包括:获取作为训练样本的所述医学图像和所述训练样本中的所述目标对应的标注区域;确定所述标注区域对应的区域分割结果,并利用所述训练样本和所述区域分割结果构建训练集,其中,通过对所述标注区域内的图像数据进行欠分割以获取所述区域分割结果;并且基于所述训练集训练待训练模型,并利用训练损失函数优化所述待训练模型,其中,在所述训练损失函数中,利用空间权重减小所述训练样本中的第一区域的像素对所述待训练模型的负面影响,所述第一区域为所述训练样本中的所述标注区域内的所述目标的目标区域以外的区域,所述目标区域由所述区域分割结果确定。
  2. 根据权利要求1所述的模型训练方法,其特征在于,获取所述区域分割结果进一步包括:
    基于所述训练样本中所述标注区域对应的图像数据获取待分割图像数据、或基于所述训练样本中所述标注区域对应的图像数据以及感兴趣分割结果中所述标注区域对应的图像数据获取所述待分割图像数据,其中,所述感兴趣分割结果为用于识别所述训练样本的感兴趣区域的二值图像;并且利用目标分割阈值对所述待分割图像数据进行阈值分割,进而获取所述区域分割结果,其中,所述区域分割结果为二值图像。
  3. 根据权利要求2所述的模型训练方法,其特征在于:
    根据所述目标所属的标注类别的获取阈值方法获取所述目标分割阈值,其中,各个标注类别的获取阈值方法由各个标注类别的平均面积和平均颜色确定,所述获取阈值方法包括第一种方法和第二种方法,所述第一种方法对应的标注类别的平均面积大于所述第二种方法对应的标注类别的平均面积且所述第一种方法对应的标注类别的平均颜色比所述第二种方法对应的标注类别的平均颜色浅;对于所述第一种方法,查找阈值,使所述待分割图像数据内灰度值大于所述阈值的像素的面积小于所述待分割图像数据的面积的预设倍数,将所述阈值作为所述目标分割阈值,其中,所述预设倍数大于0且小于1;对于所述第二种方法,若所述待分割图像数据的最小的边的长度小于预设长度,则取所 述待分割图像数据中像素的灰度值的均值作为所述目标分割阈值,否则基于所述待分割图像数据的四个角的区域和中心区域的灰度值确定所述目标分割阈值。
  4. 根据权利要求2所述的模型训练方法,其特征在于,在获取所述区域分割结果之前:还对所述待分割图像数据的阈值分割结果进行腐蚀操作以获取至少一个连通区域,从所述至少一个连通区域中选择中心离所述待分割图像数据的中心最近的所述连通区域作为所述区域分割结果。
  5. 根据权利要求1所述的模型训练方法,其特征在于:
    在所述空间权重中,所述训练样本中的所述第一区域的像素被分配第一权重,其中,所述第一权重为0。
  6. 根据权利要求1所述的模型训练方法,其特征在于:
    所述训练样本中的所述第一区域、第二区域、第三区域和第四区域的像素分别被分配第一权重、第二权重、第三权重和第四权重,其中,所述第二区域为所述目标区域,所述第三区域为感兴趣区域内的不属于所述标注区域的区域,所述第四区域为所述感兴趣区域之外的区域,所述第一权重小于所述第二权重且小于所述第三权重,所述第四权重小于所述第二权重且小于所述第三权重。
  7. 根据权利要求1所述的模型训练方法,其特征在于:
    所述待训练模型是语义分割模型,所述待训练模型的预测结果是所述训练样本的语义分割结果。
  8. 根据权利要求1所述的模型训练方法,其特征在于:
    所述标注区域的形状为矩形。
  9. 一种电子设备,其特征在于,包括,至少一个处理电路,所述至少一个处理电路被配置为执行如权利要求1至8中任一项所述的模型训练方法。
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有至少一个指令,所述至少一个指令被处理器执行时实现如权利要求1至8 中任一项所述的模型训练方法。
  11. 一种识别医学图像中目标的方法,其特征在于,包括:获取作为输入图像的所述医学图像;并且利用根据权利要求1至8中任一项所述的模型训练方法训练的至少一个经训练模型,确定针对所述输入图像的各个经训练模型的预测结果,基于所述至少一个经训练模型的预测结果获取目标预测结果。
  12. 根据权利要求11所述的方法,其特征在于:
    各个经训练模型的预测结果包括所述输入图像中的各个像素属于相应标注类别的概率,按标注类别和像素对所述至少一个经训练模型的预测结果进行集成以获取所述输入图像的各个像素属于相应标注类别的集成概率,基于所述集成概率确定连通区域,基于该连通区域获取各个标注类别对应的所述目标预测结果,其中,若仅存在一个经训练模型,则将所述概率作为所述集成概率,否则对多个经训练模型的预测结果求均值以获取所述输入图像中的各个像素属于相应标注类别的概率均值并作为所述集成概率。
  13. 根据权利要求11所述的方法,其特征在于:
    所述医学图像为眼底图像。
  14. 根据权利要求13所述的方法,其特征在于:
    所述目标包括微血管瘤、点状出血、片状出血和线状出血。
  15. 一种电子设备,其特征在于,包括:至少一个处理电路,所述至少一个处理电路被配置为:获取作为输入图像的医学图像;并且利用根据权利要求1至8中任一项所述的模型训练方法训练的至少一个经训练模型,确定针对所述输入图像的各个经训练模型的预测结果,基于所述至少一个经训练模型的预测结果获取目标预测结果。
PCT/CN2022/095137 2022-03-02 2022-05-26 识别医学图像中的目标的模型训练、方法、设备及介质 WO2023165033A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210205467.4 2022-03-02
CN202210205467.4A CN114581709A (zh) 2022-03-02 2022-03-02 识别医学图像中的目标的模型训练、方法、设备及介质

Publications (1)

Publication Number Publication Date
WO2023165033A1 true WO2023165033A1 (zh) 2023-09-07

Family

ID=81777415

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/095137 WO2023165033A1 (zh) 2022-03-02 2022-05-26 识别医学图像中的目标的模型训练、方法、设备及介质

Country Status (2)

Country Link
CN (1) CN114581709A (zh)
WO (1) WO2023165033A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117611926A (zh) * 2024-01-22 2024-02-27 重庆医科大学绍兴柯桥医学检验技术研究中心 一种基于ai模型的医学影像识别方法及***
CN117689660A (zh) * 2024-02-02 2024-03-12 杭州百子尖科技股份有限公司 基于机器视觉的保温杯温度质检方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115082990A (zh) * 2022-06-27 2022-09-20 平安银行股份有限公司 人脸的活体检测方法及装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110503654A (zh) * 2019-08-01 2019-11-26 中国科学院深圳先进技术研究院 一种基于生成对抗网络的医学图像分割方法、***及电子设备
CN110543911A (zh) * 2019-08-31 2019-12-06 华南理工大学 一种结合分类任务的弱监督目标分割方法
CN110689548A (zh) * 2019-09-29 2020-01-14 浪潮电子信息产业股份有限公司 一种医学图像分割方法、装置、设备及可读存储介质
WO2021179205A1 (zh) * 2020-03-11 2021-09-16 深圳先进技术研究院 医学图像分割方法、医学图像分割装置及终端设备
CN113678142A (zh) * 2019-04-11 2021-11-19 安捷伦科技有限公司 基于深度学习的经由回归层的实例分割训练
CN113920109A (zh) * 2021-10-29 2022-01-11 沈阳东软智能医疗科技研究院有限公司 医疗影像识别模型训练方法、识别方法、装置及设备
US20220058446A1 (en) * 2019-07-12 2022-02-24 Tencent Technology (Shenzhen) Company Limited Image processing method and apparatus, terminal, and storage medium

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150051434A1 (en) * 2012-02-21 2015-02-19 Koninklijkie Philips N.V. Method for regularizing aperture shape for milling
CN105761250A (zh) * 2016-02-01 2016-07-13 福建师范大学 一种基于模糊场景分割的建筑物提取方法
GB201709672D0 (en) * 2017-06-16 2017-08-02 Ucl Business Plc A system and computer-implemented method for segmenting an image
CN109741346B (zh) * 2018-12-30 2020-12-08 上海联影智能医疗科技有限公司 感兴趣区域提取方法、装置、设备及存储介质
CN110766694B (zh) * 2019-09-24 2021-03-26 清华大学 一种三维医学图像的交互式分割方法
AU2020401794A1 (en) * 2019-12-09 2022-07-28 Janssen Biotech, Inc. Method for determining severity of skin disease based on percentage of body surface area covered by lesions
CN113920420A (zh) * 2020-07-07 2022-01-11 香港理工大学深圳研究院 一种建筑物提取方法、装置、终端设备及可读存储介质
CN111951274A (zh) * 2020-07-24 2020-11-17 上海联影智能医疗科技有限公司 图像分割方法、***、可读存储介质和设备
CN112418205A (zh) * 2020-11-19 2021-02-26 上海交通大学 基于专注误分割区域的交互式图像分割方法和***

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113678142A (zh) * 2019-04-11 2021-11-19 安捷伦科技有限公司 基于深度学习的经由回归层的实例分割训练
US20220058446A1 (en) * 2019-07-12 2022-02-24 Tencent Technology (Shenzhen) Company Limited Image processing method and apparatus, terminal, and storage medium
CN110503654A (zh) * 2019-08-01 2019-11-26 中国科学院深圳先进技术研究院 一种基于生成对抗网络的医学图像分割方法、***及电子设备
CN110543911A (zh) * 2019-08-31 2019-12-06 华南理工大学 一种结合分类任务的弱监督目标分割方法
CN110689548A (zh) * 2019-09-29 2020-01-14 浪潮电子信息产业股份有限公司 一种医学图像分割方法、装置、设备及可读存储介质
WO2021179205A1 (zh) * 2020-03-11 2021-09-16 深圳先进技术研究院 医学图像分割方法、医学图像分割装置及终端设备
CN113920109A (zh) * 2021-10-29 2022-01-11 沈阳东软智能医疗科技研究院有限公司 医疗影像识别模型训练方法、识别方法、装置及设备

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117611926A (zh) * 2024-01-22 2024-02-27 重庆医科大学绍兴柯桥医学检验技术研究中心 一种基于ai模型的医学影像识别方法及***
CN117611926B (zh) * 2024-01-22 2024-04-23 重庆医科大学绍兴柯桥医学检验技术研究中心 一种基于ai模型的医学影像识别方法及***
CN117689660A (zh) * 2024-02-02 2024-03-12 杭州百子尖科技股份有限公司 基于机器视觉的保温杯温度质检方法
CN117689660B (zh) * 2024-02-02 2024-05-14 杭州百子尖科技股份有限公司 基于机器视觉的保温杯温度质检方法

Also Published As

Publication number Publication date
CN114581709A (zh) 2022-06-03

Similar Documents

Publication Publication Date Title
WO2023165033A1 (zh) 识别医学图像中的目标的模型训练、方法、设备及介质
US9965719B2 (en) Subcategory-aware convolutional neural networks for object detection
US20220108546A1 (en) Object detection method and apparatus, and computer storage medium
Wang et al. Interactive deep learning method for segmenting moving objects
Shang et al. End-to-end crowd counting via joint learning local and global count
WO2018108129A1 (zh) 用于识别物体类别的方法及装置、电子设备
Yang et al. Towards real-time traffic sign detection and classification
CN111414906B (zh) 纸质票据图片的数据合成与文本识别方法
CN104599275B (zh) 基于概率图模型的非参数化的rgb-d场景理解方法
US8295637B2 (en) Method of classifying red-eye objects using feature extraction and classifiers
CN111461213B (zh) 一种目标检测模型的训练方法、目标快速检测方法
US9773185B2 (en) Image processing apparatus, image processing method, and computer readable recording device
WO2022237153A1 (zh) 目标检测方法及其模型训练方法、相关装置、介质及程序产品
CN112926652B (zh) 一种基于深度学习的鱼类细粒度图像识别方法
CN110580499B (zh) 基于众包重复标签的深度学习目标检测方法及***
CN110443279B (zh) 一种基于轻量级神经网络的无人机图像车辆检测方法
CN113658146B (zh) 一种结节分级方法、装置、电子设备及存储介质
CN113947732B (zh) 基于强化学习图像亮度调节的空中视角人群计数方法
CN110827327B (zh) 一种基于融合的长期目标跟踪方法
Yang et al. Edge computing-based real-time passenger counting using a compact convolutional neural network
WO2020087434A1 (zh) 一种人脸图像清晰度评价方法及装置
Huang et al. Cost-sensitive sparse linear regression for crowd counting with imbalanced training data
Yan et al. Deeper multi-column dilated convolutional network for congested crowd understanding
Zhang et al. Artifact detection in endoscopic video with deep convolutional neural networks
CN117173075A (zh) 医学图像检测方法及相关设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22929470

Country of ref document: EP

Kind code of ref document: A1