CN113706441A

CN113706441A - Image prediction method based on artificial intelligence, related device and storage medium

Info

Publication number: CN113706441A
Application number: CN202110276959.8A
Authority: CN
Inventors: 王晓宁; 常健博; 王任直; 冯铭; 姚建华; 尚鸿; 郑瀚; 裴翰奇; 陈星翰
Original assignee: Tencent Technology Shenzhen Co Ltd; Peking Union Medical College Hospital Chinese Academy of Medical Sciences
Current assignee: Tencent Technology Shenzhen Co Ltd; Peking Union Medical College Hospital Chinese Academy of Medical Sciences
Priority date: 2021-03-15
Filing date: 2021-03-15
Publication date: 2021-11-26

Abstract

The application discloses an image prediction method based on artificial intelligence, which specifically comprises the following steps: acquiring a to-be-predicted image; acquiring a first mask image through a region segmentation model based on a to-be-predicted image, wherein the first mask image comprises a first segmentation region corresponding to a target object; acquiring a second mask image through a region prediction model based on the first mask image and the image to be predicted, wherein the second mask image comprises a second segmentation region corresponding to the target object; and generating an image prediction result according to the first mask image and the second mask image, wherein the image prediction result is used for representing the change condition of the target object in the preset time. Related apparatus and storage media are also disclosed. The method provided by the application can improve the efficiency of image analysis, save time cost and labor cost on one hand, and can improve the accuracy of image prediction to a certain extent by identifying the image based on the artificial intelligence model on the other hand.

Description

Image prediction method based on artificial intelligence, related device and storage medium

Technical Field

The present application relates to the field of computer vision technologies, and in particular, to an image prediction method based on artificial intelligence, a related apparatus, and a storage medium.

Background

With the continuous advance of artificial intelligence, computer vision is one of the most popular research fields in the field of deep learning. Computer vision is a cross-domain discipline, including computer science, mathematics, engineering, physics, biology, psychology, and the like. The use of computer vision based analysis and processing of images is also becoming more widespread.

In nature or in professional fields, it is very important for people to know the development direction of objects, and people can react more timely through the predicted development conditions of the objects, so that the risk of harm can be reduced to a certain extent. At present, the prejudgment of things needs professionals to observe and analyze images according to experience to obtain a prejudgment result.

However, the prejudgment object development is a very complex process, and observation and analysis are performed only by experience, which requires much time and effort on one hand, and on the other hand, due to problems such as lack of experience, the prejudgment result and the actual result have large deviation, resulting in low accuracy of image prediction.

Disclosure of Invention

The embodiment of the application provides an image prediction method based on artificial intelligence, a related device and a storage medium, on one hand, the efficiency of image analysis can be improved, time cost and labor cost are saved, on the other hand, images are identified based on an artificial intelligence model, and the accuracy of image prediction can be improved to a certain extent.

In view of the above, an aspect of the present application provides an image prediction method based on artificial intelligence, including:

acquiring a to-be-predicted image, wherein the to-be-predicted image corresponds to the size of a target image;

acquiring a first mask image through a region segmentation model based on a to-be-predicted image, wherein the first mask image corresponds to the size of a target image and comprises a first segmentation region corresponding to a target object;

acquiring a second mask image through a region prediction model based on the first mask image and the image to be predicted, wherein the second mask corresponds to the size of the target image, and the second mask image comprises a second segmentation region corresponding to the target object;

and generating an image prediction result according to the first mask image and the second mask image, wherein the image prediction result is used for representing the change condition of the target object in the preset time.

Another aspect of the present application provides an image prediction apparatus, including:

the device comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a to-be-predicted image, and the to-be-predicted image corresponds to the size of a target image;

the acquisition module is further used for acquiring a first mask image through a region segmentation model based on the image to be predicted, wherein the first mask image corresponds to the size of the target image and comprises a first segmentation region corresponding to the target object;

the acquisition module is further used for acquiring a second mask image through the region prediction model based on the first mask image and the image to be predicted, wherein the second mask corresponds to the size of the target image, and the second mask image comprises a second segmentation region corresponding to the target object;

the generating module is used for generating an image prediction result according to the first mask image and the second mask image, wherein the image prediction result is used for representing the change condition of the target object in the preset time.

In one possible design, in another implementation manner of another aspect of the embodiment of the present application, the image prediction apparatus further includes a training module;

the acquisition module is further used for acquiring a training sample image pair before acquiring a first mask image through a region segmentation model based on a to-be-predicted image, wherein the training sample image pair is derived from the same training object, the training sample image pair comprises an original sample image and an original labeling sample image, the original labeling sample image is an image obtained by performing segmentation region labeling on the original sample image, and the original sample image and the original labeling sample image both correspond to the size of a target image;

the acquisition module is also used for acquiring a first prediction mask image through a to-be-trained region segmentation model based on the original sample image;

the training module is used for updating model parameters of the to-be-trained region segmentation model according to the first prediction mask image and the original labeling sample image corresponding to the original sample image;

and the obtaining module is further used for taking the updated model parameters as the model parameters of the region segmentation model to obtain the region segmentation model if the model training conditions are met.

the acquisition module is further used for acquiring a training sample image pair before acquiring a second mask image through a region prediction model based on a first mask image and a to-be-predicted image, wherein the training sample image pair is derived from the same training object, the training sample image pair comprises an original sample image, an original annotation sample image and a target annotation sample image, the original annotation sample image is an image obtained after segmentation region annotation is performed on the original sample image, the target annotation sample image is an image obtained after segmentation region annotation is performed on the target sample image, and the sampling interval time between the target sample image and the original sample image is less than or equal to preset time;

the acquisition module is also used for acquiring a second prediction mask image through a to-be-trained region prediction model based on the original sample image and the original marked sample image;

the training module is used for updating model parameters of the prediction model of the region to be trained according to the second prediction mask image and the target marking sample image;

and the obtaining module is further used for taking the updated model parameters as the model parameters of the regional prediction model to obtain the regional prediction model if the model training conditions are met.

In one possible design, in another implementation of another aspect of an embodiment of the present application,

the acquisition module is specifically used for acquiring an original sample image corresponding to a training object at a first moment;

acquiring a segmentation region labeling result aiming at an original sample image to obtain an original labeling sample image corresponding to a training object;

acquiring a target sample image corresponding to a training object at a second moment, wherein the time interval between the second moment and the first moment is the sampling interval time between the target sample image and the original sample image, and the second moment occurs after the first moment;

acquiring a segmentation area labeling result aiming at a target sample image to obtain a target labeling sample image corresponding to a training object;

and acquiring a training sample image pair according to the original sample image, the original labeling sample image and the target labeling sample image.

In one possible design, in another implementation manner of another aspect of the embodiment of the present application, the image prediction apparatus further includes a processing module;

the acquisition module is specifically used for acquiring an original sample image to be processed corresponding to a training object at a first moment;

cutting an original sample image to be processed to obtain an original sample image;

the acquisition module is specifically used for acquiring a target sample image to be processed corresponding to the training object at a second moment;

cutting a target sample image to be processed to obtain a target sample image;

and the processing module is used for processing the original sample image and the target sample image based on at least one mode of image registration, image resampling and image normalization.

In one possible design, in another implementation manner of another aspect of the embodiment of the present application, the region division model is a first three-dimensional U-shaped network 3D-UNet model, and the region prediction model is a second 3D-UNet model;

the acquisition module is specifically used for carrying out downsampling processing on a convolution layer and a pooling layer included in the first 3D-UNet model based on a to-be-predicted image to obtain first characteristic data;

based on the first characteristic data, performing upsampling processing on a convolutional layer and a pooling layer included in the first 3D-UNet model to obtain a first mask image;

the acquisition module is specifically used for carrying out downsampling processing through a convolution layer and a pooling layer included in the second 3D-UNet model based on the first mask image and the image to be predicted to obtain second feature data;

and performing upsampling processing on the convolutional layer and the pooling layer included by the second 3D-UNet model based on the second characteristic data to obtain a second mask image.

In one possible design, in another implementation manner of another aspect of the embodiment of the present application, the region division model is a first U-type network UNet model, and the region prediction model is a second UNet model;

the acquisition module is specifically used for carrying out downsampling processing through a convolution layer and a pooling layer included in the first UNet model based on a to-be-predicted image to obtain a first feature map;

based on the first characteristic diagram, performing upsampling processing on a convolutional layer and a pooling layer included in the first UNet model to obtain a first mask image;

the acquisition module is specifically used for carrying out downsampling processing through a convolution layer and a pooling layer included in the second UNet model based on the first mask image and the image to be predicted to obtain a second feature map;

and performing upsampling processing on the convolution layer and the pooling layer included by the second UNet model based on the second feature map to obtain a second mask image.

In one possible design, in another implementation manner of another aspect of the embodiment of the present application, the image prediction apparatus further includes a determination module;

the acquisition module is also used for acquiring class probability distribution through an object classification model based on the first mask image after acquiring the first mask image through the region segmentation model based on the image to be predicted;

the determining module is used for determining the target category according to the category probability distribution;

and the obtaining module is further used for executing the step of obtaining a second mask image through the region prediction model based on the first mask image and the image to be predicted if the target type is used for indicating that the target object is in the change state.

the obtaining module is further used for obtaining an original sample image before class probability distribution is obtained through the object classification model based on the first mask image, wherein the original sample image corresponds to the labeling classification result;

the acquisition module is further used for acquiring a segmentation region labeling result aiming at the original sample image so as to obtain an original labeling sample image corresponding to the training object;

the acquisition module is also used for acquiring prediction class probability distribution through the object classification model to be trained based on the original marked sample image;

the training module is used for updating model parameters of the classification model of the object to be trained according to the prediction class probability distribution and the labeled classification result;

and the obtaining module is further used for taking the updated model parameters as the model parameters of the object classification model to obtain the object classification model if the model training conditions are met.

the acquisition module is further used for acquiring object associated information of the object to be predicted, wherein the object associated information comprises at least one of age, sex, height, weight and medical history information;

the processing module is used for carrying out information characterization processing on the object associated information to obtain object associated characteristics;

the obtaining module is specifically configured to obtain a second mask image through the region prediction model and the full connection layer based on the first mask image, the image to be predicted, and the object-related feature.

In one possible design, in another implementation manner of another aspect of the embodiment of the present application, the image prediction apparatus further includes a processing module and a display module;

the processing module is used for acquiring a second mask image through the region prediction model based on the first mask image and the image to be predicted, and then carrying out image alignment processing on the second mask image and the image to be predicted to obtain an aligned second mask image and an aligned image to be predicted;

the processing module is further used for covering the aligned second mask image on the aligned image to be predicted to obtain a synthesized image, wherein the part of the aligned second mask image except the second division area is a transparent area;

and the display module is used for displaying the composite image or sending the composite image to the terminal equipment so as to enable the terminal equipment to display the composite image.

the generating module is specifically configured to determine a first segmentation area and first position information according to a first segmentation area included in the first mask image, where the first position information indicates a position of the first segmentation position in the first mask image;

determining a second segmentation area and second position information according to a second segmentation area included in the second mask image, wherein the second position information represents the position of the second segmentation position in the second mask image;

determining an area change result according to the first division area and the second division area;

determining a position change result according to the first position information and the second position information;

and determining an image prediction result according to the area change result and the position change result.

Another aspect of the present application provides a computer device, comprising: a memory, a processor, and a bus system;

wherein, the memory is used for storing programs;

a processor for executing the program in the memory, the processor for performing the above-described aspects of the method according to instructions in the program code;

the bus system is used for connecting the memory and the processor so as to enable the memory and the processor to communicate.

Another aspect of the present application provides a computer-readable storage medium having stored therein instructions, which when executed on a computer, cause the computer to perform the method of the above-described aspects.

In another aspect of the application, a computer program product or computer program is provided, the computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided by the above aspects.

According to the technical scheme, the embodiment of the application has the following advantages:

the embodiment of the application provides an image prediction method based on artificial intelligence, firstly, an image to be predicted is obtained, then, based on the image to be predicted, a first mask image is obtained through a region segmentation model, the first mask image comprises a first segmentation region corresponding to a target object, then, based on the first mask image and the image to be predicted, a second mask image is obtained through the region prediction model, the second mask image comprises a second segmentation region corresponding to the target object, finally, according to the first mask image and the second mask image, an image prediction result is generated, and the image prediction result is used for representing the change situation of the target object in preset time. Through the mode, the development condition of the target object can be pre-judged by utilizing the trained regional prediction model, on one hand, the efficiency of image analysis can be improved, the time cost and the labor cost are saved, on the other hand, the image is identified based on the artificial intelligence model, and the accuracy of image prediction can be improved to a certain extent.

Drawings

Fig. 1 is a schematic view of an application scenario for implementing image prediction based on natural images in an embodiment of the present application;

FIG. 2 is a schematic diagram of an application scenario of image prediction based on medical images in the embodiment of the present application;

FIG. 3 is a block diagram of an embodiment of an image prediction system;

FIG. 4 is a diagram of an embodiment of an image prediction method in an embodiment of the present application;

FIG. 5 is a diagram illustrating an embodiment of outputting a first mask image through a region segmentation model;

FIG. 6 is a diagram illustrating the output of a second mask image by a region prediction model according to an embodiment of the present disclosure;

FIG. 7 is a schematic diagram of a training sample image pair of a training region segmentation model in an embodiment of the present application;

FIG. 8 is a schematic diagram of a training sample image pair for training a region prediction model in an embodiment of the present application;

FIG. 9 is a schematic diagram of a network structure of a three-dimensional U-shaped network according to an embodiment of the present invention;

FIG. 10 is a schematic diagram of a network structure of a U-type network according to an embodiment of the present invention;

FIG. 11 is a diagram illustrating an embodiment of outputting a second mask image in combination with associated information;

FIG. 12 is a schematic diagram of an interface showing image prediction results in an embodiment of the present application;

FIG. 13 is a schematic diagram of an embodiment of an image prediction apparatus in an embodiment of the present application;

fig. 14 is a schematic structural diagram of a computer device in an embodiment of the present application.

Detailed Description

The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "corresponding" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

In nature or in professional fields, it is very important for people to know the development direction of objects, and people can react more timely through the predicted development conditions of the objects, so that the risk of harm can be reduced to a certain extent. Based on the above, the application provides an image prediction method based on artificial intelligence, which can predict the development condition of a certain object in an image after a period of time according to the image acquired at the current moment. The image prediction method provided by the present application will be described below with reference to scenes.

Referring to fig. 1, fig. 1 is a schematic view of an application scenario of implementing image prediction based on natural images in an embodiment of the present application, as shown in (a) of fig. 1, a natural image is taken at 18 o' clock 48 min and 50 s 3/6/2021, the natural image includes a tree, and a trunk of the tree has a necrotic part. Inputting the natural image into a trained region segmentation model, outputting a first mask image through the region segmentation model, then inputting the first mask image and the natural image into a region prediction model, outputting a second mask image through the region prediction model, and covering the second mask image on the original natural image, thereby obtaining a synthetic image shown in (B) of fig. 1.

Referring to fig. 2, fig. 2 is a schematic view of an application scenario of the embodiment of the present application for implementing image prediction based on medical images, as shown in fig. 2 (a), a patient a takes a Computed Tomography (CT) image of a brain including a hematoma portion at 18 o' clock, 48 min, 50 sec at 3/6/2021. Inputting the brain CT image into a trained region segmentation model, outputting a first mask image through the region segmentation model, then inputting the first mask image and the brain CT image into a region prediction model, outputting a second mask image through the region prediction model, and covering the second mask image on an original natural image, thereby obtaining a synthetic image shown in (B) in fig. 2, and thus, the hematoma part can be continuously expanded in 24 hours in the future.

In order to realize image prediction in the above scene, the present application provides an image prediction method based on artificial intelligence, which is applied to the image prediction system shown in fig. 3, as shown in the figure, the server related to the present application may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, Network service, cloud communication, middleware service, domain name service, security service, Content Delivery Network (CDN), and a big data and artificial intelligence platform. The terminal device may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a palm computer, a personal computer, a smart television, a smart watch, and the like. The terminal device and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein. The number of servers and terminal devices is not limited. The image prediction system shown in fig. 3 can be divided into online image prediction and offline image prediction, and the following description will be given with reference to these two prediction modes.

Firstly, predicting an online image;

the image prediction system can comprise a server and a terminal device, wherein the terminal device acquires a to-be-predicted image and uploads the to-be-predicted image to the server, the server processes the to-be-predicted image by adopting a trained region segmentation model and a trained region prediction model, the processed mask image (or synthesized image) is fed back to the terminal device, and the terminal device displays the mask image (or synthesized image).

Secondly, predicting an offline image;

the image prediction system can comprise a terminal device, wherein the terminal device acquires an image to be predicted, processes the image to be predicted by adopting a trained region segmentation model and a trained region prediction model to obtain a mask image (or a synthesized image), and displays the mask image (or the synthesized image) by the terminal device.

It is understood that deep learning is applied to the task of image prediction, where deep learning is a branch of artificial intelligence, which is a theory, method, technique and application that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Image prediction is performed by using a region segmentation model and a region prediction model, and a Computer Vision (CV) technology is involved. Computer vision is a science for researching how to make a machine "see", and further, it means that a camera and a computer are used to replace human eyes to perform machine vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image prediction, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.

The training region segmentation model and the region prediction model relate to a Machine Learning (ML) technology, wherein the Machine Learning is a multi-field cross subject and relates to a plurality of subjects such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

With reference to fig. 4, an artificial intelligence based image prediction method in the present application will be described below, and an embodiment of the image prediction method in the present application includes:

101. acquiring a to-be-predicted image, wherein the to-be-predicted image corresponds to the size of a target image;

in this embodiment, the image prediction apparatus obtains the image to be predicted, where the target image size may be 512 × 512 pixels, or 1024 × 1024 pixels, or other sizes, which is not limited herein.

It should be noted that the image to be predicted may be a medical image or a natural image, and the image is described as the medical image in this application. Medical images include, but are not limited to, CT images, Magnetic Resonance Imaging (MRI), Digital Subtraction Angiography (DSA), Computed Radiography (CR) images, and Digital Radiography (DR). The object detected by the medical image includes, but is not limited to, a head, a chest, a spine, a bone, an abdomen, and the like, and the head is exemplified and should not be construed as a limitation of the present application.

It should be noted that the image prediction apparatus may be deployed in a terminal device, a server, or an image prediction system composed of a terminal device and a server, and the present application is not limited thereto.

102. Acquiring a first mask image through a region segmentation model based on a to-be-predicted image, wherein the first mask image corresponds to the size of a target image and comprises a first segmentation region corresponding to a target object;

in this embodiment, the image prediction apparatus invokes a region segmentation model to segment an image to be predicted, where the input image to be predicted and the output first mask image have the same image size, and the output first mask image includes a first segmentation region, where the first segmentation region is a region segmented based on the target object. Taking a CT image of the brain as an example, the target object may be a blood clot in the brain. Taking the tree image as an example, the target object may be a necrotic part at the trunk.

Specifically, for convenience of understanding, referring to fig. 5, fig. 5 is a schematic diagram of outputting a first mask image through a region segmentation model in the embodiment of the present application, and as shown in the figure, a to-be-predicted image is input into a trained region segmentation model, so as to output the first mask image, wherein a white region in the first mask image may be understood as a first segmentation region. Assuming that the image to be predicted is a CT image of the brain and the brain has a blood clot, in order to better express the shape, size and position of the blood clot, that is, outputting a corresponding first mask image through a region segmentation model, only a first segmentation region is identified in the first mask image, and other brain tissues do not need to be labeled.

103. Acquiring a second mask image through a region prediction model based on the first mask image and the image to be predicted, wherein the second mask corresponds to the size of the target image, and the second mask image comprises a second segmentation region corresponding to the target object;

in this embodiment, the image prediction apparatus performs stitching (concat) processing on the first mask image and the image to be predicted, and inputs the stitched first mask image and the image to be predicted into the trained region prediction model, so as to output a second mask image, where the output second mask image and the image to be predicted have the same image size, the output second mask image includes a second partition region, and the second partition region is a region partitioned based on the target object. It should be noted that the second mask image is an image obtained after a period of time in the future is predicted. For example, if the image to be recognized is acquired at the current time, the second mask image may be the mask image after 24 hours. For another example, the image to be recognized is acquired at 15 points 3, 7 and 8 in 2021, and then the second mask image may be the mask image corresponding to 15 points 3, 8 and 8 in 2021.

Specifically, for convenience of understanding, please refer to fig. 6, where fig. 6 is a schematic diagram of outputting a second mask image through a region prediction model in the embodiment of the present application, and as shown in the figure, the first mask image and a to-be-predicted image are input into a trained region prediction model, so as to output the second mask image, where a white region in the second mask image may be understood as a second divided region. Assuming that the image to be predicted is a CT image of the brain and the brain has a blood block, in order to predict the change condition of the blood block in a preset time, that is, outputting a corresponding second mask image through a region prediction model, only a second segmentation region is identified in the second mask image, and other brain tissues do not need to be labeled.

104. And generating an image prediction result according to the first mask image and the second mask image, wherein the image prediction result is used for representing the change condition of the target object in the preset time.

In this embodiment, the image prediction apparatus generates a corresponding image prediction result according to the first mask image that is actually acquired and the second mask image that is obtained through prediction, where the image prediction result can intuitively represent a change situation of the target object within a preset time, for example, a size of the target object changes (such as becomes larger, smaller, or unchanged), a position of the target object changes, a shape of the target object changes, and the like, and is not limited herein.

It should be noted that the preset time is consistent with the sample collection time used in training the model, and if the sample collection time interval is less than or equal to 24 hours, the preset time is 24 hours. Assuming that the sample collection time interval is less than or equal to 72 hours, the predetermined time is 72 hours. Taking "cerebral hemorrhage" as an example, 24 hours is usually selected as the preset time. In practical applications, a CT image of the head of a patient with acute cerebral hemorrhage is usually taken to observe the bleeding part, the amount of bleeding, the morphology of hematoma, whether the patient breaks into the ventricle, whether a low-density edema zone and a space occupying effect exist around the hematoma, and the like. In addition to the current state of the art, physicians are required to have a prognosis about the patient's progress and thus to make a treatment plan in a targeted manner, wherein an important prediction is how the bleeding pattern changes, since this point is related to whether or not surgery is performed, and how the surgery is performed. The method provided by the application can predict the future development state of the hematoma based on the CT image acquired at the current time of the patient, thereby assisting a doctor to make a treatment scheme. In the application, a large number of quantitative image features are mined from medical images by means of a computer, and the most valuable image omics features are screened by using a statistical or machine learning method to analyze clinical information, disease qualification, tumor grading and staging, curative effect evaluation, prognosis prediction and the like.

Optionally, on the basis of the embodiment corresponding to fig. 4, in another optional embodiment provided by the embodiment of the present application, before acquiring the first mask image through the region segmentation model based on the image to be predicted, the method may further include:

acquiring a training sample image pair, wherein the training sample image pair is derived from the same training object, the training sample image pair comprises an original sample image and an original labeling sample image, the original labeling sample image is an image obtained by labeling a segmentation region of the original sample image, and the original sample image and the original labeling sample image both correspond to the size of a target image;

based on an original sample image, obtaining a first prediction mask image through a to-be-trained region segmentation model;

updating model parameters of a to-be-trained region segmentation model according to a first prediction mask image and an original labeling sample image corresponding to an original sample image;

and if the model training condition is met, taking the updated model parameters as the model parameters of the region segmentation model to obtain the region segmentation model.

In this embodiment, a method for obtaining a region segmentation model through training is described. In the task of training the region segmentation model, a large number of training sample image pairs are required, and for convenience of description, the present application takes one training sample image pair as an example, but this should not be construed as limiting the present application.

Specifically, taking an original sample image as a brain CT image as an example, for ease of understanding, please refer to fig. 7, fig. 7 is a schematic diagram of a training sample image of a training region segmentation model in an embodiment of the present application, and as shown in (a) of fig. 7, a brain CT image acquired by a training object (e.g., a certain patient) for the first time is taken as an original sample image (i.e., CT1), where the original sample image is compositely recorded with a condition, for example, a bleeding type is a cerebral parenchymal bleeding, and the image has no obvious artifact. Based on this, the annotator can use some annotation tools to segment the cerebral hemorrhage region indicated by a1 and perform corresponding annotation. As shown in fig. 7 (B), the original labeled sample image (i.e., CT1 mask) includes a cerebral hemorrhage region subjected to segmentation processing. And the original sample image and the original marked sample image both correspond to the size of the target image.

It is understood that the annotation tools used in the present application include, but are not limited to, Medical Image annotation tools (ITK-SNAP), Medical Image Processing, Analysis and Visualization (MIPAV) tools, java (java) based common Image Processing software (Image J), and the like, and are not limited thereto.

In actual training, batch processing size, learning rate and maximum iteration number can be set as required. Taking an original sample image as an example, firstly, inputting the original sample image into a region segmentation model to be trained, and outputting a first prediction mask image corresponding to the original sample image through the region segmentation model to be trained, where the first prediction mask image is a prediction image. The original sample image has the original marked sample image which is marked, so that the original marked sample image is the real image. Then, a loss value between the first prediction mask image corresponding to the original sample image and the original annotated sample image is calculated by using a mean-square error (MSE) loss function as follows:

MSE represents the loss value between a real image and a prediction image, n represents the total number of pixel points in the image, i represents the ith pixel point in the image, and y represents the loss value between the real image and the prediction image_iImage data representing the ith pixel in the predicted image (i.e. the first prediction mask image),

image data representing the ith pixel point in the real image (i.e., the original annotated sample image).

It should be noted that in the actual training process, other types of loss functions may also be used to calculate the loss value between the images, for example, a cross entropy loss function, Mean Absolute Error (MAE) or cross entropy loss, etc.

After obtaining the loss value between the first prediction mask image and the original labeled sample image, the model parameters of the region segmentation model to be trained can be updated by adopting a back propagation algorithm. When the model training condition is met, the training can be finished, and the updated model parameters are used as the model parameters of the region segmentation model. It will be appreciated that in one example, the model training condition is satisfied when the loss value converges to a certain degree. In another example, a maximum iteration number is preset, and when the training iteration number reaches the maximum iteration number, the model training condition is satisfied.

Secondly, in the embodiment of the application, a mode of obtaining a region segmentation model through training is provided, and through the mode, an original sample image and an original labeling sample image are used as a training sample image pair, so that the region segmentation model to be trained is trained, after the training is completed, the segmentation of the image to be recognized can be realized, and an artificial intelligence model is adopted to segment the image, so that the accuracy of image segmentation is improved.

Optionally, on the basis of the embodiment corresponding to fig. 4, in another optional embodiment provided by the embodiment of the present application, before acquiring the second mask image through the region prediction model based on the first mask image and the image to be predicted, the method may further include:

acquiring a training sample image pair, wherein the training sample image pair is derived from the same training object, the training sample image pair comprises an original sample image, an original annotation sample image and a target annotation sample image, the original annotation sample image is an image obtained after segmentation region annotation is performed on the original sample image, the target annotation sample image is an image obtained after segmentation region annotation is performed on the target sample image, and the sampling interval time between the target sample image and the original sample image is less than or equal to the preset time;

acquiring a second prediction mask image through a to-be-trained region prediction model based on the original sample image and the original marked sample image;

updating model parameters of a prediction model of the region to be trained according to the second prediction mask image and the target labeling sample image;

and if the model training condition is met, taking the updated model parameters as the model parameters of the region prediction model to obtain the region prediction model.

In this embodiment, a method for obtaining a regional prediction model through training is described. In the task of training the region prediction model, a large number of training sample image pairs are required, and for convenience of description, the present application takes one training sample image pair as an example, but this should not be construed as limiting the present application.

Specifically, taking the original sample image as a brain CT image as an example, for ease of understanding, please refer to fig. 8, fig. 8 is a schematic diagram of a training sample image of a training region prediction model in the embodiment of the present application, as shown in (a) of fig. 8, a brain CT image acquired by a training object (e.g., a certain patient) for the first time is taken as an original sample image (i.e., CT1), wherein the original sample image is compositely recorded with a condition, for example, the type of bleeding is a brain parenchymal bleeding, and the image has no obvious artifacts. Based on this, the annotator can use some annotation tools to segment the cerebral hemorrhage region indicated by B1 and perform corresponding annotation. As shown in fig. 8 (B), the original labeled sample image (i.e., CT1 mask) includes a cerebral hemorrhage region subjected to segmentation processing.

As shown in fig. 8 (C), a brain CT image acquired by the training subject (e.g., a patient) for the second time is used as a target sample image (i.e., CT2), wherein the target sample image is compositely recorded with conditions, for example, the type of bleeding is cerebral parenchymal bleeding, and the image has no obvious artifact or the like. Based on this, the annotator can use some annotation tools to segment the cerebral hemorrhage region indicated by B2 and perform corresponding annotation. As shown in fig. 8 (D), the target annotation sample image (i.e., CT2 mask) includes a cerebral hemorrhage region subjected to segmentation processing.

It will be appreciated that the original sample image, the original annotated sample image, the target sample image and the target annotated sample image all correspond to target image dimensions. And the same training sample image pair is derived from the same training object, the sampling interval time between the target sample image and the original sample image is less than or equal to the preset time (for example, 24 hours), the target sample image and the original sample image are both acquired before the operation, and the future hematoma form provided by the target sample image is used as the target of model training. The labeling tools used in the present application include, but are not limited to, ITK-SNAP, MIPAV tools, and Image J, and are not limited herein.

In actual training, batch processing size, learning rate and maximum iteration number can be set as required. Firstly, the original sample image and the original labeled sample image are spliced and then input to a to-be-trained region prediction model, and a second prediction mask image corresponding to the original sample image is output through the to-be-trained region prediction model, wherein the second prediction mask image is a prediction image. The original sample image is changed into a target sample image after a period of time, so that the target marked sample image is marked, and the target marked sample image is a real image. Then, the following MSE loss function is used to calculate the loss value between the second prediction mask image corresponding to the original sample image and the target annotation sample image:

MSE represents the loss value between a real image and a prediction image, n represents the total number of pixel points in the image, i represents the ith pixel point in the image, and y represents the loss value between the real image and the prediction image_iImage data representing the ith pixel in the predicted image (i.e. the second prediction mask image),

and image data representing the ith pixel point in the real image (namely the target labeling sample image).

It should be noted that in the actual training process, other types of loss functions may also be used to calculate the loss value between the images, for example, a cross-entropy loss function, MAE, or cross-entropy loss, etc.

After the loss value between the second prediction mask image and the target annotation sample image is obtained, the model parameters of the prediction model of the region to be trained can be updated by adopting a back propagation algorithm. When the model training condition is met, the training can be finished, and the updated model parameters are used as the model parameters of the region prediction model. It will be appreciated that in one example, the model training condition is satisfied when the loss value converges to a certain degree. In another example, a maximum iteration number is preset, and when the training iteration number reaches the maximum iteration number, the model training condition is satisfied.

Secondly, in the embodiment of the application, a method for obtaining the region prediction model through training is provided, and through the method, the original sample image, the original annotation sample image and the target annotation sample image are used as a training sample image pair, so that the region prediction model to be trained is trained, after the training is completed, the prediction and segmentation of the image to be recognized can be realized, and the artificial intelligence model is adopted to predict and segment the image, so that the accuracy of image prediction is improved.

Optionally, on the basis of the embodiment corresponding to fig. 4, in another optional embodiment provided in the embodiment of the present application, acquiring a training sample image pair specifically may include:

acquiring an original sample image corresponding to a training object at a first moment;

In this embodiment, a method for acquiring a training sample image pair is introduced. In the task of training the region prediction model, a large number of training sample image pairs are required, and for convenience of description, the present application takes one training sample image pair as an example, but this should not be construed as limiting the present application.

Specifically, assuming that the preset time is 24 hours, an original sample image (e.g., a brain CT image) of a training object (e.g., a patient) is acquired at a first time, and then a target sample image corresponding to the training object is acquired at a second time, where a time interval between the second time and the first time is a sampling interval time, and the sampling interval time is less than or equal to the preset time (e.g., less than 24 hours). Based on this, the annotating personnel can obtain the corresponding segmentation result (i.e. the segmentation result marked by the marking tool) of the segmentation region of the original sample image, so as to obtain the original annotated sample image, wherein the segmented region of the original annotated sample image can be represented as a white region, and the rest regions can be represented as black regions. Similarly, the annotator can obtain the corresponding segmentation result (i.e. the segmentation result marked by the marking tool) of the segmentation region of the target sample image, so as to obtain the target annotation sample image, wherein the segmented region of the target annotation sample image can be represented as a white region, and the rest of the region can be represented as a black region.

In the embodiment of the application, a mode for acquiring a training sample image pair is provided, and by the above mode, when a sample image for training is acquired, not only the acquired original sample image and the acquired target sample image need to be labeled, but also the time for image sampling twice needs to be considered, so that the data adopted during training has higher reliability and accuracy, and the training is facilitated to obtain a model with a better prediction effect.

Optionally, on the basis of the embodiment corresponding to fig. 4, in another optional embodiment provided in the embodiment of the present application, acquiring an original sample image corresponding to a training object at a first time may specifically include:

acquiring an original sample image to be processed corresponding to a training object at a first moment;

acquiring a target sample image corresponding to the training object at a second time, which may specifically include:

acquiring a target sample image to be processed corresponding to the training object at a second moment;

cutting a target sample image to be processed to obtain a target sample image;

the method can also comprise the following steps:

and processing the original sample image and the target sample image based on at least one of image registration, image resampling and image normalization.

In this embodiment, a method of preprocessing a sample image is described. In the task of training the region prediction model, a large number of training sample image pairs are required, and for convenience of description, the present application takes one training sample image pair as an example, but this should not be construed as limiting the present application.

Specifically, taking the original sample image as the brain CT image as an example, as can be seen from the foregoing embodiments, the original sample image to be processed of the training object is acquired at the first time, wherein the original sample image to be processed is an unprocessed image, and similarly, the target sample image to be processed of the training object is acquired at the second time, wherein the target sample image to be processed is also an unprocessed image. Based on this, first, the original sample image to be processed and the target sample image to be processed are removed of the region other than the brain tissue, for example, the region other than the head, and the skull portion by the image cropping method.

After the original sample image and the target sample image are obtained, a series of processing, such as image registration processing, image resampling processing, image normalization processing, and the like, may be performed, and after the processing is completed, the processed image is labeled. Wherein the image registration process registers the original sample image and the target sample image to the same angle so that the target object (e.g., a lesion or blood clot, etc.) in the map is positionally aligned. In addition, for CT images, it is also necessary to resample the image to the same layer thickness and fix the image to a uniform size by end zero-filling and cropping. For ease of training and processing, the images may also be normalized and mapped to a fixed range (e.g., 0 to 1, or-1 or 1).

It should be noted that, the ways of registering the images include, but are not limited to, multi-view registration, multi-temporal registration, multi-modal registration, etc., where the multi-view registration represents the image registration of the same object in different perspectives of the same scene, and images of similar objects or scenes are captured from multiple perspectives, so as to obtain a better representation of the scanned object or scene. Multi-temporal registration refers to image registration of the same object in the same scene at different times from the viewpoint, e.g., motion tracking, tumor growth tracking, etc. Multi-modality registration is common in the field of medical images, so taking multi-modality medical image registration as an example, since medical imaging devices can provide images (CT, MRI, etc.) of different forms of information about patients, based on registration of single or multi-modality images, a classification into single modality and multi-modality can be made.

Further, in the embodiment of the present application, a method for preprocessing a sample image is provided, and through the above method, a series of preprocessing needs to be performed on an acquired sample image, so that the reliability of training is improved. Taking CT images as an example, due to differences in CT devices and differences in doctor operations, the original sample image and the target sample image may have different layers, thicknesses, scanning areas, angles, and the like, and therefore, it is necessary to align the original sample image and the target sample image as much as possible and to remove changes to the images caused by non-focal self development.

Optionally, on the basis of the embodiment corresponding to fig. 4, in another optional embodiment provided by the embodiment of the present application, the region segmentation model is a first three-dimensional U-shaped network 3D-UNet model, and the region prediction model is a second 3D-UNet model;

based on the image to be predicted, obtaining a first mask image through a region segmentation model, which may specifically include:

based on a to-be-predicted image, performing downsampling processing through a convolutional layer and a pooling layer included in a first 3D-UNet model to obtain first feature data;

based on the first mask image and the image to be predicted, a second mask image is obtained through a region prediction model, and the method comprises the following steps:

based on the first mask image and the image to be predicted, performing downsampling processing through a convolution layer and a pooling layer included in the second 3D-UNet model to obtain second feature data;

In this embodiment, a method of segmenting an image based on a three-dimensional U-network (3D-UNet) model is described. The present application may use a 3D-UNet model as a region segmentation model or a region prediction model, and the manner of image segmentation and prediction for an image will be described with reference to fig. 9.

Specifically, referring to fig. 9, fig. 9 is a schematic diagram of a network structure of a three-dimensional U-shaped network in the embodiment of the present application, where the 3D-UNet model can be regarded as a symmetric structure, the left side of the 3D-UNet model can be regarded as a coding network, the coding network shown in the figure includes 3 encoders, each encoder includes two convolutional layers, each convolutional layer is followed by a Batch Normalization (BN) layer and a Linear rectification function (Rectified Linear Unit, return), and then a downsampling layer implemented by maximum pooling is connected. The feature data is obtained after the last encoder. The right side of the 3D-UNet model can be considered as a decoding network, which is shown in the figure to include 3 decoders, which in turn boost the feature data by an upsampling operation, the last decoder outputting a mask image.

The 3D-UNet model adopts different feature fusion modes, namely the 3D-UNet model splices the features together on the channel dimension to form thicker features. It should be noted that, the sizes of the input image and the output image of the 3D-UNet model may not be consistent, and therefore, the mask image also needs to be resized, and the obtained mask image has the same size as the image to be predicted.

In the present application, the region segmentation model may be a first 3D-UNet model, and based on the network structure shown in fig. 9, the downsampling process is performed on the to-be-predicted image through the convolution layer and the pooling layer included in the first 3D-UNet model to obtain first feature data, and then the upsampling process is performed on the first feature data to obtain a first mask image. The area prediction model may be a second 3D-UNet model, and based on the network structure shown in fig. 9, the first mask image and the image to be predicted are downsampled by using the convolution layer and the pooling layer included in the second 3D-UNet model to obtain second feature data, and the second feature data is upsampled to obtain a second mask image.

Secondly, in the embodiment of the application, a mode of segmenting an image based on a 3D-UNet model is provided, and through the mode, segmentation of the image can be realized by using the 3D-UNet model, and multi-scale feature recognition of image features by a network is realized. Further, for medical images, the boundaries are fuzzy, the gradient is complex, more high-resolution information is needed for accurate segmentation, the internal structure of the living body is relatively fixed, and the segmentation target has certain regularity, so that the low-resolution information can provide the information for object identification. Based on the method, the 3D-UNet model combines low-resolution information and high-resolution information, and the accuracy of medical image segmentation can be improved. Finally, the three-dimensional image need not be trained with each slice entered separately, but rather the entire image can be used as input for the 3D-UNet.

Optionally, on the basis of the embodiment corresponding to fig. 4, in another optional embodiment provided in the embodiment of the present application, the area segmentation model is a first U-type network UNet model, and the area prediction model is a second UNet model;

based on a to-be-predicted image, performing downsampling processing through a convolution layer and a pooling layer included in a first UNet model to obtain a first feature map;

based on the first mask image and the image to be predicted, performing downsampling processing through a convolution layer and a pooling layer included in the second UNet model to obtain a second feature map;

In this embodiment, a method of segmenting an image based on a U-type network (uet) model is described. The present application may use the Unet model as a region segmentation model or a region prediction model, and the manner of image segmentation and prediction for an image will be described with reference to fig. 10.

Specifically, referring to fig. 10, fig. 10 is a schematic diagram of a network structure of a U-type network in the embodiment of the present application, as shown in the figure, the unnet model can be regarded as a symmetric structure, the left side of the unnet model can be regarded as an encoding network, the encoding network shown in the figure includes 4 encoders, each encoder includes two convolutional layers, and each encoder is followed by a downsampling layer implemented by maximum pooling. After the last encoder, a signature is obtained. The right side of the Unet model can be regarded as a decoding network, the decoding network shown in the figure comprises 4 decoders, and the decoders sequentially improve the resolution of the feature map through an upsampling operation, that is, the feature map is input to the first decoder in the decoding network, and the mask image is output through the last decoder in the decoding network.

The Unet model adopts different feature fusion modes, namely the Unet model splices features together in channel dimension to form thicker features. It should be noted that, the sizes of the input image and the output image of the Unet model may not be the same, and therefore, the mask image also needs to be resized, and the obtained mask image has the same size as the image to be predicted.

The region segmentation model or the region prediction model may be a deep Full Convolution Network (FCN), a link Network (LinkNet), an Efficient Neural Network (Enet), or the like.

In the present application, the region segmentation model may be a first UNet model, and based on the network structure shown in fig. 10, the downsampling process may be performed on the image to be predicted through the convolution layer and the pooling layer included in the first UNet model to obtain a first feature map, and the upsampling process may be performed on the first feature map to obtain a first mask image. The area prediction model may be a second UNet model, and based on the network structure shown in fig. 10, the first mask image and the image to be predicted are downsampled by using the convolution layer and the pooling layer included in the second UNet model to obtain a second feature map, and the second feature map is upsampled to obtain a second mask image.

Secondly, in the embodiment of the application, a way of segmenting an image based on a U-net model is provided, and through the way, segmentation of the image can be realized by using the U-net model, and multi-scale feature recognition of image features by a network is realized. Further, for medical images, the boundaries are fuzzy, the gradient is complex, more high-resolution information is needed for accurate segmentation, the internal structure of the living body is relatively fixed, and the segmentation target has certain regularity, so that the low-resolution information can provide the information for object identification. Based on the method, the U-net model combines low-resolution information and high-resolution information, and the accuracy of medical image segmentation can be improved.

Optionally, on the basis of the embodiment corresponding to fig. 4, in another optional embodiment provided by the embodiment of the present application, after acquiring the first mask image by using the region segmentation model based on the image to be predicted, the method may further include:

obtaining class probability distribution through an object classification model based on the first mask image;

determining a target category according to the category probability distribution;

and if the target type is used for indicating that the target object is in a change state, executing a step of acquiring a second mask image through a region prediction model based on the first mask image and the image to be predicted.

In this embodiment, a method of performing image analysis using an object classification model is described. As described in the foregoing embodiments, the image to be predicted is first acquired, and then the image to be predicted is input to the region segmentation model, so that the first mask image is obtained. Then, based on the first mask image, a development trend is judged for a future period of time, and in short, the development trend can be regarded as a "two-class" problem.

Specifically, the first mask image is input to a trained object classification model, and assuming that the object classification model is a two-class network, the output class probability distribution is (a, b), where a represents the probability of the first class, b represents the probability of the second class, and the sum of a and b is 1. In this application, the first category may represent a category in which the target object is in a changed state, and the second category may represent a category in which the target object is not in a changed state. Assuming that a is 0.7 and b is 0.3, the target class is determined to be the first class, so that the target object can be judged to change in the future for a period of time, and then the next model (namely, the region prediction model) is accessed to process the first mask image and the image to be predicted. Assuming that a is 0.2 and b is 0.8, the target class is determined to be the second class, so that the target object can be judged not to change in the future for a period of time, and whether further prediction is needed can be selected according to a set rule.

It is understood that the object classification model used in the present application may be a Convolutional Network (density Connected probabilistic Network, density Network), a Residual Network (Residual Network, Network), a Visual Geometry Group Network (VGG Network), etc., and is not limited herein.

Secondly, in the embodiment of the present application, a method for performing image analysis by using an object classification model is provided, and by the above method, before predicting the second mask image, whether a target object changes or not may be determined by using the object classification model, and if the target object is in a change state, the image prediction may be further performed. If the image is not in the change state, the subsequent image prediction can be selected not to be carried out, so that the image processing resource is saved, and the subsequent image processing can be selected to be carried out, so that the accuracy of the image prediction is further improved.

Optionally, on the basis of the embodiment corresponding to fig. 4, in another optional embodiment provided in the embodiment of the present application, before obtaining the class probability distribution through the object classification model based on the first mask image, the method may further include:

acquiring an original sample image, wherein the original sample image corresponds to an annotation classification result;

based on the original marked sample image, obtaining the probability distribution of the prediction category through a classification model of an object to be trained;

updating model parameters of the classification model of the object to be trained according to the prediction class probability distribution and the labeled classification result;

and if the model training condition is met, taking the updated model parameters as the model parameters of the object classification model to obtain the object classification model.

In this embodiment, a method for obtaining an object classification model through training is described. In the task of training the object classification model, a large number of original labeled sample images and their corresponding labeled classification results are required, and for convenience of description, the present application takes an original labeled sample image as an example, which should not be construed as a limitation to the present application.

Specifically, taking the original sample image as a brain CT image as an example, a brain CT image acquired by a training subject (e.g., a certain patient) for the first time is taken as an original sample image (i.e., CT1), wherein the original sample image is compositely recorded with conditions, for example, the type of bleeding is a brain parenchymal bleeding, the image has no obvious artifact, and the like. Based on this, the annotator can use some annotating tools to segment the indicated cerebral hemorrhage region and perform corresponding annotation, so as to obtain the original annotated sample image (i.e. CT1 mask). The brain CT image acquired by the training subject (e.g., a patient) for the second time is taken as a target sample image (i.e., CT2), wherein the target sample image is compositely recorded with conditions, e.g., the type of bleeding is brain parenchymal bleeding, the image has no obvious artifacts, etc. The annotating personnel annotate the corresponding classification result by observing the change condition of the segmentation objects (such as blood clots) in the original sample image and the target sample image, for example, if the blood clots change, the classification result is annotated as "1", and if the blood clots do not change, the classification result is annotated as "0".

It is understood that the sampling interval time between the target sample image and the original sample image is less than or equal to a preset time (e.g., 24 hours), the target sample image and the original sample image are both acquired before surgery, and the target sample image provides a future hematoma morphology as a target for model training. The labeling tools used in the present application include, but are not limited to, ITK-SNAP, MIPAV tools, and Image J, and are not limited herein.

In actual training, batch processing size, learning rate and maximum iteration number can be set as required. Firstly, inputting the original labeled sample image into a classification model of an object to be trained, and outputting a prediction class probability distribution through the classification model of the object to be trained, wherein the prediction class probability distribution is a predicted value, and the labeled classification result is a true value, so that the loss value between the labeled classification result and the prediction class probability distribution corresponding to the original sample image is calculated by adopting the following cross entropy loss function:

wherein, Loss represents the Loss value between the real value and the predicted value, y represents the real value (i.e. labeling classification result),

the probability that the predicted value (i.e. the probability distribution of the prediction category) belongs to the positive example is represented, that is, if the predicted value belongs to the positive example, the value is 1, otherwise, the value is 0.

It should be noted that, in the actual training process, there may be a multi-class cross-entropy loss function, and the above example is described by taking a two-class loss function as an example, which should not be construed as a limitation to the present application.

After the loss value between the labeled classification result and the prediction class probability distribution is obtained, a back propagation algorithm can be adopted to update the model parameters of the classification model of the object to be trained. When the model training condition is met, the training can be finished, and the updated model parameters are used as the model parameters of the object classification model. It will be appreciated that in one example, the model training condition is satisfied when the loss value converges to a certain degree. In another example, a maximum iteration number is preset, and when the training iteration number reaches the maximum iteration number, the model training condition is satisfied.

Secondly, in the embodiment of the application, a method for obtaining an object classification model through training is provided, and through the method, the original labeled sample image and the labeled classification result are used as a group of training samples, so that a to-be-trained region prediction model is trained, and after the training is completed, the development condition of a target object in an image to be recognized can be predicted, so that the accuracy of image prediction is improved.

Optionally, on the basis of the embodiment corresponding to fig. 4, in another optional embodiment provided in the embodiment of the present application, the method may further include:

acquiring object associated information of an object to be predicted, wherein the object associated information comprises at least one of age, sex, height, weight and medical history information;

carrying out information characterization processing on the object associated information to obtain object associated characteristics;

obtaining a second mask image through a region prediction model based on the first mask image and the image to be predicted, which may specifically include:

and acquiring a second mask image through the region prediction model and the full connection layer based on the first mask image, the image to be predicted and the object associated characteristics.

In this embodiment, a method for increasing the associated information to realize the prediction is introduced. Before predicting the second mask image, object associated information of the object to be predicted can be obtained, and the object associated information can also be used as a basis for predicting the second mask image.

Specifically, taking the object to be predicted as the patient as an example, the acquired object related information includes, but is not limited to, age, sex, height, weight, and medical history information. Next, the object-related information may be processed based on feature engineering, for example, sex may be encoded by a one-hot (one-hot) encoding method, height and weight may be encoded by a count encoding method, and the like, which are not listed here. After the object-related information is characterized, the object-related characteristics can be obtained. Based on this, the first mask image, the image to be predicted, and the object-related feature are input to the region prediction model and the full link layer, and the second mask image can be output.

For convenience of understanding, please refer to fig. 11, where fig. 11 is a schematic diagram illustrating that the second mask image is output by combining the association information in the embodiment of the present application, as shown in the figure, the image to be predicted and the first mask image are input to the area prediction model, the object association feature is input to the full link layer, and the second mask image is predicted based on the result output by the area prediction model and the full link layer.

Secondly, in the embodiment of the application, a mode for increasing associated information for realizing prediction is provided, and in the mode, information related to an object to be predicted can be added in the process of predicting the second mask image, and the object associated information can assist image prediction, so that the accuracy of image prediction is improved.

Optionally, on the basis of the embodiment corresponding to fig. 4, in another optional embodiment provided by the embodiment of the present application, after acquiring the second mask image through the region prediction model based on the first mask image and the image to be predicted, the method may further include:

carrying out image alignment processing on the second mask image and the image to be predicted to obtain an aligned second mask image and an aligned image to be predicted;

covering the aligned second mask image on the aligned image to be predicted to obtain a synthesized image, wherein the part of the aligned second mask image except the second segmentation area is a transparent area;

and displaying the composite image, or sending the composite image to the terminal equipment so as to enable the terminal equipment to display the composite image.

In the present embodiment, a method of generating a synthesized image based on a to-be-predicted image is described. As described in the foregoing embodiment, the second mask image may be output by the region prediction model, and in practical applications, the second mask image and the image to be predicted may be further synthesized, which will be described with reference to the following example.

Specifically, image alignment processing needs to be performed on the second mask image and the image to be predicted, and there are various alignment processing manners, for example, a mark is marked at the lower right corner of the predicted image, and if the mark is a star mark, the star mark can also be displayed in the output second mask image, and after the two star marks are overlapped, it is indicated that the second mask image and the image to be predicted are aligned, that is, the aligned second mask image and the aligned image to be predicted are obtained. Optionally, the two images may also be aligned in other manners, for example, in an image registration manner, or in a feature point matching manner, which is not described herein again.

And then, covering the aligned second mask image on the aligned image to be predicted to obtain a synthetic image. Considering that the second mask image includes a non-second divided region, this region, if overlaid on the image to be predicted, would cause other brain tissue portions in the image to be predicted to be invisible. Therefore, the part of the aligned second mask image, except the second division area, is set to be transparent and then is covered on the aligned image to be predicted, and the synthetic image can be obtained.

Taking the image to be predicted as the brain CT image as an example, the image prediction method provided by the application is adopted to output the second mask image, and the synthesized image is obtained after further processing. And if the image prediction device is deployed in the terminal equipment, directly displaying the composite image. And if the image prediction device is deployed in the server, sending the composite image to the terminal equipment and displaying the composite image by the terminal equipment. For easy understanding, referring to fig. 12, fig. 12 is an interface diagram showing an image prediction result in the embodiment of the present application, as shown in the drawing, the left side is a brain CT image of a patient a taken at 24 points 3, 7, and 15 days 2021, and the right side is a composite image, and further, an area change result and a position change result of a target object (e.g., a blood clot) can be displayed.

Secondly, in the embodiment of the present application, a method for generating a synthesized image based on a to-be-predicted image is provided, and by the above method, the output second mask image and the to-be-predicted image can be superimposed, so that a predicted image can be synthesized, and related personnel can know the development condition of a target object more intuitively, thereby increasing the practicability of the scheme.

Optionally, on the basis of the embodiment corresponding to fig. 4, in another optional embodiment provided in the embodiment of the present application, the generating an image prediction result according to the first mask image and the second mask image may specifically include:

determining a first segmentation area and first position information according to a first segmentation area included in the first mask image, wherein the first position information represents the position of the first segmentation position in the first mask image;

In this embodiment, a method for generating an image prediction result is described. After the first mask image and the second mask image are obtained, the change between the first mask image and the second mask image can be analyzed, so that an area change result and a position change result are obtained, and both the area change result and the position change result belong to an image prediction result. This will be described in connection with two examples.

Illustratively, the first mask image and the second mask image both have the same size (i.e., the target image size). Determining a first segmentation area based on the first mask image, determining a second segmentation area based on the second mask image, and determining an area change result according to the first segmentation area and the second segmentation area. For example, if the first partition area of the first partition area is 150 pixels and the second partition area of the second partition area is 250 pixels, the area change result is that the target object (e.g., blood clot) becomes 100 pixels larger. For another example, if the first partition area of the first partition area is 250 pixels and the second partition area of the second partition area is 150 pixels, the area change result is that the target object (e.g., blood clot) is reduced by 100 pixels.

Illustratively, the first mask image and the second mask image both have the same size (i.e., the target image size). First position information is determined based on the first mask image, second position information is determined based on the second mask image, and a position change result is determined according to the first position information and the second position information. For example, the first location information may be edge pixel location information of the first partition, and the second location information may be edge pixel location information of the second partition. For another example, the first location information may be location information of a center pixel point of the first divided region, and the second location information may be location information of a center pixel point of the second divided region.

Secondly, in the embodiment of the present application, a generation method of an image prediction result is provided, by which a first division area and first position information are determined from a first mask image, and a second division area and second position information are determined from a second mask image, thereby enabling common evaluation of the image prediction result from area change and position change, and improving feasibility and operability of a scheme. Further, for the problem of cerebral hemorrhage, compared with the existing method which only predicts whether the hemorrhage is enlarged or not, the method can predict the future morphology of the hematoma, so that richer and comprehensive information is provided, and doctors are assisted to make treatment schemes.

Referring to fig. 13, fig. 13 is a schematic diagram of an embodiment of an image prediction apparatus in an embodiment of the present application, in which the image prediction apparatus 20 includes:

an obtaining module 201, configured to obtain a to-be-predicted image, where the to-be-predicted image corresponds to a target image size;

the obtaining module 201 is further configured to obtain a first mask image through a region segmentation model based on the image to be predicted, where the first mask image corresponds to a size of the target image and includes a first segmentation region corresponding to the target object;

the obtaining module 201 is further configured to obtain a second mask image through a region prediction model based on the first mask image and the image to be predicted, where the second mask corresponds to a size of the target image, and the second mask image includes a second divided region corresponding to the target object;

the generating module 202 is configured to generate an image prediction result according to the first mask image and the second mask image, where the image prediction result is used to indicate a change condition of the target object within a preset time.

In the embodiment of the application, an image prediction device is provided, and by adopting the device, the development condition of a target object can be predicted by using a trained regional prediction model, so that the image analysis efficiency can be improved, the time cost and the labor cost can be saved, and on the other hand, the image can be recognized based on an artificial intelligence model, so that the image prediction accuracy can be improved to a certain extent.

Optionally, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the image prediction apparatus 20 provided in the embodiment of the present application, the image prediction apparatus 20 further includes a training module 203;

the obtaining module 201 is further configured to obtain a training sample image pair before obtaining the first mask image through the region segmentation model based on the image to be predicted, where the training sample image pair is derived from the same training object, the training sample image pair includes an original sample image and an original labeled sample image, the original labeled sample image is an image obtained by performing region segmentation labeling on the original sample image, and both the original sample image and the original labeled sample image correspond to a target image size;

the obtaining module 201 is further configured to obtain a first prediction mask image through a to-be-trained region segmentation model based on the original sample image;

the training module 203 is configured to update a model parameter of the to-be-trained region segmentation model according to the first prediction mask image and the original annotation sample image corresponding to the original sample image;

the obtaining module 201 is further configured to, if the model training condition is satisfied, use the updated model parameter as a model parameter of the region segmentation model to obtain the region segmentation model.

In the embodiment of the application, the image prediction device is provided, and by adopting the device, the original sample image and the original labeled sample image are used as a training sample image pair, so that a to-be-trained region segmentation model is trained, after the training is completed, the segmentation of the to-be-recognized image can be realized, and the image is segmented by adopting an artificial intelligence model, so that the accuracy of image segmentation is improved.

the obtaining module 201 is further configured to obtain a training sample image pair before obtaining a second mask image through a region prediction model based on a first mask image and a to-be-predicted image, where the training sample image pair is derived from the same training object, the training sample image pair includes an original sample image, an original labeled sample image, and a target labeled sample image, the original labeled sample image is an image obtained by labeling a segmentation region of the original sample image, the target labeled sample image is an image obtained by labeling a segmentation region of the target sample image, and a sampling interval time between the target sample image and the original sample image is less than or equal to a preset time;

the obtaining module 201 is further configured to obtain a second prediction mask image through a to-be-trained region prediction model based on the original sample image and the original labeled sample image;

the training module 203 is used for updating model parameters of the prediction model of the region to be trained according to the second prediction mask image and the target labeling sample image;

the obtaining module 201 is further configured to, if the model training condition is satisfied, use the updated model parameter as a model parameter of the area prediction model to obtain the area prediction model.

In the embodiment of the application, the image prediction device is provided, and the original sample image, the original labeled sample image and the target labeled sample image are used as a training sample image pair, so that a to-be-trained region prediction model is trained, after the training is completed, the prediction and segmentation of the to-be-recognized image can be realized, and the artificial intelligence model is used for predicting and segmenting the image, so that the accuracy of image prediction is improved.

Alternatively, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the image prediction apparatus 20 provided in the embodiment of the present application,

an obtaining module 201, configured to specifically acquire an original sample image corresponding to a training object at a first time;

In the embodiment of the application, an image prediction device is provided, and by adopting the device, when a sample image used for training is collected, not only the collected original sample image and the collected target sample image need to be labeled, but also the time of image sampling for two times needs to be considered, so that the data adopted during training has higher reliability and accuracy, and the training is facilitated to obtain a model with better prediction effect.

Optionally, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the image prediction apparatus 20 provided in the embodiment of the present application, the image prediction apparatus 20 further includes a processing module 204;

an obtaining module 201, configured to specifically acquire an original sample image to be processed corresponding to a training object at a first time;

the obtaining module 201 is specifically configured to collect a to-be-processed target sample image corresponding to the training object at a second time;

cutting a target sample image to be processed to obtain a target sample image;

the processing module 204 is configured to process the original sample image and the target sample image based on at least one of image registration, image resampling, and image normalization.

In the embodiment of the application, an image prediction device is provided, and by adopting the device, a series of preprocessing needs to be carried out on the collected sample image, so that the training reliability is improved. Taking CT images as an example, due to differences in CT devices and differences in doctor operations, the original sample image and the target sample image may have different layers, thicknesses, scanning areas, angles, and the like, and therefore, it is necessary to align the original sample image and the target sample image as much as possible and to remove changes to the images caused by non-focal self development.

Optionally, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the image prediction apparatus 20 provided in the embodiment of the present application, the region division model is a first three-dimensional U-shaped network 3D-UNet model, and the region prediction model is a second 3D-UNet model;

an obtaining module 201, configured to perform downsampling processing on a convolution layer and a pooling layer included in a first 3D-UNet model based on a to-be-predicted image to obtain first feature data;

an obtaining module 201, configured to perform downsampling processing on a convolution layer and a pooling layer included in a second 3D-UNet model based on the first mask image and the image to be predicted to obtain second feature data;

The embodiment of the application provides an image prediction device, and by adopting the device, the segmentation of an image can be realized by utilizing a 3D-UNet model, and the multi-scale feature recognition of the image features by a network is realized. Further, for medical images, the boundaries are fuzzy, the gradient is complex, more high-resolution information is needed for accurate segmentation, the internal structure of the living body is relatively fixed, and the segmentation target has certain regularity, so that the low-resolution information can provide the information for object identification. Based on the method, the 3D-UNet model combines low-resolution information and high-resolution information, and the accuracy of medical image segmentation can be improved. Finally, the three-dimensional image need not be trained with each slice entered separately, but rather the entire image can be used as input for the 3D-UNet.

Alternatively, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the image prediction apparatus 20 provided in the embodiment of the present application, the area division model is a first U-type network UNet model, and the area prediction model is a second UNet model;

an obtaining module 201, configured to perform downsampling processing on a convolution layer and a pooling layer included in a first UNet model based on a to-be-predicted image to obtain a first feature map;

an obtaining module 201, configured to perform downsampling processing on a convolution layer and a pooling layer included in a second UNet model based on the first mask image and the image to be predicted to obtain a second feature map;

The embodiment of the application provides an image prediction device, and by adopting the device, the segmentation of an image can be realized by utilizing a U-net model, and the multi-scale feature recognition of the image features by a network is realized. Further, for medical images, the boundaries are fuzzy, the gradient is complex, more high-resolution information is needed for accurate segmentation, the internal structure of the living body is relatively fixed, and the segmentation target has certain regularity, so that the low-resolution information can provide the information for object identification. Based on the method, the U-net model combines low-resolution information and high-resolution information, and the accuracy of medical image segmentation can be improved.

Optionally, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the image prediction apparatus 20 provided in the embodiment of the present application, the image prediction apparatus 20 further includes a determining module 205;

the obtaining module 201 is further configured to obtain a class probability distribution through an object classification model based on the first mask image after obtaining the first mask image through the region segmentation model based on the image to be predicted;

a determining module 205, configured to determine a target category according to the category probability distribution;

the obtaining module 201 is further configured to, if the target class is used to indicate that the target object is in a change state, perform a step of obtaining a second mask image through a region prediction model based on the first mask image and the image to be predicted.

In the embodiment of the present application, an image prediction apparatus is provided, and with the above apparatus, before predicting a second mask image, whether a target object changes may be determined by using an object classification model, and if the target object is in a change state, image prediction may be further performed. If the image is not in the change state, the subsequent image prediction can be selected not to be carried out, so that the image processing resource is saved, and the subsequent image processing can be selected to be carried out, so that the accuracy of the image prediction is further improved.

the obtaining module 201 is further configured to obtain an original sample image before obtaining the class probability distribution through the object classification model based on the first mask image, where the original sample image corresponds to the annotation classification result;

the obtaining module 201 is further configured to obtain a segmentation region labeling result for the original sample image, so as to obtain an original labeled sample image corresponding to the training object;

the obtaining module 201 is further configured to obtain a prediction class probability distribution through a to-be-trained object classification model based on the original labeled sample image;

the training module 203 is used for updating model parameters of the classification model of the object to be trained according to the prediction class probability distribution and the labeling classification result;

the obtaining module 201 is further configured to, if the model training condition is satisfied, use the updated model parameter as a model parameter of the object classification model to obtain the object classification model.

In the embodiment of the application, the image prediction device is provided, and the original labeled sample image and the labeled classification result are used as a group of training samples, so that a to-be-trained region prediction model is trained, and after training is completed, the development condition of a target object in an image to be recognized can be predicted, and the accuracy of image prediction is improved.

the acquiring module 201 is further configured to acquire object related information of the object to be predicted, where the object related information includes at least one of age, gender, height, weight, and medical history information;

the processing module 204 is configured to perform information characterization processing on the object association information to obtain object association characteristics;

the obtaining module 201 is specifically configured to obtain a second mask image through a region prediction model and a full connection layer based on the first mask image, the image to be predicted, and the object-related feature.

In the embodiment of the application, the image prediction device is provided, and by adopting the device, in the process of predicting the second mask image, information related to the object to be predicted can be added, and the object related information can assist in image prediction, so that the accuracy of image prediction is improved.

Optionally, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the image prediction apparatus 20 provided in the embodiment of the present application, the image prediction apparatus 20 further includes a processing module 204 and a display module 206;

the processing module 204 is configured to, based on the first mask image and the to-be-predicted image, perform image alignment processing on the second mask image and the to-be-predicted image after obtaining the second mask image through the region prediction model, so as to obtain an aligned second mask image and an aligned to-be-predicted image;

the processing module 204 is further configured to overlay the aligned second mask image on the aligned image to be predicted to obtain a synthesized image, where a portion of the aligned second mask image excluding the second division region is a transparent region;

and a display module 206, configured to display the composite image, or send the composite image to the terminal device, so that the terminal device displays the composite image.

In the embodiment of the application, the image prediction device is provided, and by adopting the device, the output second mask image and the image to be predicted can be superposed, so that a predicted image can be synthesized, related personnel can know the development condition of a target object more intuitively, and the practicability of the scheme is improved.

a generating module 202, configured to determine a first segmentation area and first position information according to a first segmentation area included in the first mask image, where the first position information indicates a position of the first segmentation position in the first mask image;

In an embodiment of the present application, an image prediction apparatus is provided, with which a first division area and first position information are determined from a first mask image, and a second division area and second position information are determined from a second mask image, whereby both area change and position change can be commonly used to evaluate an image prediction result, thereby improving feasibility and operability of a solution. Further, for the problem of cerebral hemorrhage, compared with the existing method which only predicts whether the hemorrhage is enlarged or not, the method can predict the future morphology of the hematoma, so that richer and comprehensive information is provided, and doctors are assisted to make treatment schemes.

As shown in fig. 14, for convenience of description, only the portions related to the embodiments of the present application are shown, and details of the method are not disclosed, please refer to the method portion of the embodiments of the present application. The computer device may be any computer device including a Personal computer, a mobile phone, a tablet computer, a Personal Digital Assistant (PDA), a Point of Sales (POS), a vehicle-mounted computer, and the like, taking the computer device as the Personal computer as an example:

fig. 14 is a block diagram showing a partial structure of a personal computer related to the computer device provided in the embodiment of the present application. Referring to fig. 14, the personal computer includes: radio Frequency (RF) circuit 310, memory 320, input unit 330, display unit 340, sensor 350, audio circuit 360, wireless fidelity (WiFi) module 370, processor 380, and power supply 390. Those skilled in the art will appreciate that the personal computer configuration shown in FIG. 14 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The following describes each component of the personal computer in detail with reference to fig. 14:

the RF circuit 310 may be used for receiving and transmitting signals during information transmission and reception or during a call, and in particular, receives downlink information of a base station and then processes the received downlink information to the processor 380; in addition, the data for designing uplink is transmitted to the base station. In general, the RF circuit 310 includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, RF circuit 310 may also communicate with networks and other devices via wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Messaging Service (SMS), and the like.

The memory 320 may be used to store software programs and modules, and the processor 380 executes various functional applications and data processing of the personal computer by operating the software programs and modules stored in the memory 320. The memory 320 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the personal computer, and the like. Further, the memory 320 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The input unit 330 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the personal computer. Specifically, the input unit 330 may include a touch panel 331 and other input devices 332. The touch panel 331, also referred to as a touch screen, can collect touch operations of a user (e.g., operations of the user on the touch panel 331 or near the touch panel 331 using any suitable object or accessory such as a finger, a stylus, etc.) on or near the touch panel 331, and drive the corresponding connection device according to a preset program. Alternatively, the touch panel 331 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 380, and can receive and execute commands sent by the processor 380. In addition, the touch panel 331 may be implemented in various types, such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 330 may include other input devices 332 in addition to the touch panel 331. In particular, other input devices 332 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The display unit 340 may be used to display information input by a user or information provided to the user and various menus of the personal computer. The Display unit 340 may include a Display panel 341, and optionally, the Display panel 341 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 331 can cover the display panel 341, and when the touch panel 331 detects a touch operation on or near the touch panel 331, the touch panel is transmitted to the processor 380 to determine the type of the touch event, and then the processor 380 provides a corresponding visual output on the display panel 341 according to the type of the touch event. Although the touch panel 331 and the display panel 341 are shown in fig. 14 as two separate components to implement the input and output functions of the pc, in some embodiments, the touch panel 331 and the display panel 341 may be integrated to implement the input and output functions of the pc.

The personal computer may also include at least one sensor 350, such as light sensors, motion sensors, and other sensors. Specifically, the light sensor may include an ambient light sensor that adjusts the brightness of the display panel 341 according to the brightness of ambient light, and a proximity sensor that turns off the display panel 341 and/or the backlight when the personal computer moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally three axes), detect the magnitude and direction of gravity when stationary, and can be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration) for identifying the attitude of a personal computer, and related functions (such as pedometer and tapping) for vibration identification; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured in the pc, the description thereof is omitted.

Audio circuitry 360, speaker 361, and microphone 362 may provide an audio interface between a user and a personal computer. The audio circuit 360 may transmit the electrical signal converted from the received audio data to the speaker 361, and the audio signal is converted by the speaker 361 and output; on the other hand, the microphone 362 converts the collected sound signals into electrical signals, which are received by the audio circuit 360 and converted into audio data, which are then processed by the audio data output processor 380 and then transmitted to, for example, another personal computer via the RF circuit 310, or output to the memory 320 for further processing.

WiFi belongs to a short-distance wireless transmission technology, and a personal computer can help a user send and receive e-mails, browse webpages, access streaming media and the like through the WiFi module 370, which provides wireless broadband internet access for the user. Although fig. 14 shows the WiFi module 370, it is understood that it does not belong to the essential constitution of the personal computer, and may be omitted entirely as needed within the scope not changing the essence of the invention.

The processor 380 is a control center of the personal computer, connects various parts of the entire personal computer by using various interfaces and lines, and performs various functions of the personal computer and processes data by operating or executing software programs and/or modules stored in the memory 320 and calling data stored in the memory 320, thereby integrally monitoring the personal computer. Optionally, processor 380 may include one or more processing units; optionally, processor 380 may integrate an application processor, which primarily handles operating systems, user interfaces, application programs, etc., and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 380.

The personal computer also includes a power supply 390 (e.g., a battery) for powering the various components, which may optionally be logically coupled to the processor 380 via a power management system to manage charging, discharging, and power consumption via the power management system.

Although not shown, the personal computer may further include a camera, a bluetooth module, etc., which will not be described herein.

The steps performed by the computer device in the above embodiments may be based on the computer device structure shown in fig. 14.

Embodiments of the present application also provide a computer-readable storage medium, in which a computer program is stored, and when the computer program runs on a computer, the computer is caused to execute the method described in the foregoing embodiments.

Embodiments of the present application also provide a computer program product including a program, which, when run on a computer, causes the computer to perform the methods described in the foregoing embodiments.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. An artificial intelligence based image prediction method, comprising:

acquiring a to-be-predicted image, wherein the to-be-predicted image corresponds to a target image size;

acquiring a first mask image through a region segmentation model based on the image to be predicted, wherein the first mask image corresponds to the target image size and comprises a first segmentation region corresponding to a target object;

acquiring a second mask image through a region prediction model based on the first mask image and the image to be predicted, wherein the second mask corresponds to the target image size, and the second mask image comprises a second segmentation region corresponding to the target object;

and generating an image prediction result according to the first mask image and the second mask image, wherein the image prediction result is used for representing the change condition of the target object in a preset time.

2. The image prediction method according to claim 1, wherein before the obtaining of the first mask image by the region segmentation model based on the image to be predicted, the method further comprises:

acquiring a training sample image pair, wherein the training sample image pair is derived from the same training object, the training sample image pair comprises an original sample image and an original labeling sample image, the original labeling sample image is an image obtained by labeling a segmentation region of the original sample image, and the original sample image and the original labeling sample image both correspond to the size of the target image;

based on the original sample image, obtaining a first prediction mask image through a region segmentation model to be trained;

updating the model parameters of the to-be-trained region segmentation model according to the first prediction mask image corresponding to the original sample image and the original marked sample image;

3. The image prediction method according to claim 1, wherein before the obtaining of the second mask image by the region prediction model based on the first mask image and the image to be predicted, the method further comprises:

acquiring a training sample image pair, wherein the training sample image pair is derived from the same training object, the training sample image pair comprises an original sample image, an original labeling sample image and a target labeling sample image, the original labeling sample image is an image obtained by labeling a segmentation region of the original sample image, the target labeling sample image is an image obtained by labeling a segmentation region of the target sample image, and the sampling interval time between the target sample image and the original sample image is less than or equal to the preset time;

updating model parameters of the prediction model of the region to be trained according to the second prediction mask image and the target labeling sample image;

4. The image prediction method of claim 3, wherein the obtaining of the training sample image pair comprises:

acquiring the original sample image corresponding to a training object at a first moment;

acquiring a segmentation region labeling result aiming at the original sample image to obtain the original labeling sample image corresponding to the training object;

acquiring the target sample image corresponding to the training object at a second moment, wherein a time interval between the second moment and the first moment is a sampling interval time between the target sample image and the original sample image, and the second moment occurs after the first moment;

acquiring a segmentation region labeling result aiming at the target sample image to obtain the target labeling sample image corresponding to the training object;

and acquiring the training sample image pair according to the original sample image, the original labeling sample image and the target labeling sample image.

5. The image prediction method of claim 4, wherein the acquiring the original sample image corresponding to the training object at the first time comprises:

acquiring an original sample image to be processed corresponding to the training object at the first moment;

cutting the original sample image to be processed to obtain the original sample image;

the acquiring the target sample image corresponding to the training object at the second time includes:

acquiring a target sample image to be processed corresponding to the training object at the second moment;

cutting the target sample image to be processed to obtain the target sample image;

the method further comprises the following steps:

processing the original sample image and the target sample image based on at least one of image registration, image resampling, and image normalization.

6. The image prediction method according to claim 1, wherein the region segmentation model is a first three-dimensional U-network 3D-UNet model, and the region prediction model is a second 3D-UNet model;

the obtaining of the first mask image through the region segmentation model based on the image to be predicted includes:

based on the image to be predicted, performing downsampling processing through a convolutional layer and a pooling layer included in the first 3D-UNet model to obtain first feature data;

the obtaining a second mask image through a region prediction model based on the first mask image and the image to be predicted comprises:

and performing upsampling processing on the convolution layer and the pooling layer included by the second 3D-UNet model based on the second characteristic data to obtain a second mask image.

7. The image prediction method according to claim 1, wherein the region segmentation model is a first U-network UNet model, and the region prediction model is a second UNet model;

based on the image to be predicted, performing downsampling processing through a convolution layer and a pooling layer included in the first UNet model to obtain a first feature map;

based on the first feature map, performing upsampling processing on a convolutional layer and a pooling layer included in the first UNet model to obtain a first mask image;

and performing upsampling processing on the convolutional layer and the pooling layer included by the second UNet model based on the second feature map to obtain a second mask image.

8. The image prediction method according to claim 1, wherein after the obtaining of the first mask image by the region segmentation model based on the image to be predicted, the method further comprises:

and if the target type is used for indicating that the target object is in a change state, executing the step of acquiring a second mask image through a region prediction model based on the first mask image and the image to be predicted.

9. The image prediction method of claim 1, wherein before obtaining the class probability distribution through the object classification model based on the first mask image, the method further comprises:

based on the original marked sample image, obtaining prediction class probability distribution through a classification model of an object to be trained;

updating model parameters of the object classification model to be trained according to the prediction class probability distribution and the labeling classification result;

10. The image prediction method of claim 1, further comprising:

performing information characterization processing on the object associated information to obtain object associated characteristics;

11. The image prediction method according to any one of claims 1 to 10, wherein after the obtaining of the second mask image by the region prediction model based on the first mask image and the image to be predicted, the method further comprises:

performing image alignment processing on the second mask image and the image to be predicted to obtain an aligned second mask image and an aligned image to be predicted;

covering the aligned second mask image on the aligned image to be predicted to obtain a composite image, wherein the part of the aligned second mask image except the second segmentation area is a transparent area;

and displaying the composite image, or sending the composite image to a terminal device to enable the terminal device to display the composite image.

12. The image prediction method according to any one of claims 1 to 10, wherein the generating an image prediction result from the first mask image and the second mask image comprises:

determining a first segmentation area and first position information according to the first segmentation area included in the first mask image, wherein the first position information represents the position of the first segmentation position in the first mask image;

determining a second segmentation area and second position information according to the second segmentation area included in the second mask image, wherein the second position information represents the position of the second segmentation position in the second mask image;

and determining the image prediction result according to the area change result and the position change result.

13. An image prediction apparatus comprising:

the obtaining module is further configured to obtain a first mask image through a region segmentation model based on the image to be predicted, where the first mask image corresponds to the target image size and includes a first segmentation region corresponding to a target object;

the obtaining module is further configured to obtain a second mask image through a region prediction model based on the first mask image and the image to be predicted, where the second mask corresponds to the target image size, and the second mask image includes a second partition region corresponding to the target object;

and the generating module is used for generating an image prediction result according to the first mask image and the second mask image, wherein the image prediction result is used for representing the change condition of the target object in preset time.

14. A computer device, comprising: a memory, a processor, and a bus system;

wherein the memory is used for storing programs;

the processor for executing the program in the memory, the processor for performing the method of any one of claims 1 to 12 according to instructions in program code;

15. A computer-readable storage medium comprising instructions that, when executed on a computer, cause the computer to perform the method of any of claims 1 to 12.