CN118037963A

CN118037963A - Reconstruction method, device, equipment and medium of digestive cavity inner wall three-dimensional model

Info

Publication number: CN118037963A
Application number: CN202410419709.9A
Authority: CN
Inventors: 王羽嗣; 周可; 王云忠; 刘思德
Original assignee: Guangzhou Side Medical Technology Co ltd
Current assignee: Guangzhou Side Medical Technology Co ltd
Priority date: 2024-04-09
Filing date: 2024-04-09
Publication date: 2024-05-14
Anticipated expiration: 2044-04-09
Also published as: CN118037963B

Abstract

The application provides a reconstruction method, a device, equipment and a medium of a three-dimensional model of an inner wall of a digestive cavity. The method comprises the following steps: the method comprises the steps of acquiring an image set acquired by a capsule endoscope in a digestive cavity, determining a plurality of frames of first images formed under the influence of peristalsis of the digestive cavity wall in the image set, determining a shrinkage area in each frame of first images, recovering the shrinkage area in each frame of first images into a flat area, obtaining a plurality of frames of processed first images, and obtaining a digestive cavity three-dimensional model based on the plurality of frames of processed first images. By recovering the shrinkage area in the multi-frame first image affected by the peristaltic motion of the digestive cavity wall in the image set into a flat area, more stable images are obtained, and the three-dimensional model is built according to the stable images, so that the method is beneficial to constructing a more accurate digestive cavity model, helping doctors to judge the focus of a patient more accurately, and improving the efficiency and satisfaction of the patient in medical treatment.

Description

Reconstruction method, device, equipment and medium of digestive cavity inner wall three-dimensional model

Technical Field

The present application relates to the field of three-dimensional model construction technology, and in particular, to a method, an apparatus, a computer device, a storage medium, and a computer program product for reconstructing a three-dimensional model of an inner wall of a digestive cavity.

Background

With the development of sensor technology becoming mature, the technology is gradually applied to more aspects, the function exerted by the sensor technology on the construction of a three-dimensional model is not small, the current common three-dimensional reconstruction technology mainly comprises three-dimensional reconstruction based on a binocular camera, three-dimensional reconstruction with a depth sensor and the like, the technology is arranged on a capsule endoscope and used for constructing a digestive cavity three-dimensional model, the technology is limited by the volume of the capsule endoscope, the camera is difficult to install on the technology, and the technology is limited by the volume of the depth sensor and cannot be applied to the capsule endoscope by common depth sensors such as time of flight (TOF), structured light, radar and the like. In addition, since the digestive cavity is in peristaltic state at all times, the traditional SLAM (Simultaneous Localization AND MAPPING, positioning and map construction) technology cannot obtain enough stable areas when constructing a map, and further the SLAM technology fails in composition, so that the finally constructed three-dimensional image is not accurate enough.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a reconstruction method, apparatus, computer device, computer-readable storage medium, and computer program product capable of providing a stable image, constructing a three-dimensional model of the inner wall of the digestive lumen of a three-dimensional image.

In a first aspect, the present application provides a method for reconstructing a three-dimensional model of an inner wall of a digestive cavity, comprising:

acquiring an image set acquired by a capsule endoscope in a digestive cavity;

Determining a plurality of frames of first images formed under the influence of peristalsis of the digestive cavity wall in an image set, and determining a shrinkage area in each frame of first images;

Restoring the shrinkage area in each frame of the first image into a flat area to obtain a plurality of frames of processed first images;

based on the multi-frame processed first image, a three-dimensional model of the inner wall of the digestive cavity is obtained.

In one embodiment, determining a plurality of frames of images to be processed formed under the influence of peristalsis of the digestive lumen wall in an image set, and determining a shrink region in each frame of images to be processed includes:

inputting the images in the image set into a pre-constructed shrinkage area identification model;

and obtaining a plurality of frames of first images formed under the influence of the peristaltic motion of the digestive cavity wall and the shrinkage area in each frame of first images according to the prediction result output by the shrinkage area identification model.

In one embodiment, inputting images in an image set into a pre-constructed crimp zone identification model includes:

Acquiring corresponding acceleration of the capsule endoscope when each image is acquired;

Screening out a candidate set from the image set according to whether the acceleration is larger than a threshold value;

Detecting motion blur of each image in the candidate set to obtain the motion blur condition of each image in the candidate set;

screening a target set from the candidate set according to whether the motion blur condition reaches a preset condition;

Each image in the target set is input into a pre-constructed crimp zone identification model.

In one embodiment, restoring the collapsed region in each frame of the first image to a flattened region results in a plurality of frames of the processed first image, comprising:

acquiring the motion condition among multiple frames of first images;

And restoring the shrinkage area in each frame of the first image into a flat area based on the motion condition to obtain a plurality of frames of processed first images.

In an exemplary embodiment, obtaining a three-dimensional model of the inner wall of the digestive lumen based on the plurality of frames of the processed first image comprises:

Inputting the multi-frame processed first image into a pre-constructed three-dimensional reconstruction algorithm;

And obtaining a three-dimensional model of the inner wall of the digestive cavity according to the output result of the three-dimensional reconstruction algorithm.

In one embodiment, obtaining a three-dimensional model of the inner wall of the digestive lumen based on the plurality of frames of the processed first images includes:

obtaining a first digestive cavity inner wall three-dimensional model based on the multi-frame processed first image and a second image except the first image in the image set;

If the accuracy of the three-dimensional model of the inner wall of the first digestive cavity does not reach the threshold value, acquiring the corresponding acceleration of the capsule endoscope when acquiring each second image;

Determining a second image to be restored in each second image according to the relative magnitude of the acceleration;

Performing leveling restoration processing on the second image to be restored to obtain a restored second image;

And obtaining a second digestive cavity inner wall three-dimensional model based on the restored second image, the multi-frame processed first image and other images in the image set.

In one embodiment, performing a leveling restoration process on a second image to be restored to obtain a restored second image, including:

and respectively carrying out leveling restoration processing on each second image to be restored through a filtering algorithm and a leveling algorithm to obtain corresponding restored second images.

In a second aspect, the present application also provides an apparatus for reconstructing a three-dimensional model of an inner wall of a digestive cavity, comprising:

The image set acquisition module is used for acquiring an image set acquired by the capsule endoscope in the digestive cavity;

The shrink region determining module is used for determining a plurality of frames of first images formed under the influence of peristalsis of the digestive cavity wall in the image set and determining a shrink region in each frame of first images;

the image processing module is used for recovering the shrinkage area in each frame of the first image into a flat area to obtain a plurality of frames of processed first images;

And the three-dimensional model determining module is used for recovering the shrinkage area in each frame of the first image into a flat area to obtain a plurality of frames of processed first images.

In a third aspect, the present application also provides a computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

acquiring an image set acquired by a capsule endoscope in a digestive cavity;

In a fourth aspect, the present application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

acquiring an image set acquired by a capsule endoscope in a digestive cavity;

In a fifth aspect, the application also provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of:

acquiring an image set acquired by a capsule endoscope in a digestive cavity;

The method, the device, the computer equipment, the storage medium and the computer program product for reconstructing the three-dimensional model of the inner wall of the digestive cavity are characterized in that a multi-frame first image formed under the influence of peristaltic motion of the wall of the digestive cavity is determined in an image set by acquiring the image set acquired by the capsule endoscope in the digestive cavity, a shrinkage area in each frame of the first image is determined, the shrinkage area in each frame of the first image is restored to be a flat area, a multi-frame processed first image is obtained, and the three-dimensional model of the digestive cavity is obtained based on the multi-frame processed first image. By recovering the shrinkage area in the multi-frame first image affected by the peristaltic motion of the digestive cavity wall in the image set into a flat area, more stable images are obtained, and the three-dimensional model is built according to the stable images, so that the method is beneficial to constructing a more accurate digestive cavity model, helping doctors to judge the focus of a patient more accurately, and improving the efficiency and satisfaction of the patient in medical treatment.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the related art, the drawings that are required to be used in the embodiments or the related technical descriptions will be briefly described, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to the drawings without inventive effort for those skilled in the art.

FIG. 1 is an application environment diagram of a method for reconstructing a three-dimensional model of an inner wall of a digestive cavity in one embodiment;

FIG. 2 is a flow chart of a method of reconstructing a three-dimensional model of an inner wall of a digestive cavity in one embodiment;

FIG. 3 is a schematic diagram of a model for determining a shrink region in a first image of each frame in one embodiment;

FIG. 4 is a flow chart of a method for three-dimensional modeling of the inner wall of a digestive lumen in another embodiment;

FIG. 5 is a schematic diagram of a process for building a three-dimensional model of a digestive cavity in one embodiment;

FIG. 6 is a block diagram of a reconstruction device for a three-dimensional model of the inner wall of a digestive lumen in one embodiment;

Fig. 7 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

The reconstruction method of the three-dimensional model of the inner wall of the digestive cavity provided by the embodiment of the application can be applied to an application environment shown in figure 1. The capsule endoscope 102 communicates with the server 104 through a network, and a third party data processing device may also be disposed between the capsule endoscope 102 and the server 104 for data transfer or processing. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104 or may be located on a cloud or other network server. The server 104 acquires an image set acquired by the capsule endoscope 102 in the digestive cavity of the human body, determines a plurality of frames of first images formed under the influence of peristalsis of the digestive cavity wall in the image set, determines a shrinkage area in each frame of first images, restores the shrinkage area in each frame of first images to be a flat area, obtains a plurality of frames of processed first images, and obtains a corresponding three-dimensional model of the digestive cavity inner wall according to the plurality of frames of processed first images. The server 104 may be implemented as a stand-alone server or a server cluster including a plurality of servers.

In an exemplary embodiment, as shown in fig. 2, a method for reconstructing a three-dimensional model of an inner wall of a digestive cavity is provided, and the method is applied to the server 104 in fig. 1 for illustration, and includes the following steps S201 to S204. Wherein:

Step S201, acquiring an image set acquired by the capsule endoscope in the digestive cavity.

A capsule endoscope, which is a medical instrument used as a tool for examining internal organs and tissues of a human body. The capsule endoscope is input into a human body in a swallow mode, and the inside of the passing digestive cavity is photographed by a camera on the endoscope, so that the operation condition of the digestive cavity is clearly recorded. The digestive cavity is understood to be the part of the organs of the human body that constitute the digestive system, including the esophagus, stomach, small intestine, large intestine, etc. The server 104 obtains photographs of the interior of the digestive cavity taken by the capsule endoscope through a network.

Step S202, determining a plurality of frames of first images formed under the influence of peristaltic movement of the digestive cavity wall in the image set, and determining a shrinkage area in each frame of first images.

Wherein the first image is an image affected by peristalsis.

Because the digestion of a human body is a dynamic process, that is to say, the digestive organ is in a continuously peristaltic state in a normal physiological state, the endoscope may be affected by the peristaltic movement of the digestive cavity wall when the endoscope is used for image shooting, multiple frames of first images affected by the peristaltic movement need to be determined in an image set which is shot, an unstable shrinkage area of each frame of first images due to the influence of peristaltic waves is determined, and meanwhile, the acquired image set also contains multiple frames of images which are not affected by the peristaltic movement of the digestive cavity wall, and the unaffected images need to be reserved for subsequent corresponding model construction.

Step S203, recovering the shrinking area in each frame of the first image into a flat area to obtain a plurality of frames of processed first images.

Wherein the processed first image refers to the image after the leveling is restored.

If the first image severely affected by the peristalsis of the digestive cavity wall is directly used for constructing a corresponding three-dimensional model, the three-dimensional model constructed at the moment is inaccurate, and focus diagnosis accuracy is affected, so that a shrinkage area in each frame of the first image needs to be restored to be a flat area, and further a multi-frame processed first image is obtained.

Step S204, obtaining a three-dimensional model of the inner wall of the digestive cavity based on the multi-frame processed first image.

And constructing a digestive cavity inner wall three-dimensional model of the current user together according to the multi-frame processed first image and the original image which is concentrated by the previous image and is not influenced by the peristaltic wave of the digestive cavity wall.

In the method for reconstructing the three-dimensional model of the inner wall of the digestive cavity, the image set acquired by the endoscope in the digestive cavity is acquired, a plurality of frames of first images formed under the influence of peristaltic motion of the digestive cavity wall are determined in the image set, the shrinkage area in each frame of first images is determined, the shrinkage area in each frame of first images is restored to be a flat area, a plurality of frames of processed first images are obtained, and the three-dimensional model of the digestive cavity is obtained based on the plurality of frames of processed first images. By recovering the shrinkage area in the multi-frame first image affected by the peristaltic motion of the digestive cavity wall in the image set into a flat area, more stable images are obtained, and the three-dimensional model is built according to the stable images, so that a more visual digestive cavity model is constructed, a doctor is helped to judge the focus of a patient more accurately, and the efficiency and satisfaction of the patient in medical treatment are improved.

In an exemplary embodiment, step S202 above is: determining a plurality of frames of first images formed under the influence of peristaltic movement of the digestive cavity wall in the image set, and determining a shrinkage area in each frame of first images, wherein the method comprises the following steps: inputting the images in the image set into a pre-constructed shrinkage area identification model; and obtaining a plurality of frames of first images formed under the influence of the peristaltic motion of the digestive cavity wall and the shrinkage area in each frame of first images according to the prediction result output by the shrinkage area identification model.

The pre-constructed shrink Region identification model comprises a trained neural network for extracting features from an input image and outputting a feature map, a Region proposal network (RPN, region Proposal Network) for generating a Region of interest of the image and a Mask Region-based convolutional neural network (Mask R-CNN, mask Region-based Convolutional Neural Network), wherein the neural network for outputting the feature map is disclosed as a recursive feature pyramid network (RFP, recursive Feature PYRAMID INTERNET).

An example of a specific determination of the crimp zone identification model in the first image of each frame is shown in fig. 3:

First, preprocessing operations are performed on an image set, wherein the preprocessing operations comprise operations such as image normalization, image smoothing, image enhancement and the like. The image set is a color image acquired by the endoscope by using the monocular camera.

Then, inputting the image into a pre-trained neural network (recursive feature pyramid (Recursive Feature Pyramid) network) to obtain a feature map (feature map) corresponding to each image; the recursive feature pyramid network RFP is mainly used for extracting features aiming at an input image and outputting a feature map. As shown in fig. 3, the recursive feature pyramid network RFP used in the present application includes a bottom-up backbone layer (C2, C3, C4, C5, respectively) and a top-down FPN (Feature Pyramid Network ) layer (P5, P4, P3, P2, respectively, in fig. 3); it is worth mentioning that the recursive feature pyramid network RFP combines additional feedback connections from the feature pyramid network FPN layer to the backbone layer (indicated by the corresponding dashed arrows in fig. 3), so that features of the input image are repeatedly purified, the expression capability of the feature pyramid network FPN layer is enriched, and the extracted features are more suitable for subsequent detection, and are particularly suitable for accurately identifying small targets in complex backgrounds. The recursion is used for enabling the error feedback information of the target detection to feed back and adjust the parameters of the backbone layer more directly.

Then, the feature map output by the recursive feature pyramid network is sent to an RPN (Region Proposal Network, regional proposal network) network for binary classification (for distinguishing foreground or background) and BB (bounding box) regression, and the Regional Proposal Network (RPN) outputs two types of feature maps, namely a target shape feature map and a target position feature map, wherein each target position feature map comprises a predetermined number of candidate boxes, as shown by 'region of interest' (ROI, region Of Interest, region of interest) in FIG. 3; filtering out a part of the ROI by a non-maximum suppression method;

Next, ROI alignment (ROI matching) is performed on these remaining ROIs, and since the feature map output by the region proposal network is reduced compared with the feature map of the input region proposal network, when the bounding box on the output feature map is mapped onto the input feature map, the position is shifted, which leads to insufficient accuracy in subsequent target detection, and therefore ROI alignment is required, that is, the feature map of the input Region Proposal Network (RPN) and the pixel (pixel) of the output feature map of the aforementioned recursive feature pyramid network (RFP) are first associated, and then the feature map of the input region proposal network and the fixed feature (feature) are associated, so that the input feature map including the ROI is obtained.

Finally, classifying the ROIs in the matched feature graphs, specifically, inputting the input feature graph containing the ROIs into Mask R-CNN (Mask Region-based Convolutional Neural Network, convolutional neural network based on Mask Region) for classification, wherein the Mask R-CNN is a deep learning model for object detection and instance segmentation, and can generate a bounding box of an object and an instance segmentation image of the object, and simultaneously generate a binary Mask (Mask) of the object. The matched input feature map with the ROI is identified and segmented through Mask R-CNN, and then is segmented into peristaltic wave areas and non-peristaltic wave areas, and a boundary box related to the peristaltic wave areas and a binary Mask related to the peristaltic wave areas are generated at the same time. Wherein FCN (Fully Convolutional Networks, full convolutional network) operations are performed inside each ROI, i.e., the full connection layer in Mask R-CNN is replaced with a convolutional layer.

In the embodiment, the image set is input into the pre-trained shrink region identification model, the corresponding shrink region of each frame of the first image is determined according to the output result of the model, the shrink region of each frame of the first image is quickly identified by using the pre-trained shrink region identification model, the construction speed of the three-dimensional model is increased, and the medical experience of a patient is improved.

In one embodiment, inputting images in an image set into a pre-constructed crimp zone identification model includes: acquiring corresponding acceleration of the capsule endoscope when each image is acquired; screening out a candidate set from the image set according to whether the acceleration is larger than a threshold value; detecting motion blur of each image in the candidate set to obtain the motion blur condition of each image in the candidate set; screening a target set from the candidate set according to whether the motion blur condition reaches a preset condition; each image in the target set is input into a pre-constructed crimp zone identification model.

Under the influence of gravity traction, an endoscope fed into a human body is subjected to a force, the force is not fixed, the corresponding force value at each moment is possibly different, so that the capsule endoscope is stressed differently when each image is acquired, the current acceleration is also different, the acceleration is correspondingly increased due to the influence of peristaltic waves on the image, a candidate set with the acceleration being larger than a threshold value and serving as a condition needing to further judge motion blur is screened from an image set, the motion blur condition refers to blur caused by the influence of motion on a shot image, the motion blur condition of each image in the candidate set is detected, the target set with the motion blur condition meeting the judging condition possibly influenced by the peristaltic motion of a digestive cavity wall is selected, and the target set is input into a shrinkage recognition area recognition model.

The candidate set which needs to further detect the motion blur condition is initially selected from the image set through acceleration, the target set which is input into the shrinkage region identification model is further determined from the candidate set according to the motion blur condition, the detection cost is reduced by detecting the motion blur condition in a reduced range, meanwhile, the cost of model output is reduced by only inputting the target set which meets the preset condition into the shrinkage region identification model for identification, and the construction speed and the construction quality of the three-dimensional model are further improved.

In one embodiment, the step S203 includes: restoring the shrinkage area in each frame of the first image into a flat area to obtain a plurality of frames of processed first images, wherein the method comprises the following steps of: acquiring the motion condition among multiple frames of first images; and restoring the shrinkage area in each frame of the first image into a flat area based on the motion condition to obtain a plurality of frames of processed first images.

In one embodiment of the present disclosure, the specific steps of how to restore the shrunken area in the first image of each frame to a flat area are as follows:

① Motion analysis

The image of the crumpled area can be considered as a tiny, random, more frequent motion of the image. First the direction of motion from image frame to frame is detected. Detecting the direction of motion from image frame to frame is accomplished by steps ② and ③.

② Corner detection

Any object in an image typically contains unique features, often consisting of a large number of pixels. Corner points are a small number of sets of points that accurately describe this object. The corner detection algorithm can analyze the most obvious characteristic points of the image and is used for object identification and tracking. In stomach images, corner points may be areas of folds in the image, where there is a noticeable change in brightness compared to other areas.

③ Calculating optical flow

The moving locus of the target object in the images due to the movement of the target object or the capsule endoscope in the continuous two-frame images is called an optical flow. It is a 2D vector field that can be used to display the trajectory of a point moving from a first frame image to a second frame image.

④RANSAC

RANSAC is an abbreviation for "RANdom SAmple Consensus (random sample consensus)". It can estimate the parameters of the mathematical model in an iterative manner from a set of observation data sets containing "outliers".

The RANSAC can find out the point sets matched with each other from the data containing noise, so as to calculate the transformation matrix of the two frames of images.

⑤ Motion smoothing

And filtering the motion parameters in the image sequence by using filtering algorithms such as median filtering, mean filtering, kalman filtering and the like, so that a smooth motion track can be obtained. And performing difference operation on the smoothed motion trail and the original trail to obtain the motion restoration parameters of each frame.

⑥ Image restoration

And according to the image restoration parameters of each frame, converting the first image which is output by the shrink region identification model and contains the shrink region, so as to obtain an image after image stabilization, namely, restoring the region affected by the peristaltic wave into the region before the peristaltic wave.

In the embodiment, by acquiring the movement condition among the multiple frames of first images and recovering the shrinkage area in each frame of first image into the flat area according to the movement condition, more stable data bases are laid for the follow-up construction of the three-dimensional model of the inner wall of the digestive cavity, and the integrity and the accuracy of the construction of the model are ensured.

In an exemplary embodiment, step S204 above is: obtaining a three-dimensional model of the inner wall of the digestive cavity based on the multi-frame processed first image, comprising: inputting the multi-frame processed first image into a pre-constructed three-dimensional reconstruction algorithm; and obtaining a three-dimensional model of the inner wall of the digestive cavity according to the output result of the three-dimensional reconstruction algorithm.

Wherein the processed first image is a restored-to-flatness image; a three-dimensional reconstruction algorithm may be understood as a process of fusion reconstruction of two-dimensional images or point cloud data acquired from different perspectives or sensors into a three-dimensional scene or object by means of a computer algorithm. The three-dimensional reconstruction algorithm disclosed but not limited in the application is DIM-SLAM (Dense RGB Simultaneous Localization AND MAPPING WITH Neural IMPLICIT MAPS, dense RGB with neural implicit map simultaneous localization and mapping).

Firstly, constructing a three-dimensional model of the inner wall of the digestive cavity, which comprises the following specific processes:

1. And (3) data acquisition:

The images which are not affected by peristaltic waves are acquired (the images are two-dimensional images and comprise the original images which are acquired by the capsule endoscope and are not affected by peristaltic waves and the processed first images). These two-dimensional images cover different areas of the gastric cavity to facilitate a comprehensive three-dimensional reconstruction.

2. Image preprocessing:

before SLAM (Simultaneous Localization AND MAPPING ) is performed, the image needs to be optimized by denoising, contrast enhancement, etc. to improve feature point detectability.

3. Feature extraction and matching:

Feature points in the image are extracted using a feature extraction algorithm.

Based on the feature points between adjacent frames, an associative match is made to track the motion of these points in the image sequence.

4. Capsule endoscope motion estimation:

And estimating the motion gesture of a camera (a camera on the capsule endoscope and further estimating the gesture of the capsule endoscope) by using the matching information of the characteristic points through a PnP (PERSPECTIVE-n-Point) algorithm and other methods. This step is critical to estimating the capsule endoscope pose p (including position and orientation).

5. And (3) map construction:

First, the tracked feature points are projected into a three-dimensional space to construct a sparse 3D map of the gastric cavity.

5.1, Selecting convolutional neural network and fully-connected network structure to process data: considering that the input data is mainly a two-dimensional image, the three-dimensional structure is inferred by the two-dimensional image features, the CNN (Convolutional Neural Network ) should be used to process the image data extraction features (the convolutional layer can effectively extract and process the image features, and the pooling layer can help the network understand the features of different scales, meet the three-dimensional reconstruction requirement), and the fully connected layer is used to process and predict the points in the three-dimensional space.

5.2, Training of the network: the neural network is trained using known sparse three-dimensional point clouds and corresponding RGB (red green blue) image data, and the occupancy state or depth values of any point in the prediction space are learned.

5.3, Sparse to dense mapping: with the trained model, the network outputs a continuous geometric field representation using the data points in the sparse map as input, which can be used to infer the dense structure of the entire scene.

5.4, Scene reconstruction: and (3) performing depth inference on each pixel point through a neural implicit expression network to generate a depth map, so as to obtain more continuous and detailed 3D information.

In DIM-SLAM, these data are processed using neural implicit map representations, with deep learning to optimize the accuracy and detail of the map.

It should be noted that, the representation forms of the three-dimensional space may be classified into explicit and implicit, and more commonly used display representations such as voxels (pixels), point clouds (Point clouds), triangular patches (Mesh), and the like; common implicit means signed distance functions, occupancy fields, nerve radiation fields (NeRF), etc. The neural implicit map is actually obtained by fitting a specific mapping relation by using a neural network, so that the purpose of representing a three-dimensional space by using the neural network is achieved. Neural networks are used herein to implicitly represent three-dimensional maps.

Specifically, as shown in fig. 4, for each camera pose p obtained in the step 4, a multi-scale feature (volume) is sampled along a view ray corresponding to the pose p, the sampled multi-scale feature is spliced to obtain a spliced 3D feature code, and the spliced feature is calculated by a multi-layer sensor (Multilayer Perceptron, MLP) to obtain a depth image and a color image of each pixel. And matching the depth image and the color image, and solving the 3D scene map by using the multi-layer perceptron.

In detail, the camera pose p is input into the multilayer sensorObtaining the depth corresponding to the corresponding feature body on the view ray corresponding to the predicted gestureAnd colorThe method comprises the following steps:

At the same time utilize luminosity distortion loss function ) And color loss function) The multi-layer perceptron is adjusted to enhance the consistency between the prediction and a particular observed image (key frames are selected from a set of frame sequences as the particular observed image).

5.5, Refinement and optimization: and carrying out post-processing, such as noise reduction, smoothing and the like, on the generated dense map so as to further improve the reconstruction quality.

5.6, Rendering and visualizing: rendering the thinned three-dimensional map to generate a visual 3D map, wherein the map not only has high-resolution details in the region with rich feature points, but also can depict the region which is not covered in the original sparse map. The dense map provides a comprehensive and fine three-dimensional model of the digestive cavity for doctors, and greatly enhances the accuracy and efficiency of digestive cavity examination and diagnosis.

6. Photometric consistency check:

A photometric consistency check, i.e. a photometric consistency loss (Photometric Warping Loss), is applied to verify the correctness of the map points, ensuring the accuracy of the reconstruction results.

The loss of luminosity consistency is MSE loss (Mean Squared Error Loss, mean square error loss) between the image rendered by the corresponding pose and the real image, and an exposure variable is added in a loss function, and because different images are shot at different positions, the illumination is different, the rendering of the images is influenced, and therefore the exposure variable is added.

Specifically:

6.1. selecting a reference frame:

A frame of image is selected as a reference frame (typically the first active frame in the reconstruction process).

6.2. Projection mapping:

① Determining a reference viewing angle: an image at one point in time is selected as a reference view angle. The first image in the sequence of images is typically selected or the most stable image in the motion profile.

② Estimating a camera pose: the position and orientation of the capsule endoscope at each time point was estimated using a SLAM (Simultaneous Localization AND MAPPING, simultaneous localization and mapping algorithm) algorithm. This typically involves feature point matching and motion estimation.

③ Establishing a three-dimensional reference frame: based on the estimated camera pose, images taken by the capsule endoscopes are projected into a common three-dimensional frame of reference for comparison.

④ Alignment between images: and (3) aligning the images at different time points according to the three-dimensional reference frame by using an image registration technology, so as to ensure that the same physical points in the images are matched in positions in different images.

⑤ Extraction and comparison of photometric values: and extracting luminosity values of the same scene points in the aligned images, and comparing the luminosity values. In an ideal case, the luminosity values of the same physical point in different images should be identical even though taken from different angles.

⑥ Variance analysis and threshold checking: the luminosity difference is calculated and compared with a preset threshold. If the photometric difference exceeds a threshold, it is indicated that there may be errors in the three-dimensional position estimate or image registration at that point.

6.3. Luminosity comparison:

The luminosity values (brightness and color) of the projected pixels in the reference frame and the current frame are compared. Ideally, if both the camera pose and the three-dimensional map are accurate, then these values should be consistent.

6.4. And (3) error calculation:

The photometric error is calculated, typically by calculating the sum of squares of the pixel photometric differences between the reference frame and the other frames.

6.5. Threshold value judgment:

A threshold is set to determine if the photometric error is within an acceptable range. If the error is greater than this threshold, it may indicate that the camera pose estimate or the three-dimensional map is erroneous.

7. Precision optimization and refinement:

The accuracy of the camera trajectory and map points is further optimized using optimization techniques in SLAM (e.g., bundle Adjustment, binding adjustments).

And refining the sparse map by using a deep neural network to generate a more detailed and continuous gastric cavity three-dimensional model.

8. And (3) verifying results:

Finally, the reconstructed three-dimensional model needs to be verified to ensure that the reconstructed three-dimensional model accurately reflects the actual structure of the gastric cavity.

In this embodiment, the multi-frame processed first image is input into the pre-built three-dimensional reconstruction algorithm, and according to the output result of the three-dimensional reconstruction algorithm, a corresponding three-dimensional model of the inner wall of the digestive cavity is obtained, and the pre-built three-dimensional reconstruction algorithm is used to build the three-dimensional model of the inner wall of the digestive cavity, so that the building of the three-dimensional model of the inner wall of the digestive cavity is facilitated to be quickened, meanwhile, after the building, the model is further subjected to accuracy judgment and is processed under the condition of low accuracy, and the accuracy of the built three-dimensional model of the inner wall of the digestive cavity is further improved.

In one embodiment, obtaining a three-dimensional model of the inner wall of the digestive lumen based on the plurality of frames of the processed first images includes: obtaining a first digestive cavity inner wall three-dimensional model based on the multi-frame processed first image and a second image except the first image in the image set; if the accuracy of the three-dimensional model of the inner wall of the first digestive cavity does not reach the threshold value, acquiring the corresponding acceleration of the capsule endoscope when acquiring each second image; determining a second image to be restored in each second image according to the relative magnitude of the acceleration; performing leveling restoration processing on the second image to be restored to obtain a restored second image; and obtaining a second digestive cavity inner wall three-dimensional model based on the restored second image, the multi-frame processed first image and other images in the image set.

When the three-dimensional model of the inner wall of the digestive cavity is built for the first time, according to a plurality of frames of processed first images and second images except the first images in an image set, inputting the first images and the second images into a pre-built three-dimensional reconstruction algorithm to obtain the three-dimensional model of the inner wall of the digestive cavity, if the accuracy of the first three-dimensional model of the inner wall of the digestive cavity does not reach a preset accuracy threshold, acquiring corresponding acceleration of the capsule endoscope when acquiring each second image, selecting the second image to be restored according to the relative size of the acceleration, wherein the second image to be restored is an image influenced by peristaltic waves of the digestive cavity, but the influence degree is less serious than that of the first image, further performing leveling restoration processing on the second image to be restored to obtain the restored second image for obtaining enough stable images, and finally obtaining the three-dimensional model of the inner wall of the second digestive cavity according to the restored second image, the plurality of frames of processed first images and other images in the image set.

By the method, the second image to be restored, which is less influenced by peristaltic waves of the digestive cavity wall, is subjected to leveling restoration processing, so that more stable images are obtained, and the accuracy of the three-dimensional model of the digestive cavity inner wall is improved.

In one embodiment, performing a leveling restoration process on a second image to be restored to obtain a restored second image, including: and respectively carrying out leveling restoration processing on each second image to be restored through a filtering algorithm and a leveling algorithm to obtain corresponding restored second images.

And removing noise in the second image to be restored by using a filtering algorithm, and removing fluctuation or discontinuity in the second image to be restored by using a smoothing algorithm to enable the fluctuation or discontinuity to be smoother, so as to obtain a corresponding restored second image. Because the second image to be restored is not greatly influenced by peristaltic waves of the digestive cavity wall, the processing cost can be reduced by directly adopting a filtering algorithm and a smoothing algorithm, and the method has simple steps and is beneficial to accelerating the construction speed of the three-dimensional model of the digestive cavity inner wall.

In an exemplary embodiment, as shown in fig. 5, a detailed method for reconstructing a three-dimensional model of an inner wall of a digestive cavity is provided, and specific steps include S501 to S512, in which:

Step S501, acquiring an image set acquired by the capsule endoscope inside the digestive cavity.

Step S502, acquiring corresponding acceleration of the capsule endoscope when each image is acquired.

In step S503, an image with an acceleration greater than a threshold is selected as a candidate set, and motion blur condition detection is performed on the image to obtain motion blur conditions corresponding to each image in the candidate set.

Step S504, selecting the candidate set as a target set with the motion blur condition reaching a preset condition, and inputting each image in the target set into a shrink region identification model.

Step S505, according to the prediction result output by the shrink region identification model, obtaining a plurality of frames of first images formed under the influence of the peristaltic motion of the digestive cavity wall and the shrink region in each frame of first images.

Step S506, the motion condition among the multiple frames of first images is obtained, and the shrinkage area in each frame of first images is restored to be a flat area according to the motion condition, so that multiple frames of processed first images are obtained.

Step S507, inputting the processed first images into a pre-constructed three-dimensional reconstruction algorithm to obtain a three-dimensional model of the inner wall of the digestive cavity.

After the corresponding three-dimensional model of the inner wall of the digestive cavity is constructed according to the above steps, further judging whether the accuracy of the currently constructed three-dimensional model of the inner wall of the digestive cavity meets the standard, wherein the specific steps are as follows:

step S508, obtaining a three-dimensional model of the inner wall of the first digestive cavity based on the processed first images and the second images except the first images in the image set.

Step S509, if the accuracy of the three-dimensional model of the inner wall of the first digestive cavity does not reach the threshold value, acquiring the corresponding acceleration of the capsule endoscope when acquiring each second image.

Step S510, determining the second image to be restored in each second image according to the relative magnitude of the acceleration.

And S511, respectively performing flattening recovery processing on each second image to be recovered through a filtering algorithm and a flattening algorithm to obtain a corresponding recovered second image.

And step S512, performing leveling restoration processing on the second image to be restored to obtain a restored second image.

Compared with the prior art, the application has the following advantages:

1) Inputting an image set in the digestive cavity obtained by the capsule endoscope into a shrink region identification model, determining a shrink region of a first image influenced by peristaltic waves of the digestive cavity wall according to an output result, and laying a data foundation for performing subsequent flattening recovery on the shrink region of the first image of each frame.

2) And carrying out leveling recovery processing on the shrinkage area of each frame of first image according to the motion condition among the frames of first images to obtain the processed frames of first images, and obtaining more stable first images by the method, thereby being beneficial to improving the accuracy of the three-dimensional model of the inner wall of the digestive cavity.

3) The processed multi-frame first image is input into a pre-built three-dimensional reconstruction algorithm, and a three-dimensional model of the inner wall of the digestive cavity is output, so that the building speed of the three-dimensional model of the inner wall of the digestive cavity is increased.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides a reconstruction device for the three-dimensional model of the digestive cavity inner wall, which is used for realizing the reconstruction method of the three-dimensional model of the digestive cavity inner wall. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in the embodiments of the device for reconstructing a three-dimensional model of an inner wall of a digestive cavity provided below may be referred to the limitation of the method for reconstructing a three-dimensional model of an inner wall of a digestive cavity hereinabove, and will not be repeated here.

In an exemplary embodiment, as shown in fig. 6, there is provided a reconstruction apparatus of a three-dimensional model of an inner wall of a digestive cavity, including: an image set acquisition module 601, a crimping zone determination module 602, an image processing module 603, and a three-dimensional model determination module 604, wherein:

The image set acquisition module 601 is used for acquiring an image set acquired by the capsule endoscope in the digestive cavity.

The shrink region determination module 602 is configured to determine a plurality of frames of the first image formed under the influence of peristalsis of the digestive lumen wall in the image set, and determine a shrink region in each frame of the first image.

The image processing module 603 is configured to restore the shrunken area in each frame of the first image to a flat area, so as to obtain a plurality of frames of processed first images.

The three-dimensional model determining module 604 is configured to obtain a three-dimensional model of the inner wall of the digestive cavity based on the multiple frames of the processed first images.

In one embodiment, the crush zone determination module 602 includes an image input sub-module and a crush zone determination sub-module, wherein:

the image input sub-module is used for inputting the images in the image set into a pre-constructed shrinkage area identification model;

and the shrinkage region determining submodule is used for obtaining a plurality of frames of first images formed under the influence of the peristaltic motion of the digestive cavity wall and shrinkage regions in each frame of first images according to the prediction result output by the shrinkage region identification model.

In an exemplary embodiment, the image input submodule is specifically configured to obtain an acceleration corresponding to the capsule endoscope when each image is acquired; screening out a candidate set from the image set according to whether the acceleration is larger than a threshold value; detecting motion blur of each image in the candidate set to obtain the motion blur condition of each image in the candidate set; screening a target set from the candidate set according to whether the motion blur condition reaches a preset condition; each image in the target set is input into a pre-constructed crimp zone identification model.

In one embodiment, the image processing module 603 is specifically configured to acquire a motion situation between the first images of the plurality of frames; and restoring the shrinkage area in each frame of the first image into a flat area based on the motion condition to obtain a plurality of frames of processed first images.

In one embodiment, the three-dimensional model determining module 604 is specifically configured to input the multi-frame processed first image into a pre-constructed three-dimensional reconstruction algorithm; and obtaining a three-dimensional model of the inner wall of the digestive cavity according to the output result of the three-dimensional reconstruction algorithm.

In an exemplary embodiment, the three-dimensional model determining module 604 is further specifically configured to obtain a three-dimensional model of the inner wall of the digestive lumen based on the multiple frames of the processed first image and the second image in the image set; if the accuracy of the three-dimensional model of the inner wall of the first digestive cavity does not reach the threshold value, acquiring the corresponding acceleration of the capsule endoscope when acquiring each second image; determining a second image to be restored in each second image according to the relative magnitude of the acceleration; performing leveling restoration processing on the second image to be restored to obtain a restored second image; and obtaining a second digestive cavity inner wall three-dimensional model based on the restored second image, the multi-frame processed first image and other images in the image set.

In one embodiment, the three-dimensional model module determining module 604 is further configured to perform a flattening recovery process on each of the second images to be recovered through a filtering algorithm and a flattening algorithm, so as to obtain a corresponding recovered second image.

The above-mentioned modules in the reconstruction device of the three-dimensional model of the inner wall of the digestive cavity may be all or partially implemented by software, hardware and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one exemplary embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing image set data acquired by the capsule endoscope in the digestive cavity. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of reconstructing a three-dimensional model of an inner wall of a digestive cavity.

It will be appreciated by those skilled in the art that the structure shown in FIG. 7 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

In an exemplary embodiment, a computer device is provided, comprising a memory and a processor, the memory storing a computer program, the processor implementing the method for reconstructing a three-dimensional model of an inner wall of a digestive cavity of the above embodiment when the computer program is executed.

In an embodiment, a computer readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, implements the method of reconstructing a three-dimensional model of an inner wall of a digestive cavity of the above-described embodiment.

In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the method of reconstructing a three-dimensional model of an inner wall of a digestive cavity of the above-described embodiment.

It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are both information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data are required to meet the related regulations.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magneto-resistive random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (PHASE CHANGE Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in various forms such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), etc. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims

1. A method for reconstructing a three-dimensional model of an inner wall of a digestive cavity, the method comprising:

acquiring an image set acquired by a capsule endoscope in a digestive cavity;

Determining a plurality of frames of first images formed under the influence of peristalsis of the digestive cavity wall in the image set, and determining a shrinkage area in each frame of first images;

And obtaining a three-dimensional model of the inner wall of the digestive cavity based on the multi-frame processed first image.

2. The method of claim 1, wherein determining a plurality of frames of the first image formed under the influence of peristalsis of the digestive lumen wall in the set of images and determining a region of shrinkage in each frame of the first image comprises:

3. The method of claim 2, wherein the inputting the images in the image set into a pre-constructed crimp zone identification model comprises:

Performing motion blur detection on each image in the candidate set to obtain motion blur conditions of each image in the candidate set;

And inputting each image in the target set into a pre-constructed shrinkage area identification model.

4. The method of claim 1, wherein the restoring the collapsed region in the first image of each frame to a flat region results in a plurality of processed first images, comprising:

acquiring the motion condition among multiple frames of first images;

5. The method of claim 1, wherein the deriving a three-dimensional model of the inner wall of the digestive lumen based on the plurality of frames of the processed first image comprises:

6. The method of claim 1, wherein the deriving a three-dimensional model of the inner wall of the digestive lumen based on the plurality of frames of the processed first image comprises:

if the accuracy of the three-dimensional model of the inner wall of the first digestive cavity does not reach a threshold value, acquiring corresponding acceleration of the capsule endoscope when acquiring each second image;

7. The method of claim 6, wherein performing a leveling restoration process on the second image to be restored to obtain a restored second image, comprises:

8. A device for reconstructing a three-dimensional model of an inner wall of a digestive lumen, the device comprising:

And the three-dimensional model determining module is used for obtaining a three-dimensional model of the inner wall of the digestive cavity based on the multi-frame processed first image.

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.