CN114913113A

CN114913113A - Method, device and equipment for processing image

Info

Publication number: CN114913113A
Application number: CN202110181081.XA
Authority: CN
Inventors: 陈佳伟; 贺光琳; 徐雨亭; 朱麒文
Original assignee: Hangzhou Haikang Huiying Technology Co ltd
Current assignee: Hangzhou Haikang Huiying Technology Co ltd
Priority date: 2021-02-09
Filing date: 2021-02-09
Publication date: 2022-08-16

Abstract

The application discloses a method, a device and equipment for processing images, and belongs to the field of image processing. The method comprises the following steps: determining target part information corresponding to a shot target image based on the shot target image and a pre-trained part information detection model, wherein the target part information is used for indicating a detection part presented by the target image; determining a target image processing mode corresponding to the target part information based on the corresponding relation between the part information and the image processing mode; and processing the target image based on the target image processing mode to obtain a processed target image. The application can simplify the using steps of the endoscope.

Description

Method, device and equipment for processing image

Technical Field

The present application relates to the field of image processing, and in particular, to a method, an apparatus, and a device for image processing.

Background

With the development of scientific technology, the endoscope technology is mature more and more, and is widely applied in the medical field, for example, the endoscope is used to perform internal examination on a patient, or the endoscope is used to provide a real-time image of an operation for an endoscopic operation.

When an operator uses an endoscope to examine or perform an operation on a patient, a catheter of the endoscope can be introduced into a corresponding part (such as an ear canal, a nasal cavity, an abdominal cavity, a joint cavity and the like) in the body of the patient, and an image pickup device can shoot the part in the body through the catheter of the endoscope, transmit the shot image to a display, and display the shot image by the display.

In the process of implementing the present application, the inventors found that the related art has at least the following problems:

when an endoscope is used to detect different parts in a patient, due to different cavity environments corresponding to the different parts, the detected parts in the captured image may not be displayed clearly, for example, the captured image has too high brightness or too low color contrast. Thus, the operator is required to adjust different display parameters of the image according to the detected part each time the operator uses the endoscope, so that the operator can clearly observe the detected part displayed in the image, but the using steps of the endoscope are complicated.

Disclosure of Invention

The embodiment of the application provides a method, a device and equipment for processing images, which can simplify the using steps of an endoscope. The technical scheme is as follows:

in one aspect, a method for image processing is provided, where the method includes:

determining target part information corresponding to a shot target image based on the shot target image and a pre-trained part information detection model, wherein the target part information is used for indicating a detection part presented by the target image;

determining a target image processing mode corresponding to the target part information based on the corresponding relation between the part information and the image processing mode;

and processing the target image based on the target image processing mode to obtain a processed target image.

Optionally, the determining target region information corresponding to the target image based on the shot target image and the pre-trained region information detection model includes:

inputting the target image into a pre-trained part information detection model to obtain probability values corresponding to various part information respectively;

and determining target part information corresponding to the target image based on the probability value corresponding to each part information.

Optionally, the part information detection model includes a plurality of part information detection models, where each part information detection model has different model attributes, and the model attributes include one or more of a model algorithm, a training sample, and a training amount;

inputting the target image into a pre-trained part information detection model to obtain probability values corresponding to various part information respectively, wherein the probability values comprise:

inputting the target image into a plurality of pre-trained part information detection models respectively to obtain probability values corresponding to various part information output by each part information detection model respectively;

the determining the target part information corresponding to the target image based on the probability value corresponding to each part information comprises:

for each part information, determining fusion probability values corresponding to the part information based on probability values output by different part information detection models corresponding to the part information;

and determining the target part information in a plurality of parts information based on the fusion probability value corresponding to each part information.

Optionally, the method further includes:

determining confidence corresponding to any part information based on a plurality of probability values determined by the plurality of part information detection models for the any part information;

the determining the target location information in a plurality of location information based on the fusion probability value corresponding to each location information comprises:

and determining the target part information in the plurality of part information based on the fusion probability value and the confidence coefficient corresponding to each part information.

Optionally, the determining the target location information in the multiple location information based on the fusion probability value and the confidence corresponding to each location information includes:

and determining the position information with the highest corresponding fusion probability value in the plurality of position information, and if the confidence coefficient of the position information with the highest corresponding fusion probability value is greater than a preset confidence coefficient threshold value, determining the position information with the highest corresponding fusion probability value as the target position information.

Optionally, the target image processing mode includes a target value of a preset display parameter; the processing the target image based on the target image processing mode to obtain a processed target image includes:

and adjusting the value of the preset display parameter corresponding to the target image to a target value to obtain a processed target image.

Optionally, the target image processing mode corresponds to a target processing algorithm; the processing the target image based on the target image processing mode to obtain a processed target image comprises the following steps:

and processing the target image based on the target processing algorithm to obtain the processed target image.

Optionally, the information of the parts in the corresponding relationship includes ear canal, nasal cavity, throat, ureter, normal abdominal cavity, abdominal cavity fog and abdominal cavity bleeding;

when the target part information is any one of the auditory canal, the nasal cavity and the ureter, the corresponding target image processing mode is as follows: carrying out turning-down processing on the sharpening intensity value corresponding to the target image, and carrying out increasing processing on the brightening intensity value of the dark area and the denoising intensity value corresponding to the target image;

when the target part information is the throat, the corresponding target image processing mode is as follows: carrying out augmentation processing on the sharpening intensity value and the blood vessel enhancement intensity value corresponding to the target image;

when the target part information is a normal abdominal cavity, the corresponding target image processing mode is as follows: carrying out augmentation processing on a sharpening intensity value, a blood vessel augmentation intensity value, a dark area brightening intensity value and a red saturation control intensity value corresponding to the target image;

when the target part information is the abdominal cavity fog, the corresponding target image processing mode is as follows: processing the target image based on a defogging algorithm;

when the target part information is abdominal bleeding, the corresponding target image processing mode is as follows: and processing the target image based on a red saturation suppression algorithm.

Optionally, the target site information is further used to indicate a state of the detection site.

Optionally, after obtaining the processed target image, the method further includes:

and displaying the processed target image.

In another aspect, there is provided an apparatus for performing image processing, the apparatus including:

the determination module is used for determining target part information corresponding to a shot target image based on the shot target image and a pre-trained part information detection model, wherein the target part information is used for indicating a detection part presented by the target image; determining a target image processing mode corresponding to the target part information based on the corresponding relation between the part information and the image processing mode;

and the processing module is used for processing the target image based on the target image processing mode to obtain a processed target image.

Optionally, the determining module is configured to:

the determining module is configured to:

and determining the target part information in a plurality of part information based on the fusion probability value corresponding to each part information.

Optionally, the determining module is further configured to:

determining confidence corresponding to any part information based on a plurality of probability values determined by the plurality of part information detection models for any part information;

Optionally, the determining module is configured to:

Optionally, the target image processing mode includes a target value of a preset display parameter; the processing module is configured to:

adjusting the value of a preset display parameter corresponding to the target image to a target value to obtain a processed target image;

alternatively, the first and second liquid crystal display panels may be,

the target image processing mode corresponds to a target processing algorithm; the processing module is configured to:

processing the target image based on the target processing algorithm to obtain the processed target image;

alternatively, the first and second liquid crystal display panels may be,

the part information in the corresponding relation comprises an ear canal, a nasal cavity, a throat, a ureter, a normal abdominal cavity, abdominal cavity fog and abdominal cavity bleeding; the processing module is configured to:

Optionally, the apparatus further includes a display module, configured to:

and displaying the processed target image.

In yet another aspect, a computer device is provided, which includes a processor and a memory, where at least one instruction is stored, and the instruction is loaded and executed by the processor to implement the operations performed by the method for image processing as described above.

In yet another aspect, a computer-readable storage medium is provided, in which at least one instruction is stored, the instruction being loaded and executed by a processor to implement the operations performed by the method for image processing as described above.

The technical scheme provided by the embodiment of the application has the following beneficial effects:

the shot target image is detected through a pre-trained part information detection model to obtain target part information corresponding to the target image, and then a target image processing mode for processing the target image can be determined according to the corresponding relation between the pre-set part information and the image processing mode. Thus, the target image can be processed according to the target image processing mode. According to the method and the device, the user of the endoscope does not need to manually adjust the display parameters of the images shot by the endoscope according to the detected part of the endoscope, and the using steps of the endoscope can be simplified.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a block diagram of an endoscope system provided by an embodiment of the present application;

FIG. 2 is a functional block diagram of an endoscope system provided by an embodiment of the present application;

FIG. 3 is a flowchart of a method for image processing according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a method for image processing according to an embodiment of the present disclosure;

FIG. 5 is a schematic diagram of a method for image processing according to an embodiment of the present disclosure;

FIG. 6 is a schematic diagram of a method for image processing according to an embodiment of the present disclosure;

FIG. 7 is a schematic diagram of a method for image processing according to an embodiment of the present application;

FIG. 8 is a schematic structural diagram of an apparatus for performing image processing according to an embodiment of the present disclosure;

fig. 9 is a schematic structural diagram of a terminal according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, the following detailed description of the embodiments of the present application will be made with reference to the accompanying drawings.

The method for processing the image can be realized through a terminal, wherein the terminal can be an image processing device, can be connected with an image pickup device and a display device, has an image processing function, can receive the image picked up by the image pickup device, processes the received image, transmits the processed image to the display device, and displays the processed image by the display device.

The method for processing the image provided by the embodiment of the application can be applied to the process that an operator inspects or operates a patient through an endoscope. When an operator uses the endoscope to examine or operate a patient, the guide tube of the endoscope can be guided into a corresponding position in the body of the patient, the image pickup device can shoot the position in the body through the guide tube of the endoscope, and the shot image is transmitted to a display device (such as a display), and the shot image is displayed by the display. Fig. 1 is a configuration diagram of an endoscope system. The endoscope system includes an endoscope apparatus, a light source apparatus, an image pickup system host, a display apparatus (display device), and a storage apparatus. The endoscope apparatus images each site in the subject by being inserted into the subject. The light source device supplies illumination light emitted from the distal end of the endoscope device. The imaging system host (image processing apparatus) performs processing such as encoding on image data captured by the endoscope, and transmits the processed image to the display apparatus capable of displaying the received image and the storage device that stores the received image. The operator can perform a test or operation on the patient by observing the in-patient image displayed by the display device, for example, testing a bleeding site, a tumor site, an abnormal site, or the like of the patient.

Fig. 2 is a functional block diagram of the endoscope system. The endoscope device has an imaging optical system, an imaging unit, a processing unit, and an operation unit. The imaging optical system condenses light from the detection portion. The image pickup optical system is constituted by one or more lenses. The imaging unit photoelectrically converts light received by each pixel to generate image data. The imaging unit is composed of an image sensor such as a CMOS (complementary metal oxide semiconductor) or a CCD (charge coupled device). The processing unit converts the image data generated by the imaging unit into a digital signal and sends the converted digital signal to the camera system host. The operation unit is used for outputting an instruction signal for switching the endoscope or an instruction signal for instructing the light source device to illuminate to the camera system host. Among them, the operation unit may be a switch, a button, a touch panel, and the like. The light source device comprises an illumination control unit and an illumination unit. The illumination control unit receives an indication signal of the camera system host and controls the illumination unit to provide illumination light to the endoscope. The camera system host processes image data received from the endoscope and transmits the processed image data to the display device and the storage device. The display device and the storage device are external devices. The camera system host comprises an image input unit, an image processing unit, an intelligent processing unit, a video coding unit, a control unit and an operation unit. The image input unit receives the digital signal sent by the endoscope device and transmits the received digital signal to the image processing unit. The image processing unit performs ISP (image signal processing) processing, including but not limited to luminance transformation, sharpening, moir e removal, and scaling, on the image of the image input unit. And the image processed by the image processing unit is transmitted to an intelligent processing unit, a video coding unit or a display device. The intelligent processing unit detects the images processed by the image processing unit, including but not limited to scene classification based on deep learning, instrument head detection, gauze detection, moire classification and dense fog classification. And the image processed by the intelligent processing unit is transmitted to an image processing unit or a video coding unit. The image processing unit processes the image processed by the intelligent processing unit in a manner including, but not limited to, luminance transformation, moire removal, framing, and scaling. And the video coding unit codes and compresses the image processed by the image processing unit or the intelligent processing unit and transmits the image to the storage device.

The method for processing the image provided by the embodiment of the application can be used for carrying out image recognition on the image shot by the endoscope and determining the part presented by the shot image, such as an ear canal, a nasal cavity, a chest cavity, an abdominal cavity, a joint cavity and the like. And then, carrying out corresponding processing on the image according to the identified part so as to enable the part in the image to be clearly displayed on a display.

Fig. 3 is a flowchart of a method for performing image processing according to an embodiment of the present application. Referring to fig. 3, the embodiment includes:

step 301, target region information corresponding to the target image is determined based on the captured target image and a pre-trained region information detection model.

The target portion information is used to indicate a detection portion presented by the target image, and may be, for example, an ear canal, a nasal cavity, a chest cavity, an abdominal cavity, a joint cavity, and the like.

In practice, when the operator turns on the endoscope system, the camera of the endoscope may start capturing images and then transmit the captured images to the image processing device, which may be the endoscope system camera system host. A part information detection model for which training is completed may be set in advance in the image processing apparatus, and when the image processing apparatus receives one image each time, the received image (target image) may be input to the part information detection model, and the target image is detected by the part information detection model, and target part information corresponding to the target image is determined.

The part information detection model can output probability values of parts displayed in the target image corresponding to the information of the parts respectively, and target part information corresponding to the target image is determined based on the probability values corresponding to the information of the parts.

In implementation, after the target image is input to the pre-trained region information detection model, the probability values of the regions presented by the output target image belonging to the respective regions in the body may be output, and then the region with the highest probability value may be determined as the region presented in the target image. For example, if the probability of the ear canal, the nasal cavity, the thoracic cavity, the abdominal cavity, and the joint cavity respectively corresponding to the part information detection model is 0.05, 0.10, 0.75, and 0.05, it can be determined that the target image shows the abdominal cavity, that is, the current endoscope is detecting the abdominal cavity.

In addition, the technician may also set a probability threshold, and when the probability values of the parts of the target image, which are detected by the part information detection model, belonging to the respective parts in the body are lower than the probability threshold, it may be determined that the target image presents a scene in vitro and does not belong to the parts in the body. Or the probability value of the scene presented by the target image in vitro can be directly detected by the position information detection model, and when the probability value is the highest, the scene presented by the target image in vitro can be determined.

Optionally, the site information is also used to indicate the status of the detection site. For example, when the detection site is the abdominal cavity, the corresponding abdominal cavity status may include normal abdominal cavity, bleeding abdominal cavity, and mist of abdominal cavity. Among them, bleeding and mist in the abdominal cavity may occur during the endoscopic surgery, which is caused by the large amount of water vapor generated in the abdominal cavity when some surgical instruments perform high temperature excision of the lesion tissue in the abdominal cavity. In addition, when the region information is also used to indicate the state of the detected region, the region information detection model may also detect the probability that the region presented by the target image belongs to each region in the body and the region state probability of some regions. For example, the position information detection model can respectively obtain probability values of auditory meatus, nasal cavity, thoracic cavity, abdominal cavity, joint cavity, normal abdominal cavity, abdominal cavity bleeding and abdominal cavity fog.

Optionally, in order to improve accuracy of detecting the target portion information, the embodiment of the present application further provides a plurality of portion information detection models with different model attributes to detect the target image, and the target portion information of the target image is determined according to a detection result of each portion information detection model. Wherein the model attribute comprises one or more of a model algorithm, a training sample and a training amount.

In the embodiment of the present application, three part information detection models with different model algorithms are taken as an example to explain the scheme.

As shown in fig. 4, fig. 4 is a schematic model diagram of the model a detected in the region information. After the target image is input into the location information detection model a, the target image may be processed by a multi-layer neural network included in the location information detection model a, and finally, a processing result of the target image, that is, a probability value that a location presented by the target image is each location in a body is output. For example, the multi-layer neural network in the region information detection model a may include a convolutional layer, a nonlinear activation layer, a normalization layer, and a softmax (a classification function) layer, where the softmax layer is an output layer.

As shown in fig. 5, fig. 5 is a schematic model diagram of the model B detected in the region information. Two output layers may be included in the multi-layer neural network in the location information detection model B. After the target image is input to the part information detection model B, the target image may be processed by a multilayer neural network included in the part information detection model B. The first output layer can obtain an indicated value of whether a part presented by the target image belongs to an in-vivo part, and when the indicated value is smaller than a preset indicated threshold value, the target image can be determined to be an in-vitro scene and not belong to the in-vivo part; when the indication value is larger than the preset indication threshold value, the subsequent classification layer can continue to process the target image data processed by the first output layer, and finally the probability value that the part presented by the target image is each part in the body is obtained. For example, the multi-layer neural network in the region information detection model B may include a convolutional layer, a nonlinear activation layer, a normalization layer, a sigmoid layer, and a softmax layer, where the sigmoid (a sort function) layer may be a first output layer and the softmax layer may be a second output layer, i.e., a sort layer.

As shown in fig. 6, fig. 6 is a schematic model diagram of the model B detected in the region information. The input in the part information detection model C may include two input images, one of which is a target image captured by an endoscope, the other of which is a reference image, and the reference image may include an image of a part or a state of the part, such as but not limited to any one of an ear canal, a nasal cavity, a thoracic cavity, an abdominal cavity, a joint cavity, a normal abdominal cavity, abdominal bleeding, and abdominal mist. The two input images may be matched in the location information detection model C, and the probability value that the location represented by the target image is the location represented in the reference image is determined. Since only one part image can be shown in the reference image, when each target image is identified, the target image and a plurality of different part images are sequentially combined to form an input image pair, and then the input image pair is respectively input into the part information detection model C, so as to respectively obtain the probability values of the parts shown in the target image belonging to each part in the body. The multilayer neural network in the part information detection model C can comprise a convolutional layer, a nonlinear activation layer, a normalization layer, a sigmoid layer and a softmax layer.

It should be noted that the three different part information detection models formed in the above description are existing deep learning models, and the multilayer neural network in each part information detection model is an existing neural network layer. The training mode of each part information detection model is the existing training method, and is not described herein again.

In this embodiment of the present application, any two or more of the three different location information detection models may form a location information detection framework, and perform recognition processing on the target sample image, where the corresponding processing is as follows:

respectively inputting the target image into a plurality of pre-trained part information detection models to obtain probability values respectively corresponding to a plurality of kinds of part information output by each part information detection model; for each part information, determining fusion probability values corresponding to the part information based on the probability values output by different part information detection models corresponding to the part information; and determining target part information in the plurality of part information based on the fusion probability value corresponding to each part information.

In implementation, a plurality of part information detection models may constitute a part information detection framework, and then the target image is input into the part information detection framework, and the target image is detected by the plurality of part information detection models in the part information detection framework. As shown in fig. 7, three different part information detection frames are included in fig. 7, and in practical applications, any one of the part information detection frames may be used. The part information detection framework 1 is composed of a part information detection model A and a part information detection model B; the part information detection framework 2 is composed of a part information detection model A/B and a part information detection model C; the part information detection framework 3 is composed of two part information detection models C corresponding to different training samples or training quantities, wherein the training quantities can be the times or duration of training through the training samples.

After the target image is detected by the plurality of part information detection models in the part information detection framework, the probability value of each part represented by the target image can be obtained by each part information detection model in the part information detection framework. Thus, for each part, there are multiple probability values, that is, multiple probability values that the part represented by the target image obtained by the multiple models belongs to each part, and then the multiple probability values may be fused, for example, the multiple probability values may be averaged, or the multiple probability values may be summed and normalized to obtain a fused probability value that the part represented by the target image belongs to each part. The location with the highest corresponding fusion probability value may then be determined as the location presented by the target image.

Optionally, determining a confidence corresponding to any part information based on a plurality of probability values determined by a plurality of part information detection models for any part information; and determining target part information corresponding to the target image from the plurality of part information based on the fusion probability value and the confidence coefficient corresponding to each part information.

In implementation, a fusion probability value that the part represented by the target image belongs to the corresponding part may be determined according to a plurality of probability values corresponding to the part represented by the target image belonging to each part, and then a confidence of the corresponding fusion probability value may be determined according to a plurality of probability values corresponding to the part represented by the target image belonging to each part. For example, if two part information detection models are included in the part information detection framework, the corresponding confidence coefficients are determined by the following formula:

Conf_i＝1–|P_i(a)–P_i(b)|

wherein Conf _ i is the confidence that the part represented by the target image is the part i; p _ i (a) is a probability value that a part represented by the target image obtained by the part information detection model a is a part i; p _ i (B) is a probability value that the region represented by the target image obtained by the region information detection model B is the region i.

After the fusion probability value and the confidence coefficient of the part represented by the target image belonging to each part are obtained, the part to which the part represented by the target image belongs can be determined in the plurality of parts according to the fusion probability value and the confidence coefficient of each part. For example, a region whose corresponding confidence is greater than a preset confidence threshold and whose fusion probability value is greater than a preset probability value threshold may be determined as a region to which a region presented in the target image belongs.

In an optional scheme, the location information with the highest corresponding fusion probability value may be determined from the multiple location information, and if the confidence of the location information with the highest corresponding fusion probability value is greater than a preset confidence threshold, the location information with the highest corresponding fusion probability value is determined as the target location information corresponding to the target image.

In implementation, after obtaining a fusion probability value and a confidence that a part represented by the target image belongs to each part, the part information with the highest corresponding fusion probability value may be determined. And if the confidence corresponding to the part information with the highest fusion probability value is greater than a preset confidence threshold, determining the part as the part to which the part presented by the target image belongs. If the confidence corresponding to the part information with the highest fusion probability value is less than or equal to a preset confidence threshold, the target part information corresponding to the target image which cannot be detected currently can be determined, and the target image can not be subjected to subsequent processing.

Step 302, determining a target image processing mode corresponding to the target part information based on the corresponding relationship between the preset part information and the image processing mode.

In implementation, a technician may preset an image processing method corresponding to each piece of region information, and after determining the region information of the target image, may determine a target image processing method for processing the target image according to a preset corresponding relationship.

In the corresponding relationship between the preset location information and the image processing mode, the image processing mode may be to adjust the display parameter corresponding to the image to a preset value. That is, the part information may correspond to values of different display parameters, after the target part information corresponding to the target image is determined, the target value of each display parameter corresponding to the target part information may be determined according to the corresponding relationship, and then the value of each display parameter of the target image is adjusted to the target value.

The display parameters include, but are not limited to, sharpening strength, red saturation control strength, dark area brightening strength, blood vessel enhancement strength, and noise reduction strength, and values corresponding to the display parameters may be preset by a technician. The adjustment of the values of the display parameters of the target image to the target values can be realized by an ISP algorithm, which belongs to the prior art, and detailed processing steps are not described here.

In the corresponding relationship between the preset location information and the image processing method, the image processing method may be to process the image according to different image processing algorithms, that is, different location information corresponds to different image processing algorithms. After the target part information corresponding to the target image is determined, a target processing algorithm corresponding to the target part information can be determined according to the corresponding relation, and then the target image is processed through the target processing algorithm. For example, when the part information of the target image is determined to be abdominal fog, the corresponding processing algorithm may be a defogging algorithm, so as to reduce the fog existing in the target image and improve the definition of the target image. When the position information of the target image is determined to be abdominal bleeding, the corresponding processing algorithm can be a red saturation suppression algorithm, so that the saturation of the red object of the target image is reduced and the details of the object are restored.

And step 303, processing the target image based on the target image processing mode to obtain a processed target image.

In implementation, after a target image processing method for processing the target image is determined, the target image may be further processed according to the target image processing method to obtain a processed target image.

Optionally, the part information in the corresponding relationship may include ear canal, nasal cavity, throat, ureter, normal abdominal cavity, abdominal cavity fog and abdominal cavity bleeding;

when the target part information is any one of the auditory canal, the nasal cavity and the ureter, the corresponding target image processing mode is as follows: carrying out reduction processing on a sharpening intensity value corresponding to the target image, and carrying out increase processing on a dark area brightening intensity value and a noise reduction intensity value corresponding to the target image; when the target part information is the throat part, the corresponding target image processing mode is as follows: carrying out augmentation processing on the sharpening intensity value and the blood vessel enhancement intensity value corresponding to the target image; when the target part information is a normal abdominal cavity, the corresponding target image processing mode is as follows: carrying out augmentation processing on a sharpening intensity value, a blood vessel enhancement intensity value, a dark area brightening intensity value and a red saturation control intensity value corresponding to the target image; when the target part information is the abdominal cavity fog, the corresponding target image processing mode is as follows: processing the target image based on a defogging algorithm; when the target part information is the abdominal bleeding, the corresponding target image processing mode is as follows: and processing the target image based on a red saturation suppression algorithm.

In implementation, default values may be set for each display parameter corresponding to an image, and in the case of corresponding different region information, an adjustment mode for each display parameter and an image processing algorithm for processing the image may be set. When the target part information corresponding to the image is not determined, the image can be displayed according to the set default value corresponding to each display parameter. After the target portion information corresponding to the image is determined, the display parameters may be adjusted according to the corresponding adjustment mode, and the image may be displayed according to the adjusted display parameters. The corresponding adjustment mode may be a reduction process or an increase process. The different part information may be the same or different corresponding to the display parameter to be adjusted. The adjustment method corresponding to each part information can be as shown in the following table.

Watch 1

Note that the first numerical value, the second numerical value, and the third numerical value …, which is the fifteenth numerical value, may be set in advance by a technician, and in the image processing method corresponding to each piece of region information, a value corresponding to a display parameter that is not adjusted may be a preset default value.

Optionally, after obtaining the processed target image, the processed target image may be further processed, for example, the processed image may be sent to a display, and the display displays the processed image, so that an operator performs an endoscopic operation on a patient according to the processed image. The processed image may be transmitted to a storage device while being displayed, stored, and the like.

According to the embodiment of the application, the shot target image is detected through the pre-trained part information detection model to obtain the target part information corresponding to the target image, and then the target image processing mode for processing the target image can be determined according to the corresponding relation between the pre-set part information and the image processing mode. Thus, the target image can be processed according to the target image processing mode. According to the method and the device, the user of the endoscope does not need to manually adjust the display parameters of the images shot by the endoscope according to the detected part of the endoscope, and the using steps of the endoscope can be simplified.

All the above optional technical solutions may be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described herein again.

Fig. 8 is a device for displaying an image according to an embodiment of the present application, where the device may be a terminal according to the foregoing embodiment, and the device includes:

a determining module 810, configured to determine target portion information corresponding to a captured target image based on the captured target image and a pre-trained portion information detection model, where the target portion information is used to indicate a detection portion presented by the target image; determining a target image processing mode corresponding to the target part information based on the corresponding relation between the part information and the image processing mode;

and the processing module 820 is configured to process the target image based on the target image processing manner to obtain a processed target image.

Optionally, the determining module 810 is configured to:

Optionally, the part information detection models include a plurality of models, wherein each part information detection model has different model attributes, and the model attributes include one or more of a model algorithm, a training sample, and a training amount;

the determining module 810 is configured to:

inputting the target image into a plurality of pre-trained part information detection models respectively to obtain probability values corresponding to a plurality of kinds of part information output by each part information detection model respectively;

Optionally, the determining module 810 is further configured to:

Optionally, the determining module 810 is configured to:

Optionally, the target image processing manner includes a target value of a preset display parameter; the processing module 820 is configured to:

alternatively, the first and second electrodes may be,

the target image processing mode corresponds to a target processing algorithm; the processing module 820 is configured to:

alternatively, the first and second liquid crystal display panels may be,

the part information in the corresponding relation comprises an ear canal, a nasal cavity, a throat, a ureter, a normal abdominal cavity, abdominal cavity fog and abdominal cavity bleeding; the processing module 820 is configured to:

when the target part information is the throat part, the corresponding target image processing mode is as follows: carrying out augmentation processing on the sharpening intensity value and the blood vessel enhancement intensity value corresponding to the target image;

Optionally, the apparatus further comprises a display module, configured to:

and displaying the processed target image.

It should be noted that: in the image processing apparatus according to the above embodiment, when displaying an image, the division of the functional modules is merely used as an example, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the functions described above. In addition, the apparatus for performing image processing and the method for performing image processing provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.

Fig. 9 shows a block diagram of a terminal 900 according to an exemplary embodiment of the present application. The terminal 900 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. Terminal 900 may also be referred to by other names such as user equipment, portable terminals, laptop terminals, desktop terminals, and the like.

In general, terminal 900 includes: a processor 901 and a memory 902.

Processor 901 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 901 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 901 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 901 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, the processor 901 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Memory 902 may include one or more computer-readable storage media, which may be non-transitory. The memory 902 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 902 is used to store at least one instruction for execution by processor 901 to implement the method for image processing provided by the method embodiments herein.

In some embodiments, terminal 900 can also optionally include: a peripheral interface 903 and at least one peripheral. The processor 901, memory 902, and peripheral interface 903 may be connected by buses or signal lines. Various peripheral devices may be connected to the peripheral interface 903 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 904, a touch display screen 905, a camera 906, an audio circuit 907, a positioning component 908, and a power supply 909.

The peripheral interface 903 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 901 and the memory 902. In some embodiments, the processor 901, memory 902, and peripheral interface 903 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 901, the memory 902 and the peripheral interface 903 may be implemented on a separate chip or circuit board, which is not limited by this embodiment.

The Radio Frequency circuit 904 is used to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. The radio frequency circuitry 904 communicates with communication networks and other communication devices via electromagnetic signals. The radio frequency circuit 904 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 904 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuit 904 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the radio frequency circuit 904 may also include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 905 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 905 is a touch display screen, the display screen 905 also has the ability to capture touch signals on or over the surface of the display screen 905. The touch signal may be input to the processor 901 as a control signal for processing. At this point, the display 905 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 905 may be one, providing the front panel of the terminal 900; in other embodiments, the number of the display panels 905 may be at least two, and each of the display panels is disposed on a different surface of the terminal 900 or is in a foldable design; in still other embodiments, the display 905 may be a flexible display disposed on a curved surface or a folded surface of the terminal 900. Even more, the display screen 905 may be arranged in a non-rectangular irregular figure, i.e. a shaped screen. The Display panel 905 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), or other materials.

The camera assembly 906 is used to capture images or video. Optionally, camera assembly 906 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of a terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 906 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

Audio circuit 907 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 901 for processing, or inputting the electric signals to the radio frequency circuit 904 for realizing voice communication. For stereo sound acquisition or noise reduction purposes, the microphones may be multiple and disposed at different locations of the terminal 900. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 901 or the radio frequency circuit 904 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, audio circuit 907 may also include a headphone jack.

The positioning component 908 is used to locate a current geographic Location of the terminal 900 for navigation or LBS (Location Based Service). The Positioning component 908 may be a Positioning component based on the GPS (Global Positioning System) in the united states, the beidou System in china, the graves System in russia, or the galileo System in the european union.

The power supply 909 is used to supply power to the various components in the terminal 900. The power source 909 may be ac, dc, disposable or rechargeable. When power source 909 comprises a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery can also be used to support fast charge technology.

In some embodiments, terminal 900 can also include one or more sensors 910. The one or more sensors 910 include, but are not limited to: an acceleration sensor 911, a gyro sensor 912, a pressure sensor 913, a fingerprint sensor 914, an optical sensor 915, and a proximity sensor 916.

The acceleration sensor 911 can detect the magnitude of acceleration in three coordinate axes of the coordinate system established with the terminal 900. For example, the acceleration sensor 911 may be used to detect the components of the gravitational acceleration in three coordinate axes. The processor 901 can control the touch display 905 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 911. The acceleration sensor 911 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 912 may detect a body direction and a rotation angle of the terminal 900, and the gyro sensor 912 may cooperate with the acceleration sensor 911 to acquire a 3D motion of the user on the terminal 900. The processor 901 can implement the following functions according to the data collected by the gyro sensor 912: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

The pressure sensor 913 may be disposed on a side bezel of the terminal 900 and/or underneath the touch display 905. When the pressure sensor 913 is disposed on the side frame of the terminal 900, the user's holding signal of the terminal 900 may be detected, and the processor 901 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 913. When the pressure sensor 913 is disposed at the lower layer of the touch display 905, the processor 901 controls the operable control on the UI interface according to the pressure operation of the user on the touch display 905. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 914 is used for collecting a fingerprint of the user, and the processor 901 identifies the user according to the fingerprint collected by the fingerprint sensor 914, or the fingerprint sensor 914 identifies the user according to the collected fingerprint. Upon identifying that the user's identity is a trusted identity, processor 901 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings, etc. The fingerprint sensor 914 may be disposed on the front, back, or side of the terminal 900. When a physical key or vendor Logo is provided on the terminal 900, the fingerprint sensor 914 may be integrated with the physical key or vendor Logo.

The optical sensor 915 is used to collect ambient light intensity. In one embodiment, the processor 901 may control the display brightness of the touch screen 905 based on the ambient light intensity collected by the optical sensor 915. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 905 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 905 is turned down. In another embodiment, the processor 901 can also dynamically adjust the shooting parameters of the camera assembly 906 according to the ambient light intensity collected by the optical sensor 915.

A proximity sensor 916, also known as a distance sensor, is typically provided on the front panel of the terminal 900. The proximity sensor 916 is used to collect the distance between the user and the front surface of the terminal 900. In one embodiment, when the proximity sensor 916 detects that the distance between the user and the front face of the terminal 900 is gradually reduced, the touch display 905 is controlled by the processor 901 to switch from the bright screen state to the dark screen state; when the proximity sensor 916 detects that the distance between the user and the front surface of the terminal 900 gradually becomes larger, the processor 901 controls the touch display 905 to switch from the breath screen state to the bright screen state.

Those skilled in the art will appreciate that the configuration shown in fig. 9 does not constitute a limitation of terminal 900, and may include more or fewer components than those shown, or may combine certain components, or may employ a different arrangement of components.

In an exemplary embodiment, there is also provided a computer-readable storage medium, such as a memory including instructions executable by a processor in a terminal to perform the method of image processing in the above-described embodiments. The computer readable storage medium may be non-transitory. For example, the computer-readable storage medium may be a ROM (Read-Only Memory), a RAM (Random Access Memory), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk.

The above description is only a preferred embodiment of the present application and should not be taken as limiting the present application, and any modifications, equivalents, improvements and the like that are made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method of image processing, the method comprising:

2. The method according to claim 1, wherein the determining target region information corresponding to the target image based on the captured target image and a pre-trained region information detection model comprises:

3. The method of claim 2, wherein the part information detection model comprises a plurality of models, wherein each part information detection model has different model attributes, and the model attributes comprise one or more of a model algorithm, a training sample, and a training amount;

for each piece of part information, determining a fusion probability value corresponding to the part information based on probability values output by different part information detection models corresponding to the part information;

4. The method of claim 3, further comprising:

the determining the target region information in a plurality of region information based on the fusion probability value corresponding to each region information includes:

5. The method according to claim 4, wherein the determining the target region information from a plurality of region information based on the fusion probability value and the confidence corresponding to each region information comprises:

and determining the part information with the highest corresponding fusion probability value in the plurality of part information, and if the confidence coefficient of the part information with the highest corresponding fusion probability value is greater than a preset confidence coefficient threshold, determining the part information with the highest corresponding fusion probability value as the target part information.

6. The method of claim 1, wherein the target image processing mode comprises a target value of a preset display parameter; the processing the target image based on the target image processing mode to obtain a processed target image comprises the following steps:

7. The method according to claim 1, wherein the target image processing mode corresponds to a target processing algorithm; the processing the target image based on the target image processing mode to obtain a processed target image includes:

8. The method of claim 1, wherein the site information in the correspondence includes ear canal, nasal cavity, throat, ureter, normal abdominal cavity, abdominal mist, and abdominal bleeding;

when the target part information is a normal abdominal cavity, the corresponding target image processing mode is as follows: carrying out augmentation processing on a sharpening intensity value, a blood vessel enhancement intensity value, a dark area brightening intensity value and a red saturation control intensity value corresponding to the target image;

9. The method of claim 1, wherein the target site information is further used to indicate a status of the detection site.

10. The method of claim 1, wherein after obtaining the processed target image, the method further comprises:

and displaying the processed target image.

11. An apparatus for performing image processing, the apparatus comprising:

12. The apparatus of claim 11, wherein the determining module is configured to:

and determining the target part information corresponding to the target image based on the probability value corresponding to each part information.

The part information detection model comprises a plurality of part information detection models, wherein each part information detection model has different model attributes, and the model attributes comprise one or more of a model algorithm, a training sample and a training amount;

the determining module is configured to:

determining the target part information in a plurality of part information based on the fusion probability value corresponding to each part information;

the determining module is further configured to:

determining the target part information in the plurality of part information based on the fusion probability value and the confidence coefficient corresponding to each part information;

the determining module is configured to:

determining the part information with the highest corresponding fusion probability value in the plurality of part information, and if the confidence coefficient of the part information with the highest corresponding fusion probability value is greater than a preset confidence coefficient threshold, determining the part information with the highest corresponding fusion probability value as the target part information;

the target image processing mode comprises a target value of a preset display parameter; the processing module is configured to:

alternatively, the first and second liquid crystal display panels may be,

alternatively, the first and second electrodes may be,

when the target part information is any one of the auditory canal, the nasal cavity and the ureter, the corresponding target image processing mode is as follows: carrying out adjustment on the sharpening intensity value corresponding to the target image, and carrying out increase processing on the brightening intensity value of the dark area and the noise reduction intensity value corresponding to the target image;

when the target part information is a normal abdominal cavity, the corresponding target image processing mode is as follows: carrying out augmentation processing on a sharpening intensity value, a blood vessel enhancement intensity value, a dark area brightening intensity value and a red saturation control intensity value which correspond to the target image;

when the target part information is abdominal bleeding, the corresponding target image processing mode is as follows: processing the target image based on a red saturation suppression algorithm;

the apparatus also includes a display module to:

and displaying the processed target image.

13. A computer device comprising a processor and a memory, the memory having stored therein at least one instruction that is loaded and executed by the processor to perform operations performed by a method of image processing according to any one of claims 1 to 10.