WO2018068511A1 - 基因测序的图像处理方法及*** - Google Patents

基因测序的图像处理方法及*** Download PDF

Info

Publication number
WO2018068511A1
WO2018068511A1 PCT/CN2017/085439 CN2017085439W WO2018068511A1 WO 2018068511 A1 WO2018068511 A1 WO 2018068511A1 CN 2017085439 W CN2017085439 W CN 2017085439W WO 2018068511 A1 WO2018068511 A1 WO 2018068511A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
bright spot
gene sequencing
processed
image processing
Prior art date
Application number
PCT/CN2017/085439
Other languages
English (en)
French (fr)
Inventor
徐伟彬
金欢
颜钦
姜泽飞
Original Assignee
深圳市瀚海基因生物科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市瀚海基因生物科技有限公司 filed Critical 深圳市瀚海基因生物科技有限公司
Publication of WO2018068511A1 publication Critical patent/WO2018068511A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B21/00Microscopes
    • G02B21/36Microscopes arranged for photographic purposes or projection purposes or digital imaging or video purposes including associated control and data processing arrangements
    • G02B21/365Control or image processing arrangements for digital or video microscopes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10056Microscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/10Signal processing, e.g. from mass spectrometry [MS] or from PCR

Definitions

  • the present invention relates to the field of gene sequencing technologies, and in particular, to an image processing method and system for gene sequencing and a computer readable storage medium.
  • image brightness localization has important applications in gene sequencers and LED light points.
  • Image analysis is an important part of systems that use optical imaging principles for sequence determination.
  • the accuracy of image brightness positioning directly determines the accuracy of gene sequencing.
  • embodiments of the present invention aim to at least solve one of the technical problems existing in the prior art. To this end, embodiments of the present invention need to provide an image processing method and system for gene sequencing and a computer readable storage medium.
  • An image processing method for gene sequencing includes: an image preprocessing step of analyzing an input image to be processed to remove noise of the image to be processed; a highlight detection step, the highlight detection step comprising Step: analyzing the to-be-processed image to calculate a bright spot determination threshold; analyzing the to-be-processed image of the noise to obtain a candidate pixel point, and determining whether the candidate pixel point is a bright point according to the bright spot determination threshold, and if so, calculating the bright spot The pixel center coordinate and the intensity value of the sub-pixel center coordinate, if not, discard the candidate pixel point.
  • the image processing method of the above gene sequencing, the image denoising process is performed by the image preprocessing step, which can reduce the calculation amount of the bright spot detecting step, and at the same time, determine whether the candidate bright spot is a bright spot by the bright spot judgment threshold, thereby improving the accuracy of determining the bright spot of the image. .
  • the image processing method of the present invention has no particular limitation on the image to be processed, that is, the original input data, and is applicable to processing analysis of images generated by any platform for performing nucleic acid sequence determination using optical detection principles, including but not limited to second generation and Three generations of sequencing, with high accuracy, high versatility and high precision, can get more effective information from the image.
  • known sequencing image processing methods and/or systems are basically developed for image processing of a second-generation sequencing platform, since the sequencing chips used for second-generation sequencing are generally array-type, ie, sequencing cores.
  • the on-chip probes are regularly arranged, and the images obtained by photographing are pattern images, which are easy to process and analyze.
  • the second-generation sequencing generally includes nucleic acid template amplification and amplification, high-intensity bright spots can be obtained during image acquisition, and it is easy to recognize. And positioning.
  • the general second-generation sequencing image processing method does not require high positioning accuracy, and only needs to select and locate some brightly-bright spots (bright spots) to achieve sequence determination.
  • the sequencing chip used is random, that is, the probes on the sequencing chip are randomly arranged, and the images obtained by photographing are random ( Random) image, which is difficult to process analysis;
  • image processing analysis of single-molecule sequencing is one of the most important factors determining the efficiency of the final sequence. It requires high image processing and bright spot positioning, and requires all images. Bright spots can be accurately located so that bases can be directly identified and data information is generated.
  • the image processing method of the present invention can be adapted to use for second-generation sequencing and third-generation sequencing, especially for random images in three-generation sequencing and image processing with high precision requirements, and is particularly advantageous.
  • An image processing system for gene sequencing includes: an image preprocessing module, configured to analyze an input image to be processed to obtain a denoised image, the image to be processed including at least one bright spot The bright spot has at least one pixel; the bright spot detecting module is configured to: analyze the image to be processed to calculate a bright spot determination threshold, analyze the denoised image to obtain a candidate bright spot, and determine according to the bright spot The threshold determines whether the candidate bright spot is the bright spot.
  • the image processing system for sequencing the above-mentioned gene performs denoising processing on the image through the image preprocessing module, which can reduce the calculation amount of the bright spot detection module, and at the same time, determine whether the candidate bright spot is a bright spot through the bright spot judgment threshold, thereby improving the accuracy of determining the bright spot of the image. .
  • An image processing system for gene sequencing includes: a data input unit for inputting data; a data output unit for outputting data; and a storage unit for storing data, the data including a computer executable program A processor for executing the computer executable program, the executing the computer executable program comprising performing the method of any of the above embodiments.
  • the image processing system of the above gene sequencing can improve the accuracy of judging the bright spots of the image.
  • the computer readable storage medium storing the program can be used to detect bright spots and improve the accuracy of determining image highlights.
  • FIG. 1 is a schematic flow chart of an image processing method for gene sequencing according to an embodiment of the present invention
  • FIG. 2 is another schematic flow chart of an image processing method for gene sequencing according to an embodiment of the present invention.
  • FIG. 3 is a schematic flow chart of another embodiment of an image processing method for gene sequencing according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram showing a Mexican hat filter of an image processing method for gene sequencing according to an embodiment of the present invention
  • FIG. 5 is still another schematic flowchart of an image processing method for gene sequencing according to an embodiment of the present invention.
  • FIG. 6 is still another flow chart of an image processing method for gene sequencing according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of 8-connected pixels in an image processing method for gene sequencing according to an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of an image to be processed of an image processing method for gene sequencing according to an embodiment of the present invention.
  • Figure 10 is a partial enlarged view of the image to be processed in Figure 9;
  • FIG. 11 is a schematic diagram showing an image of a bright spot in an image processing method for gene sequencing according to an embodiment of the present invention.
  • Figure 12 is a partial enlarged view of the image identifying the bright spot in Figure 11;
  • FIG. 13 is a block diagram showing an image processing system for gene sequencing according to an embodiment of the present invention.
  • FIG. 14 is another block diagram of an image processing system for gene sequencing according to an embodiment of the present invention.
  • FIG. 15 is another block diagram of an image processing system for gene sequencing according to an embodiment of the present invention.
  • 16 is a block diagram showing still another module of an image processing system for gene sequencing according to an embodiment of the present invention.
  • FIG 17 is still another block diagram of an image processing system for gene sequencing according to an embodiment of the present invention.
  • first and second are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated.
  • features defining “first” or “second” may include one or more of the described features either explicitly or implicitly.
  • the meaning of "a plurality" is two or more unless specifically and specifically defined otherwise.
  • connection is to be understood broadly, and may be fixed or detachable, for example, unless otherwise explicitly defined and defined.
  • Connected, or connected in one piece may be mechanically connected, or may be electrically connected or may communicate with each other; They are directly connected, and may also be indirectly connected through an intermediate medium, which may be the internal communication of two elements or the interaction of two elements.
  • intermediate medium which may be the internal communication of two elements or the interaction of two elements.
  • the "gene sequencing" and nucleic acid sequence determinations referred to in the embodiments of the present invention include DNA sequencing and/or RNA sequencing, including long fragment sequencing and/or short fragment sequencing.
  • the so-called “bright spot” refers to the light-emitting point on the image, and one light-emitting point occupies at least one pixel.
  • the so-called “pixel” is the same as “pixel.”
  • the image is from a sequencing platform that uses optical imaging principles for sequencing
  • the so-called sequencing platforms include, but are not limited to, sequencing platforms such as CG (Complete Genomics), Illumina/Solexa, Life Technologies ABI SOLiD, and Roche 454.
  • the detection of the so-called "bright spot” is the detection of an optical signal of an extended base or a base cluster.
  • the image is from a single molecule sequencing platform, such as Helicos
  • the input raw data is a parameter of a pixel point of the image
  • the detection of the so-called "bright spot” is the detection of a single molecule optical signal.
  • an image processing method for gene sequencing includes: an image preprocessing step S11.
  • the image preprocessing step S11 analyzes an input image to be processed to obtain a denoised image, and the image to be processed includes at least one image. a bright spot, the bright spot has at least one pixel;
  • the bright spot detecting step S12 the bright spot detecting step S12 includes the steps of: S21, analyzing the image to be processed to calculate a bright spot determination threshold, S22, analyzing the denoised image to obtain a candidate bright spot, S23, determining a threshold according to the bright spot Determine if the candidate highlight is a bright spot.
  • the image processing method of the above gene sequencing, the image denoising process is performed by the image preprocessing step, which can reduce the calculation amount of the bright spot detecting step, and at the same time, determine whether the candidate bright spot is a bright spot by the bright spot judgment threshold, thereby improving the accuracy of determining the bright spot of the image. .
  • the input image to be processed may be a 16-bit tiff format image of 512*512 or 2048*2048, and the image of the tiff format may be a grayscale image. In this way, the processing of the image processing method of gene sequencing can be simplified.
  • the bright spot detecting step further includes the step of: if the determination result is yes, S24, calculating the intensity of the sub-pixel center coordinates and/or the sub-pixel center coordinates of the bright spot. If the result of the determination is no, S25, the candidate highlights are discarded. In this way, the accuracy of the image processing method can be further improved by sub-pixels to characterize the intensity values of the center coordinates and/or the center coordinates of the bright spots.
  • image preprocessing step. S11 includes a simplified step S01 and an image filtering step S02.
  • the simplified step S01 simplifies the image processing to be processed into a simplified image
  • the image filtering step S02 filters the simplified image to obtain a denoised image.
  • the simplification step S01 can reduce the subsequent calculation amount of the image processing method of the gene sequencing, and the image filtering step S02 can acquire the denoising image under the condition that the image detail features are retained as much as possible, thereby improving the accuracy of the image processing method.
  • the simplified image is a binarized image
  • the image filtering step S02 performs Mexican hat filtering on the binarized image.
  • binarized images are easier to handle and have a wide range of applications.
  • Mexican hat filtering for binarized images is also easy to implement, reducing the cost of image processing methods for gene sequencing.
  • Mexican hat filtering can improve the contrast between the foreground and the background, making the foreground brighter and making the background darker.
  • the binarized image may include two values of 0 and 1 characterizing different attributes of the pixel, and the binarized image may be expressed as:
  • Gaussian filtering is performed on the binarized image using the m*m window, and two-dimensional Laplacing is performed on the Gaussian filtered binary image.
  • Sharpening m is a natural number and is an odd number greater than one.
  • Mexican hat filtering is achieved in two steps.
  • the Mexican hat core can be expressed as: Equation 1, where x and y represent the coordinates of the pixel points.
  • Gaussian filtering is performed on the binarized image using the m*m window, as shown in Equation 2 below: Equation 2, where t1 and t2 represent the positions of the filtering window, and w t1, t2 represent the weights of the Gaussian filtering.
  • the binarized image is then subjected to two-dimensional Laplacian sharpening, as shown in Equation 3 below: Equation 3, where K and k both represent Laplacian operators, which are related to sharpening targets. If it is necessary to strengthen sharpening and weaken sharpening, modify K and k.
  • Equation 2 when performing Gaussian filtering, Equation 2 becomes:
  • the image pre-processing step S11 further includes a subtraction background step S00, and a subtraction background step S00 performs background subtraction processing on the image to be processed, Subtract the background image to replace the image to be processed with the background image. In this way, the noise of the image to be processed can be further reduced, and the accuracy of the image processing method for gene sequencing is higher.
  • the simplification step acquires a signal to noise ratio matrix from the subtracted background image, and simplifies the subtraction of the background image according to the signal to noise ratio matrix to obtain a simplified image. In this way, a simplified image with less noise is realized, and the image processing method of gene sequencing is more accurate.
  • the signal to noise ratio matrix can be expressed as: Equation 4, where x and y represent the coordinates of the pixel, h represents the height of the image, and w represents the width of the image, i ⁇ w, j ⁇ h.
  • the binarized image can be obtained according to the signal to noise ratio matrix, and the binarized image is as shown in Equation 5: Formula 5.
  • the background processing of the image to be processed includes: determining an background of the image to be processed by using an opening operation, and performing background subtraction processing according to the image to be processed according to the background.
  • the open operation is used to eliminate small objects, separate objects at slender points, smooth the boundaries of larger objects, and does not significantly change the image area, so that the background image can be acquired more accurately.
  • the image to be processed f(x, y) (such as a grayscale image) is moved by an a*a window (for example, a 15*15 window), and an open operation (corrosion re-expansion) is used to estimate
  • the background of the image is processed as shown in Equation 6 and Equation 7 below:
  • g(x, y) is the etched grayscale image
  • f(x, y) is the original grayscale image
  • B Is a structural element;
  • the step of analyzing the image to be processed to calculate a bright spot determination threshold includes: processing the image to be processed by the Otsu method to calculate a bright spot determination threshold.
  • the search for the bright spot determination threshold is realized by a more mature and simple method, thereby improving the accuracy of the image processing method for gene sequencing and reducing the cost of the image processing method for gene sequencing.
  • the Otsu method can also be called the maximum inter-class variance method.
  • the Otsu method uses the largest variance between classes to segment the image, which means that the probability of misclassification is the smallest and the accuracy is high.
  • the segmentation threshold of the foreground and background of the image to be processed is T
  • the ratio of the number of pixels belonging to the foreground to the entire image is ⁇ 0
  • the average gradation is ⁇ 0
  • the ratio of the number of pixels belonging to the background to the entire image is ⁇ 1
  • the average gray level is ⁇ 1 .
  • the traversal method is used to obtain a segmentation threshold T that maximizes the variance between classes, that is, the desired spot determination threshold T.
  • the step of determining whether the candidate bright spot is a bright spot according to the bright spot determination threshold includes: step S31, using image reconstruction based method, after performing Mexican hat filtering In the binarized image, find pixels that are larger than (m*m-1) connected and find the pixel points as the center of the candidate bright spots, m*m and the bright points are one-to-one correspondence, each in m*m The value corresponds to one pixel; in step S32, it is determined whether the center of the candidate bright spot satisfies the condition: I max *A BI *ceof guass >T, where I max is the strongest intensity of the center of the m*m window, and A BI is m*
  • the binarized image in the m window is the ratio of the set value
  • ceof guass is the correlation coefficient of the pixel of the m*m window and the two-dimensional Gaussian distribution
  • T is the bright spot determination threshold.
  • S33 determines that the bright spot corresponding to the center of the candidate bright spot is a bright spot included in the image to be processed; if the above condition is not satisfied, S34, the bright spot corresponding to the center of the candidate bright spot is discarded. In this way, the detection of bright spots is achieved.
  • I max can be understood as the center strongest intensity of the candidate bright spot.
  • m 3, looking for pixels that are greater than 8 connected, as shown in FIG. The found pixel point is used as the pixel point of the candidate bright spot.
  • I max is the strongest intensity of the center of the 3*3 window
  • a BI is the ratio of the set value in the binarized image in the 3*3 window
  • ceof guass is the pixel of the 3*3 window and the two-dimensional Gaussian distribution Correlation coefficient.
  • the set value in the binarized image may be a value corresponding to when the pixel point satisfies the set condition.
  • the binarized image may contain two values of 0 and 1 characterizing different attributes of the pixel, the set value is 1, and A BI is the ratio of 1 in the binarized image in the m*m window. .
  • the step of calculating the intensity values of the sub-pixel center coordinates and/or the sub-pixel center coordinates of the bright points includes the steps of: calculating the sub-pixel center coordinates of the bright points by using quadratic function interpolation, And/or using quadratic spline interpolation to calculate the intensity values of the sub-pixel center coordinates.
  • the method of quadratic function and/or quadratic spline can further improve the accuracy of judging the bright spot of the image.
  • the image processing method of gene sequencing further includes the step of: S13, using the identifier to mark the position of the image of the sub-pixel center coordinate of the bright spot. In this way, it is convenient for the user to observe whether the indication of the bright spot is correct, to determine whether it is necessary to re-light The positioning of the point.
  • FIG. 9 is an image to be positioned
  • FIG. 10 is an enlarged schematic view of a range of 293*173 in the upper left corner of the image shown in FIG. Fig. 11 is an image showing a bright spot (after highlight positioning) with a cross
  • Fig. 12 is an enlarged schematic view showing a range of 293*173 in the upper left corner of the image shown in Fig. 11.
  • an image sequencing system 100 for gene sequencing includes: an image preprocessing module 102, which is configured to analyze an input image to be processed to obtain a denoised image, to be processed.
  • the image includes at least one bright spot, and the bright spot has at least one pixel;
  • the bright spot detecting module 104 is configured to: analyze the image to be processed to calculate a bright spot determination threshold, analyze the denoised image to obtain the candidate bright spot, and determine the threshold according to the bright spot determination threshold. Whether the candidate highlights are bright spots.
  • the image processing system 100 of the gene sequencing performs denoising processing on the image by the image preprocessing module 102, which can reduce the calculation amount of the bright spot detecting module 104, and at the same time, determine whether the candidate bright spot is a bright spot by using the bright spot judgment threshold, thereby improving the judgment image bright spot.
  • the accuracy is improved.
  • the bright spot detection module 104 is further configured to: if the determination result is yes, calculate the intensity value of the sub-pixel center coordinate and/or the sub-pixel center coordinate of the bright spot, if the determination result If no, discard the candidate highlights. As such, the accuracy of the image processing system 100 can be further improved by characterizing the intensity values of the center coordinates and/or the center coordinates of the bright dots by the sub-pixels.
  • the image pre-processing module 102 includes a simplification module 106 and an image filtering module 108.
  • the simplification module 106 is for simplifying the image to be processed into a simplified image, and the image filtering module 108 is for filtering the simplified image to obtain a denoised image.
  • the simplification module 106 can reduce the amount of computation of the image processing system 100 for gene sequencing.
  • the image filtering module 108 can acquire the denoised image under the condition that the image detail features are retained as much as possible, thereby improving the accuracy of the image processing system 100.
  • the simplified image is a binarized image
  • the image filtering module 108 performs Mexican hat filtering on the binarized image.
  • binarized images are easier to handle and have a wide range of applications.
  • Mexican hat filtering of binarized images is also easy to implement, reducing the cost of image sequencing system 100 for gene sequencing.
  • Mexican hat filtering can enhance the contrast between foreground and background, making the foreground brighter and making the background darker.
  • the image filtering module 108 uses In the Mexican hat filtering, the m*m window is used to perform Gaussian filtering on the binarized image, and the Gaussian filtered binarized image is subjected to two-dimensional Laplacian sharpening, where m is a natural number and is greater than 1. odd number.
  • Mexican hat filtering is achieved in two steps.
  • the image pre-processing module 102 further includes a subtraction background module 110 for performing background subtraction processing on the image to be processed to obtain a subtractive background image. , replace the image to be processed with the subtraction background image. In this way, the noise of the image to be processed can be further reduced, and the accuracy of the image-sequencing system 100 for gene sequencing is higher.
  • the simplification module 106 is configured to acquire a signal-to-noise ratio matrix from the subtracted background image and to simplify the subtracted background image from the signal-to-noise ratio matrix to obtain a simplified image. In this way, a simplified image with less noise is achieved, and the accuracy of the image sequencing system 100 for gene sequencing is higher.
  • the subtraction background module 110 is configured to: determine an background of the image to be processed by using an open operation, and perform background subtraction processing on the image to be processed according to the background.
  • the open operation is used to eliminate small objects, separate objects at slender points, smooth the boundaries of larger objects, and does not significantly change the image area, so that the background image can be acquired more accurately.
  • the bright spot detection module 104 is configured to process the image to be processed by the Otsu method to calculate a bright spot determination threshold. In this way, the search for the bright spot determination threshold is realized by a more mature and simple method, thereby improving the accuracy of the image sequencing system 100 for gene sequencing and reducing the cost of the image processing system 100 for gene sequencing.
  • the bright spot detection module 104 is configured to: use the image reconstruction based method to find the greater than (m*m-1) in the binarized image after the Mexican hat filtering. Connected pixels and the found pixel as the center of the candidate bright spot, m*m and the bright point are in one-to-one correspondence, each value in m*m corresponds to one pixel; and it is determined whether the center of the candidate bright spot satisfies the condition: I max *A BI *ceof guass >T, where I max is the strongest intensity at the center of the m*m window, and A BI is the ratio of the set value in the binarized image in the m*m window, ceof guass For the pixel of the m*m window and the correlation coefficient of the two-dimensional Gaussian distribution, T is the bright point determination threshold.
  • the bright spot corresponding to the center of the candidate bright spot is determined to be a bright spot. If the above condition is not met, the center of the candidate bright spot is discarded. Corresponding highlights. In this way, the detection of bright spots is achieved.
  • the bright spot detection module 104 is configured to: calculate the sub-pixel center coordinates of the bright point using quadratic function interpolation, and/or calculate the sub-pixel center coordinate using the quadratic spline interpolation. Strength value.
  • the method of quadratic function and/or quadratic spline can further improve the accuracy of judging the bright spot of the image.
  • the image processing system 100 for gene sequencing includes an identification module 112 for: using an identifier to indicate an image of a sub-pixel center coordinate of a bright spot. s position. In this way, it is convenient for the user to observe whether the indication of the bright spot is correct, to determine whether the positioning of the bright spot needs to be performed again.
  • a gene sequencing image processing system 300 includes: a data input unit 302 for inputting data; a data output unit 304 for outputting data; and a storage unit 306 for storing data.
  • the data includes a computer executable program; a processor 308 for executing the computer executable program, the executing the computer executable program comprising performing the method of any of the above embodiments.
  • the image processing system 300 for sequencing the above genes can improve the accuracy of judging the bright spots of the image.
  • a computer readable storage medium for storing a program for execution by a computer, the executing the program comprising the method of any of the above embodiments.
  • the above computer readable storage medium can improve the accuracy of judging image highlights.
  • a computer readable storage medium may be any apparatus that can contain, store, communicate, propagate, or transport the program for use by the instruction execution system, apparatus, or device, or in conjunction with the instruction execution system, apparatus, or device.
  • computer readable media include the following: electrical connections (electronic devices) having one or more wires, portable computer disk cartridges (magnetic devices), random access memory (RAM), Read only memory (ROM), erasable editable read only memory (EPROM or flash memory), fiber optic devices, and portable compact disk read only memory (CDROM).
  • the computer readable medium may even be a paper or other suitable medium on which the program can be printed, as it may be optically scanned, for example by paper or other medium, followed by editing, interpretation or, if appropriate, other suitable The method is processed to obtain the program electronically and then stored in computer memory.
  • the above mentioned storage medium may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Genetics & Genomics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Biophysics (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Optics & Photonics (AREA)
  • Image Processing (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

一种基因测序的图像处理方法及***。基因测序的图像处理方法包括:图像预处理步骤,该步骤分析输入的待处理图像以去掉该待处理图像的噪声;亮点检测步骤,该步骤包括步骤:分析该待处理图像以计算亮点判定阈值;分析去掉该噪声的该待处理图像以获取候选像素点,并根据该亮点判定阈值判断该候选像素点是否为亮点,若是,计算该亮点的亚像素中心坐标及该亚像素中心坐标的强度值,若否,丢弃该候选像素点。该基因测序的图像处理方法,通过图像预处理步骤对图像进行去噪处理,可减少亮点检测步骤的计算量,同时,通过亮点判断阈值判断候选亮点是否为亮点,可提高判断图像亮点的准确性。

Description

基因测序的图像处理方法及***
优先权信息
本申请请求2016年10月10日递交至中国国家知识产权局、专利申请号为201610882547.8的在先申请的优先权和权益,并且通过参照将其全文并入此处。
技术领域
本发明涉及基因测序技术领域,尤其涉及一种基因测序的图像处理方法及***及计算机可读存储介质。
背景技术
在相关技术中,图像亮度定位在基因测序仪和LED灯光点中都有重要应用。
在利用光学成像原理进行序列测定的***中,图像分析是很重要的一块。图像亮度定位的准确性直接决定了基因测序的准确性。
在核酸序列测定的过程中,如何提高判断图像亮点的准确性,成为待解决的问题之一。
发明内容
本发明实施方式旨在至少解决现有技术中存在的技术问题之一。为此,本发明实施方式需要提供一种基因测序的图像处理方法及***及计算机可读存储介质。
本发明实施方式的一种基因测序的图像处理方法,包括:图像预处理步骤,该图像预处理步骤分析输入的待处理图像以去掉该待处理图像的噪声;亮点检测步骤,该亮点检测步骤包括步骤:分析该待处理图像以计算亮点判定阈值;分析去掉该噪声的该待处理图像以获取候选像素点,并根据该亮点判定阈值判断该候选像素点是否为亮点,若是,计算该亮点的亚像素中心坐标及该亚像素中心坐标的强度值,若否,丢弃该候选像素点。
上述基因测序的图像处理方法,通过图像预处理步骤对图像进行去噪处理,可减少亮点检测步骤的计算量,同时,通过亮点判断阈值判断候选亮点是否为亮点,可提高判断图像亮点的准确性。
本发明的这一图像处理方法,对待处理图像即原始输入数据的没有特别的限制,适用于任何利用光学检测原理进行核酸序列测定的平台所产生的图像的处理分析,包括但不限于二代和三代测序,具有高准确性、高通用性和高精度的特点,能从图像中获取更多的有效信息。
特别地,目前,已知的测序图像处理方法和/***基本是针对二代测序平台的图像处理开发的,由于二代测序使用的测序芯片一般是阵列型的,即测序芯 片上的探针是规则排列的,拍照获得的图像是模式(pattern)图像,易于处理分析;另外,由于二代测序一般包含核酸模板扩增放大,图像采集时能够获得高强度的亮点,易于识别和定位。一般的二代测序的图像处理方法不要求高的定位精度,只需要挑选定位一些发光较强较好的点(亮点),就能实现序列测定。
而对于三代测序即单分子测序,受限于目前芯片表面处理相关技术的发展,其使用的测序芯片是随机型的,即测序芯片上的探针是无规则排列,拍照获得的图像是随机(random)图像,不易处理分析;而且,单分子测序的图像处理分析是决定最终序列(reads)的有效率的最重要的因素之一,对图像处理、亮点定位的要求高,要求图像上的所有亮点都能准确定位,以使能够直接识别出碱基,产生数据信息。
因此,本发明的图像处理方法可适应用于二代测序和三代测序,特别是对于三代测序中的随机图像及高精度要求的图像处理,尤其具有优势。
本发明实施方式的一种基因测序的图像处理***,包括:图像预处理模块,所述图像预处理模块用于分析输入的待处理图像以获得去噪图像,所述待处理图像包含至少一个亮点,所述亮点具有至少一个像素点;亮点检测模块,所述亮点检测模块用于:分析所述待处理图像以计算亮点判定阈值,分析所述去噪图像以获取候选亮点,根据所述亮点判定阈值判断所述候选亮点是否为所述亮点。
上述基因测序的图像处理***,通过图像预处理模块对图像进行去噪处理,可减少亮点检测模块的计算量,同时,通过亮点判断阈值判断候选亮点是否为亮点,可提高判断图像亮点的准确性。
本发明实施方式的一种基因测序的图像处理***,包括:数据输入单元,用于输入数据;数据输出单元,用于输出数据;存储单元,用于存储数据,所述数据包括计算机可执行程序;处理器,用于执行所述计算机可执行程序,执行所述计算机可执行程序包括完成如上任一实施方式所述的方法。上述基因测序的图像处理***可提高判断图像亮点的准确性。
本发明实施方式的一种计算机可读存储介质,用于存储供计算机执行的程序,执行所述程序包括完成如上任一实施方式所述的方法。存储所述程序的该计算机可读存储介质可用于检测亮点,提高判断图像亮点的准确性。
本发明实施方式的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明实施方式的实践了解到。
附图说明
本发明实施方式的上述和/或附加的方面和优点从结合下面附图对实施方式 的描述中将变得明显和容易理解,其中:
图1是本发明实施方式的基因测序的图像处理方法的流程示意图;
图2是本发明实施方式的基因测序的图像处理方法的另一流程示意图;
图3是本发明实施方式的基因测序的图像处理方法的再一流程示意图;
图4是本发明实施方式的基因测序的图像处理方法的墨西哥帽滤波的曲线示意图;
图5是本发明实施方式的基因测序的图像处理方法的又一流程示意图;
图6是本发明实施方式的基因测序的图像处理方法的又再一流程示意图;
图7是本发明实施方式的基因测序的图像处理方法中8连通像素的示意图;
图8是本发明实施方式的基因测序的图像处理方法的又另一流程示意图;
图9是本发明实施方式的基因测序的图像处理方法的待处理图像的示意图;
图10是图9中的待处理图像的局部放大图;
图11是本发明实施方式的基因测序的图像处理方法的标识出亮点的图像示意图;
图12是图11中的标识出亮点的图像的局部放大图;
图13是本发明实施方式的基因测序的图像处理***的模块示意图;
图14是本发明实施方式的基因测序的图像处理***的另一模块示意图;
图15是本发明实施方式的基因测序的图像处理***的又一模块示意图;
图16是本发明实施方式的基因测序的图像处理***的再一模块示意图;
图17是本发明实施方式的基因测序的图像处理***的又再一模块示意图。
具体实施方式
下面详细描述本发明的实施方式,所述实施方式的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施方式是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。
在本发明的描述中,需要理解的是,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个所述特征。在本发明的描述中,“多个”的含义是两个或两个以上,除非另有明确具体的限定。
在本发明的描述中,需要说明的是,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或一体地连接;可以是机械连接,也可以是电连接或可以相互通信;可 以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通或两个元件的相互作用关系。对于本领域的普通技术人员而言,可以根据具体情况理解上述术语在本发明中的具体含义。
本发明可以在不同例子中重复参考数字和/或参考字母,这种重复是为了简化和清楚的目的,其本身不指示所讨论各种实施方式和/或设定之间的关系。
本发明实施方式所称的“基因测序”同核酸序列测定,包括DNA测序和/或RNA测序,包括长片段测序和/或短片段测序。
所称的“亮点”,指图像上的发光点,一个发光点占有至少一个像素点。所称“像素点”同“像素”。
在本发明的实施方式中,图像来自利用光学成像原理进行序列测定的测序平台,所称的测序平台包括但不限于CG(Complete Genomics)、Illumina/Solexa、Life Technologies ABI SOLiD和Roche 454等测序平台,对所称的“亮点”的检测为对延伸碱基或碱基簇的光学信号的检测。
在本发明的一个实施例中,图像来自单分子测序平台,例如Helicos,输入的原始数据为图像的像素点的参数,对所称的“亮点”的检测为对单分子光学信号的检测。
请参阅图1,本发明实施方式的一种基因测序的图像处理方法,包括:图像预处理步骤S11,图像预处理步骤S11分析输入的待处理图像以获得去噪图像,待处理图像包含至少一个亮点,亮点具有至少一个像素点;亮点检测步骤S12,亮点检测步骤S12包括步骤:S21,分析待处理图像以计算亮点判定阈值,S22,分析去噪图像以获取候选亮点,S23,根据亮点判定阈值判断候选亮点是否为亮点。上述基因测序的图像处理方法,通过图像预处理步骤对图像进行去噪处理,可减少亮点检测步骤的计算量,同时,通过亮点判断阈值判断候选亮点是否为亮点,可提高判断图像亮点的准确性。
具体地,在一个例子中,输入的待处理图像可为512*512或2048*2048的16位tiff格式的图像,tiff格式的图像可为灰度图像。如此,可简化基因测序的图像处理方法的处理过程。
在某些实施方式的基因测序的图像处理方法中,请参图2,亮点检测步骤还包括步骤:若判断结果为是,S24,计算亮点的亚像素中心坐标和/或亚像素中心坐标的强度值,若判断结果为否,S25,丢弃候选亮点。如此,通过亚像素来表征亮点的中心坐标和/或中心坐标的强度值,可进一步提高图像处理方法的准确性。
在某些实施方式的基因测序的图像处理方法中,请参图3,图像预处理步骤 S11包括简化步骤S01及图像滤波步骤S02,简化步骤S01将待处理图像处理简化为简化图像,图像滤波步骤S02对简化图像进行滤波以获取去噪图像。
如此,简化步骤S01可减少基因测序的图像处理方法后续的计算量,图像滤波步骤S02可在尽量保留图像细节特征的条件下获取去噪图像,进而可提高图像处理方法的准确性。
在某些实施方式的基因测序的图像处理方法中,简化图像为二值化图像,图像滤波步骤S02对二值化图像进行墨西哥帽滤波。如此,二值化图像更易于处理,且应用范围广。对二值化图像进行墨西哥帽滤波也易于实现,降低了基因测序的图像处理方法的成本,同时,墨西哥帽滤波能提升前景与背景的对比度,使前景更亮,使背景更暗。
具体地,在一个例子中,二值化图像可包含表征像素点不同属性的0和1二个数值,二值化图像可表示为:
Figure PCTCN2017085439-appb-000001
在某些实施方式的基因测序的图像处理方法中,在进行墨西哥帽滤波时,使用m*m窗口对二值化图像进行高斯滤波,对高斯滤波后的二值化图像进行二维拉普拉斯锐化,m为自然数且为大于1的奇数。如此,通过两步骤实现了墨西哥帽滤波。
具体地,请参图4,墨西哥帽核可表示为:
Figure PCTCN2017085439-appb-000002
公式1,其中,x和y表示像素点的坐标。首先使用m*m窗口对二值化图像进行高斯滤波,如下公式2所示:
Figure PCTCN2017085439-appb-000003
公式2,其中,t1和t2表示滤波窗口的位置,wt1,t2表示高斯滤波的权重。然后对二值化图像进行二维拉普拉斯锐化,如下公式3所示:
Figure PCTCN2017085439-appb-000004
公式3,其中,K和k均表示拉普拉斯算子,与锐化目标有关,如果需要加强锐化和减弱锐化,就修改K和k。
在一个例子中,m=3,因此m*m=3*3,进行高斯滤波时,公式2变为:
Figure PCTCN2017085439-appb-000005
在某些实施方式的基因测序的图像处理方法中,请参图5,在简化步骤S01前,图像预处理步骤S11还包括减背景步骤S00,减背景步骤S00对待处理图像进行减背景处理,获得减背景图像,以减背景图像替代待处理图像。如此,能够进一步减少待处理图像的噪声,使基因测序的图像处理方法的准确性更高。
在某些实施方式的基因测序的图像处理方法中,简化步骤根据减背景图像获取信噪比矩阵,并根据信噪比矩阵简化减背景图像以得到简化图像。如此,实现了噪声较少的简化图像,使基因测序的图像处理方法的准确性更高。
具体地,在一个例子中,信噪比矩阵可表示为:
Figure PCTCN2017085439-appb-000006
公式4,其中,x和y表示像素点的坐标,h表示图像的高度,w表示图像的宽度,i∈w,j∈h。
可根据信噪比矩阵得到二值化图像,二值化图像如公式5所示:
Figure PCTCN2017085439-appb-000007
公式5。
在某些实施方式的基因测序的图像处理方法中,对待处理图像进行减背景处理,包括:利用开运算确定待处理图像的背景,根据背景对待处理图像进行减背景处理。如此,开运算用来消除小物体、在纤细点处分离物体、平滑较大物体的边界的同时并不明显改变图像面积,可更准确地获取减背景图像。
具体地,在本发明实施方式中,在待处理图像f(x,y)(如灰度图像)移动a*a窗口(例如15*15窗口),利用开运算(先腐蚀再膨胀)估计待处理图像的背景,如下公式6及公式7所示:g(x,y)=erode[f(x,y),B]=min{f(x+x',y+y')-B(x',y')|(x',y')∈Db}公式6,其中,g(x,y)为腐蚀后的灰度图像,f(x,y)为原灰度图像,B为结构元素;g(x,y)=dilate[f(x,y),B]=max{f(x-x',y-y')-B(x',y')|(x',y')∈Db}公式7,其中,g(x,y)为膨胀后的灰度图像,f(x,y)为原灰度图像,B为结构元素。故可得背景噪声g=imopen(f(x,y),B)=dilate[erode(f(x,y),B)]公式8。对原图进行减背景:f=f-g={f(x,y)-g(x,y)|(x,y)∈D}公式9。再求得减背景图像与背景的比值矩阵:R=f/g={f(x,y)/g(x,y)|(x,y)∈D}公式10,其中,D表示图像f的维度(高*宽)。由此可以求得SNR矩阵:
Figure PCTCN2017085439-appb-000008
公式11。
在某些实施方式的基因测序的图像处理方法中,分析待处理图像以计算亮点判定阈值的步骤,包括:通过大津法处理待处理图像以计算亮点判定阈值。如此,通过较成熟及简单的方法实现了亮点判定阈值的查找,进而提高了基因测序的图像处理方法的准确性及降低了基因测序的图像处理方法的成本。
具体地,大津法(OTSU算法)也可称为最大类间方差法,大津法利用类间方差最大来分割图像,意味着错分概率最小,准确性高。假设待处理图像的前 景和背景的分割阈值为T,属于前景的像素点数占整幅图像的比例为ω0,其平均灰度为μ0;属于背景的像素点数占整幅图像的比例为ω1,其平均灰度为μ1。待处理图像的总平均灰度记为μ,类间方差记为var,则有:μ=ω0011公式12;var=ω00-μ)211-μ)2公式13。将公式12代入公式13,得到等价公式14:var=ω0ω110)2公式14。
采用遍历的方法得到使类间方差最大的分割阈值T,即为所求的亮点判定阈值T。
在某些实施方式的基因测序的图像处理方法中,请参图6,根据亮点判定阈值判断候选亮点是否为亮点的步骤,包括:步骤S31,采用基于图像重建的方法,在进行墨西哥帽滤波后的二值化图像中,查找大于(m*m-1)连通的像素点并将查找到的像素点作为候选亮点的中心,m*m与亮点是一一对应的,m*m中的每个值对应一个像素点;步骤S32,判断候选亮点的中心是否满足条件:Imax*ABI*ceofguass>T,其中,Imax为m*m窗口的中心最强强度,ABI为m*m窗口中二值化图像中为设定值所占的比率,ceofguass为m*m窗口的像素和二维高斯分布的相关系数,T为亮点判定阈值。若满足上述条件,S33,判断候选亮点的中心对应的亮点为待处理图像所包含的亮点;若不满足上述条件,S34,弃去候选亮点的中心对应的亮点。如此,实现了亮点的检测。
具体地,Imax可理解为候选亮点的中心最强强度。在一个例子中,m=3,查找大于8连通的像素点,如图7所示。将查找到的像素点作为候选亮点的像素点。Imax为3*3窗口的中心最强强度,ABI为3*3窗口中二值化图像中为设定值所占的比率,ceofguass为3*3窗口的像素和二维高斯分布的相关系数。
二值化图像中的设定值可为像素点满足设定条件时所对应的值。在另一个例子中,二值化图像可包含表征像素点不同属性的0和1二个数值,设定值为1,ABI为m*m窗口中二值化图像中为1所占的比率。
在某些实施方式的基因测序的图像处理方法中,计算亮点的亚像素中心坐标和/或亚像素中心坐标的强度值的步骤,包括步骤:采用二次函数插值计算亮点的亚像素中心坐标,和/或采用二次样条插值计算亚像素中心坐标的强度值。如此,采用二次函数和/或二次样条的方法能够进一步提高判断图像亮点的准确性。
在某些实施方式的基因测序的图像处理方法中,请参图8,基因测序的图像处理方法还包括步骤:S13,利用标识标示出亮点的亚像素中心坐标所在图像的位置。如此,可方便用户观察亮点的标示是否正确,以决定是否需重新进行亮 点的定位。
具体地,在一个例子中,利用十字叉标示出亮点的亚像素中心坐标所在图像的位置。请参图9、图10、图11及图12,图9为待定位的图像,图10是图9所示的图像左上角293*173范围的放大示意图。图11为用十字叉标出亮点(亮点定位后)的图像,图12是图11所示的图像左上角293*173范围的放大示意图。
请参图13,本发明实施方式的一种基因测序的图像处理***100,包括:图像预处理模块102,图像预处理模102块用于分析输入的待处理图像以获得去噪图像,待处理图像包含至少一个亮点,亮点具有至少一个像素点;亮点检测模块104,该亮点检测模块104用于:分析待处理图像以计算亮点判定阈值,分析去噪图像以获取候选亮点,根据亮点判定阈值判断候选亮点是否为亮点。该基因测序的图像处理***100,通过图像预处理模块102对图像进行去噪处理,可减少亮点检测模块104的计算量,同时,通过亮点判断阈值判断候选亮点是否为亮点,可提高判断图像亮点的准确性。
需要说明的是,上述对基因测序的图像处理方法的实施方式的解释说明也适用于本发明实施方式的基因测序的图像处理***100,为避免冗余,在此不再详细展开。
在某些实施方式的基因测序的图像处理***100中,亮点检测模块104还用于:若判断结果为是,计算亮点的亚像素中心坐标和/或亚像素中心坐标的强度值,若判断结果为否,丢弃候选亮点。如此,通过亚像素来表征亮点的中心坐标和/或中心坐标的强度值,可进一步提高图像处理***100的准确性。
在某些实施方式的基因测序的图像处理***100中,请参图14,图像预处理模块102包括简化模块106及图像滤波模块108。
简化模块106用于将待处理图像简化为简化图像,图像滤波模块108用于对简化图像进行滤波以获取去噪图像。如此,简化模块106可减少基因测序的图像处理***100后续的计算量,图像滤波模块108可在尽量保留图像细节特征的条件下获取去噪图像,进而可提高图像处理***100的准确性。
在某些实施方式的基因测序的图像处理***100中,简化图像为二值化图像,图像滤波模块108对二值化图像进行墨西哥帽滤波。如此,二值化图像更易于处理,且应用范围广。对二值化图像进行墨西哥帽滤波也易于实现,降低了基因测序的图像处理***100的成本,同时,墨西哥帽滤波能提升前景与背景的对比度,使前景更亮,使背景更暗。
在某些实施方式的基因测序的图像处理***100中,图像滤波模块108用 于,在进行墨西哥帽滤波时,使用m*m窗口对二值化图像进行高斯滤波,对高斯滤波后的二值化图像进行二维拉普拉斯锐化,m为自然数且为大于1的奇数。如此,通过两步骤实现了墨西哥帽滤波。
在某些实施方式的基因测序的图像处理***100中,请参图15,图像预处理模块102还包括减背景模块110,减背景模块110用于对待处理图像进行减背景处理,获得减背景图像,以减背景图像替代待处理图像。如此,能够进一步减少待处理图像的噪声,使基因测序的图像处理***100的准确性更高。
在某些实施方式的基因测序的图像处理***100中,简化模块106用于根据减背景图像获取信噪比矩阵,并根据信噪比矩阵简化减背景图像以得到简化图像。如此,实现了噪声较少的简化图像,使基因测序的图像处理***100的准确性更高。
在某些实施方式的基因测序的图像处理***100中,减背景模块110用于:利用开运算确定待处理图像的背景,根据背景对待处理图像进行减背景处理。如此,开运算用来消除小物体、在纤细点处分离物体、平滑较大物体的边界的同时并不明显改变图像面积,可更准确地获取减背景图像。
在某些实施方式的基因测序的图像处理***中,亮点检测模块104用于通过大津法处理待处理图像以计算亮点判定阈值。如此,通过较成熟及简单的方法实现了亮点判定阈值的查找,进而提高了基因测序的图像处理***100的准确性及降低了基因测序的图像处理***100的成本。
在某些实施方式的基因测序的图像处理***中,亮点检测模块104用于:采用基于图像重建的方法,在进行墨西哥帽滤波后的二值化图像中,查找大于(m*m-1)连通的像素点并将查找到的像素点作为候选亮点的中心,m*m与亮点是一一对应的,m*m中的每个值对应一个像素点;判断候选亮点的中心是否满足条件:Imax*ABI*ceofguass>T,其中,Imax为m*m窗口的中心最强强度,ABI为m*m窗口中二值化图像中为设定值所占的比率,ceofguass为m*m窗口的像素和二维高斯分布的相关系数,T为亮点判定阈值,若满足上述条件,判断候选亮点的中心对应的亮点为亮点,若不满足上述条件,弃去候选亮点的中心对应的亮点。如此,实现了亮点的检测。
在某些实施方式的基因测序的图像处理***100中,亮点检测模块104用于:采用二次函数插值计算亮点的亚像素中心坐标,和/或采用二次样条插值计算亚像素中心坐标的强度值。如此,采用二次函数和/或二次样条的方法能够进一步提高判断图像亮点的准确性。
在某些实施方式的基因测序的图像处理***100中,请参图16,基因测序的图像处理***100包括标识模块112,标识模块112用于:利用标识标示出亮点的亚像素中心坐标所在图像的位置。如此,可方便用户观察亮点的标示是否正确,以决定是否需重新进行亮点的定位。
请参图17,本发明实施方式的一种基因测序的图像处理***300,包括:数据输入单元302,用于输入数据;数据输出单元304,用于输出数据;存储单元306,用于存储数据,所述数据包括计算机可执行程序;处理器308,用于执行所述计算机可执行程序,执行所述计算机可执行程序包括完成如上任一实施方式所述的方法。上述基因测序的图像处理***300可提高判断图像亮点的准确性。
本发明实施方式的一种计算机可读存储介质,用于存储供计算机执行的程序,执行所述程序包括完成如上任一实施方式所述的方法。上述计算机可读存储介质可提高判断图像亮点的准确性。计算机可读存储介质可以是任何可以包含、存储、通信、传播或传输程序以供指令执行***、装置或设备或结合这些指令执行***、装置或设备而使用的装置。计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。上述提到的存储介质可以是只读存储器,磁盘或光盘等。
在本说明书的描述中,参考术语“一个实施方式”、“一些实施方式”、“示意性实施方式”、“示例”、“具体示例”、或“一些示例”等的描述意指结合所述实施方式或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施方式或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施方式或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施方式或示例中以合适的方式结合。
尽管上面已经示出和描述了本发明的实施方式,可以理解的是,上述实施方式是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施方式进行变化、修改、替换和变型。

Claims (26)

  1. 一种基因测序的图像处理方法,其特征在于,包括:
    图像预处理步骤,所述图像预处理步骤分析输入的待处理图像以获得去噪图像,所述待处理图像包含至少一个亮点,所述亮点具有至少一个像素点;
    亮点检测步骤,所述亮点检测步骤包括步骤:
    分析所述待处理图像以计算亮点判定阈值,
    分析所述去噪图像以获取候选亮点,
    根据所述亮点判定阈值判断所述候选亮点是否为所述亮点。
  2. 根据权利要求1所述的基因测序的图像处理方法,其特征在于,所述亮点检测步骤还包括步骤:
    若判断结果为是,计算所述亮点的亚像素中心坐标和/或所述亚像素中心坐标的强度值,
    若判断结果为否,丢弃所述候选亮点。
  3. 根据权利要求1所述的基因测序的图像处理方法,其特征在于,所述图像预处理步骤包括简化步骤及图像滤波步骤,
    所述简化步骤将所述待处理图像简化为简化图像,
    所述图像滤波步骤对所述简化图像进行滤波以获取所述去噪图像。
  4. 根据权利要求3所述的基因测序的图像处理方法,其特征在于,所述简化图像为二值化图像,所述图像滤波步骤对所述二值化图像进行墨西哥帽滤波。
  5. 根据权利要求4所述的基因测序的图像处理方法,其特征在于,在进行墨西哥帽滤波时,使用m*m窗口对所述二值化图像进行高斯滤波,对高斯滤波后的二值化图像进行二维拉普拉斯锐化,m为自然数且为大于1的奇数。
  6. 根据权利要求3所述的基因测序的图像处理方法,其特征在于,在所述简化步骤前,所述图像预处理步骤还包括减背景步骤,所述减背景步骤对所述待处理图像进行减背景处理,获得减背景图像,以所述减背景图像替代所述待处理图像。
  7. 根据权利要求6所述的基因测序的图像处理方法,其特征在于,所述简化步骤根据所述减背景图像获取信噪比矩阵,并根据所述信噪比矩阵简化所述减背景图像以得到所述简化图像。
  8. 根据权利要求6所述的基因测序的图像处理方法,其特征在于,对所述待处理图像进行减背景处理,包括:
    利用开运算确定所述待处理图像的背景,
    根据所述背景对所述待处理图像进行减背景处理。
  9. 根据权利要求1-8任一项所述的基因测序的图像处理方法,其特征在于,所述分析所述待处理图像以计算亮点判定阈值的步骤,包括:
    通过大津法处理所述待处理图像以计算所述亮点判定阈值。
  10. 根据权利要求5所述的基因测序的图像处理方法,其特征在于,所述根据所述亮点判定阈值判断所述候选亮点是否为所述亮点的步骤,包括:
    采用基于图像重建的方法,在进行墨西哥帽滤波后的所述二值化图像中,查找大于(m*m-1)连通的像素点并将查找到的所述像素点作为所述候选亮点的中心;
    判断所述候选亮点的中心是否满足条件:Imax*ABI*ceofguass>T,其中,Imax为m*m窗口的中心最强强度,ABI为m*m窗口中所述二值化图像中为设定值所占的比率,ceofguass为m*m窗口的像素和二维高斯分布的相关系数,T为所述亮点判定阈值,
    若满足上述条件,判断所述候选亮点的中心对应的亮点为所述亮点,
    若不满足上述条件,弃去所述候选亮点的中心对应的亮点。
  11. 根据权利要求2所述的基因测序的图像处理方法,其特征在于,计算所述亮点的亚像素中心坐标和/或所述亚像素中心坐标的强度值的步骤,包括:
    采用二次函数插值计算所述亮点的亚像素中心坐标,和/或采用二次样条插值计算所述亚像素中心坐标的强度值。
  12. 根据权利要求2所述的基因测序的图像处理方法,其特征在于,还包括步骤:
    利用标识标示出所述亮点的亚像素中心坐标所在图像的位置。
  13. 一种基因测序的图像处理***,其特征在于,包括:
    图像预处理模块,所述图像预处理模块用于分析输入的待处理图像以获得去噪图像,所述待处理图像包含至少一个亮点,所述亮点具有至少一个像素点;
    亮点检测模块,所述亮点检测模块用于:
    分析所述待处理图像以计算亮点判定阈值,
    分析所述去噪图像以获取候选亮点,
    根据所述亮点判定阈值判断所述候选亮点是否为所述亮点。
  14. 根据权利要求13所述的基因测序的图像处理***,其特征在于,所述亮点检测模块还用于:
    若判断结果为是,计算所述亮点的亚像素中心坐标和/或所述亚像素中心坐标的强度值,
    若判断结果为否,丢弃所述候选亮点。
  15. 根据权利要求13所述的基因测序的图像处理***,其特征在于,所述图像预处理模块包括简化模块及图像滤波模块,
    所述简化模块用于将所述待处理图像简化为简化图像,
    所述图像滤波模块用于对所述简化图像进行滤波以获取所述去噪图像。
  16. 根据权利要求15所述的基因测序的图像处理***,其特征在于,所述简化图像为二值化图像,所述图像滤波模块对所述二值化图像进行墨西哥帽滤波。
  17. 根据权利要求16所述的基因测序的图像处理***,其特征在于,所述图像滤波模块用于,在进行墨西哥帽滤波时,使用m*m窗口对所述二值化图像进行高斯滤波,对高斯滤波后的二值化图像进行二维拉普拉斯锐化,m为自然数且为大于1的奇数。
  18. 根据权利要求15所述的基因测序的图像处理***,其特征在于,所述图像预处理模块还包括减背景模块,所述减背景模块用于对所述待处理图像进行减背景处理,获得减背景图像,以所述减背景图像替代所述待处理图像。
  19. 根据权利要求18所述的基因测序的图像处理***,其特征在于,所述简化模块用于根据所述减背景图像获取信噪比矩阵,并根据所述信噪比矩阵简化所述减背景图像以得到所述简化图像。
  20. 根据权利要求18所述的基因测序的图像处理***,其特征在于,所述减背景模块用于:
    利用开运算确定所述待处理图像的背景,
    根据所述背景对所述待处理图像进行减背景处理。
  21. 根据权利要求13-20任一项所述的基因测序的图像处理***,其特征在于,所述亮点检测模块用于通过大津法处理所述待处理图像以计算所述亮点判定阈值。
  22. 根据权利要求17所述的基因测序的图像处理***,其特征在于,所述亮点检测模块用于:
    采用基于图像重建的方法,在进行墨西哥帽滤波后的所述二值化图像中,查找大于(m*m-1)连通的像素点并将查找到的所述像素点作为所述候选亮点的中心,m*m与亮点是一一对应的,m*m中的每个值对应一个像素点;
    判断所述候选亮点的中心是否满足条件:Imax*ABI*ceofguass>T,其中,Imax为m*m窗口的中心最强强度,ABI为m*m窗口中所述二值化图像中为设定值所占的比率,ceofguass为m*m窗口的像素和二维高斯分布的相关系数,T为所述亮点 判定阈值,
    若满足上述条件,判断所述候选亮点的中心对应的亮点为所述亮点,
    若不满足上述条件,弃去所述候选亮点的中心对应的亮点。
  23. 根据权利要求14所述的基因测序的图像处理***,其特征在于,所述亮点检测模块用于:
    采用二次函数插值计算所述亮点的亚像素中心坐标,和/或采用二次样条插值计算所述亚像素中心坐标的强度值。
  24. 根据权利要求14所述的基因测序的图像处理***,其特征在于,所述基因测序的图像处理***包括标识模块,所述标识模块用于:
    利用标识标示出所述亮点的亚像素中心坐标所在图像的位置。
  25. 一种基因测序的图像处理***,其特征在于,包括:
    数据输入单元,用于输入数据;
    数据输出单元,用于输出数据;
    存储单元,用于存储数据,所述数据包括计算机可执行程序;
    处理器,用于执行所述计算机可执行程序,执行所述计算机可执行程序包括完成如权利要求1-12任一项所述的方法。
  26. 一种计算机可读存储介质,其特征在于,用于存储供计算机执行的程序,执行所述程序包括完成如权利要求1-12任一项所述的方法。
PCT/CN2017/085439 2016-10-10 2017-05-23 基因测序的图像处理方法及*** WO2018068511A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610882547.8 2016-10-10
CN201610882547 2016-10-10

Publications (1)

Publication Number Publication Date
WO2018068511A1 true WO2018068511A1 (zh) 2018-04-19

Family

ID=61898788

Family Applications (3)

Application Number Title Priority Date Filing Date
PCT/CN2017/085439 WO2018068511A1 (zh) 2016-10-10 2017-05-23 基因测序的图像处理方法及***
PCT/CN2017/101056 WO2018068600A1 (zh) 2016-10-10 2017-09-08 图像处理方法及***
PCT/CN2017/101054 WO2018068599A1 (zh) 2016-10-10 2017-09-08 基因测序的图像处理方法及***

Family Applications After (2)

Application Number Title Priority Date Filing Date
PCT/CN2017/101056 WO2018068600A1 (zh) 2016-10-10 2017-09-08 图像处理方法及***
PCT/CN2017/101054 WO2018068599A1 (zh) 2016-10-10 2017-09-08 基因测序的图像处理方法及***

Country Status (3)

Country Link
CN (2) CN107945150B (zh)
HK (2) HK1247724A1 (zh)
WO (3) WO2018068511A1 (zh)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3843033B1 (en) * 2018-08-22 2024-05-22 GeneMind Biosciences Company Limited Method for constructing sequencing template based on image, and base recognition method and device
CN112288783B (zh) * 2018-08-22 2021-06-29 深圳市真迈生物科技有限公司 基于图像构建测序模板的方法、碱基识别方法和装置
CN112289381B (zh) * 2018-08-22 2021-12-14 深圳市真迈生物科技有限公司 基于图像构建测序模板的方法、装置和计算机产品
CN112289377B (zh) * 2018-08-22 2022-11-15 深圳市真迈生物科技有限公司 检测图像上的亮斑的方法、装置和计算机程序产品
WO2020037573A1 (zh) * 2018-08-22 2020-02-27 深圳市真迈生物科技有限公司 检测图像上的亮斑的方法、装置和计算机程序产品
WO2020037570A1 (zh) * 2018-08-22 2020-02-27 深圳市真迈生物科技有限公司 图像配准方法、装置和计算机程序产品
CN112285070B (zh) * 2018-08-22 2022-11-11 深圳市真迈生物科技有限公司 检测图像上的亮斑的方法和装置、图像配准方法和装置
CN112823352B (zh) 2019-08-16 2023-03-10 深圳市真迈生物科技有限公司 碱基识别方法、***和测序***
CN113012757B (zh) 2019-12-21 2023-10-20 深圳市真迈生物科技有限公司 识别核酸中的碱基的方法和***
CN111951324B (zh) * 2020-07-30 2024-03-29 佛山科学技术学院 一种铝型材包装长度检测方法及***
CN113034481A (zh) * 2021-04-02 2021-06-25 广州绿怡信息科技有限公司 设备图像模糊检测方法及装置
CN113781351B (zh) * 2021-09-16 2023-12-08 广州安方生物科技有限公司 图像处理方法、设备及计算机可读存储介质
CN114166805B (zh) * 2021-11-03 2024-01-30 格力电器(合肥)有限公司 Ntc温度传感器检测方法、装置、ntc温度传感器及制造方法
CN114311572A (zh) * 2021-12-31 2022-04-12 深圳市新科聚合网络技术有限公司 Smd led注塑支架在线检测装置及其检测方法
CN115294035B (zh) * 2022-07-22 2023-11-10 深圳赛陆医疗科技有限公司 亮点定位方法、亮点定位装置、电子设备及存储介质
CN117721191B (zh) * 2024-02-07 2024-05-10 深圳赛陆医疗科技有限公司 基因测序方法、测序装置、可读存储介质和基因测序***

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105039147A (zh) * 2015-06-03 2015-11-11 西安交通大学 一种高通量基因测序碱基荧光图像捕获***装置及方法
CN105205788A (zh) * 2015-07-22 2015-12-30 哈尔滨工业大学深圳研究生院 一种针对高通量基因测序图像的去噪方法

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007315772A (ja) * 2006-05-23 2007-12-06 Canon Inc 蛍光検出装置および生化学反応分析装置
JP5499732B2 (ja) * 2009-06-23 2014-05-21 ソニー株式会社 生体サンプル像取得装置、生体サンプル像取得方法及び生体サンプル像取得プログラム
CN102174384B (zh) * 2011-01-05 2014-04-02 深圳华因康基因科技有限公司 对基因测序仪的测序及信号处理进行控制的方法及***
JP5413408B2 (ja) * 2011-06-09 2014-02-12 富士ゼロックス株式会社 画像処理装置、プログラム及び画像処理システム
CN102354398A (zh) * 2011-09-22 2012-02-15 苏州大学 基于密度中心与自适应的基因芯片处理方法
KR101348680B1 (ko) * 2013-01-09 2014-01-09 국방과학연구소 영상추적기를 위한 표적포착방법 및 이를 이용한 표적포착장치
US20140349281A1 (en) * 2013-05-22 2014-11-27 Sunpower Technologies Llc System and Method for Dispensing Barcoded Solutions
CN104297249A (zh) * 2014-09-15 2015-01-21 浙江大学 基于心肌细胞传感器的药物心脏毒性检测分析方法
WO2016107896A1 (en) * 2014-12-30 2016-07-07 Ventana Medical Systems, Inc. Systems and methods for co-expression analysis in immunoscore computation
CN105389581B (zh) * 2015-10-15 2019-08-06 哈尔滨工程大学 一种胚芽米胚芽完整度智能识别***及其识别方法
CN105741266B (zh) * 2016-01-22 2018-08-21 北京航空航天大学 一种病理图像细胞核快速定位方法

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105039147A (zh) * 2015-06-03 2015-11-11 西安交通大学 一种高通量基因测序碱基荧光图像捕获***装置及方法
CN105205788A (zh) * 2015-07-22 2015-12-30 哈尔滨工业大学深圳研究生院 一种针对高通量基因测序图像的去噪方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YE, BINGGANG: "Raw image preprocessing experiments of high throughput sequencing image segmentation experiments", CHINA DOCTORAL DISSERTATIONS FULL-TEXT DATABASE (NON OFFICIAL TRANSLATION), no. 11, 15 November 2010 (2010-11-15), pages 34 - 35,46-49, ISSN: 1674-022x *

Also Published As

Publication number Publication date
CN107918931B (zh) 2021-11-09
WO2018068600A1 (zh) 2018-04-19
CN107918931A (zh) 2018-04-17
WO2018068599A1 (zh) 2018-04-19
HK1247724A1 (zh) 2018-09-28
CN107945150A (zh) 2018-04-20
HK1247722A1 (zh) 2018-09-28
CN107945150B (zh) 2021-11-09

Similar Documents

Publication Publication Date Title
WO2018068511A1 (zh) 基因测序的图像处理方法及***
EP3306566B1 (en) Method and system for processing image
US10783641B2 (en) Systems and methods for adaptive histopathology image unmixing
Lin et al. Hierarchical, model‐based merging of multiple fragments for improved three‐dimensional segmentation of nuclei
WO2020037573A1 (zh) 检测图像上的亮斑的方法、装置和计算机程序产品
CN110660072B (zh) 一种直线边缘的识别方法、装置、存储介质及电子设备
WO2021030952A1 (zh) 碱基识别方法、***、计算机程序产品和测序***
CN108601509B (zh) 图像处理装置、图像处理方法以及记录有程序的介质
WO2018103373A1 (zh) 单分子的识别、计数方法及装置
WO2020037572A1 (zh) 检测图像上的亮斑的方法和装置、图像配准方法和装置
WO2010017206A1 (en) Image analysis
JP5088329B2 (ja) 細胞特徴量算出装置および細胞特徴量算出方法
CN112289377B (zh) 检测图像上的亮斑的方法、装置和计算机程序产品
WO2019181072A1 (ja) 画像処理方法、コンピュータプログラムおよび記録媒体
WO2020037570A1 (zh) 图像配准方法、装置和计算机程序产品
Mace et al. Quantification of transcription factor expression from Arabidopsis images
WO2018103345A1 (zh) 单分子的识别、计数方法及装置
CN114945825A (zh) 癌症判定装置、癌症判定方法以及程序
WO2020037571A1 (zh) 基于图像构建测序模板的方法、装置和计算机程序产品
WO2020037574A1 (zh) 基于图像构建测序模板的方法、碱基识别方法和装置
US11181463B2 (en) Image processing device, cell recognition apparatus, cell recognition method, and cell recognition program
DK2901415T3 (en) PROCEDURE FOR IDENTIFICATION OF CELLS IN A BIOLOGICAL Tissue
Bombrun et al. Decoding gene expression in 2D and 3D
CN111724366B (zh) 一种激光腔识别方法及装置
WO2008016912A2 (en) Systems and methods of analyzing two dimensional gels

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17860312

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 08/08/2019)

122 Ep: pct application non-entry in european phase

Ref document number: 17860312

Country of ref document: EP

Kind code of ref document: A1