CN107430587A - Automate flow cytometry method and system - Google Patents
Automate flow cytometry method and system Download PDFInfo
- Publication number
- CN107430587A CN107430587A CN201580075757.XA CN201580075757A CN107430587A CN 107430587 A CN107430587 A CN 107430587A CN 201580075757 A CN201580075757 A CN 201580075757A CN 107430587 A CN107430587 A CN 107430587A
- Authority
- CN
- China
- Prior art keywords
- mrow
- msub
- msup
- data
- flow cytometry
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000684 flow cytometry Methods 0.000 title claims abstract description 89
- 238000000034 method Methods 0.000 title claims description 61
- 238000004458 analytical method Methods 0.000 claims abstract description 77
- 238000012706 support-vector machine Methods 0.000 claims abstract description 61
- 210000004027 cell Anatomy 0.000 claims description 46
- 238000009826 distribution Methods 0.000 claims description 26
- 239000012634 fragment Substances 0.000 claims description 26
- 230000002159 abnormal effect Effects 0.000 claims description 22
- 238000004422 calculation algorithm Methods 0.000 claims description 20
- 210000004698 lymphocyte Anatomy 0.000 claims description 17
- 210000001616 monocyte Anatomy 0.000 claims description 14
- 210000003714 granulocyte Anatomy 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 8
- 201000010099 disease Diseases 0.000 claims description 8
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 8
- 238000001514 detection method Methods 0.000 claims description 7
- 230000008034 disappearance Effects 0.000 claims description 7
- 239000011159 matrix material Substances 0.000 claims description 7
- 241001269238 Data Species 0.000 claims description 6
- 238000009825 accumulation Methods 0.000 claims description 3
- 238000004891 communication Methods 0.000 claims description 2
- 230000001052 transient effect Effects 0.000 claims description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims 1
- 210000003046 sporozoite Anatomy 0.000 claims 1
- 238000003745 diagnosis Methods 0.000 abstract description 12
- 238000012360 testing method Methods 0.000 description 34
- 238000012549 training Methods 0.000 description 24
- 239000000523 sample Substances 0.000 description 21
- 230000006870 function Effects 0.000 description 13
- 230000000875 corresponding effect Effects 0.000 description 11
- 230000008569 process Effects 0.000 description 11
- 201000003793 Myelodysplastic syndrome Diseases 0.000 description 10
- 238000012545 processing Methods 0.000 description 9
- 230000002559 cytogenic effect Effects 0.000 description 8
- 101000738771 Homo sapiens Receptor-type tyrosine-protein phosphatase C Proteins 0.000 description 7
- 102100037422 Receptor-type tyrosine-protein phosphatase C Human genes 0.000 description 7
- 238000007405 data analysis Methods 0.000 description 7
- 230000002596 correlated effect Effects 0.000 description 6
- 239000000243 solution Substances 0.000 description 6
- 238000010801 machine learning Methods 0.000 description 5
- 238000005259 measurement Methods 0.000 description 5
- 238000011160 research Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 208000032839 leukemia Diseases 0.000 description 4
- 239000013598 vector Substances 0.000 description 4
- 102100024222 B-lymphocyte antigen CD19 Human genes 0.000 description 3
- 102100022005 B-lymphocyte antigen CD20 Human genes 0.000 description 3
- 101000980825 Homo sapiens B-lymphocyte antigen CD19 Proteins 0.000 description 3
- 101000897405 Homo sapiens B-lymphocyte antigen CD20 Proteins 0.000 description 3
- 210000003719 b-lymphocyte Anatomy 0.000 description 3
- 235000013399 edible fruits Nutrition 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- MHMNJMPURVTYEJ-UHFFFAOYSA-N fluorescein-5-isothiocyanate Chemical compound O1C(=O)C2=CC(N=C=S)=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 MHMNJMPURVTYEJ-UHFFFAOYSA-N 0.000 description 3
- 239000007850 fluorescent dye Substances 0.000 description 3
- 239000007788 liquid Substances 0.000 description 3
- 239000003550 marker Substances 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 238000003909 pattern recognition Methods 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 238000000926 separation method Methods 0.000 description 3
- 235000015170 shellfish Nutrition 0.000 description 3
- 101000917858 Homo sapiens Low affinity immunoglobulin gamma Fc region receptor III-A Proteins 0.000 description 2
- 101000917839 Homo sapiens Low affinity immunoglobulin gamma Fc region receptor III-B Proteins 0.000 description 2
- 102100022297 Integrin alpha-X Human genes 0.000 description 2
- 102100029185 Low affinity immunoglobulin gamma Fc region receptor III-B Human genes 0.000 description 2
- 102000003729 Neprilysin Human genes 0.000 description 2
- 108090000028 Neprilysin Proteins 0.000 description 2
- 108010004469 allophycocyanin Proteins 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 229910002056 binary alloy Inorganic materials 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 210000000349 chromosome Anatomy 0.000 description 2
- 239000003086 colorant Substances 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 238000013480 data collection Methods 0.000 description 2
- 238000007418 data mining Methods 0.000 description 2
- 239000012530 fluid Substances 0.000 description 2
- 238000011835 investigation Methods 0.000 description 2
- 239000002245 particle Substances 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 102100031585 ADP-ribosyl cyclase/cyclic ADP-ribose hydrolase 1 Human genes 0.000 description 1
- 208000031261 Acute myeloid leukaemia Diseases 0.000 description 1
- PLXMOAALOJOTIY-FPTXNFDTSA-N Aesculin Natural products OC[C@@H]1[C@@H](O)[C@H](O)[C@@H](O)[C@H](O)[C@H]1Oc2cc3C=CC(=O)Oc3cc2O PLXMOAALOJOTIY-FPTXNFDTSA-N 0.000 description 1
- 102100022749 Aminopeptidase N Human genes 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 208000031404 Chromosome Aberrations Diseases 0.000 description 1
- 206010067477 Cytogenetic abnormality Diseases 0.000 description 1
- 241000199914 Dinophyceae Species 0.000 description 1
- 102000006354 HLA-DR Antigens Human genes 0.000 description 1
- 108010058597 HLA-DR Antigens Proteins 0.000 description 1
- 102100031573 Hematopoietic progenitor cell antigen CD34 Human genes 0.000 description 1
- 102100026122 High affinity immunoglobulin gamma Fc receptor I Human genes 0.000 description 1
- 101000777636 Homo sapiens ADP-ribosyl cyclase/cyclic ADP-ribose hydrolase 1 Proteins 0.000 description 1
- 101000757160 Homo sapiens Aminopeptidase N Proteins 0.000 description 1
- 101000777663 Homo sapiens Hematopoietic progenitor cell antigen CD34 Proteins 0.000 description 1
- 101000913074 Homo sapiens High affinity immunoglobulin gamma Fc receptor I Proteins 0.000 description 1
- 101001078143 Homo sapiens Integrin alpha-IIb Proteins 0.000 description 1
- 101001046686 Homo sapiens Integrin alpha-M Proteins 0.000 description 1
- 101000878605 Homo sapiens Low affinity immunoglobulin epsilon Fc receptor Proteins 0.000 description 1
- 101001008874 Homo sapiens Mast/stem cell growth factor receptor Kit Proteins 0.000 description 1
- 101000946889 Homo sapiens Monocyte differentiation antigen CD14 Proteins 0.000 description 1
- 101000934338 Homo sapiens Myeloid cell surface antigen CD33 Proteins 0.000 description 1
- 101000581981 Homo sapiens Neural cell adhesion molecule 1 Proteins 0.000 description 1
- 102100025306 Integrin alpha-IIb Human genes 0.000 description 1
- 102100022338 Integrin alpha-M Human genes 0.000 description 1
- 102100038007 Low affinity immunoglobulin epsilon Fc receptor Human genes 0.000 description 1
- 206010025323 Lymphomas Diseases 0.000 description 1
- 102100027754 Mast/stem cell growth factor receptor Kit Human genes 0.000 description 1
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 102100035877 Monocyte differentiation antigen CD14 Human genes 0.000 description 1
- 206010068052 Mosaicism Diseases 0.000 description 1
- 102100025243 Myeloid cell surface antigen CD33 Human genes 0.000 description 1
- 102100027347 Neural cell adhesion molecule 1 Human genes 0.000 description 1
- 108010004729 Phycoerythrin Proteins 0.000 description 1
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000000149 argon plasma sintering Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 210000001185 bone marrow Anatomy 0.000 description 1
- 238000009583 bone marrow aspiration Methods 0.000 description 1
- 239000002771 cell marker Substances 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 230000002759 chromosomal effect Effects 0.000 description 1
- 201000010902 chronic myelomonocytic leukemia Diseases 0.000 description 1
- 238000010224 classification analysis Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 238000004163 cytometry Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- BFMYDTVEBKDAKJ-UHFFFAOYSA-L disodium;(2',7'-dibromo-3',6'-dioxido-3-oxospiro[2-benzofuran-1,9'-xanthene]-4'-yl)mercury;hydrate Chemical compound O.[Na+].[Na+].O1C(=O)C2=CC=CC=C2C21C1=CC(Br)=C([O-])C([Hg])=C1OC1=C2C=C(Br)C([O-])=C1 BFMYDTVEBKDAKJ-UHFFFAOYSA-L 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 239000000975 dye Substances 0.000 description 1
- 238000004043 dyeing Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- GNBHRKFJIUUOQI-UHFFFAOYSA-N fluorescein Chemical compound O1C(=O)C2=CC=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 GNBHRKFJIUUOQI-UHFFFAOYSA-N 0.000 description 1
- 239000000446 fuel Substances 0.000 description 1
- 239000003517 fume Substances 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 210000003958 hematopoietic stem cell Anatomy 0.000 description 1
- 229920000669 heparin Polymers 0.000 description 1
- ZFGMDIBRIDKWMY-PASTXAENSA-N heparin Chemical compound CC(O)=N[C@@H]1[C@@H](O)[C@H](O)[C@@H](COS(O)(=O)=O)O[C@@H]1O[C@@H]1[C@@H](C(O)=O)O[C@@H](O[C@H]2[C@@H]([C@@H](OS(O)(=O)=O)[C@@H](O[C@@H]3[C@@H](OC(O)[C@H](OS(O)(=O)=O)[C@H]3O)C(O)=O)O[C@@H]2O)CS(O)(=O)=O)[C@H](O)[C@H]1O ZFGMDIBRIDKWMY-PASTXAENSA-N 0.000 description 1
- 229960001008 heparin sodium Drugs 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 229940049705 immune stimulating antibody conjugate Drugs 0.000 description 1
- 238000010166 immunofluorescence Methods 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 102000027411 intracellular receptors Human genes 0.000 description 1
- 108091008582 intracellular receptors Proteins 0.000 description 1
- 238000002386 leaching Methods 0.000 description 1
- 210000002751 lymph Anatomy 0.000 description 1
- 239000008267 milk Substances 0.000 description 1
- 210000004080 milk Anatomy 0.000 description 1
- 235000013336 milk Nutrition 0.000 description 1
- 210000003470 mitochondria Anatomy 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 210000000822 natural killer cell Anatomy 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 239000013610 patient sample Substances 0.000 description 1
- 239000002504 physiological saline solution Substances 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 238000011158 quantitative evaluation Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 239000013535 sea water Substances 0.000 description 1
- 210000003765 sex chromosome Anatomy 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 238000011524 similarity measure Methods 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 230000017105 transposition Effects 0.000 description 1
- 239000012905 visible particle Substances 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N15/00—Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
- G01N15/10—Investigating individual particles
- G01N15/14—Optical investigation techniques, e.g. flow cytometry
- G01N15/1429—Signal processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/245—Classification techniques relating to the decision surface
- G06F18/2453—Classification techniques relating to the decision surface non-linear, e.g. polynomial classifier
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/69—Microscopic objects, e.g. biological cells or cellular parts
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B35/00—ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/60—In silico combinatorial chemistry
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N15/00—Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
- G01N15/10—Investigating individual particles
- G01N15/14—Optical investigation techniques, e.g. flow cytometry
- G01N15/1456—Optical investigation techniques, e.g. flow cytometry without spatial resolution of the texture or inner structure of the particle, e.g. processing of pulse signals
- G01N15/1459—Optical investigation techniques, e.g. flow cytometry without spatial resolution of the texture or inner structure of the particle, e.g. processing of pulse signals the analysis being performed on a sample stream
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N15/00—Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
- G01N15/10—Investigating individual particles
- G01N2015/1006—Investigating individual particles for cytology
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Chemical & Material Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Biochemistry (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Evolutionary Biology (AREA)
- Signal Processing (AREA)
- Biomedical Technology (AREA)
- Pathology (AREA)
- Immunology (AREA)
- Analytical Chemistry (AREA)
- Dispersion Chemistry (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Nonlinear Science (AREA)
- Mathematical Physics (AREA)
- Medicinal Chemistry (AREA)
- Crystallography & Structural Chemistry (AREA)
- Biophysics (AREA)
- Library & Information Science (AREA)
- Biotechnology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
Provide it is a kind of be used for receive flow cytometry data input and the automatic mode and system of the data are analyzed using the hierarchical arrangement of analytical element, the data are classified as different subgroups so as to identify the pattern in the data by each analysis element using SVMs.The pattern can be used for generation for the diagnosis prediction of patient or for identifying the pattern in the sample collected at multiple subjects.
Description
Related application
This application claims the priority for the 14/965th, No. 640 U. S. application submitted on December 10th, 2015, described U.S.
State application be the 62/090th, No. 316 U.S. Provisional Application submitted on December 10th, 2014 non-provisional submission, the U.S.
Provisional application is integrally joined to this by quoting with it.The application further relates to the theme of the 8th, 628, No. 810 United States Patent (USP), described
United States Patent (USP) is integrally joined to this by quoting with it.
Technical field
It is used to use SVMs automatically analysis distribution formula data, specifically flow cytometry the present invention relates to one kind
The method and system of data.
Background of invention
Flow cytometry is the feature for the molecule that measurement suspends in working fluid stream.It is each to focus on laser beam irradiation
Mobile particle and light scatters in all directions.It is placed in front of intersection point or the detector orthogonal with laser beam is received and dissipated
Light pulse is penetrated, generation is input into the signal explained in computer analyzer.The light summation of the forward scattering detected takes
Certainly in granular size and and reflectance factor but be closely related with the cross-sectional area of the visible particle of laser, it is and sidewise scattered
Light quantity can indicate shape or granularity.
One of most widely used flow cytometry is the cell analysis for medical diagnosis, wherein, relevant particles are outstanding
Float over the cell in saline solns.Flow cytometry technique provides the high throughput systems for collecting a large amount of cell data.Stream
Formula cell art be used for detect in various types of samples (including marrow, periphery blood and tissue) exception (such as MM, CLL,
LGL, AML, MDS, CMML, lymphocyte, MBL etc.) effective tool.If the related cell of fluorochrome label can be used
Label, other features (such as surface molecular or iuntercellular composition) of cell can also be quantified exactly;It is for example, anti-
Body-fluorescent dye is to can be used for being attached to specific surface or intracellular receptor.By to surface marker use by fluorescence
It is most common fluidic cell that the monoclonal antibody of mark carries out characterizing progress immunophenotype in the different development phases to cell
One of art.Develop and be tied to specific structure (for example, DNA, mitochondria) or to local chemical property (for example, Ca++
Concentration, pH etc.) sensitive other dyestuffs.
Although flow cytometry is widely used for medical diagnosis, flow cytometry can be also used for non-medical applications, such as
Water or other fluid analysis.For example, seawater can be analyzed to identify the presence of bacterium or other organisms or type, can analyze
Milk is with test microbes, and the fume or additive that can be tested in fuel.
Used laser beam has suitable color to encourage selected fluorescence.The fluorescence volume launched can be with
The expression of the cell marker discussed is related.Each flow cytometer usually can detect many different fluorescences simultaneously,
This depends on the configuration of the flow cytometer.In some instruments, it can swash by using in the multiple of emitting at different wavelengths
Light analyzes multiple fluorescences simultaneously.For example, can be from medical (Becton Dickinson) (the New Jersey Franklin of shellfish enlightening
Lake) obtain FACSCaliburTMFlow cytometer system is the polychrome flow cytometer for being arranged to the operation of four colors.Come from
The fluorescent emission of each cell is collected by a series of photomultiplier test tubes, and is collected and analyzed follow-up electric on computers
Event, the computer are that each signal in flow cytometry standard (FCS) data file distributes fluorescence intensity level.Data point
Analyse to be related in mark superspace and be used for filtering or " gating (gate) " data and definition event subgroup subset for further
The common factor or union of the polygonal region of analysis or screening.
International analysis cytology association (International Society for Analytical Cytology,
ISAC) it is used for the conventional expression of FCM data using FCS data files standard.This standard obtains being used to record to hang oneself
Cross the support of all Main Analysis instruments of the measured value of the sample of hemacytometer, it is allowed to which researcher and clinician exist
Selected in a large amount of commercially available instruments and software without running into main data compatibility sex chromosome mosaicism.However, this standard
Not to being described for the agreement of computational post processing and data analysis.
Due to mass data be present in flow cytometry, it is often difficult in manual processes fully utilize these numbers
According to.The high-dimensional of data also neatly use traditional statistical method and learning art, such as artificial neural network.Branch
It is the machine learning techniques based on kernel that can handle higher-dimension degrees of data to hold vector machine.Supporting vector chance is that have suitably to set
The effective tool of the disposal flow data of the kernel of meter.
The flow data of single situation is made up of multiple test tubes.Measured value while each test tube can include multiple measure.
When measuring all measure, each run is generally collected for more than 104Individual event, this can produce 106The measured value of magnitude for point
Analysis.
The conventional method of analysis flow data is usually directed to " gating (the gating) " method to data progress so as to separate certain
A little groups of cells and once check the large-scale 2D curves set of graphs of the data manually with two parameters.Flow cytometry data
It is typically found in for diagnosis useful feature on the attribute Distribution value in high-dimensional space.As a result, human reader is difficult to manage
Solve the high-dimensional pattern through convolution in data.
Modern technological progress (such as flow cytometry) has generated the mass data of many different-formats.This information
The maximum challenge that blast is presented to computer and information scientist is exactly to develop to be used to handle mass data and extract to have
With the effective ways of information.Although it is that effective, traditional statistical method is verified for low-dimensional degrees of data to be inadequate in dealing with
Frequent high complexity and high-dimensional " new data ".Specifically, so-called " incantation of dimension " is to the serious of classical statistics instrument
Limitation.Machine learning represent data processing and analysis in be used for overcome these limit it is desirable that new example.Engineering
Practise and automatically " learn " system using " data-driven " method, it can be used for classifying to Future Data or being predicted.Branch
It is state-of-the-art machine learning techniques to hold vector machine (SVM), and this technology revolutionizes machine learning field and is many
Difficult data analysis problems provide authentic and valid solution.
SVM be combined with optimal hyperlane in high-dimensional internal product space (often finite dimensional Hilbert space) and
The flexibility of data expression, computational efficiency and the rule of model capacity are realized in the concept for the kernel function that the input space defines
Change.SVM can be used for solving classification (pattern-recognition) and return (prediction) problem.It following present typical SVM pattern-recognitions
Set.
Give one group of training data:
xi,yiI=1,2 ..., m
The problem of SVM training can be formatted as finding optimal hyperlane:
Using Lagrange multiplier, dual problem is transformed into:
Quadratic programming problem is solved, we have SVM solutions:
Due to the complexity of flow cytometry data, it is difficult to which cytogenetics will be predicted by explicitly extracting essential feature or definition
Learn the pattern of result.System based on SVM provides different advantage:It only needs the value of similarity measure between example
To build grader.
The content of the invention
According to the present invention, there is provided a kind of computer assisted flow cytometry data analysis system is come by using advanced
Machine learning techniques and other mathematical algorithms most of tedious steps of analysis process are automated.It is distributed with customization
The SVMs (SVM) of formula kernel is used to detect abnormal flow distribution.Gauss hybrid models (GMM) are used for automated cluster and choosing
It is logical.Special pattern algorithm, which is used to gate automatically, to be identified.
The system remains traditional characteristic, and such as gating limits and adjustment, 2D curve maps and statistical form.However, this is
System provides automation in all analytical procedures.In addition, SVM methods facilitate point far beyond the 2D in conventional method or 3D limitations
Analysis.
The system offer automation flow cytometry data analysis of the present invention, including gate prediction automatically, automatically determine often
The normal of individual curve map (each label) is automatically determined abnormal results to exception, based on summary sheet, (collected based on unusual combination
Table, each curve map and gating distribution) automatically determine disease type.The system has provided the user training and customized normally to different
The normal ability specified.In certain embodiments, flow cytometry system is provided for having visually by display
The mark curve map of differentiable feature and value by normal with device that is making a distinction extremely, this can by highlighting, under
Line, runic or any other visually detectable designator is realized using specific color (such as red) so that be
System user clearly marks abnormal results.The result marked will be recorded in associated patient's record for pathology
Scholar, doctor or other healthcare givers assess.
The accuracy and efficiency that the system of the present invention will help virologist to significantly improve analysis flow data.The present invention's
System will also provide the strong tools for finding the new model in flow cytometry.
SVMs is used to analyze the flow cytometry data generated by the commercially available flow cytometry device of routine,
Especially SVMs is disclosed in No. 6,760,715, No. 7,117,188 and No. 6,996,549 United States Patent (USP)
Example.Described in No. 5,872,627 and No. 4,284,412 United States Patent (USP) for carrying out showing for flow cytometry measure
Example sexual system, the two United States Patent (USP)s are incorporated herein by reference.In particular example described herein, data are related to medical treatment and examined
Disconnected application, particularly it is used to detect condition of blood (such as myelodysplastic syndrome (MDS)).Flow cytometry immunophenotype
Have proven to for even detected when combination form and cytogenetics be not diagnosable in hematopoietic cell it is quantitative with
Qualitative abnormal accurate and super-sensitive method.Automation flow cytometry data analysis system disclosed herein provides automatic
The ability of mass data that ground analysis generates during flow cytometry measure, enhance Flow Cytometry methods accuracy,
Repeatability and diversity.This ability from many subjects by collecting and analyzing the magnanimity stream far beyond current method for limiting
Formula cytometry data carries out data mining and pattern-recognition not only increases the diagnostic value of flow cytometry but also extends this
The research application of method.
In one aspect of the invention, a kind of method analyzed and classified for flow cytometric art data, wherein,
The flow cytometry data includes describing multiple features of the data, and methods described includes:By including for cell mass
The input data set of flow cytometry event is downloaded in the computer system including processor and storage device, wherein, it is described
Processor is programmed to carry out at least one SVMs and performs following steps:The level knot of limiting analysis element
Structure, each analysis element correspond to different gatings and limited, wherein, each analysis element application gating algorithms are according to parameter
The predetermined criterion of combination and cell subgroup is classified, wherein, it is described classification be use with distributed kernel support to
What amount machine performed;And generation has the output display of the mark of flow cytometry data classification at display device.At some
In embodiment, methods described also includes selection cell subgroup and using the different gating algorithms of application with further to the subgroup
The different analysis elements classified usually analyze selected cell subgroup.In a preferred embodiment, the distributed kernel bag
Include Pasteur (Bhattacharya) compatibility with following form:
Wherein, p and q is input data point, and M is the average of normal distribution, and ∑ is covariance matrix.The level knot
Structure can be the tree for having multiple branches, and also include being used to result caused by each branch being combined into diagnostic classification
Analysis of conclusion step.The diagnostic classification can be included presence or absence of disease.The different gating restriction can be selected from by
The group of the following composition:Sample tube mark, fragment are strong to non-fragment, granulocyte, monocyte, lymphocyte, negative flag thing
Degree and disappearance label intensity.
In another aspect of this invention, a kind of method for being used to automatically analyze flow cytometry data, including:Detection bag
Include the lateral scattering and forward scattering event of the sample of multiple cells;The lateral scattering and forward direction are generated in two dimension or three-dimensional
Multiple curve maps of scattering events, the multiple curve map include flow cytometry data;Use the hierarchical structure of analytical element
The multiple curve map is handled, each analysis element corresponds to different gatings and limited, wherein, each analysis element application gating
Algorithm carrys out basis and the predetermined criterion of parameter combination is classified to cell subgroup, wherein, the classification is using distribution
What kernel performed;And generation has the output of the mark of one or more flow cytometry datas classification at display device.
Methods described can also include selection cell subgroup and using the different gating algorithms of application with further to subgroup progress
The different analysis elements of classification usually analyze selected cell subgroup.In a preferred embodiment, the distributed kernel is that have
The Bhattacharya compatibilities of following form:
Wherein, p and q is input data point, and M is the average of normal distribution, and ∑ is covariance matrix.The level knot
Structure can be the tree for having multiple branches, and can also include being used to result caused by each branch being combined into diagnosis point
The Analysis of conclusion step of class.The diagnostic classification can be presence or absence of disease.The different gating limit be selected from by with
The group of lower every composition:Sample tube mark, fragment are to non-fragment, granulocyte, monocyte, lymphocyte, negative flag thing intensity
With disappearance label intensity.
In the still another aspect of the present invention, a kind of system for being used to automatically analyze flow cytometry data, the system
Including:Computer processor, the computer processor and memory communication, the memory are stored with including to bag wherein
The flow cytometry data that multiple measure are performed on multiple samples of cell is included, the flow cytometry data includes lateral scattering
With forward scattering event;And computer program product, the computer program product are implemented on non-transient computer-readable Jie
In matter, the computer program product includes being used for the instruction for making the computer processor perform following operation:Described in reception
Flow cytometry data;Multiple curve maps of the lateral scattering and forward scattering event are generated in two dimension or three-dimensional;Use
The hierarchical structure of analytical element handles the multiple curve map, and each analysis element corresponds to different gatings and limited, wherein, often
Individual analytical element application gating algorithms are carried out according to the predetermined criterion to parameter combination to the cell subgroup in the sample
Classification, wherein, the classification is performed using distributed kernel;And generation has the one of the cell at display device
The output of the mark of individual or multiple flow cytometry data classification.The computer program product can also include described for making
Computer processor performs the instruction of following operation:Select cell subgroup;Using the different gating algorithms of application with further to institute
State the instruction that the different analysis elements classified subgroup usually analyze selected cell subgroup.In a preferred embodiment, it is described
Distributed kernel includes Pasteur (Bhattacharya) compatibility with following form:
Wherein, p and q is input data point, and M is the average of normal distribution, and ∑ is covariance matrix.The level knot
Structure can be the tree for having multiple branches, and the system can also include being used to combine result caused by each branch
Into the Analysis of conclusion step of diagnostic classification.In certain embodiments, the diagnostic classification is included presence or absence of disease.It is described
Difference gating is limited selected from the group being made up of the following:Sample tube mark, fragment are to non-fragment, granulocyte, monocyte, leaching
Bar cell, negative flag thing intensity and disappearance label intensity.In certain embodiments, the memory and flow cytometry instrument
It is associated and specific to single subject, and in other embodiments, the memory can be database, the data
Storehouse is configured for the accumulation flow cytometry data that storage generates from the sample collected from multiple subjects.
Brief description of the drawings
Fig. 1 is for automatically collecting and analyzing the schematic diagram of the system of flow cytometry data according to the present invention.
Fig. 2 is that the illustrative log of the relative group distribution of MDS flow cytometries is shown.
Fig. 3 is the flow chart according to the data analysing method of the present invention.
Fig. 4 is the schematic diagram of exemplary analysis hierarchical structure according to an embodiment of the invention.
Fig. 5 is the block diagram of the structure of each node of Fig. 4 of the implementation of system according to the invention tree.
Fig. 6 A and Fig. 6 B are the examples of the analysis result generated by the system of the present invention.
Fig. 7 is the flow chart of the exemplary branch of parsing tree according to an embodiment of the invention.
Fig. 8 A-8E are the sample screen sectional drawings of the exemplary analysis sequence of Fig. 7 branch.
Fig. 9 is the sample screen sectional drawing of the 3-dimensional curve map according to caused by the embodiment of flow cytometry system.
Figure 10 is the sample screen sectional drawing of analysis result according to an embodiment of the invention.
Figure 11 A-11F are the sample graphs for six different analysis generations, wherein, Figure 11 A-11C and Figure 11 F are represented
Normal outcome, and Figure 11 D-11E are highlighted to represent abnormal results.
Figure 12 is the simple electric form of the measured value and calculated value of listing different subgroups.
Figure 13 shows the parameter of a subgroup and corresponding flow cytometry data.
Figure 14 shows the parameter of another subgroup and corresponding flow cytometry data.
Embodiment
According to the present invention, there is provided a kind of method and system for being used to analyze flow cytometry data.Specifically, it is of the invention
Method includes creating the kernel for being used for analyzing the data with distributed nature.Flow cytometry application in input data p be
The set of a large amount of points in space.For example, image can be considered as the point set in 2 dimension spaces.After appropriate normalization, p
Probability distribution can be considered as.In order to define kernel for two this input data p and q to catch distributed trend, it is necessary to be p
With q defined functions, between the two whole distributions of the function measurement rather than only between the independent point in these distributions
Similarity.
A kind of mode for building this " distributed kernel " is using the distance between the two distributions function (divergence).Such as
Fruit p (p, q) is distance function, then following is kernel
K (p, q)=e-p(p,q)。 (1)
In the presence of many distance functions of the difference between two probability distribution of measurement.Kullback-Leibler divergences,
Bhattacharya compatibilities, Jeffrey divergences, Mahalanobis distances, Kolmogorov change distances and desired conditions entropy
It is all examples of this distance.Given distance function, above formula structure kernel can be based on.
For example, special customization kernel can be built based on Bhattacharya compatibilities.For with average M and association side
The normal distribution of poor matrix ∑, Bhattacharya compatibilities have following form:
New kernel is defined from this distance function using above equation.
This distributed kernel computationally more efficient (there is linear complexity) and a large amount of input datas can be disposed.
Typical density estimation method has computation complexity O (n2), this may be too high for some applications.
The distributed kernel of the present invention may be directly applied to SVM or other machines learning system to create grader and its
His forecasting system.Distributed kernel provides some differences better than the standard kernel being frequently used in SVM and other core machines
Property advantage.Distributed kernel catches the similarity between the overall distribution of larger data component, and this is probably in some applications
It is very crucial.
Fig. 3 provides the example process flow for analyzing flow cytometry data.Such as it will be apparent to one skilled in the art
, flow cytometry data is provided as the example of distributed data, and can be handled and divided using techniques described below
The other kinds of distributed data of class.
The initial data that flow cytometer 106 is generated is input into (step 302) in computer processing system, the meter
Calculation machine processing system comprises at least memory and processor, and the processor is programmed to carry out one or more supporting vectors
Machine.Typical personal computer (PC) or apple Type processor is adapted to this processing.Input data
Collection is divided into two parts, and a part is used for Training Support Vector Machines, and another part is used for the validity for testing training.In step
In rapid 304, by perform one or more feature selecting programs in processor come on training dataset operation characteristic select
Algorithm.Within step 306, using the support with distributed kernel (kernel such as based on Bhattacharya compatibilities) to
Training dataset of the amount machine processing with the feature set reduced.By being extracted in independent test data set and in step 304
The corresponding data of the feature of selection and test data is handled come in step using the housebroken SVM with distributed kernel
The validity of training step is assessed in 308.If the result of test represents sub-optimal result, SVM by re -training and will be weighed
New test, until obtaining optimal solution.If training is determined to be satisfactorily, in the step 310, with being carried out to clinical samples
The corresponding live data of flow cytometry measure be input into processor.In step 312, selected from patient data in step
The feature and apparatus selected in rapid 304 is distributed the SVM processing of the trained of formula kernel and test, and result is by patient
Sample classification is normal or abnormal.In a step 314, the report collected to result is generated, the report may be displayed on
On computer monitor 122, it is shown in printed report 124 and/or via e-mail or other Network File Transmission Systems
It is transferred to the office of research or clinical labororatory, hospital or doctor.Can also show and/or print data packet a peacekeeping
The histogram of two-dimensional representation.Result and initial data, histogram and other patient datas can also be stored in computer and deposited
In reservoir or database.
Optional additional diagnostics flow can be combined with automatic analysis system with flow cytometry data and result
The confidence level of enhancing is provided.It is special using No. 7,383,237 U.S. authorized similar to what is be incorporated herein by reference et al.
The similar scheme of scheme disclosed in profit, the result of flow cytometry test can combine with other kinds of test.Fig. 3 is shown
For the dyeing by the generation from old process (such as human chromosomal karyotyping or FISH (FISH))
Body image zooming-out correlated characteristic come use SVM perform Hemapoiesis data computer-aided image analysis so as to identify delete,
Transposition, inversion and other abnormal optional flow paths.In step 320, by training image data input computer processor
In pre-processed so as to identify and extract correlated characteristic.Generally, training image data are pretreated to identify correlated characteristic (step
It is rapid 322), be subsequently used for training image processing SVM.Test image data are subsequently used for checking and have obtained optimal solution (step
324).If do not obtained, step 324 will be repeated and SVM by re -training and will be retested.If have been carried out optimal
Solution, input patient data (step 326) living is pre-processed into (step 328) and classification (step 330).
In a preferred method, as described by the 7th, 383, No. 237 patent, each correlated characteristic in image is by individually pre-
Handle (step 322) and by for the optimal SVM processing of the feature.All phases are combined in the image procossing SVM of the second level
The analysis result of feature is closed to generate the output classified to whole image.Warp is tested using pretreatment image test data
SVM (the steps 324) of training.If solution is optimal, with patient data living (the same patient for carrying out flow cytometry)
Corresponding data are transfused to and (step 326) in processor.Patient image data is pretreated (step 328) to identify correlation
Feature, and handle each correlated characteristic with for the optimal housebroken first order SVM of the special characteristic.Correlated characteristic
Combinatory analysis result is combined and is input in housebroken second level image procossing SVM is classified with generating to whole image
Output (step 330).
The result of step 330 can be communicated to be stored in (step 316) in the patient file of database and/or incite somebody to action
It is input into the SVM of the second level to be analyzed with reference to the flow cytometry data result from step 312.This second level
SVM will be trained and tested using the training and test data that are illustrated by the broken lines between step 308,324 and 340
's.The result of step 316 and step 330 is combined to be handled by housebroken second grade SVM so as to be carried out in step 342
Combinatory analysis.The result of this combined treatment is typically binary system output, for example, normal or abnormal, ill or do not have disease etc..Group
Close result can be output shown (step 314) and/or be input into memory or or database in stored (step
It is rapid 316).Additional optional secondary flow path can be provided to be incorporated to other kinds of data and analysis, such as analysis expert, patient
History etc., these can be combined to create the last diagnostic that can be used for screening, monitor and/or treat or omen fraction or
Other outputs.
Example 1:Myelodysplastic syndrome detects (MDS)
Object of this investigation is that the related chromosome of the myelodysplastic syndrome (MDS) in investigation cytogenetics is different
The often potential association between flow cytometry data.This Immunophenotype analysis is that most common flow cytometry applies it
One, and it is well-known for those skilled in the art that agreement is collected and prepared to sample.After the sequence shown in Fig. 1,
The bone marrow aspiration liquid 102 of the patient under a cloud with MDS is collected in physiological saline or heparin sodium aqua, so as to suitable for that will hang
Supernatant liquid is introduced into multiple test tubes 104 in the flow cell of flow cytometer systems 106 or other containers and creates cell suspending liquid.
Reagent comprising the monoclonal antibody matched from different fluorescent dyes is introduced into test tube, and each test tube receives different antibodyomes
Close, one of each combination and some possible fluorescent dyes pairing.Flow cytometer is commercially available from many manufacturers, including from shellfish
The FACSCalibur of enlightening medical treatment (Becton Dickinson) (New Jersey Franklin lake)TMOr from gloomy more doctors difficult to understand
Treat the Cytoron/Absolute of (Ortho Diagnostics) (New Jersey power is stepped on)TM.For this example,
FACSCaliburTMSystem measures for four colors.As will be apparent to the skilled person in the art, this system is provided and is loaded into
The automation disposal of multiple samples in carousel so that diagram is intended to schematically, only represent flow cytometer
Sample in analyzer be present.Forward scattering detector 108 and lateral scattering detector 110 in flow cytometer systems 106 are given birth to
The electric signal corresponding into the event with being detected when cell is conducted through analysis stream.It is included in lateral scattering detector 110
In fluorescence detector measure the fluorescence that the expression of antigen represented by the antibody matched from different fluorescent markers is generated
The amplitude of signal.Numerical value is generated based on the pulse height (amplitude) measured by each detector.Gained signal is input into calculating
In processor in machine work station 120 and for create the histogram (one-parameter two-parameter) corresponding with detecting event from
And shown on graphics display monitor 122.Be related to input data is categorized as based on the comparison with controlling sample it is normal or different
Normal analyzes this data generation report 124 according to the present invention, and the report can be printed or shown on monitor 122.It is former
Beginning data, histogram and report also set the internal storage being stored in computer workstation 120 or separated memory
So as to associated with other records of patient in standby, the memory devices can include database server 130, the data
Storehouse server can be a part for Health Service Laboratory or the data warehouse in other medical facilities.
In exemplary process sequence, input data set includes 77 kinds with flow cytometer and cytogenetic data
Situation (patient).All patients are under a cloud to suffer from MDS.In this 77 kinds of situations, 37 kinds of situations have is tested by cytogenetics
Represented chromosome abnormality, this is related to carries out microexamination to the quantity or structure change of whole chromosomes.It is remaining 40 kinds
Situation is considered as being negative to cytogenetics.
The suspension air-breathing sample of bone marrow of each patient is assigned in 13 test tubes.In the color immunofluorescence agreement of standard 4,
Forward light scattering (FSC) and right angle light scatter (SSC) are collected together with 4 color Antibody Combinations to perform seven different measure,
One of measure is blank.Each case generally has 20,000-50,000 event, measures all measure.Every kind of feelings
The gained flow cytometry data collection of condition has about 106Individual measured value.Fig. 2 shows exemplary histograms, shows lateral scattering pair
CD45 is expressed, and marks different cell masses.
For each test tube in this 13 test tubes, FSC and SSC is measured, it is allowed to which gating excludes cell fragment, such as Fig. 2
Shown in the lower left corner.In addition, use antigentic specificity and the various combination of fluorescent marker for each test tube.Table 1 below lists Dan Ke
The various combination of grand antibody and following label:FITC (fluorescein isothiocynate), PE (phycoerythrin), PerCP (more dinoflagellates
Element-chlorophyll) and APC (allophycocyanin).The monoclonal antibody of fluorescent marker pairing with being identified can be from many differences
Source commercially available from, including shellfish enlightening medical treatment immunophenotype system (Becton-Dickinson Immunocytometry Systems)
(California, USA San Jose), DakoCytomation (California, USA Ka Pingtetiliya), Caltag (California, USA Bai Lingai
Nurse) and hero company (Invitrogen Corporation) (California, USA Camarillo).It is thin for enumerating ripe lymph
The CD45 antibody of born of the same parents is included in each combination to verify that lymphocyte gates.
Table 1
SVM training and Training valuation are carried out in order to provide data, and the whole data set of situation is divided into training set in 77
With independent test collection.40 kinds of situations (testing 20 kinds of positive situations and 20 kinds of conditions of forsaking one's love determining by cytogenetics) are used to train
SVM.Remaining 37 kinds of situations (17 kinds of positive situations and 20 kinds of conditions of forsaking one's love) are used to form independent test collection.
Foregoing customization kernel based on Bhattacharya compatibilities is used to analyze flow cytometry data so as to measure two
Difference between individual probability distribution.
Data from all measure are included to produce the system with optimal performance in grader.Therefore,
The feature selecting being measured based on training set.Two performance measurements are applied in feature selection step.Fisrt feature selects
Method (SVM leaving-one method (LOO)), which is related to, trains SVM in initial data set and then updates ratio by performing gradient steps
Example parameter so that LOO mistakes are reduced.These steps repeat, until realizing minimum LOO mistakes.Stopping criterion can be applied.Second
Feature selection approach is kernel alignment.Authorize Cristianini's (Oscar Cristi Ya Nini) what is be incorporated herein by reference
This technology is described in No. 7,299,213 United States Patent (USP).Kernel alignment is used only training data and can be in training
Core machine performs before occurring.
During feature selection process, it is determined that big measure feature will not contribute to the Accurate classification of data.In table 2
Give the result of feature selecting flow.
Table 2
" 1 " value in the entry of table 2 refers to select specific measure (test tube/measure combination);" 0 " refers to non-selected described
Measure.Which reduce by from each case be by data from original 91 be categorized into 26 needs consideration feature quantities.From reduction
The data of the measure of quantity are subsequently used for training SVM with distributed kernel.
Using selected measure, then housebroken SVM is tested with 37 kinds of independent situations.Tested using binary class
The conventional statistic measured value of performance collects cut-off for 0 result.Sensitivity or the rate of recovery provide correctly classify on the occasion of with by thin
On the occasion of the measured value of total ratio determined by the test of born of the same parents' science of heredity.The negative value ratio that specificity measurement is correctly identified.Survey
The analysis result for trying data is as follows:
Sensitivity:15/17=88% specificity:19/20=95%
This produces 3/37=8% total false rate.Use the estimation standard deviation of binomial distribution, σ=0.0449, test
Error rate is produced by 95% level of confidence less than 15%.
Fig. 4 shows the hierarchical structure of the system of the invention represented by root tree 400.Each node 410 of tree represents to perform
The fundamental analysis element of the various tasks related to specific selected through-flow data.Depending on giving the analysis of node execution,
Multiple branches can grow from node.In the example shown, start node 410 is separated into three branches 402,404,406.On tree
Number of nodes and numbers of branches will be changed according to parameter to be analyzed.For example, in branch 402, section point is separated into
Branch 402a and 402b.Branch 404 is separated into three branches 404a, 404b and 404c at its section point, then branch
404b separated component branch 404ba and 404bb at the 3rd node.Tree construction reflection level gating.Input number at each node
According to the strobed result for being its father node.
Fig. 5 shows the structure of each node 410 on the tree shown in Fig. 4.Each node includes gating and limits 502, be selected
Logical data set 504, the graphical plot of data 506, SVM configurations 508 and housebroken SVM data sets 510.
Example 2:The sample results of standard leukemia/lymthoma panel
Fig. 6 A and Fig. 6 B show example results caused by the system of the present invention.Analysis software includes reading standard FCS
The function of the data file of form.The analysis software can also export the result of various forms.Fig. 6 A split into multiple pages with
Sufficient resolution ratio is provided.At each occurrence, the first page of the figure is corresponding with the left panel 520 of screenshot capture, the
Page two are the center panels 522, and page three is right panel 524.Left panel 520 shows the text corresponding with institute gated data
Part.As illustrated, the first gating parameter 526 be sample tube numbering (test tube 1, test tube 2 ..., test tube x).For example, this is selected
Logical operation is by corresponding to the first node 410 in Fig. 4.528 (sub- gatings) of next gating are non-fragment and non-fragment+fragment,
This will be section point for example on branch 402a.Then non-fragment further carries out sub- choosing by monocyte and lymphocyte
It is logical.After aforementioned exemplary, the 3rd node on branch 402a occurs for this gating 530 and analysis.
Fig. 6 A the center panel 522 shows the flow cytometry data of the different subgroup marks determined by parameter.At this
In the case of kind, label is the CD45KO that SS INT LIN (lateral scattering intensity, linear) are detected.Fig. 6 A right panel
524 provide the form for being listed in the parameters used in gating and SVM analyses.As illustrated, examined under title " in SVM "
Parameter SS INT LIN and CD45KO are looked into, it is the p and q in the distributed kernel being based upon in above-mentioned equation (3) to show SVM analyses
These parameters execution of data is provided.
The bottom of Fig. 6 B screenshot capture provides the exemplary possible label (antibody) in the screening panel of shown test
List.Here, 24 labels of instruction:CD2、CD3、CD4、CD5、CD7、CD8、CD10、CD11c、CD13、CD14、CD16、
CD19, CD20, CD23, CD33, CD34, CD38, CD45, CD56, CD64, CD117, HLA-DR, kappa and lambda, its
Standard leukemia/lymthoma panel is represented, for being followed up after assisting leukaemia and Lymphoma Diagnosis and treatment.Although may not have
Have and all labels are shown in this screenshot capture, Fig. 6 B show the sample screen sectional drawing of analysis result, including CD45KO pairs
Two 2D flow cytometry curves of the SS INT LIN (left upper quadrant) and SS INT LIN to FS INT LIN (right upper quadrant)
Figure.In addition, as will be apparent to the skilled person in the art, select appropriate label existing depending on being known or suspected
It is abnormal.For example, CD11b, CD41, CD 138, Cd235a and FMC-7 can be added to mark by extension leukaemia/lymthoma panel
In the listed label of quasi- panel.The more small panel of selected label can be used for pre- diagnosis and treatment monitoring.No matter which uses
A little labels, it will comply with the information that identical basis flow extracts associated subspace from mass data.
A part for software systems helps to design SVM gating structure, configuration and training and default settings.Gating
It is defined as the arbitrary process of the specified criteria selection cell subgroup based on observation parameter.Gating be reduce data complexity and
The effective technology for the specific subgroup for focusing on data will be analyzed.However, in order to solve all aspects of analysis, will generally exist big
Amount gating, and it is probably complicated to gate structure itself.
The hierarchical structure of the system helps flexibly and easily to limit the gating of very universal class.
At each node, in step 502,2D gatings are defined based on the selection of any two parameter.2D curve maps
506 be the basis for defining gating.
Institute's gated data 504 at node be present node before a series of nodes at gating chain accumulation results.
Because each node is combined with arbitrary parameter limits 2D gatings, level scheme allows to limit actually any gating configuration.
For example, FS (forward scattering) and SS (lateral scattering) gatings can filter out fragment.On non-fragment, FS and CD45
Another gating of label can be defined as separating five subgroups:CD45-Dim (disappearance label), monocyte,
CD45-Neg (negative flag thing), granulocyte and lymphocyte.Monocyte can feed new node by further gating.
Fig. 7 provides the flow chart of the possibility strobe sequence in a branch for representing all trees 400 as shown in Figure 4.It is shown
Branch includes three nodes, and each node has the structure of the node 410 shown in Fig. 5, including event data is separated into selected
The SVM processing steps of group.For example, in step 650, lateral scattering (SS) and forward scattering (FS) event is detected, then in step
Curve map is drawn in rapid 652, produces the 2D images with data distribution.Using the curve map of SS/FS data, in step 654,
Node #1 performs gating operation by non-fragment and chip separation.Fig. 8 A show this separation, wherein, the median plane of screenshot capture
Curve in plate illustrates the line between non-fragment and fragment.In step 656, select non-fragment, then to comprising for
The curve map for the non-crumb data that CD45 and SS INT LIN are assessed is analyzed.Fig. 8 B the center panel shows this curve
Figure.In step 658, non-crumb data is separated into 5 groups by node #2:Granulocyte, monocyte, lymphocyte, CD45-
Dim and CD45-Neg.Curve in Fig. 8 C the center panel is illustrated by the SS INT LIN for CD45KO labels
The packet that data are drawn curve map and identified.(pay attention to, the inspection parameter under " in SVM " in Fig. 8 C right panel:“SS
INT LIN " and " CD45KO ".) for next step 660, granulocyte data are excluded, and to Fig. 8 D center in node #3
In panel draw curve map remaining monocyte data gated (step 662) so as to separate CD3 and CD5 cell surfaces by
Body.Fig. 8 E provide curve obtained figure, the % that illustrates based on the % on X and Y just, on X and Y is negative, % antithesis just and %
Antithesis bears the flow cytometry data that quilt is gated for quadrant.This decomposition is by using distributed kernel to the number in curve map
Generated according to SVM analyses are carried out.The top of Fig. 8 E right panel provides the numerical value of distributed analysis.
This process will repeat to each test tube of clinical samples.Can concurrently it run with the attached of different gating restrictions
Bonus point branch, for example, branch can perform different disjoint sets from node #1 bifurcateds.Optional final step will be each tree of combination
For the result of branch to generate diagnosis, this diagnosis takes the result for terminating to realize at place in each branch into account.Excellent
Select in embodiment, this final analytical procedure will be performed by SVMs, generation diagnosis fraction, binary system (for example, just or
It is negative) result, probability, pre- diagnosis prediction or the diagnosis of subject or other appropriate designators for diagnosing in advance.
It is the exemplary algorithm for being used to gate detection according to an embodiment of the invention automatically below:
The Points And lines detection gating that the system is automatically specified from user limits.It following present the false code of algorithm:
In some cases, gating may need to carry out some adjustment to individual cases.Due to involved big in analysis
Amount gating, this can be cumbersome process.
The system of the present invention is based on cluster and provides automatic gating adjustment function.Gating in flow cytometry data generally with
Cell cluster is associated.The automatic cluster of real data provides the neutral manner suitably adjusted to acquiescence gating panel.
Gauss hybrid models (GMM) are the probability distribution of the weighted sum of Gaussian Profile:
Parameter in GMM can be determined by the learning algorithm for being referred to as expectation maximization (EM) algorithm.In statistics, it is expected
Maximize the iteration calculation that algorithm is the maximum likelihood or maximum a posteriori (MAP) estimate for finding out the parameter in statistical model
Method, wherein, the model depends on not observing latent variable.
The cluster that the system application GMM comes in the flow data at detection node.Cluster information is subsequently used for gating template
It is adjusted.User can also manually adjust gating.
After the strobe, the feature (parameter) for catching each subgroup is analyzed.Each node in gating tree has phase
The SVM of association, the associated SVM are defined on the institute's gated data being present at the node.With specific son
The SVM of faciation association is trained to provide to analyze distribution pattern in the data of the subgroup and for the data in the subgroup
Normal or abnormal quantitative evaluation.
SVM inputs are not limited to 2D curve maps.The group that gates at any combination of these parameters and each node can be with
For SVM study and follow-up svm classifier.The system can use different types of SVM, such as C-SVM, nu-SVM and list
Class SVM.
The supplementary features of software systems include following functions:Import data, carry out gating adjustment, perform SVM analysis and
Result is graphically presented.
The distributed system of analysis node based on SVM quantifies instruction by the abnormal of whole situation is provided.
In the embodiment of software systems, the different method for visualizing for display data can be included.Except traditional
Outside 2D curve maps, 3D curve maps are also available, as shown in figure 9, wherein, X-axis is that (CD45-Krome is orange by CD45KO
Dyestuff), Y-axis is SS INT LIN (lateral scattering intensity, linear), and Z axis be FS INT LIN (forward scattering intensity,
Linear).Can be that 3D curve maps select any three parameters.User can alternatively move, rotate and scale 3D curve maps.
3D functions provide the expression significantly increased of the structure of flow data.
Example 3:Highlight abnormal results
The common-denominator target of automation flow cytometry system is to allow laboratory technicians to be more readily identified to need
The situation for wanting virologist to check.
This is (such as using particular color font or to be highlighted by using visually differentiable feature (for example, red
Color)) abnormal curve figure and value are shown in the display of analysis result and is partly realized.
Figure 10 provides the example of the screen display 600 on the monitor of teller work station.In this example, to clinical samples
Carry out flow cytometry.In a part for the analysis, formation curve Figure 61 0 so as to show to SS and CD45 progress
The subgroup identified during gating, (0.93%), granulocyte (50.58%), monocyte are born to separate subgroup and CD45
(3.78%), the relative percentage of CD45-Dim (2.00%) and lymphocyte (42.70%), passes through the CD45KO of X-axis
The SS INT LIN of (CD45-Krome oranges) and Y-axis depict curve map to these.In this example, lymphocyte meter
20% to 40% of number more than normal range (NR), so curve map, which is highlighted to user's signalling instruction, measures exceptional value.
In colour is shown, the upper bar 612 on curve map can be red, or entirely curve map can be marked as red.In order to
Illustrate, the upper bar 612 of curve map is highlighted with wave.
Curve map 614 shows the result gated to FS INT LIN and SS INT LIN.Because the knot of this gating
Fruit does not show abnormal results, and curve map is not highlighted, as indicated by the clearly upper bar 616 of curve map.Table in display
Lattice 618 provide the numerical result of each subgroup.Again, due to the exceptional value of lymphocyte, shown value be highlighted with
Indicate to the user that and measure exceptional value.In colour display, digital " 42.70 " can show as red or some other color
So as to which itself and other values be distinguished.In order to illustrate, described value is shown according to underscore, runic and italic type.Curve map 610
In the analysis of subgroup that shows include the further gating of lymphocyte, the numerical result of lymphocyte is shown in the form of display
In 620.As described above, each subgroup is analyzed by the node separated, the node of the separation is from execution initial strobe and analysis
Node branch come out.In this example, lymphocyte is gated following subgroup:T cell (CD2, CD3), B cell
(CD19, CD20), NK cells (CD16, (CD3-CD56)) and pre- B cell (CD10+CD19).Gained numerical result is transfused to
Into form 620, the abnormal results relevant with B cell is indicated by highlighting value 622 and 624 in display.In display
In form 630, CD4-CD8 another exceptional value is highlighted.
Figure 11 A-11F provide the further explanation for showing feature, in the rear line of second sample of the analysis from patient
Abnormal results be present in instruction.Figure 11 A illustrate Kappa FITC to FS INT LIN by curve.The clearly upper normal knot of bar instruction
Fruit.Similarly, curve is passed through in Figure 11 B (Lambda PE are to FS INT LIN) and Figure 11 C (CD23ECD is to FS INT LIN)
The result illustrated is normal.However, (CD11c PC7 are to FS by Figure 11 D (CD19PC5.5 is to FS INT LIN) and Figure 11 E
INT LIN) it is abnormal, as highlighting in the bar above curve map is indicated.(CD10APC is to FS INT by Figure 11 E
LIN the normal outcome of this parameter) is indicated.
Figure 12 shows the exemplary spreadsheet 700 of the parameters for catching and quantifying each subgroup.Electrical form
List include node serial number (C columns), institute's gating parameter (for example, test tube numbering), non-fragment (D columns), sub- gating feature (for example,
Non- fragment, fragment, gating 1, CD4APCA etc.) (E columns).F columns correspond to X-axis parameter, and G columns provide Y-axis parameter.H columns are to M
Column provides weight, X averages and Y averages and the covariance each organized, and all these combination distributed kernels are analyzed for SVM.
Figure 13 provides the additional thin of the process according to an embodiment of the invention being related in flow cytometry data analysis
Section.Curve map 712 shows respectively to select monocyte 2 using X labels and Y labels (CD20 V450 and CD23ECD)
The logical flow cytometry data illustrated by curve.For (the sample of spread-sheet data 710 for the node for performing this analysis
This node serial number 65 (the C columns from Figure 12)) monocyte 2 is gated and then its son is gated for 4 quadrants:X and Y
On % just;% on X and Y is born;% antithesis is just;And % antithesis is born.Son is gated for quadrant and provides quadrants different from falling into
The corresponding weight of the counting (percentage) of cell.The calculating average of each label is provided in electrical form as each group
Distribution (covariance).Because these results are located at outside normal value, the upper band 714 of curve map 712 is highlighted to indicate that
User's identified abnormal results.
Figure 14 provides another of the process according to an embodiment of the invention being related in flow cytometry data analysis
Example.Curve map 812 is shown with X labels CD20V 450 and Y label Kappa FITC gate to lymphocyte 2
Flow cytometry data, the spread-sheet data 810 (the C columns from Figure 12) of sample node serial number 77 is strobed and son gating
For 4 quadrants:% on X and Y is just;% on X and Y is born;% antithesis is just;And % antithesis is born.There is provided in electrical form every
Distribution (covariance) of the calculating average of individual label as each group.Because these results are located at outside normal value, upper band 814
It is highlighted to indicate to the user that identified abnormal results.
Such as will be from aforementioned exemplary and accompanying drawing it will be evident that arbitrary parameter combination can be used for automatically analyzing flow cytometry number
According to.Each parameter is separated ground
In certain embodiments, system is configured for safeguarding the database for being used for collecting data from analyzed situation.
(see the database 130 in such as Fig. 1.) feature assessed of all related datas, the statistical value reported and SVM is stored in
In this database.The widespread consensus of flow cytometry expert be exist in flow cytometry data it is more more useful than currently known
Information.Help is promoted to find the further research of the new model and diagnostic message in flow data by this database.
Software preferably includes the user instruction reminded and data are preserved at the end of analysis.For repeatedly dividing for same situation
Analysis, it can alternatively rewrite legacy data or preserve two versions of data.
In order to ensure the integrality and security of software systems, the preferred embodiment of software systems includes real-time authentication work(
Energy.Certificate server is established to handle certification request.Client software leads to via internet through security protocol and server
Letter.
In certain embodiments, can be to performing analysis in the client computer in the laboratory for being located remote from flow cytometer.
For example, initial data can be processed and via network transmission to one or more remote locations.Run on a client
Flow cytometry software will need to complete certification before being allowed to start normal operating.
In one embodiment, client is transferred to server by message is encrypted, and includes following field:
It is empty
Timestamp
Account
Purposes
Software signature
Hardware signature
When receiving certification request, server will verify each field.If certification success, server will be with request
The encryption certification message matched somebody with somebody sends back client computer.The agreement is designed to prevent " Replay Attack ".Use null value and timestamp
It will ensure that these message even for same client are unique.
Authentication function will help to provide following guarantee:Software is not yet by maliciously change, software by suitably license, system
It is suitably configured in legal environment and all analyzed situations is explained.
Flow cytometry immunophenotype is to be used to even detect when combination form and cytogenetics be not diagnosable to make
Qualitatively and quantitatively abnormal accurate and very sensitive method in haemocyte.Automation flow cytometry data disclosed herein
Analysis system provides the ability for automatically analyzing the mass data generated during flow cytometry measure, enhances fluidic cell
Accuracy, repeatability and the diversity of art method.The ability that method disclosed herein provides not only increases flow cytometry
Diagnostic value but also carry out data mining and mould by collecting and analyzing magnanimity flow cytometry data from many patients
Formula identifies and extends the research field of the technology, and these research fields are beyond currently limited method.
Claims (22)
1. a kind of method analyzed and classified for flow cytometric art data, wherein, the flow cytometry data bag
The multiple features for describing the data are included, methods described includes:
Input data set including the flow cytometry event for cell mass is downloaded to including processor and storage device
In computer system, wherein, the processor is programmed to carry out at least one SVMs and performs following steps:
The hierarchical structure of limiting analysis element, each analysis element correspond to different gatings and limited, wherein, each analysis element
The predetermined criterion of parameter combination is classified to cell subgroup using gating algorithms basis, wherein, the classification is to make
What the SVMs that apparatus is distributed formula kernel performed;And
The output display of mark of the generation with flow cytometry data classification at display device.
2. the method as described in claim 1, in addition to selection cell subgroup and using the different gating algorithms of application to enter one
Walk the different analysis elements classified to the subgroup and usually analyze the selected cell subgroup.
3. the method for claim 1, wherein the distributed kernel includes Pasteur's compatibility with following form:
<mrow>
<mi>k</mi>
<mrow>
<mo>(</mo>
<mi>p</mi>
<mo>,</mo>
<mi>q</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<msup>
<mi>e</mi>
<mrow>
<mo>-</mo>
<mi>&rho;</mi>
<mrow>
<mo>(</mo>
<mi>p</mi>
<mo>,</mo>
<mi>q</mi>
<mo>)</mo>
</mrow>
</mrow>
</msup>
<mo>=</mo>
<msup>
<msqrt>
<mfrac>
<mrow>
<mo>|</mo>
<mrow>
<mo>(</mo>
<msub>
<mo>&Sigma;</mo>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mo>&Sigma;</mo>
<mn>2</mn>
</msub>
<mo>)</mo>
</mrow>
<mo>/</mo>
<mn>2</mn>
<mo>|</mo>
</mrow>
<msqrt>
<mrow>
<mrow>
<mo>|</mo>
<msub>
<mo>&Sigma;</mo>
<mn>1</mn>
</msub>
<mo>|</mo>
</mrow>
<mo>&CenterDot;</mo>
<mrow>
<mo>|</mo>
<msub>
<mo>&Sigma;</mo>
<mn>2</mn>
</msub>
<mo>|</mo>
</mrow>
</mrow>
</msqrt>
</mfrac>
</msqrt>
<mrow>
<mo>-</mo>
<mn>1</mn>
</mrow>
</msup>
<mi>exp</mi>
<mo>{</mo>
<mo>-</mo>
<mfrac>
<mn>1</mn>
<mn>8</mn>
</mfrac>
<msup>
<mrow>
<mo>(</mo>
<msub>
<mi>M</mi>
<mn>2</mn>
</msub>
<mo>-</mo>
<msub>
<mi>M</mi>
<mn>1</mn>
</msub>
<mo>)</mo>
</mrow>
<mi>T</mi>
</msup>
<msup>
<mrow>
<mo>&lsqb;</mo>
<mfrac>
<mrow>
<msub>
<mo>&Sigma;</mo>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mo>&Sigma;</mo>
<mn>2</mn>
</msub>
</mrow>
<mn>2</mn>
</mfrac>
<mo>&rsqb;</mo>
</mrow>
<mrow>
<mo>-</mo>
<mn>1</mn>
</mrow>
</msup>
<mrow>
<mo>(</mo>
<msub>
<mi>M</mi>
<mn>2</mn>
</msub>
<mo>-</mo>
<msub>
<mi>M</mi>
<mn>1</mn>
</msub>
<mo>)</mo>
</mrow>
<mo>}</mo>
<mo>,</mo>
</mrow>
Wherein, p and q is input data point, and M is the average of normal distribution, and ∑ is covariance matrix.
4. the method for claim 1, wherein the hierarchical structure includes the tree with multiple branches, and the side
Method also includes being used for the Analysis of conclusion step that result caused by each branch is combined into diagnostic classification.
5. method as claimed in claim 4, wherein, the diagnostic classification is included presence or absence of disease.
6. the method for claim 1, wherein the different gatings are limited selected from the group being made up of the following:Sample
Pipe mark, fragment are to non-fragment, granulocyte, monocyte, lymphocyte, negative flag thing intensity and disappearance label intensity.
7. the method for claim 1, wherein generating output display includes highlighting abnormal results to contribute to user
Carry out vision-based detection.
8. a kind of method for being used to automatically analyze flow cytometry data, including:
Detection includes the lateral scattering and forward scattering event of the sample of multiple cells;
Multiple curve maps of the lateral scattering and forward scattering event, the multiple curve map bag are generated in two dimension or three-dimensional
Include flow cytometry data;
The multiple curve map is handled using the hierarchical structure of analytical element, each analysis element corresponds to different gatings and limited
It is fixed, wherein, each analysis element application gating algorithms basis is divided cell subgroup the predetermined criterion of parameter combination
Class, wherein, the classification is performed using distributed kernel;And
The output of mark of the generation with the classification of one or more flow cytometry datas at display device.
9. method as claimed in claim 8, in addition to selection cell subgroup and using the different gating algorithms of application to enter one
Walk the different analysis elements classified to the subgroup and usually analyze the selected cell subgroup.
10. method as claimed in claim 8, wherein, the distributed kernel includes Pasteur's compatibility with following form:
<mrow>
<mi>k</mi>
<mrow>
<mo>(</mo>
<mi>p</mi>
<mo>,</mo>
<mi>q</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<msup>
<mi>e</mi>
<mrow>
<mo>-</mo>
<mi>&rho;</mi>
<mrow>
<mo>(</mo>
<mi>p</mi>
<mo>,</mo>
<mi>q</mi>
<mo>)</mo>
</mrow>
</mrow>
</msup>
<mo>=</mo>
<msup>
<msqrt>
<mfrac>
<mrow>
<mo>|</mo>
<mrow>
<mo>(</mo>
<msub>
<mo>&Sigma;</mo>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mo>&Sigma;</mo>
<mn>2</mn>
</msub>
<mo>)</mo>
</mrow>
<mo>/</mo>
<mn>2</mn>
<mo>|</mo>
</mrow>
<msqrt>
<mrow>
<mrow>
<mo>|</mo>
<msub>
<mo>&Sigma;</mo>
<mn>1</mn>
</msub>
<mo>|</mo>
</mrow>
<mo>&CenterDot;</mo>
<mrow>
<mo>|</mo>
<msub>
<mo>&Sigma;</mo>
<mn>2</mn>
</msub>
<mo>|</mo>
</mrow>
</mrow>
</msqrt>
</mfrac>
</msqrt>
<mrow>
<mo>-</mo>
<mn>1</mn>
</mrow>
</msup>
<mi>exp</mi>
<mo>{</mo>
<mo>-</mo>
<mfrac>
<mn>1</mn>
<mn>8</mn>
</mfrac>
<msup>
<mrow>
<mo>(</mo>
<msub>
<mi>M</mi>
<mn>2</mn>
</msub>
<mo>-</mo>
<msub>
<mi>M</mi>
<mn>1</mn>
</msub>
<mo>)</mo>
</mrow>
<mi>T</mi>
</msup>
<msup>
<mrow>
<mo>&lsqb;</mo>
<mfrac>
<mrow>
<msub>
<mo>&Sigma;</mo>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mo>&Sigma;</mo>
<mn>2</mn>
</msub>
</mrow>
<mn>2</mn>
</mfrac>
<mo>&rsqb;</mo>
</mrow>
<mrow>
<mo>-</mo>
<mn>1</mn>
</mrow>
</msup>
<mrow>
<mo>(</mo>
<msub>
<mi>M</mi>
<mn>2</mn>
</msub>
<mo>-</mo>
<msub>
<mi>M</mi>
<mn>1</mn>
</msub>
<mo>)</mo>
</mrow>
<mo>}</mo>
<mo>,</mo>
</mrow>
Wherein, p and q is input data point, and M is the average of normal distribution, and ∑ is covariance matrix.
11. method as claimed in claim 8, wherein, the hierarchical structure includes the tree with multiple branches, and the side
Method also includes being used for the Analysis of conclusion step that result caused by each branch is combined into diagnostic classification.
12. method as claimed in claim 11, wherein, the diagnostic classification is included presence or absence of disease.
13. method as claimed in claim 8, wherein, the different gatings are limited selected from the group being made up of the following:Sample
Pipe mark, fragment are to non-fragment, granulocyte, monocyte, lymphocyte, negative flag thing intensity and disappearance label intensity.
14. method as claimed in claim 8, wherein, generation output display includes highlighting abnormal results to help to use
Family carries out vision-based detection.
15. a kind of system for being used to automatically analyze flow cytometry data, the system include:
Computer processor, the computer processor and memory communication, the memory are stored with including to bag wherein
The flow cytometry data of multiple measure of multiple samples execution of cell is included, the flow cytometry data includes lateral scattering
With forward scattering event;And
Computer program product, the computer program product are implemented in non-transient computer-readable media, the computer
Program product includes being used for the instruction for making the computer processor perform following operation:
Receive the flow cytometry data;
Multiple curve maps of the lateral scattering and forward scattering event are generated in two dimension or three-dimensional;
The multiple curve map is handled using the hierarchical structure of analytical element, each analysis element corresponds to different gatings and limited
It is fixed, wherein, each analysis element application gating algorithms are according to the predetermined criterion to parameter combination and to thin in the sample
Sporozoite group is classified, wherein, the classification is performed using distributed kernel;And
The output for the mark that one or more flow cytometry datas of the generation with the cell are classified at display device.
16. system as claimed in claim 15, wherein, the computer program product also includes being used to make at the computer
Manage the instruction that device performs following operation:Select cell subgroup;And using the different gating algorithms of application with further to the son
The different analysis elements that group is classified usually analyze the selected cell subgroup.
17. system as claimed in claim 15, wherein, it is affine that the distributed kernel includes the Pasteur with following form
Property:
<mrow>
<mi>k</mi>
<mrow>
<mo>(</mo>
<mi>p</mi>
<mo>,</mo>
<mi>q</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<msup>
<mi>e</mi>
<mrow>
<mo>-</mo>
<mi>&rho;</mi>
<mrow>
<mo>(</mo>
<mi>p</mi>
<mo>,</mo>
<mi>q</mi>
<mo>)</mo>
</mrow>
</mrow>
</msup>
<mo>=</mo>
<msup>
<msqrt>
<mfrac>
<mrow>
<mo>|</mo>
<mrow>
<mo>(</mo>
<msub>
<mo>&Sigma;</mo>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mo>&Sigma;</mo>
<mn>2</mn>
</msub>
<mo>)</mo>
</mrow>
<mo>/</mo>
<mn>2</mn>
<mo>|</mo>
</mrow>
<msqrt>
<mrow>
<mrow>
<mo>|</mo>
<msub>
<mo>&Sigma;</mo>
<mn>1</mn>
</msub>
<mo>|</mo>
</mrow>
<mo>&CenterDot;</mo>
<mrow>
<mo>|</mo>
<msub>
<mo>&Sigma;</mo>
<mn>2</mn>
</msub>
<mo>|</mo>
</mrow>
</mrow>
</msqrt>
</mfrac>
</msqrt>
<mrow>
<mo>-</mo>
<mn>1</mn>
</mrow>
</msup>
<mi>exp</mi>
<mo>{</mo>
<mo>-</mo>
<mfrac>
<mn>1</mn>
<mn>8</mn>
</mfrac>
<msup>
<mrow>
<mo>(</mo>
<msub>
<mi>M</mi>
<mn>2</mn>
</msub>
<mo>-</mo>
<msub>
<mi>M</mi>
<mn>1</mn>
</msub>
<mo>)</mo>
</mrow>
<mi>T</mi>
</msup>
<msup>
<mrow>
<mo>&lsqb;</mo>
<mfrac>
<mrow>
<msub>
<mo>&Sigma;</mo>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mo>&Sigma;</mo>
<mn>2</mn>
</msub>
</mrow>
<mn>2</mn>
</mfrac>
<mo>&rsqb;</mo>
</mrow>
<mrow>
<mo>-</mo>
<mn>1</mn>
</mrow>
</msup>
<mrow>
<mo>(</mo>
<msub>
<mi>M</mi>
<mn>2</mn>
</msub>
<mo>-</mo>
<msub>
<mi>M</mi>
<mn>1</mn>
</msub>
<mo>)</mo>
</mrow>
<mo>}</mo>
<mo>,</mo>
</mrow>
Wherein, p and q is input data point, and M is the average of normal distribution, and ∑ is covariance matrix.
18. system as claimed in claim 15, wherein, the hierarchical structure includes the tree with multiple branches, and described
System also includes being used for the Analysis of conclusion step that result caused by each branch is combined into diagnostic classification.
19. system as claimed in claim 18, wherein, the diagnostic classification is included presence or absence of disease.
20. system as claimed in claim 15, wherein, the different gatings are limited selected from the group being made up of the following:Sample
This pipe mark, fragment are to non-fragment, granulocyte, monocyte, lymphocyte, negative flag thing intensity and disappearance label intensity.
21. system as claimed in claim 15, wherein, the memory is associated with flow cytometry instrument, and described
Flow cytometry data is specific to single subject.
22. system as claimed in claim 15, wherein, the memory includes database, and the database is configured to use
In the accumulation flow cytometry data that storage generates from the sample collected from multiple subjects.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201462090316P | 2014-12-10 | 2014-12-10 | |
US62/090,316 | 2014-12-10 | ||
PCT/US2015/065095 WO2016094720A1 (en) | 2014-12-10 | 2015-12-10 | Automated flow cytometry analysis method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107430587A true CN107430587A (en) | 2017-12-01 |
Family
ID=59810972
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201580075757.XA Pending CN107430587A (en) | 2014-12-10 | 2015-12-10 | Automate flow cytometry method and system |
Country Status (2)
Country | Link |
---|---|
EP (1) | EP3230887A4 (en) |
CN (1) | CN107430587A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109655616A (en) * | 2018-12-19 | 2019-04-19 | 广州金域医学检验中心有限公司 | Detect the composite reagent and system of acute myeloid leukemia cell |
CN113228049A (en) * | 2018-11-07 | 2021-08-06 | 福斯分析仪器公司 | Milk analyzer for classifying milk |
CN114323162A (en) * | 2022-03-14 | 2022-04-12 | 青岛清万水技术有限公司 | Shell vegetable growth data online monitoring method and device and electronic equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101981446A (en) * | 2008-02-08 | 2011-02-23 | 医疗探索公司 | Method and system for analysis of flow cytometry data using support vector machines |
CN102625932A (en) * | 2009-09-08 | 2012-08-01 | 诺达利蒂公司 | Analysis of cell networks |
US20140228233A1 (en) * | 2011-06-07 | 2014-08-14 | Traci Pawlowski | Circulating biomarkers for cancer |
-
2015
- 2015-12-10 CN CN201580075757.XA patent/CN107430587A/en active Pending
- 2015-12-10 EP EP15868163.5A patent/EP3230887A4/en not_active Withdrawn
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101981446A (en) * | 2008-02-08 | 2011-02-23 | 医疗探索公司 | Method and system for analysis of flow cytometry data using support vector machines |
CN102625932A (en) * | 2009-09-08 | 2012-08-01 | 诺达利蒂公司 | Analysis of cell networks |
US20140228233A1 (en) * | 2011-06-07 | 2014-08-14 | Traci Pawlowski | Circulating biomarkers for cancer |
Non-Patent Citations (1)
Title |
---|
KAREL FISER ET AL.: "Detection and Monitoring of Normal and Leukemic Cell Populations with Hierarchical Clustering of Flow Cytometry Data", 《CYTOMERY PART A》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113228049A (en) * | 2018-11-07 | 2021-08-06 | 福斯分析仪器公司 | Milk analyzer for classifying milk |
CN113228049B (en) * | 2018-11-07 | 2024-02-02 | 福斯分析仪器公司 | Milk analyzer for classifying milk |
CN109655616A (en) * | 2018-12-19 | 2019-04-19 | 广州金域医学检验中心有限公司 | Detect the composite reagent and system of acute myeloid leukemia cell |
CN109655616B (en) * | 2018-12-19 | 2022-05-06 | 广州金域医学检验中心有限公司 | Combined reagent and system for detecting acute myeloid leukemia cells |
CN114323162A (en) * | 2022-03-14 | 2022-04-12 | 青岛清万水技术有限公司 | Shell vegetable growth data online monitoring method and device and electronic equipment |
CN114323162B (en) * | 2022-03-14 | 2022-06-28 | 青岛清万水技术有限公司 | Method and device for on-line monitoring of shell vegetable growth data and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
EP3230887A4 (en) | 2018-08-01 |
EP3230887A1 (en) | 2017-10-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20160169786A1 (en) | Automated flow cytometry analysis method and system | |
CN101981446B (en) | For the method and system using support vector machine to analyze flow cytometry data | |
US20240044904A1 (en) | System, method, and article for detecting abnormal cells using multi-dimensional analysis | |
US20130060775A1 (en) | Spanning-tree progression analysis of density-normalized events (spade) | |
EP3631417B1 (en) | Visualization, comparative analysis, and automated difference detection for large multi-parameter data sets | |
WO2020081292A1 (en) | Adaptive sorting for particle analyzers | |
CN107430587A (en) | Automate flow cytometry method and system | |
US20230393048A1 (en) | Optimized Sorting Gates | |
Azad et al. | Immunophenotype discovery, hierarchical organization, and template-based classification of flow cytometry samples | |
EP2332073B1 (en) | Shape parameter for hematology instruments | |
US20230215571A1 (en) | Automated classification of immunophenotypes represented in flow cytometry data | |
US10235495B2 (en) | Method for analysis and interpretation of flow cytometry data | |
Bashashati et al. | A pipeline for automated analysis of flow cytometry data: preliminary results on lymphoma sub-type diagnosis | |
TWI792751B (en) | Medical image project management platform | |
TW202311742A (en) | Automated classification of immunophenotypes represented in flow cytometry data | |
Cordeiro et al. | Autogating in Flow Cytometry Data Using SVM Classifiers for Bacterioplankton Identification | |
Mohamed | Using Probability Binning and Bayesian Inference to measure Euclidean Distance of Flow Cytometric data | |
Lin | Bayesian variable selection in clustering and hierarchical mixture modeling |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20171201 |
|
WD01 | Invention patent application deemed withdrawn after publication |