CN107430587A - Automate flow cytometry method and system - Google Patents

Automate flow cytometry method and system Download PDF

Info

Publication number
CN107430587A
CN107430587A CN201580075757.XA CN201580075757A CN107430587A CN 107430587 A CN107430587 A CN 107430587A CN 201580075757 A CN201580075757 A CN 201580075757A CN 107430587 A CN107430587 A CN 107430587A
Authority
CN
China
Prior art keywords
mrow
msub
msup
data
flow cytometry
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201580075757.XA
Other languages
Chinese (zh)
Inventor
M·阿尔比塔
张弘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NeoGenomics Laboratories Inc
Original Assignee
NeoGenomics Laboratories Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NeoGenomics Laboratories Inc filed Critical NeoGenomics Laboratories Inc
Priority claimed from PCT/US2015/065095 external-priority patent/WO2016094720A1/en
Publication of CN107430587A publication Critical patent/CN107430587A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Optical investigation techniques, e.g. flow cytometry
    • G01N15/1429Signal processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/245Classification techniques relating to the decision surface
    • G06F18/2453Classification techniques relating to the decision surface non-linear, e.g. polynomial classifier
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B35/00ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/60In silico combinatorial chemistry
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Optical investigation techniques, e.g. flow cytometry
    • G01N15/1456Optical investigation techniques, e.g. flow cytometry without spatial resolution of the texture or inner structure of the particle, e.g. processing of pulse signals
    • G01N15/1459Optical investigation techniques, e.g. flow cytometry without spatial resolution of the texture or inner structure of the particle, e.g. processing of pulse signals the analysis being performed on a sample stream
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N2015/1006Investigating individual particles for cytology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Biochemistry (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Signal Processing (AREA)
  • Biomedical Technology (AREA)
  • Pathology (AREA)
  • Immunology (AREA)
  • Analytical Chemistry (AREA)
  • Dispersion Chemistry (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Nonlinear Science (AREA)
  • Mathematical Physics (AREA)
  • Medicinal Chemistry (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Biophysics (AREA)
  • Library & Information Science (AREA)
  • Biotechnology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

Provide it is a kind of be used for receive flow cytometry data input and the automatic mode and system of the data are analyzed using the hierarchical arrangement of analytical element, the data are classified as different subgroups so as to identify the pattern in the data by each analysis element using SVMs.The pattern can be used for generation for the diagnosis prediction of patient or for identifying the pattern in the sample collected at multiple subjects.

Description

Automate flow cytometry method and system
Related application
This application claims the priority for the 14/965th, No. 640 U. S. application submitted on December 10th, 2015, described U.S. State application be the 62/090th, No. 316 U.S. Provisional Application submitted on December 10th, 2014 non-provisional submission, the U.S. Provisional application is integrally joined to this by quoting with it.The application further relates to the theme of the 8th, 628, No. 810 United States Patent (USP), described United States Patent (USP) is integrally joined to this by quoting with it.
Technical field
It is used to use SVMs automatically analysis distribution formula data, specifically flow cytometry the present invention relates to one kind The method and system of data.
Background of invention
Flow cytometry is the feature for the molecule that measurement suspends in working fluid stream.It is each to focus on laser beam irradiation Mobile particle and light scatters in all directions.It is placed in front of intersection point or the detector orthogonal with laser beam is received and dissipated Light pulse is penetrated, generation is input into the signal explained in computer analyzer.The light summation of the forward scattering detected takes Certainly in granular size and and reflectance factor but be closely related with the cross-sectional area of the visible particle of laser, it is and sidewise scattered Light quantity can indicate shape or granularity.
One of most widely used flow cytometry is the cell analysis for medical diagnosis, wherein, relevant particles are outstanding Float over the cell in saline solns.Flow cytometry technique provides the high throughput systems for collecting a large amount of cell data.Stream Formula cell art be used for detect in various types of samples (including marrow, periphery blood and tissue) exception (such as MM, CLL, LGL, AML, MDS, CMML, lymphocyte, MBL etc.) effective tool.If the related cell of fluorochrome label can be used Label, other features (such as surface molecular or iuntercellular composition) of cell can also be quantified exactly;It is for example, anti- Body-fluorescent dye is to can be used for being attached to specific surface or intracellular receptor.By to surface marker use by fluorescence It is most common fluidic cell that the monoclonal antibody of mark carries out characterizing progress immunophenotype in the different development phases to cell One of art.Develop and be tied to specific structure (for example, DNA, mitochondria) or to local chemical property (for example, Ca++ Concentration, pH etc.) sensitive other dyestuffs.
Although flow cytometry is widely used for medical diagnosis, flow cytometry can be also used for non-medical applications, such as Water or other fluid analysis.For example, seawater can be analyzed to identify the presence of bacterium or other organisms or type, can analyze Milk is with test microbes, and the fume or additive that can be tested in fuel.
Used laser beam has suitable color to encourage selected fluorescence.The fluorescence volume launched can be with The expression of the cell marker discussed is related.Each flow cytometer usually can detect many different fluorescences simultaneously, This depends on the configuration of the flow cytometer.In some instruments, it can swash by using in the multiple of emitting at different wavelengths Light analyzes multiple fluorescences simultaneously.For example, can be from medical (Becton Dickinson) (the New Jersey Franklin of shellfish enlightening Lake) obtain FACSCaliburTMFlow cytometer system is the polychrome flow cytometer for being arranged to the operation of four colors.Come from The fluorescent emission of each cell is collected by a series of photomultiplier test tubes, and is collected and analyzed follow-up electric on computers Event, the computer are that each signal in flow cytometry standard (FCS) data file distributes fluorescence intensity level.Data point Analyse to be related in mark superspace and be used for filtering or " gating (gate) " data and definition event subgroup subset for further The common factor or union of the polygonal region of analysis or screening.
International analysis cytology association (International Society for Analytical Cytology, ISAC) it is used for the conventional expression of FCM data using FCS data files standard.This standard obtains being used to record to hang oneself Cross the support of all Main Analysis instruments of the measured value of the sample of hemacytometer, it is allowed to which researcher and clinician exist Selected in a large amount of commercially available instruments and software without running into main data compatibility sex chromosome mosaicism.However, this standard Not to being described for the agreement of computational post processing and data analysis.
Due to mass data be present in flow cytometry, it is often difficult in manual processes fully utilize these numbers According to.The high-dimensional of data also neatly use traditional statistical method and learning art, such as artificial neural network.Branch It is the machine learning techniques based on kernel that can handle higher-dimension degrees of data to hold vector machine.Supporting vector chance is that have suitably to set The effective tool of the disposal flow data of the kernel of meter.
The flow data of single situation is made up of multiple test tubes.Measured value while each test tube can include multiple measure. When measuring all measure, each run is generally collected for more than 104Individual event, this can produce 106The measured value of magnitude for point Analysis.
The conventional method of analysis flow data is usually directed to " gating (the gating) " method to data progress so as to separate certain A little groups of cells and once check the large-scale 2D curves set of graphs of the data manually with two parameters.Flow cytometry data It is typically found in for diagnosis useful feature on the attribute Distribution value in high-dimensional space.As a result, human reader is difficult to manage Solve the high-dimensional pattern through convolution in data.
Modern technological progress (such as flow cytometry) has generated the mass data of many different-formats.This information The maximum challenge that blast is presented to computer and information scientist is exactly to develop to be used to handle mass data and extract to have With the effective ways of information.Although it is that effective, traditional statistical method is verified for low-dimensional degrees of data to be inadequate in dealing with Frequent high complexity and high-dimensional " new data ".Specifically, so-called " incantation of dimension " is to the serious of classical statistics instrument Limitation.Machine learning represent data processing and analysis in be used for overcome these limit it is desirable that new example.Engineering Practise and automatically " learn " system using " data-driven " method, it can be used for classifying to Future Data or being predicted.Branch It is state-of-the-art machine learning techniques to hold vector machine (SVM), and this technology revolutionizes machine learning field and is many Difficult data analysis problems provide authentic and valid solution.
SVM be combined with optimal hyperlane in high-dimensional internal product space (often finite dimensional Hilbert space) and The flexibility of data expression, computational efficiency and the rule of model capacity are realized in the concept for the kernel function that the input space defines Change.SVM can be used for solving classification (pattern-recognition) and return (prediction) problem.It following present typical SVM pattern-recognitions Set.
Give one group of training data:
xi,yiI=1,2 ..., m
The problem of SVM training can be formatted as finding optimal hyperlane:
Using Lagrange multiplier, dual problem is transformed into:
Quadratic programming problem is solved, we have SVM solutions:
Due to the complexity of flow cytometry data, it is difficult to which cytogenetics will be predicted by explicitly extracting essential feature or definition Learn the pattern of result.System based on SVM provides different advantage:It only needs the value of similarity measure between example To build grader.
The content of the invention
According to the present invention, there is provided a kind of computer assisted flow cytometry data analysis system is come by using advanced Machine learning techniques and other mathematical algorithms most of tedious steps of analysis process are automated.It is distributed with customization The SVMs (SVM) of formula kernel is used to detect abnormal flow distribution.Gauss hybrid models (GMM) are used for automated cluster and choosing It is logical.Special pattern algorithm, which is used to gate automatically, to be identified.
The system remains traditional characteristic, and such as gating limits and adjustment, 2D curve maps and statistical form.However, this is System provides automation in all analytical procedures.In addition, SVM methods facilitate point far beyond the 2D in conventional method or 3D limitations Analysis.
The system offer automation flow cytometry data analysis of the present invention, including gate prediction automatically, automatically determine often The normal of individual curve map (each label) is automatically determined abnormal results to exception, based on summary sheet, (collected based on unusual combination Table, each curve map and gating distribution) automatically determine disease type.The system has provided the user training and customized normally to different The normal ability specified.In certain embodiments, flow cytometry system is provided for having visually by display The mark curve map of differentiable feature and value by normal with device that is making a distinction extremely, this can by highlighting, under Line, runic or any other visually detectable designator is realized using specific color (such as red) so that be System user clearly marks abnormal results.The result marked will be recorded in associated patient's record for pathology Scholar, doctor or other healthcare givers assess.
The accuracy and efficiency that the system of the present invention will help virologist to significantly improve analysis flow data.The present invention's System will also provide the strong tools for finding the new model in flow cytometry.
SVMs is used to analyze the flow cytometry data generated by the commercially available flow cytometry device of routine, Especially SVMs is disclosed in No. 6,760,715, No. 7,117,188 and No. 6,996,549 United States Patent (USP) Example.Described in No. 5,872,627 and No. 4,284,412 United States Patent (USP) for carrying out showing for flow cytometry measure Example sexual system, the two United States Patent (USP)s are incorporated herein by reference.In particular example described herein, data are related to medical treatment and examined Disconnected application, particularly it is used to detect condition of blood (such as myelodysplastic syndrome (MDS)).Flow cytometry immunophenotype Have proven to for even detected when combination form and cytogenetics be not diagnosable in hematopoietic cell it is quantitative with Qualitative abnormal accurate and super-sensitive method.Automation flow cytometry data analysis system disclosed herein provides automatic The ability of mass data that ground analysis generates during flow cytometry measure, enhance Flow Cytometry methods accuracy, Repeatability and diversity.This ability from many subjects by collecting and analyzing the magnanimity stream far beyond current method for limiting Formula cytometry data carries out data mining and pattern-recognition not only increases the diagnostic value of flow cytometry but also extends this The research application of method.
In one aspect of the invention, a kind of method analyzed and classified for flow cytometric art data, wherein, The flow cytometry data includes describing multiple features of the data, and methods described includes:By including for cell mass The input data set of flow cytometry event is downloaded in the computer system including processor and storage device, wherein, it is described Processor is programmed to carry out at least one SVMs and performs following steps:The level knot of limiting analysis element Structure, each analysis element correspond to different gatings and limited, wherein, each analysis element application gating algorithms are according to parameter The predetermined criterion of combination and cell subgroup is classified, wherein, it is described classification be use with distributed kernel support to What amount machine performed;And generation has the output display of the mark of flow cytometry data classification at display device.At some In embodiment, methods described also includes selection cell subgroup and using the different gating algorithms of application with further to the subgroup The different analysis elements classified usually analyze selected cell subgroup.In a preferred embodiment, the distributed kernel bag Include Pasteur (Bhattacharya) compatibility with following form:
Wherein, p and q is input data point, and M is the average of normal distribution, and ∑ is covariance matrix.The level knot Structure can be the tree for having multiple branches, and also include being used to result caused by each branch being combined into diagnostic classification Analysis of conclusion step.The diagnostic classification can be included presence or absence of disease.The different gating restriction can be selected from by The group of the following composition:Sample tube mark, fragment are strong to non-fragment, granulocyte, monocyte, lymphocyte, negative flag thing Degree and disappearance label intensity.
In another aspect of this invention, a kind of method for being used to automatically analyze flow cytometry data, including:Detection bag Include the lateral scattering and forward scattering event of the sample of multiple cells;The lateral scattering and forward direction are generated in two dimension or three-dimensional Multiple curve maps of scattering events, the multiple curve map include flow cytometry data;Use the hierarchical structure of analytical element The multiple curve map is handled, each analysis element corresponds to different gatings and limited, wherein, each analysis element application gating Algorithm carrys out basis and the predetermined criterion of parameter combination is classified to cell subgroup, wherein, the classification is using distribution What kernel performed;And generation has the output of the mark of one or more flow cytometry datas classification at display device. Methods described can also include selection cell subgroup and using the different gating algorithms of application with further to subgroup progress The different analysis elements of classification usually analyze selected cell subgroup.In a preferred embodiment, the distributed kernel is that have The Bhattacharya compatibilities of following form:
Wherein, p and q is input data point, and M is the average of normal distribution, and ∑ is covariance matrix.The level knot Structure can be the tree for having multiple branches, and can also include being used to result caused by each branch being combined into diagnosis point The Analysis of conclusion step of class.The diagnostic classification can be presence or absence of disease.The different gating limit be selected from by with The group of lower every composition:Sample tube mark, fragment are to non-fragment, granulocyte, monocyte, lymphocyte, negative flag thing intensity With disappearance label intensity.
In the still another aspect of the present invention, a kind of system for being used to automatically analyze flow cytometry data, the system Including:Computer processor, the computer processor and memory communication, the memory are stored with including to bag wherein The flow cytometry data that multiple measure are performed on multiple samples of cell is included, the flow cytometry data includes lateral scattering With forward scattering event;And computer program product, the computer program product are implemented on non-transient computer-readable Jie In matter, the computer program product includes being used for the instruction for making the computer processor perform following operation:Described in reception Flow cytometry data;Multiple curve maps of the lateral scattering and forward scattering event are generated in two dimension or three-dimensional;Use The hierarchical structure of analytical element handles the multiple curve map, and each analysis element corresponds to different gatings and limited, wherein, often Individual analytical element application gating algorithms are carried out according to the predetermined criterion to parameter combination to the cell subgroup in the sample Classification, wherein, the classification is performed using distributed kernel;And generation has the one of the cell at display device The output of the mark of individual or multiple flow cytometry data classification.The computer program product can also include described for making Computer processor performs the instruction of following operation:Select cell subgroup;Using the different gating algorithms of application with further to institute State the instruction that the different analysis elements classified subgroup usually analyze selected cell subgroup.In a preferred embodiment, it is described Distributed kernel includes Pasteur (Bhattacharya) compatibility with following form:
Wherein, p and q is input data point, and M is the average of normal distribution, and ∑ is covariance matrix.The level knot Structure can be the tree for having multiple branches, and the system can also include being used to combine result caused by each branch Into the Analysis of conclusion step of diagnostic classification.In certain embodiments, the diagnostic classification is included presence or absence of disease.It is described Difference gating is limited selected from the group being made up of the following:Sample tube mark, fragment are to non-fragment, granulocyte, monocyte, leaching Bar cell, negative flag thing intensity and disappearance label intensity.In certain embodiments, the memory and flow cytometry instrument It is associated and specific to single subject, and in other embodiments, the memory can be database, the data Storehouse is configured for the accumulation flow cytometry data that storage generates from the sample collected from multiple subjects.
Brief description of the drawings
Fig. 1 is for automatically collecting and analyzing the schematic diagram of the system of flow cytometry data according to the present invention.
Fig. 2 is that the illustrative log of the relative group distribution of MDS flow cytometries is shown.
Fig. 3 is the flow chart according to the data analysing method of the present invention.
Fig. 4 is the schematic diagram of exemplary analysis hierarchical structure according to an embodiment of the invention.
Fig. 5 is the block diagram of the structure of each node of Fig. 4 of the implementation of system according to the invention tree.
Fig. 6 A and Fig. 6 B are the examples of the analysis result generated by the system of the present invention.
Fig. 7 is the flow chart of the exemplary branch of parsing tree according to an embodiment of the invention.
Fig. 8 A-8E are the sample screen sectional drawings of the exemplary analysis sequence of Fig. 7 branch.
Fig. 9 is the sample screen sectional drawing of the 3-dimensional curve map according to caused by the embodiment of flow cytometry system.
Figure 10 is the sample screen sectional drawing of analysis result according to an embodiment of the invention.
Figure 11 A-11F are the sample graphs for six different analysis generations, wherein, Figure 11 A-11C and Figure 11 F are represented Normal outcome, and Figure 11 D-11E are highlighted to represent abnormal results.
Figure 12 is the simple electric form of the measured value and calculated value of listing different subgroups.
Figure 13 shows the parameter of a subgroup and corresponding flow cytometry data.
Figure 14 shows the parameter of another subgroup and corresponding flow cytometry data.
Embodiment
According to the present invention, there is provided a kind of method and system for being used to analyze flow cytometry data.Specifically, it is of the invention Method includes creating the kernel for being used for analyzing the data with distributed nature.Flow cytometry application in input data p be The set of a large amount of points in space.For example, image can be considered as the point set in 2 dimension spaces.After appropriate normalization, p Probability distribution can be considered as.In order to define kernel for two this input data p and q to catch distributed trend, it is necessary to be p With q defined functions, between the two whole distributions of the function measurement rather than only between the independent point in these distributions Similarity.
A kind of mode for building this " distributed kernel " is using the distance between the two distributions function (divergence).Such as Fruit p (p, q) is distance function, then following is kernel
K (p, q)=e-p(p,q)。 (1)
In the presence of many distance functions of the difference between two probability distribution of measurement.Kullback-Leibler divergences, Bhattacharya compatibilities, Jeffrey divergences, Mahalanobis distances, Kolmogorov change distances and desired conditions entropy It is all examples of this distance.Given distance function, above formula structure kernel can be based on.
For example, special customization kernel can be built based on Bhattacharya compatibilities.For with average M and association side The normal distribution of poor matrix ∑, Bhattacharya compatibilities have following form:
New kernel is defined from this distance function using above equation.
This distributed kernel computationally more efficient (there is linear complexity) and a large amount of input datas can be disposed. Typical density estimation method has computation complexity O (n2), this may be too high for some applications.
The distributed kernel of the present invention may be directly applied to SVM or other machines learning system to create grader and its His forecasting system.Distributed kernel provides some differences better than the standard kernel being frequently used in SVM and other core machines Property advantage.Distributed kernel catches the similarity between the overall distribution of larger data component, and this is probably in some applications It is very crucial.
Fig. 3 provides the example process flow for analyzing flow cytometry data.Such as it will be apparent to one skilled in the art , flow cytometry data is provided as the example of distributed data, and can be handled and divided using techniques described below The other kinds of distributed data of class.
The initial data that flow cytometer 106 is generated is input into (step 302) in computer processing system, the meter Calculation machine processing system comprises at least memory and processor, and the processor is programmed to carry out one or more supporting vectors Machine.Typical personal computer (PC) or apple Type processor is adapted to this processing.Input data Collection is divided into two parts, and a part is used for Training Support Vector Machines, and another part is used for the validity for testing training.In step In rapid 304, by perform one or more feature selecting programs in processor come on training dataset operation characteristic select Algorithm.Within step 306, using the support with distributed kernel (kernel such as based on Bhattacharya compatibilities) to Training dataset of the amount machine processing with the feature set reduced.By being extracted in independent test data set and in step 304 The corresponding data of the feature of selection and test data is handled come in step using the housebroken SVM with distributed kernel The validity of training step is assessed in 308.If the result of test represents sub-optimal result, SVM by re -training and will be weighed New test, until obtaining optimal solution.If training is determined to be satisfactorily, in the step 310, with being carried out to clinical samples The corresponding live data of flow cytometry measure be input into processor.In step 312, selected from patient data in step The feature and apparatus selected in rapid 304 is distributed the SVM processing of the trained of formula kernel and test, and result is by patient Sample classification is normal or abnormal.In a step 314, the report collected to result is generated, the report may be displayed on On computer monitor 122, it is shown in printed report 124 and/or via e-mail or other Network File Transmission Systems It is transferred to the office of research or clinical labororatory, hospital or doctor.Can also show and/or print data packet a peacekeeping The histogram of two-dimensional representation.Result and initial data, histogram and other patient datas can also be stored in computer and deposited In reservoir or database.
Optional additional diagnostics flow can be combined with automatic analysis system with flow cytometry data and result The confidence level of enhancing is provided.It is special using No. 7,383,237 U.S. authorized similar to what is be incorporated herein by reference et al. The similar scheme of scheme disclosed in profit, the result of flow cytometry test can combine with other kinds of test.Fig. 3 is shown For the dyeing by the generation from old process (such as human chromosomal karyotyping or FISH (FISH)) Body image zooming-out correlated characteristic come use SVM perform Hemapoiesis data computer-aided image analysis so as to identify delete, Transposition, inversion and other abnormal optional flow paths.In step 320, by training image data input computer processor In pre-processed so as to identify and extract correlated characteristic.Generally, training image data are pretreated to identify correlated characteristic (step It is rapid 322), be subsequently used for training image processing SVM.Test image data are subsequently used for checking and have obtained optimal solution (step 324).If do not obtained, step 324 will be repeated and SVM by re -training and will be retested.If have been carried out optimal Solution, input patient data (step 326) living is pre-processed into (step 328) and classification (step 330).
In a preferred method, as described by the 7th, 383, No. 237 patent, each correlated characteristic in image is by individually pre- Handle (step 322) and by for the optimal SVM processing of the feature.All phases are combined in the image procossing SVM of the second level The analysis result of feature is closed to generate the output classified to whole image.Warp is tested using pretreatment image test data SVM (the steps 324) of training.If solution is optimal, with patient data living (the same patient for carrying out flow cytometry) Corresponding data are transfused to and (step 326) in processor.Patient image data is pretreated (step 328) to identify correlation Feature, and handle each correlated characteristic with for the optimal housebroken first order SVM of the special characteristic.Correlated characteristic Combinatory analysis result is combined and is input in housebroken second level image procossing SVM is classified with generating to whole image Output (step 330).
The result of step 330 can be communicated to be stored in (step 316) in the patient file of database and/or incite somebody to action It is input into the SVM of the second level to be analyzed with reference to the flow cytometry data result from step 312.This second level SVM will be trained and tested using the training and test data that are illustrated by the broken lines between step 308,324 and 340 's.The result of step 316 and step 330 is combined to be handled by housebroken second grade SVM so as to be carried out in step 342 Combinatory analysis.The result of this combined treatment is typically binary system output, for example, normal or abnormal, ill or do not have disease etc..Group Close result can be output shown (step 314) and/or be input into memory or or database in stored (step It is rapid 316).Additional optional secondary flow path can be provided to be incorporated to other kinds of data and analysis, such as analysis expert, patient History etc., these can be combined to create the last diagnostic that can be used for screening, monitor and/or treat or omen fraction or Other outputs.
Example 1:Myelodysplastic syndrome detects (MDS)
Object of this investigation is that the related chromosome of the myelodysplastic syndrome (MDS) in investigation cytogenetics is different The often potential association between flow cytometry data.This Immunophenotype analysis is that most common flow cytometry applies it One, and it is well-known for those skilled in the art that agreement is collected and prepared to sample.After the sequence shown in Fig. 1, The bone marrow aspiration liquid 102 of the patient under a cloud with MDS is collected in physiological saline or heparin sodium aqua, so as to suitable for that will hang Supernatant liquid is introduced into multiple test tubes 104 in the flow cell of flow cytometer systems 106 or other containers and creates cell suspending liquid. Reagent comprising the monoclonal antibody matched from different fluorescent dyes is introduced into test tube, and each test tube receives different antibodyomes Close, one of each combination and some possible fluorescent dyes pairing.Flow cytometer is commercially available from many manufacturers, including from shellfish The FACSCalibur of enlightening medical treatment (Becton Dickinson) (New Jersey Franklin lake)TMOr from gloomy more doctors difficult to understand Treat the Cytoron/Absolute of (Ortho Diagnostics) (New Jersey power is stepped on)TM.For this example, FACSCaliburTMSystem measures for four colors.As will be apparent to the skilled person in the art, this system is provided and is loaded into The automation disposal of multiple samples in carousel so that diagram is intended to schematically, only represent flow cytometer Sample in analyzer be present.Forward scattering detector 108 and lateral scattering detector 110 in flow cytometer systems 106 are given birth to The electric signal corresponding into the event with being detected when cell is conducted through analysis stream.It is included in lateral scattering detector 110 In fluorescence detector measure the fluorescence that the expression of antigen represented by the antibody matched from different fluorescent markers is generated The amplitude of signal.Numerical value is generated based on the pulse height (amplitude) measured by each detector.Gained signal is input into calculating In processor in machine work station 120 and for create the histogram (one-parameter two-parameter) corresponding with detecting event from And shown on graphics display monitor 122.Be related to input data is categorized as based on the comparison with controlling sample it is normal or different Normal analyzes this data generation report 124 according to the present invention, and the report can be printed or shown on monitor 122.It is former Beginning data, histogram and report also set the internal storage being stored in computer workstation 120 or separated memory So as to associated with other records of patient in standby, the memory devices can include database server 130, the data Storehouse server can be a part for Health Service Laboratory or the data warehouse in other medical facilities.
In exemplary process sequence, input data set includes 77 kinds with flow cytometer and cytogenetic data Situation (patient).All patients are under a cloud to suffer from MDS.In this 77 kinds of situations, 37 kinds of situations have is tested by cytogenetics Represented chromosome abnormality, this is related to carries out microexamination to the quantity or structure change of whole chromosomes.It is remaining 40 kinds Situation is considered as being negative to cytogenetics.
The suspension air-breathing sample of bone marrow of each patient is assigned in 13 test tubes.In the color immunofluorescence agreement of standard 4, Forward light scattering (FSC) and right angle light scatter (SSC) are collected together with 4 color Antibody Combinations to perform seven different measure, One of measure is blank.Each case generally has 20,000-50,000 event, measures all measure.Every kind of feelings The gained flow cytometry data collection of condition has about 106Individual measured value.Fig. 2 shows exemplary histograms, shows lateral scattering pair CD45 is expressed, and marks different cell masses.
For each test tube in this 13 test tubes, FSC and SSC is measured, it is allowed to which gating excludes cell fragment, such as Fig. 2 Shown in the lower left corner.In addition, use antigentic specificity and the various combination of fluorescent marker for each test tube.Table 1 below lists Dan Ke The various combination of grand antibody and following label:FITC (fluorescein isothiocynate), PE (phycoerythrin), PerCP (more dinoflagellates Element-chlorophyll) and APC (allophycocyanin).The monoclonal antibody of fluorescent marker pairing with being identified can be from many differences Source commercially available from, including shellfish enlightening medical treatment immunophenotype system (Becton-Dickinson Immunocytometry Systems) (California, USA San Jose), DakoCytomation (California, USA Ka Pingtetiliya), Caltag (California, USA Bai Lingai Nurse) and hero company (Invitrogen Corporation) (California, USA Camarillo).It is thin for enumerating ripe lymph The CD45 antibody of born of the same parents is included in each combination to verify that lymphocyte gates.
Table 1
SVM training and Training valuation are carried out in order to provide data, and the whole data set of situation is divided into training set in 77 With independent test collection.40 kinds of situations (testing 20 kinds of positive situations and 20 kinds of conditions of forsaking one's love determining by cytogenetics) are used to train SVM.Remaining 37 kinds of situations (17 kinds of positive situations and 20 kinds of conditions of forsaking one's love) are used to form independent test collection.
Foregoing customization kernel based on Bhattacharya compatibilities is used to analyze flow cytometry data so as to measure two Difference between individual probability distribution.
Data from all measure are included to produce the system with optimal performance in grader.Therefore, The feature selecting being measured based on training set.Two performance measurements are applied in feature selection step.Fisrt feature selects Method (SVM leaving-one method (LOO)), which is related to, trains SVM in initial data set and then updates ratio by performing gradient steps Example parameter so that LOO mistakes are reduced.These steps repeat, until realizing minimum LOO mistakes.Stopping criterion can be applied.Second Feature selection approach is kernel alignment.Authorize Cristianini's (Oscar Cristi Ya Nini) what is be incorporated herein by reference This technology is described in No. 7,299,213 United States Patent (USP).Kernel alignment is used only training data and can be in training Core machine performs before occurring.
During feature selection process, it is determined that big measure feature will not contribute to the Accurate classification of data.In table 2 Give the result of feature selecting flow.
Table 2
" 1 " value in the entry of table 2 refers to select specific measure (test tube/measure combination);" 0 " refers to non-selected described Measure.Which reduce by from each case be by data from original 91 be categorized into 26 needs consideration feature quantities.From reduction The data of the measure of quantity are subsequently used for training SVM with distributed kernel.
Using selected measure, then housebroken SVM is tested with 37 kinds of independent situations.Tested using binary class The conventional statistic measured value of performance collects cut-off for 0 result.Sensitivity or the rate of recovery provide correctly classify on the occasion of with by thin On the occasion of the measured value of total ratio determined by the test of born of the same parents' science of heredity.The negative value ratio that specificity measurement is correctly identified.Survey The analysis result for trying data is as follows:
Sensitivity:15/17=88% specificity:19/20=95%
This produces 3/37=8% total false rate.Use the estimation standard deviation of binomial distribution, σ=0.0449, test Error rate is produced by 95% level of confidence less than 15%.
Fig. 4 shows the hierarchical structure of the system of the invention represented by root tree 400.Each node 410 of tree represents to perform The fundamental analysis element of the various tasks related to specific selected through-flow data.Depending on giving the analysis of node execution, Multiple branches can grow from node.In the example shown, start node 410 is separated into three branches 402,404,406.On tree Number of nodes and numbers of branches will be changed according to parameter to be analyzed.For example, in branch 402, section point is separated into Branch 402a and 402b.Branch 404 is separated into three branches 404a, 404b and 404c at its section point, then branch 404b separated component branch 404ba and 404bb at the 3rd node.Tree construction reflection level gating.Input number at each node According to the strobed result for being its father node.
Fig. 5 shows the structure of each node 410 on the tree shown in Fig. 4.Each node includes gating and limits 502, be selected Logical data set 504, the graphical plot of data 506, SVM configurations 508 and housebroken SVM data sets 510.
Example 2:The sample results of standard leukemia/lymthoma panel
Fig. 6 A and Fig. 6 B show example results caused by the system of the present invention.Analysis software includes reading standard FCS The function of the data file of form.The analysis software can also export the result of various forms.Fig. 6 A split into multiple pages with Sufficient resolution ratio is provided.At each occurrence, the first page of the figure is corresponding with the left panel 520 of screenshot capture, the Page two are the center panels 522, and page three is right panel 524.Left panel 520 shows the text corresponding with institute gated data Part.As illustrated, the first gating parameter 526 be sample tube numbering (test tube 1, test tube 2 ..., test tube x).For example, this is selected Logical operation is by corresponding to the first node 410 in Fig. 4.528 (sub- gatings) of next gating are non-fragment and non-fragment+fragment, This will be section point for example on branch 402a.Then non-fragment further carries out sub- choosing by monocyte and lymphocyte It is logical.After aforementioned exemplary, the 3rd node on branch 402a occurs for this gating 530 and analysis.
Fig. 6 A the center panel 522 shows the flow cytometry data of the different subgroup marks determined by parameter.At this In the case of kind, label is the CD45KO that SS INT LIN (lateral scattering intensity, linear) are detected.Fig. 6 A right panel 524 provide the form for being listed in the parameters used in gating and SVM analyses.As illustrated, examined under title " in SVM " Parameter SS INT LIN and CD45KO are looked into, it is the p and q in the distributed kernel being based upon in above-mentioned equation (3) to show SVM analyses These parameters execution of data is provided.
The bottom of Fig. 6 B screenshot capture provides the exemplary possible label (antibody) in the screening panel of shown test List.Here, 24 labels of instruction:CD2、CD3、CD4、CD5、CD7、CD8、CD10、CD11c、CD13、CD14、CD16、 CD19, CD20, CD23, CD33, CD34, CD38, CD45, CD56, CD64, CD117, HLA-DR, kappa and lambda, its Standard leukemia/lymthoma panel is represented, for being followed up after assisting leukaemia and Lymphoma Diagnosis and treatment.Although may not have Have and all labels are shown in this screenshot capture, Fig. 6 B show the sample screen sectional drawing of analysis result, including CD45KO pairs Two 2D flow cytometry curves of the SS INT LIN (left upper quadrant) and SS INT LIN to FS INT LIN (right upper quadrant) Figure.In addition, as will be apparent to the skilled person in the art, select appropriate label existing depending on being known or suspected It is abnormal.For example, CD11b, CD41, CD 138, Cd235a and FMC-7 can be added to mark by extension leukaemia/lymthoma panel In the listed label of quasi- panel.The more small panel of selected label can be used for pre- diagnosis and treatment monitoring.No matter which uses A little labels, it will comply with the information that identical basis flow extracts associated subspace from mass data.
A part for software systems helps to design SVM gating structure, configuration and training and default settings.Gating It is defined as the arbitrary process of the specified criteria selection cell subgroup based on observation parameter.Gating be reduce data complexity and The effective technology for the specific subgroup for focusing on data will be analyzed.However, in order to solve all aspects of analysis, will generally exist big Amount gating, and it is probably complicated to gate structure itself.
The hierarchical structure of the system helps flexibly and easily to limit the gating of very universal class.
At each node, in step 502,2D gatings are defined based on the selection of any two parameter.2D curve maps 506 be the basis for defining gating.
Institute's gated data 504 at node be present node before a series of nodes at gating chain accumulation results. Because each node is combined with arbitrary parameter limits 2D gatings, level scheme allows to limit actually any gating configuration.
For example, FS (forward scattering) and SS (lateral scattering) gatings can filter out fragment.On non-fragment, FS and CD45 Another gating of label can be defined as separating five subgroups:CD45-Dim (disappearance label), monocyte, CD45-Neg (negative flag thing), granulocyte and lymphocyte.Monocyte can feed new node by further gating.
Fig. 7 provides the flow chart of the possibility strobe sequence in a branch for representing all trees 400 as shown in Figure 4.It is shown Branch includes three nodes, and each node has the structure of the node 410 shown in Fig. 5, including event data is separated into selected The SVM processing steps of group.For example, in step 650, lateral scattering (SS) and forward scattering (FS) event is detected, then in step Curve map is drawn in rapid 652, produces the 2D images with data distribution.Using the curve map of SS/FS data, in step 654, Node #1 performs gating operation by non-fragment and chip separation.Fig. 8 A show this separation, wherein, the median plane of screenshot capture Curve in plate illustrates the line between non-fragment and fragment.In step 656, select non-fragment, then to comprising for The curve map for the non-crumb data that CD45 and SS INT LIN are assessed is analyzed.Fig. 8 B the center panel shows this curve Figure.In step 658, non-crumb data is separated into 5 groups by node #2:Granulocyte, monocyte, lymphocyte, CD45- Dim and CD45-Neg.Curve in Fig. 8 C the center panel is illustrated by the SS INT LIN for CD45KO labels The packet that data are drawn curve map and identified.(pay attention to, the inspection parameter under " in SVM " in Fig. 8 C right panel:“SS INT LIN " and " CD45KO ".) for next step 660, granulocyte data are excluded, and to Fig. 8 D center in node #3 In panel draw curve map remaining monocyte data gated (step 662) so as to separate CD3 and CD5 cell surfaces by Body.Fig. 8 E provide curve obtained figure, the % that illustrates based on the % on X and Y just, on X and Y is negative, % antithesis just and % Antithesis bears the flow cytometry data that quilt is gated for quadrant.This decomposition is by using distributed kernel to the number in curve map Generated according to SVM analyses are carried out.The top of Fig. 8 E right panel provides the numerical value of distributed analysis.
This process will repeat to each test tube of clinical samples.Can concurrently it run with the attached of different gating restrictions Bonus point branch, for example, branch can perform different disjoint sets from node #1 bifurcateds.Optional final step will be each tree of combination For the result of branch to generate diagnosis, this diagnosis takes the result for terminating to realize at place in each branch into account.Excellent Select in embodiment, this final analytical procedure will be performed by SVMs, generation diagnosis fraction, binary system (for example, just or It is negative) result, probability, pre- diagnosis prediction or the diagnosis of subject or other appropriate designators for diagnosing in advance.
It is the exemplary algorithm for being used to gate detection according to an embodiment of the invention automatically below:
The Points And lines detection gating that the system is automatically specified from user limits.It following present the false code of algorithm:
In some cases, gating may need to carry out some adjustment to individual cases.Due to involved big in analysis Amount gating, this can be cumbersome process.
The system of the present invention is based on cluster and provides automatic gating adjustment function.Gating in flow cytometry data generally with Cell cluster is associated.The automatic cluster of real data provides the neutral manner suitably adjusted to acquiescence gating panel.
Gauss hybrid models (GMM) are the probability distribution of the weighted sum of Gaussian Profile:
Parameter in GMM can be determined by the learning algorithm for being referred to as expectation maximization (EM) algorithm.In statistics, it is expected Maximize the iteration calculation that algorithm is the maximum likelihood or maximum a posteriori (MAP) estimate for finding out the parameter in statistical model Method, wherein, the model depends on not observing latent variable.
The cluster that the system application GMM comes in the flow data at detection node.Cluster information is subsequently used for gating template It is adjusted.User can also manually adjust gating.
After the strobe, the feature (parameter) for catching each subgroup is analyzed.Each node in gating tree has phase The SVM of association, the associated SVM are defined on the institute's gated data being present at the node.With specific son The SVM of faciation association is trained to provide to analyze distribution pattern in the data of the subgroup and for the data in the subgroup Normal or abnormal quantitative evaluation.
SVM inputs are not limited to 2D curve maps.The group that gates at any combination of these parameters and each node can be with For SVM study and follow-up svm classifier.The system can use different types of SVM, such as C-SVM, nu-SVM and list Class SVM.
The supplementary features of software systems include following functions:Import data, carry out gating adjustment, perform SVM analysis and Result is graphically presented.
The distributed system of analysis node based on SVM quantifies instruction by the abnormal of whole situation is provided.
In the embodiment of software systems, the different method for visualizing for display data can be included.Except traditional Outside 2D curve maps, 3D curve maps are also available, as shown in figure 9, wherein, X-axis is that (CD45-Krome is orange by CD45KO Dyestuff), Y-axis is SS INT LIN (lateral scattering intensity, linear), and Z axis be FS INT LIN (forward scattering intensity, Linear).Can be that 3D curve maps select any three parameters.User can alternatively move, rotate and scale 3D curve maps. 3D functions provide the expression significantly increased of the structure of flow data.
Example 3:Highlight abnormal results
The common-denominator target of automation flow cytometry system is to allow laboratory technicians to be more readily identified to need The situation for wanting virologist to check.
This is (such as using particular color font or to be highlighted by using visually differentiable feature (for example, red Color)) abnormal curve figure and value are shown in the display of analysis result and is partly realized.
Figure 10 provides the example of the screen display 600 on the monitor of teller work station.In this example, to clinical samples Carry out flow cytometry.In a part for the analysis, formation curve Figure 61 0 so as to show to SS and CD45 progress The subgroup identified during gating, (0.93%), granulocyte (50.58%), monocyte are born to separate subgroup and CD45 (3.78%), the relative percentage of CD45-Dim (2.00%) and lymphocyte (42.70%), passes through the CD45KO of X-axis The SS INT LIN of (CD45-Krome oranges) and Y-axis depict curve map to these.In this example, lymphocyte meter 20% to 40% of number more than normal range (NR), so curve map, which is highlighted to user's signalling instruction, measures exceptional value. In colour is shown, the upper bar 612 on curve map can be red, or entirely curve map can be marked as red.In order to Illustrate, the upper bar 612 of curve map is highlighted with wave.
Curve map 614 shows the result gated to FS INT LIN and SS INT LIN.Because the knot of this gating Fruit does not show abnormal results, and curve map is not highlighted, as indicated by the clearly upper bar 616 of curve map.Table in display Lattice 618 provide the numerical result of each subgroup.Again, due to the exceptional value of lymphocyte, shown value be highlighted with Indicate to the user that and measure exceptional value.In colour display, digital " 42.70 " can show as red or some other color So as to which itself and other values be distinguished.In order to illustrate, described value is shown according to underscore, runic and italic type.Curve map 610 In the analysis of subgroup that shows include the further gating of lymphocyte, the numerical result of lymphocyte is shown in the form of display In 620.As described above, each subgroup is analyzed by the node separated, the node of the separation is from execution initial strobe and analysis Node branch come out.In this example, lymphocyte is gated following subgroup:T cell (CD2, CD3), B cell (CD19, CD20), NK cells (CD16, (CD3-CD56)) and pre- B cell (CD10+CD19).Gained numerical result is transfused to Into form 620, the abnormal results relevant with B cell is indicated by highlighting value 622 and 624 in display.In display In form 630, CD4-CD8 another exceptional value is highlighted.
Figure 11 A-11F provide the further explanation for showing feature, in the rear line of second sample of the analysis from patient Abnormal results be present in instruction.Figure 11 A illustrate Kappa FITC to FS INT LIN by curve.The clearly upper normal knot of bar instruction Fruit.Similarly, curve is passed through in Figure 11 B (Lambda PE are to FS INT LIN) and Figure 11 C (CD23ECD is to FS INT LIN) The result illustrated is normal.However, (CD11c PC7 are to FS by Figure 11 D (CD19PC5.5 is to FS INT LIN) and Figure 11 E INT LIN) it is abnormal, as highlighting in the bar above curve map is indicated.(CD10APC is to FS INT by Figure 11 E LIN the normal outcome of this parameter) is indicated.
Figure 12 shows the exemplary spreadsheet 700 of the parameters for catching and quantifying each subgroup.Electrical form List include node serial number (C columns), institute's gating parameter (for example, test tube numbering), non-fragment (D columns), sub- gating feature (for example, Non- fragment, fragment, gating 1, CD4APCA etc.) (E columns).F columns correspond to X-axis parameter, and G columns provide Y-axis parameter.H columns are to M Column provides weight, X averages and Y averages and the covariance each organized, and all these combination distributed kernels are analyzed for SVM.
Figure 13 provides the additional thin of the process according to an embodiment of the invention being related in flow cytometry data analysis Section.Curve map 712 shows respectively to select monocyte 2 using X labels and Y labels (CD20 V450 and CD23ECD) The logical flow cytometry data illustrated by curve.For (the sample of spread-sheet data 710 for the node for performing this analysis This node serial number 65 (the C columns from Figure 12)) monocyte 2 is gated and then its son is gated for 4 quadrants:X and Y On % just;% on X and Y is born;% antithesis is just;And % antithesis is born.Son is gated for quadrant and provides quadrants different from falling into The corresponding weight of the counting (percentage) of cell.The calculating average of each label is provided in electrical form as each group Distribution (covariance).Because these results are located at outside normal value, the upper band 714 of curve map 712 is highlighted to indicate that User's identified abnormal results.
Figure 14 provides another of the process according to an embodiment of the invention being related in flow cytometry data analysis Example.Curve map 812 is shown with X labels CD20V 450 and Y label Kappa FITC gate to lymphocyte 2 Flow cytometry data, the spread-sheet data 810 (the C columns from Figure 12) of sample node serial number 77 is strobed and son gating For 4 quadrants:% on X and Y is just;% on X and Y is born;% antithesis is just;And % antithesis is born.There is provided in electrical form every Distribution (covariance) of the calculating average of individual label as each group.Because these results are located at outside normal value, upper band 814 It is highlighted to indicate to the user that identified abnormal results.
Such as will be from aforementioned exemplary and accompanying drawing it will be evident that arbitrary parameter combination can be used for automatically analyzing flow cytometry number According to.Each parameter is separated ground
In certain embodiments, system is configured for safeguarding the database for being used for collecting data from analyzed situation. (see the database 130 in such as Fig. 1.) feature assessed of all related datas, the statistical value reported and SVM is stored in In this database.The widespread consensus of flow cytometry expert be exist in flow cytometry data it is more more useful than currently known Information.Help is promoted to find the further research of the new model and diagnostic message in flow data by this database.
Software preferably includes the user instruction reminded and data are preserved at the end of analysis.For repeatedly dividing for same situation Analysis, it can alternatively rewrite legacy data or preserve two versions of data.
In order to ensure the integrality and security of software systems, the preferred embodiment of software systems includes real-time authentication work( Energy.Certificate server is established to handle certification request.Client software leads to via internet through security protocol and server Letter.
In certain embodiments, can be to performing analysis in the client computer in the laboratory for being located remote from flow cytometer. For example, initial data can be processed and via network transmission to one or more remote locations.Run on a client Flow cytometry software will need to complete certification before being allowed to start normal operating.
In one embodiment, client is transferred to server by message is encrypted, and includes following field:
It is empty
Timestamp
Account
Purposes
Software signature
Hardware signature
When receiving certification request, server will verify each field.If certification success, server will be with request The encryption certification message matched somebody with somebody sends back client computer.The agreement is designed to prevent " Replay Attack ".Use null value and timestamp It will ensure that these message even for same client are unique.
Authentication function will help to provide following guarantee:Software is not yet by maliciously change, software by suitably license, system It is suitably configured in legal environment and all analyzed situations is explained.
Flow cytometry immunophenotype is to be used to even detect when combination form and cytogenetics be not diagnosable to make Qualitatively and quantitatively abnormal accurate and very sensitive method in haemocyte.Automation flow cytometry data disclosed herein Analysis system provides the ability for automatically analyzing the mass data generated during flow cytometry measure, enhances fluidic cell Accuracy, repeatability and the diversity of art method.The ability that method disclosed herein provides not only increases flow cytometry Diagnostic value but also carry out data mining and mould by collecting and analyzing magnanimity flow cytometry data from many patients Formula identifies and extends the research field of the technology, and these research fields are beyond currently limited method.

Claims (22)

1. a kind of method analyzed and classified for flow cytometric art data, wherein, the flow cytometry data bag The multiple features for describing the data are included, methods described includes:
Input data set including the flow cytometry event for cell mass is downloaded to including processor and storage device In computer system, wherein, the processor is programmed to carry out at least one SVMs and performs following steps:
The hierarchical structure of limiting analysis element, each analysis element correspond to different gatings and limited, wherein, each analysis element The predetermined criterion of parameter combination is classified to cell subgroup using gating algorithms basis, wherein, the classification is to make What the SVMs that apparatus is distributed formula kernel performed;And
The output display of mark of the generation with flow cytometry data classification at display device.
2. the method as described in claim 1, in addition to selection cell subgroup and using the different gating algorithms of application to enter one Walk the different analysis elements classified to the subgroup and usually analyze the selected cell subgroup.
3. the method for claim 1, wherein the distributed kernel includes Pasteur's compatibility with following form:
<mrow> <mi>k</mi> <mrow> <mo>(</mo> <mi>p</mi> <mo>,</mo> <mi>q</mi> <mo>)</mo> </mrow> <mo>=</mo> <msup> <mi>e</mi> <mrow> <mo>-</mo> <mi>&amp;rho;</mi> <mrow> <mo>(</mo> <mi>p</mi> <mo>,</mo> <mi>q</mi> <mo>)</mo> </mrow> </mrow> </msup> <mo>=</mo> <msup> <msqrt> <mfrac> <mrow> <mo>|</mo> <mrow> <mo>(</mo> <msub> <mo>&amp;Sigma;</mo> <mn>1</mn> </msub> <mo>+</mo> <msub> <mo>&amp;Sigma;</mo> <mn>2</mn> </msub> <mo>)</mo> </mrow> <mo>/</mo> <mn>2</mn> <mo>|</mo> </mrow> <msqrt> <mrow> <mrow> <mo>|</mo> <msub> <mo>&amp;Sigma;</mo> <mn>1</mn> </msub> <mo>|</mo> </mrow> <mo>&amp;CenterDot;</mo> <mrow> <mo>|</mo> <msub> <mo>&amp;Sigma;</mo> <mn>2</mn> </msub> <mo>|</mo> </mrow> </mrow> </msqrt> </mfrac> </msqrt> <mrow> <mo>-</mo> <mn>1</mn> </mrow> </msup> <mi>exp</mi> <mo>{</mo> <mo>-</mo> <mfrac> <mn>1</mn> <mn>8</mn> </mfrac> <msup> <mrow> <mo>(</mo> <msub> <mi>M</mi> <mn>2</mn> </msub> <mo>-</mo> <msub> <mi>M</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <mi>T</mi> </msup> <msup> <mrow> <mo>&amp;lsqb;</mo> <mfrac> <mrow> <msub> <mo>&amp;Sigma;</mo> <mn>1</mn> </msub> <mo>+</mo> <msub> <mo>&amp;Sigma;</mo> <mn>2</mn> </msub> </mrow> <mn>2</mn> </mfrac> <mo>&amp;rsqb;</mo> </mrow> <mrow> <mo>-</mo> <mn>1</mn> </mrow> </msup> <mrow> <mo>(</mo> <msub> <mi>M</mi> <mn>2</mn> </msub> <mo>-</mo> <msub> <mi>M</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <mo>}</mo> <mo>,</mo> </mrow>
Wherein, p and q is input data point, and M is the average of normal distribution, and ∑ is covariance matrix.
4. the method for claim 1, wherein the hierarchical structure includes the tree with multiple branches, and the side Method also includes being used for the Analysis of conclusion step that result caused by each branch is combined into diagnostic classification.
5. method as claimed in claim 4, wherein, the diagnostic classification is included presence or absence of disease.
6. the method for claim 1, wherein the different gatings are limited selected from the group being made up of the following:Sample Pipe mark, fragment are to non-fragment, granulocyte, monocyte, lymphocyte, negative flag thing intensity and disappearance label intensity.
7. the method for claim 1, wherein generating output display includes highlighting abnormal results to contribute to user Carry out vision-based detection.
8. a kind of method for being used to automatically analyze flow cytometry data, including:
Detection includes the lateral scattering and forward scattering event of the sample of multiple cells;
Multiple curve maps of the lateral scattering and forward scattering event, the multiple curve map bag are generated in two dimension or three-dimensional Include flow cytometry data;
The multiple curve map is handled using the hierarchical structure of analytical element, each analysis element corresponds to different gatings and limited It is fixed, wherein, each analysis element application gating algorithms basis is divided cell subgroup the predetermined criterion of parameter combination Class, wherein, the classification is performed using distributed kernel;And
The output of mark of the generation with the classification of one or more flow cytometry datas at display device.
9. method as claimed in claim 8, in addition to selection cell subgroup and using the different gating algorithms of application to enter one Walk the different analysis elements classified to the subgroup and usually analyze the selected cell subgroup.
10. method as claimed in claim 8, wherein, the distributed kernel includes Pasteur's compatibility with following form:
<mrow> <mi>k</mi> <mrow> <mo>(</mo> <mi>p</mi> <mo>,</mo> <mi>q</mi> <mo>)</mo> </mrow> <mo>=</mo> <msup> <mi>e</mi> <mrow> <mo>-</mo> <mi>&amp;rho;</mi> <mrow> <mo>(</mo> <mi>p</mi> <mo>,</mo> <mi>q</mi> <mo>)</mo> </mrow> </mrow> </msup> <mo>=</mo> <msup> <msqrt> <mfrac> <mrow> <mo>|</mo> <mrow> <mo>(</mo> <msub> <mo>&amp;Sigma;</mo> <mn>1</mn> </msub> <mo>+</mo> <msub> <mo>&amp;Sigma;</mo> <mn>2</mn> </msub> <mo>)</mo> </mrow> <mo>/</mo> <mn>2</mn> <mo>|</mo> </mrow> <msqrt> <mrow> <mrow> <mo>|</mo> <msub> <mo>&amp;Sigma;</mo> <mn>1</mn> </msub> <mo>|</mo> </mrow> <mo>&amp;CenterDot;</mo> <mrow> <mo>|</mo> <msub> <mo>&amp;Sigma;</mo> <mn>2</mn> </msub> <mo>|</mo> </mrow> </mrow> </msqrt> </mfrac> </msqrt> <mrow> <mo>-</mo> <mn>1</mn> </mrow> </msup> <mi>exp</mi> <mo>{</mo> <mo>-</mo> <mfrac> <mn>1</mn> <mn>8</mn> </mfrac> <msup> <mrow> <mo>(</mo> <msub> <mi>M</mi> <mn>2</mn> </msub> <mo>-</mo> <msub> <mi>M</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <mi>T</mi> </msup> <msup> <mrow> <mo>&amp;lsqb;</mo> <mfrac> <mrow> <msub> <mo>&amp;Sigma;</mo> <mn>1</mn> </msub> <mo>+</mo> <msub> <mo>&amp;Sigma;</mo> <mn>2</mn> </msub> </mrow> <mn>2</mn> </mfrac> <mo>&amp;rsqb;</mo> </mrow> <mrow> <mo>-</mo> <mn>1</mn> </mrow> </msup> <mrow> <mo>(</mo> <msub> <mi>M</mi> <mn>2</mn> </msub> <mo>-</mo> <msub> <mi>M</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <mo>}</mo> <mo>,</mo> </mrow>
Wherein, p and q is input data point, and M is the average of normal distribution, and ∑ is covariance matrix.
11. method as claimed in claim 8, wherein, the hierarchical structure includes the tree with multiple branches, and the side Method also includes being used for the Analysis of conclusion step that result caused by each branch is combined into diagnostic classification.
12. method as claimed in claim 11, wherein, the diagnostic classification is included presence or absence of disease.
13. method as claimed in claim 8, wherein, the different gatings are limited selected from the group being made up of the following:Sample Pipe mark, fragment are to non-fragment, granulocyte, monocyte, lymphocyte, negative flag thing intensity and disappearance label intensity.
14. method as claimed in claim 8, wherein, generation output display includes highlighting abnormal results to help to use Family carries out vision-based detection.
15. a kind of system for being used to automatically analyze flow cytometry data, the system include:
Computer processor, the computer processor and memory communication, the memory are stored with including to bag wherein The flow cytometry data of multiple measure of multiple samples execution of cell is included, the flow cytometry data includes lateral scattering With forward scattering event;And
Computer program product, the computer program product are implemented in non-transient computer-readable media, the computer Program product includes being used for the instruction for making the computer processor perform following operation:
Receive the flow cytometry data;
Multiple curve maps of the lateral scattering and forward scattering event are generated in two dimension or three-dimensional;
The multiple curve map is handled using the hierarchical structure of analytical element, each analysis element corresponds to different gatings and limited It is fixed, wherein, each analysis element application gating algorithms are according to the predetermined criterion to parameter combination and to thin in the sample Sporozoite group is classified, wherein, the classification is performed using distributed kernel;And
The output for the mark that one or more flow cytometry datas of the generation with the cell are classified at display device.
16. system as claimed in claim 15, wherein, the computer program product also includes being used to make at the computer Manage the instruction that device performs following operation:Select cell subgroup;And using the different gating algorithms of application with further to the son The different analysis elements that group is classified usually analyze the selected cell subgroup.
17. system as claimed in claim 15, wherein, it is affine that the distributed kernel includes the Pasteur with following form Property:
<mrow> <mi>k</mi> <mrow> <mo>(</mo> <mi>p</mi> <mo>,</mo> <mi>q</mi> <mo>)</mo> </mrow> <mo>=</mo> <msup> <mi>e</mi> <mrow> <mo>-</mo> <mi>&amp;rho;</mi> <mrow> <mo>(</mo> <mi>p</mi> <mo>,</mo> <mi>q</mi> <mo>)</mo> </mrow> </mrow> </msup> <mo>=</mo> <msup> <msqrt> <mfrac> <mrow> <mo>|</mo> <mrow> <mo>(</mo> <msub> <mo>&amp;Sigma;</mo> <mn>1</mn> </msub> <mo>+</mo> <msub> <mo>&amp;Sigma;</mo> <mn>2</mn> </msub> <mo>)</mo> </mrow> <mo>/</mo> <mn>2</mn> <mo>|</mo> </mrow> <msqrt> <mrow> <mrow> <mo>|</mo> <msub> <mo>&amp;Sigma;</mo> <mn>1</mn> </msub> <mo>|</mo> </mrow> <mo>&amp;CenterDot;</mo> <mrow> <mo>|</mo> <msub> <mo>&amp;Sigma;</mo> <mn>2</mn> </msub> <mo>|</mo> </mrow> </mrow> </msqrt> </mfrac> </msqrt> <mrow> <mo>-</mo> <mn>1</mn> </mrow> </msup> <mi>exp</mi> <mo>{</mo> <mo>-</mo> <mfrac> <mn>1</mn> <mn>8</mn> </mfrac> <msup> <mrow> <mo>(</mo> <msub> <mi>M</mi> <mn>2</mn> </msub> <mo>-</mo> <msub> <mi>M</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <mi>T</mi> </msup> <msup> <mrow> <mo>&amp;lsqb;</mo> <mfrac> <mrow> <msub> <mo>&amp;Sigma;</mo> <mn>1</mn> </msub> <mo>+</mo> <msub> <mo>&amp;Sigma;</mo> <mn>2</mn> </msub> </mrow> <mn>2</mn> </mfrac> <mo>&amp;rsqb;</mo> </mrow> <mrow> <mo>-</mo> <mn>1</mn> </mrow> </msup> <mrow> <mo>(</mo> <msub> <mi>M</mi> <mn>2</mn> </msub> <mo>-</mo> <msub> <mi>M</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <mo>}</mo> <mo>,</mo> </mrow>
Wherein, p and q is input data point, and M is the average of normal distribution, and ∑ is covariance matrix.
18. system as claimed in claim 15, wherein, the hierarchical structure includes the tree with multiple branches, and described System also includes being used for the Analysis of conclusion step that result caused by each branch is combined into diagnostic classification.
19. system as claimed in claim 18, wherein, the diagnostic classification is included presence or absence of disease.
20. system as claimed in claim 15, wherein, the different gatings are limited selected from the group being made up of the following:Sample This pipe mark, fragment are to non-fragment, granulocyte, monocyte, lymphocyte, negative flag thing intensity and disappearance label intensity.
21. system as claimed in claim 15, wherein, the memory is associated with flow cytometry instrument, and described Flow cytometry data is specific to single subject.
22. system as claimed in claim 15, wherein, the memory includes database, and the database is configured to use In the accumulation flow cytometry data that storage generates from the sample collected from multiple subjects.
CN201580075757.XA 2014-12-10 2015-12-10 Automate flow cytometry method and system Pending CN107430587A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201462090316P 2014-12-10 2014-12-10
US62/090,316 2014-12-10
PCT/US2015/065095 WO2016094720A1 (en) 2014-12-10 2015-12-10 Automated flow cytometry analysis method and system

Publications (1)

Publication Number Publication Date
CN107430587A true CN107430587A (en) 2017-12-01

Family

ID=59810972

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201580075757.XA Pending CN107430587A (en) 2014-12-10 2015-12-10 Automate flow cytometry method and system

Country Status (2)

Country Link
EP (1) EP3230887A4 (en)
CN (1) CN107430587A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109655616A (en) * 2018-12-19 2019-04-19 广州金域医学检验中心有限公司 Detect the composite reagent and system of acute myeloid leukemia cell
CN113228049A (en) * 2018-11-07 2021-08-06 福斯分析仪器公司 Milk analyzer for classifying milk
CN114323162A (en) * 2022-03-14 2022-04-12 青岛清万水技术有限公司 Shell vegetable growth data online monitoring method and device and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101981446A (en) * 2008-02-08 2011-02-23 医疗探索公司 Method and system for analysis of flow cytometry data using support vector machines
CN102625932A (en) * 2009-09-08 2012-08-01 诺达利蒂公司 Analysis of cell networks
US20140228233A1 (en) * 2011-06-07 2014-08-14 Traci Pawlowski Circulating biomarkers for cancer

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101981446A (en) * 2008-02-08 2011-02-23 医疗探索公司 Method and system for analysis of flow cytometry data using support vector machines
CN102625932A (en) * 2009-09-08 2012-08-01 诺达利蒂公司 Analysis of cell networks
US20140228233A1 (en) * 2011-06-07 2014-08-14 Traci Pawlowski Circulating biomarkers for cancer

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KAREL FISER ET AL.: "Detection and Monitoring of Normal and Leukemic Cell Populations with Hierarchical Clustering of Flow Cytometry Data", 《CYTOMERY PART A》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113228049A (en) * 2018-11-07 2021-08-06 福斯分析仪器公司 Milk analyzer for classifying milk
CN113228049B (en) * 2018-11-07 2024-02-02 福斯分析仪器公司 Milk analyzer for classifying milk
CN109655616A (en) * 2018-12-19 2019-04-19 广州金域医学检验中心有限公司 Detect the composite reagent and system of acute myeloid leukemia cell
CN109655616B (en) * 2018-12-19 2022-05-06 广州金域医学检验中心有限公司 Combined reagent and system for detecting acute myeloid leukemia cells
CN114323162A (en) * 2022-03-14 2022-04-12 青岛清万水技术有限公司 Shell vegetable growth data online monitoring method and device and electronic equipment
CN114323162B (en) * 2022-03-14 2022-06-28 青岛清万水技术有限公司 Method and device for on-line monitoring of shell vegetable growth data and electronic equipment

Also Published As

Publication number Publication date
EP3230887A4 (en) 2018-08-01
EP3230887A1 (en) 2017-10-18

Similar Documents

Publication Publication Date Title
US20160169786A1 (en) Automated flow cytometry analysis method and system
CN101981446B (en) For the method and system using support vector machine to analyze flow cytometry data
US20240044904A1 (en) System, method, and article for detecting abnormal cells using multi-dimensional analysis
US20130060775A1 (en) Spanning-tree progression analysis of density-normalized events (spade)
EP3631417B1 (en) Visualization, comparative analysis, and automated difference detection for large multi-parameter data sets
WO2020081292A1 (en) Adaptive sorting for particle analyzers
CN107430587A (en) Automate flow cytometry method and system
US20230393048A1 (en) Optimized Sorting Gates
Azad et al. Immunophenotype discovery, hierarchical organization, and template-based classification of flow cytometry samples
EP2332073B1 (en) Shape parameter for hematology instruments
US20230215571A1 (en) Automated classification of immunophenotypes represented in flow cytometry data
US10235495B2 (en) Method for analysis and interpretation of flow cytometry data
Bashashati et al. A pipeline for automated analysis of flow cytometry data: preliminary results on lymphoma sub-type diagnosis
TWI792751B (en) Medical image project management platform
TW202311742A (en) Automated classification of immunophenotypes represented in flow cytometry data
Cordeiro et al. Autogating in Flow Cytometry Data Using SVM Classifiers for Bacterioplankton Identification
Mohamed Using Probability Binning and Bayesian Inference to measure Euclidean Distance of Flow Cytometric data
Lin Bayesian variable selection in clustering and hierarchical mixture modeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20171201

WD01 Invention patent application deemed withdrawn after publication