CN107389536A - Fluidic cell particle classifying method of counting based on density distance center algorithm - Google Patents

Fluidic cell particle classifying method of counting based on density distance center algorithm Download PDF

Info

Publication number
CN107389536A
CN107389536A CN201710641341.0A CN201710641341A CN107389536A CN 107389536 A CN107389536 A CN 107389536A CN 201710641341 A CN201710641341 A CN 201710641341A CN 107389536 A CN107389536 A CN 107389536A
Authority
CN
China
Prior art keywords
mrow
particle
density
data
msub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710641341.0A
Other languages
Chinese (zh)
Other versions
CN107389536B (en
Inventor
陶靖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Nano Derivatives Technology Co Ltd
Original Assignee
Shanghai Nano Derivatives Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Nano Derivatives Technology Co Ltd filed Critical Shanghai Nano Derivatives Technology Co Ltd
Priority to CN201710641341.0A priority Critical patent/CN107389536B/en
Publication of CN107389536A publication Critical patent/CN107389536A/en
Application granted granted Critical
Publication of CN107389536B publication Critical patent/CN107389536B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Optical investigation techniques, e.g. flow cytometry
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Optical investigation techniques, e.g. flow cytometry
    • G01N2015/1486Counting the particles

Landscapes

  • Chemical & Material Sciences (AREA)
  • Dispersion Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The present invention relates to a kind of fluidic cell particle classifying method of counting based on density distance center algorithm, comprise the following steps:1) the stream data collection of the cell particle of counting to be sorted is obtained using stream type cell analyzer, described stream data collection includes the multidimensional data of particle;2) local density and the distance parameter of each particle of stream data concentration are obtained according to density distance center algorithm, is screened and is sorted, obtain initial classes group center to be clustered;3) initial value using initial classes group center as mixed model algorithm, population is clustered according to mixed model, obtains sorted multiple particle monoids, carry out counting statistics.Compared with prior art, the present invention has that accuracy is high, stability is good, the classification that adapts to the distribution of stream data, adapt to small sample population, the advantages that calculating speed is fast.

Description

Fluidic cell particle classifying method of counting based on density-distance center algorithm
Technical field
The present invention relates to cell particle classifying fields of measurement, more particularly, to a kind of based on density-distance center algorithm Fluidic cell particle classifying method of counting.
Background technology
Flow cytometry (flow cytometry, FCM) is the technology that quantitative analysis is carried out using flow cytometer, It utilizes hydrodynamics focusing principle, and analyzed cell or particulate are formed a line, quickly flow across detection light beam one by one, Analyzed by high-accuracy optical system, electrical signal processing and computer data, determine cell or particulate trigger it is polygonal Degree scattering light and multicolor fluorescence, the sizes of individual cells or particulate up to ten thousand, internal structure, nucleic acid, egg can be obtained in a short time The physics such as white matter and chemical feature.Flow cytometry is biological doctor with the advantages that it is quick, accurate, high-volume, multi parameter analysis The important basic scientific research apparatus of front line science research is carried out in treatment field;Meanwhile and important clinical examination equipment.
The multi-angle scattering light and multicolor fluorescence that each cell or particulate trigger, pass through optical system collection and photoelectric sensing Device is converted into electric signal, and handling and sampling by electrical signal turns into data signal, is stored by computer and carried out data point Analysis;The characteristic of all cells or particulate that flow cytometer obtains is referred to as stream data.
Traditionally, the analysis of stream data relies on experienced person into two-dimentional scatter diagram, then to adopt data projection Monoid interested is analyzed with the mode of region gating, such as classifies and counts, be referred to as artificial gating method.With streaming The continuous development of cell art, stream data amount are multiplied, and automatically analyzing for data has become the following hair of Flow Cytometry The Main way of exhibition.For the cluster analysis of stream data, some automatic analysis methods are successively suggested, and be can be divided mainly into and are based on The clustering method of probability distribution and the clustering method based on spatial information.
Clustering method based on probability distribution is mainly Finite mixture model clustering algorithm, is such as based on bayesian information criterion Gauss hybrid models algorithm, the cell population that the algorithm forms to the data set by normal state or nearly normal distribution has preferably Disposal ability;The data of Non-Gaussian Distribution are converted to nearly normal distribution by t- Distribution Mixed Models algorithm, instead of Gaussian Mixture mould Type streaming data carries out cluster analysis;Also deflection t- Distribution Mixed Models algorithm, can preferably handle asymmetric distribution Data.These mixed model clustering algorithms continue to develop, and improve the adaptability that model is distributed to different pieces of information.It is but high The solution that the mixed models such as this distribution, t- distributions and inclined t- distributions are obtained in itself is local optimum, therefore is based on finite mixtures mould The clustering algorithm of type depends on the position of initial point (namely class group center).Because real data is often more complicated, such as make an uproar Situation more than the point of articulation, mixed model clustering algorithm can be wrong point, so the stability of algorithm is not high.
Clustering method based on spatial information is the another kind of main method of stream data analysis, such as K-means algorithms and DBSCAN algorithms, the assembility of streaming data are limited.Based on the clustering algorithm of Finite mixture model for stream data Analysis is more suitable for, and applies relatively more.Because the clustering algorithm based on Finite mixture model depends on initial point (namely class Group center) position, its initial value to model is very sensitive.Clustering algorithm based on K-means and mixed model is for initial The selection of monoid central point is often random, and people get used to making the mutual distance of initial cluster center remote as much as possible, but Be that K-means algorithms try to achieve in itself is locally optimal solution, therefore is still possible to be absorbed in local optimum for random initial value, It is very unstable to the initial value of Selection Model, it is impossible to ensure the Stability and veracity of result.
In a practical situation, stream data is often more complicated, and the cluster analysis of various harsh conditions streaming datas is chosen War is very big, and such as the situation more than noise point, forefathers' method is by mistake divided into noise point one single monoid sometimes.In addition, sample Measure monoid small and that distribution is sparse and do not have good solution.For example, in the leukocyte differential count analysis of human peripheral, generally Monocyte accounts for the 2%~10% of leucocyte total amount, and eosinophil accounts for the 1%~6% of leucocyte total amount, and lymphocyte 40% is accounted for, granulocyte accounts for 50%, is to account for most of monoid.In such multiclass clustering alanysis, large sample class The quantity of group and small sample monoid differs greatly and close to each other, and difficult point is the positioning and differentiation of small sample monoid.Small sample class Group is distributed sparse because sample size is few, it is easy to is disturbed by adjacent dominant groups, and is divided into the one of other monoids by mistake Part, therefore requirement of the small sample monoid to the taste and stability of algorithm is very high.
The content of the invention
It is an object of the present invention to overcome the above-mentioned drawbacks of the prior art and provide one kind is based on density-distance The fluidic cell particle classifying method of counting of CENTER ALGORITHM.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of fluidic cell particle classifying method of counting based on density-distance center algorithm, comprises the following steps:
1) the stream data collection of the cell particle of counting to be sorted, described streaming number are obtained using stream type cell analyzer The multidimensional data of particle is included according to collection;
2) local density and the distance parameter of each particle of stream data concentration are obtained according to density-distance center algorithm, Screened and sorted, obtain initial classes group center to be clustered;
3) initial value using initial classes group center as mixed model algorithm, population is gathered according to mixed model Class, sorted multiple particle monoids are obtained, carry out counting statistics.
In described step 1), when the data that stream data is concentrated are 2-D data, by forward scattering optical channel data As y-axis, the data of lateral scattering optical channel form two-dimentional scatter diagram as x-axis;Or using side scattered light channel data as y Axle, the data of fluorescence channel form two-dimentional scatter diagram as x-axis;When the data that stream data is concentrated are three-dimensional data, by before Formed to scattering optical channel data as x-axis, the data of lateral scattering optical channel as y-axis, the data of fluorescence channel as z-axis Three-dimensional scatter diagram.
Described step 2) specifically includes following steps:
21) for stream data collection S={ x1,x2...xi...xn, define i-th of particle x thereiniLocal density ρi With distance δiParameter is respectively;
Wherein, dijFor xiTo xjEuclidean distance, dcTo block distance, χ (x) is a function;
22) local density threshold ρ is set0, and exclude the particle that local density is less than threshold value;
23) remaining all particles are arranged in sequence according to the order of distance from big to small;
24) monoid number k is set, k particle is as initial classes group center to be clustered before being chosen successively according to sequence.
In described step 21),
When i-th particle is the maximum point of local density, then assignment δiFor i-th of particle to distance a little most Big value, then have:
In described step 21),
When multiple local density's identical particle points be present, then to this local density plus one level off to 0 increment, Then local density and the distance parameter of each particle are recalculated.
In described step 24), when the Euclidean distance of Liang Ge classes group center is less than the threshold value of setting, then it is regarded as same One monoid, any point in this Liang Ge classes group center is taken as new class group center, or take local in this Liang Ge classes group center The larger point of density is as new class group center.
In described step 3), mixed model algorithm includes gauss hybrid models, t- Distribution Mixed Models and inclined t- distributions Mixed model.
Compared with prior art, the present invention has advantages below:
First, accuracy is high, and stability is good:The initial center of each particle monoid is first found using density-distance center algorithm, Therefore aftermentioned cluster process accuracy is high, and stability is good, is not in the situation that locally optimal solution causes to divide by mistake.
2nd, the distribution of stream data is adapted to:Using mixed model (such as Gauss model, t- Distribution Mixed Models and inclined t- Distribution Mixed Model etc.) clustered, it can effectively adapt to the characteristic distributions of stream data.
3rd, the classification of small sample population is adapted to:Context of methods can effectively handle small sample population, positioning and classification Accuracy it is high.
4th, calculating speed is fast:Initial classes group center is determined by density-distance center algorithm, clusters and calculates as mixed model The initial centered value of method, calculating speed are accelerated.
Brief description of the drawings
Fig. 1 is flow chart of the method for the present invention.
Fig. 2 is embodiments of the invention I schematic diagram, wherein, figure (2a) is distance-density profile, and figure (2b) is two Scatter diagram is tieed up, figure (2c) is the result after cluster.
Fig. 3 is embodiments of the invention II schematic diagram, wherein, figure (3a) is distance-density profile, and figure (3b) is two Scatter diagram is tieed up, figure (3c) is the result after cluster.
Fig. 4 is embodiments of the invention III schematic diagram, wherein, figure (4a) is distance-density profile, and figure (4b) is Two-dimentional scatter diagram, figure (4c) are the result after cluster.
Embodiment
The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.
The present invention proposes a kind of mixed model stream data clustering method based on density-distance center, by density-away from In positioning from the initial cluster center that CENTER ALGORITHM is applied to stream data, to determine initial classes group center, so as to ensure Limit the stability and accuracy of mixed model result.This method will be based on probability distribution and spatial information (density and distance) Method is merged, and so as to preferably solve the problems, such as the differentiation of small sample monoid, while anti-noise ability is strong, and stability is high, Accuracy is good.
Fig. 1 show the idiographic flow of the clustering method of present invention processing stream data.With reference to Fig. 1 to following cluster Step is described in detail:
In step 401, stream data collection to be analyzed is obtained using stream type cell analyzer, such as the feature of cell particle Data, including the detection limit of multi-angle scattering light and multicolor fluorescence.Stream data collection to be analyzed includes the multidimensional data of particle. When the data that stream data is concentrated are 2-D data, the number such as comprising forward scattering optical channel data and lateral scattering optical channel According to two-dimentional scatter diagram can be formed as x-axis using forward scattering optical channel data as y-axis, the data of lateral scattering optical channel, such as Scheme shown in (2b);Data such as comprising side scattered light channel data and fluorescence channel, can make side scattered light channel data For y-axis, the data of fluorescence channel form two-dimentional scatter diagram as x-axis;When the data that stream data is concentrated are three-dimensional data, such as Comprising forward scattering optical channel data, the data of lateral scattering optical channel and the data of fluorescence channel, forward scattering light can be led to Track data forms three-dimensional scatterplot as y-axis, the data of fluorescence channel as x-axis, the data of lateral scattering optical channel as z-axis Figure.
In step 402, for stream data collection to be analyzed, each particle is obtained by density-distance center algorithm Local density and distance parameter, are represented in distance-density profile, are such as schemed shown in (2a).
For data set S={ x to be clustered1,x2,…,xn, define i-th of particle x thereiniLocal density ρiWith And distance δiTwo parameters (i ∈ [1, n]).Local density reflects the density of the data in certain section, and it is defined as follows:
Wherein, function
Parameter dijThat represent is xiTo xjDistance, such as space Euclidean distance.Parameter dc>0 is blocks distance, according to reality Sample data is preset, and such as takes dc=5.From formula (1), local density ρiRepresent be in data set with xi(exclude certainly Body) distance be less than dcData point number.
To the distance δ of certain pointiDefinition be calculate it arrive all points bigger than its local density distance, take therein Minimum value, specific formula are as follows:
If this point has been the maximum point of local density, then δiBe entered as it to distance a little maximum Value.
According to formula (1)-(4), each point xiA local density ρ can be obtainediWith a distance value δi
Especially, if multiple local density's identical particle points be present, 0 is leveled off to plus one to this local density Increment, then recalculate local density and the distance parameter of each particle.
In step 403, the threshold value ρ of a local density is set0, and make a decision.If the part of a particle point is close Degree is less than threshold value ρ0, the particle point is deleted from data set.
In step 404, remaining all particles are arranged in sequence according to the order of distance from big to small.
In step 405, monoid number k is set, k particle is as initial classes to be clustered before being chosen successively according to sequence Group center.
For the stream data of certain determination analysis, the monoid number to be sorted of same class experiment sample be priori determine and Identical, monoid number is preset as definite value k, such as k=4.
If class group center is(j ∈ [1, k]), cjThe label for representing monoid central point (is the δ chosen successivelyiIndex I), D represents the set of the label for the monoid central point chosen, then its specific formula is as follows:
Especially, if the space Euclidean distance of Liang Lei group center is less than the threshold value of setting, same monoid is regarded as, is taken Any point in this Liang Ge classes group center is as new class group center, or takes local density in this Liang Ge classes group center larger Point is as new class group center.
In a step 406, the initial value using initial classes group center as mixed model algorithm, that is, each t- distributions point The location parameter μ of metric density functionj, cluster analysis is carried out to population according to mixed model, wherein being entered with maximum likelihood algorithm Row parameter Estimation.
Consider the distribution characteristics of stream data, the clustering algorithm based on Finite mixture model is relatively adapted to.Gaussian Mixture mould The cell population that type algorithm forms to the data set by normal state or nearly normal distribution has preferable disposal ability;T- distributed renderings Model algorithm is adapted to the data of Non-Gaussian Distribution;Deflection t- Distribution Mixed Models algorithm can preferably handle asymmetric distribution Data.These mixed model clustering algorithms continue to develop, and improve the adaptability that model is distributed to different pieces of information.According to close The method that degree-distance center algorithm obtains initial classes group center, it may apply to all mixed models (Gauss model, t- distributions Mixed model, inclined t- Distribution Mixed Models) in.But according to the characteristic distributions of the stream data of haemocyte, and it is real in view of algorithm Existing complexity and the efficiency of operation, carry out cluster analysis using t- Distribution Mixed Models here.
The specific algorithm of mixed model is described below:
1) mixed model
If X, which is p, ties up random vector, and x1,x2,…,xnRandom sample observation is tieed up for random vector X n p, and mutually Independent, then the Diversity model probability density function being made up of caused by X k component is defined as:
Wherein, k is the number of components of mixed model;Θ=(π1,...,πk-11,...,θk), it is matrix of unknown parameters;f (x;θi) represent the probability density function of i-th of component, θiFor its unknown parameter vector;πiFor mixing ratio, i-th point is represented Ratio of the metric density in mixed model, it meets
2) t- mixed models
If f (x in formula (5);θi) be distributed for t-, then f (x;Θ) it is t- mixed models.The probability density letter of P dimension t- distributions Several forms are:
Wherein μ is location parameter, and Σ is positive definite matrix, and υ is the free degree, δ (x;μ, Σ)=(x- μ)TΣ (x- μ) is x with Square of mahalanobis distance between μ, Γ (x) are Gamma functions, are defined asFor t mixed models, often Individual component density function all ties up t- distribution density functions for P, and its hybrid guided mode pattern is:
For stream data, if it can be divided into k monoid, t- mixed models assume that it is made up of k t- distribution. Last cluster result namely obtains k fluidic cell group of corresponding k t- distributions.Pass through streaming data Sample Establishing pole Maximum-likelihood is estimated, the hybrid parameter of Maximum-likelihood estimation can be obtained using EM algorithms.XiSample is tieed up for some p in stream data This value, Xi=(xi1,xi2,...,xip)T.Introduce XiThe label vector Z of componenti=(zi1,zi2,...,zik)T, and meet:XiCategory When j-th of t- is distributed, zij=1, otherwise zij=0.That is ZiRepresent sample value XiWhich t- distribution belonged to.Now, completely Data vector integrates as XC=(XT, Z1 T, Z2 T..., Zn T)T.Wherein X=(X1 T,X2 T,...,Xn T)T.Its corresponding log-likelihood letter Number can be written as:
3) EM algorithms are estimated
For t- mixed models, the process that parameter Estimation is carried out using EM algorithms is as follows:
(1) E-stage:If Θ(t)For the estimate of the t times iteration, then in specified criteria Θ(t)Under log-likelihood function Conditional expectation is
Q(Θ;Θ(t)))=E (ln (Lc(Θ|Xc));Θ(t)) (9)
(2) the M stages:Θ is asked by formula (8)(t+1)Make Q (Θ;Θ(t+1)) maximum, i.e.,
Θ(t+1)=argmax (Q (Θ;Θ(t))) (10)
(3) by formula (9) and formula (10) loop iteration until parameter convergence, obtains parameter Θ estimate.
The iterative of the relevant parameter tried to achieve by EM algorithms be:
Free degree υj (t+1)It is nonlinear equation
Solution, wherein
In step 407, cluster obtains multiple particle monoids, can be identified with different colours, and carry out differential counting system Meter, such as scheme shown in (2c).
Embodiment I:
As shown in Fig. 2 the specific implementation case I for context of methods.Pending stream data sample, according to forward scattering The measurement data of optical channel (FSC) and lateral scattering optical channel (SSC) establishes two-dimentional scatter diagram, such as scheme shown in (2b) that (transverse axis is Lateral scattering optical channel, the longitudinal axis are forward scattering optical channel).This sample is normal sample, and monocyte group accounts for 5%, all kinds of Group distinguishes substantially, and upper left side is lymphocyte populations, and lower left is bib, and middle top is monocyte group, and right is grain Cell mass.
Distance and local density's parameter corresponding to each particle obtained as density-distance center algorithm, represent away from From in-density profile, such as scheme shown in (2a), transverse axis is local density, and the longitudinal axis is distance.
Local density threshold is set, and excludes the particle that local density is less than threshold value;By remaining all particles according to Sequence is arranged in apart from descending order;Monoid number k=4 is set, k particle, which is used as, before being chosen successively according to sequence waits to gather The initial classes group center of class.The class group center of selection is represented with " o ", "+", " Δ " and " " respectively in (2b) is schemed.
The 1st initial classes group center chosen is the X2719 in data set, is designated as Xc1;
The 2nd initial classes group center chosen is the X102 in data set, is designated as Xc2;
The 3rd initial classes group center chosen is the X3546 in data set, is designated as Xc3;
The 4th initial classes group center chosen is the X1568 in data set, is designated as Xc4.
Initial value of the initial classes group center obtained as mixed model, is iterated according to mixed model streaming data and asks Solution, wherein carrying out parameter Estimation with reference to maximum likelihood algorithm.The result of cluster analysis is carried out as schemed with t- Distribution Mixed Models Shown in (2c).Each particle monoid is identified with different colours, and carries out differential counting statistics.Fig. 2 noise point is more, if only Solved according to mixed model, easily divide by mistake, be absorbed in locally optimal solution.Initial classes are determined with density-distance center algorithm Group center, so as to ensure the stability of Finite mixture model result and accuracy.
Using the classification results of artificial gating method as standard, the sample after this algorithm cluster is divided into 4 groups, and respectively red blood cell is broken Piece, lymphocyte, monocyte and granulocyte.The classification results of artificial gating method are contrasted, for the less monokaryon of population Cell, the error of this algorithm is 0.33%.
Embodiment II:
As shown in figure 3, the specific implementation case II for context of methods.Pending stream data sample, according to preceding to scattered The data for penetrating optical channel (FSC) and lateral scattering optical channel (SSC) establish two-dimentional scatter diagram, and such as (transverse axis is side shown in figure (3b) To scattering light, the longitudinal axis is forward scattering light).The monocyte group sample size of this sample is seldom, accounts for 2%, for sufferer or extreme Situation.
Distance and local density's parameter corresponding to each particle obtained as density-distance center algorithm, represent away from From in-density profile, such as scheme shown in (3a), transverse axis is local density, and the longitudinal axis is distance.
Local density threshold is set, and excludes the particle that local density is less than threshold value;By remaining all particles according to Sequence is arranged in apart from descending order;Monoid number k=4 is set, k particle, which is used as, before being chosen successively according to sequence waits to gather The initial classes group center of class.The class group center of selection is represented with " o ", "+", " Δ " and " " respectively in (3b) is schemed.
Initial value of the initial classes group center obtained as mixed model, is iterated according to mixed model streaming data and asks Solution, wherein carrying out parameter Estimation with reference to maximum likelihood algorithm.The result of cluster analysis is carried out as schemed with t- Distribution Mixed Models Shown in (3c).Each particle monoid is identified with different colours, and carries out differential counting statistics.The monocyte group sample of this sample Amount is seldom, and is distributed sparse, it is easy to is disturbed by adjacent dominant groups, and is divided into a part for other monoids by mistake.With close Degree-distance center algorithm determines initial classes group center, so as to ensureing the stability of Finite mixture model result and accuracy.
Using the classification results of artificial gating method as standard, the sample after this algorithm cluster is divided into 4 groups, and respectively red blood cell is broken Piece, lymphocyte, monocyte and granulocyte.The classification results of artificial gating method are contrasted, for the less monokaryon of population Cell, the error of this algorithm is 0.19%.
Embodiment III:
As shown in figure 4, the specific implementation case III for context of methods.Pending stream data sample, according to preceding to scattered The data for penetrating optical channel (FSC) and lateral scattering optical channel (SSC) establish two-dimentional scatter diagram, and such as (transverse axis is side shown in figure (4b) To scattering light, the longitudinal axis is forward scattering light).Not only sample size is few (accounting for 2%) for the monocyte group of this sample, and and lymph Cell mass is in close proximity, part aliasing.
Distance and local density's parameter corresponding to each particle obtained as density-distance center algorithm, represent away from From in-density profile, such as scheme shown in (4a), transverse axis is local density, and the longitudinal axis is distance.
Local density threshold is set, and excludes the particle that local density is less than threshold value;By remaining all particles according to Sequence is arranged in apart from descending order;Monoid number k=4 is set, k particle, which is used as, before being chosen successively according to sequence waits to gather The initial classes group center of class.The class group center of selection is represented with " o ", "+", " Δ " and " " respectively in (4b) is schemed.
Initial value of the initial classes group center obtained as mixed model, is iterated according to mixed model streaming data and asks Solution, wherein carrying out parameter Estimation with reference to maximum likelihood algorithm.The result of cluster analysis is carried out as schemed with t- Distribution Mixed Models Shown in (4c).Each particle monoid is identified with different colours, and carries out differential counting statistics.The monocyte group sample of this sample Amount is seldom, and, part aliasing in close proximity with lymphocyte populations, it is easy to is disturbed by adjacent dominant groups, and is divided into by mistake A part for lymphocyte populations.Initial classes group center is determined with density-distance center algorithm, so as to ensure Finite mixture model As a result stability and accuracy.
Using the classification results of artificial gating method as standard, the sample after this algorithm cluster is divided into 4 groups, and respectively red blood cell is broken Piece, lymphocyte, monocyte and granulocyte.The classification results of artificial gating method are contrasted, for the less monokaryon of population Cell, the error of this algorithm is 0.27%.
In summary it is embodied case, density-distance center algorithm is to distinguishing small sample monoid and close to each other The various severe distribution situations such as monoid are as a result very stable.So determined by density-distance center algorithm in initial monoid The heart, the class group center of acquisition accurately and reliably, can preferably handle positioning and the classification problem of small sample monoid, can effectively exclude The interference of various noise points, so as to ensure the stability of Finite mixture model result and accuracy;And gather as mixed model The initial centered value of class algorithm, accelerates calculating speed.

Claims (7)

1. a kind of fluidic cell particle classifying method of counting based on density-distance center algorithm, it is characterised in that including following Step:
1) the stream data collection of the cell particle of counting to be sorted, described stream data collection are obtained using stream type cell analyzer Multidimensional data comprising particle;
2) local density and the distance parameter of each particle of stream data concentration are obtained according to density-distance center algorithm, is carried out Screening and sequence, obtain initial classes group center to be clustered;
3) initial value using initial classes group center as mixed model algorithm, population is clustered according to mixed model, obtained To sorted multiple particle monoids, counting statistics is carried out.
2. a kind of fluidic cell particle classifying method of counting based on density-distance center algorithm according to claim 1, Characterized in that, in described step 1), when the data that stream data is concentrated are 2-D data, by forward scattering optical channel number According to the data as y-axis, lateral scattering optical channel two-dimentional scatter diagram is formed as x-axis;Or side scattered light channel data is made For y-axis, the data of fluorescence channel form two-dimentional scatter diagram as x-axis;, will when the data that stream data is concentrated are three-dimensional data Forward scattering optical channel data are as x-axis, and the data of lateral scattering optical channel are as y-axis, and the data of fluorescence channel are as z-axis shape Into three-dimensional scatter diagram.
3. a kind of fluidic cell particle classifying method of counting based on density-distance center algorithm according to claim 1, Characterized in that, described step 2) specifically includes following steps:
21) for stream data collection S={ x1,x2...xi...xn, define i-th of particle x thereiniLocal density ρiWith away from From δiParameter is respectively;
<mrow> <msub> <mi>&amp;rho;</mi> <mi>i</mi> </msub> <mo>=</mo> <munder> <mo>&amp;Sigma;</mo> <mi>j</mi> </munder> <mi>&amp;chi;</mi> <mrow> <mo>(</mo> <msub> <mi>d</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>-</mo> <msub> <mi>d</mi> <mi>c</mi> </msub> <mo>)</mo> </mrow> </mrow>
<mrow> <mi>&amp;chi;</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mn>1</mn> </mtd> <mtd> <mrow> <mi>x</mi> <mo>&lt;</mo> <mn>0</mn> </mrow> </mtd> </mtr> <mtr> <mtd> <mn>0</mn> </mtd> <mtd> <mrow> <mi>x</mi> <mo>&amp;GreaterEqual;</mo> <mn>0</mn> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>
<mrow> <msub> <mi>&amp;delta;</mi> <mi>i</mi> </msub> <mo>=</mo> <munder> <mi>min</mi> <mrow> <mi>j</mi> <mo>:</mo> <msub> <mi>&amp;rho;</mi> <mi>j</mi> </msub> <mo>&gt;</mo> <msub> <mi>&amp;rho;</mi> <mi>i</mi> </msub> </mrow> </munder> <mrow> <mo>(</mo> <msub> <mi>d</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>)</mo> </mrow> </mrow>
Wherein, dijFor xiTo xjEuclidean distance, dcTo block distance, χ (x) is a function;
22) local density threshold ρ is set0, and exclude the particle that local density is less than threshold value;
23) remaining all particles are arranged in sequence according to the order of distance from big to small;
24) monoid number k is set, k particle is as initial classes group center to be clustered before being chosen successively according to sequence.
4. a kind of fluidic cell particle classifying method of counting based on density-distance center algorithm according to claim 3, Characterized in that, in described step 21),
When i-th particle is the maximum point of local density, then assignment δiFor i-th of particle to distance a little maximum, Then have:
<mrow> <msub> <mi>&amp;delta;</mi> <mi>i</mi> </msub> <mo>=</mo> <munder> <mi>max</mi> <mi>j</mi> </munder> <mrow> <mo>(</mo> <msub> <mi>d</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>)</mo> </mrow> <mo>.</mo> </mrow>
5. a kind of fluidic cell particle classifying method of counting based on density-distance center algorithm according to claim 3, Characterized in that, in described step 21),
When multiple local density's identical particle points be present, then to this local density plus one level off to 0 increment, then Recalculate local density and the distance parameter of each particle.
6. a kind of fluidic cell particle classifying method of counting based on density-distance center algorithm according to claim 3, Characterized in that, in described step 24), when the Euclidean distance of Liang Ge classes group center is less than the threshold value of setting, then regarded For same monoid, any point in this Liang Ge classes group center is taken as new class group center, or take in this Liang Ge classes group center The larger point of local density is as new class group center.
7. a kind of fluidic cell particle classifying method of counting based on density-distance center algorithm according to claim 1, Characterized in that, in described step 3), mixed model algorithm includes gauss hybrid models, t- Distribution Mixed Models and inclined t- points Cloth mixed model.
CN201710641341.0A 2017-07-31 2017-07-31 Flow cell particle classification counting method based on density-distance center algorithm Active CN107389536B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710641341.0A CN107389536B (en) 2017-07-31 2017-07-31 Flow cell particle classification counting method based on density-distance center algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710641341.0A CN107389536B (en) 2017-07-31 2017-07-31 Flow cell particle classification counting method based on density-distance center algorithm

Publications (2)

Publication Number Publication Date
CN107389536A true CN107389536A (en) 2017-11-24
CN107389536B CN107389536B (en) 2020-03-31

Family

ID=60343087

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710641341.0A Active CN107389536B (en) 2017-07-31 2017-07-31 Flow cell particle classification counting method based on density-distance center algorithm

Country Status (1)

Country Link
CN (1) CN107389536B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110516584A (en) * 2019-08-22 2019-11-29 杭州图谱光电科技有限公司 A kind of Auto-counting of Cells method based on dynamic learning of microscope
CN110954465A (en) * 2018-09-26 2020-04-03 希森美康株式会社 Flow cytometer, data transmission method, and information processing system
CN112507991A (en) * 2021-02-04 2021-03-16 季华实验室 Method and system for setting gate of flow cytometer data, storage medium and electronic equipment
CN113380318A (en) * 2021-06-07 2021-09-10 天津金域医学检验实验室有限公司 Artificial intelligence assisted flow cytometry 40CD immunophenotyping detection method and system
CN114136868A (en) * 2021-12-03 2022-03-04 浙江博真生物科技有限公司 Flow cytometry full-automatic clustering method based on density and nonparametric clustering
CN116401567A (en) * 2023-06-02 2023-07-07 支付宝(杭州)信息技术有限公司 Clustering model training, user clustering and information pushing method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102680379A (en) * 2012-05-31 2012-09-19 长春迪瑞医疗科技股份有限公司 Device for classifying and counting white cells by using even high-order aspherical laser shaping system
US20130226469A1 (en) * 2008-04-01 2013-08-29 Purdue Research Foundation Gate-free flow cytometry data analysis
CN103562920A (en) * 2011-03-21 2014-02-05 贝克顿迪金森公司 Neighborhood thresholding in mixed model density gating
CN103942415A (en) * 2014-03-31 2014-07-23 中国人民解放军军事医学科学院卫生装备研究所 Automatic data analysis method of flow cytometer
CN105424560A (en) * 2015-11-24 2016-03-23 苏州创继生物科技有限公司 Automatic quantitative analysis method for data of flow-type particle instrument

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130226469A1 (en) * 2008-04-01 2013-08-29 Purdue Research Foundation Gate-free flow cytometry data analysis
CN103562920A (en) * 2011-03-21 2014-02-05 贝克顿迪金森公司 Neighborhood thresholding in mixed model density gating
CN102680379A (en) * 2012-05-31 2012-09-19 长春迪瑞医疗科技股份有限公司 Device for classifying and counting white cells by using even high-order aspherical laser shaping system
CN103942415A (en) * 2014-03-31 2014-07-23 中国人民解放军军事医学科学院卫生装备研究所 Automatic data analysis method of flow cytometer
CN105424560A (en) * 2015-11-24 2016-03-23 苏州创继生物科技有限公司 Automatic quantitative analysis method for data of flow-type particle instrument

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ALEX RODRIGUEZ 等: "Clustering by fast search and find of density peaks,Alex Rodriguez", 《SCIENCE》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110954465A (en) * 2018-09-26 2020-04-03 希森美康株式会社 Flow cytometer, data transmission method, and information processing system
CN110516584A (en) * 2019-08-22 2019-11-29 杭州图谱光电科技有限公司 A kind of Auto-counting of Cells method based on dynamic learning of microscope
CN110516584B (en) * 2019-08-22 2021-10-08 杭州图谱光电科技有限公司 Cell automatic counting method based on dynamic learning for microscope
CN112507991A (en) * 2021-02-04 2021-03-16 季华实验室 Method and system for setting gate of flow cytometer data, storage medium and electronic equipment
CN113380318A (en) * 2021-06-07 2021-09-10 天津金域医学检验实验室有限公司 Artificial intelligence assisted flow cytometry 40CD immunophenotyping detection method and system
CN114136868A (en) * 2021-12-03 2022-03-04 浙江博真生物科技有限公司 Flow cytometry full-automatic clustering method based on density and nonparametric clustering
CN116401567A (en) * 2023-06-02 2023-07-07 支付宝(杭州)信息技术有限公司 Clustering model training, user clustering and information pushing method and device
CN116401567B (en) * 2023-06-02 2023-09-08 支付宝(杭州)信息技术有限公司 Clustering model training, user clustering and information pushing method and device

Also Published As

Publication number Publication date
CN107389536B (en) 2020-03-31

Similar Documents

Publication Publication Date Title
CN107389536A (en) Fluidic cell particle classifying method of counting based on density distance center algorithm
US10337975B2 (en) Method and system for characterizing particles using a flow cytometer
US10222320B2 (en) Identifying and enumerating early granulated cells (EGCs)
CN101097180B (en) Analyzer and analyzing method
US20080172185A1 (en) Automatic classifying method, device and system for flow cytometry
CN106248559A (en) A kind of leukocyte five sorting technique based on degree of depth study
CN102507417B (en) Method for automatically classifying particles
CN105940301B (en) A kind of stream type cell analyzer and its multidimensional data sorting technique, device
JPH0352573B2 (en)
CN101672759B (en) Classified statistic method and device of particles
JPWO2005050479A1 (en) Similar pattern search device, similar pattern search method, similar pattern search program, and fraction separation device
CN110023759A (en) For using system, method and the product of multidimensional analysis detection abnormal cell
EP2939001B1 (en) Systems and methods for platelet count with clump adjustment
CN105203446B (en) Based on probability distribution cell classification statistical method
US11674881B2 (en) Subsampling flow cytometric event data
CN112114000A (en) Cell analyzer, method for classifying leukocytes based on impedance method and computer-readable storage medium
CN114813522A (en) Blood cell analysis method and system based on microscopic amplification digital image
CN106548203A (en) A kind of fast automatic point of group of multiparameter flow cytometry data and gating method
CN102331393A (en) Method for carrying out automatic classified counting on cells in human blood
CN110226083B (en) Erythrocyte fragment recognition method and device, blood cell analyzer and analysis method
CN110197193A (en) A kind of automatic grouping method of multi-parameter stream data
CN111274949B (en) Blood disease white blood cell scatter diagram similarity analysis method based on structural analysis
CN109580550A (en) A kind of classification processing method and its device of leucocyte
CN112789503B (en) Method for analyzing nucleated red blood cells, blood cell analyzer and storage medium
EP2920573B1 (en) Particle data segmentation result evaluation methods and flow cytometer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant