WO2014112567A1 - Apparatus for classifying cell groups and method for classifying cell groups - Google Patents

Apparatus for classifying cell groups and method for classifying cell groups Download PDF

Info

Publication number
WO2014112567A1
WO2014112567A1 PCT/JP2014/050720 JP2014050720W WO2014112567A1 WO 2014112567 A1 WO2014112567 A1 WO 2014112567A1 JP 2014050720 W JP2014050720 W JP 2014050720W WO 2014112567 A1 WO2014112567 A1 WO 2014112567A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
cell group
population
group
age
Prior art date
Application number
PCT/JP2014/050720
Other languages
French (fr)
Japanese (ja)
Inventor
真也 黒田
大樹 浅野
新介 宇田
峰夫 黒川
裕 矢冨
Original Assignee
国立大学法人 東京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 国立大学法人 東京大学 filed Critical 国立大学法人 東京大学
Publication of WO2014112567A1 publication Critical patent/WO2014112567A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts
    • G06V20/698Matching; Classification
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N2015/1006Investigating individual particles for cytology
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Optical investigation techniques, e.g. flow cytometry
    • G01N2015/1402Data analysis by thresholding or gating operations performed on the acquired signals or stored data
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Optical investigation techniques, e.g. flow cytometry
    • G01N2015/1477Multiparameters
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Optical investigation techniques, e.g. flow cytometry
    • G01N2015/1488Methods for deciding

Definitions

  • the present invention relates to a cell group classification apparatus and a cell group classification method for classifying a cell group to be processed.
  • Non-patent Document 1 analysis using flow cytometry (FCM) has been performed for myelodysplastic syndrome (MDS), and a prognosis prediction model based on this analysis has been proposed.
  • the present invention has been made in view of the above circumstances, and an object thereof is to provide a cell group classification device and a cell group classification method capable of performing analysis in consideration of a realistic pathological condition.
  • the present invention for solving the problems of the above-mentioned conventional example is a cell group classification device, a holding means for holding population data obtained based on measurement data of a cell group sample diagnosed as normal, and processing Based on the measurement data obtained for the target cell group, means for generating processing data that can be tested for similarity to the population data, the generated processing data, and a mother held in the holding means
  • a test means for testing the similarity with the population data a classification means for classifying the cell group to be processed into one of a plurality of predetermined groups according to a predetermined standard based on the result of the test, Is to be included.
  • the holding means is provided for each predetermined age group provided by a provider of an age belonging to each age group, and is obtained based on measurement data of cell group samples diagnosed as normal. Holding the population data, the testing means tests the similarity between the population data for each age group corresponding to the age group to which the age of the provider of the cell group to be processed belongs and the processing data It is good as well.
  • the measurement data is a plurality of parameters obtained by flow cytometry, and the population data and the processing data are m-dimensional (m is a natural number) distribution data obtained based on the parameters. Good.
  • the cell group classification method includes a step of acquiring population data based on measurement data of a cell group sample diagnosed as normal, and based on measurement data obtained for a cell group to be processed. , Generating process data that can be tested for similarity to the population data, testing the similarity between the generated process data and the population data held in the holding means, and In accordance with a predetermined standard based on the result, the cell group to be processed is classified into one of a plurality of predetermined groups.
  • the cell group classification apparatus 1 includes a control unit 11, a storage unit 12, an operation unit 13, an output unit 14, and an analysis information input unit 15 as illustrated in FIG. , Connected to the flow cytometry instrument 2.
  • the cell group classification apparatus 1 executes the cell group classification method according to the embodiment of the present invention.
  • control unit 11 is a program control device such as a CPU (Central Processing Unit), and operates according to a program stored in the storage unit 12.
  • the control unit 11 performs processing using population data obtained based on measurement data of cell group samples diagnosed as normal, which are stored in a storage unit 12 described later.
  • the control unit 11 generates processing data that can be tested for similarity to the population data based on the measurement data obtained for the cell group to be processed, and the generated processing data and the population data Test for similarity.
  • the control part 11 classify
  • the storage unit 12 includes a memory and a disk device.
  • the storage unit 12 stores population data obtained based on measurement data of a cell group sample diagnosed as normal (a sample of a cell group of the same tissue as the cell group to be processed).
  • the measurement data is a plurality of parameters obtained by applying a cell group collected from a specimen diagnosed as normal to a flow cytometry device, and the population data is based on the parameters. This is m-dimensional (m is a natural number) distribution data obtained.
  • this population data is obtained by flow cytometry for each cell group obtained from a plurality of specimens diagnosed as normal, at least FSC (forward scattered light), SSC ( Side-scattered light) and measurement data (information indicating fluorescence intensity) of a given cell surface marker (plural types) were obtained, accumulated, and the population density distribution function was estimated for the accumulated result.
  • the density distribution function is estimated by adaptive partitioning method, kernel density estimation (Parzen E. (1962). On estimation of a probability density function and mode, Ann. Math.athStat. 33, pp. 1065-1076.)
  • kernel density estimation Parzen E. (1962). On estimation of a probability density function and mode, Ann. Math.athStat. 33, pp. 1065-1076.
  • the population data is, for example, the result of an estimation calculation of these density distribution functions, and this estimation calculation method can be used widely in various statistical calculation software. Is omitted.
  • FSC forward scattered light
  • SSC side scattered light
  • predetermined cells are obtained by flow cytometry for each cell group obtained from a plurality of specimens diagnosed as normal.
  • Measurement data information indicating fluorescence intensity
  • the first two-dimensional distribution data with the X axis as FSC and the Y axis as SSC, and the second two dimensional with the X axis as FSC and the Y axis as measurement data for the cell surface marker Generation of distribution data and third two-dimensional distribution data with the X axis as SSC and the Y axis as cell surface marker measurement data, and a density distribution function for each of these two-dimensional distribution data, such as a kernel density estimation method, etc.
  • the result is stored in the storage unit 12 as population data. That is, when r types of cell surface markers are measured, 3r two-dimensional distribution data is generated, and 3r population data is stored in the storage unit 12.
  • the storage unit 12 stores information on test results (known result information) for each cell group included in a plurality of cell groups whose disease progress is known in advance. This known result information will be described later.
  • the storage unit 12 holds a program executed by the control unit 11.
  • the program may be provided by being stored in a computer-readable recording medium such as a DVD-ROM and stored in the storage unit 12.
  • the storage unit 12 also operates as a work memory for the control unit 11.
  • the operation unit 13 includes a keyboard and a mouse.
  • the operation unit 13 accepts an operation from the user and outputs information representing the content of the operation to the control unit 11.
  • the output unit 14 is a display, a printer, or other output device, and outputs information according to an instruction input from the control unit 11.
  • the analysis information input unit 15 is an interface connected to the flow cytometry device 2 and outputs measurement data input from the flow cytometry device 2 to the control unit 11.
  • the control unit 11 of the present embodiment executes a program stored in the storage unit 12, and functionally, as shown in FIG. 2, a processing data generation unit 21, a population data acquisition unit 22, and a test
  • the unit 23, the classification unit 24, and the result presentation unit 25 are configured.
  • the processing data generation unit 21 acquires measurement data obtained for the cell group to be processed from the flow cytometry device 2 via the analysis information input unit 15. Then, the processing data generation unit 21 generates processing data that can be tested for similarity with the population data stored in the storage unit 12. Specifically, this processing data can be generated as follows. Based on the measurement data (a plurality of parameters) obtained by the flow cytometry device 2, m-dimensional distribution data is generated.
  • the distribution data is, for example, 2 + r pieces of measurement data of predetermined r types of cell surface markers that are the same as the source of at least FSC (forward scattered light), SSC (side scattered light), and population data. At least (2 + r) -dimensional distribution data including information.
  • the processing data generation unit 21 sets the X axis as FSC and the Y axis as SSC for each of the r types of cell surface markers based on this distribution data.
  • First two-dimensional distribution data second two-dimensional distribution data in which the X-axis is FSC, the Y-axis is cell surface marker measurement data, the X-axis is SSC, and the Y-axis is cell surface marker measurement data.
  • 3 two-dimensional distribution data is estimated for each of the 3r two-dimensional distribution data, and the result is output to the test unit 23 as processing data.
  • various methods such as an adaptive partitioning method and a kernel density estimation can be used in the same manner as used when obtaining population data.
  • the population data acquisition unit 22 reads population data corresponding to each of the first to third two-dimensional distribution data relating to each of the r types of cell surface markers stored in the storage unit 12.
  • the data is output to the test unit 23.
  • the test unit 23 includes two-dimensional distribution data of processing data (hereinafter, this processing data is written as dp) input from the processing data generation unit 21 and population data (hereinafter referred to as population data acquisition unit 22). , This population data is written as de), and the similarity to the corresponding two-dimensional distribution data is tested. Specifically, the test unit 23 performs the distribution data dp (x1, x2,... Xm) and de (x1, x2,...
  • Each likelihood ratio statistic ⁇ n_1, ⁇ n_2, ⁇ n_3 for each distribution data is calculated.
  • likelihood ratio statistics ⁇ n_1, ⁇ n_2, and ⁇ n_3 are each asymptotic to a chi-square distribution with k-1 degrees of freedom when the sample size is large, and the absolute value of this value increases as the distribution differs.
  • the test unit 23 performs likelihood ratio statistics ⁇ n_1 and ⁇ n_2 for each of the two-dimensional distribution data corresponding to each of the first to third two-dimensional distribution data for each of the r types of cell surface markers. , ⁇ n_3 is obtained.
  • the test unit 23 accumulates the likelihood ratio statistics ⁇ n_1, ⁇ n_2, ⁇ n_3 for each two-dimensional distribution data obtained for each of the r types of cell surface markers to obtain r pieces of r for each of the r types of cell surface markers.
  • a likelihood ratio statistic ⁇ n is obtained.
  • the known result information stored in the storage unit 12 will be described.
  • the known result information associates information (I) indicating the progress of the disease and likelihood ratio statistic ⁇ n for a plurality of cell groups in which the progress of the disease is known in advance. Is.
  • this likelihood ratio statistic ⁇ n is obtained for each of the r types of cell surface markers.
  • the classification unit 24 performs processing using this known result information.
  • the classification unit 24 classifies the cell group to be processed into one of a plurality of predetermined groups according to a predetermined standard based on the calculation result in the test unit 23. Specifically, the classification unit 24 calculates the likelihood ratio statistic ⁇ n for each of the r types of cell surface markers generated based on the processing data, and the r types of cell surface markers for each of a plurality of cell groups included in the known result information. Each likelihood ratio statistic ⁇ n is used as two r-dimensional vectors having the likelihood ratio statistic ⁇ n as elements, and clustering processing is performed. As the clustering processing method, various widely known processes such as a longest distance method using a distance between r-dimensional vectors and a median method can be used.
  • the classification unit 24 divides, for example, the cell group that has been processed by the clustering process and the cell group related to the known result information into a plurality of groups.
  • the result presentation unit 25 uses the cell group information (for example, information indicating the progress of the disease) related to the known result information belonging to each of the plurality of groups obtained by the classification unit 24, and the likelihood generated based on the processing data.
  • Information indicating to which group the ratio statistic ⁇ n is classified (that is, to which group the group of cells to be processed is classified) is output via the output unit 14.
  • this result presentation unit 25 may further present a survival curve (Kaplan- Mayer survival curve) of a cell group related to known result information belonging to each group using the result of this classification.
  • a survival curve Kerplan- Mayer survival curve
  • This embodiment has the above configuration and operates as follows.
  • a group of cells collected from a subject as a treatment target is subjected to measurement using the flow cytometry device 2.
  • the cell group classification apparatus 1 obtains measurement data obtained from the cell group to be processed from the flow cytometry device 2
  • the m-dimensional distribution data is processed as processing data based on the measurement data (a plurality of parameters). Generate.
  • likelihood ratio statistics ⁇ n_1, ⁇ n_2, and ⁇ n_3 for each of the two-dimensional distribution data corresponding to each of the first to third two-dimensional distribution data are calculated. Will be calculated.
  • the cell group classification device 1 uses the likelihood ratio statistics ⁇ n_1, ⁇ n_2, ⁇ n_3 for each of the two-dimensional distribution data corresponding to each of the first to third two-dimensional distribution data for each of the r types of cell surface markers. Are further accumulated to obtain r likelihood ratio statistics ⁇ n for each of the r cell surface markers.
  • the cell group classification device 1 stores the r-dimensional vector having the likelihood ratio statistic ⁇ n for each of the r types of cell surface markers generated based on the processing data as an element, and is stored in the storage unit 12.
  • Clustering processing is performed by a widely known method using r-dimensional vectors having likelihood ratio statistics ⁇ n as elements of r types of cell surface markers for each of a plurality of cell groups included in the result information.
  • the cell group classification device 1 determines which cell group information (for example, information indicating the progress of the disease) related to the known result information belonging to each of the plurality of groups and the likelihood ratio statistic ⁇ n generated based on the processing data.
  • Information indicating whether the cells are classified into groups is output via the output unit 14.
  • the measurement data of the cell group sample diagnosed as normal is divided according to the age group of the provider of each cell group sample, and the storage unit 12 is illustrated in FIG.
  • information that represents the age group information that can be specified by the lowest age and the highest age that belong to the age group
  • population data that is generated based on the measurement data for each age group population data for each age group
  • control unit 11 accepts the age of the provider of the cell group to be processed from the operation unit 13 or the like.
  • the population data acquisition unit 22 acquires population data for each age group corresponding to the age group to which the accepted age belongs, and outputs the population data to the testing unit 24.
  • inspection part 24 tests the similarity of the acquired population data for every age group, and the process data obtained from the cell group used as the process target.
  • the obtained 3r two-dimensional data was used.
  • the present embodiment is not limited to this.
  • r three-dimensional data (distribution data) is obtained with the X axis as FSC, the Y axis as SSC, and the Z axis as each cell surface marker. It is good also as examining the similarity with the population data regarding the cell surface marker corresponding to.
  • distribution data including data relating to a plurality of cell surface markers may be used.
  • processing may be performed using m-dimensional distribution data as it is. Specifically, in this example, population data corresponding to m-dimensional distribution data relating to r types of cell surface markers stored in the storage unit 12 is generated and stored in the storage unit 12.
  • the cell group classification device 1 tests the similarity between the m-dimensional distribution data of the generated processing data dp and the m-dimensional distribution data of the population data de read from the storage unit 12. This test is similar to that already described, and the cell group classification apparatus 1 uses the m for the distribution data dp (x1, x2,... Xm), de (x1, x2,... Xm) in a pair of m-dimensional spaces.
  • the known result information is also associated with one likelihood ratio statistic ⁇ n for each cell group based on the m-dimensional distribution data. Then, the cell group classification device 1 performs clustering processing using a known method (for example, the longest distance method) using the known result information and the likelihood ratio statistic ⁇ n generated based on the processing data.
  • a known method for example, the longest distance method
  • the cell group classification device 1 determines which cell group information (for example, information indicating the progress of the disease) related to the known result information belonging to each of the plurality of groups and the likelihood ratio statistic ⁇ n generated based on the processing data.
  • Information indicating whether the cells are classified into groups is output via the output unit 14.
  • the storage unit 12 includes information indicating an age group (information that can be specified by the minimum age and the maximum age belonging to the age group), and measurement for each age group. You may hold
  • control unit 11 receives the age of the provider of the cell group to be processed from the operation unit 13 or the like, and the fixed population data acquisition unit 22 belongs to the age group to which the received age belongs.
  • Population data corresponding to age groups is acquired and output to the test unit 24.
  • inspection part 24 tests the similarity of the acquired population data for every age group, and the process data obtained from the cell group used as the process target.
  • the cell group classification device 1 uses variations in signal intensity distribution of cell surface markers for all collected cells without performing gating that extracts specific cells.
  • the difference from the variation in the signal intensity distribution of the cell surface marker obtained from the group of cells diagnosed as normal is quantified and handled.
  • the prognosis can be predicted based on the correlation between the quantification result and the prognosis.
  • the two-dimensional distribution data generated here is three-dimensional distribution data obtained by plotting data measured by assigning fluorescence intensity (measurement data) of FSC, SSC, and cell surface markers to respective axes of orthogonal three-dimensional coordinates. Are projected onto a plane including the FSC axis and the SSC axis, a plane including the FSC axis and the cell surface marker measurement data axis, and a plane including the SSC axis and the cell surface marker measurement data axis, respectively.
  • This is a density plot showing the number of data at each point on each surface. Specifically, as shown in FIG. 5, second and third two-dimensional distribution data and the like are obtained from the three-dimensional distribution data. Also, the corresponding two-dimensional distribution data for each sample diagnosed as normal is accumulated (added the density value for the same point), and the accumulated first to third two-dimensional distribution data are obtained. Obtained.
  • This adaptive partitioning method is as follows. That is, as illustrated in FIG. 6, the two-dimensional distribution data to be processed is virtually divided into 2 ⁇ 2 congruent regions, and whether the densities in the divided regions are equal to each other (first 1 hypothesis) is tested by chi-square test (S1). Similarly, the two-dimensional distribution data to be processed is virtually divided into 4 ⁇ 4 congruent areas, and whether or not the densities in the divided areas are equal to each other (second hypothesis). Test by chi-square test (S2).
  • the two-dimensional distribution data to be processed is divided into 2 ⁇ 2 congruent regions and the respective two-dimensional distributions are divided.
  • Data is generated (S3), and the processes of steps S1, S2, and S3 are recursively repeated for each of the generated two-dimensional distribution data (S4).
  • the processing result (density distribution function) obtained for the first to third two-dimensional distribution data was used as population data related to CD34.
  • the above processing was similarly performed for the cell surface marker CD41a, and population data related to CD41a was obtained. In the data obtained by this, as shown in FIG. 7, the distribution is smoothed and the data noise is reduced.
  • the fluorescence intensity of CD34 which is FSC (forward scattered light), SSC (side scattered light), and cell surface marker is measured with a flow cytometry instrument. Measurement data was obtained. From this, the first two-dimensional distribution data (density plot) in which the X-axis is FSC and the Y-axis is SSC, and the second two-dimensional distribution data (density) in which the X-axis is FSC and the Y-axis is cell surface marker measurement data Plot) and third two-dimensional distribution data (density plot) in which the X-axis is SSC and the Y-axis is measurement data of the cell surface marker.
  • FSC forward scattered light
  • SSC side scattered light
  • cell surface marker cell surface marker
  • the adaptive partitioning method was processed for each of the first to third two-dimensional distribution data, and the density distribution function was estimated.
  • the processing of the adaptive partitioning method here is the same as the method used when generating the population data, and thus repeated description is omitted.
  • the fluorescence intensity measurement data related to the cell surface marker CD41a is similarly processed, and the density distribution function for the first to third two-dimensional distribution data related to CD41a is obtained. It was.
  • each two-dimensional distribution data dp (x1, x2,..., X6) (data x1 to x3 related to CD34 and data x4 to x6 related to CD41a) based on the cell group obtained from the patient, and population data de (X1, x2,..., X6) (data x1 to x3 related to CD34 and data x4 to x6 related to CD41a) when each two-dimensional space is divided into a plurality of regions (bins) R1, R2.
  • FIG. 8 shows a density distribution function based on the second and third two-dimensional distribution data related to CD34 for the cell group obtained from the patient (the upper left and right data), and a second related to CD34 for the population data.
  • 4 shows an example in which likelihood ratio statistics ⁇ n_2 and ⁇ n_3 obtained from a density distribution function (bottom two left and right data) based on the third two-dimensional distribution data are calculated.
  • ⁇ n_2 0.12
  • ⁇ n_3 0.153.
  • FIG. 9 shows the likelihood ratio statistics ⁇ n_2 and ⁇ n_3 obtained from the density distribution function based on the second and third two-dimensional distribution data related to CD34 and the CD41a for 59 patients.
  • An example in which a distribution with likelihood ratio statistics ⁇ n_5 and ⁇ n_6 obtained from the density distribution function based on the second and third two-dimensional distribution data is generated for each patient is shown.
  • the horizontal axis represents the patient number (1 to 59), and the vertical axis represents the likelihood ratio statistic.
  • group 1 There is a group with a relatively large median distribution (median values of ⁇ n_2, ⁇ n_3, ⁇ n_5, ⁇ n_6), and when the median value exceeds 0.5 and the median value is 0.5 or less based on 0.5 Into two clusters.
  • group 1 a cluster having a median value greater than 0.5 is referred to as group 1, and a cluster having a median value of 0.5 or less is referred to as group 2.
  • FIG. 10 shows a survival curve (Kaplan- Mayer survival curve) representing the overall survival rate for each of the groups 1 and 2 described above.
  • p 0.0408, which was recognized as significant at the 5% level.
  • the median survival was 1.88 years for group 1 and 4.66 years for group 2.
  • the 5-year survival rate was 30% in Group 1 and 43.7% in Group 2.
  • classification is performed using the variation in the signal intensity of the cell surface marker for all the collected cells, and the difference from the variation in the distribution of the signal intensity of the cell surface marker obtained from a group of cells diagnosed as normal It is also understood that the likelihood ratio statistic, which is the quantification result of, correlates with the prognosis. That is, by performing the above processing for other patients, it becomes possible to determine which group has a different prognosis depending on whether or not the median of the likelihood ratio statistics exceeds 0.5. Yes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Apparatus Associated With Microorganisms And Enzymes (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

This method for classifying cell groups includes: retaining population data obtained on the basis of measured data of cell group samples which have been diagnosed as normal; creating processing data, the similarity of which to the population data can be assayed, on the basis of measured data obtained about cell groups which are to be processed; assaying the similarity of the processing data to the population data retained by a retaining means; and classifying the cell groups which are to be processed into multiple predefined groups according to prescribed criteria based on the results of assays.

Description

細胞群分類装置及び、細胞群分類方法Cell group classification device and cell group classification method
 本発明は、処理の対象となった細胞群を分類する細胞群分類装置及び、細胞群分類方法に関する。 The present invention relates to a cell group classification apparatus and a cell group classification method for classifying a cell group to be processed.
 近年、骨髄異形成症候群(MDS)に対してフローサイトメトリー(FCM)を用いた解析が行われており、この解析に基づく予後予測モデル等が提唱されている(非特許文献1)。 In recent years, analysis using flow cytometry (FCM) has been performed for myelodysplastic syndrome (MDS), and a prognosis prediction model based on this analysis has been proposed (Non-patent Document 1).
 しかしながら、上記従来のフローサイトメトリーを用いた解析においては、異常となった細胞のみを対象として解析している。一方、現実的なMDS等の病態では、異常細胞だけでなく、当該異常細胞の周囲にある免疫細胞や正常細胞との関係に着目する必要がある。つまり、従来の技術では、現実的な病態との関係での解析が適切に行われていない。 However, in the analysis using the conventional flow cytometry, only the abnormal cells are analyzed. On the other hand, in a realistic disease state such as MDS, it is necessary to pay attention to not only abnormal cells but also the relationship with immune cells and normal cells around the abnormal cells. That is, in the conventional technique, analysis in relation to a realistic disease state is not appropriately performed.
 本発明は上記実情に鑑みて為されたもので、現実的な病態を考慮した解析を行うことのできる細胞群分類装置及び、細胞群分類方法を提供することを、その目的の一つとする。 The present invention has been made in view of the above circumstances, and an object thereof is to provide a cell group classification device and a cell group classification method capable of performing analysis in consideration of a realistic pathological condition.
 上記従来例の問題点を解決するための本発明は、細胞群分類装置であって、正常と診断された細胞群サンプルの測定データに基づいて得られる母集団データを保持する保持手段と、処理対象となった細胞群について得た測定データに基づいて、前記母集団データとの類似性を検定可能な処理データを生成する手段と、前記生成した処理データと、前記保持手段に保持された母集団データとの類似性を検定する検定手段と、前記検定の結果に基づく所定の基準に従い、前記処理対象となった細胞群を、予め定めた複数のグループのいずれかに分類する分類手段と、を含むこととしたものである。 The present invention for solving the problems of the above-mentioned conventional example is a cell group classification device, a holding means for holding population data obtained based on measurement data of a cell group sample diagnosed as normal, and processing Based on the measurement data obtained for the target cell group, means for generating processing data that can be tested for similarity to the population data, the generated processing data, and a mother held in the holding means A test means for testing the similarity with the population data, a classification means for classifying the cell group to be processed into one of a plurality of predetermined groups according to a predetermined standard based on the result of the test, Is to be included.
 またここで前記保持手段は、予め定めた年齢層ごとに、各年齢層に属する年齢の提供者から提供され、正常と診断された細胞群サンプルの測定データに基づいて得られる、年齢層ごと母集団データを保持し、前記検定手段は、処理対象となった細胞群の提供者の年齢が属する前記年齢層に対応する、年齢層ごと母集団データと、前記処理データとの類似性を検定することとしてもよい。
 また前記測定データは、フローサイトメトリーによって得られる複数のパラメータであり、前記母集団データと前記処理データとは、当該パラメータに基づいて得られるm次元(mは自然数)の分布データであってもよい。
Further, here, the holding means is provided for each predetermined age group provided by a provider of an age belonging to each age group, and is obtained based on measurement data of cell group samples diagnosed as normal. Holding the population data, the testing means tests the similarity between the population data for each age group corresponding to the age group to which the age of the provider of the cell group to be processed belongs and the processing data It is good as well.
The measurement data is a plurality of parameters obtained by flow cytometry, and the population data and the processing data are m-dimensional (m is a natural number) distribution data obtained based on the parameters. Good.
 また本発明の一態様に係る細胞群分類方法は、正常と診断された細胞群サンプルの測定データに基づく母集団データを取得する工程、処理対象となった細胞群について得た測定データに基づいて、前記母集団データとの類似性を検定可能な処理データを生成する工程、前記生成した処理データと、前記保持手段に保持された母集団データとの類似性を検定する工程、及び前記検定の結果に基づく所定の基準に従い、前記処理対象となった細胞群を、予め定めた複数のグループのいずれかに分類する工程、を含むこととしたものである。 The cell group classification method according to one embodiment of the present invention includes a step of acquiring population data based on measurement data of a cell group sample diagnosed as normal, and based on measurement data obtained for a cell group to be processed. , Generating process data that can be tested for similarity to the population data, testing the similarity between the generated process data and the population data held in the holding means, and In accordance with a predetermined standard based on the result, the cell group to be processed is classified into one of a plurality of predetermined groups.
 本発明によると、現実的な病態を考慮した解析を行うことができる。 According to the present invention, it is possible to perform an analysis in consideration of a realistic disease state.
本発明の実施の形態に係る細胞群分類装置の構成例を表すブロック図である。It is a block diagram showing the example of a structure of the cell group classification device based on embodiment of this invention. 本発明の実施の形態に係る細胞群分類装置の例を表す機能ブロック図である。It is a functional block diagram showing the example of the cell group classification device concerning an embodiment of the invention. 本発明の実施の形態に係る細胞群分類装置が分類の処理で用いる既知結果情報の例を表す説明図である。It is explanatory drawing showing the example of the known result information which the cell group classification device based on embodiment of this invention uses by the process of classification. 本発明の実施の形態の一例に係る細胞群分類装置が利用する年齢層別の母集団データの例を表す説明図である。It is explanatory drawing showing the example of the population data according to age group which the cell group classification device concerning an example of an embodiment of the invention uses. 本発明の実施の形態に係る細胞群分類装置が生成する二次元分布データの例を表す説明図である。It is explanatory drawing showing the example of the two-dimensional distribution data which the cell group classification device which concerns on embodiment of this invention produces | generates. 本発明の実施の形態に係る細胞群分類装置による適応的パーティショニング処理の概要を表す説明図である。It is explanatory drawing showing the outline | summary of the adaptive partitioning process by the cell group classification device which concerns on embodiment of this invention. 本発明の実施の形態に係る細胞群分類装置が生成する密度分布関数の例を表す説明図である。It is explanatory drawing showing the example of the density distribution function which the cell group classification device concerning an embodiment of the invention generates. 本発明の実施の形態に係る細胞群分類装置による尤度比統計量の演算例を表す説明図である。It is explanatory drawing showing the example of calculation of the likelihood ratio statistic by the cell group classification device concerning an embodiment of the invention. 本発明の実施の形態に係る細胞群分類装置による尤度比統計量の検定例を表す説明図である。It is explanatory drawing showing the test example of likelihood ratio statistic by the cell group classification | category apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る細胞群分類装置により得られた結果に基づく生存曲線の例を表す説明図である。It is explanatory drawing showing the example of the survival curve based on the result obtained by the cell group classification device concerning an embodiment of the invention.
 本発明の実施の形態について図面を参照しながら説明する。本発明の実施の形態に係る細胞群分類装置1は、図1に例示するように、制御部11、記憶部12、操作部13、出力部14、及び分析情報入力部15を含んで構成され、フローサイトメトリー機器2に接続される。本実施の形態のある例では、この細胞群分類装置1により本発明の実施の形態に係る細胞群分類方法が実行される。 Embodiments of the present invention will be described with reference to the drawings. The cell group classification apparatus 1 according to the embodiment of the present invention includes a control unit 11, a storage unit 12, an operation unit 13, an output unit 14, and an analysis information input unit 15 as illustrated in FIG. , Connected to the flow cytometry instrument 2. In an example of the present embodiment, the cell group classification apparatus 1 executes the cell group classification method according to the embodiment of the present invention.
 ここに制御部11は、CPU(Central Processing Unit)等のプログラム制御デバイスであり、記憶部12に格納されたプログラムに従って動作する。本実施の形態においてこの制御部11は、後に説明する記憶部12に格納されている、正常と診断された細胞群サンプルの測定データに基づいて得られる母集団データを用いた処理を行う。制御部11は、処理対象となった細胞群について得た測定データに基づいて、母集団データとの類似性を検定可能な処理データを生成し、当該生成した処理データと、母集団データとの類似性を検定する。そして制御部11は、この検定の結果に基づく所定の基準に従い、処理対象となった細胞群を、予め定めた複数のグループのいずれかに分類して、当該分類の結果を表す情報を出力する。この制御部11の詳しい処理の内容は後に説明する。 Here, the control unit 11 is a program control device such as a CPU (Central Processing Unit), and operates according to a program stored in the storage unit 12. In the present embodiment, the control unit 11 performs processing using population data obtained based on measurement data of cell group samples diagnosed as normal, which are stored in a storage unit 12 described later. The control unit 11 generates processing data that can be tested for similarity to the population data based on the measurement data obtained for the cell group to be processed, and the generated processing data and the population data Test for similarity. And the control part 11 classify | categorizes the cell group used as the process target into either of several predetermined groups according to the predetermined reference | standard based on this test result, and outputs the information showing the result of the said classification | category . Details of the processing of the control unit 11 will be described later.
 記憶部12は、メモリやディスクデバイスを含んで構成される。この記憶部12には、正常と診断された細胞群サンプル(処理対象となる細胞群と同じ組織の細胞群のサンプル)の測定データに基づいて得られる母集団データが格納されている。本実施の形態の一例では、この測定データは、正常と診断された検体から採取された細胞群を、フローサイトメトリー機器にかけて得られる複数のパラメータであり、母集団データは、当該パラメータに基づいて得られるm次元(mは自然数)の分布データである。 The storage unit 12 includes a memory and a disk device. The storage unit 12 stores population data obtained based on measurement data of a cell group sample diagnosed as normal (a sample of a cell group of the same tissue as the cell group to be processed). In one example of the present embodiment, the measurement data is a plurality of parameters obtained by applying a cell group collected from a specimen diagnosed as normal to a flow cytometry device, and the population data is based on the parameters. This is m-dimensional (m is a natural number) distribution data obtained.
 具体的に本実施の形態の一例では、この母集団データは、正常と診断されている複数の検体から得た各細胞群についてのフローサイトメトリーにて、少なくともFSC(前方散乱光)、SSC(側方散乱光)、及び所定の細胞表面マーカー(複数種類)の測定データ(蛍光強度を表す情報)をそれぞれ得て、これらを累算し、累算結果について母集団の密度分布関数を推定したものである。ここで密度分布関数の推定は、適応パーティショニング法、カーネル密度推定(Parzen E. (1962). On estimation of a probability density function and mode, Ann. Math. Stat. 33, pp. 1065-1076.)など種々の方法を用いることができる。つまり、母集団データは、例えばこれらの密度分布関数の推定演算の結果であり、この推定演算の方法は種々の統計演算ソフトウエアにおいても広く知られたものを利用できるので、ここでの詳しい説明を省略する。 Specifically, in one example of the present embodiment, this population data is obtained by flow cytometry for each cell group obtained from a plurality of specimens diagnosed as normal, at least FSC (forward scattered light), SSC ( Side-scattered light) and measurement data (information indicating fluorescence intensity) of a given cell surface marker (plural types) were obtained, accumulated, and the population density distribution function was estimated for the accumulated result. Is. The density distribution function is estimated by adaptive partitioning method, kernel density estimation (Parzen E. (1962). On estimation of a probability density function and mode, Ann. Math.athStat. 33, pp. 1065-1076.) Various methods can be used. In other words, the population data is, for example, the result of an estimation calculation of these density distribution functions, and this estimation calculation method can be used widely in various statistical calculation software. Is omitted.
 本実施の形態の一例では、正常と診断されている複数の検体から得た各細胞群についてのフローサイトメトリーにて、FSC(前方散乱光)、SSC(側方散乱光)、及び所定の細胞表面マーカーの測定データ(蛍光強度を表す情報)を得る。そして細胞表面マーカーの種類ごとに、X軸をFSC,Y軸をSSCとする第1の二次元分布データと、X軸をFSC,Y軸を細胞表面マーカーの測定データとする第2の二次元分布データと、X軸をSSC,Y軸を細胞表面マーカーの測定データとする第3の二次元分布データとを生成し、これらの各二次元分布データについて密度分布関数を例えばカーネル密度推定法等により推定し、その結果を母集団データとして記憶部12に格納しておくものとする。つまり、r種類の細胞表面マーカーを測定した場合、3r個の二次元分布データが生成され、3r個の母集団データが記憶部12に保持される。 In an example of the present embodiment, FSC (forward scattered light), SSC (side scattered light), and predetermined cells are obtained by flow cytometry for each cell group obtained from a plurality of specimens diagnosed as normal. Measurement data (information indicating fluorescence intensity) of the surface marker is obtained. For each type of cell surface marker, the first two-dimensional distribution data with the X axis as FSC and the Y axis as SSC, and the second two dimensional with the X axis as FSC and the Y axis as measurement data for the cell surface marker Generation of distribution data and third two-dimensional distribution data with the X axis as SSC and the Y axis as cell surface marker measurement data, and a density distribution function for each of these two-dimensional distribution data, such as a kernel density estimation method, etc. And the result is stored in the storage unit 12 as population data. That is, when r types of cell surface markers are measured, 3r two-dimensional distribution data is generated, and 3r population data is stored in the storage unit 12.
 また、この記憶部12には、予め疾患の経過が知られている複数の細胞群に含まれる細胞群ごとの検定結果の情報(既知結果情報)が保持される。この既知結果情報については後に述べる。 In addition, the storage unit 12 stores information on test results (known result information) for each cell group included in a plurality of cell groups whose disease progress is known in advance. This known result information will be described later.
 さらにこの記憶部12は、制御部11によって実行されるプログラムを保持する。このプログラムは、DVD-ROM等のコンピュータ可読な記録媒体に格納されて提供され、この記憶部12に格納されたものであってもよい。またこの記憶部12は、制御部11のワークメモリとしても動作する。 Further, the storage unit 12 holds a program executed by the control unit 11. The program may be provided by being stored in a computer-readable recording medium such as a DVD-ROM and stored in the storage unit 12. The storage unit 12 also operates as a work memory for the control unit 11.
 操作部13は、キーボードやマウス等を含む。操作部13は、利用者からの操作を受け入れて、当該操作の内容を表す情報を、制御部11に出力する。出力部14は、ディスプレイや、プリンタ、その他の出力デバイスであり、制御部11から入力される指示に従って情報を出力する。分析情報入力部15は、フローサイトメトリー機器2に接続されるインタフェースであり、フローサイトメトリー機器2から入力される測定データを制御部11に出力する。 The operation unit 13 includes a keyboard and a mouse. The operation unit 13 accepts an operation from the user and outputs information representing the content of the operation to the control unit 11. The output unit 14 is a display, a printer, or other output device, and outputs information according to an instruction input from the control unit 11. The analysis information input unit 15 is an interface connected to the flow cytometry device 2 and outputs measurement data input from the flow cytometry device 2 to the control unit 11.
 次に、本実施の形態の制御部11の動作について説明する。本実施の形態の制御部11は、記憶部12に格納されたプログラムを実行し、機能的には、図2に示すように、処理データ生成部21と、母集団データ取得部22と、検定部23と、分類部24と、結果提示部25とを含んで構成される。 Next, the operation of the control unit 11 of this embodiment will be described. The control unit 11 of the present embodiment executes a program stored in the storage unit 12, and functionally, as shown in FIG. 2, a processing data generation unit 21, a population data acquisition unit 22, and a test The unit 23, the classification unit 24, and the result presentation unit 25 are configured.
 ここで処理データ生成部21は、処理対象となった細胞群について得た測定データを、分析情報入力部15を介してフローサイトメトリー機器2から取得する。そして処理データ生成部21は、記憶部12に格納されている、母集団データとの類似性を検定可能な処理データを生成する。具体的にこの処理データは次のように生成できる。フローサイトメトリー機器2で得られた測定データ(複数のパラメータ)に基づき、m次元の分布データを生成する。この分布データは例えば、少なくともFSC(前方散乱光)、SSC(側方散乱光)、及び母集団データの元となったものと同じ、所定のr種類の細胞表面マーカーの測定データの2+r個の情報を含む、少なくとも(2+r)次元の分布データである。
 処理データ生成部21は、母集団データから母集団データを得るのと同様に、この分布データに基づいてr種類の細胞表面マーカーの種類ごとに、X軸をFSC,Y軸をSSCとする第1の二次元分布データと、X軸をFSC,Y軸を細胞表面マーカーの測定データとする第2の二次元分布データと、X軸をSSC,Y軸を細胞表面マーカーの測定データとする第3の二次元分布データとを生成する。そしてこれら3r個の二次元分布データの各々について密度分布関数を推定し、その結果を処理データとして検定部23に出力する。この密度分布関数の推定もまた、母集団データを得るときに用いたのと同様に、適応パーティショニング法、カーネル密度推定など種々の方法を用いることができる。
Here, the processing data generation unit 21 acquires measurement data obtained for the cell group to be processed from the flow cytometry device 2 via the analysis information input unit 15. Then, the processing data generation unit 21 generates processing data that can be tested for similarity with the population data stored in the storage unit 12. Specifically, this processing data can be generated as follows. Based on the measurement data (a plurality of parameters) obtained by the flow cytometry device 2, m-dimensional distribution data is generated. The distribution data is, for example, 2 + r pieces of measurement data of predetermined r types of cell surface markers that are the same as the source of at least FSC (forward scattered light), SSC (side scattered light), and population data. At least (2 + r) -dimensional distribution data including information.
Similarly to obtaining population data from population data, the processing data generation unit 21 sets the X axis as FSC and the Y axis as SSC for each of the r types of cell surface markers based on this distribution data. First two-dimensional distribution data, second two-dimensional distribution data in which the X-axis is FSC, the Y-axis is cell surface marker measurement data, the X-axis is SSC, and the Y-axis is cell surface marker measurement data. 3 two-dimensional distribution data. Then, a density distribution function is estimated for each of the 3r two-dimensional distribution data, and the result is output to the test unit 23 as processing data. For the estimation of the density distribution function, various methods such as an adaptive partitioning method and a kernel density estimation can be used in the same manner as used when obtaining population data.
 母集団データ取得部22は、記憶部12に格納されている、r種類の細胞表面マーカーの各々に係る第1から第3の二次元分布データごとに、それぞれに対応する母集団データを読み出して検定部23に出力する。検定部23は、処理データ生成部21から入力される処理データ(以下、この処理データをdpと書く)の各二次元分布データと、母集団データ取得部22から入力される母集団データ(以下、この母集団データをdeと書く)の対応する二次元分布データとの類似性を検定する。具体的にこの検定部23は、一対のm次元空間内(mは自然数、ここではm=2)の分布データdp(x1,x2,…xm),de(x1,x2,…xm)について、当該m次元空間を複数の領域(ビン)R1,R2…に区分したときの各領域(ビン)Ri(i=1,2,…k)内のそれぞれのデータの総和Dpi,Dei(i=1,2,…k)を演算する。 The population data acquisition unit 22 reads population data corresponding to each of the first to third two-dimensional distribution data relating to each of the r types of cell surface markers stored in the storage unit 12. The data is output to the test unit 23. The test unit 23 includes two-dimensional distribution data of processing data (hereinafter, this processing data is written as dp) input from the processing data generation unit 21 and population data (hereinafter referred to as population data acquisition unit 22). , This population data is written as de), and the similarity to the corresponding two-dimensional distribution data is tested. Specifically, the test unit 23 performs the distribution data dp (x1, x2,... Xm) and de (x1, x2,... Xm) in a pair of m-dimensional spaces (m is a natural number, here m = 2). The sum Dpi, Dei (i = 1) of the respective data in each region (bin) Ri (i = 1, 2,... K) when the m-dimensional space is divided into a plurality of regions (bins) R1, R2,. , 2, ... k).
 一例として、この領域Riは、例えばxs_j_min<xs_j<xs_j_max(s=1,2,…m、j=1,2,…)で区切られた領域内とすることができ、互いに重なり合わないように設定されているものとする。 As an example, this area Ri can be in an area delimited by, for example, xs_j_min <xs_j <xs_j_max (s = 1, 2,... M, j = 1, 2,...), So as not to overlap each other. It is assumed that it is set.
 検定部23は、各領域(ビン)Ri(i=1,2,…k)内のそれぞれのデータの総和Dpi,Deiについて、次の(1)式を用いて細胞表面マーカーごと、かつ二次元分布データごとの尤度比統計量κn_1,κn_2,κn_3のそれぞれを演算する。 The test unit 23 uses the following equation (1) for the total sum Dpi, Dei of each data in each region (bin) Ri (i = 1, 2,... K), and for each two-dimensional cell surface marker. Each likelihood ratio statistic κn_1, κn_2, κn_3 for each distribution data is calculated.
Figure JPOXMLDOC01-appb-M000001
 
Figure JPOXMLDOC01-appb-M000001
 
 これらの尤度比統計量κn_1,κn_2,κn_3は、サンプルサイズが大きい場合に、それぞれ自由度k-1のカイ二乗分布に漸近するもので、分布が異なるほど、この値の絶対値が大きくなる。ここでの例では、検定部23は、r種類の細胞表面マーカーごとに、第1から第3の二次元分布データの各々について互いに対応する二次元分布データごとに尤度比統計量κn_1,κn_2,κn_3を得る。そして検定部23は、r種類の細胞表面マーカーごとに得られる、二次元分布データごとの尤度比統計量κn_1,κn_2,κn_3を累算して、r種類の細胞表面マーカーごとのr個の尤度比統計量κnを得る。 These likelihood ratio statistics κn_1, κn_2, and κn_3 are each asymptotic to a chi-square distribution with k-1 degrees of freedom when the sample size is large, and the absolute value of this value increases as the distribution differs. . In this example, the test unit 23 performs likelihood ratio statistics κn_1 and κn_2 for each of the two-dimensional distribution data corresponding to each of the first to third two-dimensional distribution data for each of the r types of cell surface markers. , Κn_3 is obtained. Then, the test unit 23 accumulates the likelihood ratio statistics κn_1, κn_2, κn_3 for each two-dimensional distribution data obtained for each of the r types of cell surface markers to obtain r pieces of r for each of the r types of cell surface markers. A likelihood ratio statistic κn is obtained.
 ここで記憶部12に格納されている既知結果情報について述べる。既知結果情報は、図3に例示するように、予め疾患の経過が知られている複数の細胞群について、当該疾患の経過を表す情報(I)と、尤度比統計量κnとを関連付けたものである。本実施の形態のここでの例では、この尤度比統計量κnは、r種類の細胞表面マーカーごとに得られたものとなる。分類部24は、この既知結果情報を用いて処理を行う。 Here, the known result information stored in the storage unit 12 will be described. As shown in FIG. 3, the known result information associates information (I) indicating the progress of the disease and likelihood ratio statistic κn for a plurality of cell groups in which the progress of the disease is known in advance. Is. In this example of the present embodiment, this likelihood ratio statistic κn is obtained for each of the r types of cell surface markers. The classification unit 24 performs processing using this known result information.
 すなわち分類部24は、検定部23における演算結果に基づく所定の基準に従い、処理対象となった細胞群を、予め定めた複数のグループのいずれかに分類する。具体的にこの分類部24は、処理データに基づいて生成したr種類の細胞表面マーカーごとの尤度比統計量κnと、既知結果情報に含まれる複数の細胞群ごとのr種類の細胞表面マーカーごとの尤度比統計量κnとのそれぞれを、尤度比統計量κnを要素とした2つのr次元のベクトルとして用い、クラスタリング処理を行う。このクラスタリングの処理の方法はr次元のベクトル間の距離を用いる最長距離法や、メディアン法等各種の広く知られた処理を用いることができる。 That is, the classification unit 24 classifies the cell group to be processed into one of a plurality of predetermined groups according to a predetermined standard based on the calculation result in the test unit 23. Specifically, the classification unit 24 calculates the likelihood ratio statistic κn for each of the r types of cell surface markers generated based on the processing data, and the r types of cell surface markers for each of a plurality of cell groups included in the known result information. Each likelihood ratio statistic κn is used as two r-dimensional vectors having the likelihood ratio statistic κn as elements, and clustering processing is performed. As the clustering processing method, various widely known processes such as a longest distance method using a distance between r-dimensional vectors and a median method can be used.
 分類部24は、例えばこのクラスタリングの処理により処理対象となった細胞群と、既知結果情報に係る細胞群とを複数のグループに分割する。結果提示部25は、この分類部24によって得られた複数のグループの各々に属する既知結果情報に係る細胞群の情報(例えば疾患の経過を表す情報)と、処理データに基づいて生成した尤度比統計量κnをどのグループに分類したか(つまり、処理対象となった細胞群をどのグループに分類したか)を表す情報とを出力部14を介して出力する。 The classification unit 24 divides, for example, the cell group that has been processed by the clustering process and the cell group related to the known result information into a plurality of groups. The result presentation unit 25 uses the cell group information (for example, information indicating the progress of the disease) related to the known result information belonging to each of the plurality of groups obtained by the classification unit 24, and the likelihood generated based on the processing data. Information indicating to which group the ratio statistic κn is classified (that is, to which group the group of cells to be processed is classified) is output via the output unit 14.
 なお、この結果提示部25は、さらにこの分類の結果を用いて各グループに属する既知結果情報に係る細胞群の生存曲線(Kaplan-Mayer生存曲線)を提示してもよい。 In addition, this result presentation unit 25 may further present a survival curve (Kaplan-Mayer survival curve) of a cell group related to known result information belonging to each group using the result of this classification.
 本実施の形態は、以上の構成を有してなり、次のように動作する。処理の対象として被験者から採取した細胞群をフローサイトメトリー機器2を用いた測定にかける。細胞群分類装置1は、処理対象となった細胞群について得た測定データをフローサイトメトリー機器2から取得すると、この測定データ(複数のパラメータ)に基づき、m次元の分布データを、処理データとして生成する。 This embodiment has the above configuration and operates as follows. A group of cells collected from a subject as a treatment target is subjected to measurement using the flow cytometry device 2. When the cell group classification apparatus 1 obtains measurement data obtained from the cell group to be processed from the flow cytometry device 2, the m-dimensional distribution data is processed as processing data based on the measurement data (a plurality of parameters). Generate.
 細胞群分類装置1は、記憶部12に格納されている、r種類の細胞表面マーカーに係る第1から第3の二次元分布データごとに、それぞれに対応する母集団データを読み出す。そして細胞群分類装置1は、生成した処理データdpの各二次元分布データと、記憶部12から読出した母集団データdeの対応する二次元分布データとの類似性を検定する。具体的にこの検定は次のように行う。すなわち細胞群分類装置1は、一対のm次元空間内(ここではm=2)の分布データdp(x1,x2,…xm),de(x1,x2,…xm)について、当該m次元空間を複数の領域(ビン)R1,R2…に区分したときの各領域(ビン)Ri(i=1,2,…k)内のそれぞれのデータの総和Dpi,Dei(i=1,2,…k)を演算する。そして、細胞群分類装置1は、各領域(ビン)Ri(i=1,2,…k)内のそれぞれのデータの総和Dpi,Deiについて、(1)式により尤度比統計量を演算する。ここでの例では、r種類の細胞表面マーカーごとに、第1から第3の二次元分布データの各々について互いに対応する二次元分布データのそれぞれについての尤度比統計量κn_1,κn_2,κn_3を演算することとなる。 The cell group classification device 1 reads population data corresponding to each of the first to third two-dimensional distribution data related to the r types of cell surface markers stored in the storage unit 12. Then, the cell group classification device 1 tests the similarity between each two-dimensional distribution data of the generated processing data dp and the corresponding two-dimensional distribution data of the population data de read from the storage unit 12. Specifically, this test is performed as follows. That is, the cell group classification device 1 uses the m-dimensional space for the distribution data dp (x1, x2,... Xm), de (x1, x2,... Xm) in a pair of m-dimensional spaces (here, m = 2). The sum Dpi, Dei (i = 1, 2,... K) of each data in each region (bin) Ri (i = 1, 2,... K) when divided into a plurality of regions (bins) R1, R2,. ) Is calculated. Then, the cell group classification device 1 calculates a likelihood ratio statistic based on the equation (1) for the sum Dpi, Dei of each data in each region (bin) Ri (i = 1, 2,... K). . In this example, for each of the r types of cell surface markers, likelihood ratio statistics κn_1, κn_2, and κn_3 for each of the two-dimensional distribution data corresponding to each of the first to third two-dimensional distribution data are calculated. Will be calculated.
 細胞群分類装置1は、r種類の細胞表面マーカーごとに、第1から第3の二次元分布データの各々について互いに対応する二次元分布データのそれぞれについての尤度比統計量κn_1,κn_2,κn_3を得、さらにこれらを累算して、r種類の細胞表面マーカーごとのr個の尤度比統計量κnを得る。 The cell group classification device 1 uses the likelihood ratio statistics κn_1, κn_2, κn_3 for each of the two-dimensional distribution data corresponding to each of the first to third two-dimensional distribution data for each of the r types of cell surface markers. Are further accumulated to obtain r likelihood ratio statistics κn for each of the r cell surface markers.
 次に細胞群分類装置1は、処理データに基づいて生成したr種類の細胞表面マーカーごとの尤度比統計量κnを要素としたr次元のベクトルと、記憶部12に格納されている、既知結果情報に含まれる複数の細胞群ごとのr種類の細胞表面マーカーごとの尤度比統計量κnを要素としたr次元のベクトルとを用い、広く知られた方法でクラスタリング処理を行う。 Next, the cell group classification device 1 stores the r-dimensional vector having the likelihood ratio statistic κn for each of the r types of cell surface markers generated based on the processing data as an element, and is stored in the storage unit 12. Clustering processing is performed by a widely known method using r-dimensional vectors having likelihood ratio statistics κn as elements of r types of cell surface markers for each of a plurality of cell groups included in the result information.
 そして細胞群分類装置1は、複数のグループの各々に属する既知結果情報に係る細胞群の情報(例えば疾患の経過を表す情報)と、処理データに基づいて生成した尤度比統計量κnをどのグループに分類したか(つまり、処理対象となった細胞群をどのグループに分類したか)を表す情報とを出力部14を介して出力する。 Then, the cell group classification device 1 determines which cell group information (for example, information indicating the progress of the disease) related to the known result information belonging to each of the plurality of groups and the likelihood ratio statistic κn generated based on the processing data. Information indicating whether the cells are classified into groups (that is, the group into which the cell group to be processed is classified) is output via the output unit 14.
 さらに本実施の形態の細胞群分類装置1では、正常と診断されている細胞群サンプルの測定データを、各細胞群サンプルの提供者の年齢層別に分割し、記憶部12は、図4に例示するように、年齢層を表す情報(年齢層に属する最低年齢と最高年齢とで特定可能な情報)と、各年齢層別の測定データに基づいて生成した母集団データ(年齢層ごと母集団データ)とを関連付けて保持してもよい。 Furthermore, in the cell group classification apparatus 1 of the present embodiment, the measurement data of the cell group sample diagnosed as normal is divided according to the age group of the provider of each cell group sample, and the storage unit 12 is illustrated in FIG. As described above, information that represents the age group (information that can be specified by the lowest age and the highest age that belong to the age group) and population data that is generated based on the measurement data for each age group (population data for each age group) ) May be held in association with each other.
 この場合、制御部11は、処理対象となった細胞群の提供者の年齢を操作部13等から受け入れる。そして母集団データ取得部22は、当該受け入れた年齢が属する年齢層に対応する、年齢層ごと母集団データを取得して検定部24に出力する。そして検定部24が、当該取得された年齢層ごと母集団データと、処理対象となった細胞群から得られた処理データとの類似性を検定する。 In this case, the control unit 11 accepts the age of the provider of the cell group to be processed from the operation unit 13 or the like. The population data acquisition unit 22 acquires population data for each age group corresponding to the age group to which the accepted age belongs, and outputs the population data to the testing unit 24. And the test | inspection part 24 tests the similarity of the acquired population data for every age group, and the process data obtained from the cell group used as the process target.
 またここでは、本実施の形態の細胞群分類装置1は、測定データに基づく分布データや、母集団データとして、m次元(r種類の細胞表面マーカーを用いる場合、m=r+2)の分布データから得た3r個の2次元のデータを用いていた。しかしながら本実施の形態は、これに限られない。例えば、X軸をFSC,Y軸をSSCとし、Z軸を各細胞表面マーカーとしたr個の3次元データ(分布データ)を得て、検定部23が、測定データに基づく分布データと、それぞれに対応する細胞表面マーカーに係る母集団データとの類似性を検定することとしてもよい。また、複数の細胞表面マーカーに係るデータを含む分布データを用いてもよい。さらに、本実施の形態では、m次元の分布データをそのまま用いて処理を行ってもよい。具体的に、この例では、記憶部12に格納されている、r種類の細胞表面マーカーに係るm次元の分布データに対応する母集団データを生成して記憶部12に保持しておく。 In addition, here, the cell group classification device 1 of the present embodiment uses m-dimensional (m = r + 2 when r types of cell surface markers are used) distribution data based on measurement data or population data as distribution data. The obtained 3r two-dimensional data was used. However, the present embodiment is not limited to this. For example, r three-dimensional data (distribution data) is obtained with the X axis as FSC, the Y axis as SSC, and the Z axis as each cell surface marker. It is good also as examining the similarity with the population data regarding the cell surface marker corresponding to. Further, distribution data including data relating to a plurality of cell surface markers may be used. Further, in the present embodiment, processing may be performed using m-dimensional distribution data as it is. Specifically, in this example, population data corresponding to m-dimensional distribution data relating to r types of cell surface markers stored in the storage unit 12 is generated and stored in the storage unit 12.
 細胞群分類装置1は、生成した処理データdpのm次元分布データと、記憶部12から読み出した母集団データdeのm次元分布データとの類似性を検定する。この検定は既に述べたものと同様で、細胞群分類装置1は、一対のm次元空間内の分布データdp(x1,x2,…xm),de(x1,x2,…xm)について、当該m次元空間を複数の領域(ビン)R1,R2…に区分したときの各領域(ビン)Ri(i=1,2,…k)内のそれぞれのデータの総和Dpi,Dei(i=1,2,…k)を演算する。そして、細胞群分類装置1は、各領域(ビン)Ri(i=1,2,…k)内のそれぞれのデータの総和Dpi,Deiについて、(1)式により尤度比統計量κnを演算する。 The cell group classification device 1 tests the similarity between the m-dimensional distribution data of the generated processing data dp and the m-dimensional distribution data of the population data de read from the storage unit 12. This test is similar to that already described, and the cell group classification apparatus 1 uses the m for the distribution data dp (x1, x2,... Xm), de (x1, x2,... Xm) in a pair of m-dimensional spaces. The sum Dpi, Dei (i = 1, 2) of the respective data in each region (bin) Ri (i = 1, 2,... K) when the dimensional space is divided into a plurality of regions (bins) R1, R2,. , ... k). Then, the cell group classification device 1 calculates the likelihood ratio statistic κn by the equation (1) for the sum Dpi, Dei of each data in each region (bin) Ri (i = 1, 2,... K). To do.
 一方、このような処理を行う場合、既知結果情報もまた、m次元分布データに基づいて、細胞群ごとに一つの尤度比統計量κnが関連付けられたものとなる。そして細胞群分類装置1は、この既知結果情報と、処理データに基づいて生成した尤度比統計量κnとを用い、広く知られた方法(例えば最長距離法等)でクラスタリング処理を行う。 On the other hand, when such processing is performed, the known result information is also associated with one likelihood ratio statistic κn for each cell group based on the m-dimensional distribution data. Then, the cell group classification device 1 performs clustering processing using a known method (for example, the longest distance method) using the known result information and the likelihood ratio statistic κn generated based on the processing data.
 そして細胞群分類装置1は、複数のグループの各々に属する既知結果情報に係る細胞群の情報(例えば疾患の経過を表す情報)と、処理データに基づいて生成した尤度比統計量κnをどのグループに分類したか(つまり、処理対象となった細胞群をどのグループに分類したか)を表す情報とを出力部14を介して出力する。 Then, the cell group classification device 1 determines which cell group information (for example, information indicating the progress of the disease) related to the known result information belonging to each of the plurality of groups and the likelihood ratio statistic κn generated based on the processing data. Information indicating whether the cells are classified into groups (that is, the group into which the cell group to be processed is classified) is output via the output unit 14.
 この例による場合も、記憶部12は、図4に例示したるように、年齢層を表す情報(年齢層に属する最低年齢と最高年齢とで特定可能な情報)と、各年齢層別の測定データに基づいて生成した母集団データ(年齢層ごと母集団データ)とを関連付けて保持してもよい。 Also in this example, as illustrated in FIG. 4, the storage unit 12 includes information indicating an age group (information that can be specified by the minimum age and the maximum age belonging to the age group), and measurement for each age group. You may hold | maintain in association with the population data (population data for every age group) produced | generated based on data.
 そしてこのようにした場合は、制御部11が、処理対象となった細胞群の提供者の年齢を操作部13等から受け入れ、定母集団データ取得部22が、当該受け入れた年齢が属する年齢層に対応する、年齢層ごと母集団データを取得して検定部24に出力する。そして検定部24が、当該取得された年齢層ごと母集団データと、処理対象となった細胞群から得られた処理データとの類似性を検定する。 In this case, the control unit 11 receives the age of the provider of the cell group to be processed from the operation unit 13 or the like, and the fixed population data acquisition unit 22 belongs to the age group to which the received age belongs. Population data corresponding to age groups is acquired and output to the test unit 24. And the test | inspection part 24 tests the similarity of the acquired population data for every age group, and the process data obtained from the cell group used as the process target.
 本実施の形態によれば、異常な細胞だけでなく、その周囲で異常な細胞との間で相互作用する他の細胞の種類を含め、細胞種のばらつきを定量化して分析するので、現実的な病態を考慮した解析を行うことができる。また、年齢層別に母集団データを生成しておくことで、細胞種のばらつきが変化することにも配慮した解析を行うことができる。
 すなわち本発明の実施の形態に係る細胞群分類装置1は、特定の細胞を抜出すようなゲーティングを行うことなく、採取した細胞のすべてについての細胞表面マーカーの信号強度の分布のばらつきを用いて分類を行い、正常と診断される細胞群から得られる細胞表面マーカーの信号強度の分布のばらつきとの相違を定量化して扱うこととしたものである。また、この定量化の結果と、予後との相関に基づいて、予後予測を可能としている。
According to the present embodiment, not only abnormal cells but also other cell types that interact with abnormal cells in the surrounding area are quantified and analyzed, so that it is realistic. It is possible to perform analysis in consideration of various pathological conditions. In addition, by generating population data for each age group, it is possible to perform analysis in consideration of changes in cell type variation.
That is, the cell group classification device 1 according to the embodiment of the present invention uses variations in signal intensity distribution of cell surface markers for all collected cells without performing gating that extracts specific cells. Thus, the difference from the variation in the signal intensity distribution of the cell surface marker obtained from the group of cells diagnosed as normal is quantified and handled. In addition, the prognosis can be predicted based on the correlation between the quantification result and the prognosis.
 次に本発明の実施例について説明する。以下の例では、骨髄異形成症候群(MDS)患者のデータについて、細胞表面マーカーとしてCD34とCD41aとを用いる例について説明する。 Next, examples of the present invention will be described. In the following example, an example using CD34 and CD41a as cell surface markers will be described for data of a patient with myelodysplastic syndrome (MDS).
 まず、予め正常と診断されている処理対象となる細胞群と同じ組織の細胞群のサンプルを40だけ用意し、各サンプルについてフローサイトメトリー機器にて、FSC(前方散乱光)、SSC(側方散乱光)、及び細胞表面マーカーであるCD34に係る蛍光強度の測定データを得た。これからX軸をFSC,Y軸をSSCとする第1の二次元分布データと、X軸をFSC,Y軸を細胞表面マーカーの測定データとする第2の二次元分布データと、X軸をSSC,Y軸を細胞表面マーカーの測定データとする第3の二次元分布データとを生成した。 First, only 40 samples of cell groups of the same tissue as the cell group to be processed that have been diagnosed as normal in advance are prepared, and each sample is subjected to FSC (forward scattered light), SSC (lateral) using a flow cytometry instrument. Scattered light) and fluorescence intensity measurement data related to cell surface marker CD34 were obtained. From now on, the first two-dimensional distribution data in which the X-axis is FSC, the Y-axis is SSC, the second two-dimensional distribution data in which the X-axis is FSC, the Y-axis is cell surface marker measurement data, and the X-axis is SSC. , And the third two-dimensional distribution data with the Y axis as the measurement data of the cell surface marker.
 ここで生成される二次元分布データは、FSC,SSC,及び細胞表面マーカーの蛍光強度(測定データ)を、直交三次元座標の各軸に割り当てて測定されたデータをプロットした三次元の分布データを、FSCの軸とSSCの軸とを含む面、FSCの軸と細胞表面マーカーの測定データの軸とを含む面、SSCの軸と細胞表面マーカーの測定データの軸とを含む面にそれぞれ射影したもので、各面における各点でのデータ数を表した密度プロットである。具体的に図5に示すように三次元の分布データから第2,第3の二次元分布データ等が得られることになる。また、この正常と診断された各サンプルについてそれぞれ対応する二次元分布データを、それぞれ累算(同じ点に対する密度の値を加算)して、累算した第1ないし第3の二次元分布データを得た。 The two-dimensional distribution data generated here is three-dimensional distribution data obtained by plotting data measured by assigning fluorescence intensity (measurement data) of FSC, SSC, and cell surface markers to respective axes of orthogonal three-dimensional coordinates. Are projected onto a plane including the FSC axis and the SSC axis, a plane including the FSC axis and the cell surface marker measurement data axis, and a plane including the SSC axis and the cell surface marker measurement data axis, respectively. This is a density plot showing the number of data at each point on each surface. Specifically, as shown in FIG. 5, second and third two-dimensional distribution data and the like are obtained from the three-dimensional distribution data. Also, the corresponding two-dimensional distribution data for each sample diagnosed as normal is accumulated (added the density value for the same point), and the accumulated first to third two-dimensional distribution data are obtained. Obtained.
 次に、この累算した第1ないし第3の二次元分布データのそれぞれを処理の対象として、適応パーティショニング法の処理を行い、密度分布関数を推定した。この適応パーティショニング法は次のような処理となる。すなわち、図6に例示するように、処理の対象とする二次元分布データを2×2の互いに合同な領域に仮想的に分割し、各分割した領域内の密度が互いに等しいか否か(第1の仮説)をカイ二乗検定により検定する(S1)。また、同様に、処理の対象とする二次元分布データを4×4の互いに合同な領域に仮想的に分割し、各分割した領域内の密度が互いに等しいか否か(第2の仮説)をカイ二乗検定により検定する(S2)。 Next, the processing of the adaptive partitioning method was performed on each of the accumulated first to third two-dimensional distribution data, and the density distribution function was estimated. This adaptive partitioning method is as follows. That is, as illustrated in FIG. 6, the two-dimensional distribution data to be processed is virtually divided into 2 × 2 congruent regions, and whether the densities in the divided regions are equal to each other (first 1 hypothesis) is tested by chi-square test (S1). Similarly, the two-dimensional distribution data to be processed is virtually divided into 4 × 4 congruent areas, and whether or not the densities in the divided areas are equal to each other (second hypothesis). Test by chi-square test (S2).
 ここで第1または第2の仮説がカイ二乗検定により棄却されると判断されたときには、処理の対象とした二次元分布データを2×2の互いに合同な領域に分けて、それぞれの二次元分布データを生成し(S3)、生成した各二次元分布データのそれぞれを処理の対象として、再帰的にステップS1,S2,S3の処理を繰り返す(S4)。 Here, when it is determined that the first or second hypothesis is rejected by the chi-square test, the two-dimensional distribution data to be processed is divided into 2 × 2 congruent regions and the respective two-dimensional distributions are divided. Data is generated (S3), and the processes of steps S1, S2, and S3 are recursively repeated for each of the generated two-dimensional distribution data (S4).
 一方、ステップS1,S2において、処理の対象とした二次元分布データについての第1、第2の仮説のいずれもがカイ二乗検定により棄却されなかったときには、当該二次元分布データについては分割を行わないものとする(S5)。 On the other hand, if neither the first hypothesis nor the second hypothesis for the two-dimensional distribution data to be processed is rejected by the chi-square test in steps S1 and S2, the two-dimensional distribution data is divided. It is assumed that there is not (S5).
 そして分割を行わないとした二次元分布データ内の領域については、密度が等しいとして、当該領域内の密度の値の平均により、当該領域内のすべての点の値を置き換えておく。こうしてすべての領域について分割を行わないと決定するまで処理を繰り返す。この処理により、上記第1ないし第3の二次元分布データについて得られた処理結果(密度分布関数)をCD34に係る母集団データとした。以上の処理を細胞表面マーカーCD41aについても同様に行い、CD41aに係る母集団データを得た。これにより得られるデータでは、図7にその概要を示すように、分布が平滑化され、データのノイズが軽減されたものとなる。 Suppose that the areas in the two-dimensional distribution data that are not divided are assumed to have the same density, and the values of all the points in the area are replaced by the average of the density values in the area. Thus, the process is repeated until it is determined that the division is not performed for all areas. By this processing, the processing result (density distribution function) obtained for the first to third two-dimensional distribution data was used as population data related to CD34. The above processing was similarly performed for the cell surface marker CD41a, and population data related to CD41a was obtained. In the data obtained by this, as shown in FIG. 7, the distribution is smoothed and the data noise is reduced.
 次に、腫瘍に罹患している患者から得た細胞群についてフローサイトメトリー機器にて、FSC(前方散乱光)、SSC(側方散乱光)、及び細胞表面マーカーであるCD34に係る蛍光強度の測定データを得た。これからX軸をFSC,Y軸をSSCとする第1の二次元分布データ(密度プロット)と、X軸をFSC,Y軸を細胞表面マーカーの測定データとする第2の二次元分布データ(密度プロット)と、X軸をSSC,Y軸を細胞表面マーカーの測定データとする第3の二次元分布データ(密度プロット)とを生成した。 Next, with respect to a cell group obtained from a patient suffering from a tumor, the fluorescence intensity of CD34 which is FSC (forward scattered light), SSC (side scattered light), and cell surface marker is measured with a flow cytometry instrument. Measurement data was obtained. From this, the first two-dimensional distribution data (density plot) in which the X-axis is FSC and the Y-axis is SSC, and the second two-dimensional distribution data (density) in which the X-axis is FSC and the Y-axis is cell surface marker measurement data Plot) and third two-dimensional distribution data (density plot) in which the X-axis is SSC and the Y-axis is measurement data of the cell surface marker.
 次に、この第1ないし第3の二次元分布データのそれぞれを処理の対象として、適応パーティショニング法の処理を行い、密度分布関数を推定した。ここでの適応パーティショニング法の処理は、母集団データの生成の際に用いた方法と同様であるので、繰返しての説明を省略する。 Next, the adaptive partitioning method was processed for each of the first to third two-dimensional distribution data, and the density distribution function was estimated. The processing of the adaptive partitioning method here is the same as the method used when generating the population data, and thus repeated description is omitted.
 この患者から得た細胞群についても、細胞表面マーカーCD41aに係る蛍光強度の測定データに対して同様に処理し、CD41aに係る、第1ないし第3の二次元分布データについての密度分布関数を得た。 For the cell group obtained from this patient, the fluorescence intensity measurement data related to the cell surface marker CD41a is similarly processed, and the density distribution function for the first to third two-dimensional distribution data related to CD41a is obtained. It was.
 そしてここで患者から得た細胞群に基づく各二次元分布データdp(x1,x2,…,x6)(CD34に係るデータx1ないしx3と、CD41aに係るデータx4ないしx6)と、母集団データde(x1,x2,…,x6)(CD34に係るデータx1ないしx3と、CD41aに係るデータx4ないしx6)について、各二次元空間を、複数の領域(ビン)R1,R2…に区分したときの各領域(ビン)Ri(i=1,2,…k)内のそれぞれの密度のデータの総和Dpi,Dei(i=1,2,…k)を演算する。この領域Riは、例えばxj_min<xj<xj_max(j=1,2,…)で区切られた領域内とし、互いに重なり合わないように設定した。 Here, each two-dimensional distribution data dp (x1, x2,..., X6) (data x1 to x3 related to CD34 and data x4 to x6 related to CD41a) based on the cell group obtained from the patient, and population data de (X1, x2,..., X6) (data x1 to x3 related to CD34 and data x4 to x6 related to CD41a) when each two-dimensional space is divided into a plurality of regions (bins) R1, R2. The sum Dpi, Dei (i = 1, 2,... K) of each density data in each region (bin) Ri (i = 1, 2,... K) is calculated. This region Ri is set, for example, within a region delimited by xj_min <xj <xj_max (j = 1, 2,...) And does not overlap each other.
 次に、第1ないし第3の二次元分布データのそれぞれについての各領域(ビン)Ri(i=1,2,…k)内の密度のデータの総和Dpi,Deiを用い、(1)式により、CD34,CD41aそれぞれについての第1ないし第3の二次元分布データについての尤度比統計量κn_1,κn_2,…,κn_6を演算した。 Next, using the sum Dpi and Dei of the density data in each region (bin) Ri (i = 1, 2,... K) for each of the first to third two-dimensional distribution data, Thus, likelihood ratio statistics κn_1, κn_2,..., Κn_6 for the first to third two-dimensional distribution data for CD34 and CD41a were calculated.
 図8は、患者から得た細胞群についてのCD34に係る第2、第3の二次元分布データに基づく密度分布関数(上段の左右2つのデータ)と、母集団データについてのCD34に係る第2、第3の二次元分布データに基づく密度分布関数(下段の左右2つのデータ)とから得た尤度比統計量κn_2,κn_3を演算した例を表すものである。ここではκn_2=0.12、κn_3=0.153となっている。 FIG. 8 shows a density distribution function based on the second and third two-dimensional distribution data related to CD34 for the cell group obtained from the patient (the upper left and right data), and a second related to CD34 for the population data. 4 shows an example in which likelihood ratio statistics κn_2 and κn_3 obtained from a density distribution function (bottom two left and right data) based on the third two-dimensional distribution data are calculated. Here, κn_2 = 0.12 and κn_3 = 0.153.
 図9に、59例の患者ごとに、上記の処理を行い、CD34に係る第2、第3の二次元分布データに基づく密度分布関数から得た尤度比統計量κn_2,κn_3と、CD41aに係る第2、第3の二次元分布データに基づく密度分布関数から得た尤度比統計量κn_5,κn_6との分布を各患者ごとに生成した例を示す。図9のグラフは横軸が患者番号(1から59)、縦軸が尤度比統計量を表す。 FIG. 9 shows the likelihood ratio statistics κn_2 and κn_3 obtained from the density distribution function based on the second and third two-dimensional distribution data related to CD34 and the CD41a for 59 patients. An example in which a distribution with likelihood ratio statistics κn_5 and κn_6 obtained from the density distribution function based on the second and third two-dimensional distribution data is generated for each patient is shown. In the graph of FIG. 9, the horizontal axis represents the patient number (1 to 59), and the vertical axis represents the likelihood ratio statistic.
 分布の中央値(κn_2,κn_3,κn_5,κn_6の中央値)が比較的大きい集団があり、0.5を基準に、当該中央値が0.5を超えるものと、中央値が0.5以下のものとの2つのクラスタに分類した。以下、中央値が0.5を超えるクラスタをグループ1、中央値が0.5以下のクラスタをグループ2と呼ぶ。 There is a group with a relatively large median distribution (median values of κn_2, κn_3, κn_5, κn_6), and when the median value exceeds 0.5 and the median value is 0.5 or less based on 0.5 Into two clusters. Hereinafter, a cluster having a median value greater than 0.5 is referred to as group 1, and a cluster having a median value of 0.5 or less is referred to as group 2.
 図10に、上記のグループ1、グループ2のそれぞれについての全生存率を表す生存曲線(Kaplan-Mayer生存曲線)を示す。図10に示した生存曲線(横軸は年、縦軸は全生存率)の差を一般化Wilcoxon検定により検定するとp=0.0408となり、5%水準で有意であると認められた。生存期間の中央値は、グループ1について1.88年、グループ2で4.66年であった。5年生存率はグループ1で30%、グループ2で43.7%であった。
 これより、採取した細胞のすべてについての細胞表面マーカーの信号強度の分布のばらつきを用いて分類を行い、正常と診断される細胞群から得られる細胞表面マーカーの信号強度の分布のばらつきとの相違の定量化結果である尤度比統計量が、予後と相関することも理解される。すなわち、他の患者についても上記の処理を行うことで、上記尤度比統計量の中央値が0.5を超えるか否かにより、予後の異なるグループのどちらに属するかを判断可能になっている。
FIG. 10 shows a survival curve (Kaplan-Mayer survival curve) representing the overall survival rate for each of the groups 1 and 2 described above. When the difference between the survival curves shown in FIG. 10 (the horizontal axis is the year and the vertical axis is the total survival rate) was tested by the generalized Wilcoxon test, p = 0.0408, which was recognized as significant at the 5% level. The median survival was 1.88 years for group 1 and 4.66 years for group 2. The 5-year survival rate was 30% in Group 1 and 43.7% in Group 2.
Based on this, classification is performed using the variation in the signal intensity of the cell surface marker for all the collected cells, and the difference from the variation in the distribution of the signal intensity of the cell surface marker obtained from a group of cells diagnosed as normal It is also understood that the likelihood ratio statistic, which is the quantification result of, correlates with the prognosis. That is, by performing the above processing for other patients, it becomes possible to determine which group has a different prognosis depending on whether or not the median of the likelihood ratio statistics exceeds 0.5. Yes.

Claims (4)

  1.  正常と診断された細胞群サンプルの測定データに基づく母集団データを保持する保持手段と、
     処理対象となった細胞群について得た測定データに基づいて、前記母集団データとの類似性を検定可能な処理データを生成する手段と、
     前記生成した処理データと、前記保持手段に保持された母集団データとの類似性を検定する検定手段と、
     前記検定の結果に基づく所定の基準に従い、前記処理対象となった細胞群を、予め定めた複数のグループのいずれかに分類する分類手段と、
     を含む細胞群分類装置。
    Holding means for holding population data based on measurement data of cell group samples diagnosed as normal;
    Based on the measurement data obtained for the cell group to be processed, means for generating processing data that can be tested for similarity to the population data;
    Testing means for testing the similarity between the generated processing data and the population data held in the holding means;
    In accordance with a predetermined standard based on the result of the test, classification means for classifying the cell group to be processed into any of a plurality of predetermined groups,
    A cell group classification apparatus comprising:
  2.  請求項1に記載の細胞群分類装置であって、
     前記保持手段は、予め定めた年齢層ごとに、各年齢層に属する年齢の提供者から提供され、正常と診断された細胞群サンプルの測定データに基づいて得られる、年齢層ごと母集団データを保持し、
     前記検定手段は、処理対象となった細胞群の提供者の年齢が属する前記年齢層に対応する、年齢層ごと母集団データと、前記処理データとの類似性を検定する細胞群分類装置。
    The cell group classification device according to claim 1,
    The holding means provides population data for each age group, which is provided from a provider of an age belonging to each age group and obtained based on measurement data of a cell group sample diagnosed as normal for each predetermined age group. Hold and
    The test means is a cell group classification device that tests the similarity between population data for each age group and the processing data corresponding to the age group to which the age of the provider of the cell group to be processed belongs.
  3.  請求項1または2に記載の細胞群分類装置であって、
     前記測定データは、フローサイトメトリーによって得られる複数のパラメータであり、前記母集団データと前記処理データとは、当該パラメータに基づいて得られるm次元(mは自然数)の分布データである。
    The cell group classification device according to claim 1 or 2,
    The measurement data is a plurality of parameters obtained by flow cytometry, and the population data and the processing data are m-dimensional (m is a natural number) distribution data obtained based on the parameters.
  4.  正常と診断された細胞群サンプルの測定データに基づく母集団データを取得する工程、
     処理対象となった細胞群について得た測定データに基づいて、前記母集団データとの類似性を検定可能な処理データを生成する工程、
     前記生成した処理データと、前記保持手段に保持された母集団データとの類似性を検定する工程、及び
     前記検定の結果に基づく所定の基準に従い、前記処理対象となった細胞群を、予め定めた複数のグループのいずれかに分類する工程、
     を含む細胞群分類方法。
     
    Obtaining population data based on measurement data of cell group samples diagnosed as normal;
    A step of generating processing data that can be tested for similarity to the population data, based on measurement data obtained for a cell group to be processed;
    In accordance with a step of testing the similarity between the generated processing data and the population data held in the holding unit, and a predetermined standard based on the result of the testing, the cell group to be processed is determined in advance. A process of classifying it into one of a plurality of groups,
    A cell group classification method comprising:
PCT/JP2014/050720 2013-01-17 2014-01-16 Apparatus for classifying cell groups and method for classifying cell groups WO2014112567A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2013006783 2013-01-17
JP2013-006783 2013-01-17

Publications (1)

Publication Number Publication Date
WO2014112567A1 true WO2014112567A1 (en) 2014-07-24

Family

ID=51209656

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2014/050720 WO2014112567A1 (en) 2013-01-17 2014-01-16 Apparatus for classifying cell groups and method for classifying cell groups

Country Status (1)

Country Link
WO (1) WO2014112567A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63191044A (en) * 1987-02-03 1988-08-08 Omron Tateisi Electronics Co Cell analyzer
WO2005050479A1 (en) * 2003-11-21 2005-06-02 National University Corporation Kochi University Similar pattern searching apparatus, method of similar pattern searching, program for similar pattern searching, and fractionation apparatus

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63191044A (en) * 1987-02-03 1988-08-08 Omron Tateisi Electronics Co Cell analyzer
WO2005050479A1 (en) * 2003-11-21 2005-06-02 National University Corporation Kochi University Similar pattern searching apparatus, method of similar pattern searching, program for similar pattern searching, and fractionation apparatus

Similar Documents

Publication Publication Date Title
KR102094326B1 (en) Methods and systems of evaluating a risk of a gastrointestinal cancer
EP2700042B1 (en) Analyzing the expression of biomarkers in cells with moments
Finak et al. Optimizing transformations for automated, high throughput analysis of flow cytometry data
US20160169786A1 (en) Automated flow cytometry analysis method and system
US20140221247A1 (en) System, method, and article for detecting abnormal cells using multi-dimensional analysis
US20170102310A1 (en) Flow cytometer and a multi-dimensional data classification method and an apparatus thereof
WO2018151680A1 (en) Methods and devices for identifying population clusters in data
CN107389536B (en) Flow cell particle classification counting method based on density-distance center algorithm
JP7361395B2 (en) Display control device, display control method, and display control program
EP2920573B1 (en) Particle data segmentation result evaluation methods and flow cytometer
WO2014112567A1 (en) Apparatus for classifying cell groups and method for classifying cell groups
KR102225231B1 (en) IDENTIFYING METHOD FOR TUMOR PATIENT BASED ON miRNA IN EXOSOME AND APPARATUS FOR THE SAME
US20180293427A1 (en) Automatic Calculation for Ploidy Classification
NL1040872C2 (en) Method of analysing data from, computer program, computer readable medium and data analysis system.
CN108169105B (en) Leukocyte classification processing method applied to hematology analyzer
Bashashati et al. A pipeline for automated analysis of flow cytometry data: preliminary results on lymphoma sub-type diagnosis
Caligola et al. GateMeClass: Gate Mining and Classification of cytometry data
Thairu Quality control and analysis of ow cytometry data
US20160125134A1 (en) Automated Classification of Cells in Biologic Mixtures Analyzed by High Parameter Cytometry Instrumentation, Processing, System and Method
Templ et al. Statistical Indicators for the Analysis of Digitalized Brain Tumor Images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14740665

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14740665

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP