CN117377982A - Tumor immunophenotyping based on spatial distribution analysis - Google Patents

Tumor immunophenotyping based on spatial distribution analysis Download PDF

Info

Publication number
CN117377982A
CN117377982A CN202280037416.3A CN202280037416A CN117377982A CN 117377982 A CN117377982 A CN 117377982A CN 202280037416 A CN202280037416 A CN 202280037416A CN 117377982 A CN117377982 A CN 117377982A
Authority
CN
China
Prior art keywords
digital pathology
pathology image
tumor
biological
biological object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280037416.3A
Other languages
Chinese (zh)
Inventor
J·R·伊斯特汉
H·克彭
李骁
D·Y·奥尔洛瓦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Genentech Inc
Original Assignee
Genentech Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Genentech Inc filed Critical Genentech Inc
Priority claimed from PCT/US2022/031220 external-priority patent/WO2022251556A1/en
Publication of CN117377982A publication Critical patent/CN117377982A/en
Pending legal-status Critical Current

Links

Landscapes

  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The present disclosure provides systems and methods related to processing digital pathology images. More specifically, the technique includes accessing a digital pathology image depicting a slice of a biological sample, wherein the digital pathology image includes areas that display reactivity to a plurality of stains. For each of a plurality of tiles of the digital pathology image, a local density measurement is calculated for each of a plurality of biological object types. One or more spatial distribution metrics may be generated for the biological object type based at least in part on the calculated local density measurements. A tumor immunophenotype may then be generated for the digital pathology image based at least in part on the local density measurements or the one or more spatial distribution metrics.

Description

Tumor immunophenotyping based on spatial distribution analysis
Cross Reference to Related Applications
The present application claims the benefit and priority of U.S. provisional application No. 63/194,009 entitled "automated tumor immunophenotyping" filed on day 5, month 27 of 2021, U.S. provisional application No. 63/279,946 entitled "tumor immunophenotyping based on spatial distribution analysis" filed on day 11, month 16 of 2021, and U.S. provisional application No. 63/308,491 entitled "tumor immunophenotyping based on spatial distribution analysis" filed on day 2, month 9 of 2022, the entire contents of which are incorporated herein by reference.
Technical Field
The present application relates generally to image processing of digital pathology images to generate an output of spatial information characterizing a particular type of object in the image.
Background
Image analysis includes processing individual images to generate image-level results. For example, the result may be a binary result corresponding to one or more of a rating as to whether the image includes a particular type of object or a classification of the image includes a set of types of objects. As another example, the results may include an image level count of the number of objects of a particular type detected within the image or a density of distribution of objects of a particular type. In the context of digital pathology, the results may include a display of counts or specific indications of cells of a specific type detected within an image of the sample, a ratio of counts of cells of one type to counts of cells of another type throughout the image, and/or a density of cells of a specific type.
This image level approach is convenient because it can facilitate metadata storage and can be easily understood in terms of the manner in which the results are generated. However, such image level methods may remove details from the image, which may prevent detection of details of the depicted scene and/or environment. This simplification may be particularly effective in the context of digital pathology, as the current or future potential activity of a particular type of cell may depend largely on the microenvironment.
Therefore, it would be highly advantageous to develop techniques that process digital pathology images to generate an output reflecting the density and spatial distribution of depicted biological objects (such as different types of cells).
Disclosure of Invention
With the success of immunooncology treatments, analysis of immune infiltrates in human tumors has shifted from a prognostic effect of interest to identify predictive factors. The predictive or prognostic capacity of the density and spatial distribution of Tumor Infiltrating Lymphocytes (TILs) has been empirically demonstrated. However, this biomarker is still lacking in widespread adoption in clinical decisions. In most cases, the assessment of the pattern and density of immune infiltration is based on visual inspection of stained tissue sections by a pathologist. This form of manual analysis is labor intensive, subjective, error-prone, and is associated with poor consistency between and within observers. Semi-automated or fully automated methods can be used as potential solutions. These automated solutions may continue the lack of standardization observed in manual methods, as they are intended to mimic the way pathologists assess immune infiltration. The present disclosure describes an automated method that reduces the effects of lack of normalization and facilitates the widespread use of spatial distribution of TIL as a predictive or prognostic biomarker by using a set of derived spatial features described herein.
In some embodiments, a computer-implemented method is provided that includes: a digital pathology image processing system that accesses digital pathology images depicting sections of biological samples taken from a subject suffering from a given medical condition. The digital pathology image includes areas that show reactions to two or more colorants. The digital pathology image processing system identifies, detects, one or more tumor-associated regions in the digital pathology image. The digital pathology image processing system segments and subdivides the digital pathology image into a plurality of tiles. For each of the plurality of tiles, the digital pathology image processing system calculates a local density measurement for each of the plurality of biological object types. The digital pathology image processing system generates one or more spatial distribution metrics for a biological object type in the digital pathology image based at least in part on the local density measurements calculated for each of a plurality of tiles of the digital pathology image. The digital pathology image processing system classifies a biological sample depicted in a digital pathology image having a particular immunophenotype based at least in part on a local density metric and one or more spatial distribution metrics. Classifying the biological sample depicted in the digital pathology image having the specific immunophenotype includes: projecting a representation of the digital pathology image into a feature space having axes based on one or more spatial distribution metrics; and classifying the biological sample depicted in the digital pathology image based on the location of the digital pathology image within the feature space. The classification of the biological sample depicted in the digital pathology image is further based on the proximity of the location of the digital pathology image within the feature space to the location of the one or more other digital pathology image representations having the specified immunophenotype classification. For each of the plurality of tiles, the local density measurement for each of the plurality of biological object types includes a representation of an absolute or relative amount of the biological object type that is identified as being located within the tile and a representation of an absolute or relative amount of the biological object type that is identified as being located within the tile of the second type of biological object.
Each of the plurality of biological object types may be responsive to one of two or more stains. Calculating local density measurements for each of a plurality of biological object types for each tile includes: for each tile, partitioning the tile into a plurality of regions according to reactivity to two or more colorants; classifying regions of the tile by reactivity of the biological sample to each of the two or more stains; and for a tile, calculating a local density measurement for each of a plurality of biological object types based on a number of regions of the tile classified with each of the two or more colorants. The region of a tile is a pixel of a digital pathology image located within the tile. The regions of the tile may be classified based on a dominant color of the regions, the color based on a reaction of each of the plurality of biological object types to one of the two or more colorants. The plurality of biological object types may include cytokeratin and cytotoxic structures.
One or more tumor-associated regions in the digital pathology image may be identified by a machine learning model trained to identify tumor-associated regions within the digital pathology image. Identifying one or more tumor-associated regions in the digital pathology image includes: providing a user interface for display, the user interface comprising a digital pathology image and one or more interactive elements; and receiving a selection of one or more tumor-associated regions through interaction with one or more interactive elements. The one or more spatial distribution metrics characterize a degree to which at least a portion of the first set of biological object depictions are interspersed with at least a portion of the second set of biological object depictions. The one or more spatial distribution metrics include a Jaccard index, An index, a Bhattacharyya coefficient, a Moran index, a Geary adjacency ratio, a Morisita-Horn index, a co-location quotient (Colocation Quotient), or a metric defined based on hot/cold spot analysis. Digital pathology image processing system at leastGenerating a result corresponding to: assessment of a medical condition in a subject, including prognosis of the outcome of the medical condition. The digital pathology image processing system generates a display comprising an indication of an assessment and prognosis of the medical condition of the subject. Generating the results includes processing the immunophenotyping and the one or more spatial distribution metrics using a trained machine learning model that has been trained using a set of training elements, each of the set of training elements corresponding to another subject having a similar medical condition for which the results of the medical condition are known. The digital pathology image processing system generates a result corresponding to: a prediction of the extent to which a given treatment that modulates an immune response will be effective to treat a given medical condition in a subject. The digital pathology image processing system determines that the subject is eligible for a clinical trial based on the results. The digital pathology image processing system generates a display comprising instructions for: the subject is eligible for a clinical trial.
In some embodiments, a system is provided, comprising: one or more data processors; and a non-transitory computer-readable storage medium containing instructions that, when executed on the one or more data processors, cause the one or more data processors to perform a portion or all of one or more methods disclosed herein.
In some embodiments, a computer program product is provided that is tangibly embodied in a non-transitory machine-readable storage medium and includes instructions configured to cause one or more data processors to perform some or all of one or more methods disclosed herein.
Some embodiments of the present disclosure include a system comprising one or more data processors. In some embodiments, the system includes a non-transitory computer-readable storage medium containing instructions that, when executed on one or more data processors, cause the one or more data processors to perform a portion or all of one or more methods disclosed herein and/or a portion or all of one or more processes disclosed herein. Some embodiments of the present disclosure include a computer program product tangibly embodied in a non-transitory machine-readable storage medium, comprising instructions configured to cause one or more data processors to perform a portion or all of one or more methods disclosed herein and/or a portion or all of one or more processes disclosed herein.
The terms and expressions which have been employed are used as terms of description and not of limitation, and there is no intention in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention claimed. Accordingly, it should be understood that although the claimed invention has been specifically disclosed by embodiments and optional features, modification and variation of the concepts herein disclosed may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this invention as defined by the appended claims.
Drawings
FIG. 1 illustrates an interactive system for generating and processing digital pathology images.
FIG. 2 shows an illustrative system for processing object description data to generate a spatial distribution metric.
Fig. 3 illustrates a process for providing health related assessment based on image processing of digital pathology images.
Fig. 4A-4F show exemplary digital pathology images showing various immunophenotyping.
Fig. 5A to 5B show examples of pixel-based segmentation of digital pathology images.
FIG. 6 illustrates an example of tile-based local density measurement calculation.
Fig. 7 shows an exemplary annotated digital pathology image.
Fig. 8A-8B illustrate exemplary heatmaps of densities for particular biological object types.
FIG. 9 shows a small plot of biological subject density bins by immunophenotype.
Fig. 10A to 10B illustrate a process for processing an image using a lattice-based spatial region analysis framework.
Fig. 11A to 11D illustrate a process for classifying a sample using an immunophenotype.
Fig. 12A shows a process for assigning predicted outcome tags to each subject in a study cohort using a nested monte carlo cross-validation modeling strategy.
Fig. 12B and 12C show exemplary overall survival plots of classified immunophenotypes for different treatments.
Fig. 12D and 12E show exemplary progression-free survival plots of classified immunophenotypes for different treatments.
FIG. 13 illustrates an exemplary computer system.
In the drawings, similar components and/or features may have the same reference numerals. Furthermore, various components of the same type may be distinguished by following the reference label by a dash and a second label that distinguishes among the similar components. If only the first reference label is used in the specification, the description is applicable to any one of the similar sites having the same first reference label irrespective of the second reference label.
Detailed Description
Digital images are increasingly used in medical environments to facilitate clinical evaluations, such as diagnosis, prognosis, treatment selection, and treatment evaluation, among various other uses. In the field of digital pathology, processing of digital pathology images may be performed to estimate whether a given image includes a depiction of a biological object of a particular type or class. For example, a section of a tissue sample may be stained such that a specific type of biological object (e.g., a specific type of cell, a specific type of organelle, or a blood vessel) is depicted with a higher intensity of a specific color by preferentially absorbing the stain. By way of example and not limitation, two different Immunohistochemical (IHC) stains may be applied to help identify different types of cells: whole-cell keratin ("PanCK") stains can highlight cytokeratin-positive (ck+) regions (e.g., regions depicting tumor cells being responsive to PanCK stains), and CD8 staining can highlight cd8+ regions of T cells expressing CD8 co-receptors (also known as T lymphocytes). Tissue samples may be imaged according to the techniques disclosed herein. The digital pathology image may then be processed to detect biological object delineations. The detection of the depiction of the biological object may be based on the biological object meeting certain criteria in the analysis corresponding to the staining profile, such as high intensity pixels having at least a defined amount of continuity, a size within a defined range, a shape of a defined type, etc. In particular, one or more slices (also referred to as "tiles") of an image may be classified based on an analysis of the digital pathology image. Clinical assessment, classification of underlying samples corresponding to digital pathology images, or advice may be made based on image analysis.
Tumors, particularly solid tumors, can be classified according to the density and spatial distribution of certain immune cell components. In certain embodiments, classification of tumors and/or tumor samples may be based on the presence and specific location of cd8+ T cells within the tumor bed, as cd8+ T cells are known to kill cancer cells and other infected or damaged cells. In particular, the assessment may be based on the interaction between cd8+ T cells and ck+ tumor cells, as cytokeratin is a known marker of epithelial cancer cells. As explained in detail herein, samples and tumors can be classified according to the extent to which cd8+ T cells infiltrate into ck+ tumor cells and the density of cd8+ T cells in ck+ tumor cells.
With advances in imaging technology, digital imaging of tumor tissue slides is becoming a routine clinical procedure for managing multiple types of disorders. Digital pathology images can capture high resolution biological objects. It may be advantageous to characterize the degree and/or density of spatial heterogeneity of biological objects captured in digital pathology images and the degree of spatial aggregation and/or distribution of objects of a given type relative to each other and/or relative to objects of different types, such as in the context of assessing cd8+ T cell infiltration into tumor beds. The location and relationship depicted by the biological objects in the digital pathology image may be correlated with the location and relationship of the corresponding biological objects in the tissue sample of the subject. Objectively characterizing the density and relationship of a depiction of a particular type of biological subject can significantly impact the quality of current diagnosis, prognosis, treatment assessment, treatment selection, and/or treatment qualification. As disclosed herein, such objective spatial characterization may be performed by detecting a set of biological object depictions from a digital pathology image and generating a specified metric based on the biological object depictions. The objects may be represented according to one or more spatial analysis frameworks including, but not limited to, a spatial region analysis framework. In some cases, metadata may be stored for each of a set of regions within an image and for each of one or more particular types of objects, the metadata indicating a number or density of depictions of each particular type of biological object predicted or determined to be located within the region.
Spatial aggregation may include measurements of: how objects within a digital pathology image are spatially aggregated or distributed over the entire digital pathology image or over a region of the entire digital pathology image. For example, it may be advantageous to determine the extent to which one type or class of biological object (e.g., lymphocyte, cd8+ T cell) is spatially mixed with another type or class of biological object (e.g., tumor cell, ck+ cell). To illustrate, intratumoral tumor infiltrating lymphocytes ("TILs") are located within and interact directly with tumor cells, while stromal TILs are located in the tumor stroma and do not interact directly with tumor cells. Not only are intratumoral TILs have different activity patterns than matrix TILs, but each cell type can be associated with a different type of microenvironment, further affecting the behavioral differences between these types of TILs. If lymphocytes are detected at a particular location (e.g., within a tumor), the fact that the lymphocytes are able to infiltrate the tumor may convey information about the activity of the lymphocytes and/or tumor cells. In addition, the microenvironment may affect the current and future activity of lymphocytes. Identifying the relative location of a particular type of biological subject can provide particularly rich information for predictive applications such as identifying prognosis and treatment regimens, assessing the eligibility of a patient to conduct a clinical trial, and representing the immunological characteristics of the subject and its condition.
As another form of objective characterization of the location and relationship of the detected biological object depiction, the detected biological object depiction may be used to generate one or more spatial distribution metrics that may characterize, at a regional, image, and/or subject level, the extent to which a biological object of a given type or class is predicted to be interspersed with, aggregated with, and/or aggregated with other objects of the same type. For example, the digital pathology image processing system may detect a first set of biological object depictions and a second set of biological object depictions in the digital pathology image. The system can predict: each of the first set of biological object depictions depicts a first type of biological object (e.g., lymphocyte), and each of the second set of biological object depictions depicts a second type of biological object (e.g., tumor cell). The digital pathology image processing system may perform the aggregate-based assessment to generate a spatial distribution metric indicative of a degree to which each biological object depiction in the first set of biological object depictions is spatially combined or separated from each biological object depiction in the second set of biological object depictions and/or a degree to which the first set of biological object depictions are spatially combined (e.g., jointly) with the second set of biological object depictions (e.g., jointly). As disclosed herein, a variety of spatial distribution metrics have been developed and applied for this purpose.
Continuing with the example of evaluating cd8+ T cells within a tumor bed, the analysis may begin with exposing a particular sample to one or more staining agents known to react with cd8+ T cells and ck+ tumor cells, including but not limited to dual-chromogenic immunohistochemical assays to detect and delineate epithelial cells. Once exposed, a digital pathology image of the sample may be taken, such as according to the techniques described herein. Digital pathology images of samples can be classified based on the density of cd8+ T cells within the image, particularly in the areas of the image that appear to include ck+ tumor cells. Classification may be robustly automated and repeatable or performed in a subjective manual manner. The classification may include subjective measures, for example, from 0 (corresponding to very few valid cells) to 3 or more (corresponding to dense immune infiltrates). In certain embodiments, the classification may be based on the pattern of infiltrating cells, such as away from ck+ tumor cells or partial or complete overlap between cd8+ and ck+ tumor cells. The resulting classifications may include desert (e.g., sparse cd8+ infiltration, independent of spatial distribution), exclusive (e.g., cd8+ T cells and ck+ tumor cells rarely overlap, where the distribution of cd8+ T cells is limited to CK-stromal compartments, as depicted by the spatial separation of tumor cells and immune cells), or inflammatory (e.g., cd8+ T cells co-localize with ck+ tumor cells, with substantial overlap). As described above, while the evaluation process may be performed manually, manual evaluation is fraught with sources of error. First, the assessment is performed subjectively and is subject to subjective performance by human evaluators even if attempts are made to add rigidity to the assessment (such as a cut-off metric for each class type). Second, the assessment is affected by intra-tumor heterogeneity, wherein multiple images generated from the same sample or multiple samples imaged from the same tumor are classified differently according to different depictions of biological objects therein. In manual methods, particularly in large samples, it may be difficult to correlate or otherwise account for these differences in classification. In addition, the automated digital method discussed herein considers more variables with repeatable results, widening the analyzed factors by using spatial distribution metrics. Although described in the context of assessing immunophenotype based on the presence and density of cd8+ T cells in ck+ tumor cells, similar principles apply as well to cases where exposure of a sample to one or more staining agents may allow different types of biological subjects to be expressed in a digital pathology image of the sample.
Principles and quantitative methods from advanced analysis (e.g., spatial statistics) can be applied to generate novel solutions related to analysis of digital pathology images for classification and prediction purposes. Techniques provided herein may be used to process digital pathology images to generate results that characterize the spatial distribution and/or spatial pattern of one or more particular types or classes of delineated objects (e.g., biological objects). The processing may include: a depiction of a biological object (e.g., biological cells corresponding to each of a plurality of types) and/or a specialized image segmentation at the region or pixel level for each of a plurality of specific types is detected. Object detection may include: for each region of a set of regions within the digital pathology image and for each of a plurality of particular biological object types, a higher-order metric is identified, which is defined as being dependent on and associated with a number of biological objects or a lower-order metric (e.g., a count, density, or image intensity inferred to represent the number of biological objects of a particular type present within the corresponding image region). Furthermore, spatial distribution metrics may be used in combination with other metrics (e.g., RNA sequencing, radiological imaging (CT, MRI, etc.)) to increase its predictive ability or to discover new biomarkers to meet unmet medical needs.
Image locations of one or more biological object depictions may be determined. The image locations may be determined and represented according to one or more spatial analysis frameworks, such as a spatial region analysis framework. As an example, a biological object depiction may be represented or indicated jointly with or by one or more other biological object depictions as contributing to: a count of objects detected within a specific region of the image, a density of biological objects detected within a specific region of the image, a pattern of biological objects detected within a specific region of the image, and the like.
The digital pathology image processing system may use spatial distribution metrics to facilitate the identification of: for example, diagnosis, prognosis, treatment assessment, treatment selection, and/or treatment qualification (e.g., qualification that a subject is accepted or recommended for a clinical trial or a particular clinical trial group). For example, a particular prognosis may be identified in response to detecting that cd8+ T cells infiltrate into ck+ tumor cells to some extent. As another example, diagnosis of a tumor or cancer stage may be informed based on the degree to which immune cells (e.g., cd8+ T cells) spatially bind to cancer cells (e.g., ck+ tumor cells). As yet another example, the effect of a treatment may be determined to be higher when the spatial proximity of cd8+ T cells relative to ck+ tumor cells after initiation of the treatment is smaller relative to pre-treatment or relative to an expected proximity based on one or more previous assessments performed on a given subject.
Biological object detection (or detection of depicted biological objects) may be used to generate results that may include or may be based on spatial distribution metrics, which may indicate proximity between depictions of the same or different types of biological objects and/or the extent of co-localization of depictions of one or more types of biological objects. Co-localization of depictions of biological objects may represent similar locations of multiple cell types within each of one or more regions of a digital pathology image. The results may indicate and/or predict interactions between different biological objects and types of biological objects that may occur within the microenvironment of structures in the subject or patient, as indicated by samples collected from the subject or patient. Such interactions may support and/or be critical to biological processes such as tissue formation, homeostasis, regenerative processes, or immune responses. Thus, the spatial information conveyed by the results may provide information about the function and activity of a particular biological structure, and thus may be used as a quantitative basis for classifying a sample or characterizing a disease state and prognosis or predicting therapeutic effect and other subject outcomes.
A plurality of spatial distribution metrics may be generated. For example, one or more metrics may be generated using a spatial region analysis framework. The metric may characterize a count or density of depictions of a biological object of a first type relative to other depictions of a biological object of a second type within various image regions.
A machine learning model or rule may be used to generate results corresponding to one or more metrics that each correspond to a metric type of the one or more metric types: for example, diagnosis, prognosis, treatment assessment, treatment selection, treatment qualification (e.g., qualification accepted or recommended for a clinical trial or a particular clinical trial group), and/or prediction of gene mutation, gene alteration, biomarker expression levels (including but not limited to genes or proteins), and the like. By way of example and not limitation, the machine learning model may include classification, regression, decision tree, or neural network techniques trained to learn one or more weights to be used in processing metrics to produce results. Further, machine learning models or rules may be trained to predict or suggest modifications to the procedure used to classify or divide the sample, such as modifications to the infiltration and density metrics, which may form cutoff values and/or thresholds for classification into one or more immunophenotypes.
The digital pathology image processing system may further identify and learn a pattern that identifies the locations and relationships depicted by the detected biological objects based in part on one or more spatial distribution metrics. For example, the digital pathology image processing system may detect a pattern of locations, densities, and relationships depicted by the detected biological objects in the digital pathology image of the first sample. The digital pathology image processing system may generate a mask or other pattern storage data structure from the identified patterns. The digital pathology image processing system may use image analysis principles and spatial distribution metrics as described herein to predict diagnosis, prognosis, treatment assessment, treatment selection, and/or treatment qualification. The digital pathology image processing system may store a predicted prognosis, etc., in association with the detected pattern and/or the generated mask. The digital pathology image processing system may receive subject results to verify predicted prognosis, etc.
When processing a second digital pathology image from a second sample, the digital pathology image processing system may detect a pattern of locations and relationships depicted by the detected biological objects in the second digital pathology image. The digital pathology image processing system may identify similarities between patterns of detected positions and relationships in the second digital pathology image and masks or stored, detected patterns from the first digital pathology image. The digital pathology image processing system may inform a predicted prognosis, treatment recommendation, or treatment qualification determination based on the identified similarity and/or subject outcome. As an example, the digital pathology image processing system may compare the stored mask with a pattern of locations and relationships of the biological object depictions detected in the second digital pathology image. The digital pathology image processing system may determine one or more spatial distribution metrics for the second digital pathology image and derive a comparison of the stored mask with the identified pattern from the second digital pathology image based on a comparison of the spatial distribution metrics of the detected biological object depictions in the first digital pathology image and the second digital pathology image.
Additionally or alternatively, the digital pathology image processing system may further use spatial distribution metrics and/or immunophenotype assigned to the digital pathology image and/or sample to facilitate identification of treatment options. For example, immunotherapy may be selectively recommended after a particular immunophenotype is determined. The suggested therapy may be recommended after comparison to studies of long-term and/or overall survival statistics associated with other subjects with similar immunophenotyping. Furthermore, the details of the spatial distribution metrics used to draw conclusions about the immunophenotyping can be used to refine the suggestion, thereby increasing the complexity of the suggestion.
Facilitating identification of a diagnosis, prognosis, treatment assessment, treatment selection, and/or treatment qualification may include automatically generating a potential diagnosis, prognosis, treatment assessment, and/or treatment selection. The automatic identification may be based on one or more learned and/or static rules. The rules may have a if-then format, which may include inequality and/or one or more thresholds in the condition, which may indicate, for example, that a metric above the threshold is associated with the suitability of a particular treatment. The rules may alternatively or additionally include functions, such as a function that correlates a numerical metric with a severity score of the disease or a quantified score of treatment qualification. The digital pathology image processing system may output potential diagnosis, prognosis, treatment assessment, treatment selection, and/or treatment qualification determination as recommendations and/or predictions. For example, the digital pathology image processing system may provide an output to a locally coupled display, transmit the output to a remote device or an access terminal of the remote device, store the results in a local or remote data storage device, or the like. In this way, a human user (e.g., physician and/or healthcare provider) may use the automatically generated output or form a different assessment notified by the quantitative metrics discussed herein.
Facilitating identification of diagnosis, prognosis, treatment assessment, treatment selection, and/or treatment qualification determination may include outputting spatial distribution metrics consistent with the disclosed subject matter. For example, the output may include an identifier of the subject (e.g., a subject's name), stored clinical data related to the subject (e.g., past diagnosis, possible diagnosis, current treatment, symptoms, examination results, and/or vital signs), and the determined spatial distribution metric. The output may include a digital pathology image from which the spatial distribution metric is derived and/or a modified version thereof. For example, a modified version of the digital pathology image may include an overlay and/or marker that identifies each biological object depiction detected in the digital pathology image and/or that identifies the density of biological object depictions detected in the digital pathology image and/or in one or more regions of the image. The modified version of the digital pathology image may further provide information about the depiction of the detected biological object. The output (including the spatial distribution metrics) may then be used by a human user (e.g., physician and/or healthcare provider) to identify or verify a recommended diagnosis, prognosis, treatment assessment, treatment selection, or treatment qualification.
Biological object delineations detected from a single digital pathology image are used to generate multiple types of spatial distribution metrics. Multiple types of spatial distribution metrics may be used in combination in accordance with the subject matter disclosed herein. The multiple types of spatial distribution metrics may correspond to the same or different frameworks related to, for example, how to characterize the location depicted for each biological object. The multiple types of spatial distribution metrics may include different variable types (e.g., calculated using different algorithms) and may be presented on different value scales. Rules or machine learning models may be used to jointly process multiple types of spatial distribution metrics to generate tags. The signature may correspond to a predicted diagnosis, prognosis, treatment assessment, treatment selection, and/or treatment qualification.
As referred to herein, the term "biological subject" may refer to a biological unit. By way of example and not limitation, a biological object may include a cell, an organelle (e.g., a nucleus), a cell membrane, a stroma, a tumor, or a blood vessel. It should be understood that a biological object may comprise a three-dimensional object, and that a digital pathology image may capture only a single two-dimensional slice of the object, which need not even extend entirely through the entire object along the plane of the two-dimensional slice. Nonetheless, references herein may refer to such capture portions as depicting biological objects.
As referred to herein, the term "type of biological object" or biological object type may refer to a category of biological units. By way of example and not limitation, a type of biological object may refer to a cell (in general), a particular type of cell (e.g., a lymphocyte or a tumor cell), a particular classification of cell types (e.g., a cd8+ T cell or a ck+ tumor cell), a cell membrane (in general), and the like. Some disclosure may refer to detecting biological object depictions corresponding to a first type of biological object and other biological object depictions corresponding to a second type of biological object. The first and second types of biological objects may have similar, identical, or different levels of specificity and/or generality. For example, the first and second types of biological objects may be identified as lymphocyte types and tumor cell types, respectively. As another example, a first type of biological object may be identified as lymphocytes and a second type of biological object may be identified as a tumor.
As referred to herein, the term "spatial distribution metric" may refer to a metric that characterizes the spatial arrangement of particular biological object depictions in an image relative to each other and/or relative to other particular biological object depictions. The spatial distribution metric may characterize the extent to which one type of biological object (e.g., lymphocyte) has infiltrated another type of biological object (e.g., tumor), disseminated with another type of object (e.g., tumor cell), physically approximated with another type of object (e.g., tumor cell), and/or co-localized with another type of object (e.g., tumor cell).
FIG. 1 illustrates an interactive system or network 100 (e.g., a specially configured computer system) that can be used to generate and process digital pathology images to characterize relative spatial information of biological objects in accordance with the disclosed subject matter.
The digital pathology image generation system 120 may generate one or more digital images corresponding to a particular sample. For example, the image generated by the digital pathology image generation system 120 may include a stained section (stained section) of a biopsy sample. As another example, the image generated by the digital pathology image generation system 120 may include a slide image of a liquid sample (e.g., a blood smear). As another example, the image generated by the digital pathology image generation system 120 may include a fluorescence micrograph, such as a slide image depicting Fluorescence In Situ Hybridization (FISH) after a fluorescent probe has bound to a target DNA or RNA sequence.
Some types of samples (e.g., biopsies, solid samples, and/or samples including tissue) may be processed by the sample preparation system 121 to immobilize and/or embed the sample. The sample preparation system 121 may facilitate infiltration of the sample with a fixative (e.g., a liquid fixative such as a formaldehyde solution) and/or an intercalating substance (e.g., a histological wax). For example, the immobilization subsystem may immobilize the sample by exposing the sample to the immobilization agent for at least a threshold amount of time (e.g., at least 3 hours, at least 6 hours, or at least 12 hours). The dehydration subsystem may dehydrate the sample (e.g., by exposing the fixed sample and/or a portion of the fixed sample to one or more ethanol solutions) and potentially remove the dehydrated sample using a removal intermediate (e.g., which includes ethanol and histological wax). The embedding subsystem may infiltrate the sample with heated (e.g., thus liquid) histological wax (e.g., one or more times for a corresponding predefined period of time). The histological wax may comprise paraffin wax and potentially one or more resins (e.g., styrene or polyethylene). The sample and wax may then be cooled, and the wax-infiltrated sample may then be sealed.
Sample slicer 122 may receive both the fixed and embedded samples and may generate a set of slices (sections). The sample slicer 122 may expose the fixed and embedded samples to cool or cold temperatures. The sample slicer 122 may then cut the frozen sample (or a trimmed version thereof) to produce a set of slices. Each slice may have a thickness of, for example, less than 100 μm, less than 50 μm, less than 10 μm, or less than 5 μm. Each slice may have a thickness of, for example, greater than 0.1 μm, greater than 1 μm, greater than 2 μm, or greater than 4 μm. The cutting of the frozen sample may be performed in a warm water bath (e.g., at a temperature of at least 30 ℃, at least 35 ℃, or at least 40 ℃).
Automated staining system 123 can facilitate staining of one or more sample sections by exposing each section to one or more stains (e.g., hematoxylin and eosin, immunohistochemical stains, or specialty stains). Each slice may be exposed to a predefined volume of stain for a predefined period of time. In certain cases, individual sections are exposed to multiple staining agents simultaneously or sequentially.
Each of the one or more stained sections may be presented to an image scanner 124, which may capture a digital image of the section. The image scanner 124 may include a microscope camera. The image scanner 124 may capture digital images at multiple magnification levels (e.g., using a 10x objective lens, a 20x objective lens, a 40x objective lens, etc.). The manipulation of the image may be used to capture a selected portion of the sample within a desired magnification range. The image scanner 124 may further capture annotations and/or morphological metrics identified by a human operator. In some cases, after capturing one or more images, the slice is returned to the automated staining system 123 so that the slice may be washed, exposed to one or more other stains, and imaged again. When multiple colorants are used, the colorants may be selected to have different color profiles such that a first region of the image corresponding to a first slice portion that absorbs a first amount of the first colorant is distinguished from a second region of the image (or a different image) corresponding to a second slice portion that absorbs a second amount of the second colorant.
It should be appreciated that one or more components of the digital pathology image generation system 120 may operate in conjunction with a human operator. For example, a human operator may move samples across various subsystems (e.g., subsystems of sample preparation system 121 or digital pathology image generation system 120) and/or initiate or terminate operation of one or more subsystems, systems, or components of digital pathology image generation system 120. As another example, some or all of one or more components of the digital pathology image generation system (e.g., one or more subsystems of the sample preparation system 121) may be replaced in part or in whole with actions of a human operator.
Further, it should be appreciated that while the various described and depicted functions and components of the digital pathology image generation system 120 relate to the processing of solid and/or biopsy samples, other embodiments may relate to liquid samples (e.g., blood samples). For example, the digital pathology image generation system 120 may receive a liquid sample (e.g., blood or urine) slide including a base slide, a smeared liquid sample, and a cover slip. The image scanner 124 may then capture an image of the sample slide. Other embodiments of the digital pathology image generation system 120 may involve capturing an image of a sample using advanced imaging techniques such as FISH as described herein. For example, once the fluorescent probe has been introduced into the sample and allowed to bind to the target sequence, an image of the sample can be captured for further analysis using appropriate imaging.
A given sample may be associated with one or more users (e.g., one or more physicians, laboratory technicians, and/or medical providers). The associated user may include a person ordering a test or biopsy that produced the sample being imaged and/or a person having access to receive the results of the test or biopsy. For example, the user may correspond to a physician, pathologist, clinician, or subject from whom the sample is obtained. A user may use one or more devices 130, for example, to initially submit one or more of the following requests (e.g., that identify a subject): the sample is processed by a digital pathology image generation system 120 and the resulting image is processed by a digital pathology image processing system 110.
The digital pathology image generation system 120 transmits the digital pathology image produced by the image scanner 124 back to the user device 130, and the user device 130 communicates with the digital pathology image processing system 110 to initiate automated processing of the digital pathology image. The digital pathology image generation system 120 uses the digital pathology image generated by the image scanner 124 directly to the digital pathology image processing system 110, for example at the direction of a user of the user device 130. Although not shown, other intermediary devices (e.g., a data storage area connected to a server of the digital pathology image generation system 120 or the digital pathology image processing system 110) may also be used. In addition, for simplicity, only one digital pathology image processing system 110, digital pathology image generation system 120, and user device 130 are shown in network 100. The present disclosure contemplates the use of one or more of each type of system and its components without departing from the teachings of the present disclosure.
The digital pathology image processing system 110 may analyze the received digital pathology image, identify spatial characteristics of the digital pathology image, characterize spatial distribution of biological object depictions therein, and/or provide classification of the corresponding sample based on the analysis of the digital pathology image and the spatial characteristics of the digital pathology image.
The image annotation module 111 may generate and/or receive annotations of the digital pathology image. As an example, the image annotation module may include one or more machine learning models and/or rule-based models for annotating digital pathology images. Annotation of the digital pathology image may include identifying the primary structure shown within the digital pathology image (e.g., tumor beds including stroma and inflammation). The primary structure may be used to calibrate or normalize the processing of digital pathology images by the remaining components of the digital pathology image processing system. In some embodiments, the image annotation module 111 may generate a user interface for presentation to a user (such as a pathologist) to evaluate the digital pathology image and provide annotation of the image or to provide verification of the annotation performed by the annotation model.
The image blocking module 112 may generate tiles from the digital pathology image. The digital pathology image may be provided in the form of a full slide image. Typically, the full slide image is significantly larger than the standard image and is much larger (e.g., approximately 100,000 pixels by 100,000 pixels) than would otherwise be generally feasible for standard image identification and analysis. For ease of analysis, the image segmentation module 112 subdivides each full slide image into tiles. For analysis purposes, the size and shape of the tiles are uniform, but the size and shape may be variable. In some embodiments, tiles may overlap to increase the chance that the image background is properly analyzed by the digital pathology image processing system 110. To balance the work that is performed accurately, non-overlapping tiles may be preferred.
The pixel-based segmentation module 113 may generate image segmentations for digital pathology images and/or tiles generated from digital pathology images. In particular embodiments, segmentation may be performed on a per pixel basis, but areas in the digital pathology image that are larger than a single pixel and smaller than a tile may be used. The pixel-based segmentation module 113 may use the properties of the digital pathology image to accurately segment tiles generated from the digital pathology image, ideally into segments corresponding to various biological objects of interest. The pixel-based segmentation module 113 may use the intensity of the color channel associated with one or more known effects of the stain processing the sample corresponding to the digital pathology image. Various colorants may be associated with specific color values, and these color values may correspond to certain known biological objects. By way of example, the first stain may cause the expression of cd8+ T cells to appear dark brown due to reactivity with the CD8 IHC stain. The pixel-based segmentation module 113 may thus identify pixels that have high intensities in color channels mapped to brown and relatively low intensities in other channels. These identified pixels may be part of the segmented channels associated with cd8+ T cells. Similarly, the second stain may result in the expression of ck+ tumor cells in intense magenta color due to the reactivity with the PanCK IHC stain. The pixel-based segmentation module 113 can thus identify pixels that have high intensities in the color channel mapped to magenta and relatively low intensities in other channels, and associate these pixels as part of the segmented channel associated with ck+ tumor cells. Although specific colors and color channels are discussed, other suitable color characterizations and color identifications are contemplated, particularly when different colorants and different types of colorants are used. The pixel-based segmentation module 113 may segment tiles using the shape, edges, patterns, and other attributes of the digital pathology image. Additional morphological operations may be performed to consolidate and identify regions associated with a first biological object (e.g., regions associated with cd8+ T cells) and regions associated with a second biological object (e.g., regions associated with ck+ tumor cells).
The density assessment module 114 may automatically detect the density of depictions of one or more specific types of specific objects (e.g., biological objects) in each of the segmented digital pathology image blocks. As described herein, a subject type may include, for example, some type of biological structure, such as a cell. For example, a first set of biological objects may correspond to a first cell type (e.g., cd8+ T cells, etc.), and a second set of biological objects may correspond to a second cell type (e.g., tumor cells, etc.) or a type of biological structure (e.g., tumor, malignancy, etc.). The density detection may be based on pixel-based segmentation of individual tiles. As an example, density detection may include binning pixel-segmented regions of tiles and counting or otherwise measuring the magnitude of the presentation of pixels associated with certain biological objects relative to pixel-segmented regions of tiles associated with other biological objects. Additionally or alternatively, the density assessment module 114 may generate tile-based local density measurements by comparing the intensity or magnitude level of biological objects detected within the tiles to a threshold level. Based on the density assessment, the density assessment module 114 or one or more other modules of the digital pathology image processing system 110 may assign each tile a classification corresponding to whether one or more of the types of biological objects (e.g., cd8+ T cells, ck+ tumor cells) are more dominant in a particular tile. The classification may indicate, for example, the presence or absence of a type of biological subject (e.g., a tile may be classified as cd8+ T cell positive or negative and ck+ tumor cell positive or negative). The density values generated for each tile may be provided to the object distribution detector 115.
The object distribution detector 115 may analyze the raw density values for the digital pathology image and generate one or more spatial distribution metrics. The object distribution detector 115 may use static rules and/or trained models to detect and characterize biological objects depicted in tiles of the digital pathology image based at least in part on the density values. The object distribution detector 115 may generate and/or characterize a spatial distribution of one or more objects associated with a particular density of a particular type of biological object and/or a tile of the digital pathology image. The distribution may be generated, for example, by using one or more static rules (e.g., that identify how to use absolute or smoothed counts or densities of biological objects within a grid region of the digital pathology image, etc.) and/or using a trained machine learning model (e.g., that may predict that initial object delineation data is to be adjusted according to a predicted quality of the one or more digital pathology images). For example, the characterization may indicate: the particular types of biological objects are depicted as being densely clustered relative to each other, as being distributed across all or a portion of the image, as to how closely together (relative to each other) the depictions of the particular types of biological objects are compared to the depictions of the other types of biological objects, as to how closely together the depictions of the one or more particular types of biological objects are relative to the depictions of the one or more other types of biological objects, and/or as to how closely together the depictions of the one or more particular types of biological objects are located within and/or near an area defined by the one or more depictions of the one or more other types of biological objects. As described in more detail below in connection with fig. 2, the biological object distribution detector 115 may initially generate a representation of the biological object using a particular framework (e.g., a spatial region analysis framework, etc.).
In addition to using the density values prepared by the density assessment module 114, the object distribution detector 115 may also detect biological objects depicted in one or more tiles of the digital pathology image and prepare spatial distribution metrics therefrom. Rule-based biological object detection may include: detecting one or more edges; identifying a sufficiently connected and shape-closed subset of edges; and/or detecting one or more high intensity regions or pixels. If, for example, the area of the region within the closed edge is within a predefined range and/or if the high intensity region has a size within a predefined range, a portion of the digital pathology image may be determined to depict the biological object. Detecting biological object delineations using the trained model may include: neural networks are employed, such as convolutional neural networks, deep convolutional neural networks, and/or graph-based convolutional neural networks. The model may have been trained using annotated images that include annotations indicating the location and/or boundaries of objects. Annotated images may have been received from a data store (e.g., a public data store) and/or from one or more devices associated with one or more human annotators. The model may have been trained using generic or natural images (e.g., generally, not just images captured for digital pathology or medical use). This may extend the ability of the model to distinguish between different types of biological objects. The model may have been trained using a training set of specialized images (such as digital pathology images) that have been selected for training the model to detect a particular type of object.
Rule-based biological object detection and trained model biological object detection may be used in any combination. For example, rule-based biological object detection may detect the delineation of one type of biological object, while a trained model is used to detect the delineation of another type of biological object. Another example may include: the results from the rule-based biological object detection are verified using the biological object output by the trained model, or the results of the trained model are verified using a rule-based method. Yet another example may include: using rule-based biological object detection as initial object detection, then using a trained model for finer biological object analysis; or applying a rule-based object detection method to the image after detecting the depiction of the set of initial biological objects via the trained network.
For each detected biological object and/or each tile of the digital pathology image for which a density value has been generated, the object distribution detector 115 may identify and store a representative location (e.g., centroid or midpoint) of the depicted biological object, a set of pixels or voxels corresponding to edges of the depicted object, and/or a set of pixels or voxels corresponding to an area of the depicted biological object. The biological object data may be stored with metadata of the biological object, which may include, by way of example and not limitation, an identifier of the biological object (e.g., a digital identifier), an identifier of a corresponding digital pathology image, an identifier of a corresponding region within a corresponding digital pathology image, an identifier of a corresponding subject, and/or an identifier of a type of object.
The immunophenotyping module 116 may predict an immunophenotype for a sample depicted in the digital pathology image based at least in part on the tile classifications generated by the density assessment module 114 and the spatial distribution metrics generated by the object distribution detector for the digital pathology image. The immunophenotyping module 116 may include and/or use one or more machine learning models or rule systems in performing the evaluation. As an example, the model may be learned by a supervised learning process based on training data including a set of digital pathology images, their associated tile classifications and spatial distribution metrics, and a known or pre-assigned immunophenotype. As described herein, the object distribution detector 115 may generate tens of spatial distribution metrics based on the density data of the tiles. The classifier may identify one or more hyperplanes to separate sets of digital pathology images (and underlying samples) in a feature space that includes various spatial distribution metrics. This process is referred to as a supervised learning process because the set of immunophenotypes may be limited to immunophenotypes specified for a particular type of tumor or biological structure. In some embodiments, learning may be an unsupervised process in which model learning categorizes training data itself into an unspecified number or set of specified groups. Once trained, the immunophenotyping module 116 may use the model to evaluate real-time data (e.g., new inputs) and group the input digital pathology images into the appropriate immunophenotype.
The response assessment module 117 can use the spatial distribution metrics and assigned immunophenotype to generate one or more subject-level signatures. The subject-level tags may include tags determined for a single subject (e.g., patient), a defined group of subjects (e.g., patients with similar characteristics), a clinical study group, and so forth. For example, the signature may correspond to a potential diagnosis, prognosis, treatment assessment, treatment recommendation, or treatment qualification determination. The tags may be generated using predefined or learned rules. For example, a rule may indicate that certain immunophenotypes will be associated with a particular treatment recommendation, a rule may indicate that a particular spatial distribution metric above a predefined threshold will be associated with a particular medical condition (e.g., as a potential diagnosis), and a metric below the threshold will not be associated with a particular medical condition. As another example, a rule may indicate that a particular treatment is to be recommended when the spatial distribution metric is within a predefined range (e.g., not other ranges) and the assigned immunophenotype is one of two categories. As yet another example, the rules may determine different therapeutic effect bands based on a ratio of the spatial distribution metric corresponding to the most recently acquired digital pathology image to the stored baseline spatial distribution metric corresponding to the more recently acquired digital pathology image.
The output generation module 118 may generate a plurality of user interfaces, reports, and graphics corresponding to the digital pathology image, the underlying sample, the patient, or other unique associations to convey various assessments made by the digital pathology image processing system. As an example, the output generation module 118 may generate a user interface or graphic corresponding to the annotation generated or received by the image annotation module 111. The annotations may be presented as an overlay of the original or edited digital pathology image. As another example, the output generation module 118 may generate an annotated or interactive representation of the digital pathology image based on the segmentation generated by the pixel-based segmentation module 113. Similarly, the output generation module 118 may generate a heat map corresponding to the density values for each tile of the digital pathology image generated by the density assessment module 114. In addition, the output generation module 118 may prepare a report of the immunophenotype predicted by the immunophenotyping module 116 for the digital pathology image and/or a response assessment prepared by the response assessment module 117. In general, the output generation module 118 may provide insight into the operation of the digital pathology image processing system 110, thereby helping to review the accuracy of the system and helping pathologists or other researchers understand the mechanisms and reasons for a particular assessment. The output may include a local presentation or transmission (e.g., to the user device 130).
The training controller 119 of the digital pathology image processing system 110 may control the training of one or more machine learning models and/or the functions used by the digital pathology image processing system 110. In some cases, some or all of the models and functions are trained together by training controller 119. In some cases, training controller 119 may selectively train the model for use by digital pathology image processing system 110. As embodied herein, the training controller 119 may select, retrieve, and/or access training data comprising a set of digital pathology images and spatial distribution metrics generated from the digital pathology images. The training data may further include a set of immunophenotypes assigned to each of the digital pathology images. During training operations, the training controller 119 may cause the digital pathology image processing system to process and assign immunophenotype and/or response assessments for a subset of the digital pathology images in the training data. The output of each of the digital pathology images may be compared to a predetermined immunophenotype and/or outcome of the training data. Based on the comparison, one or more scoring functions may be used to evaluate the level of accuracy and precision of the machine learning model under test. The training process will be repeated multiple times and may be performed using one or more subsets or segments of training data. For example, during each training period, a random sampling selection of digital pathology images from the training data may be provided as input.
As an example, the training controller 119 may use a scoring function that penalizes variability or differences between the immunophenotype and results provided and the output generated by the immunophenotyping module 116 and the response assessment module 117. Scoring functions may be designed, for example, to motivate the system to learn to recognize a particular immunophenotype or outcome and/or to indicate a particular criteria for a particular immunophenotype or outcome as described herein. The results of the scoring function may be provided to a machine learning model being trained that applies or saves modifications to the model to optimize the score. After modifying the model, another training period will begin with a new random sample of input training data.
The training controller 119 further determines when to stop training. For example, the training controller 119 may determine to train a machine learning model or other algorithm used by the digital pathology image processing system for a set number of cycles. As another example, the training controller 119 may determine to train the digital pathology image processing system until the scoring function indicates that the model has passed a threshold of success. As another example, the training controller 119 may periodically pause training and provide a test set of digital pathology images, where the results are known. The training controller 119 may evaluate the output of the digital pathology image processing system 110 against known results to determine the accuracy of the digital pathology image processing system. Once the accuracy reaches the set threshold, the training controller 119 may stop training.
Each component and/or system in fig. 1 may include, for example, one or more computers, one or more servers, one or more processors, and/or one or more computer-readable media. A single computing system (having one or more computers, one or more servers, one or more processors, and/or one or more computer-readable media) may include the various components depicted in fig. 1. For example, the digital pathology image processing system 110 may include a single server and/or collection of servers that collectively implement an image annotation module 111, an image segmentation module 112, a pixel-based segmentation module 113, a density assessment module 114, an object distribution detector 115, an immunophenotyping module 116, a response assessment module 117, an output generation module 118, and/or a training controller 119.
FIG. 2 shows an illustrative biological object mode computing system 200 for processing object data to generate spatial distribution metrics. The object distribution detector 115 may comprise a portion or all of the system 200.
Biological object mode computing system 200 may include multiple subsystems, but for emphasis only region processing subsystem 210 is shown and described. Each of the subsystems may correspond to and use a different framework to generate spatial distribution metrics or its constituent data, such as a region analysis framework 230, a point process analysis framework, a geostatistical framework, a graph framework, and the like. The region analysis framework 230 may be a framework in which data (e.g., the location of the depicted biological objects or the density of the biological objects) is indexed using a coordinate and/or spatial lattice (e.g., tiles) rather than by individual biological object depictions. The region analysis framework 230 may support the generation of one or more metrics that characterize the spatial pattern and/or distribution formed across the depiction of one or more biological objects for each of one or more types.
The region analysis framework 230 may index the data using coordinates and/or a spatial lattice. The region processing subsystem 210 may apply a region analysis framework 230 to identify or reference a density for each of a set of coordinates and/or regions associated with an image area. Density may be identified using one or more of lattice-based partitioner 265, grid-based cluster generator 270, and/or hotspot monitor 275 or other techniques described herein.
The lattice-based partitioner 265 may apply a spatial lattice to the image, including a representation of the location of the depicted biological object on the image. Applying the spatial lattice to the image may include dividing the image into a plurality of tiles, such as by the image segmentation module 112. The spatial lattice (comprising a set of rows and a set of columns) may define a set of regions (e.g., tiles), where each region corresponds to a row-column combination. Each row may have a defined height and each column may have a defined width such that each region of the space lattice may have a defined area.
The lattice-based partitioner 265 may determine an intensity metric using the spatial lattice and the locations associated with each tile within the lattice. For example, for each lattice region, the intensity metric may indicate and/or may be based on an absolute or relative number or density of biological object depictions for each of one or more types within the region. The intensity metrics (e.g., density) may be normalized and/or weighted based on the total number of biological objects (e.g., of a given type or all types) detected within the tile, digital pathology image, and/or for the scale of the sample and/or digital pathology image. In particular embodiments, the intensity metrics are smoothed and/or otherwise transformed. For example, the initial count may be thresholded such that the final intensity metric is binary or presented on a normalized scale (e.g., 0 to 1, including 0 and 1). The binary metric may include determining whether the lattice region is associated with a density that meets a threshold (e.g., whether at least fifty percent of the tile includes pixels that are segmented into associated with a particular stain). The lattice-based partitioner 265 may use the region data to generate one or more spatial distribution metrics by, for example, comparing intensity metrics between different types of biological objects (e.g., comparing the density of cd8+ T cells and the density of ck+ tumor cells across the tile).
Grid-based cluster generator 270 may generate one or more spatial distribution metrics based on cluster-related data related to one or more biological object types. For example, for each of one or more biological object types, a clustering technique and/or fitting technique may be applied to determine the extent to which depictions of biological objects of that type (e.g., cd8+ T cells) spatially aggregate with each other and/or with depictions of biological objects of another type (e.g., ck+ tumor cells), for example. Clustering techniques and/or fitting techniques may further be applied to determine the extent to which the delineations of biological objects are spatially dispersed and/or randomly distributed. For example, the mesh-based cluster generator 270 may determine a morsinta-Horn index and/or a Moran index. For example, a single metric may indicate how closely a depiction of one type of biological object is spatially aggregated and/or approximated with a depiction of another type of object.
The hot/cold spot monitor 275 may perform an analysis to detect any "hot spot" locations of the digital pathology image where a depiction of one or more particular types of biological objects may be present or any "cold spot" locations where a depiction of one or more particular types of biological objects may not be present. The intensity metrics of the lattice partitions may be used, for example, to identify local intensity extrema (e.g., maxima or minima) and/or to fit one or more peaks (which may be characterized as hot spots) or one or more valleys (which may be characterized as cold spots). The Getis-Ord hotspot algorithm may be used to identify any hotspots (e.g., intensities across a set of adjacent pixels that are high enough to be significantly different from other intensities in the digital pathology image) or any cold spots (e.g., intensities across a set of adjacent pixels that are low enough to be significantly different from other intensities in the digital pathology image). In particular embodiments, "significantly different" may correspond to a determination of statistical significance. Once the object type specific hot and cold spots are identified, the hot/cold spot monitor 275 may compare the location, magnitude and/or width of any hot or cold spot detected for one biological object type to the location, magnitude and/or width of any hot/cold spot detected for another biological object type.
It should be understood that the various subsystems may include components not depicted and may perform processes not explicitly described. For example, the region processing subsystem 210 may generate a spatial distribution metric corresponding to entropy-based mutual information measurements to indicate that information about the location of a depiction of a first type of biological object within a given region reduces the degree of uncertainty as to whether a depiction of another biological object (of the same or other type) is present at a location within another region. For example, the mutual information metric may indicate that the location of one biological object type provides information about the location of another biological object type (and thereby reduces entropy). This mutual information may be associated with situations in which cells of one cell type are interspersed with cells of another cell type (e.g., tumor-infiltrating lymphocytes are interspersed within tumor cells).
Biological object model computing system 200 may use various types of combinations of multiple (e.g., two or more, three or more, four or more, or five or more) spatial distribution metrics (e.g., such as those disclosed herein) to generate a result (which may itself be a spatial distribution metric). The plurality of spatial distribution metrics may include metrics generated using different frameworks (e.g., region analysis framework 230) and/or metrics generated by different subsystems. For example, jaccard index, An index, B coefficient, moran I computation, geary C computation, morisita-Horn index metric, getis-Ord G index, collaborative location quotient to generate a spatial distribution metric, or other similar spatial distribution metrics based on a framework of lattice and tiles.
The multiple metrics may be combined using one or more user-defined and/or predefined rules and/or using a trained model. For example, a Machine Learning (ML) model controller 295 (separate from and/or integrated into the training controller 119) may train a machine learning model to learn one or more parameters (e.g., weights) that specify how the various lower-level metrics are to be processed together to generate a combined spatial distribution metric. The combined spatial distribution metric may be more accurate in terms of aggregation than the individual parameters alone. Additionally or alternatively, the machine learning model controller 295 may train a machine learning model to classify or otherwise make decisions about the digital pathology images provided. Parameters of the machine learning model may be stored in the ML model architecture data store. As an example, and as described herein, a machine learning model may be trained to learn distance metrics and embedding to isolate classes of immunophenotype based on a training set of data including spatial features and provided immunophenotype classifications. The machine learning model may generate an embedding and feature space to separate categories of immunophenotypes based on the calculated spatial distribution metrics. The architecture of the machine learning model may also be stored in the ML model architecture data store 296. For example, the machine learning model may include logistic regression, linear regression, decision trees, random forests, support vector machines, neural networks (e.g., feed forward neural networks), etc., and the ML model architecture data store 296 may store one or more equations defining the model. Optionally, the ML model hyper-parameter data storage 297 stores one or more hyper-parameters that are used to define the model and/or its training but not learned. For example, the superparameter may identify multiple hidden layers, discard rate, learning rate, etc. The learned parameters (e.g., corresponding to one or more weights, thresholds, coefficients, etc.) may be stored in ML model parameters data store 298.
Although not shown in fig. 2, biological object mode computing system 200 may further include one or more components to aggregate spatial distribution metrics across slices of the subject sample and generate one or more aggregated spatial distribution metrics. Such aggregated metrics may be generated, for example, by components within the subsystem (e.g., by hotspot monitor 275), by the subsystem (e.g., by regional processing subsystem 210), by ML model controller 295, and/or by biological object mode computing system 200. The aggregated spatial distribution metric may comprise, for example, a sum, median, average, maximum, or minimum of a set of slice-specific metrics.
Fig. 3 illustrates a process 300 for classifying biological samples according to immunophenotype and using spatial distribution metrics to provide health-related assessment based on image processing of digital pathology images. More specifically, the digital pathology image may be processed, for example, by a digital pathology image processing system, to generate one or more metrics that characterize the spatial pattern and/or distribution of one or more types of biological objects. The spatial distribution metric may then provide information for diagnosis, prognosis, treatment assessment, or treatment qualification decision.
The process begins at step 310, where digital pathology image processing system 110 may access one or more digital pathology images of a stained tissue sample. For example, the digital pathology image processing system 110 may receive a subject-related identifier. The subject-related identifier may include an identifier of the subject, sample, slice, and/or digital pathology image. The subject-related identifier may be provided by a user (e.g., a medical provider of the subject and/or a physician of the subject). The user may provide the identifier as an input to the user device, which may transmit the identifier to the digital pathology image processing system 110. The digital pathology image processing system may use the identifier to query a local or remote data store to retrieve the digital pathology image. Additionally or alternatively, the digital pathology image processing system 110 may receive images directly from, for example, the user device 130. As another example, a request including a subject-related identifier may be transmitted to another system (e.g., digital pathology image generation system 120), and the response may include a digital pathology image.
The digital pathology image may depict a stained section of a sample from a subject exhibiting a medical condition. The digital pathology image may be stained with more than one stain, as described herein, selected based on known reactivity properties with one or more types of biological objects (e.g., tumor cells and lymphocytes). By way of example, samples may be stained with specific stains or other treatments known to cause reactivity with tumor cells and lymphocytes in order to enhance the detectability of these biological objects and the relevant areas of the digital pathology image.
At step 320, the digital pathology image processing system 110 may identify a tumor-associated region in the digital pathology image. In particular embodiments, digital pathology image processing system 110 uses a machine learning model trained to identify tumor-related regions within digital pathology images to identify tumor-related regions in digital pathology images. In particular embodiments, digital pathology image processing system 110 identifies tumor-associated regions in digital pathology images through interaction by a pathologist or other user. As an example, the digital pathology image processing system 110 may provide a user interface for display that includes a digital pathology image and one or more interactive elements. The user interface may be provided, for example, to the user device 130 or through a user input device of the digital pathology image processing system 110. The digital pathology image processing system 110 may then receive a selection of one or more tumor-associated regions through interaction with one or more interactive elements.
At step 330, the digital pathology image processing system 110 may subdivide the digital pathology image into a plurality of tiles. The digital pathology image may be provided in the format of a full slide image or other large format image. Since the whole slide image is significantly larger than the standard image, for ease of analysis, the digital pathology image subdivides the digital pathology image into more manageable sizes (referred to as tiles). The size and shape of tiles may be uniform or may vary based on the needs of a particular analysis. Additionally, while in some embodiments the tiles do not overlap (e.g., they are mutually exclusive of one another), in other embodiments the tiles may overlap to increase the chance of the image context being properly analyzed by the digital pathology image processing system 110. The size and shape of the tiles may be determined automatically by the digital pathology image processing system 110, or may be predetermined by or upon request by one or more users (e.g., by input to the digital pathology image processing system 110 by the user device 130). In particular, the tiles and the size and shape of the lattice formed by the tiles may be determined based on the type of analysis being performed, including the final outcome assessment, the type of biological object sought, the type of tissue from which the biological sample was extracted, the type of medical condition, or other relevant variables.
At step 340, the digital pathology image processing system 110 may segment each of the tiles into regions based on the reactivity of the biological objects depicted in the tiles to two or more stains that the biological sample has been processed. In a particular embodiment, the digital pathology image includes depictions of a plurality of biological object types, and each of the plurality of biological object types is responsive to one of the stains. The digital pathology image processing system 110 may apply a pixel-based segmentation method to segment and classify regions of tiles based on reactivity to a stain. As an example, each of the regions of a tile may be set as pixels that make up the tile. The digital pathology image processing system 110 may classify each pixel as belonging to or containing one or more of the depicted biological object types based on the color of the region. For example, the digital pathology image processing system 110 may associate pixels with threshold intensities in one or more first color channels (e.g., depicting areas that are responsive to CD8 IHC stain) as being associated with a first biological object type (e.g., cd8+ T cells) and pixels with threshold intensities in one or more second color channels (e.g., depicting magenta in areas that are responsive to PanCK IHC stain) as being associated with a second biological object type (e.g., ck+ tumor cells). The threshold intensity and the specific color channel may be based on a specific stain for which the biological sample depicted in the digital pathology image is directed, the color being based on a reaction of each of the plurality of biological object types to one of the two or more stains. The association of regions with particular biological object types may be further based on confidence scores of the image segmentation algorithm.
At step 350, for each tile of the digital pathology image, the digital pathology image processing system 110 may calculate a local density measurement for each of the biological object types. In some embodiments, the digital pathology image processing system 110 may classify a single tile as ck+ (e.g., when the area of tumor cells depicted in the tile indicated by the pixel depicting PanCK IHC stain is greater than 25% of the window area) and/or CK- (e.g., when the presence of T cells indicated by the pixel depicting CD8 IHC stain is greater than 25% of the window area), which allows classification of a given tile as both ck+ and CK-, and as ck+ or CK-alone. From the local density measurements, the digital pathology image processing system 110 may generate a data structure to include object information characterizing a depiction of a biological object. The data structure may identify, for example, the location of a depiction of a biological object and/or the location of a tile within a lattice of digital pathology images. The data structure may further identify a type of biological object (e.g., lymphocyte, tumor cell, etc.) corresponding to the depicted biological object. The calculation may be based on the number of regions (e.g., pixel areas) of the tile classified with each of the two or more colorants (e.g., associated with each of the biological object types). In a particular embodiment, for each tile, the local density measurement for each of the plurality of biological object types includes a representation of an absolute or relative amount of the biological object type that is identified as being located within the tile for a first type of biological object depiction and a representation of an absolute or relative amount of the biological object type that is identified as being located within the tile for a second type of biological object depiction. For example, in examples where there are two biological object types of interest, the local density measurement may reflect an absolute number or percentage of regions of each tile associated with each of the two biological object types. The value may be divided by the total number of regions within the tile to yield a percentage of tiles associated with each of the biological object types. Additionally or alternatively, the local density measurement may be expressed as an area value based on a known transformation between the pixel size of the digital pathology image and the corresponding size of the biological sample. In some embodiments, the two-dimensional density profile for each biological subject type (e.g., ck+ tumor cells and cd8+ T cells) can be obtained from local density measurements.
At step 360, for each tile, the digital pathology image processing system 110 may generate a spatial distribution metric for the type of biological object in the digital pathology image based on the local density measurements. Each of the spatial distribution metrics characterizes at least a portion of the depiction of the first set of biological objects as at least a portion of the depiction of the second set of biological objectsThe degree of spreading. As described herein, the generated spatial distribution metrics may include one or more of the following: jaccard index,An index, a Bhattacharyya coefficient, a Moran index, a Geary adjacency ratio, a Morisita-Horn index, a co-location quotient, a metric defined based on hot/cold spot analysis, or a variation or modification thereof.
In some embodiments, the one or more spatial distribution metrics may quantify spatial patterns of the biological object, such as co-localization of biological object types, presence of hot spots of one or more biological object types (including cells), and the like. This spatial pattern may help assess the level of lymphocyte infiltration into a tumor area (e.g., TIL). The spatial map mode may be quantified using one or more spatial analysis methods based on a lattice of digital pathology images. Each tile may be considered a spatial unit and its center coordinates may be extracted. The total area of the region including the particular type of biological object may be calculated for each tile (e.g., based on the number of pixels depicting a staining color that indicates the presence of the type of biological object) and then normalized by dividing by the sum of all areas of the biological type on all tiles of the same slide (e.g., the sum of all areas depicting a tumor region on all tiles of the slide, or the sum of all areas depicting T cells on all tiles of the slide). One or more prevalence graphs may be created, where the prevalence value for each tile may be based on the calculated normalized area of biological object types. The spatial distribution metric may be derived from a prevalence map. The spatial distribution metric may represent, among other things, co-localization of two biological object types (such as TILs embedded in tumor cells), and/or spatial distribution of one biological object type (such as TILs in tumor regions).
As described herein, the one or more spatial distribution metrics may include one or more of the following: jaccard index,An index, a Bhattacharyya coefficient, a Moran index (including a bivariate Moran index, a Moran index for CD8, and/or a Moran index for CK), a geory adjacency ratio or C index (including a geory C index for CD8 and/or a geory C index for CK), a Morisita-Horn index, a metric defined based on hot spot/cold spot analysis (e.g., a get-Ord hot spot (including co-located get-Ord hot spot, a get-Ord hot spot for CD8, and/or a get-Ord hot spot for CK)), a ratio of areas of biological objects (e.g., a ratio of a total area of cd8+ regions to a total area of ck+ regions), a co-location quotient, or variations or modifications thereof.
At step 370, the digital pathology image processing system 110 may determine a particular immunophenotype for the digital pathology image based on the local density measurements generated for the tiles of the digital pathology image and the spatial distribution metrics generated for the digital pathology image. The digital pathology image processing system 110 may use the local density measurements and the spatial distribution metrics as inputs to generate an embedded or other representation of the digital pathology image. As an example, the digital pathology image processing system 110 may project a representation of the digital pathology image into an embedded space or feature space defined by a spatial distribution metric (e.g., having an axis based on one or more spatial distribution metrics). The projection and feature space may be based on machine learning models trained to generate embeddings in the appropriate feature space. The digital pathology image processing system 110 may then classify the biological sample based on the location of the digital pathology image within the feature space. In particular, the digital pathology image processing system 110 may classify the digital pathology images based on proximity of the location of the representation of the digital pathology image to the location of the representation of one or more other digital pathology images in the feature space. These adjacent digital pathology image representations may have pre-assigned or predetermined immunophenotype classifications. The digital pathology image may be assigned an immunophenotype based on the nearest neighbor immunophenotype in the feature space.
At step 380, the digital pathology image processing system 110 may generate a health-related assessment result based on the immunophenotype classification and the one or more spatial distribution metrics. The health-related assessment results may correspond to, for example, diagnosis, prognosis, treatment assessment, or treatment qualification with respect to a medical condition associated with the subject from which the biological sample was taken. The digital pathology image processing system may use a trained machine learning model to process immunophenotyping and one or more spatial distribution metrics. As described herein, a machine learning model may be trained to generate assessment results using a set of training elements, each corresponding to another subject having a similar medical condition and for which a health-related assessment related result or classification is known. For example, if the health-related assessment is related to a prediction of overall survival (or survival over a specified period of time), each of the known outcomes of the subject may include information related to the subject's viability. As another example, if the health-related assessment is related to the availability or qualification of a particular treatment (including, for example, a clinical trial), each of the known outcomes of the subject may include information related to exclusion or inclusion criteria or rehabilitation outcomes following survival and treatment of the subject.
In some embodiments, the digital pathology image processing system 110 may generate one or more outputs based on local density measurements, spatial distribution metrics, immunophenotyping, or health related assessment results. The output may include one or more visualizations of the enhanced digital pathology image based on the interim calculations or determinations by the digital pathology image processing system 110. As an example, the first output may include a heat map visualization of the digital pathology image showing the prevalence of one or more biological object types observed by the digital pathology image processing system based on local density measurements calculated for each tile of the digital pathology image processing system. As another example, the second output may include an overlay based on the calculated spatial distribution metric, wherein information associated with a relationship between the distribution of the plurality of biological object types. As another example, the third output may include results of the health related assessment. For example, the output may be provided through a user interface displayed on the user device 130. One or more outputs may be provided directly to the subject, and some outputs may be limited to medical professionals or to clinical or research environments.
Fig. 4A shows several examples of immunophenotypes that may be associated with digital pathology images. As described herein, immunophenotype and specific immunophenotypes associated with cd8+ T cells associated with ck+ tumor intracellular infiltration may include desert, repulsive, and inflammatory types. Another type of digital pathology image may be referred to as uncertain because the digital pathology image from the sample does not follow other known patterns. When sparse cd8+ infiltrates exist (e.g., local density measurements of immune cells are less than immune cell density thresholds for multiple tiles), the digital pathology image may be classified as a desert immunophenotype. Digital pathology images may be classified as repulsive when there is very little overlap of cd8+ T cells and ck+ tumor cells, or the distribution of cd8+ T cells is limited to only CK-stroma compartments (e.g., local density measurement of tumor cells is less than tumor cell density threshold and local density measurement of immune cells is greater than or equal to immune cell density threshold for one or more of the plurality of tiles; one or more spatial distribution metrics indicate spatial separation of tumor cells and immune cells for one or more of the plurality of tiles). Alternatively, digital pathology may be classified as inflammatory when cd8+ T cells are co-localized with ck+ tumor cells and have substantial overlap (e.g., local density measurements of tumor cells greater than or equal to a tumor cell density threshold and local density measurements of immune cells greater than or equal to an immune cell density threshold for one or more of the plurality of tiles; one or more spatial distribution metrics indicate co-localization of tumor cells and immune cells for one or more of the plurality of tiles).
The digital pathology images of fig. 4A-4F may be images of a sample from a subject having non-small cell lung cancer (NSCLC). These samples are formalin fixed, paraffin embedded (FFPE) sections that have been subjected to Immunohistochemical (IHC) staining using CD8 for T lymphocyte recognition and PanCK for indicating areas of malignant (and benign) epithelium. For example, the image may show the slice reacting to CD8 stain in brown (shown in black and white with medium shading) and the slice reacting to PanCK stain in magenta (shown in black and white with darkest shading). In some embodiments, the image may show a slice that is reactive to a third stain (such as hematoxylin), displayed in blue (shown in a black-and-white image with the lightest shading). The tumor-associated areas (in the digital pathology image) including the surviving malignant epithelium are identified by the digital pathology image processing system 110. The digital pathology image is then segmented into a plurality of tiles. Biological samples were classified with tumor immunophenotype based on the spatial distribution and density of cd8+ T cells.
Digital pathology image 410 (fig. 4A) depicts an example of a desert-type immunophenotype. Although regions of ck+ tumor cells, such as region 411, can be identified, the region of cd8+ T cells is extremely sparse. Digital pathology image 420 (fig. 4B) depicts an example of a repulsive immunophenotype. In contrast to the desert-type immunophenotype associated with the digital pathology image 410, the repulsive-type immunophenotype includes a region of cd8+ T cells (e.g., region 421) and a region of ck+ tumor cells (e.g., region 422). However, these regions are relatively separated. Cd8+ T cells aggregate together, but are not generally depicted as areas of infiltrating tumor cells. As shown in exemplary image 420, the distribution of cd8+ cells may be limited to CK negative matrix compartments.
The digital pathology image 430 (fig. 4C) depicts the first instance of an inflammatory immunophenotype. As with the repulsive immunophenotype, the delineation of the cd8+ T cells from the region of the tumor cells is readily identifiable. However, it was found by examination that T cells have begun to infiltrate more readily into areas of tumor cells, demonstrating co-localization. In particular, T cells are shown as being distributed primarily throughout the digital pathology image 430. The "type 1" tumor shown in the digital pathology image 430 may show diffuse infiltration involving ck+ regions, with or without stromal compartments.
The digital pathology image 440 (fig. 4D) depicts a second instance of an inflammatory immunophenotype. In this example, cd8+ T cells have begun to aggregate, such as in region 441, although cd8+ T cells remain prevalent infiltrating into the entire tumor cell. This degree of aggregation may be a distinguishing factor between inflammatory and repulsive immunophenotypes. The "type 2" tumor shown in the digital pathology image 440 may display a dominant matrix pattern of cd8+ infiltration and "spill" into the ck+ tumor cell aggregates. In some embodiments, the digital pathology image processing system 110 may classify biological samples according to subtype (such as "type 1" tumor or "type 2" tumor).
Fig. 4E and 4F illustrate other examples of immunophenotypes that may be associated with digital pathology images. In particular, the digital pathology images 450 and 460 are taken from the same sample. Digital pathology images 450 and 460 show variability in immunophenotype performance and density and pattern of infiltration intra-tumor heterogeneity, even for the same samples. The digital pathology image 450 shows an example of a repulsive immunophenotype. As can be seen by comparison with other examples, the digital pathology image 450 includes relatively small numbers of tumor cells, but includes large numbers of cd8+ T cells. The digital pathology image 460 (fig. 4F) shows one example of a desert immunophenotype, with only extremely sparse cd8+ regions.
In some embodiments, a digital pathology image may be classified as having a certain tumor immunophenotype when a threshold percentage of tumor regions (e.g., ck+ regions) have a given pattern. For example, if the percentage of tumor regions that show an inflammatory immunophenotype is greater than a pattern threshold (e.g., 20%), the digital pathology image may be classified as inflammatory.
In certain embodiments, digital pathology images may be classified as having a certain tumor immunophenotype based on a co-location quotient (CLQ). CLQ can assess co-occurrence or avoidance of cell type pairs by measuring the local density of target cell types at a fixed radius for each cell of a sample belonging to a reference cell type. For example, when applied to digital pathology images, CLQ assessment can facilitate determining whether cd8+ T cells (target cell type) co-localize (inflammatory type) or evade (repulsive type) tumor cells (reference cell type) by measuring the local density of cd8+ T cells at a fixed radius for each of the tumor cells. Thus, at the short fixed radius of each of the tumor cells, the high average local density of cd8+ T cells corresponds to the synergistic location, as observed in the inflammatory class.
The digital pathology image may be classified as desert-type using a hard cutoff threshold, a maximum percentage area occupied by immune cells, or any other suitable measurement indicative of immune cell deficiency. For example, when the sample has less than a specified number of immune cells (e.g., 200 cells), the sample may be labeled as desert type. The remaining samples can be rendered into two clusters using a binary gaussian mixture model using CLQ Immunization-tumor (local Density of tumor cells in the neighborhood of each immune cell) and CLQ Tumor-immunity (local density of immune cells in the neighborhood of each tumor cell) to distinguish between the rejection type and inflammatory type categories. CLQ (CLQ) Immunization-tumor And CLQ Tumor-immunity Clusters with a larger mean value can be classified as inflammatory. The remaining clusters may be classified as repulsive.
Fig. 5A to 5B show examples of pixel-based segmentation of ck+ tumor cells and cd8+ T cells. Digital pathology image 500 (fig. 5A) shows a sample that has been treated with one or more stains that react with ck+ tumor cells (which are reactive with magenta PanCK stain) and cd8+ T cells (which are reactive with brown CD8 stain). The stain causes expression of the biological subject based on the color of the biological subject at the time of review. By analyzing the color of individual pixels of the digital pathology image and/or tiles of the digital pathology image, the digital pathology image processing system may divide the region into regions associated with cd8+ T cells, ck+ tumor cells, other biological structures, or no biological structures. As an example, the digital pathology image processing system 110 or one or more components thereof, including but not limited to the pixel-based segmentation module 113, may perform color thresholding associated with color channels known to be associated with application of a stain effective on a biological structure of interest. Additional morphological operations may be performed to consolidate and identify regions associated with a first biological object (e.g., regions associated with cd8+ T cells) and regions associated with a second biological object (e.g., regions associated with ck+ tumor cells). After segmentation, an output may be provided for the digital pathology image and/or its tiles, such as a score indicating: the number or percentage of each tile of the digital pathology image and/or that has been segmented into each of the various segments for the present analysis. Fig. 5B shows an overlaid view 550 of the digital pathology image 500, wherein the region associated with ck+ tumor cells is highlighted (e.g., as in region 555), and the region associated with ck+ tumor cells is de-emphasized (e.g., as in region 557). The final association of the pixels of the digital pathology image and/or tiles with a particular segment associated with a particular biological object may be further based on a threshold operation, wherein a certain amount of pixels of each tile must exceed a particular intensity for any portion to be segmented into associated with the particular biological object. This thresholding operation may be particularly influential in cases where the stain used on the digital pathology image is responsive to more than one type of biological subject.
FIG. 6 depicts an example of tile-based local density measurements. Fig. 6 shows three masks 610, 620, and 630 that have been generated from a digital pathology image. The mask may be generated, for example, by the pixel-based segmentation module 113, the density assessment module 114, or other suitable components of the digital pathology image processing system 110. Mask 610 is a staining intensity mask for the digital pathology image. The digital pathology image and mask 610 has been divided into four tiles. Each tile includes four pixels. Each pixel is associated with a dye intensity value corresponding to the intensity of a particular dye (e.g., the intensity of a color channel known to reflect dye performance). Northwest tile 611 includes staining intensity values 3, 25, 6, and 30. Southwest tile 612 includes staining intensity values of 5, 8, 7, and 9. Northeast tile 613 includes staining intensity values 35, 30, 25, and 3. Southeast tile 614 includes staining intensity values of 4, 20, 8, and 5. Since each of the staining intensity values reflects the performance of the staining (e.g., the rate at which biological objects depicted in the corresponding pixels of the digital pathology image absorb or express the staining agent), the staining intensity values may be used to determine which biological objects are displayed in the tiles and the frequency of occurrence.
Mask 620 is a binary mask for dye thresholding of the dye intensity mask 610. The individual pixel values of the staining intensity mask 610 have been compared to a predetermined and customizable threshold value for the staining agent of interest. The threshold value may be selected according to a scheme reflecting the expected expression level of the staining intensity, which corresponds to a confirmed depiction of the correct biological subject. The staining intensity value and the threshold value may be absolute values (e.g., staining intensity values above 20) or relative values (e.g., setting the threshold value at the first 30% of the staining intensity value). Additionally, the staining intensity values may be normalized based on historical values (e.g., based on the overall performance of the stain in many previous analyses) or based on digital pathology images at hand (e.g., taking into account differences in brightness and other imaging changes that may result in the images not accurately displaying the correct staining intensity). In the dye thresholded binary mask 620, the threshold has been set to a 20 dye intensity value and applied to all pixels within the dye intensity mask 610. The result is a pixel-level binary mask, where a "1" indicates that the pixel's dye intensity equals or exceeds a threshold value, and a "0" indicates that the pixel does not meet the necessary dye intensity.
The mask 630 is an object density mask at the tile level. Based on the assumption that a staining intensity level above the threshold is relevant to the depiction of a particular biological object within the digital pathology image, an operation is performed on the staining thresholded binary mask 620 to reflect the density of the biological object within each tile. In the exemplary object density mask 630, the operations include summing the values of the dye thresholded binary mask 620 within each tile and dividing by the number of pixels within the tile. The northwest tile contains two pixels above the threshold staining intensity value out of a total of four pixels, so the value in the object density mask of the northwest tile is 0.5. Similar operations apply to all tiles. Additional operations may be performed to preserve, for example, the locality of each tile, such as sub-tile segmentation and preserving the coordinates of each sub-tile within the lattice. As described herein, the object density mask 630 may be provided to the object distribution detector 115 as a basis for computing a spatial distribution metric. It should be understood that the example depicted in fig. 6 is simplified for discussion purposes only. The number of pixels within each tile and the number of tiles within each digital pathology image can be greatly expanded and adjusted as needed based on computational efficiency and accuracy requirements.
Fig. 7 shows an exemplary annotated digital pathology image 700. In particular, annotated digital pathology image 700 shows line 710 separating the bottom portion from the top portion of digital pathology image 700. This separation indicates that the bottom portion of the digital pathology image 700 is associated with a tumor bed (e.g., comprising tumor tissue and stroma), while the top portion of the digital pathology image 700 above line 710 is not associated with a tumor bed. As described herein, the segmentation lines may be generated by the digital pathology image processing system 110 or received by the digital pathology image processing system 110 from a manual evaluation. Further, the annotated digital pathology image 700 may be provided as an output of the digital pathology image processing system 110. As described herein, various forms of output may be provided as a mechanism that allows a reviewer to better understand the methods employed by the digital pathology image processing system 110 and how it reaches a final conclusion. The output indicating the segmentation into tumor tissue or the like is the first step to ensure that the digital pathology image processing system 110 properly interprets the digital pathology image and the sample.
Fig. 8A and 8B illustrate exemplary thermal maps of biological object densities for a particular type of biological object. Using the digital pathology image provided to the digital pathology image processing system 110, the digital pathology image processing system 110 has segmented the digital pathology image into tiles, performed pixel-based segmentation, and generated an initial density metric (e.g., generated an object density mask for the digital pathology image). In the example depicted in fig. 8A and 8B, the digital pathology image processing system 110 has identified the density of cd8+ T cells in both CK-and ck+ regions. To aid in reviewing the output of the digital pathology image processing system 110, the output generation module 118 has created heat map visualizations 800 and 850 based on the corresponding object density metrics. The heat map visualization 800 shows the density of cd8+ T cells in CK-tumor cells (e.g., in tumor stroma). The heat map visualization 850 shows the density of cd8+ T cells in ck+ tumor cells (e.g., within tumor tissue). The visualization may help a pathologist systematically classify samples shown in the digital pathology image and also help understand the immunophenotype assigned by the digital pathology image processing system 110.
Fig. 9 shows a plot of biological subject density bins by immunophenotype. In particular, FIG. 9 shows a plot 900 showing the CK+ and CK-density values plotted against the CD8 density value. The plot 900 shows a first and simple interpretation of the density scores that may be generated by the density assessment module 114. While it is possible to determine some trends from simple graphs, such as clusters of desert-type immunophenotypes being lower on the y-axis and inflammatory immunophenotypes being more prevalent on both the x-axis and y-axis, other clusters could not be determined and it is difficult to draw further conclusions. Thus, the small plot 900 demonstrates the limitations of previous forms of analysis and the motivation to develop other techniques to automatically classify digital pathology images and tiles derived therefrom. According to embodiments described herein, these other techniques include integration of advanced spatial distribution metrics derived from density values.
Fig. 10A depicts an application of the region analysis framework 230. In particular, the area analysis framework 230 is used to process digital pathology images of stained sample sections. As described above, the density of a particular type of biological object (e.g., tumor cells and T cells) is detected to generate biological object data, examples of which are shown in table 1000. In some embodiments, the output biological object data includes coordinates of individual tiles within the lattice formed by the image segmentation module 112 and areas of tiles associated with each of the biological objects of interest. As an example, when the biological object of interest includes cd8+ T cells and ck+ or ck+ tumor cells, the output biological data includes a tile area associated with ck+ tumor cells, a tile area associated with CK-tumor cells, a tile area associated with ck+ tumor cells and cd8+ T cells, and a tile area associated with CK-tumor cells and cd8+ T cells.
As described, a spatial lattice having a defined number of columns and a defined number of rows may be used to divide a digital pathology image into tiles. For each tile, the number or density of biological object depictions within the region may be identified, such as by using the density calculation techniques described herein. For each biological object type, a set of region-specific biological object densities, which tiles contain a mapping of specific density values at which locations, may be defined as lattice data for the biological object type.
FIG. 10A shows a particular embodiment of lattice data 1010 for depicting a second type of biological object-CK+ tumor cells-and lattice data 1015 for depicting a first type of biological object-CD8+ T cells. For illustration purposes, each of the lattice data is displayed overlaid on a representation of the digital pathology image of the stained section. In some embodiments, each tile may be a spatial unit, and the center coordinates of the tile may be extracted to form a spatial lattice. For each region in the lattice, the lattice data may be defined to include a prevalence value defined to be equal to the count for that region divided by the total count across all regions. Thus, a region within which a biological object of a given type is not present will have a prevalence value of 0, while a region within which at least one biological object of a given type is present will have a positive non-zero prevalence value.
One or more prevalence graphs may be created using the prevalence values. For example, for a CK/CD8 generalization graph, the CK to CD8 area ratio (e.g., the ratio of the number of ck+ pixels to the number of cd8+ pixels) may be calculated for each tile. The area of each tile can be normalized by dividing the sum of CK areas over all tiles of the same slide by the sum of CD8 areas, thereby reducing slide size effects. One or more spatial distribution metrics may be derived from the prevalence graph. The spatial distribution metric may represent, among other things, co-localization of two biological object types (such as ck+ tumor cells and cd8+ T cells), and/or spatial distribution of one biological object type (such as cd8+ T cells in ck+ or CK-negative regions), respectively.
The same amount of a biological subject (e.g., lymphocyte) in two different contexts (e.g., tumors) is not necessarily meant to be characterized or characterized to a degree (e.g., the same immune infiltration). Conversely, a distribution of the first type of biological object depiction relative to the second type of biological object depiction may indicate a functional state. Thus, the proximity of depictions characterizing biological objects of the same type and different types may reflect more information. Morisita-Horn index is an ecological measure of similarity (e.g., overlap) in biological or ecosystem. Characterization of bivariate relationships or co-localized Morisita-Horn indices (MH) between two populations delineated by biological objects (e.g., of two types) can be defined as:
Wherein the method comprises the steps ofRepresenting the prevalence of a first type of biological object depiction and a second type of biological object depiction at square grid i, respectively. In fig. 10A, lattice data 1010 shows exemplary generalization values for the depiction of a first type of biological object across lattice points>And lattice data 1015 shows an exemplary generalization value of the depiction of a second type of biological object across grid points +.>
When each lattice region does not include two types of biological object depictions, the Morisita-Horn index is defined as 0 (indicating that the distribution of different biological object types is spatially separated). For example, when considering the illustrative spatial separation distribution or isolation distribution shown in illustrative first scenario 1020, the Morisita-Horn index would be 0. The Morisita-Horn index is defined as 1 when the distribution of a first biological object type across the lattice region matches (or is a scaled version of) the distribution of a second biological object type across the lattice region. For example, when considering the illustrative highly co-located distribution shown in the illustrative second scenario 1025, the Morisita-Horn index would be close to 1.
In the example shown in fig. 10A, the Morisita-Horn index calculated using the dot matrix data 1010 and the dot matrix data 1015 is 0.47. A high Morisita-Horn index value indicates that the depictions of the biological objects of the first type and the second type are highly co-localized.
The Jaccard index (J) and Sorensen index (L) are similar to and closely related to each other. They can be defined as:
wherein the method comprises the steps ofRepresenting the prevalence of the first type of biological object depiction and the second type of biological object depiction at square grid i, respectively, min (a, b) returns a minimum between a and b. Jaccard index and->An index may be used to represent a spatial co-location of a biological object type.
Another measure that can characterize the spatial distribution of a biological object depiction is the Moran index, which is a measure of spatial autocorrelation. In general, the Moran index is a correlation coefficient of a relationship between a first variable and a second variable at adjacent spatial units. The first variable may be defined as the prevalence of the depiction of the biological object of the first type and the second variable may be defined as the prevalence of the depiction of the biological object of the second type in order to quantify the extent to which the two types of biological object depictions are spread in the digital pathology image. Moran index I can be defined as:
wherein x is i 、y j Representing the normalized prevalence of a first type of biological object depiction (e.g., lymphocytes) at area unit i and the normalized prevalence of a second type of biological object depiction (e.g., tumor cells) at area unit j. Omega ij For binary weights of area cells i and j, if two cells are adjacent, the weight is 1, otherwise 0, a first order pattern may be used to define the neighborhood structure. Moran I can be derived separately for biological object delineation of different types of biological objects.
When the biological object depiction is completely dispersed across the lattice, the Moran index is defined as equal to-1 (thus having negative spatial autocorrelation); and when biological objects depict tight aggregation, the Moran index is defined as 1 (and thus has a positive autocorrelation). When the subject distribution matches the random distribution, the molan index is defined as 0. Thus, the area representation of a particular biological object delineating type facilitates the generation of a grid that supports the calculation of Moran indices for each biological object type. In embodiments in which two or more types of biological object depictions are identified and tracked, a difference between the Moran indices calculated for each of the two or more types of biological object depictions may provide an indication of a co-location between the types of biological object depictions (e.g., where the difference is near zero indicates a co-location).
Geary C (also known as Geary adjacency ratio) is a spatial autocorrelation measure or an attempt to determine whether adjacent observations of the same phenomenon are correlated. Geary C is inversely related to Moran I, but not exactly the same. Although Moran I is a global spatial autocorrelation measure, geary C is more sensitive to local spatial autocorrelation.
Wherein z is i Representing the prevalence, ω, of the first type or second type of biological object delineation at square grid i i,j The same as defined above.
Another metric that can characterize the spatial distribution of a biological object depiction is the Bhattacharyya coefficient ("B coefficient"), which is an approximate measure of overlap between two statistical samples. In general, B coefficients can be used to determine the relative proximity of two statistical samples (e.g., biological objects or biological object types), such as spatial co-location features of ck+ pixels and cd8+ pixels in a ck+ tile. It is used to measure the separability of categories in a classification.
Given probability distributions p and q over the same domain X (e.g., delineated distributions of two types of biological objects within the same digital pathology image), the B-coefficient is defined as:
wherein BC is more than or equal to 0 and less than or equal to 1 and D is more than or equal to 0 B And is not more than infinity. Note that D B Not obey the triangle inequality, but the Hailingge distanceBut obeys the triangle inequality. The B-factor increases with the number of partitions in the domain with members from both samples (e.g., with the number of delineated or appropriately dense tiles with two or more types of biological object delineations in the digital pathology image). Thus, for each partition in a domain with significant overlap of samples, e.g., for each partition containing a large number of members of two samples, the B-factor is still greater. The choice of the number of partitions is variable and can be tailored to the number of members in each sample. To maintain accuracy, annotate It is intended to avoid selecting too few partitions and overestimating the overlapping areas, and to avoid selecting too many partitions and creating partitions without members, albeit in a densely populated sample space. If there is no overlap at all between the two samples depicted by the biological object, the B-factor will be 0.
Dot matrix data 1010 and dot matrix data 1015 may be further processed to generate hot spot data 1030 corresponding to the detected depiction of the first type of biological object and hot spot data 1035 corresponding to the detected depiction of the second type of biological object, respectively. In fig. 10B, hotspot data 1030 and hotspot data 1035 indicate areas of the detected delineated hotspots determined to be for the corresponding type of biological object. The area detected as a hot spot is shown as red, and the area determined not to be a hot spot is shown as 'x'. The hotspot data 1030, 1035 is defined for each region associated with a non-zero object count. The hotspot data 1030, 1035 may also include a binary value indicating whether a given region is identified as a hotspot. In addition to hot spot data and analysis, cold spot data and analysis may also be performed.
For the depiction of biological objects, hotspot data 1030, 1035 may be generated for each biological object type by: getis-Ord local statistics for each region associated with a non-zero object count for a biological object type are determined. A Getis-Ord hot spot/cold spot analysis can be used to identify statistically significant hot/cold spots of tumor cells or lymphocytes, where a hot spot is an area unit where the prevalence of a depiction of a biological object has a statistically significantly higher value compared to an adjacent area unit, and a cold spot is an area unit where the prevalence of a depiction of a biological object has a statistically significantly lower value compared to an adjacent area unit. This value and what to determine to contrast the hot/cold spot region with the adjacent region may be selected according to user preferences and, in particular, may be selected according to a rule-based method or a learned model. For example, the number and/or type of depictions of the detected biological object, the absolute number of depictions, and other factors may be considered. The Getis-Ord local statistic is a z-score and for a square grid i can be defined as:
Where i represents a single region (a particular row and column combination) in the lattice, n is the number of row and column combinations (i.e., the number of regions) in the lattice, ω i,j Is the spatial weight between i and j, and z j The prevalence of depictions for a given type of biological object in an area,average object prevalence for a given type across regions, and: />
The get-Ord local statistic may be transformed into a binary value by determining whether each statistic exceeds a threshold. For example, the threshold may be set to 0.16. The threshold may be selected according to user preferences and in particular may be set according to a rule-based method or a machine learning method.
Another metric that may characterize the spatial distribution of a biological object depiction is a Collaborative Location Quotient (CLQ), which is a ratio of ratios and may be used to measure the local density divided by the global density of a particular type of biological object. CLQ measures co-occurrence or avoidance of biological subject pairs. In particular, the CLQ method may look at the local density of the target biological object type at a fixed radius from each occurrence of a sample belonging to the reference biological object. CLQ may be defined as:
wherein, CLQ A→B Global CLQ, LCLQ, which is cell type a A→B Is the local CLQ of cell type a, N is the total number of cells in the image, δ ij Is kronecker delta indicating whether cell j is a B-type cell; w (w) ij For the non-weighted version, 1/N, and for the weighted version, a Gaussian distance decay kernel.
For example, the local density may be calculated as a fraction of cell type B in a neighborhood constructed with a radius centered on cell type a. The global density may be the proportion of cell type B in the whole slide image. CLQ may be greater than 1 when the density of cell type B within the neighborhood of cell type a is greater than the global density of cell type B. When the neighborhood of cell type a contains many other cell types than cell type B, CLQ may be less than 1. A CLQ value of 1 may mean that there is no spatial relationship between cell type a and cell type B.
In addition, CLQ methods may rely on continuous summary statistics. Thus, CLQ methods may have capabilities beyond the three immunophenotype categories (e.g., desert, repulsive, and inflammatory) described herein, and may highlight immunophenotype or conditions that are located at the boundaries of the immunophenotype category.
Logical AND functions may be used to identify regions identified as delineated hot spots for more than one type of biological object. For example, co-located hotspot data 1040 indicates an area (shown as a circle symbol) identified as a hotspot depicted for two types of biological objects. A high ratio of the number of regions identified as co-localized hot spots relative to the number of hot spot regions identified for a given object type (e.g., for a tumor cell object) may indicate that a biological object of a given type depicts spatial characteristics in common with other object types. At the same time, a low ratio at or near zero may be consistent with spatial separation of different types of biological objects.
Once the spatial distribution metrics are generated, the spatial distribution metrics, density values, and other generated data can be used to assign an immunophenotype to the sample. As described herein, the assignment of immunophenotype may be provided by a machine learning model trained in a supervision training process, wherein a digital pathology image of the markers and its spatial distribution metrics are provided. Through the training process, the digital pathology image processing system 110 or one or more modules thereof, including the immunophenotyping module 116, may learn to classify digital pathology images and their corresponding samples into selected immunophenotyping groups.
FIG. 11A illustrates a visualization of training and use of a machine learning model that constitutes one embodiment of an immunophenotyping module 116. Data generated by the various components of the digital pathology image processing system 110 may be collected into a training data set 1110. The training set includes values for the various spatial distribution metrics discussed herein. The training data set 1110 also includes immunophenotypes that have been manually assigned to the sample, such as by a pathologist, for training purposes. Each digital pathology image may be projected into a multivariate space, where each of the spatial distribution metrics and/or its changes or derivatives have axes. With the provided labels, a machine learning model can be trained to identify clusters of digital pathology images (and corresponding samples) within a multivariate space. In this formula, the task of marking data points that have not been previously seen can be approximated by determining to which cluster the new data point belongs.
Fig. 11A further illustrates the challenge of identifying the appropriate mechanism for identifying the clustering criteria. The plot 1120 shows data points of several digital pathology images plotted on a two-dimensional cartesian grid. The circular dots 1121 designate a first type of label and the square dots designate two different labels, each label being different from the first type of label. The signature may be equivalent to an immunophenotype. The first attempt to group the points may involve, for example, the euclidean nearest neighbor method. In this approach, all points within a particular radius 1124 may be marked with a target tag type. Although in this example the neighborhood does capture all points 1121 associated with the first type of label, it also captures two artifact data points 1122 and 1123 (shown as squares). To accurately capture only points associated with the first tag type within the neighborhood, other measures of similarity may be used. In one example, this may include taking into account other metrics (e.g., other axes of added similarity). Thus, in the plot 1130, the target point 1121 can be effectively distinguished from the artifact data points 1122 and 1123 by the hyperplane of the multivariate space represented by the data (e.g., in the training data set 1110). Furthermore, other distance metrics besides euclidean distance metrics may be used to define the nearest neighbor and the resulting clusters.
Fig. 11B shows plot 1140, which illustrates idealized results, particularly in comparison to plot 900 shown in fig. 9. In the plot 1140, a clean grouping of data has been identified, in this example distinguishing desert-, repulsive-and inflammatory-type immunophenotypes based on the input data. The relationship between the spatial distance metrics used to create these groupings and the groupings themselves can be seen more clearly (as shown in plot 900) than using only a single density measurement.
The machine learning model may be trained to process digital pathology images, for example, from biopsies of a subject, in order to predict an assessment of a condition of the subject from the digital pathology images. As an example, using the techniques described herein, a digital pathology image processing system may generate a variety of spatial distribution metrics and predict immunophenotype of a digital pathology image. Based on this input, a regression machine learning model may be trained to predict, for example, suspicious patient outcomes, assessment of relevant patient condition factors, availability or qualification of selected treatments, and other relevant recommendations.
Biopsies can be collected from each of a plurality of subjects suffering from the disorder. Samples may be fixed, embedded, sectioned, stained, and imaged according to the subject matter disclosed herein. Delineation and density of specific types of biological objects (e.g., tumor cells and lymphocytes) can be detected. A digital pathology image processing system may process images using a set of trained machine learning models to quantify the density of biological objects of interest. For each of the plurality of subjects, a signature may be generated to indicate whether the disorder exhibits a specified characteristic and/or to indicate certain secondary signatures (e.g., immunophenotype) imposed by the digital pathology image processing system. In the context of predicting an overall assessment of a subject's condition, tags such as immunophenotype are considered secondary in that they can inform the overall assessment.
For each subject, the input vector may be defined to include a set of spatial distribution metrics. The set of spatial distribution metrics may include a selection of metrics described herein. The set of spatial distribution metrics may capture co-location of one or more biological object types (e.g., cd8+ T cells) in ck+ or CK-negative tiles, spatial distribution of ck+ tumor cells in ck+ tiles or CK-negative tiles, or both. As an example, metrics to be included in the input vector may include the following:
-intratumoral lymphocyte ratio;
-Bhattacharyya coefficients;
Morisita-Horn index;
-Jaccard index;
-an index;
-B coefficient;
-molan index;
-Geary C;
-CD8-CK area ratio;
-a ratio of co-located points (e.g., hot spots, cold spots, non-salient points) plotted for the type of biological object with respect to a number of points (e.g., hot spots, cold spots, non-salient points) plotted for the first type of biological object, wherein the points (e.g., hot spots, cold spots, non-salient points) are defined using a Getis-Ord local statistic; and/or
Features obtained by fitting a variational function to a depiction of two types of biological objects (e.g. tumor cells and lymphocytes).
The selected metrics may correspond to multiple frames (e.g., regional process analysis frames). For each subject, a label may be defined to indicate secondary determinations, such as subject density metrics and/or assigned immunophenotype. Machine learning models, including but not limited to logistic regression models, may be trained and tested with paired input data and labels using repeated nested cross-validation. As an example, for each of the 5 data folds, the model may be trained on the remaining 4 folds and tested on the remaining folds to calculate the area under ROC.
In embodiments where the sample size is limited, adaptable techniques may be used to evaluate model performance. As a non-limiting example, nested monte carlo cross-validation (nMCCV) may be used to evaluate model performance. By randomly splitting between the training set, validation set, and test set in the same ratio, the same enrichment process can be repeated B times to generate a set of score functions and thresholdsFor the ith subject, overall responder status can be assessed by: averaging membership of i to the reactant group in a repetition of i being randomly assigned to the test set; a threshold of 0.5 is set. The risk or odds ratio, as well as the 95% confidence interval and p-value, may be calculated for the aggregated test subjects.
In some embodiments, clusters of data points that distinguish tumor immunophenotype may be used to learn one or more spatial distribution metrics of unlabeled data or tumor immunophenotype. Data labeled with tumor immunophenotype can be projected onto space to spatially separate the labeled data into clusters with corresponding spatial distribution metrics. In some embodiments, spatial separation of inflammatory, repulsive, and desert immunophenotype can be used to identify clusters of data points. For example, as shown in fig. 11C, the marked data in the plot 1140 (shown against a white background in fig. 11C) may be co-embedded in the same space as the unmarked data in the plot 1160. Spatial separation of clusters of immunophenotype learned from the data in panel 1140 may be used as a measure of the distance between unlabeled data in panel 1160. In some embodiments, biological samples in unlabeled data can be assigned immunophenotype (e.g., desert, repulsive, or inflammatory) based on the distance in space from the centroid of clusters (one for each immunophenotype) of labeled data (in panel 1140) to unlabeled data (panel 1160). A tumor immunophenotype of a biological sample corresponding to one or more unlabeled data points may be determined based on the minimum distance.
In some embodiments, a cluster matching process (e.g., K-means aggregation) may be used to match one or more data points (one or more spatial distribution metrics) in unlabeled data (in plot 1160) with clusters of data points in labeled data (in plot 1140). The marker data may be labeled with the corresponding immunophenotype. Biological samples corresponding to one or more unlabeled data points may be assigned an immunophenotype tag (e.g., desert, repulsive, or inflammatory) based on the matched clusters, as shown in panel 1180 of fig. 11D. In some embodiments, the therapy response information may overlap with clusters in space, and the recommended therapy may be determined based on the clusters. In some embodiments, each unlabeled data point may be projected into a space trained on the labeled dataset, and the immunophenotype label of each unlabeled data point may be assigned based on the shortest distance from the centroid of each immunophenotype cluster from the labeled dataset (most likely immunophenotype class).
The overall workflow of predictive analysis is summarized in the flow chart of fig. 12A. More specifically, to assign a label to each subject in the study cohort, a nested monte carlo cross-validation (nMCCV) modeling strategy can be used to overcome the overfitting.
Specifically, for each subject, at block 1205, the data set may be divided into a training data portion, a validation data portion, and a test data portion at a ratio of 60:20:20. At block 1210, 10-fold cross-validation Ridge-Cox (L2 regularized Cox model) can be performed using the training set to generate 10 models (with the same model architecture). A particular model of the 10 generated models may be selected and stored based on the 10 fold training data. At block 1215, a particular model may then be applied to the validation set to adjust the specified variables. For example, the variable may identify a threshold value for the risk score. At block 1220, a threshold and a particular model may then be applied to the independent test set to generate votes for the subject, predicting whether the subject is stratified into longer or shorter survival (e.g., overall survival or progression free survival) groups. The data splitting, training, truncation recognition, and vote generation (blocks 1205 through 1220) may be repeated N (e.g., =1000) times. At block 1225, the subjects are then assigned to one of a longer survival group or a shorter survival group based on the votes. For example, the steps at block 1225 may include: subjects are assigned to longer living groups or shorter living groups by determining which group is associated with the majority vote. At block 1230, a survival analysis may then be performed on the longer/shorter survival group subjects. It should be appreciated that similar processes of applying various labels to data based on the results of interest may be applied to any suitable clinical assessment or qualification study.
Fig. 12B and 12C show overall survival rates of whole slide images classified according to desert-, exclusion-and inflammation-type immunophenotypes, and fig. 12D and 12E show progression-free survival rates of whole slide images classified according to desert-, exclusion-and inflammation-type immunophenotypes. These panels show that the disclosed methods can result in a clear separation of classified immunophenotypes in groups receiving a certain treatment (e.g., alemtuzumab) as compared to groups receiving a different treatment (e.g., docetaxel). Thus, alemtuzumab improved overall and progression-free survival compared to docetaxel.
The comprehensive model based on spatial statistics and spatial distribution metrics used in the analysis of this example enhances the ability of the analysis pipeline to generate (in this case) system level knowledge of the immunophenotype based on intratumoral density decisions by modeling the histopathological image as spatial data with the aid of pixel-based segmentation. This effect is not limited to a particular treatment assessment, but can be applied to many scenarios where the necessary baseline truth data can be obtained. The use of spatial statistics to characterize histopathological images and other digital pathological images can be used in clinical settings to predict treatment outcome, thereby providing information for treatment selection.
Some embodiments of the present disclosure include a system comprising one or more data processors. In some embodiments, the system includes a non-transitory computer-readable storage medium containing instructions that, when executed on one or more data processors, cause the one or more data processors to perform a portion or all of one or more methods disclosed herein and/or a portion or all of one or more processes disclosed herein. Some embodiments of the present disclosure include a computer program product tangibly embodied in a non-transitory machine-readable storage medium, comprising instructions configured to cause one or more data processors to perform a portion or all of one or more methods disclosed herein and/or a portion or all of one or more processes disclosed herein.
Fig. 13 illustrates an exemplary computer system 1300. In particular embodiments, one or more computer systems 1300 perform one or more steps of one or more methods described or illustrated herein. In particular embodiments, one or more computer systems 1300 provide the functionality described or illustrated herein. In particular embodiments, software running on one or more computer systems 1300 performs one or more steps of one or more methods described or illustrated herein, or provides the functionality described or illustrated herein. Particular embodiments include one or more portions of one or more computer systems 1300. Herein, references to a computer system may include computing devices, and vice versa, where appropriate. Further, references to computer systems may include one or more computer systems, where appropriate.
The present disclosure contemplates any suitable number of computer systems 1300. The present disclosure contemplates computer system 1300 taking any suitable physical form. By way of example, and not limitation, computer system 1300 may be an embedded computer system, a system on a chip (SOC), a single board computer System (SBC), such as, for example, a computer on a module (COM) or a system on a module (SOM)), a desktop computer system, a laptop or notebook computer system, an interactive kiosk, a mainframe, a grid of computer systems, a mobile phone, a Personal Digital Assistant (PDA), a server, a tablet computer system, an augmented/virtual reality device, or a combination of two or more thereof. Computer system 1300 may include one or more computer systems 1300 where appropriate; may be integral or distributed; can span multiple locations; can span multiple machines; may span multiple data centers; or may reside in a cloud, which may include one or more cloud components in one or more networks. Where appropriate, one or more computer systems 1300 may perform one or more steps of one or more methods described or illustrated herein without substantial spatial or temporal limitation. By way of example, and not limitation, one or more computer systems 1300 may perform in real-time or in batch mode one or more steps of one or more methods described or illustrated herein. Where appropriate, one or more computer systems 1300 may perform one or more steps of one or more methods described or illustrated herein at different times or at different locations.
In a particular embodiment, the computer system 1300 includes a processor 1302, a memory 1304, a storage device 1306, an input/output (I/O) interface 1308, a communication interface 1310, and a bus 1312. Although this disclosure describes and illustrates a particular computer system having a particular number of particular components in a particular arrangement, this disclosure contemplates any suitable computer system having any suitable number of any suitable components in any suitable arrangement.
In a particular embodiment, the processor 1302 includes hardware for executing instructions, such as those comprising a computer program. By way of example, and not limitation, to execute instructions, processor 1302 may retrieve (or fetch) instructions from internal registers, internal caches, memory 1304, or storage 1306; these instructions may be decoded and executed; and then one or more results may be written to an internal register, internal cache, memory 1304, or storage 1306. In a particular embodiment, the processor 1302 may include one or more internal caches for data, instructions, or addresses. The present disclosure contemplates processor 1302 including any suitable number of any suitable internal caches, where appropriate. By way of example, and not limitation, processor 1302 may include one or more instruction caches, one or more data caches, and one or more Translation Lookaside Buffers (TLBs). The instructions in the instruction cache may be copies of instructions in the memory 1304 or the storage 1306, and the instruction cache may speed retrieval of those instructions by the processor 1302. The data in the data cache may be: a copy of the data in the memory 1304 or storage 1306 for operation by instructions executing at the processor 1302; results of previous instructions executed at processor 1302 for access by, or writing to, memory 1304 or storage 1306 by subsequent instructions executed at processor 1302; or other suitable data. The data cache may speed up read or write operations of the processor 1302. The TLB may accelerate virtual address translation for the processor 1302. In a particular embodiment, the processor 1302 may include one or more internal registers for data, instructions, or addresses. The present disclosure contemplates processor 1302 including any suitable number of any suitable internal registers, where appropriate. The processor 1302 may include one or more Arithmetic Logic Units (ALUs), where appropriate; may be a multi-core processor; or may include one or more processors 1302. Although this disclosure describes and illustrates a particular processor, this disclosure contemplates any suitable processor.
In a particular embodiment, the memory 1304 includes a main memory for storing instructions for execution by the processor 1302 or data for operation by the processor 1302. By way of example, and not limitation, computer system 1300 may load instructions from storage 1306 or another source (such as, for example, another computer system 1300) into memory 1304. The processor 1302 may then load the instructions from the memory 1304 into an internal register or internal cache. To execute instructions, the processor 1302 may retrieve instructions from internal registers or internal caches and decode the instructions. During or after instruction execution, the processor 1302 may write one or more results (which may be intermediate results or final results) to an internal register or internal cache. The processor 1302 may then write one or more of those results to the memory 1304. In a particular embodiment, the processor 1302 executes only instructions in one or more internal registers or internal caches or in the memory 1304 (rather than the storage 1306 or elsewhere) and operates only on data in one or more internal registers or internal caches or in the memory 1304 (rather than the storage 1306 or elsewhere). One or more memory buses (which may each include an address bus and a data bus) may couple processor 1302 to memory 1304. Bus 1312 can include one or more memory buses, as described below. In a particular embodiment, one or more Memory Management Units (MMUs) reside between the processor 1302 and the memory 1304 and facilitate access to the memory 1304 requested by the processor 1302. In a particular embodiment, the memory 1304 includes Random Access Memory (RAM). The RAM may be volatile memory, where appropriate. The RAM may be Dynamic RAM (DRAM) or Static RAM (SRAM), where appropriate. Further, the RAM may be single-port or multi-port RAM, where appropriate. The present disclosure contemplates any suitable RAM. The memory 1304 may include one or more memories 1304, where appropriate. Although this disclosure describes and illustrates particular memory, this disclosure contemplates any suitable memory.
In a particular embodiment, the storage 1306 includes a mass storage device for data or instructions. By way of example, and not limitation, storage 1306 may include a Hard Disk Drive (HDD), a floppy disk drive, flash memory, an optical disk, a magneto-optical disk, a magnetic tape, or a Universal Serial Bus (USB) drive, or a combination of two or more thereof. Storage 1306 may include removable or non-removable (or fixed) media, where appropriate. Storage 1306 may be internal or external to computer system 1300, where appropriate. In a particular embodiment, the storage device 1306 is a non-volatile solid state memory. In a particular embodiment, the storage 1306 includes Read Only Memory (ROM). The ROM may be a mask-programmed ROM, a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), an electrically rewritable ROM (EAROM), or a flash memory, or a combination of two or more thereof, where appropriate. The present disclosure contemplates mass storage device 1306 in any suitable physical form. Where appropriate, the storage 1306 may include one or more memory control units that facilitate communication between the processor 1302 and the storage 1306. Storage 1306 may include one or more storage devices 1306, where appropriate. Although this disclosure describes and illustrates particular storage devices, this disclosure contemplates any suitable storage devices.
In particular embodiments, I/O interface 1308 comprises hardware, software, or both, which provides one or more interfaces for communicating between computer system 1300 and one or more I/O devices. Computer system 1300 may include one or more of these I/O devices, where appropriate. One or more of these I/O devices may enable communications between a person and computer system 1300. By way of example, and not limitation, an I/O device may include a keyboard, a keypad, a microphone, a monitor, a mouse, a printer, a scanner, a speaker, a still camera, a stylus, a tablet computer, a touch screen, a trackball, a video camera, another suitable I/O device, or a combination of two or more thereof. The I/O device may include one or more sensors. The present disclosure contemplates any suitable I/O devices and any suitable I/O interfaces 1308 for them. The I/O interface 1308 may include one or more devices or software drivers, where appropriate, to enable the processor 1302 to drive one or more of these I/O devices. The I/O interface 1308 may include one or more I/O interfaces 1308, where appropriate. Although this disclosure describes and illustrates particular I/O interfaces, this disclosure encompasses any suitable I/O interfaces.
In particular embodiments, communication interface 1310 includes hardware, software, or both that provides one or more interfaces for communication (such as, for example, packet-based communication) between computer system 1300 and one or more other computer systems 1300 or one or more networks. By way of example and not limitation, communication interface 1310 may include a Network Interface Controller (NIC) or network adapter for communicating with an ethernet or other wire-based network, or a Wireless NIC (WNIC) or wireless adapter for communicating with a wireless network, such as a WI-FI network. The present disclosure contemplates any suitable network and any suitable communication interface 1310 therefor. By way of example, and not limitation, computer system 1300 may communicate with one or more portions of an ad hoc network, a Personal Area Network (PAN), a Local Area Network (LAN), a Wide Area Network (WAN), a Metropolitan Area Network (MAN), or the Internet, or a combination of two or more thereof. One or more portions of one or more of these networks may be wired or wireless. As an example, computer system 1300 may be combined with a Wireless PAN (WPAN), such as, for example, a BLUETOOTH WPAN, a WI-FI network, a WI-MAX network, a cellular telephone network, such as, for example, a global system for mobile communications (GSM) network, or other suitable wireless network, or a combination of two or more thereof. Computer system 1300 may include any suitable communication interface 1310 for any of these networks, where appropriate. Communication interface 1310 may include one or more communication interfaces 1310, where appropriate. Although this disclosure describes and illustrates a particular communication interface, this disclosure encompasses any suitable communication interface.
In particular embodiments, bus 1312 includes hardware, software, or two mutually coupled components of computer system 1300. By way of example, and not limitation, bus 1312 may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a HyperTransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an INFINIBAND interconnect, a Low Pin Count (LPC) bus, a memory bus, a Micro Channel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCIe) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or another suitable bus or combination of two or more thereof. Bus 1312 may include one or more buses 1312, where appropriate. Although this disclosure describes and illustrates a particular bus, this disclosure contemplates any suitable bus.
Herein, one or more computer-readable non-transitory storage media may include one or more semiconductor-based or other Integrated Circuits (ICs) (such as, for example, field Programmable Gate Arrays (FPGAs) or Application Specific ICs (ASICs)), a Hard Disk Drive (HDD), a hybrid hard disk drive (HHD), an Optical Disk Drive (ODD), a magneto-optical disk drive, a Floppy Disk Drive (FDD), a magnetic tape, a Solid State Drive (SSD), a RAM drive, a secure digital card or drive, any other suitable computer-readable non-transitory storage media, or any suitable combination of two or more thereof. The computer-readable non-transitory storage medium may be a volatile storage medium, a non-volatile storage medium, or a combination of volatile and non-volatile storage media, where appropriate.
Herein, "or" is inclusive and not exclusive, unless explicitly indicated otherwise or the context indicates otherwise. Thus, herein, "a or B" refers to "A, B or both" unless explicitly stated otherwise or the context indicates otherwise. Furthermore, herein, "and" are both common and individual unless explicitly stated otherwise or the context indicates otherwise. Thus, herein, "a and B" means "a and B, collectively or individually," unless explicitly stated otherwise or the context indicates otherwise.
The terms and expressions which have been employed are used as terms of description and not of limitation, and there is no intention in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention claimed. Accordingly, it should be understood that although the claimed invention has been specifically disclosed by embodiments and optional features, modification and variation of the concepts herein disclosed may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this invention as defined by the appended claims.
The following description merely provides preferred exemplary embodiments and is not intended to limit the scope, applicability, or configuration of the disclosure. Rather, the ensuing description of the preferred exemplary embodiments will provide those skilled in the art with an enabling description for implementing various embodiments. It being understood that various changes may be made in the function and arrangement of elements without departing from the spirit and scope as set forth in the appended claims.
In the following description, specific details are given to provide a thorough understanding of the embodiments. It may be evident, however, that the embodiments may be practiced without these specific details. For example, circuits, systems, networks, processes, and other components may be shown in block diagram form in order not to obscure the embodiments in unnecessary detail. In other instances, well-known circuits, processes, algorithms, structures, and techniques may be shown without unnecessary detail in order to avoid obscuring the embodiments.
Various embodiments of the invention may include:
1. a method, comprising:
accessing, by a digital pathology image processing system, a digital pathology image depicting a slice of a biological sample, wherein the digital pathology image includes areas exhibiting reactivity to two or more stains;
Subdividing, by the digital pathology image processing system, the digital pathology image into a plurality of tiles;
calculating, for each of the plurality of tiles, a local density measurement for each of a plurality of biological object types by the digital pathology image processing system;
generating, by the digital pathology image processing system, one or more spatial distribution metrics for the plurality of biological object types in the digital pathology image based at least in part on the calculated local density measurements; and
determining, by the digital pathology image processing system, a tumor immunophenotype of the digital pathology image based at least in part on the local density measurement or the one or more spatial distribution metrics.
2. The method of claim 1, wherein each of the local density measurements comprises a representation of absolute or relative quantity, area, or density.
3. The method of claim 1 or 2, wherein the plurality of biological subject types comprises tumor cells and immune cells, the tumor immunophenotype comprising:
a desert type provided that for the plurality of tiles, the local density measurement of the immune cells is less than an immune cell density threshold;
Rejection, provided that for one or more of the plurality of tiles, the local density measurement of the tumor cells is less than a tumor cell density threshold and the local density measurement of the immune cells is greater than or equal to the immune cell density threshold; or (b)
Inflammatory type, provided that for one or more of the plurality of tiles, the local density measurement of the tumor cells is greater than or equal to the tumor cell density threshold and the local density measurement of the immune cells is greater than or equal to the immune cell density threshold.
4. A method according to any one of claims 1 to 3, wherein the one or more spatial distribution metrics characterize the extent to which a first biological object type is depicted interspersed with a second biological object type.
5. The method of any of claims 1 to 4, wherein the one or more spatial distribution metrics comprise:
jaccard index;
an index; />
Bhattacharyya coefficients;
moran index;
geary adjacency ratio;
Morisita-Horn index; or (b)
Metrics defined based on hot/cold spot analysis.
6. The method of any one of claims 1-5, wherein the plurality of biological subject types comprises tumor cells and immune cells, the tumor immunophenotype comprising:
Rejection, provided that for one or more of the plurality of tiles, the one or more spatial distribution metrics indicate spatial separation of the tumor cells and the immune cells; or (b)
An inflammatory type, provided that for one or more of the plurality of tiles, the one or more spatial distribution metrics indicate co-localization of the tumor cells and the immune cells.
7. The method of any one of claims 1-6, wherein calculating the local density measurement for each of the plurality of biological object types comprises:
for each of the plurality of tiles:
dividing the tile into a plurality of regions according to the two or more stains, wherein each of the biological object types is responsive to one of the stains;
classifying each of the regions according to reactivity to the stain; and
the local density measurement for each of the plurality of biological object types located within the tile is calculated based on a number of the regions of the tile classified with each of the two or more colorants.
8. The method of claim 7, wherein each of the regions of the tile is determined based on a staining intensity value of the region, the staining intensity value being based on a reactivity of each of the plurality of biological object types to one of the two or more stains.
9. The method of claim 7 or 8, wherein the regions of the tile are determined as tumor-related regions and non-tumor-related regions.
10. The method of claim 9, wherein each of the tumor-associated region and non-tumor-associated region is determined to be an immune cell-associated region and a non-immune cell-associated region.
11. The method of any one of claims 1-10, wherein determining the tumor immunophenotype of the image comprises:
projecting a representation of the digital pathology image into a feature space having an axis based on the one or more spatial distribution metrics; and
the tumor immunophenotype of the image is determined based on the location of the digital pathology image within the feature space.
12. The method of claim 11, wherein determining the tumor immunophenotype of the image is further based on a proximity of the location of the digital pathology image within the feature space to a location represented by one or more other digital pathology images having a specified tumor immunophenotype.
13. The method of any one of claims 1 to 12, wherein the plurality of biological subject types comprises cytokeratin and cytotoxic structures.
14. The method of any one of claims 1 to 13, further comprising:
identifying one or more tumor regions in the digital pathology image, comprising:
providing a user interface for display, the user interface comprising the digital pathology image and one or more interactive elements; and
a selection of the one or more tumor regions is received through interaction with the one or more interactive elements.
15. The method of any one of claims 1 to 14, further comprising:
generating a result corresponding to: an assessment of a medical condition of a subject comprising a prognosis for the outcome of the medical condition; and
generating a display comprising an indication of the assessment of the medical condition and the prognosis of the subject.
16. The method of claim 15, wherein determining the tumor immunophenotype of the image and generating the one or more spatial distribution metrics uses a trained machine learning model that has been trained using a set of training elements, each of the set of training elements corresponding to another subject having a similar medical condition and for which a result of the medical condition is known.
17. The method of any one of claims 1 to 16, further comprising:
generating, based at least in part on the one or more spatial distribution metrics, a result corresponding to: a prediction of the extent to which a given treatment that modulates an immune response will be effective in treating a medical condition in a subject;
determining that the subject is eligible for a clinical trial based on the results; and
generating a display comprising an indication that the subject is eligible for the clinical trial.
18. A digital pathology image processing system, comprising:
one or more data processors; and
a non-transitory computer-readable storage medium communicatively coupled to the one or more data processors and comprising instructions that, when executed by the one or more data processors, cause the one or more data processors to perform operations comprising:
accessing a digital pathology image depicting a slice of a biological sample, wherein the digital pathology image comprises a region displaying a response to two or more stains;
subdividing the digital pathology image into a plurality of tiles;
For each of the plurality of tiles, calculating a local density measurement for each of a plurality of biological object types identified within the tile;
generating one or more spatial distribution metrics for the plurality of biological object types in the digital pathology image based at least in part on the calculated local density measurements; and
a tumor immunophenotype of the digital pathology image is determined based at least in part on the local density measurement and the one or more spatial distribution metrics.
19. A digital pathology image processing system according to claim 18, wherein each of the local density measurements comprises a representation of an absolute or relative quantity, area or density.
20. A digital pathology image processing system according to claim 18 or 19, wherein the plurality of biological subject types comprises tumor cells and immune cells, the tumor immunophenotype comprising:
a desert type provided that for the plurality of tiles, the local density measurement of the immune cells is less than an immune cell density threshold;
rejection, provided that for one or more of the plurality of tiles, the local density measurement of the tumor cells is less than a tumor cell density threshold and the local density measurement of the immune cells is greater than or equal to the immune cell density threshold; or (b)
Inflammatory type, provided that for one or more of the plurality of tiles, the local density measurement of the tumor cells is greater than or equal to the tumor cell density threshold and the local density measurement of the immune cells is greater than or equal to the immune cell density threshold.
21. A digital pathology image processing system according to any one of claims 18 to 20, wherein the one or more spatial distribution metrics characterize the extent to which a first biological object type is depicted interspersed with a second biological object type.
22. A digital pathology image processing system according to any one of claims 18 to 21, wherein the one or more spatial distribution metrics comprise:
jaccard index;
an index;
bhattacharyya coefficients;
moran index;
geary adjacency ratio;
Morisita-Horn index; or (b)
Metrics defined based on hot/cold spot analysis.
23. A digital pathology image processing system according to any one of claims 18 to 22, wherein the plurality of biological subject types includes tumor cells and immune cells, the tumor immunophenotype comprising:
rejection, provided that for one or more of the plurality of tiles, the one or more spatial distribution metrics indicate spatial separation of the tumor cells and the immune cells; or (b)
An inflammatory type, provided that for one or more of the plurality of tiles, the one or more spatial distribution metrics indicate co-localization of the tumor cells and the immune cells.
24. A digital pathology image processing system according to any one of claims 18 to 23, wherein calculating the local density measurement for each of the plurality of biological object types comprises:
for each of the plurality of tiles:
dividing the tile into a plurality of regions according to the two or more stains, wherein each of the biological object types is responsive to one of the stains;
classifying each of the regions according to reactivity to the stain; and
the local density measurement for each of the plurality of biological object types located within the tile is calculated based on a number of the regions of the tile classified with each of the two or more colorants.
25. The digital pathology image processing system of claim 24, wherein each of the regions of the tile is determined based on a staining intensity value of the region, the staining intensity value being based on a reactivity of each of the plurality of biological object types to one of the two or more stains.
26. A digital pathology image processing system according to claim 24 or 25, wherein the regions of the tiles are determined as tumor-associated regions and non-tumor-associated regions.
27. The digital pathology image processing system of claim 26, wherein each of the tumor-associated region and non-tumor-associated region is determined as an immune cell-associated region and a non-immune cell-associated region.
28. A digital pathology image processing system according to any one of claims 18 to 27, wherein determining the tumor immunophenotype of the image comprises:
projecting a representation of the digital pathology image into a feature space having an axis based on the one or more spatial distribution metrics; and
the tumor immunophenotype of the image is determined based on the location of the digital pathology image within the feature space.
29. A digital pathology image processing system according to claim 28, wherein determining the tumor immunophenotype of the image is further based on a proximity of the location of the digital pathology image within the feature space to a location represented by one or more other digital pathology images having a specified tumor immunophenotype.
30. A digital pathology image processing system according to any one of claims 18 to 29, wherein the plurality of biological object types include cytokeratin and cytotoxic structures.
31. A digital pathology image processing system according to any one of claims 18 to 30, further comprising:
identifying one or more tumor regions in the digital pathology image, comprising:
providing a user interface for display, the user interface comprising the digital pathology image and one or more interactive elements; and
a selection of the one or more tumor regions is received through interaction with the one or more interactive elements.
32. A digital pathology image processing system according to any one of claims 18 to 31, further comprising:
generating a result corresponding to: an assessment of a medical condition of a subject comprising a prognosis for the outcome of the medical condition; and
generating a display comprising an indication of the assessment of the medical condition and the prognosis of the subject.
33. A digital pathology image processing system according to claim 32, wherein determining the tumor immunophenotype of the image and generating the one or more spatial distribution metrics uses a trained machine learning model that has been trained using a set of training elements, each of the set of training elements corresponding to another subject having a similar medical condition and for which the outcome of the medical condition is known.
34. A digital pathology image processing system according to any one of claims 18 to 33, further comprising:
generating, based at least in part on the one or more spatial distribution metrics, a result corresponding to: a prediction of the extent to which a given treatment that modulates an immune response will be effective in treating a medical condition in a subject;
determining that the subject is eligible for a clinical trial based on the results; and
generating a display comprising an indication that the subject is eligible for the clinical trial.
35. A non-transitory computer-readable medium comprising instructions that, when executed by one or more data processors of one or more computing devices, cause the one or more processors to:
Receiving a digital pathology image depicting a slice of a biological sample, wherein the digital pathology image comprises a region displaying a response to two or more stains;
segmenting the digital pathology image into a plurality of tiles;
for each of the plurality of tiles, calculating a local density measurement for each of a plurality of biological object types identified within the tile;
generating one or more spatial distribution metrics for the plurality of biological object types in the digital pathology image based at least in part on the calculated local density measurements; and
a tumor immunophenotype of the digital pathology image is determined based at least in part on the local density measurement or the one or more spatial distribution metrics.
36. The method of claim 35, wherein each of the local density measurements comprises a representation of absolute or relative quantity, area, or density.
37. The method of claim 35 or 36, wherein the plurality of biological subject types comprises tumor cells and immune cells, the tumor immunophenotype comprising:
a desert type provided that for the plurality of tiles, the local density measurement of the immune cells is less than an immune cell density threshold;
Rejection, provided that for one or more of the plurality of tiles, the local density measurement of the tumor cells is less than a tumor cell density threshold and the local density measurement of the immune cells is greater than or equal to the immune cell density threshold; or (b)
Inflammatory type, provided that for one or more of the plurality of tiles, the local density measurement of the tumor cells is greater than or equal to the tumor cell density threshold and the local density measurement of the immune cells is greater than or equal to the immune cell density threshold.
38. The method of any one of claims 35 to 37, wherein the one or more spatial distribution metrics characterize a degree to which a first biological object type is depicted interspersed with a second biological object type.
39. The method of any of claims 35 to 38, wherein the one or more spatial distribution metrics comprise:
jaccard index;
an index;
bhattacharyya coefficients;
moran index;
geary adjacency ratio;
Morisita-Horn index; or (b)
Metrics defined based on hot/cold spot analysis.
40. The method of any one of claims 35-39, wherein the plurality of biological subject types comprises tumor cells and immune cells, the tumor immunophenotype comprising:
Rejection, provided that for one or more of the plurality of tiles, the one or more spatial distribution metrics indicate spatial separation of the tumor cells and the immune cells; or (b)
An inflammatory type, provided that for one or more of the plurality of tiles, the one or more spatial distribution metrics indicate co-localization of the tumor cells and the immune cells.
41. The method of any one of claims 35 to 40, wherein calculating the local density measurement for each of the plurality of biological object types comprises:
for each of the plurality of tiles:
dividing the tile into a plurality of regions according to the two or more stains, wherein each of the biological object types is responsive to one of the stains;
classifying each of the regions according to reactivity to the stain; and
the local density measurement for each of the plurality of biological object types located within the tile is calculated based on a number of the regions of the tile classified with each of the two or more colorants.
42. The method of claim 41, wherein each of the regions of the tile is determined based on a staining intensity value of the region, the staining intensity value being based on a reactivity of each of the plurality of biological object types to one of the two or more stains.
43. The method of claim 41 or 42, wherein the regions of the tile are determined as tumor-related regions and non-tumor-related regions.
44. The method of claim 43, wherein each of the tumor-associated region and non-tumor-associated region is determined as an immune cell-associated region and a non-immune cell-associated region.
45. The method of any one of claims 35 to 44, wherein determining the tumor immunophenotype of the image comprises:
projecting a representation of the digital pathology image into a feature space having an axis based on the one or more spatial distribution metrics; and
the tumor immunophenotype of the image is determined based on the location of the digital pathology image within the feature space.
46. The method of claim 45, wherein determining the tumor immunophenotype of the image is further based on a proximity of the location of the digital pathology image within the feature space to a location represented by one or more other digital pathology images having a specified tumor immunophenotype.
47. The method of any one of claims 35 to 46, wherein the plurality of biological subject types comprises cytokeratin and a cytotoxic structure.
48. The method of any one of claims 35 to 47, further comprising:
identifying one or more tumor regions in the digital pathology image, comprising:
providing a user interface for display, the user interface comprising the digital pathology image and one or more interactive elements; and
a selection of the one or more tumor regions is received through interaction with the one or more interactive elements.
49. The method of any one of claims 35 to 48, further comprising:
generating a result corresponding to: an assessment of a medical condition of a subject comprising a prognosis for the outcome of the medical condition; and
generating a display comprising an indication of the assessment of the medical condition and the prognosis of the subject.
50. The method of claim 49, wherein determining the tumor immunophenotype of the image and generating the one or more spatial distribution metrics uses a trained machine learning model that has been trained using a set of training elements, each of the set of training elements corresponding to another subject having a similar medical condition and for which a result of the medical condition is known.

Claims (20)

1. A method, comprising:
accessing, by a digital pathology image processing system, a digital pathology image depicting a slice of a biological sample, wherein the digital pathology image includes areas exhibiting reactivity to a plurality of stains;
subdividing, by the digital pathology image processing system, the digital pathology image into a plurality of tiles;
calculating, for each of the tiles, local density measurements for each of a plurality of biological object types by the digital pathology image processing system;
generating, by the digital pathology image processing system, one or more spatial distribution metrics for the biological object type in the digital pathology image based at least in part on the calculated local density measurements; and
determining, by the digital pathology image processing system, a tumor immunophenotype of the digital pathology image based at least in part on the local density measurement or the one or more spatial distribution metrics.
2. The method of claim 1, wherein each of the local density measurements comprises a representation of absolute or relative quantity, area, or density.
3. The method of claim 1, wherein the biological subject types include tumor cells and immune cells, the tumor immunophenotype comprising:
desert type, provided that for each of the tiles, the local density measurement of the immune cells is less than an immune cell density threshold;
rejection, provided that for one or more of the patches, the local density measurement of the tumor cells is less than a tumor cell density threshold and the local density measurement of the immune cells is greater than or equal to the immune cell density threshold; or (b)
Inflammatory type, provided that for one or more of the tiles, the local density measurement of the tumor cells is greater than or equal to the tumor cell density threshold and the local density measurement of the immune cells is greater than or equal to the immune cell density threshold.
4. The method of claim 1, wherein the one or more spatial distribution metrics characterize a degree to which a first biological object type is depicted interspersed with a second biological object type.
5. The method of claim 1, wherein the one or more spatial distribution metrics comprise:
Jaccard index;
an index;
bhattacharyya coefficients;
moran index;
geary adjacency ratio;
Morisita-Horn index; or (b)
Metrics defined based on hot/cold spot analysis.
6. The method of claim 1, wherein the biological subject types include tumor cells and immune cells, the tumor immunophenotype comprising:
rejection-type, provided that for one or more of the tiles, the one or more spatial distribution metrics are indicative of spatial separation of the tumor cells and the immune cells; or (b)
An inflammatory type, provided that for one or more of the tiles, the one or more spatial distribution metrics indicate co-localization of the tumor cells and the immune cells.
7. The method of claim 1, wherein calculating the local density measurements for each of the biological object types comprises:
for each of the tiles:
dividing the tile into a plurality of regions according to the stain, wherein each of the biological object types is responsive to one of the stains;
classifying each of the regions according to reactivity to the stain; and
The local density measurement for each of the biological object types located within the tile is calculated based on the number of the regions of the tile classified with each of the stains.
8. The method of claim 7, wherein each of the regions of the tile is determined based on a staining intensity value of the region, the staining intensity value being based on a reactivity of each of the biological object types to one of the stains.
9. The method of claim 7, wherein the regions of the tile are determined as tumor-related regions and non-tumor-related regions.
10. The method of claim 9, wherein each of the tumor-associated region and non-tumor-associated region is determined to be an immune cell-associated region and a non-immune cell-associated region.
11. The method of claim 1, wherein determining the tumor immunophenotype of the image comprises:
projecting a representation of the digital pathology image into a feature space having an axis based on the one or more spatial distribution metrics; and
the tumor immunophenotype of the image is determined based on the location of the digital pathology image within the feature space.
12. The method of claim 11, wherein determining the tumor immunophenotype of the image is further based on a proximity of the location of the digital pathology image within the feature space to a location represented by one or more other digital pathology images having a specified tumor immunophenotype.
13. The method of claim 1, wherein the biological subject types include cytokeratin and cytotoxic structures.
14. The method as recited in claim 1, further comprising:
identifying one or more tumor regions in the digital pathology image, comprising:
providing a user interface for display, the user interface comprising the digital pathology image and one or more interactive elements; and
a selection of the one or more tumor regions is received through interaction with the one or more interactive elements.
15. The method as recited in claim 1, further comprising:
generating a result corresponding to: an assessment of a medical condition of a subject comprising a prognosis for the outcome of the medical condition; and
Generating a display comprising an indication of the assessment of the medical condition and the prognosis of the subject.
16. The method of claim 15, wherein determining the tumor immunophenotype of the image and generating the one or more spatial distribution metrics uses a trained machine learning model that has been trained using a set of training elements, each of the set of training elements corresponding to another subject having a similar medical condition and for which a result of the medical condition is known.
17. The method as recited in claim 1, further comprising:
generating, based at least in part on the one or more spatial distribution metrics, a result corresponding to: a prediction of the extent to which a given treatment that modulates an immune response will be effective in treating a medical condition in a subject;
determining that the subject is eligible for a clinical trial based on the results; and
generating a display comprising an indication that the subject is eligible for the clinical trial.
18. A digital pathology image processing system, comprising:
one or more data processors; and
A non-transitory computer-readable storage medium communicatively coupled to the one or more data processors and comprising instructions that, when executed by the one or more data processors, cause the one or more data processors to perform operations comprising:
accessing a digital pathology image depicting a slice of a biological sample, wherein the digital pathology image comprises a region displaying a response to a plurality of stains;
subdividing the digital pathology image into a plurality of tiles;
for each of the tiles, calculating a local density measurement for each of a plurality of biological object types identified within the tile;
generating one or more spatial distribution metrics for the biological object type in the digital pathology image based at least in part on the calculated local density measurements; and
a tumor immunophenotype of the digital pathology image is determined based at least in part on the local density measurement and the one or more spatial distribution metrics.
19. A digital pathology image processing system according to claim 18, wherein the biological subject types include tumor cells and immune cells, the tumor immunophenotype comprising:
Desert type, provided that for each of the tiles, the local density measurement of the immune cells is less than an immune cell density threshold;
rejection, provided that for one or more of the patches, the local density measurement of the tumor cells is less than a tumor cell density threshold and the local density measurement of the immune cells is greater than or equal to the immune cell density threshold; or (b)
Inflammatory type, provided that for one or more of the tiles, the local density measurement of the tumor cells is greater than or equal to the tumor cell density threshold and the local density measurement of the immune cells is greater than or equal to the immune cell density threshold.
20. A non-transitory computer-readable medium comprising instructions that, when executed by one or more data processors of one or more computing devices, cause the one or more processors to:
receiving a digital pathology image depicting a slice of a biological sample, wherein the digital pathology image includes a region displaying a response to a plurality of stains;
segmenting the digital pathology image into a plurality of tiles;
For each of the tiles, calculating a local density measurement for each of a plurality of biological object types identified within the tile;
generating one or more spatial distribution metrics for the biological object type in the digital pathology image based at least in part on the calculated local density measurements; and
a tumor immunophenotype of the digital pathology image is determined based at least in part on the local density measurement or the one or more spatial distribution metrics.
CN202280037416.3A 2021-05-27 2022-05-26 Tumor immunophenotyping based on spatial distribution analysis Pending CN117377982A (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US63/194,009 2021-05-27
US63/279,946 2021-11-16
US202263308491P 2022-02-09 2022-02-09
US63/308,491 2022-02-09
PCT/US2022/031220 WO2022251556A1 (en) 2021-05-27 2022-05-26 Tumor immunophenotyping based on spatial distribution analysis

Publications (1)

Publication Number Publication Date
CN117377982A true CN117377982A (en) 2024-01-09

Family

ID=89391487

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202280037416.3A Pending CN117377982A (en) 2021-05-27 2022-05-26 Tumor immunophenotyping based on spatial distribution analysis

Country Status (1)

Country Link
CN (1) CN117377982A (en)

Similar Documents

Publication Publication Date Title
KR102583103B1 (en) Systems and methods for processing electronic images for computational detection methods
Yang et al. Deep learning-based six-type classifier for lung cancer and mimics from histopathological whole slide images: a retrospective study
US20220156930A1 (en) Cancer risk stratification based on histopathological tissue slide analysis
US20210073986A1 (en) Systems and methods for processing images of slides to infer biomarkers
US20230140977A1 (en) Spatial feature analysis for digital pathology images
Sobhani et al. Artificial intelligence and digital pathology: Opportunities and implications for immuno-oncology
CN116157834A (en) Assessing heterogeneity of features in digital pathology images using machine learning techniques
CN114981899A (en) System and method for processing electronic images for biomarker localization
US20240087122A1 (en) Detecting tertiary lymphoid structures in digital pathology images
Ding et al. Deep learning‐based classification and spatial prognosis risk score on whole‐slide images of lung adenocarcinoma
CN117377982A (en) Tumor immunophenotyping based on spatial distribution analysis
US20240104948A1 (en) Tumor immunophenotyping based on spatial distribution analysis
Eastwood et al. MesoGraph: Automatic profiling of mesothelioma subtypes from histological images
US20240087726A1 (en) Predicting actionable mutations from digital pathology images
CN117378015A (en) Predicting actionable mutations from digital pathology images
Hue et al. High content image analysis in routine diagnostic histopathology predicts outcomes in HPV-associated oropharyngeal squamous cell carcinomas
Zehra et al. Use of Novel Open-Source Deep Learning Platform for Quantification of Ki-67 in Neuroendocrine Tumors–Analytical Validation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination