WO2024107784A2 - Repair of disease-associated single nucleotide variants via interallelic gene conversion - Google Patents

Repair of disease-associated single nucleotide variants via interallelic gene conversion Download PDF

Info

Publication number
WO2024107784A2
WO2024107784A2 PCT/US2023/079724 US2023079724W WO2024107784A2 WO 2024107784 A2 WO2024107784 A2 WO 2024107784A2 US 2023079724 W US2023079724 W US 2023079724W WO 2024107784 A2 WO2024107784 A2 WO 2024107784A2
Authority
WO
WIPO (PCT)
Prior art keywords
mutation
composition
snv
grna
cells
Prior art date
Application number
PCT/US2023/079724
Other languages
French (fr)
Other versions
WO2024107784A3 (en
Inventor
Michael Robert SAVONA
Alexander J. SILVER
Original Assignee
Vanderbilt University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vanderbilt University filed Critical Vanderbilt University
Publication of WO2024107784A2 publication Critical patent/WO2024107784A2/en
Publication of WO2024107784A3 publication Critical patent/WO2024107784A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K31/00Medicinal preparations containing organic active ingredients
    • A61K31/70Carbohydrates; Sugars; Derivatives thereof
    • A61K31/7088Compounds having three or more nucleosides or nucleotides
    • A61K31/711Natural deoxyribonucleic acids, i.e. containing only 2'-deoxyriboses attached to adenine, guanine, cytosine or thymine and having 3'-5' phosphodiester links
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2320/00Applications; Uses
    • C12N2320/30Special therapeutic applications
    • C12N2320/34Allele or polymorphism specific uses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses

Definitions

  • a major aim of anti-cancer therapies is to target cancer cells effectively while sparing healthy tissue.
  • Somatic mutations that arise in neoplastic clones constitute one class of salient distinguishing features with the potential to be targeted (Hanahan, D. and Weinberg, R.A. Cell 2000 100:57-70).
  • Single nucleotide variants (SNVs) are highly represented in sequenced cancer genomes (Martincorena, I. and Campbell Peter, J. Science 2015 349:1483-1489).
  • SNVs Single nucleotide variants
  • each SNV represents only a slight deviation from the wild-type sequence, there are inherent challenges with ensuring proposed therapeutic modalities maintain a high degree of specificity (Rabinowitz, R.
  • HDR homology-directed repair
  • CRISPR KO CRISPR knockout
  • CRISPR-Cas9 is a useful tool for creating precise genetic knock-in alterations through homology-directed repair (HDR), although all current methods rely on provision of an exogenous repair template.
  • HDR homology-directed repair
  • Disclosed herein are compositions, systems, and methods for repairing heterozygous single nucleotide variants (SNVs) using the cell’s own wild-type allele rather than an exogenous template. This technique can reduce cost and complexity for experiments modeling phenotypic consequences of SNVs.
  • ITC interallelic gene conversion
  • compositions or system for repairing heterozygous single nucleotide variants in a cell by CRISPR-mediated interallelic gene conversion (IGC), the composition involving an isolated nucleic acid encoding a Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR)-associated endonuclease; and an isolated nucleic acid sequence encoding an SNV-specific guide RNA (gRNA) 17 to 24 nucleotides in length complementary to a target gene sequence comprising the SNV, wherein the composition does not comprise an exogenous donor repair template.
  • CRISPR CRISPR-mediated interallelic gene conversion
  • the target gene can be any gene in a cell that is not dividing, wherein the gene is targetable by a CRISPR-associated endonuclease (Cas9) (e.g. has a PAM sequence).
  • Cas9 CRISPR-associated endonuclease
  • the target gene is selected from the group consisting of ASXL1, DNMT3A, GNAS, GNB1, IDH1, IDH2, KIT, NRAS, PPM1D, SF3B1, SRSF2, TET2, TP53, and U2AF1.
  • composition or system of claim 1 wherein the SNV is a Y591 , Q733, or L775 mutation in asx/1, such as a Y591X, Q733X, or L775X mutation in ASXL1.
  • composition or system of claim 1 wherein the SNV is a R326, R635, V657, R729, Y735, R736, R749, F755, R771 , I780, R882, W860, or P904 mutation in DNMT3A, such as a R326C, R635Q, R635W, V657M, R729W, Y735C, R736C, R736H, R749C, F755S, R771X, I780T, R882C, R882H, W860X, or P904L mutation in DNMT3A.
  • the composition or system of claim 1 wherein the SNV is a R201 mutation in gnas, such as a R201 H mutation in GNAS.
  • composition or system of claim 1 wherein the SNV is a K57 mutation in GNB1, such as a K57E mutation in GNB1.
  • composition or system of claim 1 wherein the SNV is a R132 mutation in IDH1, such as a R132H or R132C mutation in IDH1.
  • composition or system of claim 1 wherein the SNV is a R140 mutation in IDH2, such as a R140Q or R140L mutation in IDH2.
  • composition or system of claim 1 wherein the SNV is a D816 mutation in kit, such as a D816V mutation in KIT.
  • composition or system of claim 1 wherein the SNV is a G12 mutation in NRAS, such as a G12D mutation in NRAS.
  • composition or system of claim 1 wherein the SNV is a R552 mutation in PPM1D, such as a R552X mutation in PPM1D.
  • composition or system of claim 1 wherein the SNV is a R387W H662, T663, K700, G740E, G740, or A744 mutation in SF3B1, such as a R387W, H662D, H662Q, T663I, K700E, G740E, G740R, or A744P mutation in SF3B1.
  • composition or system of claim 1 wherein the SNV is a P95 mutation in SRSF2, such as a P95H, P95L, or P95R mutation in SRSF2.
  • composition or system of claim 1 wherein the SNV is a R544, Q803, Q916, C1135, Q1191 , R1216, R1261 , R1359, R1465, S1486, R1516, or 11873 mutation in TET2, such as a R544X, Q803X, Q916X, C1135Y, Q1191X, R1216Q, R1216X, R1261C, R1261 H, R1359C, R1359H, R1465X, S1486X, R1516X, or I1873T mutation in TET2.
  • composition or system of claim 1 wherein the SNV is a G108, R110, P177, H179, Y220, M237, C238, C242, M246, R248, R273, R306, or R342 mutation in TP53, such as a G108S, R110L, P177R, H179Y, Y220C, M237I, C238Y, C242Y, M246V, R248Q, R273H, R306X, or R342X mutation in TP53.
  • composition or system of claim 1 wherein the SNV is a S34F mutation in u2af1, such as a S34F mutation in U2AF1.
  • composition or system of claim 1 wherein the SNV is selected from the group consisting of rs377577594, rs147001633, rs200018028, rs121913237, rs371369583, rs371369583, rs387907078, rs149095705, rs147828672, rs144689354, rs370751539, rs779626155, rs200018028, rs761934754, rs751562376, rs747448117, rs751713049, rs141326438, rs559063155, rs121913495, rs779070661, rs121913500, rs121913499, rs121913502, rs121913502, rs121913507, rs371769427, rs752263134, rs28934576, rs7
  • composition or system of claim 1 wherein the gRNA comprises a polynucleotide selected from the group consisting of SEQ ID NO: 1-216.
  • a method for repairing heterozygous single nucleotide variants (SNVs) in a subject in need thereof comprising administering to the subject a therapeutically effective amount of the composition or system of any one of claims 1 to 20.
  • FIGs. 1A to 1 J show interallelic gene conversion can increase wild-type allele fraction in hematopoietic cell lines.
  • FIG. 1A shows the allele-specific gRNA sequences designed against
  • DNMT3A p.R882C (wt: SEQ ID NO:223, Mut: SEQ ID NO:224, gRNA: SEQ ID NO:1), DNMT3A p.R882H (wt: SEQ ID NO:225, Mut: SEQ ID NO:226, gRNA: SEQ ID NO:4), NRAS p.G12D (wt: SEQ ID NO:227, Mut: SEQ ID NO:228, gRNA: SEQ ID NO:9), ASXL1 p.Y591X (C.C1773G) (wt: SEQ ID NO:229, Mut: SEQ ID NQ:230, gRNA: SEQ ID NO:12), and ASXL1 p.Y591X (C.C1773A) (wt: SEQ ID NO:231 , Mut: SEQ ID NO:232, gRNA: SEQ ID NO: 15).
  • FIG. 1 B shows cells were electroporated with ribonucleic protein containing SNV-specific gRNA, then DNA was collected and compared to cells receiving scramble gRNA as a control.
  • FIG. 10 shows Sanger trace (SEQ ID NO:233) showing increase in the wild-type allele fraction and decrease in the mutant allele fraction of DNMT3A p.R882C in OCI-AML3 cells.
  • FIGs. 1 D to 1 H show the wild-type (WT) allelic fraction following control treatment (red), treatment with targeted gRNA plus homology- directed repair (HDR) enhancer, or gRNA without HDR enhancer for the DNMT3A P.R882C locus in OCI-AML3 cells (FIG.
  • WT wild-type allelic fraction following control treatment
  • HDR homology- directed repair
  • FIGs. 11 and 1J show total indels in samples receiving mutant-targeting gRNA plus HDR enhancer or gRNA without HDR enhancer in OCI- AML3 cells (FIG. 11) and SET-2 cells (FIG. 1J). Bar plots and error bars represent mean and standard deviation.
  • FIGs. 2A to 2H show reversion of ASXL1 p.Y591X results in transcriptional downregulation of pro-growth pathways.
  • FIG 1 A shows Western blot of scramble gRNA or Y591X-gRNA treated K562, showing full-length and truncated ASXL1 and [3-actin.
  • FIG. 1 E shows most significant enriched and de-enriched Hallmark pathways in preranked gene set enrichment analysis.
  • FIG. 1 E shows most significant enriched and de-enriched Hallmark pathways in preranked gene set enrichment analysis.
  • FIG. 1G shows cell divisions (mean +/- SD) as measured by CellTrace Violet fluorescence of K562 treated with scramble or mutant-targeting gRNA after
  • FIGs. 3A to 3C show bulk interallelic gene conversion is sufficient to prolong survival in cell-line derived mouse xenograft model.
  • FIG. 3A is a schematic of xenograft model.
  • FIGs. 4A to 4D show interallelic gene conversion works in primary cells.
  • FIG. 4A shows the sequence of the ASXL1 p.Q748X targeted gRNA (wt: SEQ ID NO:234, Mut: SEQ ID NO:235, gRNA: SEQ ID NO:236).
  • FIG. 4B shows the wild-type (WT) allele fraction of scramble- or targeted-gRNA treated patient sample. Bar plots and error bars represent mean and standard deviation.
  • FIG. 4C shows the number of reads containing indels for each of the experimental replicates.
  • 4D shows mean allele frequency (AF) for the reference allele (GRCh38/hg38) at heterozygous SNPs (purple) on the chromosome harboring the CRISPR-targeted SNV (orange).
  • AF mean allele frequency
  • FIG. 5A shows representative Sanger traces for of DNMT3A p.R882H in SET-2 cells (SEQ ID NO:237), NRAS p.G12D in THP-1 cells (SEQ ID NO:238), ASXL1 P.Y591X in OCI-AML5 (SEQ ID NO:239), and ASXL1 p.Y591X in K562 cells (SEQ ID NQ:240).
  • FIG. 5B shows proportion of all wild-type, mutant, or unphased reads (empty bars) and the proportion of indel-containing wild-type, mutant, or unphased reads (shaded bars) for OCI-AML3, SET-2, THP-1 , OCI-AML5, and K562 cells. Bar plots and error bars represent mean and standard deviation.
  • FIG. 5C shows Integrated Genome Viewer locus plot depicting indels in down-sampled NGS reads at the DNMT3A p.R882C locus (SEQ ID NO:241); following OCIAML3 treatment with mutant-specific gRNA without homology-directed repair (HDR) enhancer.
  • FIG. 6A shows Western blot of scramble gRNA or Y591XgRNA treated OCI- AML5, showing full-length and truncated ASXL1 and [3-actin.
  • FIG. 6I shows ratio of control- or ASXL1 -treated K562 cells in G1 phase to S/G2/M phases. Bar plots and error bars represent mean and standard deviation.
  • FIG. 7C shows FACS gating strategy for Ki-67 cell cycle assay.
  • FIG. 8B shows survival curves of mice receiving K562 (solid) and OCI-AML5 (dashed) cells treated with scramble or mutant-targeting gRNA. Significance determined by Cox proportional hazards model stratified by cell line.
  • FIGs. 9A to 9E show mean allele frequency (AF) for the reference allele (GRCh38/hg38) at heterozygous SNPs on the chromosome harboring the CRISPR- targeted SNV (orange) for OCI-AML3 (FIG. 9A), SET-2 (FIG. 9B), THP-1 (FIG. 9C), OCI- AML5 (FIG. 9D), and K562 (FIG. 9E).
  • treated samples includes all samples irrespective of the addition of HDR enhancer.
  • FIGs. 10A depicts animal weights for an OCI-AML5 transplant model.
  • 10B depicts survival curves for K562 and OCI-AML5 transplant models, with combined treatment effect determined by Cox proportional hazards model stratified by cell type.
  • Embodiments of the present disclosure will employ, unless otherwise indicated, techniques of chemistry, biology, and the like, which are within the skill of the art.
  • gRNA guide RNA molecules for use in the disclosed compositions, systems, and methods.
  • a gRNA refers to a nucleic acid that promotes the specific targeting or homing of a gRNA molecule/Cas9 molecule complex to a target nucleic acid.
  • gRNA molecules can be unimolecular (having a single RNA molecule), sometimes referred to herein as “chimeric” gRNAs, or modular (comprising more than one, and typically two, separate RNA molecules).
  • gRNA are a synthetic fusion of the endogenous bacterial crRNA and tracrRNA.
  • gRNA provide both targeting specificity and scaffolding/binding ability for Cas9 nuclease. They do not exist in nature.
  • gRNA are sometimes referred to as “single guide RNA” or “sgRNA”.
  • a gRNA molecule comprises a number of domains, which are described in more detail below.
  • the gRNA comprises, preferably from 5' to 3': a targeting domain (which is complementary to a target nucleic acid); a first complementarity domain; a linking domain; a second complementarity domain (which is complementary to the first complementarity domain); a proximal domain; and optionally, a tail domain.
  • the targeting domain comprises a nucleotide sequence that is complementary, e.g., at least 80%, 85%, 90%, 95%, or 100% complementary, e.g., fully complementary, to the target sequence on the target nucleic acid.
  • the targeting domain is part of an RNA molecule and will therefore comprise the base uracil (U), while any DNA encoding the gRNA molecule will comprise the base thymine (T). While not wishing to be bound by theory, it is believed that the complementarity of the targeting domain with the target sequence contributes to specificity of the interaction of the gRNA molecule/Cas9 molecule complex with a target nucleic acid.
  • the target domain itself comprises, in the 5' to 3' direction, an optional secondary domain, and a core domain.
  • the core domain is fully complementary with the target sequence.
  • the targeting domain is 5 to 50, 10 to 40, e.g., 10 to 30, e.g., 15 to 30, e.g., 15 to 25 nucleotides in length.
  • the targeting domain is 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24 or 25 nucleotides in length.
  • the strand of the target nucleic acid with which the targeting domain is complementary is referred to herein as the complementary strand. Some or all of the nucleotides of the domain can have a modification.
  • the first complementarity domain is complementary with the second complementarity domain, and In some embodiments, has sufficient complementarity to the second complementarity domain to form a duplexed region under at least some physiological conditions.
  • the first complementarity domain is 5 to 30 nucleotides in length. In some embodiments, the first complementarity domain is 5 to 25 nucleotides in length. In some embodiments, the first complementary domain is 7 to 25 nucleotides in length. In some embodiments, the first complementary domain is 7 to 22 nucleotides in length. In some embodiments, the first complementary domain is 7 to 18 nucleotides in length. In some embodiments, the first complementary domain is 7 to 15 nucleotides in length. In some embodiments, the first complementary domain is 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, or 25 nucleotides in length.
  • the first complementarity domain comprises 3 subdomains, which, in the 5' to 3' direction are: a 5' subdomain, a central subdomain, and a 3' subdomain.
  • the 5' subdomain is 4-9, e.g., 4, 5, 6, 7, 8 or 9 nucleotides in length.
  • the central subdomain is 1 , 2, or 3, e.g., 1 , nucleotide in length.
  • the 3' subdomain is 3 to 25, e.g., 4-22, 4- 18, or 4 to 10, or 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, or 25, nucleotides in length.
  • the first complementarity domain can share homology with, or be derived from, a naturally occurring first complementarity domain. In some embodiments, it has at least 50% homology with a first complementarity domain disclosed herein, e.g., a Streptococcus pyogenes (S. pyogenes) or Streptococcus thermophiles (S. thermophiles), first complementarity domain.
  • a Streptococcus pyogenes S. pyogenes
  • Streptococcus thermophiles S. thermophiles
  • a linking domain serves to link the first complementarity domain with the second complementarity domain of a unimolecular gRNA.
  • the linking domain can link the first and second complementarity domains covalently or non-covalently.
  • the linkage is covalent.
  • the linking domain covalently couples the first and second complementarity domains.
  • the linking domain is, or comprises, a covalent bond interposed between the first complementarity domain and the second complementarity domain.
  • the linking domain comprises one or more, e.g., 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides. In modular gRNA molecules the two molecules can be associated by virtue of the hybridization of the complementarity domains.
  • linking domains are suitable for use in unimolecular gRNA molecules.
  • Linking domains can consist of a covalent bond, or be as short as one or a few nucleotides, e.g., 1 , 2, 3, 4, or 5 nucleotides in length.
  • a linking domain is 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, or 25 or more nucleotides in length. In some embodiments, a linking domain is 2 to 50, 2 to 40, 2 to 30, 2 to 20, 2 to 10, or 2 to 5 nucleotides in length. In some embodiments, a linking domain shares homology with, or is derived from, a naturally occurring sequence, e.g., the sequence of a tracrRNA that is 5' to the second complementarity domain. In some embodiments, the linking domain has at least 50% homology with a linking domain disclosed herein.
  • a modular gRNA can comprise additional sequence, 5' to the second complementarity domain, referred to herein as the 5' extension domain.
  • the 5' extension domain is, 2-10, 2-9, 2-8, 2-7, 2-6, 2-5, 2-4 nucleotides in length.
  • the 5' extension domain is 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more nucleotides in length.
  • the second complementarity domain is complementary with the first complementarity domain, and In some embodiments, has sufficient complementarity to the second complementarity domain to form a duplexed region under at least some physiological conditions.
  • the second complementarity domain can include sequence that lacks complementarity with the first complementarity domain, e.g., sequence that loops out from the duplexed region.
  • the second complementarity domain is 5 to 27 nucleotides in length. In some embodiments, it is longer than the first complementarity region.
  • the second complementary domain is 7 to 27 nucleotides in length. In some embodiments, the second complementary domain is 7 to 25 nucleotides in length. In some embodiments, the second complementary domain is 7 to 20 nucleotides in length. In some embodiments, the second complementary domain is 7 to 17 nucleotides in length. In some embodiments, the complementary domain is 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24 or 25 nucleotides in length.
  • the second complementarity domain comprises 3 subdomains, which, in the 5' to 3' direction are: a 5' subdomain, a central subdomain, and a 3' subdomain.
  • the 5' subdomain is 3 to 25, e.g., 4 to 22, 4 to 18, or 4 to 10, or 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, or 25 nucleotides in length.
  • the central subdomain is 1 , 2, 3, 4 or 5, e.g., 3, nucleotides in length.
  • the 3' subdomain is 4 to 9, e.g., 4, 5, 6, 7, 8 or 9 nucleotides in length.
  • the 5' subdomain and the 3' subdomain of the first complementarity domain are respectively, complementary, e.g., fully complementary, with the 3' subdomain and the 5' subdomain of the second complementarity domain.
  • the second complementarity domain can share homology with or be derived from a naturally occurring second complementarity domain. In some embodiments, it has at least 50% homology with a second complementarity domain disclosed herein, e.g., an S. pyogenes, or S. thermophilus, first complementarity domain.
  • nucleotides of the domain can have a modification.
  • the proximal domain is 5 to 20 nucleotides in length. In some embodiments, the proximal domain can share homology with or be derived from a naturally occurring proximal domain. In some embodiments, it has at least 50% homology with a proximal domain disclosed herein, e.g., an S. pyogenes, or S. thermophilus, proximal domain.
  • tail domains are suitable for use in gRNA molecules.
  • the tail domain is 0 (absent), 1 , 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides in length.
  • the tail domain nucleotides are from or share homology with sequence from the 5' end of a naturally occurring tail domain.
  • the tail domain includes sequences that are complementary to each other and which, under at least some physiological conditions, form a duplexed region.
  • the tail domain is absent or is 1 to 50 nucleotides in length.
  • the tail domain can share homology with or be derived from a naturally occurring proximal tail domain. In some embodiments, it has at least 50% homology with a tail domain disclosed herein, e.g., an S. pyogenes, or S. thermophilus, tail domain.
  • the tail domain includes nucleotides at the 3' end that are related to the method of in vitro or in vivo transcription.
  • these nucleotides may be any nucleotides present before the 3' end of the DNA template.
  • these nucleotides may be the sequence UUUUUU.
  • alternate pol-lll promoters are used, these nucleotides may be various numbers or uracil bases or may include alternate bases.
  • a software tool can be used to optimize the choice of sgRNA within a user's target sequence, e.g., to minimize total off-target activity across the genome.
  • Off target activity may be other than cleavage.
  • the tool can identify all off-target sequences (e.g., preceding either NAG or NGG PAMs) across the genome that contain up to a certain number (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, or 10) of mismatched base-pairs.
  • the cleavage efficiency at each off-target sequence can be predicted, e.g., using an experimentally-derived weighting scheme.
  • Each possible gRNA is then ranked according to its total predicted off-target cleavage; the top-ranked gRNAs represent those that are likely to have the greatest on-target and the least off-target cleavage.
  • Other functions e.g., automated reagent design for CRISPR construction, primer design for the on-target Surveyor assay, and primer design for high-throughput detection and quantification of off-target cleavage via nextgen sequencing, can also be included in the tool.
  • Candidate gRNA molecules can be evaluated by art-known methods.
  • Cas9 molecules of a variety of species can be used in the methods and compositions described herein. While the S. pyogenes and S. thermophilus Cas9 molecules are typically used, Cas9 molecules of, derived from, or based on the Cas9 proteins of other species can be used, e.g., Staphylococcus aureus, Neisseria meningitides.
  • a Cas9 molecule refers to a molecule that can interact with a sgRNA molecule and, in concert with the sgRNA molecule, localize (e.g., target or home) to a site which comprises a target domain and PAM sequence.
  • the Cas9 molecule is capable of cleaving a target nucleic acid molecule.
  • Exemplary naturally occurring Cas9 molecules are described in Chylinski et al., RNA Biology 2013; 10:5, 727-737.
  • Naturally occurring Cas9 molecules possess a number of properties, including: nickase activity, nuclease activity (e.g., endonuclease and/or exonuclease activity); helicase activity; the ability to associate functionally with a gRNA molecule; and the ability to target (or localize to) a site on a nucleic acid (e.g., PAM recognition and specificity).
  • a Cas9 molecules can include all or a subset of these properties.
  • Cas9 molecules have the ability to interact with a gRNA molecule and, in concert with the gRNA molecule, localize to a site in a nucleic acid.
  • Other activities e.g., PAM specificity, cleavage activity, or helicase activity can vary more widely in Cas9 molecules.
  • Cas9 molecules with desired properties can be made in a number of ways, e.g., by alteration of a parental, e.g., naturally occurring Cas9 molecules to provide an altered Cas9 molecule having a desired property.
  • one or more mutations or differences relative to a parental Cas9 molecule can be introduced. Such mutations and differences comprise: substitutions (e.g., conservative substitutions or substitutions of non-essential amino acids); insertions; or deletions.
  • a Cas9 molecule can comprises one or more mutations or differences, e.g., at least 1 , 2, 3, 4, 5, 10, 15, 20, 30, 40 or 50 mutations but less than 200, 100, or 80 mutations relative to a reference Cas9 molecule.
  • a mutation or mutations do not have a substantial effect on a Cas9 activity, e.g. a Cas9 activity described herein. In some embodiments, a mutation or mutations have a substantial effect on a Cas9 activity, e.g. a Cas9 activity described herein.
  • exemplary activities comprise one or more of PAM specificity, cleavage activity, and helicase activity.
  • a mutation(s) can be present, e.g., in: one or more RuvC-like domain, e.g., an N-terminal RuvC-like domain; an HNH- like domain; a region outside the RuvC-like domains and the HNH-like domain.
  • a mutation(s) is present in an N-terminal RuvC-like domain. In some embodiments, a mutation(s) is present in an HNH-like domain. In some embodiments, mutations are present in both an N-terminal RuvC-like domain and an HNH-like domain.
  • a “non- essential” amino acid residue is a residue that can be altered from the wild-type sequence of a Cas9 molecule, e.g., a naturally occurring Cas9 molecule, e.g., an eaCas9 molecule, without abolishing or more preferably, without substantially altering a Cas9 activity (e.g., cleavage activity), whereas changing an “essential” amino acid residue results in a substantial loss of activity (e.g., cleavage activity).
  • Naturally occurring Cas9 molecules can recognize specific PAM sequences, for example the PAM recognition sequences for S. pyogenes, S. thermophilus, S. mutans, S. aureus and N. meningitidis.
  • a Cas9 molecule has the same PAM specificities as a naturally occurring Cas9 molecule.
  • a Cas9 molecule has a PAM specificity not associated with a naturally occurring Cas9 molecule, or a PAM specificity not associated with the naturally occurring Cas9 molecule to which it has the closest sequence homology.
  • a naturally occurring Cas9 molecule can be altered, e.g., to alter PAM recognition, e.g., to alter the PAM sequence that the Cas9 molecule recognizes to decrease off target sites and/or improve specificity; or eliminate a PAM recognition requirement.
  • a Cas9 molecule can be altered, e.g., to increase length of PAM recognition sequence and/or improve Cas9 specificity to high level of identity to decrease off target sites and increase specificity.
  • the length of the PAM recognition sequence is at least 4, 5, 6, 7, 8, 9, 10 or 15 amino acids in length.
  • Cas9 molecules that recognize different PAM sequences and/or have reduced off-target activity can be generated using directed evolution.
  • a Cas9 molecule comprises a cleavage property that differs from naturally occurring Cas9 molecules, e.g., that differs from the naturally occurring Cas9 molecule having the closest homology.
  • a Cas9 molecule can differ from naturally occurring Cas9 molecules, e.g., a Cas9 molecule of S. pyogenes, as follows: its ability to modulate, e.g., decreased or increased, cleavage of a double stranded break (endonuclease and/or exonuclease activity), e.g., as compared to a naturally occurring Cas9 molecule (e.g., a Cas9 molecule of S.
  • pyogenes its ability to modulate, e.g., decreased or increased, cleavage of a single strand of a nucleic acid, e.g., a non-complimentary strand of a nucleic acid molecule or a complementary strand of a nucleic acid molecule (nickase activity), e.g., as compared to a naturally occurring Cas9 molecule (e.g., a Cas9 molecule of S. pyogenes)’, or the ability to cleave a nucleic acid molecule, e.g., a double stranded or single stranded nucleic acid molecule, can be eliminated.
  • a naturally occurring Cas9 molecule e.g., a Cas9 molecule of S. pyogenes
  • the Cas9 is a “high fidelity” spCas9 variants (HF-Cas9), such as those designed according to principles disclosed by Joung and colleagues (Kleinstiver, et al., 2016, which is incorporated herein in its entirety).
  • HF-Cas9 high fidelity spCas9 variants
  • Gene Targets Disclosed herein are isolated nucleic acid sequences encoding an SNV-specific guide RNA (gRNA). Examples of gene SNVs and corresponding gRNAs are provided in Table 1.
  • HDR Homology-Directed Repair
  • nuclease-induced homology directed repair can be used to alter a target sequence and correct (e.g., repair or edit) a mutation in the genome. While not wishing to be bound by theory, it is believed that alteration of the target sequence occurs by homology-directed repair (HDR). Normally this is done using a donor template or template nucleic acid. For example, the donor template or the template nucleic acid provides for alteration of the target sequence.
  • compositions, systems, and methods for repairing heterozygous single nucleotide variants (SNVs) using the cell’s own wild-type allele rather than an exogenous donor template This is referred to herein as interallelic gene conversion (IGC).
  • IGC alteration of a target sequence depends on cleavage by a Cas9 molecule.
  • Cleavage by Cas9 can comprise a double strand break or two single strand breaks.
  • a mutation can be corrected by either a single doublestrand break or two single strand breaks. In some embodiments, a mutation can be corrected by: (1) a single double-strand break, (2) two single strand breaks, (3) two double stranded breaks with a break occurring on each side of the target sequence, (4) one double stranded breaks and two single strand breaks with the double strand break and two single strand breaks occurring on each side of the target sequence or (5) four single stranded breaks with a pair of single stranded breaks occurring on each side of the target sequence.
  • the disclosed compositions, systems, and methods do not include the use of an exogenous donor template nucleic acid.
  • a “template nucleic acid,” as used herein, refers to an endogenous or exogenous nucleic acid sequence comprising the wildtype nucleic acid sequence of the target nucleic acid, i.e. lacking the SNV. Therefore, in some embodiments, the disclosed compositions and systems do not have or need an exogenous donor template nucleic acid for HDR, but instead rely on the cell’s own wild-type allele as the template nucleic acid.
  • the components can be delivered, formulated, or administered in a variety of forms.
  • the DNA will typically include a control region, e.g., comprising a promoter, to effect expression.
  • control region e.g., comprising a promoter
  • Useful promoters for Cas9 molecule sequences include CMV, EF-1a, MSCV, PGK, CAG control promoters.
  • Useful promoters for sgRNAs include H1 , EF-1a and U6 promoters. Promoters with similar or dissimilar strengths can be selected to tune the expression of components.
  • Sequences encoding a Cas9 molecule can comprise a nuclear localization signal (NLS), e.g., an SV40 NLS.
  • NLS nuclear localization signal
  • a promoter for a Cas9 molecule or a sgRNA molecule can be, independently, inducible, tissue specific, or cell specific.
  • DNA encoding Cas9 molecules and/or gRNA molecules can be administered to subjects or delivered into cells by art-known methods or as described herein.
  • Cas9-encoding and/or gRNA-encoding DNA can be delivered, e.g., by vectors (e.g., viral or non-viral vectors), non-vector based methods (e.g., using naked DNA or DNA complexes), or a combination thereof.
  • the Cas9- and/or gRNA-encoding DNA is delivered by a vector (e.g., viral vector/virus or plasmid).
  • a vector e.g., viral vector/virus or plasmid
  • a vector can comprise a sequence that encodes a Cas9 molecule and/or a gRNA molecule.
  • a vector can also comprise a sequence encoding a signal peptide (e.g., for nuclear localization, nucleolar localization, mitochondrial localization), fused, e.g., to a Cas9 molecule sequence.
  • a vector can comprise a nuclear localization sequence (e.g., from SV40) fused to the sequence encoding the Cas9 molecule.
  • the vector or delivery vehicle is a viral vector (e.g., for generation of recombinant viruses).
  • the virus is a DNA virus (e.g., dsDNA or ssDNA virus).
  • the virus is an RNA virus (e.g., an ssRNA virus).
  • Exemplary viral vectors/viruses include, e.g., retroviruses, lentiviruses, adenovirus, adeno-associated virus (AAV), vaccinia viruses, poxviruses, and herpes simplex viruses.
  • the virus infects dividing cells. In other embodiments, the virus infects non-dividing cells.
  • the virus infects both dividing and non-dividing cells.
  • the virus can integrate into the host genome.
  • the virus is engineered to have reduced immunity, e.g., in human.
  • the virus is replication-competent.
  • the virus is replication-defective, e.g., having one or more coding regions for the genes necessary for additional rounds of virion replication and/or packaging replaced with other genes or deleted.
  • the virus causes transient expression of the Cas9 molecule and/or the gRNA molecule.
  • the virus causes long- lasting, e.g., at least 1 week, 2 weeks, 1 month, 2 months, 3 months, 6 months, 9 months, 1 year, 2 years, or permanent expression, of the Cas9 molecule and/or the gRNA molecule.
  • the packaging capacity of the viruses may vary, e.g., from at least about 4 kb to at least about 30 kb, e.g., at least about 5 kb, 10 kb, 15 kb, 20 kb, 25 kb, 30 kb, 35 kb, 40 kb, 45 kb, or 50 kb.
  • the Cas9- and/or gRNA-encoding DNA is delivered by a recombinant lentivirus.
  • the lentivirus is replication-defective, e.g., does not comprise one or more genes required for viral replication.
  • the Cas9- and/or gRNA-encoding DNA is delivered by a recombinant adenovirus.
  • the adenovirus is engineered to have reduced immunity in human.
  • the Cas9- and/or gRNA-encoding DNA is delivered by a recombinant AAV.
  • the AAV can incorporate its genome into that of a host cell, e.g., a target cell as described herein.
  • the AAV is a self-complementary adeno-associated virus (scAAV), e.g., a scAAV that packages both strands which anneal together to form double stranded DNA.
  • scAAV self-complementary adeno-associated virus
  • AAV serotypes that may be used in the disclosed methods include, e.g., AAV1 , AAV2, modified AAV2 (e.g., modifications at Y444F, Y500F, Y730F and/or S662V), AAV3, modified AAV3 (e.g., modifications at Y705F, Y731 F and/or T492V), AAV4, AAV5, AAV6, modified AAV6 (e.g., modifications at S663V and/or T492V), AAV8, AAV 8.2, AAV9, AAV rh 10, and pseudotyped AAV, such as AAV2/8, AAV2/5 and AAV2/6 can also be used in the disclosed methods.
  • AAV1 e.g., AAV2, modified AAV2 (e.g., modifications at Y444F, Y500F, Y730F and/or S662V), AAV3, modified AAV3 (e.g., modifications at Y705F,
  • the Cas9- and/or gRNA-encoding DNA is delivered by a hybrid virus, e.g., a hybrid of one or more of the viruses described herein.
  • a Packaging cell is used to form a virus particle that is capable of infecting a host or target cell.
  • a cell includes a 293 cell, which can package adenovirus, and a i 2 cell or a PA317 cell, which can package retrovirus.
  • a viral vector used in gene therapy is usually generated by a producer cell line that packages a nucleic acid vector into a viral particle.
  • the vector typically contains the minimal viral sequences required for packaging and subsequent integration into a host or target cell (if applicable), with other viral sequences being replaced by an expression cassette encoding the protein to be expressed.
  • an AAV vector used in gene therapy typically only possesses inverted terminal repeat (ITR) sequences from the AAV genome which are required for packaging and gene expression in the host or target cell.
  • ITR inverted terminal repeat
  • the missing viral functions are supplied in trans by the packaging cell line.
  • the viral DNA is packaged in a cell line, which contains a helper plasmid encoding the other AAV genes, namely rep and cap, but lacking ITR sequences.
  • the cell line is also infected with adenovirus as a helper.
  • the helper virus promotes replication of the AAV vector and expression of AAV genes from the helper plasmid.
  • the helper plasmid is not packaged in significant amounts due to a lack of ITR sequences. Contamination with adenovirus can be reduced by, e.g., heat treatment to which adenovirus is more sensitive than AAV.
  • the viral vector has the ability of cell type and/or tissue type recognition.
  • the viral vector can be pseudotyped with a different/alternative viral envelope glycoprotein; engineered with a cell type-specific receptor (e.g., genetic modification of the viral envelope glycoproteins to incorporate targeting ligands such as a peptide ligand, a single chain antibody, a growth factor); and/or engineered to have a molecular bridge with dual specificities with one end recognizing a viral glycoprotein and the other end recognizing a moiety of the target cell surface (e.g., ligand-receptor, monoclonal antibody, avidin-biotin and chemical conjugation).
  • ligand-receptor monoclonal antibody, avidin-biotin and chemical conjugation
  • the viral vector achieves cell type specific expression.
  • a tissue-specific promoter can be constructed to restrict expression of the transgene (Cas 9 and gRNA) in only the target cell.
  • the specificity of the vector can also be mediated by microRNA-dependent control of transgene expression.
  • the viral vector has increased efficiency of fusion of the viral vector and a target cell membrane.
  • a fusion protein such as fusion-competent hemagglutin (HA) can be incorporated to increase viral uptake into cells.
  • the viral vector has the ability of nuclear localization.
  • a virus that requires the breakdown of the cell wall (during cell division) and therefore will not infect a non-diving cell can be altered to incorporate a nuclear localization peptide in the matrix protein of the virus thereby enabling the transduction of non-proliferating cells.
  • the Cas9- and/or gRNA-encoding DNA is delivered by a non-vector based method (e.g., using naked DNA or DNA complexes).
  • the DNA can be delivered, e.g., by organically modified silica or silicate (Ormosil), electroporation, gene gun, sonoporation, magnetofection, lipid-mediated transfection, dendrimers, inorganic nanoparticles, calcium phosphates, or a combination thereof.
  • the Cas9- and/or gRNA-encoding DNA is delivered by a combination of a vector and a non-vector based method.
  • a virosome comprises a liposome combined with an inactivated virus (e.g., HIV or influenza virus), which can result in more efficient gene transfer, e.g., in a respiratory epithelial cell than either a viral or a liposomal method alone.
  • an inactivated virus e.g., HIV or influenza virus
  • the delivery vehicle is a non-viral vector.
  • the non-viral vector is an inorganic nanoparticle (e.g., attached to the payload to the surface of the nanoparticle).
  • exemplary inorganic nanoparticles include, e.g., magnetic nanoparticles (e.g., Fe3MnO2), or silica.
  • the outer surface of the nanoparticle can be conjugated with a positively charged polymer (e.g., polyethylenimine, polylysine, polyserine) which allows for attachment (e.g., conjugation or entrapment) of payload.
  • the non-viral vector is an organic nanoparticle (e.g., entrapment of the payload inside the nanoparticle).
  • organic nanoparticles include, e.g., SNALP liposomes that contain cationic lipids together with neutral helper lipids which are coated with polyethylene glycol (PEG) and protamine and nucleic acid complex coated with lipid coating.
  • PEG polyethylene glycol
  • the vehicle has targeting modifications to increase target cell update of nanoparticles and liposomes, e.g., cell specific antigens, monoclonal antibodies, single chain antibodies, aptamers, polymers, sugars, and cell penetrating peptides.
  • the vehicle uses fusogenic and endosome-destabilizing peptides/polymers.
  • the vehicle undergoes acid-triggered conformational changes (e.g., to accelerate endosomal escape of the cargo).
  • a stimuli-cleavable polymer is used, e.g., for release in a cellular compartment.
  • disulfide-based cationic polymers that are cleaved in the reducing cellular environment can be used.
  • the delivery vehicle is a biological non-viral delivery vehicle.
  • the vehicle is an attenuated bacterium (e.g., naturally or artificially engineered to be invasive but attenuated to prevent pathogenesis and expressing the transgene (e.g., Listeria monocytogenes, certain Salmonella strains, Bifidobacterium longum, and modified Escherichia coli), bacteria having nutritional and tissue-specific tropism to target specific tissues, bacteria having modified surface proteins to alter target tissue specificity).
  • the transgene e.g., Listeria monocytogenes, certain Salmonella strains, Bifidobacterium longum, and modified Escherichia coli
  • the vehicle is a genetically modified bacteriophage (e.g., engineered phages having large packaging capacity, less immunogenic, containing mammalian plasmid maintenance sequences and having incorporated targeting ligands).
  • the vehicle is a mammalian virus-like particle.
  • modified viral particles can be generated (e.g., by purification of the “empty” particles followed by ex vivo assembly of the virus with the desired cargo).
  • the vehicle can also be engineered to incorporate targeting ligands to alter target tissue specificity.
  • the vehicle is a biological liposome.
  • the biological liposome is a phospholipid-based particle derived from human cells (e.g., erythrocyte ghosts, which are red blood cells broken down into spherical structures derived from the subject (e.g., tissue targeting can be achieved by attachment of various tissue or cell-specific ligands), or secretory exosomes — subject (i.e. , patient) derived membrane-bound nanovescicle (30-100 nm) of endocytic origin (e.g., can be produced from various cell types and can therefore be taken up by cells without the need of for targeting ligands).
  • human cells e.g., erythrocyte ghosts, which are red blood cells broken down into spherical structures derived from the subject (e.g., tissue targeting can be achieved by attachment of various tissue or cell-specific ligands), or secretory exosomes — subject (i.e. , patient) derived membrane-bound nanovescicle (30-100 nm) of
  • one or more nucleic acid molecules are delivered.
  • the nucleic acid molecule is delivered at the same time as one or more of the components of the Cas system are delivered.
  • the nucleic acid molecule is delivered before or after (e.g., less than about 30 minutes, 1 hour, 2 hours, 3 hours, 6 hours, 9 hours, 12 hours, 1 day, 2 days, 3 days, 1 week, 2 weeks, or 4 weeks) one or more of the components of the Cas system are delivered.
  • the nucleic acid molecule is delivered by a different means than one or more of the components of the Cas system, e.g., the Cas9 molecule component and/or the gRNA molecule component, are delivered.
  • the nucleic acid molecule can be delivered by any of the delivery methods described herein.
  • the nucleic acid molecule can be delivered by a viral vector, e.g., an integration-deficient lentivirus, and the Cas9 molecule component and/or the gRNA molecule component can be delivered by electroporation, e.g., such that the toxicity caused by nucleic acids (e.g., DNAs) can be reduced.
  • the nucleic acid molecule encodes a therapeutic protein, e.g., a protein described herein. In some embodiments, the nucleic acid molecule encodes an RNA molecule, e.g., an RNA molecule described herein.
  • RNA encoding Cas9 and/or gRNA molecules can be delivered into cells, e.g., target cells described herein, by art-known methods or as, described herein.
  • Cas9-encoding and/or gRNA-encoding RNA can be delivered, e.g., by microinjection, electroporation, lipid-mediated transfection, peptide-mediated delivery, or a combination thereof.
  • Systemic modes of administration include oral and parenteral routes.
  • Parenteral routes include, by way of aerosol, intravenous, intrarterial, intraosseous, intramuscular, intradermal, subcutaneous, intranasal and intraperitoneal routes.
  • This Example shows that acquired SNVs associated with clonal hematopoiesis and hematologic malignancy can be efficiently restored to their wild-type sequences using widely available CRISPR reagents without the need for exogenous HDR templates.
  • High-fidelity Cas9 allows for the specific targeting of mutant alleles and that incorporation of the SNV into a standard 20-base pair protospacer sequence is sufficient to prevent off-target cutting of the wild-type allele.
  • use of a commercially available HDR enhancer can drastically reduce insertions/deletions (indels) generated during this process.
  • the ASXL1 protein is a part of several histone-modifying complexes, including polycomb repressive complex 2 (PRC2) and the BAP1 histone H2AK119Ub deubiquitinase (DUB) complex (Fujino, T., et al. Exp. Hematol. 2020 83:74-84). Stopgain mutations in ASXL1, including p.Y591X, are commonly observed in clonal hematopoiesis (Bick, A.G., et al. Nature 2020 586:763- 768; Jaiswal, S., et al. N. Engl. J. Med.
  • Oncogene 2005 24:4472-4476 has been shown to promote proliferation in a K562 model (Sakajiri, S., et al. Leukemia 2005 19:1404-1410) and primary patient samples (Zhang, W., et al. Oncol. Lett. 2013 6:203-206).
  • RNAseq was conducted after similar CRISPR experiments in the OCI-AML5 cell line (Figure 6D-6E), and, when combined with the K562 data using a linear mixed model (Figure 6F-6G), DLK1 was again found to be significantly downregulated, increasing our interest in examining this result.
  • the next goal was to determine whether bulk-editing of cells to correct truncating mutations in ASXL1 could improve survival in xenograft models. It has been previously reported that complete removal of truncated ASXL1 in a subcloned KBM5 cell line delayed mortality in a mouse model (Valletta, S., et al. Oncotarget 2015 6:44061-71), but it was unclear whether a survival benefit would be seen if instead the xenograft were comprised of a more heterogenous bulk-edited population in which some residual fraction of cells retain mutant protein, or even if the KBM5 results were generalizable beyond that model. K562 cells treated with either scramble or ASXL1 p.
  • Primary hematopoietic cells are amenable to interallelic gene conversion
  • peripheral blood mononuclear cells were used that were obtained via leukapheresis of a patient with acute myeloid leukemia (AML).
  • AML acute myeloid leukemia
  • This sample included an ASXL1 p.Q748X mutation with an allele fraction of 48%.
  • the IGC method presented here offers an alternative to exogenous- template HDR in a highly circumscribed context: when the purpose is to replace a SNV existing in a cell with its alternate allele.
  • the method offers several potential benefits: 1) higher theoretical efficiency due to needing successful delivery and action of only a single reagent (RNP) to a given cell rather than two reagents (RNP and template), 2) reduced experimental complexity, cost, and faster startup compared to those needing exogenous template, and 3) lack of artefactual insertions which may occur with exogenous templates (Boel, A., et al. Dis. Model Meeh. 2018 11 :dmm035352).
  • IGC does not carry the risk of editing nearby bases as base editors do, and, for another, it is a much more compact system to deliver to cells than prime editors, which require both Cas and reverse transcriptase proteins (Anzalone, A.V., et al. Nat. Biotechnol. 2020 38:824-844).
  • prime editors which require both Cas and reverse transcriptase proteins
  • IGC is an experimentally straightforward new approach that complements existing CRISPR-based methods and which may have particular utility in specific scenarios, primarily when the object is to restore a disease-associated SNV to its wild-type form.
  • interallelic gene conversion is a simple CRISPR-based approach to revert a specific SNV allele to its counterpart base in heterozygous cells. This can find broad application in basic science studies as well as in translational applications aimed at reducing disease burden from deleterious SNVs.
  • K562 cells (ATCC, Manassas, VA, USA) were cultured in RPMI 1640 (Corning, Corning, NY) supplemented with 20% fetal bovine serum (FBS; R&D Systems, Minneapolis, MN, USA) and 1 % penicillin/streptomycin (P/S; Thermo Fisher Scientific, Waltham, MA, USA).
  • FBS fetal bovine serum
  • P/S penicillin/streptomycin
  • OCI-AML3 (DSMZ, Germany) cells were cultured in MEM a (Thermo Fisher Scientific) with 20% FBS and 1% P/S.
  • OCI-AML5 (DSMZ) cells were cultured in MEM a supplemented with 20% heat-inactivated FBS, 1 % P/S, and 10 ng/mL of GM-CSF (PeproTech, Rocky Hill, NJ, USA).
  • THP-1 cells ATCC
  • RPMI 1640 with 10% FBS, 1 % P/S, and 0.05 mM [3-mercaptoethanol (MilliporeSigma, Burlington, MA, USA).
  • SET-2 (DSMZ) cells were cultured in RPMI 1640 with 20% FBS and 1 % P/S.
  • Leukapheresis samples were cultured in RPMI 1640 with 5% FBS, 0.1 mM P-mercaptoethanol, and 10 ng/mL IL-1 p (PeproTech), 10 ng/mL IL-3 (PeproTech), and 10 ng/mL GM-CSF.
  • Next-generation sequencing was performed at the VUMC VANTAGE core using the VUMC Clonal Hematopoiesis Sequencing Assay v2.0, a custom capture protocol which covers 24 frequently mutated CH genes plus several dozen germline SNPs; this assay covers the full exonic sequences of all of the genes examined in this study.
  • Samples were sequenced to a depth of between 0.5 x 10 6 to 1 x 10 6 reads per sample on a NovaSeq sequencer (PE150).
  • the DRAGEN Somatic pipeline v3.10.4, Illumina was used to validate SNVs and generate BAM files mapped to GRCh38/hg38 that were then analyzed in R (v4.1.1) with Rsamtools (v2.10.0).
  • reads Prior to final quantification of allele frequency and indels at the targeted SNV, reads were filtered to include only those with unique mapping (MAPQ > 60) and coverage of both the SNV and at least four bases to either side of the predicted cut site 3 bp upstream of the PAM, in order to reduce the likelihood of underestimating indel proportions. Data were then reviewed manually in IGV (v2.14.0, Broad Institute). Allele fractions for non-targeted heterozygous SNPs were derived from IGV coverage statistics for reads with MAPQ > 60. Ideograms were generated with Rldeogram ( O.2.2) mapping to GRCh38/hg38. Animal models
  • Protein lysates were extracted from 1-5 million cells using RIPA buffer (Thermo Fisher Scientific) supplemented with complete Protease Inhibitor Cocktail (MilliporeSigma) and phosphatase inhibitor cocktail set V (Calbiochem, San Diego, CA, USA). Lysates were mixed 1 :1 with 2X Laemmli Buffer (Bio-Rad, Hercules, CA, USA) supplemented with [3-mercaptoethanol and denatured at 95°C for 10 minutes. Samples were run on 4-20% TGX gels (Bio-Rad) at 100V for 5 min, then 80V for 2 additional hours.
  • PVDF membrane (MilliporeSigma) was performed at 100V for 60 min on ice using transfer buffer prepared in house (20% v/v methanol, 0.3% w/v Tris, 1.44% w/v glycine). After overnight blocking in 5% milk in TBST, membranes were probed with primary antibody for 60 min at room temperature, followed by 3xTBST washes, then probed with goat anti-rabbit secondary antibody at 1 :5000 for 45 minutes at room temperature, followed by 3xTBST washes.
  • Chemiluminescent detection using autoradiography film was performed using SuperSignal West Dura (Thermo Fisher Scientific) for ASXL1 and SuperSignal West Pico PLUS (Thermo Fisher Scientific) for [3-actin.
  • the primary antibodies used in this study were polyclonal anti-ASXL1 raised in rabbit (PA5-68360, Thermo Fisher Scientific) and polyclonal anti-p-actin raised in rabbit (A2066, MilliporeSigma).
  • cells were plated at an initial density of 100,000 cells/mL and cultured for 72 hours. Counting of cells stained with trypan blue 0.4% (Thermo Fisher Scientific) was performed every 24 hours using a Countess II FL automated cell counter (Thermo Fisher Scientific). For cell division quantification, cells were incubated in CellTrace Violet (Thermo Fisher Scientific) in PBS for 20 minutes at room temperature in the dark, followed by three washes in PBS. Immediately following, an aliquot of cells was fixed in 4% paraformaldehyde, while the remaining cells were cultured for 72 hours followed by paraformaldehyde fixation.
  • CellTrace Violet Thermo Fisher Scientific
  • 0.1 mL of cell suspension was incubated with human APC- conjugated Ki-67 antibody (Cat. 350514, BioLegend, San Diego, CA, USA) at room temperature in the dark for 30 minutes. Cells were washed twice and resuspended in 0.2 mL BD Perm/Wash Buffer. Samples were incubated with 5 pL Propidium Iodide Staining Solution (Cat. 556463, BD Biosciences) and analyzed on a 5-laser BD LSR II (BD Biosciences). Three independent experiments were performed for each assay. FACS data were analyzed using FlowJo software (v10).
  • RNAseq data used in this study are available in GEO under GSE212730.
  • the cell line DNAseq data used in this study are available under NCBI SRA accession number PRJNA880841.
  • the human subject DNAseq data is available upon reasonable request.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Biomedical Technology (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Epidemiology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Enzymes And Modification Thereof (AREA)

Abstract

CRISPR-Cas9 is a useful tool for creating precise genetic knock-in alterations through homology-directed repair (HDR), although all current methods rely on provision of an exogenous repair template. Disclosed herein are compositions, systems, and methods for repairing heterozygous single nucleotide variants (SNVs) using the cell's own wild-type allele rather than an exogenous template. This technique can reduce cost and complexity for experiments modeling phenotypic consequences of SNVs. Furthermore, because it only requires Cas enzyme and gRNA, interalleleic gene conversion (IGC) has unique potential to move toward therapeutic use more rapidly than HDR approaches that require template delivery.

Description

REPAIR OF DISEASE-ASSOCIATED SINGLE NUCLEOTIDE VARIANTS VIA INTERALLELIC GENE CONVERSION
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims benefit of U.S. Provisional Application No. 63/383,776, filed November 15, 2022, which is hereby incorporated herein by reference in its entirety.
SEQUENCE LISTING
This application contains a sequence listing filed in ST.26 format entitled “222230_2220_Sequence_Listing” created on November 14, 2023, having 213,130 bytes. The content of the sequence listing is incorporated herein in its entirety.
BACKGROUND
A major aim of anti-cancer therapies is to target cancer cells effectively while sparing healthy tissue. Somatic mutations that arise in neoplastic clones constitute one class of salient distinguishing features with the potential to be targeted (Hanahan, D. and Weinberg, R.A. Cell 2000 100:57-70). Moreover, there are many specific mutations which are known to contribute to disease pathogenesis or disease severity. Single nucleotide variants (SNVs) are highly represented in sequenced cancer genomes (Martincorena, I. and Campbell Peter, J. Science 2015 349:1483-1489). Yet, because each SNV represents only a slight deviation from the wild-type sequence, there are inherent challenges with ensuring proposed therapeutic modalities maintain a high degree of specificity (Rabinowitz, R. and Offen, D. Mol. Ther. 2021 29:937-948). On the other hand, the small footprint of SNV lesions makes them attractive targets for genetic manipulation via homology-directed repair (HDR), as the success rate of HDR has an inverse relationship with the size of the alteration to be made (Li, K., et al. PLOS ONE 2014 9:e105779). While HDR is traditionally carried out by providing an exogenous repair template, it should also be possible to use HDR to replace a heterozygous SNV if a double-stranded break in DNA can be induced on the mutant allele while leaving the wild-type allele intact to act as a template. To this end, sporadic cell-intrinsic HDR following the targeting of a heterozygous SNV by CRISPR knockout (CRISPR KO) has been observed in zygotes (Yoshimi, K., et al. Nat. Commun. 2014 5:4240; Wu, Y., et al. Cell Stem Cell 2013 13:659-662; Ma, H., et al. Nature 2017 548:413-419). Many cancer- associated SNVs are heterozygous, but so far there exist no studies examining whether interallelic gene conversion (IGC) aimed at missense/nonsense SNVs could intentionally be used to reliably repair deleterious point mutations in somatic human cells.
SUMMARY
CRISPR-Cas9 is a useful tool for creating precise genetic knock-in alterations through homology-directed repair (HDR), although all current methods rely on provision of an exogenous repair template. Disclosed herein are compositions, systems, and methods for repairing heterozygous single nucleotide variants (SNVs) using the cell’s own wild-type allele rather than an exogenous template. This technique can reduce cost and complexity for experiments modeling phenotypic consequences of SNVs. Furthermore, because it only requires Cas enzyme and gRNA, interallelic gene conversion (IGC) has unique potential to move toward therapeutic use more rapidly than HDR approaches that require template delivery.
Disclosed herein is a composition or system for repairing heterozygous single nucleotide variants (SNVs) in a cell by CRISPR-mediated interallelic gene conversion (IGC), the composition involving an isolated nucleic acid encoding a Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR)-associated endonuclease; and an isolated nucleic acid sequence encoding an SNV-specific guide RNA (gRNA) 17 to 24 nucleotides in length complementary to a target gene sequence comprising the SNV, wherein the composition does not comprise an exogenous donor repair template.
In some embodiment, the target gene can be any gene in a cell that is not dividing, wherein the gene is targetable by a CRISPR-associated endonuclease (Cas9) (e.g. has a PAM sequence).
In some embodiments, the target gene is selected from the group consisting of ASXL1, DNMT3A, GNAS, GNB1, IDH1, IDH2, KIT, NRAS, PPM1D, SF3B1, SRSF2, TET2, TP53, and U2AF1. Examples of gene SNVs and corresponding gRNAs are provided in Table 1.
The composition or system of claim 1 , wherein the SNV is a Y591 , Q733, or L775 mutation in asx/1, such as a Y591X, Q733X, or L775X mutation in ASXL1.
The composition or system of claim 1 , wherein the SNV is a R326, R635, V657, R729, Y735, R736, R749, F755, R771 , I780, R882, W860, or P904 mutation in DNMT3A, such as a R326C, R635Q, R635W, V657M, R729W, Y735C, R736C, R736H, R749C, F755S, R771X, I780T, R882C, R882H, W860X, or P904L mutation in DNMT3A. The composition or system of claim 1 , wherein the SNV is a R201 mutation in gnas, such as a R201 H mutation in GNAS.
The composition or system of claim 1 , wherein the SNV is a K57 mutation in GNB1, such as a K57E mutation in GNB1.
The composition or system of claim 1 , wherein the SNV is a R132 mutation in IDH1, such as a R132H or R132C mutation in IDH1.
The composition or system of claim 1 , wherein the SNV is a R140 mutation in IDH2, such as a R140Q or R140L mutation in IDH2.
The composition or system of claim 1 , wherein the SNV is a D816 mutation in kit, such as a D816V mutation in KIT.
The composition or system of claim 1 , wherein the SNV is a G12 mutation in NRAS, such as a G12D mutation in NRAS.
The composition or system of claim 1 , wherein the SNV is a R552 mutation in PPM1D, such as a R552X mutation in PPM1D.
The composition or system of claim 1 , wherein the SNV is a R387W H662, T663, K700, G740E, G740, or A744 mutation in SF3B1, such as a R387W, H662D, H662Q, T663I, K700E, G740E, G740R, or A744P mutation in SF3B1.
The composition or system of claim 1 , wherein the SNV is a P95 mutation in SRSF2, such as a P95H, P95L, or P95R mutation in SRSF2.
The composition or system of claim 1 , wherein the SNV is a R544, Q803, Q916, C1135, Q1191 , R1216, R1261 , R1359, R1465, S1486, R1516, or 11873 mutation in TET2, such as a R544X, Q803X, Q916X, C1135Y, Q1191X, R1216Q, R1216X, R1261C, R1261 H, R1359C, R1359H, R1465X, S1486X, R1516X, or I1873T mutation in TET2.
The composition or system of claim 1 , wherein the SNV is a G108, R110, P177, H179, Y220, M237, C238, C242, M246, R248, R273, R306, or R342 mutation in TP53, such as a G108S, R110L, P177R, H179Y, Y220C, M237I, C238Y, C242Y, M246V, R248Q, R273H, R306X, or R342X mutation in TP53.
The composition or system of claim 1 , wherein the SNV is a S34F mutation in u2af1, such as a S34F mutation in U2AF1.
The composition or system of claim 1 , wherein the SNV is selected from the group consisting of rs377577594, rs147001633, rs200018028, rs121913237, rs371369583, rs371369583, rs387907078, rs149095705, rs147828672, rs144689354, rs370751539, rs779626155, rs200018028, rs761934754, rs751562376, rs747448117, rs751713049, rs141326438, rs559063155, rs121913495, rs779070661, rs121913500, rs121913499, rs121913502, rs121913502, rs121913507, rs371769427, rs752263134, rs28934576, rs730882005, rs11540652, rs121912666, rs751477326, rs483352695, rs121912655, rs587782664, rs587780070, rs11540654, rs587782461 , rs730882029, rs121913344, rs778467242, rs1239341681 , rs776846119, rs745511585, rs1009194427, rs116519313, rs370735654, rs780710758, rs771761785, rs1235228377, rs769422572, rs1729381211 , rs562667223, rs898441677, rs759658003, rs775677220, rs1376289450, rs1440692352, and rs368508787.
The composition or system of claim 1 , wherein the gRNA comprises a polynucleotide selected from the group consisting of SEQ ID NO: 1-216.
The composition or system of any one of claims 1 to 18, wherein the CRISPR- associated endonuclease is a high fidelity Cas9.
A method for repairing heterozygous single nucleotide variants (SNVs) in a subject in need thereof, the method comprising administering to the subject a therapeutically effective amount of the composition or system of any one of claims 1 to 20.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
DESCRIPTION OF DRAWINGS
FIGs. 1A to 1 J show interallelic gene conversion can increase wild-type allele fraction in hematopoietic cell lines. FIG. 1A shows the allele-specific gRNA sequences designed against
DNMT3A p.R882C (wt: SEQ ID NO:223, Mut: SEQ ID NO:224, gRNA: SEQ ID NO:1), DNMT3A p.R882H (wt: SEQ ID NO:225, Mut: SEQ ID NO:226, gRNA: SEQ ID NO:4), NRAS p.G12D (wt: SEQ ID NO:227, Mut: SEQ ID NO:228, gRNA: SEQ ID NO:9), ASXL1 p.Y591X (C.C1773G) (wt: SEQ ID NO:229, Mut: SEQ ID NQ:230, gRNA: SEQ ID NO:12), and ASXL1 p.Y591X (C.C1773A) (wt: SEQ ID NO:231 , Mut: SEQ ID NO:232, gRNA: SEQ ID NO: 15). FIG. 1 B shows cells were electroporated with ribonucleic protein containing SNV-specific gRNA, then DNA was collected and compared to cells receiving scramble gRNA as a control. FIG. 10 shows Sanger trace (SEQ ID NO:233) showing increase in the wild-type allele fraction and decrease in the mutant allele fraction of DNMT3A p.R882C in OCI-AML3 cells. FIGs. 1 D to 1 H show the wild-type (WT) allelic fraction following control treatment (red), treatment with targeted gRNA plus homology- directed repair (HDR) enhancer, or gRNA without HDR enhancer for the DNMT3A P.R882C locus in OCI-AML3 cells (FIG. 1 D), the DNMT3A p.R882H locus in SET-2 cells (FIG. 1 E), the NRAS p.G12D locus in THP-1 cells (FIG. 1 F), the ASXL1 p.Y591X (C.C1773G) locus in OCI-AML5 cells (FIG. 1G), and the ASXL1 p.Y591X (C.C1773A) locus in K562 cells (FIG. 1 H). FIGs. 11 and 1J show total indels in samples receiving mutant-targeting gRNA plus HDR enhancer or gRNA without HDR enhancer in OCI- AML3 cells (FIG. 11) and SET-2 cells (FIG. 1J). Bar plots and error bars represent mean and standard deviation.
FIGs. 2A to 2H show reversion of ASXL1 p.Y591X results in transcriptional downregulation of pro-growth pathways. FIG 1 A shows Western blot of scramble gRNA or Y591X-gRNA treated K562, showing full-length and truncated ASXL1 and [3-actin. FIGs. 1 B and 1C show protein quantification for full-length ASXL1 (FIG. 1 B) and truncated ASXL1 (FIG. 1C) (N = 3 experimental replicates). FIG. 1 D is a volcano plot showing RNAseq results of differentially expressed genes (FDR = 0.05) in Y591X- treated vs. scramble-treated conditions. FIG. 1 E shows most significant enriched and de-enriched Hallmark pathways in preranked gene set enrichment analysis. FIG. 1 F shows fold expansion (mean +/- SEM) of cell count over 72 hours for K562 treated with scramble or mutant-targeting gRNA (N = 3 experimental replicates). FIG. 1G shows cell divisions (mean +/- SD) as measured by CellTrace Violet fluorescence of K562 treated with scramble or mutant-targeting gRNA after 72 hours of growth (N = 3 experimental replicates). FIG. 1 H shows proportion of cells in GO, G1 , and S/G2/M (mean +/- SD) as quantified by Ki-67 and propidium iodide fluorescence of K562 treated with scramble or mutant-targeting gRNA (N = 3 experimental replicates).
FIGs. 3A to 3C show bulk interallelic gene conversion is sufficient to prolong survival in cell-line derived mouse xenograft model. FIG. 3A is a schematic of xenograft model. FIGs. 3B and 3C show measured weights (mean +/- SD) (FIG. 3B) and survival curves (FIG. 3C) of NSGS mice receiving human K562 AML cells treated with either scramble gRNA or mutant-targeted gRNA. Results based on two independent experiments (N = 3 and 5 per condition). Significance was determined using a Cox proportional hazards model.
FIGs. 4A to 4D show interallelic gene conversion works in primary cells. FIG. 4A shows the sequence of the ASXL1 p.Q748X targeted gRNA (wt: SEQ ID NO:234, Mut: SEQ ID NO:235, gRNA: SEQ ID NO:236). FIG. 4B shows the wild-type (WT) allele fraction of scramble- or targeted-gRNA treated patient sample. Bar plots and error bars represent mean and standard deviation. FIG. 4C shows the number of reads containing indels for each of the experimental replicates. FIG. 4D shows mean allele frequency (AF) for the reference allele (GRCh38/hg38) at heterozygous SNPs (purple) on the chromosome harboring the CRISPR-targeted SNV (orange). In FIG. 4D: not significant (ns), p-value < 0.05 (*), p-value < 0.01 (**), and p-value < 0.001 (***).
FIG. 5A shows representative Sanger traces for of DNMT3A p.R882H in SET-2 cells (SEQ ID NO:237), NRAS p.G12D in THP-1 cells (SEQ ID NO:238), ASXL1 P.Y591X in OCI-AML5 (SEQ ID NO:239), and ASXL1 p.Y591X in K562 cells (SEQ ID NQ:240). FIG. 5B shows proportion of all wild-type, mutant, or unphased reads (empty bars) and the proportion of indel-containing wild-type, mutant, or unphased reads (shaded bars) for OCI-AML3, SET-2, THP-1 , OCI-AML5, and K562 cells. Bar plots and error bars represent mean and standard deviation. FIG. 5C shows Integrated Genome Viewer locus plot depicting indels in down-sampled NGS reads at the DNMT3A p.R882C locus (SEQ ID NO:241); following OCIAML3 treatment with mutant-specific gRNA without homology-directed repair (HDR) enhancer.
FIG. 6A shows Western blot of scramble gRNA or Y591XgRNA treated OCI- AML5, showing full-length and truncated ASXL1 and [3-actin. FIGs. 6B and 6C shows protein quantification for full-length ASXL1 (FIG. 6B) and truncated ASXL1 (FIG. 6C) (N = 3 experimental replicates). FIGs. 6D and 6E shows Volcano plot (FDR = 0.05) and top enriched/de-enriched GSEA Hallmark gene sets of DESeq2 results of ASXL1 -treated vs. control-treated OCI-AML5. FIGs. 6F and 6G show Volcano plot (FDR = 0.05) and top enriched/de-enriched GSEA Hallmark gene sets of differential expression for repeated measures (DREAM) results of ASXL1 -treated vs. control-treated K562 and OCI-AML5 cells where [expression ~ treatment + (1 |cell line)]. FIG. 6H shows DLK1 expression (RPKM) among ASXL1-mutant (N = 29) and non-ASXL1 -mutant (N = 376) samples in the BEAT AML cohort. Significance determined by one-tailed Wilcoxon rank sum test. FIG. 6I shows ratio of control- or ASXL1 -treated K562 cells in G1 phase to S/G2/M phases. Bar plots and error bars represent mean and standard deviation.
FIGs. 7A and 7B show FACS gating strategy for CellTrace assay, showing a sample fixed at time = 0 (FIG. 7A) and fixed at time = 72 hours (FIG. 7B). FIG. 7C shows FACS gating strategy for Ki-67 cell cycle assay. FIG. 8A shows weights (mean +/- SD) of OCI-AML5 mice (N = 5) starting on day of transplantation. FIG. 8B shows survival curves of mice receiving K562 (solid) and OCI-AML5 (dashed) cells treated with scramble or mutant-targeting gRNA. Significance determined by Cox proportional hazards model stratified by cell line.
FIGs. 9A to 9E show mean allele frequency (AF) for the reference allele (GRCh38/hg38) at heterozygous SNPs on the chromosome harboring the CRISPR- targeted SNV (orange) for OCI-AML3 (FIG. 9A), SET-2 (FIG. 9B), THP-1 (FIG. 9C), OCI- AML5 (FIG. 9D), and K562 (FIG. 9E). For FIG. 9A and FIG. 9B, treated samples includes all samples irrespective of the addition of HDR enhancer. In FIG. 9B, sequencing data revealed a haplotype spanning three SNPs (from Chr2: 227,032,170 to Chr2:227,032,360), so for significance testing, the average AF of these three SNPs was taken as the haplotype AF for each sample. Significance determined by Welch’s t-test; not significant at p = 0.05 (ns) or p-value < 0.05 (*), p-value < 0.01 (**), and p-value < 0.001 (***).
FIGs. 10A depicts animal weights for an OCI-AML5 transplant model. 10Bdepicts survival curves for K562 and OCI-AML5 transplant models, with combined treatment effect determined by Cox proportional hazards model stratified by cell type.
DETAILED DESCRIPTION
Before the present disclosure is described in greater detail, it is to be understood that this disclosure is not limited to particular embodiments described, and as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the present disclosure will be limited only by the appended claims.
Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the disclosure. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges and are also encompassed within the disclosure, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the disclosure.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present disclosure, the preferred methods and materials are now described.
All publications and patents cited in this specification are herein incorporated by reference as if each individual publication or patent were specifically and individually indicated to be incorporated by reference and are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited. The citation of any publication is for its disclosure prior to the filing date and should not be construed as an admission that the present disclosure is not entitled to antedate such publication by virtue of prior disclosure. Further, the dates of publication provided could be different from the actual publication dates that may need to be independently confirmed.
As will be apparent to those of skill in the art upon reading this disclosure, each of the individual embodiments described and illustrated herein has discrete components and features which may be readily separated from or combined with the features of any of the other several embodiments without departing from the scope or spirit of the present disclosure. Any recited method can be carried out in the order of events recited or in any other order that is logically possible.
Embodiments of the present disclosure will employ, unless otherwise indicated, techniques of chemistry, biology, and the like, which are within the skill of the art.
The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to perform the methods and use the probes disclosed and claimed herein. Efforts have been made to ensure accuracy with respect to numbers (e.g., amounts, temperature, etc.), but some errors and deviations should be accounted for. Unless indicated otherwise, parts are parts by weight, temperature is in °C, and pressure is at or near atmospheric. Standard temperature and pressure are defined as 20 °C and 1 atmosphere.
Before the embodiments of the present disclosure are described in detail, it is to be understood that, unless otherwise indicated, the present disclosure is not limited to particular materials, reagents, reaction materials, manufacturing processes, or the like, as such can vary. It is also to be understood that the terminology used herein is for purposes of describing particular embodiments only, and is not intended to be limiting. It is also possible in the present disclosure that steps can be executed in different sequence where this is logically possible. It must be noted that, as used in the specification and the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise.
Guide RNA
Disclosed herein are “guide RNA” (gRNA) molecules for use in the disclosed compositions, systems, and methods. A gRNA, as used herein, refers to a nucleic acid that promotes the specific targeting or homing of a gRNA molecule/Cas9 molecule complex to a target nucleic acid. gRNA molecules can be unimolecular (having a single RNA molecule), sometimes referred to herein as “chimeric” gRNAs, or modular (comprising more than one, and typically two, separate RNA molecules). gRNA are a synthetic fusion of the endogenous bacterial crRNA and tracrRNA. gRNA provide both targeting specificity and scaffolding/binding ability for Cas9 nuclease. They do not exist in nature. gRNA are sometimes referred to as “single guide RNA” or “sgRNA”. A gRNA molecule comprises a number of domains, which are described in more detail below.
In some embodiments, the gRNA comprises, preferably from 5' to 3': a targeting domain (which is complementary to a target nucleic acid); a first complementarity domain; a linking domain; a second complementarity domain (which is complementary to the first complementarity domain); a proximal domain; and optionally, a tail domain.
In some embodiments, the targeting domain comprises a nucleotide sequence that is complementary, e.g., at least 80%, 85%, 90%, 95%, or 100% complementary, e.g., fully complementary, to the target sequence on the target nucleic acid. The targeting domain is part of an RNA molecule and will therefore comprise the base uracil (U), while any DNA encoding the gRNA molecule will comprise the base thymine (T). While not wishing to be bound by theory, it is believed that the complementarity of the targeting domain with the target sequence contributes to specificity of the interaction of the gRNA molecule/Cas9 molecule complex with a target nucleic acid. It is understood that in a targeting domain and target sequence pair, the uracil bases in the targeting domain will pair with the adenine bases in the target sequence. In some embodiments, the target domain itself comprises, in the 5' to 3' direction, an optional secondary domain, and a core domain. In some embodiments, the core domain is fully complementary with the target sequence. In some embodiments, the targeting domain is 5 to 50, 10 to 40, e.g., 10 to 30, e.g., 15 to 30, e.g., 15 to 25 nucleotides in length. In some embodiments, the targeting domain is 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24 or 25 nucleotides in length. The strand of the target nucleic acid with which the targeting domain is complementary is referred to herein as the complementary strand. Some or all of the nucleotides of the domain can have a modification.
The first complementarity domain is complementary with the second complementarity domain, and In some embodiments, has sufficient complementarity to the second complementarity domain to form a duplexed region under at least some physiological conditions. In some embodiments, the first complementarity domain is 5 to 30 nucleotides in length. In some embodiments, the first complementarity domain is 5 to 25 nucleotides in length. In some embodiments, the first complementary domain is 7 to 25 nucleotides in length. In some embodiments, the first complementary domain is 7 to 22 nucleotides in length. In some embodiments, the first complementary domain is 7 to 18 nucleotides in length. In some embodiments, the first complementary domain is 7 to 15 nucleotides in length. In some embodiments, the first complementary domain is 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, or 25 nucleotides in length.
In some embodiments, the first complementarity domain comprises 3 subdomains, which, in the 5' to 3' direction are: a 5' subdomain, a central subdomain, and a 3' subdomain. In some embodiments, the 5' subdomain is 4-9, e.g., 4, 5, 6, 7, 8 or 9 nucleotides in length. In some embodiments, the central subdomain is 1 , 2, or 3, e.g., 1 , nucleotide in length. In some embodiments, the 3' subdomain is 3 to 25, e.g., 4-22, 4- 18, or 4 to 10, or 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, or 25, nucleotides in length.
The first complementarity domain can share homology with, or be derived from, a naturally occurring first complementarity domain. In some embodiments, it has at least 50% homology with a first complementarity domain disclosed herein, e.g., a Streptococcus pyogenes (S. pyogenes) or Streptococcus thermophiles (S. thermophiles), first complementarity domain.
A linking domain serves to link the first complementarity domain with the second complementarity domain of a unimolecular gRNA. The linking domain can link the first and second complementarity domains covalently or non-covalently. In some embodiments, the linkage is covalent. In some embodiments, the linking domain covalently couples the first and second complementarity domains. In some embodiments, the linking domain is, or comprises, a covalent bond interposed between the first complementarity domain and the second complementarity domain. Typically, the linking domain comprises one or more, e.g., 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides. In modular gRNA molecules the two molecules can be associated by virtue of the hybridization of the complementarity domains.
A wide variety of linking domains are suitable for use in unimolecular gRNA molecules. Linking domains can consist of a covalent bond, or be as short as one or a few nucleotides, e.g., 1 , 2, 3, 4, or 5 nucleotides in length.
In some embodiments, a linking domain is 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, or 25 or more nucleotides in length. In some embodiments, a linking domain is 2 to 50, 2 to 40, 2 to 30, 2 to 20, 2 to 10, or 2 to 5 nucleotides in length. In some embodiments, a linking domain shares homology with, or is derived from, a naturally occurring sequence, e.g., the sequence of a tracrRNA that is 5' to the second complementarity domain. In some embodiments, the linking domain has at least 50% homology with a linking domain disclosed herein.
In some embodiments, a modular gRNA can comprise additional sequence, 5' to the second complementarity domain, referred to herein as the 5' extension domain. In some embodiments, the 5' extension domain is, 2-10, 2-9, 2-8, 2-7, 2-6, 2-5, 2-4 nucleotides in length. In some embodiments, the 5' extension domain is 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more nucleotides in length.
The second complementarity domain is complementary with the first complementarity domain, and In some embodiments, has sufficient complementarity to the second complementarity domain to form a duplexed region under at least some physiological conditions. In some embodiments, the second complementarity domain can include sequence that lacks complementarity with the first complementarity domain, e.g., sequence that loops out from the duplexed region.
In some embodiments, the second complementarity domain is 5 to 27 nucleotides in length. In some embodiments, it is longer than the first complementarity region.
In some embodiments, the second complementary domain is 7 to 27 nucleotides in length. In some embodiments, the second complementary domain is 7 to 25 nucleotides in length. In some embodiments, the second complementary domain is 7 to 20 nucleotides in length. In some embodiments, the second complementary domain is 7 to 17 nucleotides in length. In some embodiments, the complementary domain is 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24 or 25 nucleotides in length.
In some embodiments, the second complementarity domain comprises 3 subdomains, which, in the 5' to 3' direction are: a 5' subdomain, a central subdomain, and a 3' subdomain. In some embodiments, the 5' subdomain is 3 to 25, e.g., 4 to 22, 4 to 18, or 4 to 10, or 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, or 25 nucleotides in length. In some embodiments, the central subdomain is 1 , 2, 3, 4 or 5, e.g., 3, nucleotides in length. In some embodiments, the 3' subdomain is 4 to 9, e.g., 4, 5, 6, 7, 8 or 9 nucleotides in length.
In some embodiments, the 5' subdomain and the 3' subdomain of the first complementarity domain, are respectively, complementary, e.g., fully complementary, with the 3' subdomain and the 5' subdomain of the second complementarity domain.
The second complementarity domain can share homology with or be derived from a naturally occurring second complementarity domain. In some embodiments, it has at least 50% homology with a second complementarity domain disclosed herein, e.g., an S. pyogenes, or S. thermophilus, first complementarity domain.
Some or all of the nucleotides of the domain can have a modification.
In some embodiments, the proximal domain is 5 to 20 nucleotides in length. In some embodiments, the proximal domain can share homology with or be derived from a naturally occurring proximal domain. In some embodiments, it has at least 50% homology with a proximal domain disclosed herein, e.g., an S. pyogenes, or S. thermophilus, proximal domain.
A broad spectrum of tail domains are suitable for use in gRNA molecules. In some embodiments, the tail domain is 0 (absent), 1 , 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides in length. In some embodiments, the tail domain nucleotides are from or share homology with sequence from the 5' end of a naturally occurring tail domain. In some embodiments, the tail domain includes sequences that are complementary to each other and which, under at least some physiological conditions, form a duplexed region.
In some embodiments, the tail domain is absent or is 1 to 50 nucleotides in length. In some embodiments, the tail domain can share homology with or be derived from a naturally occurring proximal tail domain. In some embodiments, it has at least 50% homology with a tail domain disclosed herein, e.g., an S. pyogenes, or S. thermophilus, tail domain.
In some embodiments, the tail domain includes nucleotides at the 3' end that are related to the method of in vitro or in vivo transcription. When a T7 promoter is used for in vitro transcription of the gRNA, these nucleotides may be any nucleotides present before the 3' end of the DNA template. When a U6 promoter is used for in vivo transcription, these nucleotides may be the sequence UUUUUU. When alternate pol-lll promoters are used, these nucleotides may be various numbers or uracil bases or may include alternate bases.
Methods for Designing qRNAs
Methods for selection and validation of target sequences as well as off-target analyses are described, e.g., in. Mali et al., 2013 SCIENCE 339(6121): 823-826; Hsu et al., 2013 NAT BIOTECHNOL, 31 (9): 827-32; Fu et al., 2014 NAT BIOTECHNOL, doi: 10.1038/nbt.2808. PubMed PMID: 24463574; Heigwer et al., 2014 NAT METHODS 11 (2):122-3. doi: 10.1038/nmeth.2812. PubMed PMID: 24481216; Bae et al., 2014 BIOINFORMATICS PubMed PMID: 24463181 ; Xiao A et al., 2014 BIOINFORMATICS PubMed PMID: 24389662.
For example, a software tool can be used to optimize the choice of sgRNA within a user's target sequence, e.g., to minimize total off-target activity across the genome. Off target activity may be other than cleavage. For each possible gRNA choice e.g., using S. pyogenes Cas9, the tool can identify all off-target sequences (e.g., preceding either NAG or NGG PAMs) across the genome that contain up to a certain number (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, or 10) of mismatched base-pairs. The cleavage efficiency at each off-target sequence can be predicted, e.g., using an experimentally-derived weighting scheme. Each possible gRNA is then ranked according to its total predicted off-target cleavage; the top-ranked gRNAs represent those that are likely to have the greatest on-target and the least off-target cleavage. Other functions, e.g., automated reagent design for CRISPR construction, primer design for the on-target Surveyor assay, and primer design for high-throughput detection and quantification of off-target cleavage via nextgen sequencing, can also be included in the tool. Candidate gRNA molecules can be evaluated by art-known methods.
Cas9 Molecules
Cas9 molecules of a variety of species can be used in the methods and compositions described herein. While the S. pyogenes and S. thermophilus Cas9 molecules are typically used, Cas9 molecules of, derived from, or based on the Cas9 proteins of other species can be used, e.g., Staphylococcus aureus, Neisseria meningitides.
A Cas9 molecule, as that term is used herein, refers to a molecule that can interact with a sgRNA molecule and, in concert with the sgRNA molecule, localize (e.g., target or home) to a site which comprises a target domain and PAM sequence. In some embodiments, the Cas9 molecule is capable of cleaving a target nucleic acid molecule. Exemplary naturally occurring Cas9 molecules are described in Chylinski et al., RNA Biology 2013; 10:5, 727-737. Naturally occurring Cas9 molecules possess a number of properties, including: nickase activity, nuclease activity (e.g., endonuclease and/or exonuclease activity); helicase activity; the ability to associate functionally with a gRNA molecule; and the ability to target (or localize to) a site on a nucleic acid (e.g., PAM recognition and specificity). In some embodiments, a Cas9 molecules can include all or a subset of these properties. In typical embodiments, Cas9 molecules have the ability to interact with a gRNA molecule and, in concert with the gRNA molecule, localize to a site in a nucleic acid. Other activities, e.g., PAM specificity, cleavage activity, or helicase activity can vary more widely in Cas9 molecules.
Cas9 molecules with desired properties can be made in a number of ways, e.g., by alteration of a parental, e.g., naturally occurring Cas9 molecules to provide an altered Cas9 molecule having a desired property. For example, one or more mutations or differences relative to a parental Cas9 molecule can be introduced. Such mutations and differences comprise: substitutions (e.g., conservative substitutions or substitutions of non-essential amino acids); insertions; or deletions. In some embodiments, a Cas9 molecule can comprises one or more mutations or differences, e.g., at least 1 , 2, 3, 4, 5, 10, 15, 20, 30, 40 or 50 mutations but less than 200, 100, or 80 mutations relative to a reference Cas9 molecule.
In some embodiments, a mutation or mutations do not have a substantial effect on a Cas9 activity, e.g. a Cas9 activity described herein. In some embodiments, a mutation or mutations have a substantial effect on a Cas9 activity, e.g. a Cas9 activity described herein. In some embodiments, exemplary activities comprise one or more of PAM specificity, cleavage activity, and helicase activity. A mutation(s) can be present, e.g., in: one or more RuvC-like domain, e.g., an N-terminal RuvC-like domain; an HNH- like domain; a region outside the RuvC-like domains and the HNH-like domain. In some embodiments, a mutation(s) is present in an N-terminal RuvC-like domain. In some embodiments, a mutation(s) is present in an HNH-like domain. In some embodiments, mutations are present in both an N-terminal RuvC-like domain and an HNH-like domain.
Whether or not a particular sequence, e.g., a substitution, may affect one or more activity, such as targeting activity, cleavage activity, etc, can be evaluated or predicted, e.g., by evaluating whether the mutation is conservative. In some embodiments, a “non- essential” amino acid residue, as used in the context of a Cas9 molecule, is a residue that can be altered from the wild-type sequence of a Cas9 molecule, e.g., a naturally occurring Cas9 molecule, e.g., an eaCas9 molecule, without abolishing or more preferably, without substantially altering a Cas9 activity (e.g., cleavage activity), whereas changing an “essential” amino acid residue results in a substantial loss of activity (e.g., cleavage activity).
Naturally occurring Cas9 molecules can recognize specific PAM sequences, for example the PAM recognition sequences for S. pyogenes, S. thermophilus, S. mutans, S. aureus and N. meningitidis.
In some embodiments, a Cas9 molecule has the same PAM specificities as a naturally occurring Cas9 molecule. In other embodiments, a Cas9 molecule has a PAM specificity not associated with a naturally occurring Cas9 molecule, or a PAM specificity not associated with the naturally occurring Cas9 molecule to which it has the closest sequence homology. For example, a naturally occurring Cas9 molecule can be altered, e.g., to alter PAM recognition, e.g., to alter the PAM sequence that the Cas9 molecule recognizes to decrease off target sites and/or improve specificity; or eliminate a PAM recognition requirement. In some embodiments, a Cas9 molecule can be altered, e.g., to increase length of PAM recognition sequence and/or improve Cas9 specificity to high level of identity to decrease off target sites and increase specificity. In some embodiments, the length of the PAM recognition sequence is at least 4, 5, 6, 7, 8, 9, 10 or 15 amino acids in length. Cas9 molecules that recognize different PAM sequences and/or have reduced off-target activity can be generated using directed evolution.
In some embodiments, a Cas9 molecule comprises a cleavage property that differs from naturally occurring Cas9 molecules, e.g., that differs from the naturally occurring Cas9 molecule having the closest homology. For example, a Cas9 molecule can differ from naturally occurring Cas9 molecules, e.g., a Cas9 molecule of S. pyogenes, as follows: its ability to modulate, e.g., decreased or increased, cleavage of a double stranded break (endonuclease and/or exonuclease activity), e.g., as compared to a naturally occurring Cas9 molecule (e.g., a Cas9 molecule of S. pyogenes)’, its ability to modulate, e.g., decreased or increased, cleavage of a single strand of a nucleic acid, e.g., a non-complimentary strand of a nucleic acid molecule or a complementary strand of a nucleic acid molecule (nickase activity), e.g., as compared to a naturally occurring Cas9 molecule (e.g., a Cas9 molecule of S. pyogenes)’, or the ability to cleave a nucleic acid molecule, e.g., a double stranded or single stranded nucleic acid molecule, can be eliminated. In some embodiments, the Cas9 is a “high fidelity” spCas9 variants (HF-Cas9), such as those designed according to principles disclosed by Joung and colleagues (Kleinstiver, et al., 2016, which is incorporated herein in its entirety).
Gene Targets Disclosed herein are isolated nucleic acid sequences encoding an SNV-specific guide RNA (gRNA). Examples of gene SNVs and corresponding gRNAs are provided in Table 1.
Figure imgf000018_0001
Figure imgf000019_0001
Figure imgf000020_0001
Figure imgf000021_0001
Figure imgf000022_0001
Figure imgf000023_0001
Figure imgf000024_0002
Figure imgf000024_0001
Figure imgf000025_0001
Homology-Directed Repair (HDR)
As described herein, nuclease-induced homology directed repair (HDR) can be used to alter a target sequence and correct (e.g., repair or edit) a mutation in the genome. While not wishing to be bound by theory, it is believed that alteration of the target sequence occurs by homology-directed repair (HDR). Normally this is done using a donor template or template nucleic acid. For example, the donor template or the template nucleic acid provides for alteration of the target sequence.
However, disclosed herein are compositions, systems, and methods for repairing heterozygous single nucleotide variants (SNVs) using the cell’s own wild-type allele rather than an exogenous donor template. This is referred to herein as interallelic gene conversion (IGC). IGC alteration of a target sequence depends on cleavage by a Cas9 molecule. Cleavage by Cas9 can comprise a double strand break or two single strand breaks.
In some embodiments, a mutation can be corrected by either a single doublestrand break or two single strand breaks. In some embodiments, a mutation can be corrected by: (1) a single double-strand break, (2) two single strand breaks, (3) two double stranded breaks with a break occurring on each side of the target sequence, (4) one double stranded breaks and two single strand breaks with the double strand break and two single strand breaks occurring on each side of the target sequence or (5) four single stranded breaks with a pair of single stranded breaks occurring on each side of the target sequence.
In some embodiments, the disclosed compositions, systems, and methods do not include the use of an exogenous donor template nucleic acid. A “template nucleic acid,” as used herein, refers to an endogenous or exogenous nucleic acid sequence comprising the wildtype nucleic acid sequence of the target nucleic acid, i.e. lacking the SNV. Therefore, in some embodiments, the disclosed compositions and systems do not have or need an exogenous donor template nucleic acid for HDR, but instead rely on the cell’s own wild-type allele as the template nucleic acid.
Constructs/Components
The components, e.g., a Cas9 molecule or gRNA molecule, or both, can be delivered, formulated, or administered in a variety of forms. When a component is delivered encoded in DNA the DNA will typically include a control region, e.g., comprising a promoter, to effect expression. Useful promoters for Cas9 molecule sequences include CMV, EF-1a, MSCV, PGK, CAG control promoters. Useful promoters for sgRNAs include H1 , EF-1a and U6 promoters. Promoters with similar or dissimilar strengths can be selected to tune the expression of components. Sequences encoding a Cas9 molecule can comprise a nuclear localization signal (NLS), e.g., an SV40 NLS. In some embodiments, a promoter for a Cas9 molecule or a sgRNA molecule can be, independently, inducible, tissue specific, or cell specific.
DNA encoding Cas9 molecules and/or gRNA molecules, can be administered to subjects or delivered into cells by art-known methods or as described herein. For example, Cas9-encoding and/or gRNA-encoding DNA can be delivered, e.g., by vectors (e.g., viral or non-viral vectors), non-vector based methods (e.g., using naked DNA or DNA complexes), or a combination thereof.
In some embodiments, the Cas9- and/or gRNA-encoding DNA is delivered by a vector (e.g., viral vector/virus or plasmid).
A vector can comprise a sequence that encodes a Cas9 molecule and/or a gRNA molecule. A vector can also comprise a sequence encoding a signal peptide (e.g., for nuclear localization, nucleolar localization, mitochondrial localization), fused, e.g., to a Cas9 molecule sequence. For example, a vector can comprise a nuclear localization sequence (e.g., from SV40) fused to the sequence encoding the Cas9 molecule.
One or more regulatory/control elements, e.g., a promoter, an enhancer, an intron, a polyadenylation signal, a Kozak consensus sequence, internal ribosome entry sites (IRES), a 2A sequence, and a splice acceptor or donor can be included in the vectors. In some embodiments, the promoter is recognized by RNA polymerase II (e.g., a CMV promoter). In other embodiments, the promoter is recognized by RNA polymerase III (e.g., a U6 promoter). In some embodiments, the promoter is a regulated promoter (e.g., inducible promoter). In other embodiments, the promoter is a constitutive promoter. In some embodiments, the promoter is a tissue specific promoter. In some embodiments, the promoter is a viral promoter. In other embodiments, the promoter is a non-viral promoter.
In some embodiments, the vector or delivery vehicle is a viral vector (e.g., for generation of recombinant viruses). In some embodiments, the virus is a DNA virus (e.g., dsDNA or ssDNA virus). In other embodiments, the virus is an RNA virus (e.g., an ssRNA virus). Exemplary viral vectors/viruses include, e.g., retroviruses, lentiviruses, adenovirus, adeno-associated virus (AAV), vaccinia viruses, poxviruses, and herpes simplex viruses. In some embodiments, the virus infects dividing cells. In other embodiments, the virus infects non-dividing cells. In some embodiments, the virus infects both dividing and non-dividing cells. In some embodiments, the virus can integrate into the host genome. In some embodiments, the virus is engineered to have reduced immunity, e.g., in human. In some embodiments, the virus is replication-competent. In other embodiments, the virus is replication-defective, e.g., having one or more coding regions for the genes necessary for additional rounds of virion replication and/or packaging replaced with other genes or deleted. In some embodiments, the virus causes transient expression of the Cas9 molecule and/or the gRNA molecule. In other embodiments, the virus causes long- lasting, e.g., at least 1 week, 2 weeks, 1 month, 2 months, 3 months, 6 months, 9 months, 1 year, 2 years, or permanent expression, of the Cas9 molecule and/or the gRNA molecule. The packaging capacity of the viruses may vary, e.g., from at least about 4 kb to at least about 30 kb, e.g., at least about 5 kb, 10 kb, 15 kb, 20 kb, 25 kb, 30 kb, 35 kb, 40 kb, 45 kb, or 50 kb.
In some embodiments, the Cas9- and/or gRNA-encoding DNA is delivered by a recombinant retrovirus. In some embodiments, the retrovirus (e.g., Moloney murine leukemia virus) comprises a reverse transcriptase, e.g., that allows integration into the host genome. In some embodiments, the retrovirus is replication-competent. In other embodiments, the retrovirus is replication-defective, e.g., having one of more coding regions for the genes necessary for additional rounds of virion replication and packaging replaced with other genes, or deleted.
In some embodiments, the Cas9- and/or gRNA-encoding DNA is delivered by a recombinant lentivirus. For example, the lentivirus is replication-defective, e.g., does not comprise one or more genes required for viral replication.
In some embodiments, the Cas9- and/or gRNA-encoding DNA is delivered by a recombinant adenovirus. In some embodiments, the adenovirus is engineered to have reduced immunity in human.
In some embodiments, the Cas9- and/or gRNA-encoding DNA is delivered by a recombinant AAV. In some embodiments, the AAV can incorporate its genome into that of a host cell, e.g., a target cell as described herein. In some embodiments, the AAV is a self-complementary adeno-associated virus (scAAV), e.g., a scAAV that packages both strands which anneal together to form double stranded DNA. AAV serotypes that may be used in the disclosed methods include, e.g., AAV1 , AAV2, modified AAV2 (e.g., modifications at Y444F, Y500F, Y730F and/or S662V), AAV3, modified AAV3 (e.g., modifications at Y705F, Y731 F and/or T492V), AAV4, AAV5, AAV6, modified AAV6 (e.g., modifications at S663V and/or T492V), AAV8, AAV 8.2, AAV9, AAV rh 10, and pseudotyped AAV, such as AAV2/8, AAV2/5 and AAV2/6 can also be used in the disclosed methods.
In some embodiments, the Cas9- and/or gRNA-encoding DNA is delivered by a hybrid virus, e.g., a hybrid of one or more of the viruses described herein.
A Packaging cell is used to form a virus particle that is capable of infecting a host or target cell. Such a cell includes a 293 cell, which can package adenovirus, and a i 2 cell or a PA317 cell, which can package retrovirus. A viral vector used in gene therapy is usually generated by a producer cell line that packages a nucleic acid vector into a viral particle. The vector typically contains the minimal viral sequences required for packaging and subsequent integration into a host or target cell (if applicable), with other viral sequences being replaced by an expression cassette encoding the protein to be expressed. For example, an AAV vector used in gene therapy typically only possesses inverted terminal repeat (ITR) sequences from the AAV genome which are required for packaging and gene expression in the host or target cell. The missing viral functions are supplied in trans by the packaging cell line. Henceforth, the viral DNA is packaged in a cell line, which contains a helper plasmid encoding the other AAV genes, namely rep and cap, but lacking ITR sequences. The cell line is also infected with adenovirus as a helper. The helper virus promotes replication of the AAV vector and expression of AAV genes from the helper plasmid. The helper plasmid is not packaged in significant amounts due to a lack of ITR sequences. Contamination with adenovirus can be reduced by, e.g., heat treatment to which adenovirus is more sensitive than AAV.
In some embodiments, the viral vector has the ability of cell type and/or tissue type recognition. For example, the viral vector can be pseudotyped with a different/alternative viral envelope glycoprotein; engineered with a cell type-specific receptor (e.g., genetic modification of the viral envelope glycoproteins to incorporate targeting ligands such as a peptide ligand, a single chain antibody, a growth factor); and/or engineered to have a molecular bridge with dual specificities with one end recognizing a viral glycoprotein and the other end recognizing a moiety of the target cell surface (e.g., ligand-receptor, monoclonal antibody, avidin-biotin and chemical conjugation).
In some embodiments, the viral vector achieves cell type specific expression. For example, a tissue-specific promoter can be constructed to restrict expression of the transgene (Cas 9 and gRNA) in only the target cell. The specificity of the vector can also be mediated by microRNA-dependent control of transgene expression. In some embodiments, the viral vector has increased efficiency of fusion of the viral vector and a target cell membrane. For example, a fusion protein such as fusion-competent hemagglutin (HA) can be incorporated to increase viral uptake into cells. In some embodiments, the viral vector has the ability of nuclear localization. For example, a virus that requires the breakdown of the cell wall (during cell division) and therefore will not infect a non-diving cell can be altered to incorporate a nuclear localization peptide in the matrix protein of the virus thereby enabling the transduction of non-proliferating cells.
In some embodiments, the Cas9- and/or gRNA-encoding DNA is delivered by a non-vector based method (e.g., using naked DNA or DNA complexes). For example, the DNA can be delivered, e.g., by organically modified silica or silicate (Ormosil), electroporation, gene gun, sonoporation, magnetofection, lipid-mediated transfection, dendrimers, inorganic nanoparticles, calcium phosphates, or a combination thereof.
In some embodiments, the Cas9- and/or gRNA-encoding DNA is delivered by a combination of a vector and a non-vector based method. For example, a virosome comprises a liposome combined with an inactivated virus (e.g., HIV or influenza virus), which can result in more efficient gene transfer, e.g., in a respiratory epithelial cell than either a viral or a liposomal method alone.
In some embodiments, the delivery vehicle is a non-viral vector. In some embodiments, the non-viral vector is an inorganic nanoparticle (e.g., attached to the payload to the surface of the nanoparticle). Exemplary inorganic nanoparticles include, e.g., magnetic nanoparticles (e.g., Fe3MnO2), or silica. The outer surface of the nanoparticle can be conjugated with a positively charged polymer (e.g., polyethylenimine, polylysine, polyserine) which allows for attachment (e.g., conjugation or entrapment) of payload. In some embodiments, the non-viral vector is an organic nanoparticle (e.g., entrapment of the payload inside the nanoparticle). Exemplary organic nanoparticles include, e.g., SNALP liposomes that contain cationic lipids together with neutral helper lipids which are coated with polyethylene glycol (PEG) and protamine and nucleic acid complex coated with lipid coating.
In some embodiments, the vehicle has targeting modifications to increase target cell update of nanoparticles and liposomes, e.g., cell specific antigens, monoclonal antibodies, single chain antibodies, aptamers, polymers, sugars, and cell penetrating peptides. In some embodiments, the vehicle uses fusogenic and endosome-destabilizing peptides/polymers. In some embodiments, the vehicle undergoes acid-triggered conformational changes (e.g., to accelerate endosomal escape of the cargo). In some embodiments, a stimuli-cleavable polymer is used, e.g., for release in a cellular compartment. For example, disulfide-based cationic polymers that are cleaved in the reducing cellular environment can be used.
In some embodiments, the delivery vehicle is a biological non-viral delivery vehicle. In some embodiments, the vehicle is an attenuated bacterium (e.g., naturally or artificially engineered to be invasive but attenuated to prevent pathogenesis and expressing the transgene (e.g., Listeria monocytogenes, certain Salmonella strains, Bifidobacterium longum, and modified Escherichia coli), bacteria having nutritional and tissue-specific tropism to target specific tissues, bacteria having modified surface proteins to alter target tissue specificity). In some embodiments, the vehicle is a genetically modified bacteriophage (e.g., engineered phages having large packaging capacity, less immunogenic, containing mammalian plasmid maintenance sequences and having incorporated targeting ligands). In some embodiments, the vehicle is a mammalian virus-like particle. For example, modified viral particles can be generated (e.g., by purification of the “empty” particles followed by ex vivo assembly of the virus with the desired cargo). The vehicle can also be engineered to incorporate targeting ligands to alter target tissue specificity. In some embodiments, the vehicle is a biological liposome. For example, the biological liposome is a phospholipid-based particle derived from human cells (e.g., erythrocyte ghosts, which are red blood cells broken down into spherical structures derived from the subject (e.g., tissue targeting can be achieved by attachment of various tissue or cell-specific ligands), or secretory exosomes — subject (i.e. , patient) derived membrane-bound nanovescicle (30-100 nm) of endocytic origin (e.g., can be produced from various cell types and can therefore be taken up by cells without the need of for targeting ligands).
In some embodiments, one or more nucleic acid molecules (e.g., DNA molecules) other than the components of a Cas system, e.g., the Cas9 molecule component and/or the gRNA molecule component described herein, are delivered. In some embodiments, the nucleic acid molecule is delivered at the same time as one or more of the components of the Cas system are delivered. In some embodiments, the nucleic acid molecule is delivered before or after (e.g., less than about 30 minutes, 1 hour, 2 hours, 3 hours, 6 hours, 9 hours, 12 hours, 1 day, 2 days, 3 days, 1 week, 2 weeks, or 4 weeks) one or more of the components of the Cas system are delivered. In some embodiments, the nucleic acid molecule is delivered by a different means than one or more of the components of the Cas system, e.g., the Cas9 molecule component and/or the gRNA molecule component, are delivered. The nucleic acid molecule can be delivered by any of the delivery methods described herein. For example, the nucleic acid molecule can be delivered by a viral vector, e.g., an integration-deficient lentivirus, and the Cas9 molecule component and/or the gRNA molecule component can be delivered by electroporation, e.g., such that the toxicity caused by nucleic acids (e.g., DNAs) can be reduced. In some embodiments, the nucleic acid molecule encodes a therapeutic protein, e.g., a protein described herein. In some embodiments, the nucleic acid molecule encodes an RNA molecule, e.g., an RNA molecule described herein.
RNA encoding Cas9 and/or gRNA molecules, can be delivered into cells, e.g., target cells described herein, by art-known methods or as, described herein. For example, Cas9-encoding and/or gRNA-encoding RNA can be delivered, e.g., by microinjection, electroporation, lipid-mediated transfection, peptide-mediated delivery, or a combination thereof.
Systemic modes of administration include oral and parenteral routes. Parenteral routes include, by way of aerosol, intravenous, intrarterial, intraosseous, intramuscular, intradermal, subcutaneous, intranasal and intraperitoneal routes.
A number of embodiments of the invention have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other embodiments are within the scope of the following claims.
EXAMPLES
Example 1:
This Example shows that acquired SNVs associated with clonal hematopoiesis and hematologic malignancy can be efficiently restored to their wild-type sequences using widely available CRISPR reagents without the need for exogenous HDR templates. High-fidelity Cas9 allows for the specific targeting of mutant alleles and that incorporation of the SNV into a standard 20-base pair protospacer sequence is sufficient to prevent off-target cutting of the wild-type allele. Furthermore, use of a commercially available HDR enhancer can drastically reduce insertions/deletions (indels) generated during this process. In sum, these experiments show that efficient reversion of heterozygous gain-of-function or loss-of-function SNVs in hematopoietic cell lines and primary patient samples can be accomplished using routine CRISPR workflows. The simplicity of this approach should enable easy adoption into any number of research applications, especially within the portion of the scientific community that has experience with CRISPR KO. This method, which in its simplest form needs only high-fidelity Cas9 and a gRNA, could furthermore be moved rapidly toward use as a genetic therapy, particularly in light of current human trials that are evaluating safety of Cas9/gRNA treatments.
Results
SNV-directed CRISPR cutting enables interallelic gene conversion
Working under the hypothesis that IGC will occur in a situation where a heterozygous mutant allele is cut while the wild-type (WT) allele remains intact, the goal was to test if a strategy using high-fidelity Cas9 and SNV-specific gRNAs without exogenous template could lead to correction of a mutant sequence. To test this, several hematopoietic cell lines were examined, identifying different heterozygous malignancy- associated SNVs that could serve as potential targets. Allele-specific gRNAs were then designed against these sequences, including the mutant SNV in the protospacer sequence (Figure 1A). The parental cell lines were then electroporated with ribonucleic protein (RNP) consisting of the SNV-specific guide and a high-fidelity Cas9, followed by collection of DNA 48 hours post-electroporation (Figure 1 B). There was both a relative increase in the measured amount of the WT allele fraction and a decrease in the mutant allele fraction as assessed by Sanger sequencing (Figure 1C). To confirm this, customcapture next generation sequencing (NGS) was performed with >300X coverage of target loci in order to assess the relative abundance of WT and mutant alleles. In five tested cell lines, there was a significant increase in the abundance of WT allele following treatment (Figure 1 D-1 H), accompanied by a relative decrease in the mutant allele fraction (Figure 5A-5B). These results demonstrate that HDR can occur at heterozygous loci in cell lines even in the absence of exogenous repair templates.
Although the intent was to test whether IGC could be performed at heterozygous SNVs in somatic cells and an inhibitor of non-homologous end joining (“HDR enhancer”) was included in all of the experiments from the outset, there was also a goal to test whether IGC would occur at meaningful levels or if indel formation would predominate in the absence of HDR enhancer. Thus, in the first two cell lines tested, experiments were performed in the presence and absence of HDR enhancer. The WT allele fraction was significantly increased even without HDR enhancer (Figure 1 D-1 E). The NGS reads were then phased at each locus to be able to quantify the overall and allele-specific burden of indels (Table 2 and Figure 5C). This revealed that the total number of indels was significantly higher in the absence of HDR enhancer (Figure 11-1 J). Moreover, across all of the cell line experiments, only a small minority of reads containing indels also contained the WT genotype (range 0.0%-13.8% of total indels; Table 2). This indicates that WT-sparing IGC using high-fidelity Cas9 and mutant-specific gRNAs can be accomplished with or without HDR enhancer, although the use of enhancer minimizes the amount of indel formation in targeted cells.
Bulk editing of ASXL1 truncating mutations reduces pro-growth transcriptional signatures and cellular proliferation
Having observed genetic evidence of IGC in several cell lines at various loci, there was a goal to understand how bulk editing using this method might impact cellular phenotypes of genetically complex cancer cells. The next focus was on correction of truncating mutations in ASXL1, due to this gene’s importance in myeloid cancers but also because correction of a stopgain SNV would be a logical and attractive early step for any future trials in human disease: restoration of a wild-type allele may offer benefit, while introducing stopgain-causing frameshifts into an allele already harboring a stopgain mutation should not present harm. The ASXL1 protein is a part of several histone-modifying complexes, including polycomb repressive complex 2 (PRC2) and the BAP1 histone H2AK119Ub deubiquitinase (DUB) complex (Fujino, T., et al. Exp. Hematol. 2020 83:74-84). Stopgain mutations in ASXL1, including p.Y591X, are commonly observed in clonal hematopoiesis (Bick, A.G., et al. Nature 2020 586:763- 768; Jaiswal, S., et al. N. Engl. J. Med. 2017 337:111) and myeloid malignancies, being associated with poor prognosis in the latter (Asada, S., et al. Cell. Mol. Life Sci. 2019 76:2511-2523; Bernard, E., et al. NEJM Evidence 2022 1:EVIDoa2200008; Wang, L., et al. Nat. Cancer 2021 2:515-526). Heterozygous ASXL 1 p.Y591X mutations are present in both the K562 and OCI-AML5 cell lines which allowed ample opportunity to test IGC correction of a stopgain mutation. When K562 cells were treated with ASXL1 p.Y591X- targeting gRNA, there was a significant reduction in the amount of mutant protein but no fluctuation in the amount of full-length ASXL1 as compared to controls (Figure 2A-2C), with qualitatively similar results when repeated in OCI-AML5 cells (Figure 6A-6C). These results indicate that IGC of ASXL1 stopgain SNVs preserves the expression of the WT protein and can lead to reduced levels of the mutant protein. The next goal was to examine how reducing the burden of ASXL1 mutant protein would affect the transcriptome. To characterize the effect of targeting ASXL1 stopgains, we performed bulk RNAseq on K562 cells treated with scramble gRNA or mutantspecific gRNA. Here, a far greater number of genes experienced a relative decrease in transcription than a relative increase (one-prop z-test, p = 0.0), suggesting at least partial restoration of ASXL1 -mediated epigenetic regulation of transcription (Figure 2D). One of the genes found to have significantly lower expression after IGC was DLK1 (Figure 2D), a gene frequently upregulated in myeloid malignancy (Sakajiri, S., et al. Leukemia 2005 19:1404-1410). DLK1 expression is known to be positively correlated with stemness/differentiation block in the hematopoietic niche (Li, L., et al. Oncogene 2005 24:4472-4476) and has been shown to promote proliferation in a K562 model (Sakajiri, S., et al. Leukemia 2005 19:1404-1410) and primary patient samples (Zhang, W., et al. Oncol. Lett. 2013 6:203-206). Moreover, RNAseq was conducted after similar CRISPR experiments in the OCI-AML5 cell line (Figure 6D-6E), and, when combined with the K562 data using a linear mixed model (Figure 6F-6G), DLK1 was again found to be significantly downregulated, increasing our interest in examining this result. An association between ASXL1 mutation and DLK1 expression has not previously been reported, so there was interest to find that ASXL1 mutation does indeed have a significant association with higher DLK1 transcription in the BEAT AML cohort (Figure 6H) (Gerami, et al. Cancer Discov. 2012 2:401-404; Gao, J., et al. Sci. Signal. 2013 6:pl1 ; Tyner, J.W., et al. Nature 2018 562:526-531). Looking further at our differential expression analyses, pre-ranked gene set enrichment analysis (GSEA) using the Hallmark gene sets (Liberzon, A., et al. Cell Syst. 2015 1 :417-425) revealed significant de-enrichment in pro-growth/pro-proliferation signaling pathways in both cell lines (Figures 2E and 6). Meanwhile, there was enrichment of immune signaling and heme metabolism pathways, potentially indicative of partial restoration of differentiation towards mature hematopoietic cells. These results show that bulk editing of ASXL1 p.Y591X via IGC leads to the downregulation of proliferative transcriptional programs.
Next asked was whether these transcriptional changes would lead to decreased proliferation. K562 cell numbers were assessed over time for scramble- or mutant-gRNA treated samples seeded at equal densities and saw a significantly lower fold expansion of the experimental samples (two-way ANOVA, p = 0.006; Figure 2F). To determine if this decreased expansion was the result of a diminished rate of replication, a CellTrace assay was conducted, showing that the population of experimentally treated cells was left-shifted compared to controls (Figures 2G and 7A-7B). Consistent with these findings, when dual Ki-67/propidium iodide staining was performed to look at stages of the cell cycle, showing that while both conditions had a similar proportion of cells in GO, there were significantly fewer experimental cells in G1 phase, coupled with a relative increase in S/G2/M phases (Figures 2H, 6I, 7C). Altogether, these data suggest that IGC correction of the ASXL1 p.Y591X SNV in K562s leads to a diminished proliferative drive and alters the cycling of these cells in a manner consistent with a delayed progression through cellular replication checkpoints.
Correction of ASXL1 truncating mutations prolongs survival in myeloid malignancy xenograft model
The next goal was to determine whether bulk-editing of cells to correct truncating mutations in ASXL1 could improve survival in xenograft models. It has been previously reported that complete removal of truncated ASXL1 in a subcloned KBM5 cell line delayed mortality in a mouse model (Valletta, S., et al. Oncotarget 2015 6:44061-71), but it was unclear whether a survival benefit would be seen if instead the xenograft were comprised of a more heterogenous bulk-edited population in which some residual fraction of cells retain mutant protein, or even if the KBM5 results were generalizable beyond that model. K562 cells treated with either scramble or ASXL1 p. Y591X-targeting gRNA were therefore transplanted into sublethally irradiated NSGS mice (Figure 3A). No significant difference was found in animal weights over time between conditions (Figure 3B). In a Cox proportional hazards analysis, it was found the ASXL1 -targeted group had significantly greater survival than the control group, with a hazard ratio of 0.16 (95% Cl = [0.04, 0.59]; p = 0.0060) for the animals receiving the treated cells (Figure 3C). When an ancillary experiment was performed using OCI-AML5 cells, there was a consistent trend for increased survival. Utilizing survival data from both cell lines, survival was improved with hazard of 0.31 (0.13, 0.76; p = 0.010) in a stratified Cox proportional hazards model (Figure 8). These results indicate that bulk CRISPR of truncating mutations in ASXL1 is sufficient to impart a survival advantage in the xenotransplant setting.
Primary hematopoietic cells are amenable to interallelic gene conversion
Having successfully tested IGC in multiple cell lines, next asked was whether it would be possible to perform invoke HDR without an exogenous repair template in primary human cells. To examine this, peripheral blood mononuclear cells (PBMCs) were used that were obtained via leukapheresis of a patient with acute myeloid leukemia (AML). This sample included an ASXL1 p.Q748X mutation with an allele fraction of 48%. Single treatment with a Q748X-targeting gRNA (Figure 4A) demonstrated an average increase in the WT allele fraction from 52% to 62% (p = 0.0095) compared to scramble gRNA (Figure 4b). Notably, there were very few indels (mean = 0.3% of reads) following treatment with the mutant-specific gRNA (Figure 4C). These results indicate that IGC may also be accomplished in primary human hematopoietic cells.
While large-scale deletions and copy number changes have not been reported for allele-specific CRISPR in somatic cells, allele-specific editing in human embryos has been shown to lead to some chromosomal loss (Zuccaro, M.V., et al. Cell 2020 183:1650-1664), so the possibility of whether large-scale genomic events were driving the change in allele fraction at the targeted SNV was investigated. To do this, the allele fraction at all heterozygous SNPs on Chr20 was examined, reasoning that copy number alterations would affect allelic ratios extending beyond the targeted SNV. There were no significant changes in allele fraction at any of the five heterozygous Chr 20 SNPs measured in the sequencing, which covered both chromosomal arms and flanked the targeted SNV (Figure 4D). While not ruling out the possibility of copy number changes affecting a small fraction of cells, this result demonstrates that the preponderance of the change in allele fraction at the targeted SNV is attributable to IGC. Likewise, no indication of copy number alterations was observed when looking back at the cell line data, with the exception of one K562 SNP, located ~5 kb proximal to the targeted ASXL1 SNV, which had a small but statistically significant increase in allele fraction (Figure 9). A more distal K562 SNP did not have a significant change, which leads to the belief that IGC extending beyond the cut site could be a more plausible explanation than chromosomal loss, especially given that kilobase-magnitude IGC has been observed previously (Yoshimi, K., et al. Nat. Commun. 2014 5:4240). Altogether, these data demonstrate that intentional IGC in human hematopoietic cells is not confounded an appreciable degree of copy number loss.
Discussion
In this study, it was demonstrated that intentional IGC can be used to repair heterozygous mutations. An IGC approach was successfully tested on numerous disease-associated SNVs across cancer cell lines and primary patient samples and went on to show the utility of this approach as a preclinical research tool by targeting truncating ASXL1 mutations. Targeting mutant ASXL1 led to a partial amelioration of proliferative cellular phenotypes, and a potential downstream target in DLK1 was identified. Finally, a bulk IGC correction of a single SNV in a cancer cell line was shown to be sufficient to significantly prolong survival in a mouse xenograft model. The primary importance of these findings is that IGC is a highly straightforward and streamlined approach to revert mutant SNVs to their wild-type sequence that has potential utility in creation of research models and treatment of human disease.
Several additional points deserve emphasis. In these experiments using a high- fidelity Cas9 enzyme, the results provide no evidence of significant cutting of the wildtype allele, which we believe is crucial to the success of IGC. This is true despite use of protospacer sequences that differ from the wild-type sequence by only a single base (and ranging in position from 3-10 bp upstream of the PAM). It was shown that IGC can occur with RNP treatment alone, but that the mutant may be repaired with indels, an unintended outcome that can be minimized by the use of a small molecule NHEJ inhibitor. It was shown that IGC can be accomplished even when there is polyploidy for an allelic locus, a situation that is not uncommon in malignant cells. Lastly, it was shown that chromosomal copy number loss does not appear to be a major confounder of IGC.
There is much promise in the use of CRISPR-based methodologies to directly fix disease-causing mutations, and this work extends progress in this field. Previously described CRISPR HDR approaches have relied on exogenously provided repair templates, and this approach has led to new insights generated from a broad range of genetic knock-in models (Bonafont, J., et al. Mol. Ther. 2021 29:2008-2018; Zhang, Y., et al. Sci. Adv. 2017 3:e1602814; Hoban, M.D., et al. Mol. Ther. 2016 24:1561-1569; Boettcher, S., et al. Science 2019 365:599-604; Nishiga, M., et al. Nat. Rev. Cardiol. 2022 19:505-521). The IGC method presented here offers an alternative to exogenous- template HDR in a highly circumscribed context: when the purpose is to replace a SNV existing in a cell with its alternate allele. In this arena, the method offers several potential benefits: 1) higher theoretical efficiency due to needing successful delivery and action of only a single reagent (RNP) to a given cell rather than two reagents (RNP and template), 2) reduced experimental complexity, cost, and faster startup compared to those needing exogenous template, and 3) lack of artefactual insertions which may occur with exogenous templates (Boel, A., et al. Dis. Model Meeh. 2018 11 :dmm035352). The simplicity of our system may facilitate translation to human disease: in vivo delivery of Cas9 and mRNA (the minimally required components of our method) has already been demonstrated in the clinical trial setting (Gillmore, J.D., et al. N. Engl. J. Med. 2021 385:493-502; Kan, M.J., et al. JAMA 2022 328:980-981). Other CRISPR-based technologies, namely base editors and prime editors, have also shown great promise in single-base editing applications. Unlike IGC, these systems can introduce a wide variety of novel base substitutions with relatively high efficiency at the user’s discretion (Anzalone, A.V., et al. Nat. Biotechnol. 2020 38:824-844). These may be the best option for base reversion in some scenarios, however, features of IGC may make it a preferred approach in other situations. For one, IGC does not carry the risk of editing nearby bases as base editors do, and, for another, it is a much more compact system to deliver to cells than prime editors, which require both Cas and reverse transcriptase proteins (Anzalone, A.V., et al. Nat. Biotechnol. 2020 38:824-844). Taken together, IGC is an experimentally straightforward new approach that complements existing CRISPR-based methods and which may have particular utility in specific scenarios, primarily when the object is to restore a disease-associated SNV to its wild-type form.
In conclusion, interallelic gene conversion is a simple CRISPR-based approach to revert a specific SNV allele to its counterpart base in heterozygous cells. This can find broad application in basic science studies as well as in translational applications aimed at reducing disease burden from deleterious SNVs.
Materials & Methods
Ethical statement
Experiments were conducted on deidentified primary patient samples collected and distributed by the Vanderbilt-Ingram Cancer Center Hematopoietic Malignancies Repository following acquisition of written informed consent, and in accordance with the tenets of the Declaration of Helsinki and approved by the Vanderbilt University Medical Center Institutional Review Board (#151710).
Cell culture
K562 cells (ATCC, Manassas, VA, USA) were cultured in RPMI 1640 (Corning, Corning, NY) supplemented with 20% fetal bovine serum (FBS; R&D Systems, Minneapolis, MN, USA) and 1 % penicillin/streptomycin (P/S; Thermo Fisher Scientific, Waltham, MA, USA). OCI-AML3 (DSMZ, Germany) cells were cultured in MEM a (Thermo Fisher Scientific) with 20% FBS and 1% P/S. OCI-AML5 (DSMZ) cells were cultured in MEM a supplemented with 20% heat-inactivated FBS, 1 % P/S, and 10 ng/mL of GM-CSF (PeproTech, Rocky Hill, NJ, USA). THP-1 cells (ATCC) were cultured in RPMI 1640 with 10% FBS, 1 % P/S, and 0.05 mM [3-mercaptoethanol (MilliporeSigma, Burlington, MA, USA). SET-2 (DSMZ) cells were cultured in RPMI 1640 with 20% FBS and 1 % P/S. Leukapheresis samples were cultured in RPMI 1640 with 5% FBS, 0.1 mM P-mercaptoethanol, and 10 ng/mL IL-1 p (PeproTech), 10 ng/mL IL-3 (PeproTech), and 10 ng/mL GM-CSF.
CRISPR/Cas9
Using a Neon transfection system (Thermo Fisher Scientific), cells were electroporated with HiFi Cas9 (IDT, Newark, NJ, USA) at 150 pg/mL complexed with sgRNA (IDT) at a 1 :2.5 ratio. For electroporation, cell lines were resuspended in Buffer R, while primary cells were resuspended in Buffer T, all at ~100 x 106 cells/mL. Settings were as follows: K562, 1450 V, 10 ms pulse width, 3 pulses; OCI-AML3, 1600 V, 10 ms pulse width, 3 pulses; OCI-AML5, 1500 V, 30 ms pulse width, 1 pulse; THP-1 , 1700 V, 20 ms pulse width, 1 pulse; SET-2, 1400 V, 20 ms pulse width, 2 pulses; primary cells, 1650V, 10ms pulse width, 3 pulses. For experiments including HDR Enhancer V2 (IDT), cells were resuspended in media containing 1 pM of the compound for overnight culture, followed by media exchange the following morning. Genomic DNA was isolated using DNA Blood Mini Kit (Qiagen, Hilden, Germany) at 48 hours post-electroporation.
Measurement of post-CRISPR allele fractions
Sanger sequencing of PCR-amplified CRISPR loci was performed through Genewiz/Azenta using the primers listed in Table 3. Primers were purchased through IDT. Sanger trace files were visualized with SnapGene Viewer.
Next-generation sequencing was performed at the VUMC VANTAGE core using the VUMC Clonal Hematopoiesis Sequencing Assay v2.0, a custom capture protocol which covers 24 frequently mutated CH genes plus several dozen germline SNPs; this assay covers the full exonic sequences of all of the genes examined in this study. Samples were sequenced to a depth of between 0.5 x 106 to 1 x 106 reads per sample on a NovaSeq sequencer (PE150). The DRAGEN Somatic pipeline (v3.10.4, Illumina) was used to validate SNVs and generate BAM files mapped to GRCh38/hg38 that were then analyzed in R (v4.1.1) with Rsamtools (v2.10.0). Prior to final quantification of allele frequency and indels at the targeted SNV, reads were filtered to include only those with unique mapping (MAPQ > 60) and coverage of both the SNV and at least four bases to either side of the predicted cut site 3 bp upstream of the PAM, in order to reduce the likelihood of underestimating indel proportions. Data were then reviewed manually in IGV (v2.14.0, Broad Institute). Allele fractions for non-targeted heterozygous SNPs were derived from IGV coverage statistics for reads with MAPQ > 60. Ideograms were generated with Rldeogram ( O.2.2) mapping to GRCh38/hg38. Animal models
Animal experiments were performed in accordance with guidelines approved by the Institutional Animal Care and Use Committee (IACUC) at Vanderbilt University Medical Center. Experiments used female NSGS mice, aged 6 to 8 weeks old, which were irradiated with 1 Gy radiation. For the K562 xenograft model, each mouse received 5 x 106 cells. In the OCI-AML5 model, each animal received 2 x 106 cells. Survival analysis was performed in R using the survival package (v3.3.1).
Western blot
Protein lysates were extracted from 1-5 million cells using RIPA buffer (Thermo Fisher Scientific) supplemented with complete Protease Inhibitor Cocktail (MilliporeSigma) and phosphatase inhibitor cocktail set V (Calbiochem, San Diego, CA, USA). Lysates were mixed 1 :1 with 2X Laemmli Buffer (Bio-Rad, Hercules, CA, USA) supplemented with [3-mercaptoethanol and denatured at 95°C for 10 minutes. Samples were run on 4-20% TGX gels (Bio-Rad) at 100V for 5 min, then 80V for 2 additional hours. Wet transfer to PVDF membrane (MilliporeSigma) was performed at 100V for 60 min on ice using transfer buffer prepared in house (20% v/v methanol, 0.3% w/v Tris, 1.44% w/v glycine). After overnight blocking in 5% milk in TBST, membranes were probed with primary antibody for 60 min at room temperature, followed by 3xTBST washes, then probed with goat anti-rabbit secondary antibody at 1 :5000 for 45 minutes at room temperature, followed by 3xTBST washes. Chemiluminescent detection using autoradiography film (Thomas Scientific, Swedesboro, NJ, USA) was performed using SuperSignal West Dura (Thermo Fisher Scientific) for ASXL1 and SuperSignal West Pico PLUS (Thermo Fisher Scientific) for [3-actin. The primary antibodies used in this study were polyclonal anti-ASXL1 raised in rabbit (PA5-68360, Thermo Fisher Scientific) and polyclonal anti-p-actin raised in rabbit (A2066, MilliporeSigma).
RNAseq
Two million cells from each condition (scramble or Y591X-targeted gRNA) were harvested from three independent CRISPR experiments on K562 and OCI-AML5 cell lines. RNA was extracted using the RNeasy Mini Kit (Qiagen), treated with DNAse I (NEB, Ipswich, MA, USA) and subsequently purified with the Monarch RNA Cleanup Kit (10 pg) (NEB). Library preparation using NEBNext® Poly(A) selection and sequencing were performed at the VUMC sequencing core. Samples were sequenced to a depth of 50 million reads on a NovaSeq 6000 sequencer, PE150. Preprocessing was performed on the DRAGEN RNA Pipeline (v3.6.3). Raw counts were normalized using DESeq2 (v1.34.0) for individual analyses and with edgeR (v3.36.0) for combined analysis. Differential expression analysis was conducted with DESeq2 or DREAM in variancepartition (v1.24.1). Volcano plots were generated using EnhancedVolcano (v1.12.0). Enrichment analysis and plotting was performed using fgsea (v1.20.0) using the MSigDB hallmark gene sets (v7.5.1).
Cell number and proliferation assays
For cell expansion experiments, cells were plated at an initial density of 100,000 cells/mL and cultured for 72 hours. Counting of cells stained with trypan blue 0.4% (Thermo Fisher Scientific) was performed every 24 hours using a Countess II FL automated cell counter (Thermo Fisher Scientific). For cell division quantification, cells were incubated in CellTrace Violet (Thermo Fisher Scientific) in PBS for 20 minutes at room temperature in the dark, followed by three washes in PBS. Immediately following, an aliquot of cells was fixed in 4% paraformaldehyde, while the remaining cells were cultured for 72 hours followed by paraformaldehyde fixation. Samples were analyzed on a 3-laser BD LSRFortessa (BD Biosciences, Franklin Lakes, NJ, USA). Cell division bins were determined by taking the mean day 0 fluorescence, dividing sequentially by 2, and then calculating the midpoints between the halved intensity values. For cell cycle analysis, cells were plated at an initial density of 200,000 cells/mL. After 24 hours in culture, 7.5 x 105 cells were washed twice in PBS, then fixed by the dropwise addition of 3 mL cold 70% ethanol. Cells were then incubated at -20°C for 60 minutes, washed three times in PBS, and resuspended in 0.5 mL BD Perm/Wash Buffer (BD Biosciences). Following this, 0.1 mL of cell suspension was incubated with human APC- conjugated Ki-67 antibody (Cat. 350514, BioLegend, San Diego, CA, USA) at room temperature in the dark for 30 minutes. Cells were washed twice and resuspended in 0.2 mL BD Perm/Wash Buffer. Samples were incubated with 5 pL Propidium Iodide Staining Solution (Cat. 556463, BD Biosciences) and analyzed on a 5-laser BD LSR II (BD Biosciences). Three independent experiments were performed for each assay. FACS data were analyzed using FlowJo software (v10).
Visualization
Except as noted, plots were generated using ggplot2 (v3.3.6), ggpubr( 0.4.0), and RColorBrewer (v.1 .1.3).
Statistical analyses For analyses of variant allele fractions, two-sided Welch’s t-tests were used. To test for differential expression in the BEAT AML cohort, a one-tailed Wilcoxon rank sum test was used.
Data availability The RNAseq data used in this study are available in GEO under GSE212730.
The cell line DNAseq data used in this study are available under NCBI SRA accession number PRJNA880841. The human subject DNAseq data is available upon reasonable request.
Figure imgf000044_0001
Figure imgf000044_0002
Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of skill in the art to which the disclosed invention belongs. Publications cited herein and the materials for which they are cited are specifically incorporated by reference. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. Such equivalents are intended to be encompassed by the following claims.

Claims

WHAT IS CLAIMED IS:
1. A composition or system for repairing heterozygous single nucleotide variants (SNVs) in a cell by CRISPR-mediated interallelic gene conversion (IGC), the composition comprising: an isolated nucleic acid encoding a Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR)-associated endonuclease; and an isolated nucleic acid sequence encoding an SNV-specific guide RNA (gRNA) 17 to 24 nucleotides in length complementary to a target gene sequence comprising the SNV, wherein the composition does not comprise an exogenous donor repair template.
2. The composition or system of claim 1 , wherein the target gene is selected from the group consisting ofASXLI, DNMT3A, GNAS, GNB1, IDH1, IDH2, KIT, NRAS, PPM1D, SF3B1, SRSF2, TET2, TP53, and U2AF1.
3. The composition or system of claim 1 , wherein the SNV is a Y591 , Q733, or L775 mutation in asx/1, such as a Y591X, Q733X, or L775X mutation in ASXL1.
4. The composition or system of claim 1 , wherein the SNV is a R326, R635, V657, R729, Y735, R736, R749, F755, R771 , I780, R882, W860, or P904 mutation in DNMT3A, such as a R326C, R635Q, R635W, V657M, R729W, Y735C, R736C, R736H, R749C, F755S, R771X, I780T, R882C, R882H, W860X, or P904L mutation in DNMT3A.
5. The composition or system of claim 1 , wherein the SNV is a R201 mutation in gnas, such as a R201 H mutation in GNAS.
6. The composition or system of claim 1 , wherein the SNV is a K57 mutation in GNB1, such as a K57E mutation in GNB1.
7. The composition or system of claim 1 , wherein the SNV is a R132 mutation in IDH1, such as a R132H or R132C mutation in IDH1.
8. The composition or system of claim 1 , wherein the SNV is a R140 mutation in IDH2, such as a R140Q or R140L mutation in IDH2.
9. The composition or system of claim 1 , wherein the SNV is a D816 mutation in kit, such as a D816V mutation in KIT.
10. The composition or system of claim 1 , wherein the SNV is a G12 mutation in NRAS, such as a G12D mutation in NRAS.
11. The composition or system of claim 1 , wherein the SNV is a R552 mutation in PPM1D, such as a R552X mutation in PPM1D.
12. The composition or system of claim 1 , wherein the SNV is a R387W H662, T663, K700, G740E, G740, orA744 mutation in SF3B1, such as a R387W, H662D, H662Q, T663I, K700E, G740E, G740R, or A744P mutation in SF3B1.
13. The composition or system of claim 1 , wherein the SNV is a P95 mutation in SRSF2, such as a P95H, P95L, or P95R mutation in SRSF2.
14. The composition or system of claim 1 , wherein the SNV is a R544, Q803, Q916, C1135, Q1191 , R1216, R1261 , R1359, R1465, S1486, R1516, or 11873 mutation in TET2, such as a R544X, Q803X, Q916X, C1135Y, Q1191X, R1216Q, R1216X, R1261C, R1261 H, R1359C, R1359H, R1465X, S1486X, R1516X, or l1873T mutation in TET2.
15. The composition or system of claim 1 , wherein the SNV is a G108, R110, P177, H179, Y220, M237, C238, C242, M246, R248, R273, R306, or R342 mutation in TP53, such as a G108S, R110L, P177R, H179Y, Y220C, M237I, C238Y, C242Y, M246V, R248Q, R273H, R306X, or R342X mutation in TP53.
16. The composition or system of claim 1 , wherein the SNV is a S34F mutation in u2af1, such as a S34F mutation in U2AF1.
17. The composition or system of claim 1 , wherein the SNV is selected from the group consisting of rs377577594, rs147001633, rs200018028, rs121913237, rs371369583, rs371369583, rs387907078, rs149095705, rs147828672, rs144689354, rs370751539, rs779626155, rs200018028, rs761934754, rs751562376, rs747448117, rs751713049, rs141326438, rs559063155, rs121913495, rs779070661 , rs121913500, rs121913499, rs121913502, rs121913502, rs121913507, rs371769427, rs752263134, rs28934576, rs730882005, rs11540652, rs121912666, rs751477326, rs483352695, rs121912655, rs587782664, rs587780070, rs11540654, rs587782461 , rs730882029, rs121913344, rs778467242, rs1239341681 , rs776846119, rs745511585, rs1009194427, rs116519313, rs370735654, rs780710758, rs771761785, rs1235228377, rs769422572, rs1729381211 , rs562667223, rs898441677, rs759658003, rs775677220, rs1376289450, rs1440692352, and rs368508787.
18. The composition or system of claim 1 , wherein the gRNA comprises a polynucleotide selected from the group consisting of SEQ ID NO: 1-216.
19. The composition or system of any one of claims 1 to 18, wherein the CRISPR- associated endonuclease is a high fidelity Cas9.
20. A method for repairing heterozygous single nucleotide variants (SNVs) in a subject in need thereof, the method comprising administering to the subject a therapeutically effective amount of the composition or system of any one of claims 1
PCT/US2023/079724 2022-11-15 2023-11-15 Repair of disease-associated single nucleotide variants via interallelic gene conversion WO2024107784A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202263383776P 2022-11-15 2022-11-15
US63/383,776 2022-11-15

Publications (2)

Publication Number Publication Date
WO2024107784A2 true WO2024107784A2 (en) 2024-05-23
WO2024107784A3 WO2024107784A3 (en) 2024-07-04

Family

ID=91085405

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2023/079724 WO2024107784A2 (en) 2022-11-15 2023-11-15 Repair of disease-associated single nucleotide variants via interallelic gene conversion

Country Status (1)

Country Link
WO (1) WO2024107784A2 (en)

Also Published As

Publication number Publication date
WO2024107784A3 (en) 2024-07-04

Similar Documents

Publication Publication Date Title
US20240002843A1 (en) Compositions and methods for the treatment of hemoglobinopathies
US11124796B2 (en) Delivery, use and therapeutic applications of the CRISPR-Cas systems and compositions for modeling competition of multiple cancer mutations in vivo
AU2016326711B2 (en) Use of exonucleases to improve CRISPR/Cas-mediated genome editing
EP3230460B2 (en) Methods and compositions for selectively eliminating cells of interest
US11492670B2 (en) Compositions and methods for targeting cancer-specific sequence variations
TW202027799A (en) Compositions and methods for expressing factor ix
TW202028461A (en) Nucleic acid constructs and methods of use
US20220396813A1 (en) Recombinase compositions and methods of use
TW202027798A (en) Compositions and methods for transgene expression from an albumin locus
CN111206032A (en) Delivery, use and therapeutic applications of CRISPR-CAS systems and compositions for genome editing
KR20030003240A (en) Targeted chromosomal genomic alterations with modified single stranded oligonucleotides
WO2018191440A1 (en) In vivo gene editing of blood progenitors
JP2023522788A (en) CRISPR/CAS9 therapy to correct Duchenne muscular dystrophy by targeted genomic integration
US20240042025A1 (en) Biallelic knockout of b2m
US20210130804A1 (en) Knockout of a mutant allele of an elane gene
US20220228142A1 (en) Compositions and methods for editing beta-globin for treatment of hemaglobinopathies
WO2024107784A2 (en) Repair of disease-associated single nucleotide variants via interallelic gene conversion
US20220280571A1 (en) Compositions and methods for treating alpha thalassemia
JP2023549457A (en) Synthetic introns for targeted gene expression
EP4192948A2 (en) Rna and dna base editing via engineered adar
WO2020214619A1 (en) Crispr compositions and methods for promoting gene editing of gata2
CN114072518B (en) Methods and compositions for treating thalassemia or sickle cell disease
KR20240010451A (en) Knock-in strategy in C3 safe harbor area
WO2021243218A2 (en) Differential knockout of a heterozygous allele of samd9
WO2020163379A1 (en) Crispr compositions and methods for promoting gene editing of ribosomal protein s19 (rps19) gene

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23892445

Country of ref document: EP

Kind code of ref document: A2