WO2021175288A1 - 改进的胞嘧啶碱基编辑*** - Google Patents

改进的胞嘧啶碱基编辑*** Download PDF

Info

Publication number
WO2021175288A1
WO2021175288A1 PCT/CN2021/079086 CN2021079086W WO2021175288A1 WO 2021175288 A1 WO2021175288 A1 WO 2021175288A1 CN 2021079086 W CN2021079086 W CN 2021079086W WO 2021175288 A1 WO2021175288 A1 WO 2021175288A1
Authority
WO
WIPO (PCT)
Prior art keywords
amino acid
apobec3b deaminase
ha3bctd
ha3b
base editing
Prior art date
Application number
PCT/CN2021/079086
Other languages
English (en)
French (fr)
Inventor
高彩霞
王延鹏
靳帅
宗媛
Original Assignee
中国科学院遗传与发育生物学研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院遗传与发育生物学研究所 filed Critical 中国科学院遗传与发育生物学研究所
Priority to JP2022553071A priority Critical patent/JP2023517890A/ja
Priority to KR1020227034519A priority patent/KR20220150363A/ko
Priority to BR112022017732A priority patent/BR112022017732A2/pt
Priority to AU2021229415A priority patent/AU2021229415A1/en
Priority to EP21764693.4A priority patent/EP4130257A4/en
Priority to CN202180019220.7A priority patent/CN115427564A/zh
Priority to CA3174615A priority patent/CA3174615A1/en
Priority to US17/909,570 priority patent/US20230313234A1/en
Publication of WO2021175288A1 publication Critical patent/WO2021175288A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/62DNA sequences coding for fusion proteins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • C12Y305/04001Cytosine deaminase (3.5.4.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • C12Y305/04005Cytidine deaminase (3.5.4.5)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/80Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Definitions

  • the invention belongs to the field of gene editing. Specifically, the present invention relates to an improved cytosine base editing system, which has a significantly reduced genome-wide off-target effect and a narrow editing window.
  • Genome editing technology is a genetic engineering technology based on the targeted modification of the genome by artificial nucleases, and it is playing an increasingly powerful role in agricultural and medical research.
  • Clustered regularly spaced short palindromic repeats and its related system are currently the most widely used genome editing tools.
  • Cas The protein Under the guidance of the artificially designed guide RNA, Cas The protein can be targeted to any location in the genome.
  • the base editing system is a new gene editing technology developed based on the CRISPR system. It is divided into cytosine base editing system and adenine base editing system.
  • Cytosine deaminase and adenine deaminase are respectively combined with Cas9 single-stranded nickase. Fusion, under the targeting action of the guide RNA, Cas9 single-stranded nickase produces a single-stranded DNA region, so the deaminase can efficiently deaminate the C and A nucleotides on the single-stranded DNA at the targeted position. , Become U bases and I bases, and then are repaired into T bases and G bases in the process of cell self-repair.
  • the cytosine base editing system has been found to produce unpredictable off-target phenomena in the genome. This is probably due to the overexpression of cytosine deaminase in the genome and random deamination in the highly active regions of the genome. Caused. In addition, if there are multiple Cs in the working window of the target site, the existing high-efficiency base editing system often obtains products with multiple C changes at the same time, and cannot obtain products with only a single C mutation. The specificity of the genome and the accuracy of the target site greatly influence the use of cytosine base editing systems.
  • the specificity and accuracy of the cytosine base editing system may be related to the ability of cytosine deaminase to bind to single-stranded DNA, changing or weakening the ability of deaminase to bind to single-stranded DNA without reducing the ability of deaminase With the ability of deamination, it is possible to obtain a cytosine single-base editing system that is both high-efficiency, specific and precise.
  • the inventors optimized Loop1 and Loop7 in the human-derived hA3Bctd (APOBEC3B C-terminal domain) domain that binds to single-stranded DNA, and tested the obtained variants through rice protoplast transformation. The efficiency and accuracy of the variants were obtained, and the specificity of the obtained variants was tested, thereby obtaining a series of highly efficient, highly specific, and highly accurate base editing systems.
  • Figure 1 Shows the selection of A3Bctd mutation sites.
  • Figure 2 Shows the targeting efficiency and off-target efficiency of the base editing system to be tested.
  • Figure 3 Shows the average targeting efficiency and average off-target efficiency of the base editing system to be tested.
  • Figure 4. Shows the combination of double protrusion and triple mutant.
  • Figure 5 Shows the protoplast transformation to verify the targeting and off-target efficiency of the double and triple mutants.
  • Figure 6 Shows the average targeting efficiency and average off-target efficiency of the double and triple mutants to be tested.
  • Figure 7 Shows the working efficiency of different base editing systems for different Cs at the four targeted sites.
  • Figure 8 Shows the average mutation types of editing products of different base editing systems at four targeted sites.
  • the term “and/or” encompasses all combinations of items connected by the term, and should be treated as if each combination has been individually listed herein.
  • “A and/or B” encompasses “A”, “A and B”, and “B”.
  • “A, B, and/or C” encompasses "A”, “B”, “C”, “A and B”, “A and C”, “B and C”, and "A and B and C”.
  • the protein or nucleic acid may be composed of the sequence, or may have additional amino acids or nuclei at one or both ends of the protein or nucleic acid. Glycolic acid, but still has the activity described in the present invention.
  • methionine encoded by the start codon at the N-terminus of the polypeptide will be retained under certain actual conditions (for example, when expressed in a specific expression system), but does not substantially affect the function of the polypeptide.
  • CRISPR effector protein generally refers to the nuclease present in the naturally-occurring CRISPR system, as well as its modified form, its variant, its catalytically active fragment, and the like.
  • the term covers any effector protein based on the CRISPR system that can achieve gene targeting (such as gene editing, gene targeted regulation, etc.) in cells.
  • Cas9 nuclease examples include Cas9 nuclease or variants thereof.
  • the Cas9 nuclease may be a Cas9 nuclease from different species, such as spCas9 from S. pyogenes or SaCas9 derived from S. aureus.
  • Cas9 nuclease and Cas9 are used interchangeably herein, and refer to RNA comprising Cas9 protein or fragments thereof (for example, a protein containing the active DNA cleavage domain of Cas9 and/or the gRNA binding domain of Cas9) Guided nuclease.
  • Cas9 is a component of the CRISPR/Cas (clustered regularly spaced short palindrome repeats and related systems) genome editing system, which can target and cleave the DNA target sequence under the guidance of the guide RNA to form a DNA double-strand break (DSB) ).
  • CRISPR/Cas clustered regularly spaced short palindrome repeats and related systems
  • CRISPR effector proteins may also include Cpf1 nuclease or variants thereof such as highly specific variants.
  • the Cpf1 nuclease may be Cpf1 nuclease from different species, for example, Cpf1 nuclease from Francisella novicida U112, Acidaminococcus sp. BV3L6 and Lachnospiraceae bacterium ND2006.
  • CRISPR effector protein can also be derived from Cas3, Cas8a, Cas5, Cas8b, Cas8c, Cas10d, Cse1, Cse2, Csy1, Csy2, Csy3, GSU0054, Cas10, Csm2, Cmr5, Cas10, Csx11, Csx10, Csf1, Csn2, Cas4 , C2c1, C2c3 or C2c2 nucleases, for example, include these nucleases or functional variants thereof.
  • Gene as used herein not only covers chromosomal DNA present in the nucleus, but also includes organelle DNA present in subcellular components of the cell (such as mitochondria, plastids).
  • organism includes any organism suitable for genome editing, preferably eukaryotes.
  • organisms include, but are not limited to, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, and cats; poultry such as chickens, ducks, and geese; plants include monocots and dicots, For example, rice, corn, wheat, sorghum, barley, soybean, peanut, Arabidopsis and so on.
  • Genetically modified organism or “genetically modified cell” means an organism or cell that contains exogenous polynucleotides or modified genes or expression control sequences in its genome.
  • exogenous polynucleotides can be stably integrated into the genome of organisms or cells, and inherited for successive generations.
  • the exogenous polynucleotide can be integrated into the genome alone or as part of a recombinant DNA construct.
  • the modified gene or expression control sequence contains single or multiple deoxynucleotide substitutions, deletions and additions in the organism or cell genome.
  • Form in terms of sequence means a sequence from a foreign species, or if from the same species, a sequence that has undergone significant changes in composition and/or locus from its natural form through deliberate human intervention.
  • nucleic acid sequence is used interchangeably and are single-stranded or double-stranded RNA or DNA polymers, optionally containing synthetic, non-natural Or changed nucleotide bases.
  • Nucleotides are referred to by their single letter names as follows: “A” is adenosine or deoxyadenosine (respectively RNA or DNA), “C” is cytidine or deoxycytidine, and “G” is guanosine or Deoxyguanosine, “U” means uridine, “T” means deoxythymidine, “R” means purine (A or G), “Y” means pyrimidine (C or T), “K” means G or T, “ H” means A or C or T, “I” means inosine, and “N” means any nucleotide.
  • Polypeptide “peptide”, and “protein” are used interchangeably in the present invention and refer to a polymer of amino acid residues.
  • the term applies to amino acid polymers in which one or more amino acid residues are corresponding artificial chemical analogs of naturally occurring amino acids, as well as to naturally occurring amino acid polymers.
  • the terms "polypeptide”, “peptide”, “amino acid sequence” and “protein” may also include modified forms, including but not limited to glycosylation, lipid linkage, sulfation, gamma carboxylation of glutamic acid residues, hydroxyl And ADP-ribosylation.
  • Suitable conservative amino acid substitutions are known to those skilled in the art and can generally be made without changing the biological activity of the resulting molecule.
  • those skilled in the art recognize that a single amino acid substitution in a non-essential region of a polypeptide does not substantially change the biological activity (see, for example, Watson et al., Molecular Biology of the Gene, 4th Edition, 1987, The Benjamin/Cummings Pub .co.,p.224).
  • expression construct refers to a vector suitable for expression of a nucleotide sequence of interest in an organism, such as a recombinant vector.
  • “Expression” refers to the production of a functional product.
  • the expression of a nucleotide sequence may refer to the transcription of the nucleotide sequence (such as transcription to generate mRNA or functional RNA) and/or the translation of RNA into a precursor or mature protein.
  • the "expression construct" of the present invention can be a linear nucleic acid fragment, a circular plasmid, a viral vector, or, in some embodiments, can be an RNA (such as mRNA) that can be translated.
  • the "expression construct" of the present invention may contain regulatory sequences and nucleotide sequences of interest from different sources, or regulatory sequences and nucleotide sequences of interest from the same source but arranged in a manner different from those normally occurring in nature.
  • regulatory sequence and “regulatory element” are used interchangeably and refer to the upstream (5' non-coding sequence), middle or downstream (3' non-coding sequence) of the coding sequence, and affect the transcription, RNA processing, or processing of the related coding sequence. Stability or translated nucleotide sequence. Regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
  • Promoter refers to a nucleic acid fragment capable of controlling the transcription of another nucleic acid fragment.
  • the promoter is a promoter capable of controlling gene transcription in a cell, regardless of whether it is derived from the cell.
  • the promoter can be a constitutive promoter or a tissue-specific promoter or a developmentally regulated promoter or an inducible promoter.
  • tissue-specific promoter and “tissue-preferred promoter” are used interchangeably, and refer to mainly but not necessarily exclusively expressed in a tissue or organ, and can also be expressed in a specific cell or cell type The promoter.
  • tissue-preferred promoter refers to a promoter whose activity is determined by developmental events.
  • inducible promoters selectively express operably linked DNA sequences in response to endogenous or exogenous stimuli (environment, hormones, chemical signals, etc.).
  • operably linked refers to the connection of regulatory elements (for example, but not limited to, promoter sequences, transcription termination sequences, etc.) to nucleic acid sequences (for example, coding sequences or open reading frames) such that the nucleotides The transcription of the sequence is controlled and regulated by the transcription control element.
  • regulatory elements for example, but not limited to, promoter sequences, transcription termination sequences, etc.
  • nucleic acid sequences for example, coding sequences or open reading frames
  • "Introducing" nucleic acid molecules such as plasmids, linear nucleic acid fragments, RNA, etc.
  • proteins into an organism refers to transforming the cells of the organism with the nucleic acid or protein so that the nucleic acid or protein can function in the cell.
  • the "transformation” used in the present invention includes stable transformation and transient transformation.
  • “Stable transformation” refers to the introduction of an exogenous nucleotide sequence into the genome, resulting in the stable inheritance of the exogenous nucleotide sequence. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the organism and any successive generations thereof.
  • Transient transformation refers to the introduction of nucleic acid molecules or proteins into cells to perform functions without stable inheritance of exogenous nucleotide sequences. In transient transformation, the foreign nucleic acid sequence is not integrated into the genome.
  • Proteins refer to the physiological, morphological, biochemical or physical characteristics of cells or organisms.
  • “Agronomic traits” especially refer to the measurable index parameters of crop plants, including but not limited to: leaf green, grain yield, growth rate, total biomass or accumulation rate, fresh weight at maturity, dry weight at maturity, fruit Yield, seed yield, plant total nitrogen content, fruit nitrogen content, seed nitrogen content, plant nutrient tissue nitrogen content, plant total free amino acid content, fruit free amino acid content, seed free amino acid content, plant nutrient tissue free amino acid content, plant total protein Content, fruit protein content, seed protein content, plant nutrient tissue protein content, herbicide resistance, drought resistance, nitrogen absorption, root lodging, harvest index, stem lodging, plant height, ear height, ear length, disease resistance Resistance, cold resistance, salt resistance and tiller number.
  • the present invention provides a base editing fusion protein, which comprises APOBEC3B deaminase or a mutant of APOBEC3B deaminase fused with a CRISPR effector protein.
  • base editing fusion protein and “base editor” can be used interchangeably.
  • the base editing fusion protein containing APOBEC3B deaminase or its mutants of the present invention can perform efficient base editing on target sequences, and at the same time has significantly reduced genome-wide random off-target effects compared with other base editors.
  • the base editing fusion protein comprising APOBEC3B deaminase or a mutant thereof of the present invention has a shortened editing window for the target sequence, and can achieve more precise base editing.
  • the APOBEC3B deaminase mutant is or derived from human APOBEC3B deaminase.
  • An exemplary wild-type human APOBEC3B deaminase comprises the amino acid sequence shown in SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is or derived from the C-terminal domain (hA3Bctd, APOBEC3B C-terminal domain) of human APOBEC3B deaminase.
  • An exemplary wild-type hA3Bctd includes the amino acid sequence of SEQ ID NO: 2.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd Amino acid substitutions at one or more of the following positions: 210th, 211th, 214th, 230th, 240th, 281th, 308th, 311th, 313th, 314th And the 315th position, wherein the amino acid position is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd Amino acid substitutions at one or more of the following positions: position 211, position 214, position 308, position 311, position 313, position 314, and position 315, wherein the amino acid positions refer to SEQ ID NO: 19 Sure.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd
  • the amino acid substitutions at positions 211 and 311, wherein the position of the amino acid is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd
  • the amino acid substitutions at positions 211 and 313, wherein the position of the amino acid is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd
  • the amino acid substitutions at positions 211 and 314, wherein the position of the amino acid is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd
  • the amino acid substitutions at positions 311 and 313, wherein the position of the amino acid is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd
  • the amino acid substitutions at positions 214 and 314, wherein the position of the amino acid is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd
  • the amino acid substitutions at positions 314 and 315, wherein the position of the amino acid is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd
  • the amino acid substitutions at positions 211, 311 and 314, wherein the positions of the amino acids are determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd
  • the amino acid substitutions at positions 211, 214 and 313, wherein the positions of the amino acids are determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd
  • the amino acid substitutions at positions 214, 314 and 315, wherein the positions of the amino acids are determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd
  • One or more amino acid substitutions selected from: R210A, R210K3, R211K, T214C, T214G, T214S, T214V, L230K, N240A, W281H, F308K, R311K, Y313F, D314R, D314H, Y315M, wherein the amino acid positions refer to SEQ ID NO: 19 confirmed.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd
  • One or more amino acid substitutions selected from the following: R211K, T214V, F308K, R311K, Y313F, D314R, D314H, and Y315M, wherein the position of the amino acid is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd Amino acid substitutions R211K and R311K, wherein the position of the amino acid is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd Amino acid substitutions R211K and Y313F, wherein the position of the amino acid is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd Contains amino acid substitutions R211K and D314R, wherein the position of the amino acid is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd Amino acid substitutions R311K and Y313F, wherein the position of the amino acid is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd Amino acid substitutions T214V and D314R, wherein the position of the amino acid is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd Amino acid substitutions D314R and Y315M, wherein the position of the amino acid is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd Amino acid substitutions R211K, R311K and D314R, wherein the position of the amino acid is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd Amino acid substitutions R211K, T214V and Y313F, wherein the position of the amino acid is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant is derived from human APOBEC3B deaminase (hA3B) or the C-terminal domain (hA3Bctd) of human APOBEC3B deaminase, and contains relative to wild-type hA3B or hA3Bctd Amino acid substitutions T214V, D314H and Y315M, wherein the position of the amino acid is determined with reference to SEQ ID NO: 19.
  • the APOBEC3B deaminase mutant comprises an amino acid sequence selected from SEQ ID NO: 3-18, 26-31, and 32-34.
  • the CRISPR effector protein is a "nuclease-inactivated CRISPR effector protein".
  • nuclease-inactivated CRISPR effector protein refers to the loss of the double-stranded nucleic acid cleavage activity of the CRISPR effector protein, but still retains the gRNA-directed DNA targeting ability.
  • CRISPR effector proteins lacking double-stranded nucleic acid cleavage activity also encompass nickases, which form a nick in the double-stranded nucleic acid molecule, but do not completely cut the double-stranded nucleic acid.
  • the nuclease-inactivated CRISPR effector protein of the present invention has nickase activity. Without being limited by any theory, it is believed that mismatch repair in eukaryotes guides the removal and repair of mismatched bases in the DNA strand through nicks in the strand.
  • the U:G mismatch formed by the action of cytidine deaminase may be repaired to C:G. By introducing a cut on a chain containing an unedited G, it will be possible to preferentially repair the U:G mismatch to the desired U:A or T:A.
  • the nuclease-inactivated CRISPR effector protein is nuclease-inactivated Cas9.
  • the DNA cleavage domain of Cas9 nuclease is known to contain two subdomains: HNH nuclease subdomain and RuvC subdomain.
  • the HNH subdomain cleaves the strand complementary to gRNA, while the RuvC subdomain cleaves the non-complementary strand. Mutations in these subdomains can inactivate the nuclease activity of Cas9, forming "nuclease-inactivated Cas9".
  • the nuclease-inactivated Cas9 still retains the gRNA-directed DNA binding ability. Therefore, in principle, when fused with another protein, the nuclease-inactivated Cas9 can target the additional protein to almost any DNA sequence simply by co-expression with a suitable guide RNA.
  • the nuclease-inactivated Cas9 of the present invention can be derived from Cas9 of different species, for example, derived from S. pyogenes Cas9 (SpCas9), or derived from Staphylococcus aureus (S. aureus) Cas9 (SaCas9). ). Simultaneously mutating the HNH nuclease subdomain and RuvC subdomain of Cas9 (for example, including mutations D10A and H840A) inactivates the nuclease of Cas9 and becomes nuclease death Cas9 (dCas9). Mutation and inactivation of one of the subdomains can make Cas9 have nickase activity, that is, obtain Cas9 nickase (nCas9), for example, nCas9 with only mutation D10A.
  • SpCas9 S. pyogenes Cas9
  • SaCas9 Staphylococc
  • the nuclease-inactivated Cas9 of the present invention contains the amino acid substitution D10A and/or H840A relative to the wild-type Cas9.
  • the nuclease-inactivated Cas9 may also contain additional mutations.
  • SpCas9 with nuclease inactivation may also include EQR, VQR, or VRER mutations
  • SaCas9 may also include KKH mutations (Kim et al. Nat. Biotechnol. 35, 371-376.).
  • the nuclease-inactivated SpCas9 includes the amino acid sequence shown in SEQ ID NO:35.
  • the nuclease-inactivated CRISPR effector protein is nuclease-inactivated Cpf1.
  • Cpf1 contains a DNA cleavage domain (RuvC), which can be mutated to delete the DNA cleavage activity of Cpf1, forming "Cpf1 with lack of DNA cleavage activity".
  • the Cpf1 lacking DNA cleavage activity still retains the DNA binding ability guided by gRNA. Therefore, in principle, when fused with another protein, Cpf1 lacking DNA cleavage activity can target the additional protein to almost any DNA sequence simply by co-expression with a suitable guide RNA.
  • the Cpf1 lacking DNA cleavage activity of the present invention can be derived from Cpf1 of different species, for example, Cpf1 proteins derived from Francisella novicida U112, Acidaminococcus sp. BV3L6 and Lachnospiraceae bacterium ND2006 called FnCpf1, AsCpf1 and LbCpf1, respectively.
  • the Cpf1 lacking DNA cleavage activity is FnCpf1 lacking DNA cleavage activity. In some embodiments, the FnCpf1 lacking DNA cleavage activity comprises a D917A mutation relative to the wild-type FnCpf1.
  • the Cpf1 lacking DNA cleavage activity is AsCpf1 lacking DNA cleavage activity. In some embodiments, the AsCpf1 lacking DNA cleavage activity comprises a D908A mutation relative to the wild-type AsCpf1.
  • the Cpf1 lacking DNA cleavage activity is LbCpf1 lacking DNA cleavage activity.
  • the LbCpf1 lacking DNA cleavage activity comprises a D832A mutation relative to the wild-type LbCpf1.
  • the APOBEC3B deaminase or APOBEC3B deaminase mutant is fused to the N-terminus of the CRISPR effector protein (for example, a nuclease-inactivated CRISPR effector protein, such as Cas9 or Cpf1).
  • the CRISPR effector protein for example, a nuclease-inactivated CRISPR effector protein, such as Cas9 or Cpf1.
  • the APOBEC3B deaminase or APOBEC3B deaminase mutant and the CRISPR effector protein are fused via a linker.
  • the joint can be 1-50 long (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or 20-25, 25-50) or more amino acids, non-functional amino acid sequences without secondary or higher structure.
  • the joint may be a flexible joint.
  • the linker is 16 or 32 amino acids long.
  • the linker is the XTEN linker shown in SEQ ID NO: 36 or 37.
  • uracil DNA glycosylase catalyzes the removal of U from DNA and initiates base excision repair (BER), resulting in the repair of U:G to C:G. Therefore, without being limited by any theory, the inclusion of a uracil DNA glycosylase inhibitor in the base editing fusion protein of the present invention will be able to increase the efficiency of base editing.
  • the base editing fusion protein further comprises Uracil DNA Glycosylase Inhibitor (UGI).
  • URI Uracil DNA Glycosylase Inhibitor
  • the uracil DNA glycosylase inhibitor comprises the amino acid sequence shown in SEQ ID NO: 38.
  • the base editing fusion protein of the present invention further comprises a nuclear localization sequence (NLS).
  • NLS nuclear localization sequence
  • one or more NLS in the base editing fusion protein should have sufficient strength to drive the base editing fusion protein to accumulate in the nucleus of the cell in an amount that can realize its base editing function.
  • the strength of nuclear localization activity is determined by the number and position of NLS in the base editing fusion protein, one or more specific NLS used, or a combination of these factors.
  • the NLS of the base editing fusion protein of the present invention may be located at the N-terminus and/or C-terminus. In some embodiments of the present invention, the NLS of the base editing fusion protein of the present invention may be located between the APOBEC3B deaminase or APOBEC3B deaminase mutant and the CRISPR effector protein. In some embodiments, the base editing fusion protein comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more NLS. In some embodiments, the base editing fusion protein comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more NLS at or near the N-terminus.
  • the base editing fusion protein comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more NLS at or near the C-terminus. In some embodiments, the base editing fusion protein includes a combination of these, such as one or more NLS at the N-terminus and one or more NLS at the C-terminus. When there is more than one NLS, each one can be selected as not dependent on the other NLS. In some preferred embodiments of the present invention, the base editing fusion protein comprises at least 2 NLS, for example, the at least 2 NLS are located at the C-terminus. In some embodiments, the NLS is located at the C-terminus of the base editing fusion protein. In some embodiments, the base editing fusion protein comprises at least 3 NLS.
  • NLS consists of one or more short sequences of positively charged lysine or arginine exposed on the surface of the protein, but other types of NLS are also known.
  • Non-limiting examples of NLS include: PKKKRKV or KRPAATKKAGQAKKKK.
  • the N-terminus of the base editing fusion protein includes the NLS of the amino acid sequence shown in PKKKRKV. In some embodiments of the present invention, the C-terminus of the base editing fusion protein includes the NLS of the amino acid sequence shown in KRPAATKKAGQAKKKK. In some embodiments of the present invention, the C-terminus of the base editing fusion protein includes the NLS of the amino acid sequence shown in PKKKRKV.
  • the base editing fusion protein of the present invention may also include other positioning sequences, such as cytoplasmic positioning sequence, chloroplast positioning sequence, mitochondrial positioning sequence, and the like.
  • the present invention also provides the use of the base editing fusion protein of the present invention to base edit the target sequence in the cell genome.
  • the present invention also provides a system for base editing a target sequence in a cell genome, which comprises at least one of the following i) to v):
  • the base editing fusion protein of the present invention and an expression construct containing a nucleotide sequence encoding a guide RNA;
  • An expression construct comprising a nucleotide sequence encoding the base editing fusion protein of the present invention, and an expression construct comprising a nucleotide sequence encoding a guide RNA;
  • the guide RNA can target the base editing fusion protein to the target sequence in the cell genome.
  • base editing system refers to a combination of components required for base editing of the genome of a cell or organism.
  • the various components of the system such as base editing fusion protein and one or more guide RNAs, may exist independently of each other, or may exist in any combination as a composition.
  • guide RNA and “gRNA” are used interchangeably, and refer to RNA that can form a complex with the CRISPR effector protein and can target the complex to the target sequence due to a certain identity with the target sequence molecular.
  • the guide RNA targets the target sequence by base pairing with the complementary strand of the target sequence.
  • the gRNA used by Cas9 nuclease or its functional variants is usually composed of crRNA and tracrRNA molecules that are partially complementary to form a complex, wherein the crRNA contains sufficient identity with the target sequence to hybridize with the complementary strand of the target sequence and guide
  • the CRISPR complex (Cas9+crRNA+tracrRNA) is a guide sequence (also called a seed sequence) that specifically binds to the target sequence.
  • sgRNA single guide RNA
  • the gRNA used by Cpf1 nuclease or its functional variants is usually composed of mature crRNA molecules only, which can also be called sgRNA. Designing a suitable gRNA based on the CRISPR effector protein used and the target sequence to be edited is within the abilities of those skilled in the art.
  • the base editing system of the present invention contains more than one guide RNA, so that more than one target sequence can be base edited at the same time.
  • the nucleotide sequence encoding the base editing fusion protein is codon-optimized for the organism from which the cell to be base edited is derived.
  • Codon optimization refers to replacing at least one codon of the natural sequence with a codon that is used more frequently or most frequently in the gene of the host cell (e.g., about or more than about 1, 2, 3, 4, 5, 10 , 15, 20, 25, 50 or more codons while maintaining the natural amino acid sequence to modify the nucleic acid sequence to enhance expression in the host cell of interest.
  • a codon that is used more frequently or most frequently in the gene of the host cell e.g., about or more than about 1, 2, 3, 4, 5, 10 , 15, 20, 25, 50 or more codons
  • Codon preference is often related to the translation efficiency of messenger RNA (mRNA), and the translation efficiency is considered to depend on the nature and the nature of the codon being translated
  • mRNA messenger RNA
  • tRNA transfer RNA
  • genes can be tailored to be the best in a given organism based on codon optimization. Good gene expression. Codon utilization tables can be easily obtained, such as the "Codon Usage Database” available at www.kazusa.orjp/codon/ , and these tables can be adjusted in different ways Applicable. See, Nakamura Y. et al., "Codon usage tabulated from the international DNA sequence databases: status for the year 2000. Nucl. Acids Res., 28:292 (2000).
  • the guide RNA is a single guide RNA (sgRNA).
  • sgRNA single guide RNA
  • the method of constructing a suitable sgRNA based on a given target sequence is known in the art. For example, see the literature: Wang, Y. et al. Simultaneous editing of three homoeoalleles in hexaploid bread wheat conflicts heritable resistance to powdery mildew. Nat. Biotechnol. 32,947-951 (2014); Shan, Q.et gen. Target modified.
  • the nucleotide sequence encoding the base editing fusion protein and/or the nucleotide sequence encoding the guide RNA are operably linked to an expression control element such as a promoter.
  • promoters examples include, but are not limited to, polymerase (pol) I, pol II, or pol III promoters.
  • the pol I promoter include chicken RNA pol I promoter.
  • pol II promoters include, but are not limited to, cytomegalovirus immediate early (CMV) promoter, Rous sarcoma virus long terminal repeat (RSV-LTR) promoter, and simian virus 40 (SV40) immediate early promoter.
  • pol III promoters include U6 and H1 promoters.
  • An inducible promoter such as a metallothionein promoter can be used.
  • promoters include T7 phage promoter, T3 phage promoter, ⁇ -galactosidase promoter, and Sp6 phage promoter.
  • the promoter can be cauliflower mosaic virus 35S promoter, maize Ubi-1 promoter, wheat U6 promoter, rice U3 promoter, maize U3 promoter, rice actin promoter.
  • Organisms whose genome can be modified by the base editing system of the present invention include any organisms suitable for base editing, preferably eukaryotes.
  • organisms include, but are not limited to, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, and cats; poultry such as chickens, ducks, and geese; plants, including monocots and dicots
  • the plant is a crop plant, including but not limited to wheat, rice, corn, soybean, sunflower, sorghum, rape, alfalfa, cotton, barley, millet, sugar cane, tomato, tobacco, cassava, and potato.
  • the organism is a plant. More preferably, the organism is rice.
  • the present invention provides a method for producing a genetically modified organism, comprising the base editing fusion protein of the present invention, or an expression construct comprising the base editing fusion protein of the present invention or the present invention
  • the system for base editing of target sequences in the cell genome is introduced into organism cells.
  • the guide RNA can target the base editing fusion protein to the target sequence in the cell genome of the organism, resulting in the target sequence
  • One or more of C is replaced by T.
  • the organism is a plant.
  • target sequences that can be recognized and targeted by the CRISPR effector protein and the guide RNA complex are within the skill of those of ordinary skill in the art.
  • the method of the present invention also includes screening for organisms such as plants with the desired nucleotide substitutions.
  • the nucleotide substitution in organisms such as plants can be detected by T7EI, PCR/RE or sequencing methods, for example, see Shan, Q., Wang, Y., Li, J. & Gao, C. Genome editing in rice and wheat using the CRISPR/Cas system. Nat. Protoc. 9, 2395-2410 (2014).
  • the target sequence to be modified can be located anywhere in the genome, for example, in a functional gene such as a protein-coding gene, or, for example, can be located in a gene expression regulatory region such as a promoter region or an enhancer region, so as to achieve Modification of gene function or modification of gene expression.
  • the C to T base editing in the cell target sequence can be detected by T7EI, PCR/RE or sequencing methods.
  • the base editing system can be introduced into cells by various methods well known to those skilled in the art.
  • Methods that can be used to introduce the genome editing system of the present invention into cells include, but are not limited to: calcium phosphate transfection, protoplast fusion, electroporation, liposome transfection, microinjection, viral infection (such as baculovirus, vaccinia virus, adenovirus) Viruses, adeno-associated viruses, lentiviruses and other viruses), gene bombardment, PEG-mediated transformation of protoplasts, and Agrobacterium-mediated transformation.
  • Cells that can be genome edited by the method of the present invention can be derived from, for example, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cows, and cats; poultry such as chickens, ducks, and geese; plants, including monads.
  • mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cows, and cats
  • poultry such as chickens, ducks, and geese
  • plants including monads.
  • Leafy plants and dicotyledonous plants such as rice, corn, wheat, sorghum, barley, soybean, peanut, Arabidopsis, etc.
  • the method of the present invention is particularly suitable for producing genetically modified plants, such as crop plants.
  • the base editing system can be introduced into the plant by various methods well known to those skilled in the art.
  • the methods that can be used to introduce the base editing system of the present invention into plants include, but are not limited to: gene bombardment, PEG-mediated transformation of protoplasts, Agrobacterium-mediated transformation, plant virus-mediated transformation, pollen tube passage method, and seed Room injection.
  • the base editing system is introduced into the plant by transient transformation.
  • the target sequence can be modified by introducing or producing the base editing fusion protein and guide RNA into plant cells, and the modification can be inherited stably without the need to edit the base.
  • the system stably transforms plants. This avoids the potential off-target effects of the stable base editing system, and also avoids the integration of exogenous nucleotide sequences in the plant genome, thereby having higher biological safety.
  • the introduction is performed in the absence of selective pressure, so as to avoid the integration of foreign nucleotide sequences in the plant genome.
  • the introduction includes transforming the base editing system of the present invention into an isolated plant cell or tissue, and then regenerating the transformed plant cell or tissue into a whole plant.
  • the regeneration is performed in the absence of selective pressure, that is, no selective agent for the selective gene carried on the expression vector is used during the tissue culture process. Not using selection agents can improve plant regeneration efficiency and obtain modified plants that do not contain exogenous nucleotide sequences.
  • the base editing system of the present invention can be transformed into specific parts on the whole plant, such as leaves, stem tips, pollen tubes, young ears or hypocotyls. This is particularly suitable for the transformation of plants that are difficult to undergo tissue culture regeneration.
  • the protein expressed in vitro and/or the RNA molecule transcribed in vitro is directly transformed into the plant.
  • the protein and/or RNA molecule can realize base editing in plant cells and then be degraded by the cell, avoiding the integration of foreign nucleotide sequences in the plant genome.
  • genetic modification and breeding of plants using the method of the present invention can obtain plants without foreign DNA integration, that is, transgene-free modified plants.
  • the base editing system of the present invention has high specificity (low off-target rate) when performing base editing in plants, which also improves biological safety.
  • Plants that can be base edited by the method of the present invention include monocotyledonous plants and dicotyledonous plants.
  • the plant may be a crop plant such as wheat, rice, corn, soybean, sunflower, sorghum, rape, alfalfa, cotton, barley, millet, sugarcane, tomato, tobacco, cassava, or potato.
  • the target sequence is related to plant traits such as agronomic traits, whereby the base editing causes the plant to have an altered trait relative to a wild-type plant.
  • the target sequence to be modified can be located anywhere in the genome, for example, in a functional gene such as a protein-coding gene, or, for example, can be located in a gene expression regulatory region such as a promoter region or an enhancer region, so as to achieve Modification of gene function or modification of gene expression.
  • the C to T substitution results in an amino acid substitution in the target protein.
  • the C to T substitution results in a change in the expression of the target gene.
  • the method further includes obtaining progeny of the genetically modified plant.
  • the present invention also provides a genetically modified plant or its progeny or part thereof, wherein the plant is obtained by the above-mentioned method of the present invention.
  • the genetically modified plant or progeny or part thereof is non-transgenic.
  • the present invention also provides a plant breeding method, comprising crossing the genetically modified first plant obtained by the above-mentioned method of the present invention with a second plant not containing the genetic modification, thereby combining the The genetic modification is introduced into the second plant.
  • Example 1 Selection of A3Bctd mutation site based on protein structure information
  • the candidate single-base editing system is optimized on the A3A-BE3 vector backbone (SEQ ID NO: 1, including the base editor of human APOBEC3A), using artificially synthesized A3Bctd DNA fragments (SEQ ID NO: 2) using the Gbison method Replace the APOBEC3A sequence in the A3A-BE3 vector to obtain the A3Bctd-BE3 vector.
  • A3Bctd-BE3 vector fusion PCR and Gbison were used to perform point mutations on the encoded amino acids on A3Bctd to obtain A3Bctd-R210A-BE3, A3Bctd-R210K-BE3, A3Bctd-R211K-BE3, A3Bctd-T214C-BE3, A3Bctd- T214G-BE3, A3Bctd-T214S-BE3, A3Bctd-T214V-BE3, A3Bctd-L230K-BE3, A3Bctd-N240A-BE3, A3Bctd-W281H-BE3, A3Bctd-F308K-BE3, A3Bctd-A3Bctd-R311K-BE3 BE3, A3Bctd-D314R-BE3, A3Bctd-D314H-BE3, A3Bctd-Y315M-BE3 point mutation
  • the constructed control plasmids include A3A-BE3, YEE-BE3, RK-BE3, eA3A-BE3, A3A-R128A-BE3, A3A-Y130F-BE3, and untruncated APOBEC3B-BE3 (see the deaminase sequence in SEQ ID NO: 19-25), where YEE and RK are two variants of APOBEC1 deaminase on the BE3 vector, constructed by fusion PCR and Gbison.
  • the A3A deaminase sequence is artificially synthesized, and the R128A and Y130F of A3A are constructed by fusion PCR and Gbison.
  • the guide RNA vectors used in this experiment include pSp-sgRNA and pSa-sgRNA vectors.
  • the eight targets shown in Table 1 were constructed respectively.
  • the target of -T1 was constructed into pSp-sgRNA vector using restriction enzyme digestion and ligation method, as The guide RNA vector for detecting the targeting efficiency, the target at the end of -SaT1 or -SaT2 is constructed into the pSa-sgRNA vector by restriction enzyme digestion and ligation method, as a vector for detecting random off-target ability by the TA-AS method.
  • the principle of the TA-AS method is to co-transfect with the base editing system to be tested (such as the base editing system based on nSpCas9 in this experiment) with its orthogonal (that is, cannot share gRNA), and can produce other single-stranded regions.
  • a CRISPR system such as the nSaCas9 system, whereby other orthogonal CRISPR systems select a location in the genome to produce a long-term stable single-stranded region. If the base editing system to be tested has a random off-target effect in the genome, it will Deamination at the C base of this single-stranded region causes undesirable editing. The random off-target effect of the single-base editing system can be detected efficiently and simply by the high-throughput sequencing of the amplicons at selected sites.
  • each single base The base editing system and its own guide RNA vector pSp-sgRNA and pnSaCsa9 and the corresponding pSa-sgRNA in the TA-AS system are used to transform rice protoplasts. After two days of culture, the target site amplicons are sequenced, and four targets are selected. The average value of the site and the four off-target sites to evaluate the targeting efficiency and off-target efficiency.
  • Each single-base editing system has at least three biological repetitions for each target site.
  • the results are shown in Figure 2 and Figure 3. It was found that there are eight point mutations R211K, T214V, F308K, R311K, Y313F, D314R, D314H, and Y315M that can maintain high mutation efficiency while reducing off-target efficiency. Seven of them are on Loop1 and Loop7. These seven variants were combined to further improve specificity.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Medicinal Chemistry (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Cell Biology (AREA)
  • Mycology (AREA)
  • Peptides Or Proteins (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
  • Document Processing Apparatus (AREA)
  • Saccharide Compounds (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

提供了一种改进的胞嘧啶碱基编辑***,该***具有降低的基因组范围脱靶效应以及窄的编辑窗口,该***包含一种碱基编辑融合蛋白,该融合蛋白包含与CRISPR效应蛋白融合的APOBEC3B脱氨酶或APOBEC3B脱氨酶突变体。

Description

改进的胞嘧啶碱基编辑*** 技术领域
本发明属于基因编辑领域。具体而言,本发明涉及改进的胞嘧啶碱基编辑***,其具有显著降低的基因组范围脱靶效应以及窄的编辑窗口。
发明背景
基因组编辑技术是基于人工核酸酶对基因组进行靶向修饰的基因工程技术,在农业和医学研究中发挥着越来越强大的作用。成簇的规律间隔的短回文重复序列及其相关***(Clustered regularly interspaced short palindromic repeats/CRISPR associated,CRISPR)是目前使用最广泛的基因组编辑工具,在人工设计的guide RNA的导向作用下,Cas蛋白可以靶向基因组中的任意位置。碱基编辑***是基于CRISPR***开发的新型基因编辑技术,分为胞嘧啶碱基编辑***和腺嘌呤碱基编辑***,分别将胞嘧啶脱氨酶与腺嘌呤脱氨酶与Cas9单链切口酶融合,在向导RNA的靶向作用下,Cas9单链切口酶产生一个单链DNA区域,因此脱氨酶可以高效地分别将靶向位置的单链DNA上的C和A核苷酸脱去氨基,变为U碱基和I碱基,进而在细胞自身修复的过程中被修复为T碱基和G碱基。
胞嘧啶碱基编辑***被发现在基因组范围内会产生不可预测的脱靶现象,这很可能是由于胞嘧啶脱氨酶在基因组中过量表达,在基因组中的高转录活跃区产生的随机脱氨现象导致的。另外,如果在靶位点的工作窗口内存在多个C时,现有的高效碱基编辑***经常会获得多个C同时改变的产物,无法获得只有单个C突变的产物。在基因组范围的特异性与靶位点处的精确性极大地影响了胞嘧啶碱基编辑***的使用。
发明简述
胞嘧啶碱基编辑***的特异性和精确性均可能与胞嘧啶脱氨酶与单链DNA的结合能力有关,改变或削弱脱氨酶与单链DNA的结合能力,同时不降低脱氨酶的脱氨能力,有可能获得既高效、又兼顾特异性与精确性的胞嘧啶单碱基编辑***。本发明人通过对人源的hA3Bctd(APOBEC3B C-terminal domain)的与单链DNA结合的结构域中的Loop1与Loop7进行优化,通过水稻原生质体转化对获得的变体进行了测试,检测了所获变体的效率与精确性,并对所获变体的特异性进行了测试,从而获得了一系列高效、高特异性、高精确性的碱基编辑***。
附图简述
图1.示出A3Bctd突变位点选择。
图2.示出待测试的碱基编辑***的靶向效率与脱靶效率。
图3.示出待测试的碱基编辑***的平均靶向效率与平均脱靶效率。
图4.示出双突与三突变体的组合。
图5.示出原生质体转化验证双突与三突变体的靶向与脱靶效率。
图6.示出待测试的双突、三突变体的平均靶向效率与平均脱靶效率。
图7.示出不同碱基编辑***在四个靶向位点上不同C的工作效率。
图8.示出不同碱基编辑***在四个靶向位点上的编辑产物的平均突变类型。
发明详述
一、定义
在本发明中,除非另有说明,否则本文中使用的科学和技术名词具有本领域技术人员所通常理解的含义。并且,本文中所用的蛋白质和核酸化学、分子生物学、细胞和组织培养、微生物学、免疫学相关术语和实验室操作步骤均为相应领域内广泛使用的术语和常规步骤。例如,本发明中使用的标准重组DNA和分子克隆技术为本领域技术人员熟知,并且在如下文献中有更全面的描述:Sambrook,J.,Fritsch,E.F.和Maniatis,T.,Molecular Cloning:A Laboratory Manual;Cold Spring Harbor Laboratory Press:Cold Spring Harbor,1989(下文称为“Sambrook”)。同时,为了更好地理解本发明,下面提供相关术语的定义和解释。
如本文所用,术语“和/或”涵盖由该术语连接的项目的所有组合,应视作各个组合已经单独地在本文列出。例如,“A和/或B”涵盖了“A”、“A和B”以及“B”。例如,“A、B和/或C”涵盖“A”、“B”、“C”、“A和B”、“A和C”、“B和C”以及“A和B和C”。
“包含”一词在本文中用于描述蛋白质或核酸的序列时,所述蛋白质或核酸可以是由所述序列组成,或者在所述蛋白质或核酸的一端或两端可以具有额外的氨基酸或核苷酸,但仍然具有本发明所述的活性。此外,本领域技术人员清楚多肽N端由起始密码子编码的甲硫氨酸在某些实际情况下(例如在特定表达***表达时)会被保留,但不实质影响多肽的功能。因此,本申请说明书和权利要求书中在描述具体的多肽氨基酸序列时,尽管其可能不包含N端由起始密码子编码的甲硫氨酸,然而此时也涵盖包含该甲硫氨酸的序列,相应地,其编码核苷酸序列也可以包含起始密码子;反之亦然。
如本文所用,术语“CRISPR效应蛋白”通常指在天然存在的CRISPR***中存在的核酸酶,以及其修饰形式、其变体、其催化活性片段等。该术语涵盖基于CRISPR***的能够在细胞内实现基因靶向(例如基因编辑、基因靶向调控等)的任何效应蛋白。
“CRISPR效应蛋白”的实例包括Cas9核酸酶或其变体。所述Cas9核酸酶可以是来自不同物种的Cas9核酸酶,例如来自化脓链球菌(S.pyogenes)的spCas9或衍生自金黄色葡萄球菌(S.aureus)的SaCas9。“Cas9核酸酶”和“Cas9”在本文中可互换使用,指的是包括Cas9蛋白或其片段(例如包含Cas9的活性DNA切割结构域和/或Cas9的gRNA 结合结构域的蛋白)的RNA指导的核酸酶。Cas9是CRISPR/Cas(成簇的规律间隔的短回文重复序列及其相关***)基因组编辑***的组分,能在向导RNA的指导下靶向并切割DNA靶序列形成DNA双链断裂(DSB)。
“CRISPR效应蛋白”的实例还可以包括Cpf1核酸酶或其变体例如高特异性变体。所述Cpf1核酸酶可以是来自不同物种的Cpf1核酸酶,例如来自Francisella novicida U112、Acidaminococcus sp.BV3L6和Lachnospiraceae bacterium ND2006的Cpf1核酸酶。
“CRISPR效应蛋白”还可以衍生自Cas3、Cas8a、Cas5、Cas8b、Cas8c、Cas10d、Cse1、Cse2、Csy1、Csy2、Csy3、GSU0054、Cas10、Csm2、Cmr5、Cas10、Csx11、Csx10、Csf1、Csn2、Cas4、C2c1、C2c3或C2c2核酸酶,例如包括这些核酸酶或其功能性变体。
“基因组”如本文所用不仅涵盖存在于细胞核中的染色体DNA,而且还包括存在于细胞的亚细胞组分(如线粒体、质体)中的细胞器DNA。
如本文所用,“生物体”包括适于基因组编辑的任何生物体,优选真核生物。生物体的实例包括但不限于,哺乳动物如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅;植物包括单子叶植物和双子叶植物,例如水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥等。
“经遗传修饰的生物体”或“经遗传修饰的细胞”意指在其基因组内包含外源多核苷酸或修饰的基因或表达调控序列的生物体或细胞。例如外源多核苷酸能够稳定地整合进生物体或细胞的基因组中,并遗传连续的世代。外源多核苷酸可单独地或作为重组DNA构建体的部分整合进基因组中。修饰的基因或表达调控序列为在生物体或细胞基因组中所述序列包含单个或多个脱氧核苷酸取代、缺失和添加。
针对序列而言的“外源”意指来自外来物种的序列,或者如果来自相同物种,则指通过蓄意的人为干预而从其天然形式发生了组成和/或基因座的显著改变的序列。
“多核苷酸”、“核酸序列”、“核苷酸序列”或“核酸片段”可互换使用并且是单链或双链RNA或DNA聚合物,任选地可含有合成的、非天然的或改变的核苷酸碱基。核苷酸通过如下它们的单个字母名称来指代:“A”为腺苷或脱氧腺苷(分别对应RNA或DNA),“C”表示胞苷或脱氧胞苷,“G”表示鸟苷或脱氧鸟苷,“U”表示尿苷,“T”表示脱氧胸苷,“R”表示嘌呤(A或G),“Y”表示嘧啶(C或T),“K”表示G或T,“H”表示A或C或T,“I”表示肌苷,并且“N”表示任何核苷酸。
“多肽”、“肽”、和“蛋白质”在本发明中可互换使用,指氨基酸残基的聚合物。该术语适用于其中一个或多个氨基酸残基是相应的天然存在的氨基酸的人工化学类似物的氨基酸聚合物,以及适用于天然存在的氨基酸聚合物。术语“多肽”、“肽”、“氨基酸序列”和“蛋白质”还可包括修饰形式,包括但不限于糖基化、脂质连接、硫酸盐化、谷氨酸残基的γ羧化、羟化和ADP-核糖基化。在肽或蛋白中,合适的保守型氨基酸取代是本领域技术人员已知的,并且一般可以进行而不改变所得分子的生物活性。通常,本领域技术人员认识到多肽的非必需区中的单个氨基酸取代基本上不改变生物活性(参 见,例如,Watson et al.,Molecular Biology of the Gene,4th Edition,1987,The Benjamin/Cummings Pub.co.,p.224)。
如本发明所用,“表达构建体”是指适于感兴趣的核苷酸序列在生物体中表达的载体如重组载体。“表达”指功能产物的产生。例如,核苷酸序列的表达可指核苷酸序列的转录(如转录生成mRNA或功能RNA)和/或RNA翻译成前体或成熟蛋白质。
本发明的“表达构建体”可以是线性的核酸片段、环状质粒、病毒载体,或者,在一些实施方式中,可以是能够翻译的RNA(如mRNA)。
本发明的“表达构建体”可包含不同来源的调控序列和感兴趣的核苷酸序列,或相同来源但以不同于通常天然存在的方式排列的调控序列和感兴趣的核苷酸序列。
“调控序列”和“调控元件”可互换使用,指位于编码序列的上游(5'非编码序列)、中间或下游(3'非编码序列),并且影响相关编码序列的转录、RNA加工或稳定性或者翻译的核苷酸序列。调控序列可包括但不限于启动子、翻译前导序列、内含子和多腺苷酸化识别序列。
“启动子”指能够控制另一核酸片段转录的核酸片段。在本发明的一些实施方案中,启动子是能够控制细胞中基因转录的启动子,无论其是否来源于所述细胞。启动子可以是组成型启动子或组织特异性启动子或发育调控启动子或诱导型启动子。
“组成型启动子”指一般将引起基因在多数细胞类型中在多数情况下表达的启动子。“组织特异性启动子”和“组织优选启动子”可互换使用,并且指主要但非必须专一地在一种组织或器官中表达,而且也可在一种特定细胞或细胞型中表达的启动子。“发育调控启动子”指其活性由发育事件决定的启动子。“诱导型启动子”响应内源性或外源性刺激(环境、激素、化学信号等)而选择性表达可操纵连接的DNA序列。
如本文中所用,术语“可操作地连接”指调控元件(例如但不限于,启动子序列、转录终止序列等)与核酸序列(例如,编码序列或开放读码框)连接,使得核苷酸序列的转录被所述转录调控元件控制和调节。用于将调控元件区域可操作地连接于核酸分子的技术为本领域已知的。
将核酸分子(例如质粒、线性核酸片段、RNA等)或蛋白质“导入”生物体是指用所述核酸或蛋白质转化生物体细胞,使得所述核酸或蛋白质在细胞中能够发挥功能。本发明所用的“转化”包括稳定转化和瞬时转化。
“稳定转化”指将外源核苷酸序列导入基因组中,导致外源核苷酸序列稳定遗传。一旦稳定转化,外源核酸序列稳定地整合进所述生物体和其任何连续世代的基因组中。
“瞬时转化”指将核酸分子或蛋白质导入细胞中,执行功能而没有外源核苷酸序列稳定遗传。瞬时转化中,外源核酸序列不整合进基因组中。
“性状”指细胞或生物体的生理的、形态的、生化的或物理的特征。
“农艺性状”特别是指作物植物的可测量的指标参数,包括但不限于:叶片绿色、籽粒产量、生长速率、总生物量或积累速率、成熟时的鲜重、成熟时的干重、果实产量、种子产量、植物总氮含量、果实氮含量、种子氮含量、植物营养组织氮含量、植物总游 离氨基酸含量、果实游离氨基酸含量、种子游离氨基酸含量、植物营养组织游离氨基酸含量、植物总蛋白含量、果实蛋白含量、种子蛋白含量、植物营养组织蛋白质含量、除草剂的抗性抗旱性、氮的吸收、根的倒伏、收获指数、茎的倒伏、株高、穗高、穗长、抗病性、抗寒性、抗盐性和分蘖数等。
二、改进的碱基编辑***
首先,本发明提供一种碱基编辑融合蛋白,其包含与CRISPR效应蛋白融合的APOBEC3B脱氨酶或APOBEC3B脱氨酶突变体。
在本文实施方案中,“碱基编辑融合蛋白”和“碱基编辑器”可互换使用。本发明的包含APOBEC3B脱氨酶或其突变体的碱基编辑融合蛋白能够对靶序列进行高效的碱基编辑,同时与其他碱基编辑器相比,具有显著降低的基因组范围的随机脱靶效应。在一些实施方案中,本发明的包含APOBEC3B脱氨酶或其突变体的碱基编辑融合蛋白对靶序列具有缩短的编辑窗口,能够实现更精准额碱基编辑。
在一些实施方案中,所述APOBEC3B脱氨酶突变体是或衍生自人APOBEC3B脱氨酶。示例性的野生型人APOBEC3B脱氨酶包含SEQ ID NO:19所示氨基酸序列。
在一些实施方案中,所述APOBEC3B脱氨酶突变体是或衍生自人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd,APOBEC3B C-terminal domain)。示例性的野生型hA3Bctd包含SEQ ID NO:2的氨基酸序列。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在以下一或多个位置的氨基酸取代:第210位、第211位、第214位、第230位、第240位、第281位、第308位、第311位、第313位、第314位和第315位,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在以下一或多个位置的氨基酸取代:第211位、第214位、第308位、第311位、第313位、第314位和第315位,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第211位和第311位的氨基酸取代,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第211位和第313位的氨基酸取代,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第211位和第314位的氨基酸取代,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第311位和第313位的氨基酸取代,,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第214位和第314位的氨基酸取代,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第314位和第315位的氨基酸取代,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第211位、第311位和第314位的氨基酸取代,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第211位、第214位和第313位的氨基酸取代,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第214位、第314位和第315位的氨基酸取代,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含选自以下的一或多个氨基酸取代:R210A、R210K3、R211K、T214C、T214G、T214S、T214V、L230K、N240A、W281H、F308K、R311K、Y313F、D314R、D314H、Y315M,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含选自以下的一或多个氨基酸取代:R211K、T214V、F308K、R311K、Y313F、 D314R、D314H与Y315M,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含氨基酸取代R211K和R311K,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含氨基酸取代R211K和Y313F,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含包含氨基酸取代R211K和D314R,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含氨基酸取代R311K和Y313F,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含氨基酸取代T214V和D314R,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含氨基酸取代D314R和Y315M,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含氨基酸取代R211K、R311K和D314R,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含氨基酸取代R211K、T214V和Y313F,其中所述氨基酸位置参考SEQ ID NO:19确定。
在一些实施方案中,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含氨基酸取代T214V、D314H和Y315M,其中所述氨基酸位置参考SEQ ID  NO:19确定。
在一些具体实施方案中,所述APOBEC3B脱氨酶突变体包含选自SEQ ID NO:3-18、26-31和32-34的氨基酸序列。
在一些实施方案中,所述CRISPR效应蛋白是“核酸酶失活的CRISPR效应蛋白”。
“核酸酶失活的CRISPR效应蛋白”是指CRISPR效应蛋白的双链核酸切割活性缺失,然而还保留gRNA指导的DNA靶向能力。缺失双链核酸切割活性的CRISPR效应蛋白也涵盖切口酶(nickase),其在双链核酸分子形成切口(nick),但不完全切断双链核酸。
在本发明的一些优选的实施方案中,本发明所述核酸酶失活的CRISPR效应蛋白具有切口酶活性。不受任何理论限制,据认为真核生物的错配修复通过DNA链上的切口(nick)来指导该链错配碱基的移除和修复。胞苷脱氨酶作用形成的U:G错配可能被修复为C:G。通过在包含未编辑的G的一条链上引入切口,将能够优先地将U:G错配修复为期望的U:A或T:A。
在一些实施方案中,所述核酸酶失活的CRISPR效应蛋白是核酸酶失活的Cas9。Cas9核酸酶的DNA切割结构域已知包含两个亚结构域:HNH核酸酶亚结构域和RuvC亚结构域。HNH亚结构域切割与gRNA互补的链,而RuvC亚结构域切割非互补的链。在这些亚结构域中的突变可以使Cas9的核酸酶活性失活,形成“核酸酶失活的Cas9”。所述核酸酶失活的Cas9仍然保留gRNA指导的DNA结合能力。因此,原则上,当与另外的蛋白融合时,核酸酶失活的Cas9可以简单地通过与合适的向导RNA共表达而将所述另外的蛋白靶向几乎任何DNA序列。
本发明所述核酸酶失活的Cas9可以衍生自不同物种的Cas9,例如,衍生自化脓链球菌(S.pyogenes)Cas9(SpCas9),或衍生自金黄色葡萄球菌(S.aureus)Cas9(SaCas9)。同时突变Cas9的HNH核酸酶亚结构域和RuvC亚结构域(例如,包含突变D10A和H840A)使Cas9的核酸酶失去活性,成为核酸酶死亡Cas9(dCas9)。突变失活其中一个亚结构域可以使得Cas9具有切口酶活性,即获得Cas9切口酶(nCas9),例如,仅具有突变D10A的nCas9。
因此,在本发明的一些实施方案中,本发明所述核酸酶失活的Cas9相对于野生型Cas9包含氨基酸取代D10A和/或H840A。
在本发明的一些具体实施方式中,所述核酸酶失活的Cas9还可以包含额外的突变。例如核酸酶失活的SpCas9还可以包含EQR、VQR或VRER突变以及SaCas9还可以包含KKH突变(Kim et al.Nat.Biotechnol.35,371-376.)。
在本发明的一些具体实施方式中,所述核酸酶失活的SpCas9包含SEQ ID NO:35所示的氨基酸序列。
在一些实施方案中,所述核酸酶失活的CRISPR效应蛋白是核酸酶失活的Cpf1。Cpf1包含一个DNA切割结构域(RuvC),将其突变后可以使Cpf1的DNA切割活性缺失,形成“DNA切割活性缺失的Cpf1”。所述DNA切割活性缺失的Cpf1仍然保留gRNA指导的DNA结合能力。因此,原则上,当与另外的蛋白融合时,DNA切割活性缺失的 Cpf1可以简单地通过与合适的向导RNA共表达而将所述另外的蛋白靶向几乎任何DNA序列。
本发明所述DNA切割活性缺失的Cpf1可以衍生自不同物种的Cpf1,例如,衍生自Francisella novicida U112、Acidaminococcus sp.BV3L6和Lachnospiraceae bacterium ND2006的分别称为FnCpf1、AsCpf1和LbCpf1的Cpf1蛋白。
在一些实施方案中,所述DNA切割活性缺失的Cpf1是DNA切割活性缺失的FnCpf1。在一些具体实施方式中,所述DNA切割活性缺失的FnCpf1相对于野生型FnCpf1包含D917A突变。
在一些实施方案中,所述DNA切割活性缺失的Cpf1是DNA切割活性缺失的AsCpf1。在一些具体实施方式中,所述DNA切割活性缺失的AsCpf1相对于野生型AsCpf1包含D908A突变。
在一些实施方案中,所述DNA切割活性缺失的Cpf1是DNA切割活性缺失的LbCpf1。在一些具体实施方式中,所述DNA切割活性缺失的LbCpf1相对于野生型LbCpf1包含D832A突变。
在本发明的一些实施方案中,所述APOBEC3B脱氨酶或APOBEC3B脱氨酶突变体被融合至所述CRISPR效应蛋白(例如核酸酶失活的CRISPR效应蛋白,例如Cas9或Cpf1)的N末端。
在本发明的一些实施方案中,所述APOBEC3B脱氨酶或APOBEC3B脱氨酶突变体和所述CRISPR效应蛋白(例如核酸酶失活的CRISPR效应蛋白,例如Cas9或Cpf1)通过接头融合。所述接头可以是长1-50个(例如1、2、3、4、5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20个或20-25个、25-50个)或更多个氨基酸、无二级以上结构的非功能性氨基酸序列。例如,所述接头可以是柔性接头。优选地,所述接头长16或32个氨基酸。在一些具体实施方案中,所述接头是SEQ ID NO:36或37所示的XTEN接头。
在细胞中,尿嘧啶DNA糖基化酶催化U从DNA上的去除并启动碱基切除修复(BER),导致将U:G修复成C:G。因此,不受任何理论限制,在本发明的碱基编辑融合蛋白中包含尿嘧啶DNA糖基化酶抑制剂将能够增加碱基编辑的效率。
因此,在本发明的一些实施方案中,所述碱基编辑融合蛋白还包含尿嘧啶DNA糖基化酶抑制剂(UGI)。在一些具体实施方式中,所述尿嘧啶DNA糖基化酶抑制剂包含SEQ ID NO:38所示的氨基酸序列。
在本发明的一些实施方案中,本发明的碱基编辑融合蛋白还包含核定位序列(NLS)。一般而言,所述碱基编辑融合蛋白中的一个或多个NLS应具有足够的强度,以便在细胞的核中驱动所述碱基编辑融合蛋白以可实现其碱基编辑功能的量积聚。一般而言,核定位活性的强度由所述碱基编辑融合蛋白中NLS的数目、位置、所使用的一个或多个特定的NLS、或这些因素的组合决定。
在本发明的一些实施方案中,本发明的碱基编辑融合蛋白的NLS可以位于N端和/ 或C端。在本发明的一些实施方案中,本发明的碱基编辑融合蛋白的NLS可以位于所述APOBEC3B脱氨酶或APOBEC3B脱氨酶突变体与所述CRISPR效应蛋白之间。在一些实施方案中,所述碱基编辑融合蛋白包含约1、2、3、4、5、6、7、8、9、10个或更多个NLS。在一些实施方案中,所述碱基编辑融合蛋白包含在或接近于N端的约1、2、3、4、5、6、7、8、9、10个或更多个NLS。在一些实施方案中,所述碱基编辑融合蛋白包含在或接近于C端约1、2、3、4、5、6、7、8、9、10个或更多个NLS。在一些实施方案中,所述碱基编辑融合蛋白包含这些的组合,如包含在N端的一个或多个NLS以及在C端的一个或多个NLS。当存在多于一个NLS时,每一个可以被选择为不依赖于其他NLS。在本发明的一些优选实施方式中,所述碱基编辑融合蛋白包含至少2个NLS,例如所述至少2个NLS位于C端。在一些实施方案中,所述NLS位于所述碱基编辑融合蛋白的C末端。在一些实施方案中,所述碱基编辑融合蛋白包含至少3个NLS。
一般而言,NLS由暴露于蛋白表面上的带正电的赖氨酸或精氨酸的一个或多个短序列组成,但其他类型的NLS也是已知的。NLS的非限制性实例包括:PKKKRKV或KRPAATKKAGQAKKKK。
在本发明的一些实施方式中,所述碱基编辑融合蛋白的N端包含PKKKRKV所示的氨基酸序列的NLS。在本发明的一些实施方式中,所述碱基编辑融合蛋白的C端包含KRPAATKKAGQAKKKK所示的氨基酸序列的NLS。在本发明的一些实施方式中,所述碱基编辑融合蛋白的C端包含PKKKRKV所示的氨基酸序列的NLS。
此外,根据所需要编辑的DNA位置,本发明的碱基编辑融合蛋白还可以包括其他的定位序列,例如细胞质定位序列、叶绿体定位序列、线粒体定位序列等。
在另一方面,本发明还提供了本发明所述碱基编辑融合蛋白在对细胞基因组中的靶序列进行碱基编辑的用途。
在另一方面,本发明还提供了一种用于对细胞基因组中的靶序列进行碱基编辑的***,其包含以下i)至v)中至少一项:
i)本发明所述碱基编辑融合蛋白,和向导RNA;
ii)包含编码本发明所述碱基编辑融合蛋白的核苷酸序列的表达构建体,和向导RNA;
iii)本发明所述碱基编辑融合蛋白,和包含编码向导RNA的核苷酸序列的表达构建体;
iv)包含编码本发明所述碱基编辑融合蛋白的核苷酸序列的表达构建体,和包含编码向导RNA的核苷酸序列的表达构建体;
v)包含编码本发明所述碱基编辑融合蛋白的核苷酸序列和编码向导RNA的核苷酸序列的表达构建体;
其中所述向导RNA能够将所述碱基编辑融合蛋白靶向细胞基因组中的靶序列。
如本文所用,“碱基编辑***”是指用于对细胞或生物体内基因组进行碱基编辑所需的成分的组合。其中所述***的各个成分,例如碱基编辑融合蛋白、一种或多种向导 RNA可以各自独立地存在,或者可以以任意的组合作为组合物的形式存在。
如本文所用,“向导RNA”和“gRNA”可互换使用,指的是能够与CRISPR效应蛋白形成复合物并由于与靶序列具有一定相同性而能够将所述复合物靶向靶序列的RNA分子。向导RNA通过与靶序列互补链之间的碱基配对而靶向所述靶序列。例如,Cas9核酸酶或其功能性变体所采用的gRNA通常由部分互补形成复合物的crRNA和tracrRNA分子构成,其中crRNA包含与靶序列具有足够相同性以便与该靶序列的互补链杂交并且指导CRISPR复合物(Cas9+crRNA+tracrRNA)与该靶序列序列特异性地结合的引导序列(也称种子序列)。然而,本领域已知可以设计单向导RNA(sgRNA),其同时包含crRNA和tracrRNA的特征。而Cpf1核酸酶或其功能性变体所采用的gRNA通常仅由成熟crRNA分子构成,其也可称为sgRNA。基于所使用的CRISPR效应蛋白和待编辑的靶序列设计合适的gRNA属于本领域技术人员的能力范围内。
在一些实施方案中,本发明的碱基编辑***包含多于一种向导RNA,从而可以同时对多于一个靶序列进行碱基编辑。
为了在细胞中获得有效表达,在本发明的一些实施方式中,所述编码碱基编辑融合蛋白的核苷酸序列针对待进行碱基编辑的细胞所来自的生物体进行密码子优化。
密码子优化是指通过用在宿主细胞的基因中更频繁地或者最频繁地使用的密码子代替天然序列的至少一个密码子(例如约或多于约1、2、3、4、5、10、15、20、25、50个或更多个密码子同时维持该天然氨基酸序列而修饰核酸序列以便增强在感兴趣宿主细胞中的表达的方法。不同的物种对于特定氨基酸的某些密码子展示出特定的偏好。密码子偏好性(在生物之间的密码子使用的差异)经常与信使RNA(mRNA)的翻译效率相关,而该翻译效率则被认为依赖于被翻译的密码子的性质和特定的转运RNA(tRNA)分子的可用性。细胞内选定的tRNA的优势一般反映了最频繁用于肽合成的密码子。因此,可以将基因定制为基于密码子优化在给定生物中的最佳基因表达。密码子利用率表可以容易地获得,例如在 www.kazusa.orjp/codon/上可获得的密码子使用数据库(“Codon Usage Database”)中,并且这些表可以通过不同的方式调整适用。参见,Nakamura Y.等,“Codon usage tabulated from the international DNA sequence databases:status for the year2000.Nucl.Acids Res.,28:292(2000)。
在本发明一些实施方式中,所述向导RNA是单向导RNA(sgRNA)。根据给定的靶序列构建合适的sgRNA的方法是本领域已知的。例如,可参见文献:Wang,Y.et al.Simultaneous editing of three homoeoalleles in hexaploid bread wheat confers heritable resistance to powdery mildew.Nat.Biotechnol.32,947-951(2014);Shan,Q.et al.Targeted genome modification of crop plants using a CRISPR-Cas system.Nat.Biotechnol.31,686-688(2013);Liang,Z.et al.Targeted mutagenesis in Zea mays using TALENs and the CRISPR/Cas system.J Genet Genomics.41,63–68(2014)。
在本发明一些实施方式中,所述编码碱基编辑融合蛋白的核苷酸序列和/或所述编码向导RNA的核苷酸序列与表达调控元件如启动子可操作地连接。
本发明可使用的启动子的实例包括但不限于聚合酶(pol)I、pol II或pol III启动子。pol I启动子的实例包括鸡RNA pol I启动子。pol II启动子的实例包括但不限于巨细胞病毒立即早期(CMV)启动子、劳斯肉瘤病毒长末端重复(RSV-LTR)启动子和猿猴病毒40(SV40)立即早期启动子。pol III启动子的实例包括U6和H1启动子。可以使用诱导型启动子如金属硫蛋白启动子。启动子的其他实例包括T7噬菌体启动子、T3噬菌体启动子、β-半乳糖苷酶启动子和Sp6噬菌体启动子。当用于植物时,启动子可以是花椰菜花叶病毒35S启动子、玉米Ubi-1启动子、小麦U6启动子、水稻U3启动子、玉米U3启动子、水稻肌动蛋白启动子。
可以通过本发明的碱基编辑***进行基因组修饰的生物体包括适于碱基编辑的任何生物体,优选真核生物。生物体的实例包括但不限于,哺乳动物如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅;植物,包括单子叶植物和双子叶植物,例如,所述植物是作物植物,包括但不限于小麦、水稻、玉米、大豆、向日葵、高粱、油菜、苜蓿、棉花、大麦、粟、甘蔗、番茄、烟草、木薯和马铃薯。优选地,所述生物体是植物。更优选地,所述生物体是水稻。
三、产生经遗传修饰的生物体的方法
在另一方面,本发明提供了一种产生经遗传修饰的生物体的方法,包括将本发明的碱基编辑融合蛋白、或包含编码本发明的碱基编辑融合蛋白的表达构建体或本发明的用于对细胞基因组中的靶序列进行碱基编辑的***导入生物体细胞。
通过导入本发明的用于对细胞基因组中的靶序列进行碱基编辑的***,向导RNA可以将所述碱基编辑融合蛋白靶向所述生物体细胞基因组中的靶序列,导致所述靶序列中的一或多个C被T取代。在一些优选实施方案中,所述生物体是植物。
可以被CRISPR效应蛋白和向导RNA复合物识别并靶向的靶序列的选择和设计属于本领域普通技术人员的技能范围。
在本发明所述方法的一些实施方案中,还包括筛选具有期望的核苷酸取代的生物体如植物。可以通过T7EI、PCR/RE或测序方法检测生物体如植物中的核苷酸取代,例如可参见Shan,Q.,Wang,Y.,Li,J.&Gao,C.Genome editing in rice and wheat using the CRISPR/Cas system.Nat.Protoc.9,2395-2410(2014)。
在本发明中,待进行修饰的靶序列可以位于基因组的任何位置,例如位于功能基因如蛋白编码基因内,或者例如可以位于基因表达调控区如启动子区或增强子区,从而实现对所述基因功能修饰或对基因表达的修饰。
可以通过T7EI、PCR/RE或测序方法检测所述细胞靶序列中的C至T碱基编辑。
在本发明的方法中,所述碱基编辑的***可以通过本领域技术人员熟知的各种方法导入细胞。可用于将本发明的基因组编辑***导入细胞的方法包括但不限于:磷酸钙转染、原生质融合、电穿孔、脂质体转染、微注射、病毒感染(如杆状病毒、痘苗病毒、腺病毒、腺相关病毒、慢病毒和其他病毒)、基因枪法、PEG介导的原生质体转化、土 壤农杆菌介导的转化。
可以通过本发明的方法进行基因组编辑的细胞可以来自例如,哺乳动物如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅;植物,包括单子叶植物和双子叶植物,例如水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥等。
本发明的方法尤其适合于产生经遗传修饰的植物,例如作物植物。在本发明的产生经遗传修饰的植物的方法中,所述碱基编辑***可以本领域技术人员熟知的各种方法导入植物。可用于将本发明的碱基编辑***导入植物的方法包括但不限于:基因枪法、PEG介导的原生质体转化、土壤农杆菌介导的转化、植物病毒介导的转化、花粉管通道法和子房注射法。优选地,通过瞬时转化将所述碱基编辑***导入植物。
在本发明的方法中,只需在植物细胞中导入或产生所述碱基编辑融合蛋白和向导RNA即可实现对靶序列的修饰,并且所述修饰可以稳定遗传,无需将所述碱基编辑***稳定转化植物。这样避免了稳定存在的碱基编辑***的潜在脱靶作用,也避免外源核苷酸序列在植物基因组中的整合,从而具有更高生物安全性。
在一些优选实施方式中,所述导入在不存在选择压力下进行,从而避免外源核苷酸序列在植物基因组中的整合。
在一些实施方式中,所述导入包括将本发明的碱基编辑***转化至分离的植物细胞或组织,然后使所述经转化的植物细胞或组织再生为完整植物。优选地,在不存在选择压力下进行所述再生,也即是,在组织培养过程中不使用任何针对表达载体上携带的选择基因的选择剂。不使用选择剂可以提高植物的再生效率,获得不含外源核苷酸序列的经修饰的植物。
在另一些实施方式中,可以将本发明的碱基编辑***转化至完整植物上的特定部位,例如叶片、茎尖、花粉管、幼穗或下胚轴。这特别适合于难以进行组织培养再生的植物的转化。
在本发明的一些实施方式中,直接将体外表达的蛋白质和/或体外转录的RNA分子转化至所述植物。所述蛋白质和/或RNA分子能够在植物细胞中实现碱基编辑,随后被细胞降解,避免了外源核苷酸序列在植物基因组中的整合。
因此,在一些实施方式中,使用本发明的方法对植物进行遗传修饰和育种可以获得无外源DNA整合的植物,即非转基因(transgene-free)的经修饰的植物。此外,本发明的碱基编辑***在植物中进行碱基编辑时具有高特异性(低脱靶率),这也提高了生物安全性。
可以通过本发明的方法进行碱基编辑的植物包括单子叶植物和双子叶植物。例如,所述植物可以是作物植物,例如小麦、水稻、玉米、大豆、向日葵、高粱、油菜、苜蓿、棉花、大麦、粟、甘蔗、番茄、烟草、木薯或马铃薯。
在本发明的一些实施方式中,其中所述靶序列与植物性状如农艺性状相关,由此所述碱基编辑导致所述植物相对于野生型植物具有改变的性状。在本发明中,待进行修饰的靶序列可以位于基因组的任何位置,例如位于功能基因如蛋白编码基因内,或者例如 可以位于基因表达调控区如启动子区或增强子区,从而实现对所述基因功能修饰或对基因表达的修饰。相应地,在本发明的一些实施方式中,所述C至T的取代导致靶蛋白中的氨基酸取代。在本发明的另一些实施方式中,所述C至T的取代导致靶基因的表达发生变化。
在本发明的一些实施方式中,所述方法还包括获得所述经遗传修饰的植物的后代。
在另一方面,本发明还提供了经遗传修饰的植物或其后代或其部分,其中所述植物通过本发明上述的方法获得。在一些实施方式中,所述经遗传修饰的植物或其后代或其部分是非转基因的。
在另一方面,本发明还提供了一种植物育种方法,包括将通过本发明上述的方法获得的经遗传修饰的第一植物与不含有所述遗传修饰的第二植物杂交,从而将所述遗传修饰导入第二植物。
实施例
为了便于理解本发明,下面将参照相关具体实施例及附图对本发明进行更全面的描述。附图中给出了本发明的较佳实施例。但是,本发明可以以许多不同的形式来实现,并不限于本文所描述的实施例。相反地,提供这些实施例的目的是使对本发明的公开内容的理解更加透彻全面。
实施例1、基于蛋白结构信息选择A3Bctd突变位点
根据hA3Bctd的已发表结构信息(PDB:2NBQ)与全长的hAPOBEC3B的已发表结构信息(PDB:5CQD,5CQH,5TD5)等,主要在hA3Bctd与单链DNA结合紧密相关的关键loop区域Loop1与Loop7上进行了氨基酸点突变,以期降低与单链DNA的结合能力,具体氨基酸的点突变位置与类型如图1所示。
候选单碱基编辑***在A3A-BE3载体骨架上(SEQ ID NO:1,包含人APOBEC3A的碱基编辑器)优化而来,使用人工合成的A3Bctd DNA片段(SEQ ID NO:2)利用Gbison方式替换A3A-BE3载体中的APOBEC3A序列,以获得A3Bctd-BE3载体。在A3Bctd-BE3载体中利用融合PCR与Gbison方式对A3Bctd上的编码氨基酸进行点突变,分别获得A3Bctd-R210A-BE3、A3Bctd-R210K-BE3、A3Bctd-R211K-BE3、A3Bctd-T214C-BE3、A3Bctd-T214G-BE3、A3Bctd-T214S-BE3、A3Bctd-T214V-BE3、A3Bctd-L230K-BE3、A3Bctd-N240A-BE3、A3Bctd-W281H-BE3、A3Bctd-F308K-BE3、A3Bctd-R311K-BE3、A3Bctd-Y313F-BE3、A3Bctd-D314R-BE3、A3Bctd-D314H-BE3、A3Bctd-Y315M-BE3点突变的单碱基编辑载体(点突后的脱氨酶氨基酸序列分别SEQ ID NO:3-18)。
另外,构建的对照质粒有A3A-BE3、YEE-BE3、RK-BE3、eA3A-BE3、A3A-R128A-BE3、A3A-Y130F-BE3、及未截短的APOBEC3B-BE3(其中脱氨酶序列见SEQ ID NO:19-25),其中,YEE与RK为BE3载体上APOBEC1脱氨酶的两个变体,由融合PCR与Gbison构建。A3A脱氨酶序列由人工合成,A3A的R128A、Y130F由融 合PCR与Gbison构建。
实施例2、在原生质体验证携带单个点突变的A3Bctd-BE3***编辑效率与特异性
2.1.载体构建
本实验中所使用的向导RNA载体包括pSp-sgRNA和pSa-sgRNA载体,分别构建如下表1的八个靶点,其中-T1的靶点使用酶切连接方法构建至pSp-sgRNA载体中,作为检测靶向效率的向导RNA载体,-SaT1或者-SaT2结尾的靶点使用酶切连接方法构建至pSa-sgRNA载体中,作为TA-AS方法检测随机脱靶能力的载体。
TA-AS方法的原理是与待检测的碱基编辑***(如本实验的基于nSpCas9的碱基编辑***)共转染与其正交的(即不能共享gRNA的),可以产生单链区域的其他CRISPR***,如nSaCas9***,由此该正交的其它CRISPR***在基因组内选定位点产生一个长时间稳定的单链区域,如果待检测的碱基编辑***具有基因组范围的随机脱靶效应,会在该单链区域的C碱基上脱氨并造成不期望的编辑。通过所述选定位点的扩增子高通量测序即可高效简便地检测该单碱基编辑***的随机脱靶效应。
表1.
Figure PCTCN2021079086-appb-000001
2.2.原生质体转化
以常规的BE3,A3A-BE3,YEE-BE3,RK-BE3,eA3A-BE3,A3A-R128A-BE3,A3A-Y130F,未截短的APOBEC3B-BE3与A3Bctd-BE3***作为对照,每个单碱基编辑***均与自身的向导RNA载体pSp-sgRNA以及TA-AS***中的pnSaCsa9及对应的pSa-sgRNA共同转化水稻原生质体,培养两天后进行靶位点扩增子测序,取四个靶向位点与四个脱靶位点的平均值,评估其靶向效率与脱靶效率,每个单碱基编辑***的每个靶点有至少三次生物学重复,结果如图2与图3所示。发现有八个点突变R211K,T214V,F308K、R311K、Y313F、D314R、D314H与Y315M可以在维持较高的突变效率的同时,降低脱靶效率,其中有七个点突在Loop1与Loop7上。将这七个变体进行了组合,以进一步提高特异性。
实施例3、组合筛选到的点突变进一步提高特异性
将上一步筛选好的七个氨基酸突变位点进行组合,组合为了九个双突与三突变体(图4),与上文的实验流程相同,对组合变体的靶位点突变效率与脱靶效率进行了四个靶向位点和四个脱靶位点的测试(图4),测试结果发现,KKR,VHM两个三突变体在保持高效的靶向效率的同时,已将脱靶效率降低至与本底相当的水平(图5,图6)。特别是KKR变体,TS-AS***检测结果显示,其检测的四个靶点的平均脱靶效率只有0.6%,与野生型的A3Bctd相比,其脱靶效率降低了21倍(图5,图6)。
实施例4、新胞嘧啶单碱基编辑***的编辑特性分析
将所有单碱基编辑***在四个靶向位点中的变体特性,包括编辑窗口、偏好性、编辑产物类型进行了分析(将PAM序列视作21-23位)。在编辑窗口方面,可以发现A3Bctd与A3A-BE3、A3A-R128A、A3A-Y130F的编辑效率相当,但其工作窗口比A3A-BE3、A3A-R128A、A3A-Y130F要窄。其单个氨基酸的变体A3Bctd-Y313F、A3Bctd-211K、A3Bctd-Y315M、A3Bctd-T214V能够将工作窗口进一步缩小到2~3bp。而其双突或三突变体在略微牺牲编辑效率的同时能够将工作窗口缩小到1~2bp(图7)。
将基因编辑产物根据突变的C的个数可以分为单个(Single)、两个(Double)和多个(Multiple)三种突变类型,图8描述了本实验中所有单碱基编辑***在四个靶向位点上的编辑产物的平均突变类型,按照单个C突变的效率进行排序,可以发现A3A-BE3系列编辑***的总效率虽然较高,但其产生单个C突变产物的比例极低,拿到单个C突变产物的几率非常小。A3Bctd的变体Y313F有近10%的几率拿到仅有一个C突变的编辑产物,VHM、VR、KR同样展现了较高的编辑精确性。值得注意的是,VHM、KKR变体的产物精确性非常高,基本只会产生一个C或两个C突变的编辑产物。

Claims (39)

  1. 一种碱基编辑融合蛋白,其包含与CRISPR效应蛋白融合的APOBEC3B脱氨酶或APOBEC3B脱氨酶突变体。
  2. 权利要求1的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体是或衍生自人APOBEC3B脱氨酶,例如,所述人APOBEC3B脱氨酶包含SEQ ID NO:19所示氨基酸序列。
  3. 权利要求1的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体是或衍生自人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),例如,所述A3Bctd包含SEQ ID NO:2的氨基酸序列。
  4. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在以下一或多个位置的氨基酸取代:第210位、第211位、第214位、第230位、第240位、第281位、第308位、第311位、第313位、第314位和第315位,其中所述氨基酸位置参考SEQ ID NO:19确定。
  5. 权利要求2或3的碱基编辑融合蛋白,所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在以下一或多个位置的氨基酸取代:第211位、第214位、第308位、第311位、第313位、第314位和第315位,其中所述氨基酸位置参考SEQ ID NO:19确定。
  6. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第211位和第311位的氨基酸取代,其中所述氨基酸位置参考SEQ ID NO:19确定。
  7. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第211位和第313位的氨基酸取代,其中所述氨基酸位置参考SEQ ID NO:19确定。
  8. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第211位和第314位的氨基酸取代,其中所述氨基酸位置参考SEQ ID NO:19确定。
  9. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第311位和第313位的氨基酸取代,,其中所述氨基酸位置参考SEQ ID NO:19确定。
  10. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第214位和第314位的氨基酸取代,其中所述氨基酸位置参考SEQ ID NO:19确定。
  11. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第314位和第315位的氨基酸取代,其中所述氨基酸位置参考SEQ ID NO:19确定。
  12. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第211位、第311位和第314位的氨基酸取代,其中所述氨基酸位置参考SEQ ID NO:19确定。
  13. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第211位、第214位和第313位的氨基酸取代,其中所述氨基酸位置参考SEQ ID NO:19确定。
  14. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含在第214位、第314位和第315位的氨基酸取代,其中所述氨基酸位置参考SEQ ID NO:19确定。
  15. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含选自以下的一或多个氨基酸取代:R210A、R210K3、R211K、T214C、T214G、T214S、T214V、L230K、N240A、W281H、F308K、R311K、Y313F、D314R、D314H、Y315M,其中所述氨基酸位置参考SEQ ID NO:19确定。
  16. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含选自以下的一或多个氨基酸取代:R211K、T214V、F308K、R311K、Y313F、D314R、D314H与Y315M,其中所述氨基酸位置参考SEQ ID NO:19确定。
  17. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含氨基酸取代R211K和R311K,其中所述氨基酸位置参考SEQ ID NO:19确定。
  18. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍 生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含氨基酸取代R211K和Y313F,其中所述氨基酸位置参考SEQ ID NO:19确定。
  19. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含包含氨基酸取代R211K和D314R,其中所述氨基酸位置参考SEQ ID NO:19确定。
  20. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含氨基酸取代R311K和Y313F,其中所述氨基酸位置参考SEQ ID NO:19确定。
  21. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含氨基酸取代T214V和D314R,其中所述氨基酸位置参考SEQ ID NO:19确定。
  22. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含氨基酸取代D314R和Y315M,其中所述氨基酸位置参考SEQ ID NO:19确定。
  23. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含氨基酸取代R211K、R311K和D314R,其中所述氨基酸位置参考SEQ ID NO:19确定。
  24. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含氨基酸取代R211K、T214V和Y313F,其中所述氨基酸位置参考SEQ ID NO:19确定。
  25. 权利要求2或3的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体衍生自人APOBEC3B脱氨酶(hA3B)或人APOBEC3B脱氨酶的C-末端结构域(hA3Bctd),且相对于野生型的hA3B或hA3Bctd包含氨基酸取代T214V、D314H和Y315M,其中所述氨基酸位置参考SEQ ID NO:19确定。
  26. 权利要求1的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶突变体包含选自SEQ ID NO:3-18、26-31和32-34的氨基酸序列。
  27. 权利要求1-26中任一项的碱基编辑融合蛋白,其中所述CRISPR效应蛋白是核酸酶失活的CRISPR效应蛋白,例如,其是CRISPR切口酶。
  28. 权利要求27的碱基编辑融合蛋白,其中所述核酸酶失活的CRISPR效应蛋白是 核酸酶失活的Cas9,其相对于野生型Cas9包含氨基酸取代D10A和/或H840A,所述核酸酶失活的Cas9包含SEQ ID NO:35所示的氨基酸序列。
  29. 权利要求1-28中任一项的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶或APOBEC3B脱氨酶突变体融合至所述CRISPR效应蛋白的N端。
  30. 权利要求1-29中任一项的碱基编辑融合蛋白,其中所述APOBEC3B脱氨酶或APOBEC3B脱氨酶突变体和所述CRISPR效应蛋白通过接头融合,例如所述接头是SEQ ID NO:36或37所示的接头。
  31. 权利要求1-30中任一项的碱基编辑融合蛋白,其中所述碱基编辑融合蛋白还包含尿嘧啶DNA糖基化酶抑制剂(UGI),例如,所述尿嘧啶DNA糖基化酶抑制剂包含SEQ ID NO:38所示的氨基酸序列。
  32. 权利要求1-31中任一项的碱基编辑融合蛋白,其中所述碱基编辑融合蛋白还包含核定位序列(NLS)。
  33. 一种用于对细胞基因组中的靶序列进行碱基编辑的***,其包含以下i)至v)中至少一项:
    i)权利要求1-32中任一项的碱基编辑融合蛋白,和向导RNA;
    ii)包含编码权利要求1-32中任一项的碱基编辑融合蛋白的核苷酸序列的表达构建体,和向导RNA;
    iii)权利要求1-32中任一项的碱基编辑融合蛋白,和包含编码向导RNA的核苷酸序列的表达构建体;
    iv)包含编码权利要求1-32中任一项的碱基编辑融合蛋白的核苷酸序列的表达构建体,和包含编码向导RNA的核苷酸序列的表达构建体;
    v)包含编码权利要求1-32中任一项的碱基编辑融合蛋白的核苷酸序列和编码向导RNA的核苷酸序列的表达构建体;
    其中所述向导RNA能够将所述碱基编辑融合蛋白靶向细胞基因组中的靶序列。
  34. 权利要求33的***,其包含多于一种向导RNA或其表达构建体,从而可以同时对多于一个靶序列进行碱基编辑。
  35. 权利要求33或34的***,其中所述编码碱基编辑融合蛋白的核苷酸序列针对待进行碱基编辑的细胞所来自的生物体进行密码子优化。
  36. 权利要求33-35中任一项的***,所述向导RNA是单向导RNA(sgRNA)。
  37. 权利要求33-36中任一项的***,其中所述编码碱基编辑融合蛋白的核苷酸序列和/或所述编码向导RNA的核苷酸序列与表达调控元件如启动子可操作地连接。
  38. 一种产生经遗传修饰的生物体的方法,包括将权利要求1-32中任一项的碱基编辑融合蛋白、或者包含编码权利要求1-32中任一项的碱基编辑融合蛋白的核苷酸序列的表达构建体、或者权利要求33-36中任一项的用于对细胞基因组中的靶序列进行碱基编辑的***导入生物体细胞。
  39. 权利要求38的方法,其中所述生物体是植物。
PCT/CN2021/079086 2020-03-04 2021-03-04 改进的胞嘧啶碱基编辑*** WO2021175288A1 (zh)

Priority Applications (8)

Application Number Priority Date Filing Date Title
JP2022553071A JP2023517890A (ja) 2020-03-04 2021-03-04 改善されたシトシン塩基編集システム
KR1020227034519A KR20220150363A (ko) 2020-03-04 2021-03-04 개선된 사이토신 염기 편집 시스템
BR112022017732A BR112022017732A2 (pt) 2020-03-04 2021-03-04 Sistema aperfeiçoado de edição de base de citosina
AU2021229415A AU2021229415A1 (en) 2020-03-04 2021-03-04 Improved cytosine base editing system
EP21764693.4A EP4130257A4 (en) 2020-03-04 2021-03-04 IMPROVED CYTOSINE BASE EDITING SYSTEM
CN202180019220.7A CN115427564A (zh) 2020-03-04 2021-03-04 改进的胞嘧啶碱基编辑***
CA3174615A CA3174615A1 (en) 2020-03-04 2021-03-04 Improved cytosine base editing system
US17/909,570 US20230313234A1 (en) 2020-03-04 2021-03-04 Improved cytosine base editing system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010145047.2 2020-03-04
CN202010145047 2020-03-04

Publications (1)

Publication Number Publication Date
WO2021175288A1 true WO2021175288A1 (zh) 2021-09-10

Family

ID=77613907

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/079086 WO2021175288A1 (zh) 2020-03-04 2021-03-04 改进的胞嘧啶碱基编辑***

Country Status (9)

Country Link
US (1) US20230313234A1 (zh)
EP (1) EP4130257A4 (zh)
JP (1) JP2023517890A (zh)
KR (1) KR20220150363A (zh)
CN (1) CN115427564A (zh)
AU (1) AU2021229415A1 (zh)
BR (1) BR112022017732A2 (zh)
CA (1) CA3174615A1 (zh)
WO (1) WO2021175288A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114686456A (zh) * 2022-05-10 2022-07-01 中山大学 基于双分子脱氨酶互补的碱基编辑***及其应用

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018218188A2 (en) * 2017-05-25 2018-11-29 The General Hospital Corporation Base editors with improved precision and specificity
WO2019042284A1 (en) * 2017-09-01 2019-03-07 Shanghaitech University FUSION PROTEINS FOR ENHANCED PRECISION IN THE BASIC EDITION

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016046635A1 (en) * 2014-09-25 2016-03-31 Institut Pasteur Methods for characterizing human papillomavirus associated cervical lesions
IL294014B2 (en) * 2015-10-23 2024-07-01 Harvard College Nucleobase editors and their uses
US10961525B2 (en) * 2017-07-05 2021-03-30 The Trustees Of The University Of Pennsylvania Hyperactive AID/APOBEC and hmC dominant TET enzymes
US11332749B2 (en) * 2017-07-13 2022-05-17 Regents Of The University Of Minnesota Real-time reporter systems for monitoring base editing
EP3841203A4 (en) * 2018-08-23 2022-11-02 The Broad Institute Inc. CAS9 VARIANTS WITH NON-CANONICAL PAM SPECIFICITIES AND USES OF THEM

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018218188A2 (en) * 2017-05-25 2018-11-29 The General Hospital Corporation Base editors with improved precision and specificity
WO2019042284A1 (en) * 2017-09-01 2019-03-07 Shanghaitech University FUSION PROTEINS FOR ENHANCED PRECISION IN THE BASIC EDITION

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
COELHO, MA ET AL.: "BE-FLARE: a Fluorescent Reporter of Base Editing Activity Reveals Editing Characteristics of APOBEC3A and APOBEC3B", BMC BIOLOGY, vol. 16, 28 December 2018 (2018-12-28), XP055751951, ISSN: 1741-7007, DOI: 10.1186/s12915-018-0617-1 *
LIANG, Z. ET AL.: "Targeted mutagenesis in Zea mays using TALENs and the CRISPR/Cas system", J GENET GENOMICS, vol. 41, 2014, pages 63 - 68
NAKAMURA, Y. ET AL.: "Codon usage tabulated from the international DNA sequence databases: status for the year 2000", NUCL. ACIDS RES., vol. 28, 2000, pages 292, XP002941557, DOI: 10.1093/nar/28.1.292
SAMBROOK, J.FRITSCH, E.F.MANIATIS, T.: "Molecular Cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY PRESS
See also references of EP4130257A4
SHAN, Q. ET AL.: "Targeted genome modification of crop plants using a CRISPR-Cas system", NAT. BIOTECHNOL., vol. 31, 2013, pages 686 - 688, XP055216828, DOI: 10.1038/nbt.2650
SHAN, Q.WANG, Y.LI, J.GAO, C.: "Genome editing in rice and wheat using the CRISPR/Cas system", NAT. PROTOC., vol. 9, 2014, pages 2395 - 2410, XP055927452, DOI: 10.1038/nprot.2014.157
WANG, Y. ET AL.: "Simultaneous editing of three homoeoalleles in hexaploid bread wheat confers heritable resistance to powdery mildew", NAT. BIOTECHNOL., vol. 32, 2014, pages 947 - 951, XP055390922, DOI: 10.1038/nbt.2969
WATSON ET AL.: "Molecular Biology of the Gene", 1987, THE BENJAMIN/CUMMINGS PUB. CO., pages: 224

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114686456A (zh) * 2022-05-10 2022-07-01 中山大学 基于双分子脱氨酶互补的碱基编辑***及其应用
CN114686456B (zh) * 2022-05-10 2023-02-17 中山大学 基于双分子脱氨酶互补的碱基编辑***及其应用

Also Published As

Publication number Publication date
US20230313234A1 (en) 2023-10-05
CN115427564A (zh) 2022-12-02
JP2023517890A (ja) 2023-04-27
AU2021229415A1 (en) 2022-10-06
EP4130257A4 (en) 2024-05-01
KR20220150363A (ko) 2022-11-10
BR112022017732A2 (pt) 2023-01-17
EP4130257A1 (en) 2023-02-08
EP4130257A9 (en) 2024-04-24
CA3174615A1 (en) 2021-09-10

Similar Documents

Publication Publication Date Title
WO2019120310A1 (en) Base editing system and method based on cpf1 protein
WO2021032155A1 (zh) 一种碱基编辑***和其使用方法
BR112020010594A2 (pt) método para edição de bases em plantas
WO2023169454A1 (zh) 腺嘌呤脱氨酶及其在碱基编辑中的用途
WO2021185358A1 (zh) 一种提高植物遗传转化和基因编辑效率的方法
WO2021082830A1 (zh) 靶向性修饰植物基因组序列的方法
US20240117368A1 (en) Multiplex genome editing method and system
WO2023169410A1 (zh) 胞嘧啶脱氨酶及其在碱基编辑中的用途
WO2020087631A1 (zh) 基于C2c1核酸酶的基因组编辑***和方法
CN112805385B (zh) 基于人apobec3a脱氨酶的碱基编辑器及其用途
CN117264998A (zh) 双功能基因组编辑***及其用途
WO2021175288A1 (zh) 改进的胞嘧啶碱基编辑***
EP4242237A1 (en) Foki nuclease domain variant
JP2024501892A (ja) 新規の核酸誘導型ヌクレアーゼ
WO2022188816A1 (zh) 改进的cg碱基编辑***
WO2024051850A1 (zh) 基于dna聚合酶的基因组编辑***和方法
WO2020117837A1 (en) Methods and compositions for improving silage
WO2022199665A1 (zh) 一种提高植物遗传转化和基因编辑效率的方法
WO2023227050A1 (zh) 一种在基因组中定点***外源序列的方法
WO2021098709A1 (zh) 衍生自黄杆菌的基因编辑***
US20230357756A1 (en) Compositions, methods, and systems for cell labeling
WO2023232109A1 (zh) 新的crispr基因编辑***
US20230407278A1 (en) Compositions and methods for cas9 molecules with improved gene editing properties
CN116622758A (zh) 一种提高植物遗传转化和基因编辑效率的方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21764693

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022553071

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 3174615

Country of ref document: CA

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112022017732

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 20227034519

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021229415

Country of ref document: AU

Date of ref document: 20210304

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2021764693

Country of ref document: EP

Effective date: 20221004

REG Reference to national code

Ref country code: BR

Ref legal event code: B01E

Ref document number: 112022017732

Country of ref document: BR

Free format text: 1) COM BASE NA PORTARIA 48 DE 20/06/2022, SOLICITA-SE QUE SEJA APRESENTADO, EM ATE 60 (SESSENTA) DIAS, NOVO CONTEUDO DE LISTAGEM DE SEQUENCIA POIS O CONTEUDO APRESENTADO NA PETICAO NO 870220085893 DE 20/09/2022 NAO POSSUI TODOS OS CAMPOS OBRIGATORIOS INFORMADOS, NAO CONSTANDO O CAMPO 150 / 151 .2) FAVOR EFETUAR, EM ATE 60 (SESSENTA) DIAS, O PAGAMENTO DE GRU CODIGO DE SERVICO 260 PARA A REGULARIZACAO DO PEDIDO, CONFORME ART 2O 1O DA RESOLUCAO 189/2017 E NOTA DE ESCLARECIMENTO PUBLICADA NA RPI 2421 DE 30/05/2017, UMA VEZ QUE A PETICAO NO 870220085893 DE 20/09/2022 APRESENTA DOCUMENTOS REFERENTES A DOIS SERVICOS DIVERSOS (COMPLEMENTACAO E MODIFICACAO DO RESUMO) TENDO SIDO PAGA SOMENTE UMA RET

ENP Entry into the national phase

Ref document number: 112022017732

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20220902