WO2021098709A1 - 衍生自黄杆菌的基因编辑*** - Google Patents

衍生自黄杆菌的基因编辑*** Download PDF

Info

Publication number
WO2021098709A1
WO2021098709A1 PCT/CN2020/129665 CN2020129665W WO2021098709A1 WO 2021098709 A1 WO2021098709 A1 WO 2021098709A1 CN 2020129665 W CN2020129665 W CN 2020129665W WO 2021098709 A1 WO2021098709 A1 WO 2021098709A1
Authority
WO
WIPO (PCT)
Prior art keywords
guide rna
sequence
cas12a protein
ribozyme
seq
Prior art date
Application number
PCT/CN2020/129665
Other languages
English (en)
French (fr)
Inventor
高彩霞
靳帅
Original Assignee
中国科学院遗传与发育生物学研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院遗传与发育生物学研究所 filed Critical 中国科学院遗传与发育生物学研究所
Priority to US17/777,936 priority Critical patent/US20230002453A1/en
Priority to BR112022009584A priority patent/BR112022009584A2/pt
Priority to CN202080080579.0A priority patent/CN115052980A/zh
Priority to EP20890516.6A priority patent/EP4063500A4/en
Publication of WO2021098709A1 publication Critical patent/WO2021098709A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/10Cells modified by introduction of foreign genetic material
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/09Fusion polypeptide containing a localisation/targetting motif containing a nuclear localisation signal
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/12Type of nucleic acid catalytic nucleic acids, e.g. ribozymes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/30Chemical structure
    • C12N2310/35Nature of the modification
    • C12N2310/351Conjugate
    • C12N2310/3519Fusion with another nucleic acid

Definitions

  • the invention belongs to the field of genetic engineering. Specifically, the present invention relates to a gene editing system derived from Flavobacterium and its application.
  • Genome editing technology is a genetic engineering technology based on the targeted modification of the genome by artificial nucleases, and it is playing an increasingly powerful role in agricultural and medical research.
  • Clustered regularly spaced short palindromic repeats and its related system Clustered regularly interspaced short palindromic repeats/CRISPR associated, CRISPR
  • CRISPR Clustered regularly interspaced short palindromic repeats/CRISPR associated, CRISPR
  • RNA RNA
  • Cas The protein can be targeted to any position in the genome, so that the targeted sequence produces a double-strand break (DSB), and activates non-homologous End Joining (NHEJ) or homologous repair ( The Homology Directly Repair (HDR) approach introduces mutations in these two ways.
  • the most commonly used Cas protein is the Cas9 protein derived from Streptococcus pyogenes, which belongs to the Type II-A subtype of the Class II CRISPR system.
  • Cong et al. Multiplex Genome Engineering Using CRISPR/Cas Systems, Science, 2013)
  • Mali et al. RNA-guided human genome engineering via Cas9, Science, 2013
  • Both the CRISPR/Cas12a system and the CRISPR/Cas9 system belong to the Class II CRISPR system.
  • Zetsche et al. applied the Cas12a protein (formerly known as Cpf1) derived from amino acid streptococcus and Trichospirillum to the gene editing of animal cells (Cpf1 is a Single RNA) -Guided Endonuclease of a Class 2 CRISPR-Cas System, Cell, 2015).
  • Cpf1 is a Single RNA
  • the CRISPR/Cas12a system belongs to Type V, which has a shorter crRNA sequence and higher specificity.
  • the 5'-TTTN PAM sequence is complementary to the 3'-NGG of Cas9, and it is easier to produce sticky ends, etc. Advantages, further expanding the gene editing toolbox of the CRISPR system.
  • CRISPR/Cas9 and CRISPR/Cas12 have successfully been widely used in animal cell lines, animal individuals, plant cells, plant individuals and microorganisms. Because of their high efficiency and simple use, they have been widely used worldwide. The scope caused a revolution in the field of gene editing.
  • the working efficiency of the CRISPR/Cas12a system varies greatly at different target sites, and the working efficiency is low at certain sites in the plant genome. This may be due to the fact that the existing Cas12a system is mainly derived from humans or animals. Pathogenic bacteria are caused by their suitable working temperature being higher than that of plants. Therefore, it is necessary to identify and develop a CRISPR/Cas12a system that can work stably at suitable plant temperatures.
  • FbCas12a protein in plant symbiotic bacteria that had not been reported before through homology and similarity comparison, and artificially predicted the mature form of their own crRNA, and compared their own crRNA with LbCas12a's crRNA in vivo. It is found that FbCas12a can work in plant cells and has higher editing efficiency when using LbCas12a crRNA.
  • FIG. 1 Schematic diagram of the carrier used in the embodiment.
  • FIG. 1 Editing of rice endogenous gene OsEPSPS by the combination of FbCas12a and FbcrRNA.
  • FIG. Editing results of rice endogenous genes by the combination of FbCas12a and FbcrRNA or LbcrRNA.
  • the term “and/or” encompasses all combinations of items connected by the term, and should be treated as if each combination has been individually listed herein.
  • “A and/or B” encompasses “A”, “A and B”, and “B”.
  • “A, B, and/or C” encompasses "A”, “B”, “C”, “A and B”, “A and C”, “B and C”, and "A and B and C”.
  • the protein or nucleic acid may be composed of the sequence, or may have additional amino acids or nuclei at one or both ends of the protein or nucleic acid. Glycolic acid, but still has the activity described in the present invention.
  • methionine encoded by the start codon at the N-terminus of the polypeptide will be retained under certain actual conditions (for example, when expressed in a specific expression system), but does not substantially affect the function of the polypeptide.
  • Gene as used herein not only covers chromosomal DNA present in the nucleus, but also includes organelle DNA present in subcellular components of the cell (such as mitochondria, plastids).
  • organism includes any organism suitable for genome editing, preferably eukaryotes.
  • organisms include, but are not limited to, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, and cats; poultry such as chickens, ducks, and geese; plants include monocots and dicots, For example, rice, corn, wheat, sorghum, barley, soybeans, peanuts, Arabidopsis and so on.
  • Genetically modified organism or “genetically modified cell” means an organism or cell that contains exogenous polynucleotides or modified genes or expression control sequences in its genome.
  • exogenous polynucleotides can be stably integrated into the genome of organisms or cells, and inherited for successive generations.
  • the exogenous polynucleotide can be integrated into the genome alone or as part of a recombinant DNA construct.
  • the modified gene or expression control sequence contains single or multiple deoxynucleotide substitutions, deletions and additions in the organism or cell genome.
  • Form in terms of sequence means a sequence from a foreign species, or if from the same species, a sequence that has undergone significant changes in composition and/or locus from its natural form through deliberate human intervention.
  • nucleic acid sequence is used interchangeably and are single-stranded or double-stranded RNA or DNA polymers, optionally containing synthetic, non-natural Or changed nucleotide bases.
  • Nucleotides are referred to by their single letter names as follows: “A” is adenosine or deoxyadenosine (respectively RNA or DNA), “C” is cytidine or deoxycytidine, and “G” is guanosine or Deoxyguanosine, “U” means uridine, “T” means deoxythymidine, “R” means purine (A or G), “Y” means pyrimidine (C or T), “K” means G or T, “ H” means A or C or T, “I” means inosine, and “N” means any nucleotide.
  • Polypeptide “peptide”, and “protein” are used interchangeably in the present invention and refer to a polymer of amino acid residues.
  • the term applies to amino acid polymers in which one or more amino acid residues are artificial chemical analogs of the corresponding naturally-occurring amino acids, as well as to naturally-occurring amino acid polymers.
  • the terms "polypeptide”, “peptide”, “amino acid sequence” and “protein” may also include modified forms, including but not limited to glycosylation, lipid linkage, sulfation, gamma carboxylation of glutamic acid residues, hydroxyl And ADP-ribosylation.
  • Sequence "identity” has the art-recognized meaning, and the percentage of sequence identity between two nucleic acid or polypeptide molecules or regions can be calculated using published techniques. Sequence identity can be measured along the entire length of a polynucleotide or polypeptide or along a region of the molecule.
  • identity is well known to the skilled person (Carrillo, H. & Lipman, D., SIAM J Applied Math 48: 1073 (1988) )).
  • Suitable conservative amino acid substitutions are known to those skilled in the art and can generally be made without changing the biological activity of the resulting molecule.
  • those skilled in the art recognize that a single amino acid substitution in a non-essential region of a polypeptide does not substantially change the biological activity (see, for example, Watson et al., Molecular Biology of the Gene, 4th Edition, 1987, The Benjamin/Cummings Pub .co.,p.224).
  • expression construct refers to a vector suitable for expression of a nucleotide sequence of interest in an organism, such as a recombinant vector.
  • “Expression” refers to the production of a functional product.
  • the expression of a nucleotide sequence may refer to the transcription of the nucleotide sequence (such as transcription to generate mRNA or functional RNA) and/or the translation of RNA into a precursor or mature protein.
  • the "expression construct" of the present invention can be a linear nucleic acid fragment, a circular plasmid, a viral vector, or, in some embodiments, can be an RNA (such as mRNA) that can be translated.
  • the "expression construct" of the present invention may comprise regulatory sequences and nucleotide sequences of interest from different sources, or regulatory sequences and nucleotide sequences of interest from the same source but arranged in a manner different from those normally occurring in nature.
  • regulatory sequence and “regulatory element” are used interchangeably and refer to the upstream (5' non-coding sequence), middle or downstream (3' non-coding sequence) of the coding sequence, and affect the transcription, RNA processing, or processing of the related coding sequence. Stability or translated nucleotide sequence. Regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
  • Promoter refers to a nucleic acid fragment capable of controlling the transcription of another nucleic acid fragment.
  • a promoter is a promoter capable of controlling gene transcription in a cell, regardless of whether it is derived from the cell.
  • the promoter can be a constitutive promoter or a tissue-specific promoter or a developmentally regulated promoter or an inducible promoter.
  • tissue-specific promoter and “tissue-preferred promoter” are used interchangeably, and refer to mainly but not necessarily exclusively expressed in a tissue or organ, and can also be expressed in a specific cell or cell type Promoter.
  • tissue-preferred promoter refers to a promoter whose activity is determined by developmental events.
  • inducible promoters selectively express operably linked DNA sequences in response to endogenous or exogenous stimuli (environment, hormones, chemical signals, etc.).
  • operably linked refers to the connection of regulatory elements (for example, but not limited to, promoter sequences, transcription termination sequences, etc.) to nucleic acid sequences (for example, coding sequences or open reading frames) such that the nucleotides The transcription of the sequence is controlled and regulated by the transcription control element.
  • regulatory elements for example, but not limited to, promoter sequences, transcription termination sequences, etc.
  • nucleic acid sequences for example, coding sequences or open reading frames
  • "Introducing" nucleic acid molecules such as plasmids, linear nucleic acid fragments, RNA, etc.
  • proteins into an organism refers to transforming the cells of the organism with the nucleic acid or protein so that the nucleic acid or protein can function in the cell.
  • the "transformation” used in the present invention includes stable transformation and transient transformation.
  • “Stable transformation” refers to the introduction of an exogenous nucleotide sequence into the genome, resulting in the stable inheritance of the exogenous nucleotide sequence. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the organism and any successive generations thereof.
  • Transient transformation refers to the introduction of nucleic acid molecules or proteins into cells to perform functions without stable inheritance of exogenous nucleotide sequences. In transient transformation, the foreign nucleic acid sequence is not integrated into the genome.
  • Proteins refer to the physiological, morphological, biochemical or physical characteristics of cells or organisms.
  • “Agronomic traits” especially refer to the measurable index parameters of crop plants, including but not limited to: leaf green, grain yield, growth rate, total biomass or accumulation rate, fresh weight at maturity, dry weight at maturity, fruit Yield, seed yield, plant total nitrogen content, fruit nitrogen content, seed nitrogen content, plant nutrient tissue nitrogen content, plant total free amino acid content, fruit free amino acid content, seed free amino acid content, plant nutrient tissue free amino acid content, plant total protein Content, fruit protein content, seed protein content, plant nutrient tissue protein content, herbicide resistance, drought resistance, nitrogen absorption, root lodging, harvest index, stem lodging, plant height, ear height, ear length, disease resistance Resistance, cold resistance, salt resistance and tiller number.
  • Genome editing system based on Flavobacterium Cas12a protein
  • the present invention provides a new Cas12a protein, which
  • Cas12a protein “Cas12a nuclease” and “Cas12a” are used interchangeably herein, and refer to RNA-guided nucleases or variants thereof including Cas12a protein or fragments thereof.
  • Cas12a is a component of the CRISPR-Cas12a genome editing system, which can target and/or cleave DNA target sequences to form DNA double-strand breaks (DSB) under the guidance of guide RNA (crRNA).
  • the Cas12a protein of the present invention is derived from plant symbiotic bacteria, and therefore, is particularly suitable for genome editing in plants.
  • the Cas12a protein is derived from a species of the genus Flavobacterium. In some embodiments, the Cas12a protein is derived from Flavobacterium branchiophilum. Those skilled in the art will understand that the Cas12a protein of different strains of the same bacterial species may have certain differences in amino acid sequence, but can achieve substantially the same function.
  • the Cas12a protein is produced recombinantly.
  • the Cas12a protein further contains a fusion tag, for example, a tag used for the separation and/or purification of the Cas12a protein.
  • a fusion tag for example, a tag used for the separation and/or purification of the Cas12a protein.
  • Methods of recombinantly producing proteins are known in the art.
  • tags that can be used to separate/or purify proteins are known in the art, including but not limited to His tags, GST tags, and the like. Generally speaking, these tags will not change the activity of the target protein.
  • the Cas12a protein is also fused with other functional proteins, such as deaminase, transcription activation/repressor protein, etc., so as to realize base editing or transcription regulation functions.
  • the Cas12a protein of the present invention further comprises a nuclear localization sequence (NLS), for example, connected to the nuclear localization sequence via a linker.
  • the joint can be 1-50 long (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or 20-25, 25-50) or more amino acids, non-functional amino acid sequences without secondary or higher structure.
  • the linker may be a flexible linker, such as SGGS (SEQ ID NO: 3).
  • one or more NLS in the Cas12a protein should have sufficient strength to drive the Cas12a protein to accumulate in the nucleus in an amount that can achieve its genome editing function.
  • the strength of nuclear localization activity is determined by the number and location of NLS in the Cas12a protein, one or more specific NLS used, or a combination of these factors.
  • Exemplary nuclear localization sequences include, but are not limited to, SV40 nuclear localization signal sequence (for example, shown in SEQ ID NO: 4), and nucleoplasmin nuclear localization signal sequence (for example, shown in SEQ ID NO: 5).
  • the Cas12a protein of the present invention may also include other positioning sequences, such as cytoplasmic positioning sequence, chloroplast positioning sequence, mitochondrial positioning sequence, etc.
  • the multiple positioning sequences may be connected by a linker.
  • the Cas12a protein comprises the amino acid sequence shown in SEQ ID NO:6.
  • the present invention provides the use of the Cas12a protein of the present invention in genome editing of cells, preferably eukaryotic cells, and more preferably plant cells.
  • the present invention provides a genome editing system for site-directed modification of a target nucleic acid sequence in a cell genome, which comprises the Cas12a protein of the present invention and/or comprises a nucleotide sequence encoding the Cas12a protein of the present invention.
  • Expression construct for site-directed modification of a target nucleic acid sequence in a cell genome, which comprises the Cas12a protein of the present invention and/or comprises a nucleotide sequence encoding the Cas12a protein of the present invention.
  • genomic editing system and “gene editing system” are used interchangeably, and refer to a combination of components required for genome editing of the genome of an organism's cells, wherein the various components of the system, for example, The Cas12a protein, gRNA, or corresponding expression constructs, etc. may exist independently of each other, or may exist in the form of a composition in any combination.
  • the genome editing system further includes at least one guide RNA (gRNA) and/or an expression construct comprising a nucleotide sequence encoding the at least one guide RNA.
  • gRNA guide RNA
  • the guide RNA of the CRISPR-Cas12a genome editing system is usually composed of only crRNA molecules, where the crRNA contains sufficient identity with the target sequence to hybridize with the complementary sequence of the target sequence and direct the CRISPR complex (Cas12a+crRNA) to be specific to the target sequence. Sexual binding sequence.
  • the guide RNA is crRNA.
  • the guide RNA includes the crRNA backbone sequence shown in SEQ ID NO: 10 or 11.
  • the crRNA backbone sequence is SEQ ID NO: 11.
  • the cRNA sequence further includes a sequence (ie, a spacer sequence) that specifically hybridizes with the complementary sequence of the target sequence located 3'of the cRNA backbone sequence.
  • the crRNA comprises the following sequence:
  • sequence N x (spacer sequence) can specifically hybridize to the complementary sequence of the target sequence.
  • the 5'end of the target sequence targeted by the genome editing system of the present invention needs to include a protospacer adjacent motif (PAM).
  • the PAM may be, for example, 5'-TTTN, where N represents A, G, C, or T.
  • PAM protospacer adjacent motif
  • different PAM sequences can also be used.
  • those skilled in the art can easily determine the target sequence in the genome that can be used for targeting and optionally editing and design a suitable guide RNA accordingly. For example, if there is a PAM sequence 5'-TTTG-3' in the genome, about 18 to about 35, preferably 20, 21, 22 or 23 consecutive nucleotides in the 3'immediate vicinity can be used as the target sequence.
  • the at least one guide RNA is encoded by different expression constructs. In some embodiments, the at least one guide RNA is encoded by the same expression construct. In some embodiments, the at least one guide RNA and the Cas12a protein of the invention are encoded by the same expression construct.
  • the genome editing system may comprise any one selected from the following:
  • the Cas12a protein of the present invention and an expression construct comprising a nucleotide sequence encoding the at least one guide RNA;
  • an expression construct comprising a nucleotide sequence encoding the Cas12a protein of the present invention, and an expression construct comprising a nucleotide sequence encoding the at least one guide RNA;
  • the nucleotide sequence encoding the Cas12a protein is codon-optimized for the organism from which the cell to be genome edited is derived.
  • Codon optimization refers to replacing at least one codon of the natural sequence with a codon that is used more frequently or most frequently in the gene of the host cell (e.g., about or more than about 1, 2, 3, 4, 5, 10). , 15, 20, 25, 50 or more codons while maintaining the natural amino acid sequence to modify the nucleic acid sequence to enhance expression in the host cell of interest.
  • Different species display certain codons for specific amino acids Codon preference (the difference in codon usage between organisms) is often related to the translation efficiency of messenger RNA (mRNA), and the translation efficiency is considered to depend on the nature and the nature of the codon being translated
  • mRNA messenger RNA
  • tRNA transfer RNA
  • Codon utilization tables can be easily obtained, such as the codon usage database available on www.kazusa.orjp/codon/ ("Codon Usage Database"), and these tables can be adjusted in different ways Applicable. See, Nakamura Y. et al., "Codon usage tabulated from the international DNA sequence databases: status for the year 2000. Nucl. Acids Res., 28:292 (2000).
  • the organism from which the cells for genome editing can be performed by the Cas12a protein or genome editing system of the present invention are preferably eukaryotes, including but not limited to mammals such as humans, mice, rats, monkeys, dogs, pigs, and sheep , Cattle, cats; poultry such as chickens, ducks, geese; plants include monocotyledonous and dicotyledonous plants, such as rice, corn, wheat, sorghum, barley, soybeans, peanuts, Arabidopsis and so on.
  • the Cas12a protein or genome editing system of the present invention is particularly suitable for genome editing in plants.
  • the nucleotide sequence encoding the Cas12a protein is codon-optimized for plants such as rice.
  • the nucleotide sequence encoding the Cas12a protein is selected from SEQ ID NO: 2 and SEQ ID NO: 7.
  • the nucleotide sequence encoding the Cas12a protein and/or the nucleotide sequence encoding the at least one guide RNA are operably linked to an expression control element such as a promoter.
  • promoters examples include, but are not limited to, polymerase (pol) I, pol II, or pol III promoters.
  • the pol I promoter include the chicken RNA pol I promoter.
  • pol II promoters include, but are not limited to, cytomegalovirus immediate early (CMV) promoter, Rous sarcoma virus long terminal repeat (RSV-LTR) promoter, and simian virus 40 (SV40) immediate early promoter.
  • pol III promoters include U6 and H1 promoters.
  • An inducible promoter such as a metallothionein promoter can be used.
  • promoters include T7 phage promoter, T3 phage promoter, ⁇ -galactosidase promoter, and Sp6 phage promoter.
  • the promoter can be cauliflower mosaic virus 35S promoter, maize Ubi-1 promoter, wheat U6 promoter, rice U3 promoter, maize U3 promoter, rice actin promoter.
  • the 5'end of the guide RNA coding sequence is connected to the first The 3'end of the ribozyme coding sequence.
  • the first ribozyme is designed to cut the first ribozyme-guide RNA fusion produced by transcription in the cell at the 5'end of the guide RNA, thereby forming a non-carrying 5' Guide RNA with extra nucleotides at the end.
  • the 3'end of the guide RNA coding sequence is connected to the 5'end of the second ribozyme coding sequence, and the second ribozyme is designed to cut the cell at the 3'end of the guide RNA.
  • the guide RNA-second ribozyme fusion generated by transcription, thereby forming a guide RNA that does not carry additional nucleotides at the 3'end.
  • the 5'end of the guide RNA coding sequence is connected to the 3'end of the first ribozyme coding sequence, and the 3'end of the guide RNA coding sequence is connected to the 5'end of the second ribozyme coding sequence.
  • the first ribozyme is designed to cut the first ribozyme-guide RNA-second ribozyme fusion produced by transcription in the 5'end of the guide RNA
  • the second ribozyme is designed to The first ribozyme-guide RNA-second ribozyme fusion produced by transcription in the cell is cut at the 3'end of the guide RNA, thereby forming a guide RNA that does not carry additional nucleotides at the 5'and 3'ends.
  • first or second ribozyme is within the abilities of those skilled in the art. For example, see Gao et al., JIPB, Apr, 2014; Vol 56, Issue 4,343-349.
  • the first ribozyme is encoded by the following sequence: 5'-(N) 6 CTGATGAGTCCGTGAGGACGAAACGAGTAAGCTCGTC-3' (SEQ ID NO: 31), wherein N is independently selected from A, G, C, and T , And (N) 6 represents the reverse complementary sequence to the first 6 nucleotides of the 5'end of the guide RNA.
  • the second ribozyme is encoded by the following sequence: 5'-GGCCGGCATGGTCCCAGCCTCCTCGCTGGCGCCGGCTGGGCAACATGCTTCGGCATGGCGAATGGGAC-3' (SEQ ID NO: 32).
  • the 5'end of the guide RNA coding sequence is connected to the first At the 3'end of the tRNA coding sequence, the first tRNA is designed to be cleaved at the 5'end of the guide RNA (that is, by the precise tRNA processing mechanism that exists in the cell (which precisely removes the 5'and 5'of the precursor tRNA) 3'additional sequence to form a mature tRNA) cleaved) the first tRNA-guide RNA fusion generated by intracellular transcription, thereby forming a guide RNA that does not carry additional nucleotides at the 5'end.
  • the 3'end of the guide RNA coding sequence is connected to the 5'end of the second tRNA coding sequence, and the second tRNA is designed to be transcribed in the 3'end tRNA cell of the guide RNA.
  • the guide RNA-second tRNA fusion thus forming a guide RNA that does not carry additional nucleotides at the 3'end.
  • the 5'end of the guide RNA coding sequence is connected to the 3'end of the first tRNA coding sequence, and the 3'end of the guide RNA coding sequence is connected to the 5'end of the second tRNA coding sequence
  • the first tRNA is designed to be a first tRNA-guide RNA-second tRNA fusion produced by cleaving the 5'end of the guide RNA in the cell
  • the second tRNA is designed to be the first tRNA-guide RNA-second tRNA fusion.
  • the 3'end cuts the first tRNA-guide RNA-second tRNA fusion generated by intracellular transcription, thereby forming a guide RNA that does not carry additional nucleotides at the 5'and 3'ends.
  • tRNA-guide RNA fusion is within the ability of those skilled in the art. For example, you can refer to Xie et al., PNAS, Mar 17, 2015; vol. 112, no. 11, 3570-3575.
  • the present invention provides a method for site-directed modification of a target nucleic acid sequence in a cell genome, including introducing the genome editing system of the present invention into the cell.
  • the introduction of the genome editing system results in a double-strand break (DSB) in the target nucleic acid sequence. Subsequently, through the repair function of the cell, the substitution, deletion and/or addition of one or more nucleotides in the target nucleic acid sequence or its nearby sequence is realized.
  • DSB double-strand break
  • the present invention also provides a method for producing genetically modified cells, including introducing the genome editing system of the present invention into the cells.
  • the present invention also provides a genetically modified organism, which comprises a genetically modified cell or its progeny cells produced by the method of the present invention.
  • the target sequence to be modified can be located anywhere in the genome, for example, in a functional gene such as a protein-coding gene, or, for example, can be located in a gene expression regulatory region such as a promoter region or an enhancer region, so as to achieve Modification of gene function or modification of gene expression.
  • the modification in the cell target sequence can be detected by T7EI, PCR/RE or sequencing methods.
  • the gene editing system can be introduced into cells by various methods well known to those skilled in the art.
  • Methods that can be used to introduce the gene editing system of the present invention into cells include, but are not limited to: calcium phosphate transfection, protoplast fusion, electroporation, liposome transfection, microinjection, viral infection (such as baculovirus, vaccinia virus, adenovirus) Viruses, adeno-associated viruses, lentiviruses and other viruses), gene bombardment, PEG-mediated transformation of protoplasts, and Agrobacterium-mediated transformation.
  • the methods of the invention are performed in vitro.
  • the cell is an isolated cell, or a cell in an isolated tissue or organ.
  • the method of the present invention can also be performed in vivo.
  • the cell is a cell in an organism, and the system of the present invention can be introduced into the cell in vivo by a method mediated by, for example, a virus or Agrobacterium.
  • Cells that can be genome edited by the method of the present invention can be derived from, for example, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cows, and cats; poultry such as chickens, ducks, and geese; plants, including monads.
  • mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cows, and cats
  • poultry such as chickens, ducks, and geese
  • plants including monads.
  • Leafy plants and dicotyledonous plants such as rice, corn, wheat, sorghum, barley, soybean, peanut, Arabidopsis, etc.
  • the Cas12a protein or genome editing system of the present invention is particularly suitable for genome editing in plants.
  • the present invention provides a method for producing a genetically modified plant, comprising introducing the genome editing system of the present invention into at least one of the plants, thereby causing a modification in the genome of the at least one plant.
  • the modification includes substitution, deletion and/or addition of one or more nucleotides.
  • the genome editing system can be introduced into plants by various methods well known to those skilled in the art.
  • Methods that can be used to introduce the genome system of the present invention into plants include, but are not limited to: gene bombardment, PEG-mediated transformation of protoplasts, Agrobacterium-mediated transformation, plant virus-mediated transformation, pollen tube passage method, and ovary injection law.
  • the target sequence can be modified by introducing or producing the Cas12a protein and guide RNA into plant cells, and the modification can be inherited stably, without the need to stably transform the genome editing system into plants .
  • This avoids the potential off-target effects of the stable genome editing system, and also avoids the integration of exogenous nucleotide sequences in the plant genome, thereby having higher biological safety.
  • the introduction is performed in the absence of selective pressure, so as to avoid the integration of foreign nucleotide sequences in the plant genome.
  • the introduction includes transforming the genome editing system of the present invention into an isolated plant cell or tissue, and then regenerating the transformed plant cell or tissue into a whole plant.
  • the regeneration is performed in the absence of selective pressure, that is, no selective agent for the selective gene carried on the expression vector is used during the tissue culture process. Not using selection agents can improve plant regeneration efficiency and obtain herbicide-resistant plants without exogenous nucleotide sequences.
  • the genome editing system of the present invention can be transformed to specific parts on the whole plant, such as leaves, stem tips, pollen tubes, young ears or hypocotyls. This is particularly suitable for the transformation of plants that are difficult to regenerate from tissue culture.
  • the protein expressed in vitro and/or the RNA molecule transcribed in vitro is directly transformed into the plant.
  • the protein and/or RNA molecule can realize genome editing in plant cells and then be degraded by the cell, avoiding the integration of foreign nucleotide sequences in the plant genome.
  • genetic modification of plants using the method of the present invention can obtain plants whose genomes have no exogenous polynucleotide integration, that is, transgene-free modified plants.
  • the modification is related to plant traits such as agronomic traits
  • the modification causes the plant to have an altered (preferably improved) trait, such as agronomic trait, relative to a wild-type plant.
  • the method further includes the step of screening for plants with desired modifications and/or desired traits such as agronomic traits.
  • the method further includes obtaining progeny of the genetically modified plant.
  • the genetically modified plant or its progeny have desired modifications and/or desired traits such as agronomic traits.
  • the present invention also provides a genetically modified plant or its progeny or part thereof, wherein the plant is obtained by the above-mentioned method of the present invention.
  • the genetically modified plant or progeny or part thereof is non-transgenic.
  • the genetically modified plant or its progeny have desired genetic modification and/or desired traits such as agronomic traits.
  • the present invention also provides a plant breeding method, comprising crossing a genetically modified first plant obtained by the above-mentioned method of the present invention with a second plant that does not contain the modification, thereby combining the modification Introduce the second plant.
  • the genetically modified first plant has desired traits such as agronomic traits.
  • the present invention also includes a kit used in the method of the present invention, which includes the genome editing system of the present invention, and instructions for use.
  • the kit generally includes a label indicating the intended use and/or method of use of the contents of the kit.
  • the term label includes any written or recorded material provided on or with the kit or otherwise provided with the kit.
  • Example 1 Using homology and similarity comparison to find the CRISPR/Cas12a system in plant symbiotic bacteria
  • the size of the protein is 1318aa (SEQ ID NO:1), but there are no other Cas protein sequences near the sequence.
  • the CRISPR repeat sequence appears at 1509bp downstream of the genome. There are 37 Spacer sequences in total.
  • Its Direct Repeat is GTTTAAAACCACTTTAAAATTTCTACTATTGTAGAT (SEQ ID NO: 9) Compared with the Direct Repeat of the commonly used Cas12a proteins FnCas12a, LbCas12a, and AsCas12a as shown in Figure 1a.
  • the protein sequence alignment results of this protein with the commonly used FnCas12a, LbCas12a, and AsCas12a are shown in Figure 1b.
  • the sequence similarity alignment uses the NCBI blastp program.
  • the coding sequence of FbCas12a derived from Flavobacterium branchiophilum was codon-optimized, and two nuclear localization signals (NLS) were added to its 3'end, and BamHI/SmaI restriction sites were added at both ends.
  • the optimized FbCas12a protein can be better expressed and localized in rice.
  • the nucleotide coding sequence of FbCas12a with NLS added and codon optimized is shown in SEQ ID NO: 7 in the sequence table.
  • positions 3967-3987 are the SV40 nuclear localization signal sequence
  • positions 3988-3999 are the SGGS linker between the two nuclear localization signal sequences
  • positions 4000-4047 are the nucleoplasmin nuclear localization signal sequence.
  • Positions 1-3966 are the coding sequence of FbCas12a protein.
  • SEQ ID NO: 7 encodes the protein shown in SEQ ID NO: 6, that is, the FbCas12a nuclease with nuclear localization signal.
  • the DNA sequence of LbCas12a commonly used in laboratory genome editing was ligated into the pJIT163 vector to obtain the pJIT163-UBI-LbCas12a vector.
  • the construction method of the vector is similar to that of pJIT163-UBI-FbCas12a.
  • the nucleotide coding sequence of the codon-optimized LbCas12a is shown in SEQ ID NO: 8 in the sequence list.
  • the pJIT163-FbCas12a and pJIT163-LbCas12a vectors contain UBI promoter, plant codon-optimized FbCas12a protein or LbCas12a coding sequence, 3'SV40 nuclear localization signal coding sequence, nucleoplasmin nuclear localization signal coding sequence, and its structure
  • the schematic diagram is shown in Figure 2a and Figure 2c.
  • Cas12a has a crRNA self-maturation function.
  • the full-length FbCas12a crRNA backbone sequence full-length Direct Repeat
  • genome editing was not achieved. Therefore, it seems that FbCas12a cannot mature its natural crRNA backbone. It is necessary to explore the crRNA of FbCas12a to determine whether it can realize genome editing.
  • the recognition sequence of the target sequence to be mutated in rice can be linked into the vector pJIT163-FbcrRNA through two restriction sites, and positions 89-156 are HDV nuclei. Enzyme sequence, position 157-162 is the restriction site sequence of SmaI.
  • the synthetic DNA fragment of SEQ ID NO: 14 was ligated into the expression vector pJIT163 to obtain the pJIT163-FbcrRNA vector.
  • the vector contains UBI promoter, HH ribozyme, truncated FbcrRNA sequence, HDV ribozyme and CaMV terminator. Its structure is shown in Figure 2b.
  • the pJIT163-FbcrRNA vector uses a ribozyme-based crRNA maturation strategy to obtain precisely processed crRNA sequences.
  • Example 4 Site-directed mutation of rice endogenous gene EPSPS by FbCas12a system and mutation of four endogenous gene targets in rice using FbCas12a protein and LbCas12a crRNA
  • target-EPSPS05 TTTG GTACTAAATATACAATCCCTTGG (SEQ ID NO: 16; the sequence is LOC_Os06g04280.1 in the 956-982th nucleotide of the OsEPSPS gene.
  • the underlined part is the PAM sequence).
  • target-OsCDC48 TTTA TTCAGATTACATATGGTTAG (SEQ ID NO: 17; nucleotides 582-605 in the OsCDC48 gene of sequence LOC_Os03g05730.
  • the underlined part is the PAM sequence).
  • target-OsDEP1T3 TTTC AAATGGATCTAAACAGGGCCTTA (SEQ ID NO: 18; nucleotides 1919-1945 in the OsDEP1 gene of sequence LOC_Os09g26999.
  • the underlined part is the PAM sequence).
  • target-OsPDS TTTG GAGTGAAATCTCTTGTCTTA (SEQ ID NO: 19; the sequence is LOC_Os03g08570 from nucleotides 136 to 159 in the OsPDS gene.
  • the underlined part is the PAM sequence).
  • target-OsEpspsC02 TTTA TGAAAATATGTATGGAATTCATG (SEQ ID NO: 20; nucleotides 1294-1320 in the OsEPSPS gene with sequence LOC_Os06g04280.1.
  • the underlined part is the PAM sequence).
  • SP1 is the coding DNA of RNA that can complementally bind to the target-EPSPS05
  • SP1-F AGAT GTACTAAATATACAATCCCTTGG (SEQ ID NO: 21)
  • SP1-R AAAC CCAAGGGATTGTATATTTAGTAC (SEQ ID NO: 22)
  • double-stranded DNA with sticky ends is formed, which is inserted between the two BsaI restriction sites of pJIT163-FbcrRNA to obtain the pJIT163-FbcrRNA plasmid containing SP1.
  • the plasmid is verified as a positive plasmid by sequencing.
  • SP2 ⁇ SP5 are RNA coding DNAs that can complementally bind to the targets target-OsCDC48, target-OsDEP1T3, target-OsPDS and target-OsEpspsC02.
  • SP2-F AGAT TTCAGATTACATATGGTTAG (SEQ ID NO: 23)
  • SP2-R AAAC CTAACCATATGTAATCTGAA (SEQ ID NO: 24)
  • SP3-F AGAT AAATGGATCTAAACAGGGCCTTA (SEQ ID NO: 25)
  • SP3-R AAAC ATTGGCCCTGTTTAGATCCATTT (SEQ ID NO: 26)
  • SP4-F AGAT GAGTGAAATCTCTTGTCTTA (SEQ ID NO: 27)
  • SP4-R AAAC TAAGACAAGAGATTTCACTC (SEQ ID NO: 28)
  • SP5-F AGAT TGAAAATATGTATGGAATTCATG (SEQ ID NO: 29)
  • SP5-R AAAC CATGAATTCCATACATATTTTCA (SEQ ID NO: 30)
  • double-stranded DNA with sticky ends is formed and inserted between the two BsaI restriction sites of pJIT163-FbcrRNA and pJIT163-LbcrRNA to obtain pJIT163-FbcrRNA plasmid and pJIT163-LbcrRNA plasmid containing SP1 ⁇ SP5.
  • the plasmid was sequenced and verified as a positive plasmid.
  • the plasmids pJIT163-UBI-FbCas12a, pJIT163-UBI-FbCas12a, and pJIT163-FbcrRNA and pJIT163-LbcrRNA containing SP1 ⁇ SP5 were transformed into the protoplasts of rice Nipponbare respectively.
  • the specific process of rice protoplast transformation refers to the literature Shan, Q. et al. .,Rapid and efficient gene modification in rice and Brachypodium using TALENs.Molecular Plant (2013).
  • the genomic DNA was extracted 48 hours after the transformation of rice protoplasts, and the DNA was used as a template to conduct amplicon high-throughput sequencing experiments to analyze its editing efficiency.
  • the specific process of amplicon high-throughput sequencing refers to the literature Zhang et al. Perfectly matched 20 -Nucleotide guide RNA sequences enable robust genome editing using high-fidelity SpCas9 nucleases. Genome Biology, 2017.
  • FIG. 3a The results of artificially matured FbcrRNA: FbCas12a high-throughput sequencing experiments are shown in Figure 3a.
  • the results show that compared with the control group, the FbCas12a treatment group has a mutation at the target site of the OsEPSPS gene, and the mutation efficiency is about 6.25%.
  • Figure 3b shows the ratio of mutation types obtained from high-throughput sequencing data analysis. The results show that most of the mutation types generated by FbCas12a at the target site are deletions of DNA fragments.
  • Fb means FbcrRNA:FbCas12a
  • FbLb means LbcrRNA:FbCas12a
  • Lb means LbcrRNA:LbCas12a.

Landscapes

  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Zoology (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Mycology (AREA)
  • Medicinal Chemistry (AREA)
  • Cell Biology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)

Abstract

本发明属于基因工程领域。具体而言,本发明公开了一种衍生自黄杆菌的基因编辑***及其应用。

Description

衍生自黄杆菌的基因编辑*** 技术领域
本发明属于基因工程领域。具体而言,本发明涉及一种衍生自黄杆菌的基因编辑***及其应用。
发明背景
基因组编辑技术是基于人工核酸酶对基因组进行靶向修饰的基因工程技术,在农业和医学研究中发挥着越来越强大的作用。成簇的规律间隔的短回文重复序列及其相关***(Clustered regularly interspaced short palindromic repeats/CRISPR associated,CRISPR)是目前使用最广泛的基因组编辑工具,在人工设计的guide RNA的导向作用下,Cas蛋白可以靶向基因组中的任意位置,从而使靶向序列产生双链断裂(Double Strand Break,DSB),激活细胞内的非同源末端连接(Non-homologous End Joining,NHEJ)或同源修复(Homology Directly Repair,HDR)途径,以这两种方式引入突变。最常用的Cas蛋白为来源于酿脓链球菌(Streptococcus pyogenes)的Cas9蛋白,属于Class II型CRISPR***中的Type II-A亚型,Cong等(Multiplex Genome Engineering Using CRISPR/Cas Systems,Science,2013)与Mali等(RNA-guided human genome engineering via Cas9,Science,2013)将CRISPR/Cas9***在人细胞系内成功应用。
CRISPR/Cas12a***与CRISPR/Cas9***均属于Class II类CRISPR***,Zetsche等首次将来源于氨基酸链球菌、毛螺旋菌的Cas12a蛋白(旧称Cpf1)应用于动物细胞的基因编辑(Cpf1 is a Single RNA-Guided Endonuclease of a Class 2 CRISPR-Cas System,Cell,2015)。不同的是,CRISPR/Cas12a***属于Type V类型,其具有crRNA序列较短,特异性更高,所具有的5’-TTTN PAM序列与Cas9的3’-NGG互补,同时更容易产生粘性末端等优势,进一步地拓展了CRISPR***的基因编辑工具箱。
迄今为止,基于CRISPR/Cas9与CRISPR/Cas12的基因编辑工具已经成功地在动物细胞系、动物个体、植物细胞、植物个体与微生物中得到了广泛的应用,因其高效率、使用简单,在全球范围内引起了基因编辑领域的革命。而CRISPR/Cas12a***的工作效率在不同靶位点处的差异较大,在植物基因组中的某些位点处工作效率较低,这可能是由于现有的Cas12a***主要来源于人或动物的致病菌,其适宜工作温度高于植物的适宜温度导致的,因此有必要识别、开发出在适宜植物温度下能够稳定工作的CRISPR/Cas12a***。
发明简述
本发明人通过同源相似性比对,在植物共生菌中寻找到了之前未报道过的FbCas12a 蛋白,并人为预测其本身crRNA的成熟形式,将自身的crRNA与LbCas12a的crRNA进行了体内效率比较,发现FbCas12a可以在植物细胞内工作,并在使用LbCas12a的crRNA时具有更高的编辑效率。
附图简述
图1、不同来源的Cas12a的DR序列和蛋白序列比对结果。
图2、实施例中使用的载体示意图。
图3、FbCas12a和FbcrRNA组合对水稻内源基因OsEPSPS的编辑。
图4、FbCas12a和FbcrRNA或LbcrRNA组合对水稻内源基因编辑的结果。
发明详述
一、定义
在本发明中,除非另有说明,否则本文中使用的科学和技术名词具有本领域技术人员所通常理解的含义。并且,本文中所用的蛋白质和核酸化学、分子生物学、细胞和组织培养、微生物学、免疫学相关术语和实验室操作步骤均为相应领域内广泛使用的术语和常规步骤。例如,本发明中使用的标准重组DNA和分子克隆技术为本领域技术人员熟知,并且在如下文献中有更全面的描述:Sambrook,J.,Fritsch,E.F.和Maniatis,T.,Molecular Cloning:A Laboratory Manual;Cold Spring Harbor Laboratory Press:Cold Spring Harbor,1989(下文称为“Sambrook”)。同时,为了更好地理解本发明,下面提供相关术语的定义和解释。
如本文所用,术语“和/或”涵盖由该术语连接的项目的所有组合,应视作各个组合已经单独地在本文列出。例如,“A和/或B”涵盖了“A”、“A和B”以及“B”。例如,“A、B和/或C”涵盖“A”、“B”、“C”、“A和B”、“A和C”、“B和C”以及“A和B和C”。
“包含”一词在本文中用于描述蛋白质或核酸的序列时,所述蛋白质或核酸可以是由所述序列组成,或者在所述蛋白质或核酸的一端或两端可以具有额外的氨基酸或核苷酸,但仍然具有本发明所述的活性。此外,本领域技术人员清楚多肽N端由起始密码子编码的甲硫氨酸在某些实际情况下(例如在特定表达***表达时)会被保留,但不实质影响多肽的功能。因此,本申请说明书和权利要求书中在描述具体的多肽氨基酸序列时,尽管其可能不包含N端由起始密码子编码的甲硫氨酸,然而此时也涵盖包含该甲硫氨酸的序列,相应地,其编码核苷酸序列也可以包含起始密码子;反之亦然。
“基因组”如本文所用不仅涵盖存在于细胞核中的染色体DNA,而且还包括存在于细胞的亚细胞组分(如线粒体、质体)中的细胞器DNA。
如本文所用,“生物体”包括适于基因组编辑的任何生物体,优选真核生物。生物体的实例包括但不限于,哺乳动物如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅;植物包括单子叶植物和双子叶植物,例如水稻、玉米、小麦、高粱、大 麦、大豆、花生、拟南芥等。
“经遗传修饰的生物体”或“经遗传修饰的细胞”意指在其基因组内包含外源多核苷酸或修饰的基因或表达调控序列的生物体或细胞。例如外源多核苷酸能够稳定地整合进生物体或细胞的基因组中,并遗传连续的世代。外源多核苷酸可单独地或作为重组DNA构建体的部分整合进基因组中。修饰的基因或表达调控序列为在生物体或细胞基因组中所述序列包含单个或多个脱氧核苷酸取代、缺失和添加。
针对序列而言的“外源”意指来自外来物种的序列,或者如果来自相同物种,则指通过蓄意的人为干预而从其天然形式发生了组成和/或基因座的显著改变的序列。
“多核苷酸”、“核酸序列”、“核苷酸序列”或“核酸片段”可互换使用并且是单链或双链RNA或DNA聚合物,任选地可含有合成的、非天然的或改变的核苷酸碱基。核苷酸通过如下它们的单个字母名称来指代:“A”为腺苷或脱氧腺苷(分别对应RNA或DNA),“C”表示胞苷或脱氧胞苷,“G”表示鸟苷或脱氧鸟苷,“U”表示尿苷,“T”表示脱氧胸苷,“R”表示嘌呤(A或G),“Y”表示嘧啶(C或T),“K”表示G或T,“H”表示A或C或T,“I”表示肌苷,并且“N”表示任何核苷酸。
“多肽”、“肽”、和“蛋白质”在本发明中可互换使用,指氨基酸残基的聚合物。该术语适用于其中一个或多个氨基酸残基是相应的天然存在的氨基酸的人工化学类似物的氨基酸聚合物,以及适用于天然存在的氨基酸聚合物。术语“多肽”、“肽”、“氨基酸序列”和“蛋白质”还可包括修饰形式,包括但不限于糖基化、脂质连接、硫酸盐化、谷氨酸残基的γ羧化、羟化和ADP-核糖基化。
序列“相同性”具有本领域公认的含义,并且可以利用公开的技术计算两个核酸或多肽分子或区域之间序列相同性的百分比。可以沿着多核苷酸或多肽的全长或者沿着该分子的区域测量序列相同性。(参见,例如:Computational Molecular Biology,Lesk,A.M.,ed.,Oxford University Press,New York,1988;Biocomputing:Informatics and Genome Projects,Smith,D.W.,ed.,Academic Press,New York,1993;Computer Analysis of Sequence Data,Part I,Griffin,A.M.,and Griffin,H.G.,eds.,Humana Press,New Jersey,1994;Sequence Analysis in Molecular Biology,von Heinje,G.,Academic Press,1987;and Sequence Analysis Primer,Gribskov,M.and Devereux,J.,eds.,M Stockton Press,New York,1991)。虽然存在许多测量两个多核苷酸或多肽之间的相同性的方法,但是术语“相同性”是技术人员公知的(Carrillo,H.& Lipman,D.,SIAM J Applied Math 48:1073(1988))。
在肽或蛋白中,合适的保守型氨基酸取代是本领域技术人员已知的,并且一般可以进行而不改变所得分子的生物活性。通常,本领域技术人员认识到多肽的非必需区中的单个氨基酸取代基本上不改变生物活性(参见,例如,Watson et al.,Molecular Biology of the Gene,4th Edition,1987,The Benjamin/Cummings Pub.co.,p.224)。
如本发明所用,“表达构建体”是指适于感兴趣的核苷酸序列在生物体中表达的载体如重组载体。“表达”指功能产物的产生。例如,核苷酸序列的表达可指核苷酸序列的转录(如转录生成mRNA或功能RNA)和/或RNA翻译成前体或成熟蛋白质。
本发明的“表达构建体”可以是线性的核酸片段、环状质粒、病毒载体,或者,在一些实施方式中,可以是能够翻译的RNA(如mRNA)。
本发明的“表达构建体”可包含不同来源的调控序列和感兴趣的核苷酸序列,或相同来源但以不同于通常天然存在的方式排列的调控序列和感兴趣的核苷酸序列。
“调控序列”和“调控元件”可互换使用,指位于编码序列的上游(5'非编码序列)、中间或下游(3'非编码序列),并且影响相关编码序列的转录、RNA加工或稳定性或者翻译的核苷酸序列。调控序列可包括但不限于启动子、翻译前导序列、内含子和多腺苷酸化识别序列。
“启动子”指能够控制另一核酸片段转录的核酸片段。在本发明的一些实施方案中,启动子是能够控制细胞中基因转录的启动子,无论其是否来源于所述细胞。启动子可以是组成型启动子或组织特异性启动子或发育调控启动子或诱导型启动子。
“组成型启动子”指一般将引起基因在多数细胞类型中在多数情况下表达的启动子。“组织特异性启动子”和“组织优选启动子”可互换使用,并且指主要但非必须专一地在一种组织或器官中表达,而且也可在一种特定细胞或细胞型中表达的启动子。“发育调控启动子”指其活性由发育事件决定的启动子。“诱导型启动子”响应内源性或外源性刺激(环境、激素、化学信号等)而选择性表达可操纵连接的DNA序列。
如本文中所用,术语“可操作地连接”指调控元件(例如但不限于,启动子序列、转录终止序列等)与核酸序列(例如,编码序列或开放读码框)连接,使得核苷酸序列的转录被所述转录调控元件控制和调节。用于将调控元件区域可操作地连接于核酸分子的技术为本领域已知的。
将核酸分子(例如质粒、线性核酸片段、RNA等)或蛋白质“导入”生物体是指用所述核酸或蛋白质转化生物体细胞,使得所述核酸或蛋白质在细胞中能够发挥功能。本发明所用的“转化”包括稳定转化和瞬时转化。
“稳定转化”指将外源核苷酸序列导入基因组中,导致外源核苷酸序列稳定遗传。一旦稳定转化,外源核酸序列稳定地整合进所述生物体和其任何连续世代的基因组中。
“瞬时转化”指将核酸分子或蛋白质导入细胞中,执行功能而没有外源核苷酸序列稳定遗传。瞬时转化中,外源核酸序列不整合进基因组中。
“性状”指细胞或生物体的生理的、形态的、生化的或物理的特征。
“农艺性状”特别是指作物植物的可测量的指标参数,包括但不限于:叶片绿色、籽粒产量、生长速率、总生物量或积累速率、成熟时的鲜重、成熟时的干重、果实产量、种子产量、植物总氮含量、果实氮含量、种子氮含量、植物营养组织氮含量、植物总游离氨基酸含量、果实游离氨基酸含量、种子游离氨基酸含量、植物营养组织游离氨基酸含量、植物总蛋白含量、果实蛋白含量、种子蛋白含量、植物营养组织蛋白质含量、除草剂的抗性抗旱性、氮的吸收、根的倒伏、收获指数、茎的倒伏、株高、穗高、穗长、抗病性、抗寒性、抗盐性和分蘖数等。
二、基于黄杆菌属Cas12a蛋白的基因组编辑***
在一方面,本发明提供新的Cas12a蛋白,其
(i)包含与SEQ ID NO:1具有至少80%、至少85%、至少90%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、甚至100%序列相同性的氨基酸序列,或
(ii)包含相对于SEQ ID NO:1具有一或多个,例如1个、2个、3个、4个、5个、6个、7个、8个、9个或10个氨基酸取代、缺失或添加的氨基酸序列。
“Cas12a蛋白”、“Cas12a核酸酶”和“Cas12a”在本文中可互换使用,指的是包括Cas12a蛋白或其片段的RNA指导的核酸酶或其变体。Cas12a是CRISPR-Cas12a基因组编辑***的组分,能在向导RNA(crRNA)的指导下靶向和/或切割DNA靶序列形成DNA双链断裂(DSB)。本发明的Cas12a蛋白衍生自植物共生菌,因此,特别适合于在植物中进行基因组编辑。
在本文各方面的一些实施方式中,所述Cas12a蛋白衍生自黄杆菌属(Flavobacterium)属物种。在一些实施方式中,所述Cas12a蛋白衍生自噬腮黄杆菌(Flavobacterium branchiophilum)。本领域人员将可以理解,在同一细菌物种的不同菌株Cas12a蛋白可能在氨基酸序列存在一定差异,但是却能实现基本上相同的功能。
在本发明各方面的在一些实施方案中,所述Cas12a蛋白是重组产生的。在本发明各方面的在一些实施方案中,所述Cas12a蛋白还含有融合标签,例如用于Cas12a蛋白分离/和或纯化的标签。重组产生蛋白质的方法是本领域已知的。并且本领域已知多种可以用于分离/和或纯化蛋白质的标签,包括但不限于His标签、GST标签等。通常而言,这些标签不会改变目的蛋白的活性。在一些实施方案中,所述Cas12a蛋白还融合有其它功能性蛋白,例如脱氨酶、转录激活/抑制蛋白等,从而能够实现碱基编辑或者转录调控功能。
在本发明各方面的一些实施方案中,本发明的Cas12a蛋白还包含核定位序列(NLS),例如,通过接头与所述核定位序列相连。所述接头可以是长1-50个(例如1、2、3、4、5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20个或20-25个、25-50个)或更多个氨基酸、无二级以上结构的非功能性氨基酸序列。例如,所述接头可以是柔性接头,例如SGGS(SEQ ID NO:3)。一般而言,所述Cas12a蛋白中的一个或多个NLS应具有足够的强度,以便在细胞核中驱动所述Cas12a蛋白以可实现其基因组编辑功能的量积聚。一般而言,核定位活性的强度由所述Cas12a蛋白中NLS的数目、位置、所使用的一个或多个特定的NLS、或这些因素的组合决定。示例性的核定位序列包括但不限于SV40核定位信号序列(例如示于SEQ ID NO:4)、nucleoplasmin核定位信号序列(例如示于SEQ ID NO:5)。此外,根据所需要编辑的DNA位置,本发明的Cas12a蛋白还可以包括其他的定位序列,例如细胞质定位序列、叶绿体定位序列、线粒体定位序列等。在一些实施方案中,所述多个定位序列可以通过接头相连。在一些具体实施方式中,所 述Cas12a蛋白包含SEQ ID NO:6所示氨基酸序列。
在一方面,本发明提供本发明的Cas12a蛋白在对细胞,优选真核细胞,更优选植物细胞进行基因组编辑的用途。
在一方面,本发明提供了一种用于对细胞基因组中靶核酸序列进行定点修饰的基因组编辑***,其包含本发明的Cas12a蛋白和/或包含编码本发明的Cas12a蛋白的核苷酸序列的表达构建体。
在本文中,术语“基因组编辑***”和“基因编辑***”可互换使用,是指用于对生物体细胞内基因组进行基因组编辑所需的成分的组合,其中所述***的各个成分,例如Cas12a蛋白、gRNA或相应的表达构建体等可以各自独立地存在,或者可以以任意的组合作为组合物的形式存在。
在一些实施方案中,所述基因组编辑***还包括至少一种向导RNA(gRNA)和/或包含编码所述至少一种向导RNA的核苷酸序列的表达构建体。
“向导RNA”和“gRNA”在本文中可互换使用。CRISPR-Cas12a基因组编辑***的向导RNA通常仅由crRNA分子构成,其中crRNA包含与靶序列具有足够相同性以便与靶序列的互补序列杂交并且指导CRISPR复合物(Cas12a+crRNA)与该靶序列序列特异性结合的序列。
在本发明的方法的一些实施方式中,所述向导RNA是crRNA。在一些实施方案中,所述向导RNA包含SEQ ID NO:10或11所示的crRNA骨架序列。在一些优选实施方式中,所述crRNA骨架序列为SEQ ID NO:11。在一些实施方式中,所述cRNA序列还包括位于所述cRNA骨架序列的3’的与靶序列的互补序列特异性杂交的序列(即spacer序列)。
在一些实施方案中,所述crRNA包含以下序列:
i)5’-AAUUUCUACUAUUGUAGAU(SEQ ID NO:10)-N x-3’;或
ii)5’-UAAUUUCUACUAAGUGUAGAU(SEQ ID NO:11)-N x-3’;
其中N x表示X个连续的核苷酸组成的核苷酸序列,N各自独立地选自A、G、C和U;X为18≤X≤35的整数,优选地,X=20、21、22或23。在一些实施方案中,序列N x(spacer序列)能够与靶序列的互补序列特异性杂交。
一般而言,本发明的基因组编辑***靶向的靶序列5’末端需包含前间区序列邻近基序(protospacer adjacent motif)(PAM)。所述PAM可以是例如5’-TTTN,其中N表示A、G、C或T。然而,也可以使用不同的PAM序列。本领域技术人员基于PAM的存在,可以容易地确定基因组中可以用于靶向以及任选地编辑的靶序列并相应地设计合适的向导RNA。例如,基因组中存在一个PAM序列5’-TTTG-3’,则其3’紧邻的大约18-大约35个,优选20、21、22或23个连续核苷酸可作为靶序列。
在一些实施方案中,所述至少一种向导RNA由不同表达构建体编码。在一些实施方案中,所述至少一种向导RNA由同一表达构建体编码。在一些实施方案中,所述至少一种向导RNA和本发明的Cas12a蛋白由同一表达构建体编码。
例如,在一些实施方案中,所述基因组编辑***可以包含选自以下的任一项:
i)本发明的Cas12a蛋白和所述至少一种向导RNA,任选地,所述Cas12a蛋白和所述至少一种向导RNA形成复合物;
ii)包含编码本发明的Cas12a蛋白的核苷酸序列的表达构建体,和所述至少一种向导RNA;
iii)本发明的Cas12a蛋白,和包含编码所述至少一种向导RNA的核苷酸序列的表达构建体;
iv)包含编码本发明的Cas12a蛋白的核苷酸序列的表达构建体,和包含编码所述至少一种向导RNA的核苷酸序列的表达构建体;
v)包含编码本发明的Cas12a蛋白的核苷酸序列和编码所述至少一种向导RNA的核苷酸序列的表达构建体。
为了在细胞中获得有效表达,在本发明的一些实施方式中,所述编码Cas12a蛋白的核苷酸序列针对待进行基因组编辑的细胞所来自的生物体进行密码子优化。
密码子优化是指通过用在宿主细胞的基因中更频繁地或者最频繁地使用的密码子代替天然序列的至少一个密码子(例如约或多于约1、2、3、4、5、10、15、20、25、50个或更多个密码子同时维持该天然氨基酸序列而修饰核酸序列以便增强在感兴趣宿主细胞中的表达的方法。不同的物种对于特定氨基酸的某些密码子展示出特定的偏好。密码子偏好性(在生物之间的密码子使用的差异)经常与信使RNA(mRNA)的翻译效率相关,而该翻译效率则被认为依赖于被翻译的密码子的性质和特定的转运RNA(tRNA)分子的可用性。细胞内选定的tRNA的优势一般反映了最频繁用于肽合成的密码子。因此,可以将基因定制为基于密码子优化在给定生物中的最佳基因表达。密码子利用率表可以容易地获得,例如在www.kazusa.orjp/codon/上可获得的密码子使用数据库(“Codon Usage Database”)中,并且这些表可以通过不同的方式调整适用。参见,Nakamura Y.等,“Codon usage tabulated from the international DNA sequence databases:status for the year 2000.Nucl.Acids Res.,28:292(2000)。
可通过本发明的Cas12a蛋白或基因组编辑***进行基因组编辑的细胞所来自的生物体优选是真核生物,包括但不限于,哺乳动物如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅;植物包括单子叶植物和双子叶植物,例如水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥等。特别优选地,由于衍生自植物共生菌,本发明的Cas12a蛋白或基因组编辑***特别适合在植物中进行基因组编辑。
在本发明的一些具体实施方式中,所述编码Cas12a蛋白的核苷酸序列针对植物如水稻进行密码子优化。
在一些具体实施方式中,所述编码Cas12a蛋白的核苷酸序列选自SEQ ID NO:2和SEQ ID NO:7。
在本发明一些实施方式中,所述编码Cas12a蛋白的核苷酸序列和/或编码所述至少一种向导RNA的核苷酸序列与表达调控元件如启动子可操作地连接。
本发明可使用的启动子的实例包括但不限于聚合酶(pol)I、pol II或pol III启动子。pol I启动子的实例包括鸡RNA pol I启动子。pol II启动子的实例包括但不限于巨细胞病毒立即早期(CMV)启动子、劳斯肉瘤病毒长末端重复(RSV-LTR)启动子和猿猴病毒40(SV40)立即早期启动子。pol III启动子的实例包括U6和H1启动子。可以使用诱导型启动子如金属硫蛋白启动子。启动子的其他实例包括T7噬菌体启动子、T3噬菌体启动子、β-半乳糖苷酶启动子和Sp6噬菌体启动子。当用于植物时,启动子可以是花椰菜花叶病毒35S启动子、玉米Ubi-1启动子、小麦U6启动子、水稻U3启动子、玉米U3启动子、水稻肌动蛋白启动子。
在一些实施方式中,为了在细胞内精确产生向导RNA,在编码所述至少一种向导RNA的核苷酸序列的表达构建体中,其中所述向导RNA编码序列的5’端连接至第一核酶编码序列的3’端,所述第一核酶被设计为在所述向导RNA的5’末端切割细胞内转录生成的第一核酶-向导RNA融合物,由此形成不携带5’端额外核苷酸的向导RNA。在一实施方案中,所述向导RNA编码序列的3’端连接至第二核酶编码序列的5’端,所述第二核酶被设计为在所述向导RNA的3’末端切割细胞内转录生成的向导RNA-第二核酶融合物,由此形成不携带3’端额外核苷酸的向导RNA。在一些实施方案中,所述向导RNA编码序列的5’端连接至第一核酶编码序列的3’端,所述向导RNA编码序列的3’端连接至第二核酶编码序列的5’端,所述第一核酶被设计为在所述向导RNA的5’末端切割细胞内转录生成的第一核酶-向导RNA-第二核酶融合物,所述第二核酶被设计为在所述向导RNA的3’末端切割细胞内转录生成的第一核酶-向导RNA-第二核酶酶融合物,由此形成不携带5’和3’端额外核苷酸的向导RNA。
所述第一或第二核酶的设计属于本领域技术人员的能力范围内。例如,可以参见Gao et al.,JIPB,Apr,2014;Vol 56,Issue 4,343-349。
在一具体实施方式中,所述第一核酶由以下序列编码:5’-(N) 6CTGATGAGTCCGTGAGGACGAAACGAGTAAGCTCGTC-3’(SEQ ID NO:31),其中N独立地选自A、G、C和T,且(N) 6表示与向导RNA的5’端前6个核苷酸反向互补的序列。在一具体实施方式中,所述第二核酶由以下序列编码:5’-GGCCGGCATGGTCCCAGCCTCCTCGCTGGCGCCGGCTGGGCAACATGCTTCGGCATGGCGAATGGGAC-3’(SEQ ID NO:32)。
在一些实施方式中,为了在细胞内精确产生向导RNA,在编码所述至少一种向导RNA的核苷酸序列的表达构建体中,其中所述向导RNA编码序列的5’端连接至第一tRNA编码序列的3’端,所述第一tRNA被设计为在所述向导RNA的5’末端切割(即,被细胞内存在的精确加工tRNA的机制(其精确切除前体tRNA的5’和3’额外序列以形成成熟tRNA)所切割)细胞内转录生成的第一tRNA-向导RNA融合物,由此形成不携带5’端额外核苷酸的向导RNA。在一实施方案中,所述向导RNA编码序列的3’端连接至第二tRNA编码序列的5’端,所述第二tRNA被设计为在所述向导RNA的3’末端tRNA细胞内转录生成的向导RNA-第二tRNA融合物,由此形成不携带3’端额外核苷酸的向 导RNA。在一些实施方案中,所述向导RNA编码序列的5’端连接至第一tRNA编码序列的3’端,所述向导RNA编码序列的3’端连接至第二tRNA编码序列的5’端,所述第一tRNA被设计为在所述向导RNA的5’末端切割细胞内转录生成的第一tRNA-向导RNA-第二tRNA融合物,所述第二tRNA被设计为在所述向导RNA的3’末端切割细胞内转录生成的第一tRNA-向导RNA-第二tRNA融合物,由此形成不携带5’和3’端额外核苷酸的向导RNA。
所述tRNA-向导RNA融合物的设计属于本领域技术人员的能力范围内。例如,可以参考Xie et al.,PNAS,Mar 17,2015;vol.112,no.11,3570-3575。
三、定点修饰细胞基因组中靶核酸序列的方法
在另一方面,本发明提供了一种对细胞基因组中靶核酸序列进行定点修饰的方法,包括将本发明的基因组编辑***导入所述细胞。
在一些实施方案中,所述基因组编辑***的导入导致所述靶核酸序列中的双链断裂(DSB)。随后,通过细胞的修复功能,实现所述靶核酸序列或其附近序列中的一个或多个核苷酸的取代、缺失和/或添加。
在另一方面,本发明还提一种产生经遗传修饰的细胞的方法,包括将本发明的基因组编辑***导入所述细胞。
在另一方面,本发明还提供经遗传修饰的生物体,其包含通过本发明的方法产生的经遗传修饰的细胞或其后代细胞。
在本发明中,待进行修饰的靶序列可以位于基因组的任何位置,例如位于功能基因如蛋白编码基因内,或者例如可以位于基因表达调控区如启动子区或增强子区,从而实现对所述基因功能修饰或对基因表达的修饰。可以通过T7EI、PCR/RE或测序方法检测所述细胞靶序列中的修饰。
在本发明的方法中,所述基因编辑***可以通过本领域技术人员熟知的各种方法导入细胞。
可用于将本发明的基因编辑***导入细胞的方法包括但不限于:磷酸钙转染、原生质融合、电穿孔、脂质体转染、微注射、病毒感染(如杆状病毒、痘苗病毒、腺病毒、腺相关病毒、慢病毒和其他病毒)、基因枪法、PEG介导的原生质体转化、土壤农杆菌介导的转化。
在一些实施方式中,本发明的方法在体外进行。例如,所述细胞是分离的细胞,或在分离的组织或器官中的细胞。
在另一些实施方式中,本发明的方法还可以在体内进行。例如,所述细胞是生物体内的细胞,可以通过例如病毒或土壤农杆菌介导的方法将本发明的***体内导入所述细胞。
可以通过本发明的方法进行基因组编辑的细胞可以来自例如,哺乳动物如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅;植物,包括单子叶植物和双子叶 植物,例如水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥等。
特别优选地,由于衍生自植物共生菌,本发明的Cas12a蛋白或基因组编辑***特别适合在植物中进行基因组编辑。
因此,本发明提供了一种产生经遗传修饰的植物的方法,包括将本发明的基因组编辑***导入至少一个所述植物,由此导致所述至少一个植物的基因组中的修饰。所述修饰包括一或多个核苷酸的取代、缺失和/或添加。
在本发明的方法中,所述基因组编辑***可以本领域技术人员熟知的各种方法导入植物。可用于将本发明的基因组***导入植物的方法包括但不限于:基因枪法、PEG介导的原生质体转化、土壤农杆菌介导的转化、植物病毒介导的转化、花粉管通道法和子房注射法。
在本发明的方法中,只需在植物细胞中导入或产生所述Cas12a蛋白和向导RNA即可实现对靶序列的修饰,并且所述修饰可以稳定遗传,无需将所述基因组编辑***稳定转化植物。这样避免了稳定存在的基因组编辑***的潜在脱靶作用,也避免外源核苷酸序列在植物基因组中的整合,从而具有更高生物安全性。
在一些优选实施方式中,所述导入在不存在选择压力下进行,从而避免外源核苷酸序列在植物基因组中的整合。
在一些实施方式中,所述导入包括将本发明的基因组编辑***转化至分离的植物细胞或组织,然后使所述经转化的植物细胞或组织再生为完整植物。优选地,在不存在选择压力下进行所述再生,也即是,在组织培养过程中不使用任何针对表达载体上携带的选择基因的选择剂。不使用选择剂可以提高植物的再生效率,获得不含外源核苷酸序列的除草剂抗性植物。
在另一些实施方式中,可以将本发明的基因组编辑***转化至完整植物上的特定部位,例如叶片、茎尖、花粉管、幼穗或下胚轴。这特别适合于难以进行组织培养再生的植物的转化。
在本发明的一些实施方式中,直接将体外表达的蛋白质和/或体外转录的RNA分子转化至所述植物。所述蛋白质和/或RNA分子能够在植物细胞中实现基因组编辑,随后被细胞降解,避免了外源核苷酸序列在植物基因组中的整合。
因此,在一些实施方式中,使用本发明的方法对植物进行遗传修饰可以获得其基因组无外源多核苷酸整合的植物,即非转基因(transgene-free)的经修饰的植物。
在本发明的一些实施方式中,其中所述修饰与植物性状如农艺性状相关,例如所述修饰导致所述植物相对于野生型植物具有改变的(优选改善的)性状,例如农艺性状。
在一些实施方式中,所述方法还包括筛选具有期望的修饰和/或期望的性状如农艺性状的植物的步骤。
在本发明的一些实施方式中,所述方法还包括获得所述经遗传修饰的植物的后代。优选地,所述经遗传修饰的植物或其后代具有期望的修饰和/或期望的性状如农艺性状。
在另一方面,本发明还提供了经遗传修饰的植物或其后代或其部分,其中所述植物 通过本发明上述的方法获得。在一些实施方式中,所述经遗传修饰的植物或其后代或其部分是非转基因的。优选地,所述经遗传修饰的植物或其后代具有期望的遗传修饰和/或期望的性状如农艺性状。
在另一方面,本发明还提供了一种植物育种方法,包括将通过本发明上述的方法获得的经遗传修饰的第一植物与不含有所述修饰的第二植物杂交,从而将所述修饰导入第二植物。优选地,所述经遗传修饰的第一植物具有期望的性状如农艺性状。
四、试剂盒
本发明还包括用于本发明的方法的试剂盒,该试剂盒包括本发明的基因组编辑***,以及使用说明。试剂盒一般包括表明试剂盒内容物的预期用途和/或使用方法的标签。术语标签包括在试剂盒上或与试剂盒一起提供的或以其他方式随试剂盒提供的任何书面的或记录的材料。
实施例
实施例1、利用同源相似性比对寻找植物共生菌中的CRISPR/Cas12a***
根据Bai等(Functional overlap of the Arabidopsis leaf and root microbiota)与Levy等(Genomic features of bacterial adaptation to plants)关于植物共生细菌的报道,发明人通过搜集植物共生菌的基因组序列,使用CRISPRdisco软件整理并分析了4269个植物共生细菌基因组内的CRISPR***,发现植物共生菌内的CRISPR***含量较为丰富,但大部分为Class I中Type I类型的CRISPR***,仅有1个Cas12a蛋白为Class II中的Type V类型。该Cas12a蛋白来源于噬腮黄杆菌(Flavobacterium branchiophilum),NCBI contig ID为FQ859183.1,GeneBank protein ID为CCB70584.1,以下简称为FbCas12a。
该蛋白大小为1318aa(SEQ ID NO:1),但序列附近并没有其他Cas蛋白序列,在基因组下游1509bp处开始出现CRISPR repeat序列,共有37个Spacer序列,其Direct Repeat为GTTTAAAACCACTTTAAAATTTCTACTATTGTAGAT(SEQ ID NO:9),与常用的Cas12a蛋白FnCas12a、LbCas12a、以及AsCas12a的Direct Repeat比较如图1a。该蛋白与常用的FnCas12a、LbCas12a、以及AsCas12a的蛋白序列比对结果如图1b,序列相似性比对使用NCBI blastp程序。
实施例2、FbCas12a和LbCas12a表达载体制备
构建植物原生质体转化使用的载体:pJIT163-UBI-FbCas12a、pJIT163-UBI-LbCas12a、pJIT163-UBI-FbcrRNA及pJIT163-UBI-LbcrRNA。
对来源于噬腮黄杆菌(Flavobacterium branchiophilum)的FbCas12a编码序列进行了密码子优化,并在其3’端添加两个核定位信号(NLS),在两端添加BamHI/SmaI限制酶切位点,使优化后的FbCas12a蛋白更好地在水稻中表达和定位。添加NLS且密码子优化后的FbCas12a的核苷酸编码序列如序列表中的SEQ ID NO:7所示。SEQ ID NO:7中, 第3967-3987位为SV40核定位信号序列,第3988-3999位为两个核定位信号序列之间的SGGS linker,第4000-4047位为nucleoplasmin核定位信号序列,第1-3966位为FbCas12a蛋白的编码序列。SEQ ID NO:7编码SEQ ID NO:6所示蛋白,即带有核定位信号的FbCas12a核酸酶。
人工合成了带BamHI/SmaI位点的SEQ ID NO:7所示的DNA。经BamHI/SmaI双酶切,将该DNA片段连接入表达载体pJIT163(Guerineau,F.,Lucy,A.& Mullineaux,P.,Effect of two consensus sequences preceding the translation initiator codon on gene expression in plant protoplasts.Plant Molecular Biology 18,815-818,1992,公众可从中国科学院遗传与发育生物学研究所获得该载体)中,所得构建体命名为pJIT163-FbCas12a。经测序证明,在pJIT163表达载体的BamH I和SmaI酶切位点间***了具有SEQ ID NO:7的所示序列的核苷酸片段。
将实验室基因组编辑常用的LbCas12a的DNA序列连接到pJIT163载体中获得pJIT163-UBI-LbCas12a载体,载体构建方式与pJIT163-UBI-FbCas12a类似。密码子优化后的LbCas12a的核苷酸编码序列如序列表中的SEQ ID NO:8所示。
总的来说,pJIT163-FbCas12a与pJIT163-LbCas12a载体包含UBI启动子、植物密码子优化后的FbCas12a蛋白或LbCas12a编码序列,3’的SV40核定位信号编码序列、nucleoplasmin核定位信号编码序列,其结构示意图如图2a和图2c。
实施例3、带有FbCas12a的人工预测成熟形式的crRNA骨架编码序列的载体pJIT163-FbcrRNA的制备与pJIT163-LbcrRNA的制备
Cas12a有crRNA自成熟功能,然而,出乎意料的是,本发明人使用全长的FbCas12a crRNA骨架序列(全长Direct Repeat)时,没有实现基因组编辑。因此,似乎FbCas12a并不能成熟其天然crRNA骨架。需要对FbCas12a的crRNA进行探索以确定其是否能够实现基因组编辑。
人工合成SEQ ID NO:14所示的核苷酸序列的DNA片段,该片段包含锤头状核酶(Hammerhead,HH ribozyrne)与丁型肝炎病毒核酶(Hepatitis deltavirus,HDV ribozyrne),可以剪切FbCas12a对应的人工预测的已成熟的直接重复序列(direct repeat,DR)。该片段1-6位为HindIII酶切位点,7-12位为HH核酶工作时所需要的反向互补序列,13-49位为HH核酶序列,50-68位为人为截短的DR序列,69-88位内含有两个BsaI酶切位点,水稻中待突变的靶序列的识别序列可通过两个酶切位点连入载体pJIT163-FbcrRNA中,89-156位为HDV核酶序列,157-162位为SmaI的酶切位点序列。
经HindIII/SmaI双酶切,将合成的SEQ ID NO:14的DNA片段连接入表达载体pJIT163,得到pJIT163-FbcrRNA载体。该载体包含UBI启动子、HH核酶、截短的FbcrRNA序列、HDV核酶与CaMV终止子,其结构示意图如图2b。pJIT163-FbcrRNA载体使用基于核酶的crRNA成熟策略得到精确加工的crRNA序列。
人工合成SEQ ID NO:15所示的核苷酸序列的DNA片段,该片段仅有DR序列与 FbCas12a不同,同样经HindIII/SmaI双酶切,将合成的SEQ ID NO:15的DNA片段连接入表达载体pJIT163,得到pJIT163-LbcrRNA载体。该载体的结构示意图如图2d。
实施例4、FbCas12a***对水稻内源基因EPSPS的定点突变及使用FbCas12a蛋白与LbCas12a的crRNA对水稻四个内源基因靶点的突变
(1)靶标片段target-EPSPS02的设计
target-EPSPS05: TTTGGTACTAAATATACAATCCCTTGG(SEQ ID NO:16;序列为LOC_Os06g04280.1的OsEPSPS基因中第956-982位核苷酸。划线部分是PAM序列)。
target-OsCDC48: TTTATTCAGATTACATATGGTTAG(SEQ ID NO:17;序列为LOC_Os03g05730的OsCDC48基因中第582-605位核苷酸。划线部分是PAM序列)。
target-OsDEP1T3: TTTCAAATGGATCTAAACAGGGCCTTA(SEQ ID NO:18;序列为LOC_Os09g26999的OsDEP1基因中第1919-1945位核苷酸。划线部分是PAM序列)。
target-OsPDS: TTTGGAGTGAAATCTCTTGTCTTA(SEQ ID NO:19;序列为LOC_Os03g08570的OsPDS基因中第136-159位核苷酸。划线部分是PAM序列)。
target-OsEpspsC02: TTTATGAAAATATGTATGGAATTCATG(SEQ ID NO:20;序列为LOC_Os06g04280.1的OsEPSPS基因中第1294-1320位核苷酸。划线部分是PAM序列)。
(2)含有SP1、SP2、SP3、SP4、SP5的pJIT163-FbcrRNA质粒和pJIT163-LbcrRNA质粒
SP1是能与靶标target-EPSPS05互补结合的RNA的编码DNA
合成下述带有粘性末端(划线部分)的单链引物:
SP1-F: AGATGTACTAAATATACAATCCCTTGG(SEQ ID NO:21)
SP1-R: AAACCCAAGGGATTGTATATTTAGTAC(SEQ ID NO:22)
经引物退火形成带有粘性末端的双链DNA,***到pJIT163-FbcrRNA的两个BsaI酶切位点之间,即得到含有SP1的pJIT163-FbcrRNA质粒,质粒经测序验证为阳性质粒。
SP2~SP5是能与靶标target-OsCDC48、target-OsDEP1T3、target-OsPDS及target-OsEpspsC02互补结合的RNA的编码DNA。
合成下述带有粘性末端(划线部分)的单链引物:
SP2-F: AGATTTCAGATTACATATGGTTAG(SEQ ID NO:23)
SP2-R: AAACCTAACCATATGTAATCTGAA(SEQ ID NO:24)
SP3-F: AGATAAATGGATCTAAACAGGGCCTTA(SEQ ID NO:25)
SP3-R: AAACATTGGCCCTGTTTAGATCCATTT(SEQ ID NO:26)
SP4-F: AGATGAGTGAAATCTCTTGTCTTA(SEQ ID NO:27)
SP4-R: AAACTAAGACAAGAGATTTCACTC(SEQ ID NO:28)
SP5-F: AGATTGAAAATATGTATGGAATTCATG(SEQ ID NO:29)
SP5-R: AAACCATGAATTCCATACATATTTTCA(SEQ ID NO:30)
经引物退火形成带有粘性末端的双链DNA,***到pJIT163-FbcrRNA与pJIT163-LbcrRNA的两个BsaI酶切位点之间,即得到含有SP1~SP5的pJIT163-FbcrRNA质粒与pJIT163-LbcrRNA质粒,质粒经测序验证为阳性质粒。
(3)转化FbcrRNA:FbCas12a、LbcrRNA:FbCas12a、LbcrRNA:LbCas12a至水稻原生质体
分别将pJIT163-UBI-FbCas12a、pJIT163-UBI-FbCas12a和含有SP1~SP5的pJIT163-FbcrRNA、pJIT163-LbcrRNA质粒转化至水稻日本晴的原生质体,水稻原生质体转化具体过程参考了文献Shan,Q.et al.,Rapid and efficient gene modification in rice and Brachypodium using TALENs.Molecular Plant(2013)中公开的方法。
水稻原生质体转化后48小时提取基因组DNA,以该DNA为模板,进行扩增子高通量测序实验分析其编辑效率,扩增子高通量测序具体过程参考了文献Zhang et al.Perfectly matched 20-nucleotide guide RNA sequences enable robust genome editing using high-fidelity SpCas9 nucleases.Genome Biology,2017中描述的方法。
人工成熟的FbcrRNA:FbCas12a高通量测序实验结果如图3a,结果表明,与对照组相比,FbCas12a处理组在OsEPSPS基因的靶位点处发生了突变,突变效率大约在6.25%左右。图3b表示由高通量测序数据分析得到的突变类型比例,结果表明,FbCas12a在靶位点处产生的突变类型大部分是DNA片段的删除。
另一次原生质体转化实验结果如图4,Fb表示FbcrRNA:FbCas12a,FbLb表示LbcrRNA:FbCas12a,Lb表示LbcrRNA:LbCas12a。结果表明,FbcrRNA:FbCas12a在水稻的四个靶点处均能工作,但在使用LbcrRNA时,其工作效率更高。

Claims (16)

  1. 一种Cas12a蛋白,其
    (i)包含与SEQ ID NO:1具有至少80%、至少85%、至少90%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、甚至100%序列相同性的氨基酸序列,或
    (ii)包含相对于SEQ ID NO:1具有一或多个,例如1个、2个、3个、4个、5个、6个、7个、8个、9个或10个氨基酸取代、缺失或添加的氨基酸序列。
  2. 权利要求1的Cas12a蛋白,其中所述Cas12a蛋白衍生自黄杆菌(Flavobacterium)属物种,例如衍生自噬腮黄杆菌(Flavobacterium branchiophilum)。
  3. 权利要求1或2的Cas12a蛋白,其中所述Cas12a蛋白还包含核定位序列(NLS)。
  4. 权利要求3的Cas12a蛋白,其包含SEQ ID NO:6所示氨基酸序列。
  5. 权利要求1-4中任一项的Cas12a蛋白在对细胞,优选真核细胞,更优选植物细胞进行基因组编辑的用途。
  6. 一种用于对细胞基因组中靶核酸序列进行定点修饰的基因组编辑***,其包含权利要求1-4中任一项的Cas12a蛋白和/或包含编码权利要求1-4中任一项的Cas12a蛋白的核苷酸序列的表达构建体。
  7. 权利要求6的基因组编辑***,其还包括至少一种向导RNA(gRNA)和/或包含编码所述至少一种向导RNA的核苷酸序列的表达构建体。
  8. 权利要求7的基因组编辑***,其中所述向导RNA是crRNA,且包含SEQ ID NO:10或11所示的crRNA骨架序列。
  9. 权利要求7或8的基因组编辑***,其包含选自以下i)至v)的任一项:
    i)权利要求1-4中任一项的Cas12a蛋白和所述至少一种向导RNA,任选地,所述Cas12a蛋白和所述至少一种向导RNA形成复合物;
    ii)包含编码权利要求1-4中任一项的Cas12a蛋白的核苷酸序列的表达构建体,和所述至少一种向导RNA;
    iii)权利要求1-4中任一项的Cas12a蛋白,和包含编码所述至少一种向导RNA的核苷酸序列的表达构建体;
    iv)包含编码权利要求1-4中任一项的Cas12a蛋白的核苷酸序列的表达构建体,和包含编码所述至少一种向导RNA的核苷酸序列的表达构建体;
    v)包含编码权利要求1-4中任一项的Cas12a蛋白的核苷酸序列和编码所述至少一种向导RNA的核苷酸序列的表达构建体。
  10. 权利要求6-9中任一项的基因组编辑***,其中编码所述Cas12a蛋白的核苷酸序列针对植物如水稻进行密码子优化。
  11. 权利要求10的基因组编辑***,其中所述编码Cas12a蛋白的核苷酸序列选自 SEQ ID NO:2和SEQ ID NO:7。
  12. 权利要求7-11中任一项的基因组编辑***,所述编码Cas12a蛋白的核苷酸序列和/或编码所述至少一种向导RNA的核苷酸序列与表达调控元件如启动子可操作地连接。
  13. 权利要求7-11中任一项的基因组编辑***,所述向导RNA编码序列的5’端连接至第一核酶编码序列的3’端,所述向导RNA编码序列的3’端连接至第二核酶编码序列的5’端,所述第一核酶被设计为在所述向导RNA的5’末端切割细胞内转录生成的第一核酶-向导RNA-第二核酶融合物,所述第二核酶被设计为在所述向导RNA的3’末端切割细胞内转录生成的第一核酶-向导RNA-第二核酶融合物,由此形成不携带5’和3’端额外核苷酸的向导RNA。
  14. 权利要求13的基因组编辑***,其中所述第一核酶由SEQ ID NO:31所示序列编码,所述第二核酶由SEQ ID NO:32所示序列编码。
  15. 一种产生经遗传修饰的细胞的方法,包括将权利要求6-14中任一项的基因组编辑***导入所述细胞。
  16. 权利要求15的方法,其中所述细胞来自哺乳动物如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅;植物,包括单子叶植物和双子叶植物,例如水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥。
PCT/CN2020/129665 2019-11-18 2020-11-18 衍生自黄杆菌的基因编辑*** WO2021098709A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US17/777,936 US20230002453A1 (en) 2019-11-18 2020-11-18 Gene editing system derived from flavobacteria
BR112022009584A BR112022009584A2 (pt) 2019-11-18 2020-11-18 Sistema de edição de genes derivado de flavobacterium
CN202080080579.0A CN115052980A (zh) 2019-11-18 2020-11-18 衍生自黄杆菌的基因编辑***
EP20890516.6A EP4063500A4 (en) 2019-11-18 2020-11-18 GENE EDITING SYSTEM DERIVED FROM BACTERIA OF THE GENUS FLAVOBACTERIUM

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911126348.4 2019-11-18
CN201911126348 2019-11-18

Publications (1)

Publication Number Publication Date
WO2021098709A1 true WO2021098709A1 (zh) 2021-05-27

Family

ID=75981339

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/129665 WO2021098709A1 (zh) 2019-11-18 2020-11-18 衍生自黄杆菌的基因编辑***

Country Status (5)

Country Link
US (1) US20230002453A1 (zh)
EP (1) EP4063500A4 (zh)
CN (1) CN115052980A (zh)
BR (1) BR112022009584A2 (zh)
WO (1) WO2021098709A1 (zh)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108513582A (zh) * 2015-06-18 2018-09-07 布罗德研究所有限公司 新型crispr酶以及***
WO2019051428A1 (en) * 2017-09-11 2019-03-14 The Regents Of The University Of California CAS9 ANTIBODY MEDIA ADMINISTRATION TO MAMMALIAN CELLS
CN110117621A (zh) * 2019-05-24 2019-08-13 青岛农业大学 一种碱基编辑器及其制备方法和应用

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10513711B2 (en) * 2014-08-13 2019-12-24 Dupont Us Holding, Llc Genetic targeting in non-conventional yeast using an RNA-guided endonuclease
EA038321B1 (ru) * 2014-11-06 2021-08-09 Е.И. Дюпон Де Немур Энд Компани Опосредуемая пептидом доставка направляемой рнк эндонуклеазы в клетки
US10648020B2 (en) * 2015-06-18 2020-05-12 The Broad Institute, Inc. CRISPR enzymes and systems
US9896696B2 (en) * 2016-02-15 2018-02-20 Benson Hill Biosystems, Inc. Compositions and methods for modifying genomes
US20200263190A1 (en) * 2016-04-19 2020-08-20 The Broad Institute, Inc. Novel crispr enzymes and systems
WO2017223538A1 (en) * 2016-06-24 2017-12-28 The Regents Of The University Of Colorado, A Body Corporate Methods for generating barcoded combinatorial libraries
US20190330659A1 (en) * 2016-07-15 2019-10-31 Zymergen Inc. Scarless dna assembly and genome editing using crispr/cpf1 and dna ligase
JP2020507312A (ja) * 2017-02-10 2020-03-12 ザイマージェン インコーポレイテッド 複数の宿主用の複数のdnaコンストラクトのアセンブリ及び編集のためのモジュラーユニバーサルプラスミド設計戦略
WO2018226972A2 (en) * 2017-06-09 2018-12-13 Vilmorin & Cie Compositions and methods for genome editing
US20210071174A1 (en) * 2018-05-09 2021-03-11 Dsm Ip Assets B.V. Crispr transient expression construct (ctec)

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108513582A (zh) * 2015-06-18 2018-09-07 布罗德研究所有限公司 新型crispr酶以及***
WO2019051428A1 (en) * 2017-09-11 2019-03-14 The Regents Of The University Of California CAS9 ANTIBODY MEDIA ADMINISTRATION TO MAMMALIAN CELLS
CN110117621A (zh) * 2019-05-24 2019-08-13 青岛农业大学 一种碱基编辑器及其制备方法和应用

Non-Patent Citations (20)

* Cited by examiner, † Cited by third party
Title
"Biocomputing: Informatics and Genome Projects", 1993, ACADEMIC PRESS
"Computer Analysis of Sequence Data", 1994, HUMANA PRESS
"GeneBank", Database accession no. CCB70584.1
"NCBI", Database accession no. FQ859183.1
"Sequence Analysis Primer", 1991, M STOCKTON PRESS
BAI ET AL., FUNCTIONAL OVERLAP OF THE ARABIDOPSIS LEAF AND ROOT MICROBIOTA
CARRILLO, H.LIPMAN, D., SIAM J APPLIED MATH, vol. 48, 1988, pages 1073
CONG ET AL.: "Multiplex Genome Engineering Using CRISPR/Cas Systems", SCIENCE, 2013
DATABASE Protein GenPept; ANONYMOUS: "type V CRISPR-associated protein Cas12a/Cpf1 [Flavobacterium branchiopphilum]", XP055822487, retrieved from NCBI *
GAO ET AL., JIPB, vol. 56, no. 4, April 2014 (2014-04-01), pages 343 - 349
LEVY ET AL., GENOMIC FEATURES OF BACTERIAL ADAPTATION TO PLANTS
MALI ET AL.: "RNA-guided human genome engineering via Cas9", SCIENCE, 2013
NAKAMURA Y. ET AL.: "Codon usage tabulated from the international DNA sequence databases: status for the year 2000", NUCL. ACIDS RES., vol. 28, 2000, pages 292, XP002941557, DOI: 10.1093/nar/28.1.292
PLANT MOLECULAR BIOLOGY, vol. 18, 1992, pages 815 - 818
SAMBROOK, J.FRITSCH, E.FMANIATIS, T.: "Molecular Cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY PRESS
See also references of EP4063500A4
SHAN, Q. ET AL.: "Rapid and efficient gene modification in rice and Brachypodium using TALENs", METHOD DISCLOSED IN MOLECULAR PLANT, 2013
WATSON ET AL.: "Sequence Analysis in Molecular Biology", 1987, THE BENJAMIN/CUMMINGS PUB. CO., pages: 224
XIE ET AL., PNAS, vol. 112, no. 11, 17 March 2015 (2015-03-17), pages 3570 - 3575
ZHANG ET AL.: "Perfectly matched 20 -nucleotide guide RNA sequences enable robust genome editing using high-fidelity SpCas9 nucleases", METHODS DESCRIBED IN GENOME BIOLOGY, 2017

Also Published As

Publication number Publication date
EP4063500A1 (en) 2022-09-28
US20230002453A1 (en) 2023-01-05
EP4063500A4 (en) 2023-12-27
CN115052980A (zh) 2022-09-13
BR112022009584A2 (pt) 2022-10-04

Similar Documents

Publication Publication Date Title
US11702643B2 (en) System and method for genome editing
WO2019120310A1 (en) Base editing system and method based on cpf1 protein
JP7138712B2 (ja) ゲノム編集のためのシステム及び方法
CN113373130A (zh) Cas12蛋白、含有Cas12蛋白的基因编辑***及应用
WO2023169454A1 (zh) 腺嘌呤脱氨酶及其在碱基编辑中的用途
WO2020224611A1 (en) Improved gene editing system
JP7361109B2 (ja) C2c1ヌクレアーゼに基づくゲノム編集のためのシステムおよび方法
CA3228222A1 (en) Class ii, type v crispr systems
CN116555237A (zh) 胞嘧啶脱氨酶及其在碱基编辑中的用途
CN113025597A (zh) 改进的基因组编辑***
WO2021004456A1 (zh) 改进的基因组编辑***及其应用
WO2021098709A1 (zh) 衍生自黄杆菌的基因编辑***
WO2023030534A1 (zh) 改进的引导编辑***
WO2021175288A1 (zh) 改进的胞嘧啶碱基编辑***
WO2022188816A1 (zh) 改进的cg碱基编辑***
WO2024051850A1 (zh) 基于dna聚合酶的基因组编辑***和方法
WO2023232109A1 (zh) 新的crispr基因编辑***
CN117327679A (zh) 碱基编辑工具及其应用

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20890516

Country of ref document: EP

Kind code of ref document: A1

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112022009584

Country of ref document: BR

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020890516

Country of ref document: EP

Effective date: 20220620

REG Reference to national code

Ref country code: BR

Ref legal event code: B01E

Ref document number: 112022009584

Country of ref document: BR

Free format text: APRESENTAR NOVO CONTEUDO ELETRONICO DE LISTAGEM DE SEQUENCIAS BIOLOGICAS, UMA VEZ QUE O APRESENTADO TEM DIVERGENCIA DO PEDIDO NA FASE NACIONAL, EM RELACAO A CAMPOS OBRIGATORIOS (CAMPO 110).

ENP Entry into the national phase

Ref document number: 112022009584

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20220517