WO2020177751A1 - 一种用于基因编辑的核酸构建物 - Google Patents

一种用于基因编辑的核酸构建物 Download PDF

Info

Publication number
WO2020177751A1
WO2020177751A1 PCT/CN2020/078079 CN2020078079W WO2020177751A1 WO 2020177751 A1 WO2020177751 A1 WO 2020177751A1 CN 2020078079 W CN2020078079 W CN 2020078079W WO 2020177751 A1 WO2020177751 A1 WO 2020177751A1
Authority
WO
WIPO (PCT)
Prior art keywords
coding sequence
nucleic acid
acid construct
sequence
gene
Prior art date
Application number
PCT/CN2020/078079
Other languages
English (en)
French (fr)
Inventor
李峰
梁亚峰
Original Assignee
山东舜丰生物科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201910839046.5A external-priority patent/CN110527695B/zh
Priority claimed from CN201910838286.3A external-priority patent/CN110526993B/zh
Application filed by 山东舜丰生物科技有限公司 filed Critical 山东舜丰生物科技有限公司
Publication of WO2020177751A1 publication Critical patent/WO2020177751A1/zh

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01HNEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
    • A01H5/00Angiosperms, i.e. flowering plants, characterised by their plant parts; Angiosperms characterised otherwise than by their botanic taxonomy
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01HNEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
    • A01H6/00Angiosperms, i.e. flowering plants, characterised by their botanic taxonomy
    • A01H6/20Brassicaceae, e.g. canola, broccoli or rucola
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01HNEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
    • A01H6/00Angiosperms, i.e. flowering plants, characterised by their botanic taxonomy
    • A01H6/46Gramineae or Poaceae, e.g. ryegrass, rice, wheat or maize
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01HNEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
    • A01H6/00Angiosperms, i.e. flowering plants, characterised by their botanic taxonomy
    • A01H6/54Leguminosae or Fabaceae, e.g. soybean, alfalfa or peanut
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K19/00Hybrid peptides, i.e. peptides covalently bound to nucleic acids, or non-covalently bound protein-protein complexes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/10Cells modified by introduction of foreign genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)

Definitions

  • the present invention relates to the field of biotechnology, in particular, to a nucleic acid construct for gene editing.
  • CRISPR-Cas9 gene editing technology has been widely used in the research of gene editing in animals and plants. Currently, it mainly includes cytosine base editor (CBE) and adenine base editor (ABE).
  • CBE cytosine base editor
  • ABE adenine base editor
  • the base editor can be performed in the genome Accurate base changes (such as substitutions) without causing DNA double-strand breaks (DSB).
  • the early developed CBE and ABE systems consist of rat cytidine deaminase APOBEC1, which is derived from lamprey’s cytidine deaminase PmCDA1 and TadA, which is derived from tRNA adenine deaminase, and have been applied to many plant species.
  • Medium such as rice, wheat, corn, tomato, Arabidopsis and Brassica napus.
  • Kang et al. optimized the promoter
  • Zong et al. optimized the deaminase.
  • Cas9 protein variants can recognize PAM sequences that are different from the classic NGG motif.
  • the efficiency of single-base editing is still very low.
  • the base editors reported so far can only recognize a limited number of PAM sequences, which greatly limits the editable range of plant genomes.
  • the purpose of the present invention is to provide a new nucleic acid construct for gene editing that can efficiently achieve A-G transformation in a large range and accurately in plant cells while significantly reducing the risk of insertion or deletion mutations.
  • the first aspect of the present invention provides a nucleic acid construct having a 5'-3' (5' to 3') formula I structure:
  • I1 is the first integrated component
  • I2 is the second integrated component
  • Z1 is the first expression cassette
  • Z2 is the second expression cassette
  • one of the expression cassettes of Z1 and Z2 has the structure of Ia or Ia', and the other expression cassette has the structure of formula Ib:
  • P1, S1, X1, L1, X2, X4, L2, X3, P2, Y1 are elements used to form the construct, respectively;
  • P1 is a first promoter, and the first promoter is an RNA polymerase II dependent promoter;
  • S1 is the coding sequence of the first nuclear localization signal
  • X1 is the coding sequence of adenine deaminase (such as wild-type and/or mutant TadA) and/or the coding sequence of cytosine deaminase;
  • L1 is the coding sequence of no or first connecting peptide
  • X2 is the coding sequence of Cas9 nuclease, which has no cleavage activity or single-stranded cleavage activity;
  • X4 is the coding sequence of no or uracil glycosidase inhibitor UGI;
  • L2 is the coding sequence of no or second connecting peptide
  • X3 is the coding sequence of the second nuclear localization signal
  • P2 is the second promoter
  • Y1 is the coding sequence of gRNA
  • each "-" is a bond or a nucleotide connection sequence
  • the additional condition is that when X1 is the coding sequence of adenine deaminase, X4 is none, when X1 is the coding sequence of cytosine deaminase, and X4 is the coding sequence of the uracil glycosidase inhibitor UGI.
  • the gRNA includes crRNA, tracrRNA, and sgRNA.
  • the first promoter is derived from one or more plants selected from the group consisting of corn, rice, soybean, Arabidopsis or tomato.
  • the first promoter is derived from one or more microorganisms selected from the group consisting of Streptomyces and Escherichia coli.
  • the first promoter is derived from one or more viruses selected from the group consisting of tobacco mosaic virus, yellow leaf curl virus, cauliflower mosaic virus, and cotton leaf curl virus.
  • the first promoter includes a maize ubiquitin promoter.
  • the ubiquitin promoter includes UBI promoter.
  • the first promoter is selected from the group consisting of UBI, UBQ, 35S, Actin, SPL, CmYLCV, YAO, CDC45, rbcS, rbcL, PsGNS2, UEP1, TobRB7, Cab, or a combination thereof.
  • the length of the nucleotide sequence of L1 and L2 is independently 3-120 nt, preferably 3-96 nt, and preferably a multiple of 3.
  • the lengths of the amino acid sequences encoded by L1 and L2 are independently 3-40aa, preferably 6-32aa, preferably 18-32aa, and preferably 24-32aa.
  • the length of the nucleotide linking sequence is 1-300 nt, preferably 1-100 nt.
  • nucleotide linking sequence does not affect the normal transcription and translation of each element.
  • the Cas9 nuclease is selected from the group consisting of nCas9, Cas9NG, nCas9NG, or a combination thereof.
  • the Cas9 nuclease is selected from the group consisting of nSpCas9 (D10A), nSpCas9NG, nSaCas9, nScCas9, nSqCas9, nXCas9, or a combination thereof.
  • the Cas9 nuclease includes a mutant Cas9 nuclease.
  • the identity of the Cas9 nuclease and the mutant Cas9 nuclease is ⁇ 80%, preferably ⁇ 90%; more preferably ⁇ 95%, more preferably, ⁇ 98% Or 99%.
  • the activity of the mutant Cas9 nuclease is equivalent to or significantly better than that of the wild-type Cas9 enzyme.
  • the mutant Cas9 nuclease is passed through one or more of the wild-type Cas9 nuclease, preferably 1-15, preferably 1-10, preferably 1- 7, more preferably 2-5, amino acid substitution, deletion; and/or after 1-5, preferably 1-4, more preferably 1-3, most preferably 1-2 amino acid addition Forming.
  • the mutation site is at position D10A of Cas9 nuclease, and its amino acid sequence is shown in SEQ ID NO.:2.
  • the X2 element is mutated at one or more sites selected from the following group in the Cas9 nuclease corresponding to SEQ ID NO.: 2:
  • Threonine (T) at position 1337.
  • the arginine (R) at position 1335 is mutated to alanine (A); and/or
  • Leucine (L) at position 1111 is mutated to arginine (R); and/or
  • Threonine (T) at position 1337 was mutated to arginine (R).
  • the mutation is selected from the following group: R1335A; L1111R; D1135V; G1218R; E1219F; A1322R; T1337R.
  • amino acid sequence of the X2 element is shown in SEQ ID NO.: 3.
  • the X2 element is derived from bacteria.
  • the source of the X2 element is selected from the group consisting of Streptococcus pyogenes, Staphylococcus aureus, Streptococcus canis, or a combination thereof.
  • the coding sequence of the first connecting peptide and the coding sequence of the second connecting peptide each independently include XTEN.
  • the coding sequence of the first connecting peptide and the coding sequence of the second connecting peptide are shown in SEQ ID NO.: 4 or 7.
  • the nuclear localization signal is a codon-optimized nuclear localization signal.
  • the nuclear localization signal is a plant codon optimized nuclear localization signal.
  • the nuclear localization signal includes bpNLS.
  • the nuclear localization signal is bpNLS.
  • nucleotide sequences of the S1 element and the X3 element are shown in SEQ ID NO.: 5 or 20, respectively.
  • the adenine deaminase includes wild type and mutant type.
  • the adenine deaminase includes wild-type and/or mutant TadA.
  • the adenine deaminase includes TadA.
  • the mutant type of adenine deaminase includes TadA7-10.
  • the adenine deaminase is a tandem adenine deaminase, and the structure of the tandem adenine deaminase is shown in formula II:
  • Z8 is the amino acid sequence of wild-type adenine deaminase TadA
  • L8 is an optional connecting peptide sequence
  • Z9 is the amino acid sequence of the mutant adenine deaminase TadA7-10.
  • the adenine deaminase is a codon-optimized adenine deaminase.
  • the adenine deaminase is a plant codon optimized adenine deaminase.
  • the coding sequence of the adenine deaminase is selected from the following group:
  • the coding sequence of the adenine deaminase is shown in SEQ ID NO.1.
  • amino acid sequence of the adenine deaminase is shown in SEQ ID NO.: 8.
  • the cytosine deaminase includes wild type and mutant type.
  • the cytosine deaminase includes APOBEC.
  • the APOBEC is selected from the following group: APOBEC1 (A1), APOBEC2 (A2), APOBEC3A, APOBEC3B, APOBEC3C, APOBEC3D, APOBEC3E, APOBEC3F, APOBEC3H, APOBEC4 (A4), activation-induced deaminase ( activation induced cytidinedeaminase, AID), or a combination thereof.
  • the mutants of the cytosine deaminase include CBE2.0, CBE2.1, CBE2.2, CBE2.3, and CBE2.4.
  • the cytosine deaminase is a codon-optimized cytosine deaminase.
  • the cytosine deaminase is a plant codon optimized cytosine deaminase.
  • the coding sequence of the cytosine deaminase is selected from the following group:
  • the coding sequence of the cytosine deaminase is shown in SEQ ID NO.19.
  • amino acid sequence of the cytosine deaminase is shown in SEQ ID NO.:21.
  • nucleotide sequence of the X4 element is shown in SEQ ID NO.: 22.
  • the second promoter is derived from one or more plants selected from the group consisting of rice, corn, soybean, Arabidopsis or tomato.
  • the second promoter is derived from one or more microorganisms selected from the group consisting of Streptomyces and Escherichia coli.
  • the second promoter is derived from one or more viruses selected from the group consisting of tobacco mosaic virus, yellow leaf curl virus, cauliflower mosaic virus, and cotton leaf curl virus.
  • the second promoter includes an RNA polymerase III-dependent promoter.
  • the second promoter is an RNA polymerase III-dependent promoter.
  • the second promoter is selected from the group consisting of U6, U3, U6a, U6b, U6c, U6-1, U3b, U3d, U6-26, U6-29, H1, or a combination thereof.
  • the second promoter includes U6 promoter.
  • the second promoter is selected from the group consisting of OsU6, OsU3, OsU6a, OsU6b, OsU6c, AtU6-1, AtU3b, AtU3d, AtU6-1, AtU6-26, AtU6-29, or combination.
  • the "no cleavage activity or single-strand cleavage activity" refers to the Cas9 nuclease having no cleavage activity for the single-stranded target site T.
  • the "no cleavage activity or single-strand cleavage activity" refers to the Cas9 nuclease has no cleavage activity for the single-stranded target site G.
  • nucleotide elements of the present invention are connected in-frame to express a fusion protein with the correct amino acid sequence.
  • the construct has a structure of formula IIa or IIa' or formula IIb or IIb':
  • both the first expression cassette and the second expression cassette have a terminator.
  • first expression cassette and the second expression cassette share the same terminator.
  • the terminator includes a terminator suitable for plant gene editing.
  • the terminator is selected from the group consisting of NOS, Poly A, T-UBQ, rbcS, or a combination thereof.
  • nucleotide sequence of the terminator is shown in SEQ ID NO.:6.
  • the first integration element includes a 5'homology arm sequence.
  • the second integration element includes a 3'homology arm sequence.
  • the length of the nucleic acid construct is 3000-10000 bp, preferably 4000-8500 bp, more preferably 4000-6000 bp.
  • one or more additional expression cassettes are additionally inserted.
  • the additional expression cassette is independent of the first expression cassette and the second expression cassette.
  • the additional expression cassette expresses a substance selected from the following group:
  • the marker gene includes a resistance gene (such as a hygromycin resistance gene, a herbicide resistance gene), a fluorescent gene, or a combination thereof.
  • the second aspect of the present invention provides a fusion protein, the fusion protein includes (a) adenine deaminase and/or cytosine deaminase; and (b) Cas9 nuclease, and the fusion protein is composed of the present invention
  • the nucleic acid construct described in the first aspect encodes.
  • the third aspect of the present invention provides a vector containing the nucleic acid construct according to the first aspect of the present invention.
  • the vector is a plant expression vector.
  • the vector is an expression vector that can be transfected or transformed into plant cells.
  • the carrier is an Agrobacterium Ti carrier.
  • the construct is integrated into the T-DNA region of the vector.
  • the carrier is cyclic or linear.
  • the fourth aspect of the present invention provides a genetically engineered cell containing the nucleic acid construct according to the first aspect of the present invention, or its genome integrates one or more nucleic acid constructs according to the first aspect of the present invention.
  • the cell is a plant cell.
  • the plant is selected from the group consisting of monocots, dicots, gymnosperms, or combinations thereof.
  • the plant is selected from the group consisting of gramineous plants, legumes, cruciferous plants, Solanaceae, Umbelliferae, or a combination thereof.
  • the plants include: Arabidopsis, wheat, barley, oats, corn, rice, sorghum, millet, soybean, peanut, tobacco, tomato, cabbage, rape, spinach, lettuce, cucumber, chrysanthemum , Water spinach, celery, lettuce, or combinations thereof.
  • the genetically engineered cell is introduced into the cell with the nucleic acid construct of claim 1 by a method selected from the group consisting of: Agrobacterium transformation method, gene gun method, microinjection method, electric shock method , Ultrasonic method and polyethylene glycol (PEG) mediation method.
  • a method selected from the group consisting of: Agrobacterium transformation method, gene gun method, microinjection method, electric shock method , Ultrasonic method and polyethylene glycol (PEG) mediation method.
  • the fifth aspect of the present invention provides a reagent combination for gene editing, including:
  • P1 is a first promoter, and the first promoter is an RNA polymerase II dependent promoter;
  • S1 is the coding sequence of the first nuclear localization signal
  • X1 is the coding sequence of adenine deaminase (such as wild-type and/or mutant TadA) and/or the coding sequence of cytosine deaminase;
  • L1 is the coding sequence of no or first connecting peptide
  • X2 is the coding sequence of Cas9 nuclease, said Cas9 nuclease has no cleavage activity or single-stranded cleavage activity;
  • X4 is the coding sequence of no or uracil glycosidase inhibitor UGI;
  • L2 is the coding sequence of no or second connecting peptide
  • X3 is the coding sequence of the nuclear localization signal
  • the additional condition is that when X1 is the coding sequence of adenine deaminase, X4 is none, when X1 is the coding sequence of cytosine deaminase, and X4 is the coding sequence of the uracil glycosidase inhibitor UGI; and
  • P2 is the second promoter
  • Y1 is the coding sequence of gRNA
  • the first carrier and the second carrier are different carriers.
  • first nucleic acid construct and the second nucleic acid construct are located on different vectors.
  • the first carrier and the second carrier are the same carrier.
  • first nucleic acid construct and the second nucleic acid construct are located on the same vector.
  • the sixth aspect of the present invention provides a kit containing the reagent combination according to the fifth aspect of the present invention.
  • the kit further contains a label or instructions.
  • the seventh aspect of the present invention provides a method for gene editing of plants, including the steps:
  • nucleic acid construct according to the first aspect of the present invention, the vector according to the third aspect of the present invention, or the reagent combination according to the fifth aspect of the present invention is introduced into the plant cell of the plant to be edited, so that the Gene editing in plant cells.
  • the introduction is by Agrobacterium.
  • the introduction is by gene gun introduction.
  • the gene editing is site-directed base substitution (or mutation).
  • the site-directed substitution includes mutating A to G.
  • the site-directed substitution includes mutating C to T.
  • the plants include any higher plant types that can be transformed, including monocots, dicots and gymnosperms.
  • the plant is selected from the group consisting of gramineous plants, legumes, cruciferous plants, Solanaceae, Umbelliferae, or a combination thereof.
  • the plants include: Arabidopsis, wheat, barley, oats, corn, rice, sorghum, millet, soybean, peanut, tobacco, tomato, cabbage, rape, spinach, lettuce, cucumber, chrysanthemum , Water spinach, celery, lettuce, or combinations thereof.
  • the eighth aspect of the present invention provides a method for preparing gene-edited plant cells, including the steps:
  • nucleic acid construct of the first aspect of the present invention, the vector of the third aspect of the present invention, or the combination of the reagent of the fifth aspect of the present invention are transfected into plant cells, so that the chromosomes in the plant cells undergo site-directed replacement ( Or mutation), thereby preparing the gene-edited plant cell.
  • the transfection adopts the Agrobacterium transformation method or the gene gun bombardment method.
  • the ninth aspect of the present invention provides a nucleic acid construct according to the first aspect of the present invention, a fusion protein according to the second aspect of the present invention, a vector according to the third aspect of the present invention, and a gene according to the fourth aspect of the present invention
  • the engineered cell, the reagent combination according to the fifth aspect of the present invention, and the use of the kit according to the sixth aspect of the present invention are used for gene editing of plants.
  • the tenth aspect of the present invention provides a method for preparing a gene-edited plant, including the steps:
  • the gene-edited plant cell prepared by the method of the eighth aspect of the present invention or the ninth aspect of the present invention is regenerated into a plant body, thereby obtaining the gene-edited plant.
  • the eleventh aspect of the present invention provides a gene-edited plant prepared by the method of the tenth aspect of the present invention.
  • the twelfth aspect of the present invention provides a composite comprising the following two components:
  • nucleic acid component the nucleic acid is gRNA
  • Figure 1 shows the editing efficiency of ABE-nCas9 and ABEmax-nCas9 on ALS genes.
  • Figure 2 shows the editing efficiency of ABE-nCas9, ABEmax-nCas9 and ABEmax-nCas9NG on ALS genes at different PAM sites.
  • Figure 3 shows the base editing efficiency of CBE2.0-nCas9 for replacing the target site C with T in the rice NRT1.1B gene.
  • the abscissa represents the sgRNA-PAM sequence, and the ordinate represents the substitution efficiency.
  • Figure 4 shows the ratio of homozygous to other non-homozygous in rice NRT1.1B gene mutant plants.
  • FIGS 5 and 6 show the difference in traits between rice SLR1 gene mutant plants and wild-type plants.
  • Figure 7 shows the base editing efficiency of CBE2.0-nCas9 for replacing the target site C of the rice SLR1 gene with T.
  • the abscissa represents the sgRNA-PAM sequence, and the ordinate represents the substitution efficiency.
  • Figure 8 shows the ratio of homozygous to other non-homozygous mutation types in rice SLR1 gene mutants.
  • Figure 9 shows the base editing efficiency of CBE2.0-nCas9 for replacing the target site C with T in the rice ALS gene.
  • Figure 10 shows the ratio of homozygous to other non-homozygous plants in mutant plants of the rice ALS gene.
  • Figure 11 shows the base editing efficiency of CBE2.0-nCas9NG for replacing the target site C with T in rice ALS gene.
  • Figure 12 shows the difference in the growth phenotype of rice ALS gene mutant plants and wild-type plants sprayed with imidazolium herbicide.
  • Figure 13 shows the mutation sites of rice ALS gene mutant plants relative to wild-type plants.
  • Figure 14 shows the mutation sites of the rice EPSPS gene.
  • Figure 15 shows the editing efficiency of CBE2.0-nCas9 and CBE2.0-nCas9NG for different target genes.
  • adenine base editor elements and/or cytosine base editor elements After extensive and in-depth research, the inventors have optimized the quality and quantity of adenine base editor elements and/or cytosine base editor elements, using binuclear localization signals, optimized adenine deaminase and/or cytosine Pyrimidine deaminase and different Cas9 nucleases construct an adenine and/or cytosine base editing tool that is more efficient and has a wider recognition range in plant gene editing.
  • the present invention successfully implements sgRNA in plants for the first time Guided base site-directed mutation (such as A mutation to G or C mutation to T), and the mutation efficiency of the adenine base editor element is very high (up to ⁇ 40% or higher), which can identify more PAM positions At the same time, the indel ratio is very low.
  • the inventors have also improved the mutation efficiency significantly (up to ⁇ 80% or higher) by optimizing the cytosine deaminase. By optimizing the Cas9 protein, it can recognize more Many PAM sites (including NGG, NG), and the indel ratio is very low, ⁇ 7%, which improves the accuracy of editing.
  • the present invention has been completed on this basis.
  • homologous arm refers to the flanking sequences that are identical to the genome sequence on both sides of the foreign sequence to be inserted on the targeting vector, and are used to identify and recombine regions.
  • plant promoter refers to a nucleic acid sequence capable of initiating transcription of nucleic acid in plant cells.
  • the plant promoter can be derived from plants, microorganisms (such as bacteria, viruses), animals, etc., or synthetic or engineered promoters.
  • the term "gene editing” or “base mutation” or “base editing” refers to a substitution, insertion, and/or deletion of a base at a certain position in a nucleotide sequence. ).
  • the "edit” or “mutation” in the present invention is preferably a single base mutation.
  • base substitution refers to the mutation of a base at a certain position in the nucleotide sequence to another different base, such as the mutation of A to G.
  • A.T to G.C refers to the mutation or replacement of an A-T base pair at a certain position with a G-C base pair in a double-stranded nucleic acid sequence (especially a genomic sequence).
  • C.G to T.A refers to the mutation or replacement of a C-G base pair at a certain position with a T-A base pair in a double-stranded nucleic acid sequence (especially a genomic sequence).
  • Cas protein refers to a nuclease.
  • a preferred Cas protein is the Cas9 protein.
  • Typical Cas9 proteins include (but are not limited to): Cas9 derived from Streptococcus pyogenes.
  • the Cas9 protein is a mutant Cas9 protein, specifically, a mutant Cas9 protein that has no cleavage activity or only a single-stranded cleavage activity.
  • the Cas9 protein of the present invention includes SpCas9n (D10A), nSpCas9NG, SaCas9n, ScCas9n, XCas9n.
  • the term "coding sequence of Cas protein” refers to a nucleotide sequence encoding Cas protein.
  • the skilled person will realize that because of the degeneracy of the codon, a large number of polynucleotide sequences can encode the same polypeptide .
  • technicians will also realize that different species have certain preferences for codons, and may optimize the codons of Cas protein according to the needs of expression in different species. These variants are all referred to by the term "Cas protein.
  • Encoding sequence specifically covers.
  • the term specifically includes a full-length sequence that is substantially the same as the Cas gene sequence, and a sequence encoding a protein that retains the function of the Cas protein.
  • gRNA is also called guide RNA or guide RNA, and has the meaning commonly understood by those skilled in the art.
  • guide RNAs can include direct repeats and guide sequences, or consist essentially of direct repeats and guide sequences (also called spacers in the context of endogenous CRISPR systems). (spacer)) composition.
  • gRNA can include crRNA and tracrRNA, or only crRNA, depending on the Cas protein it depends on.
  • crRNA and tracrRNA can be artificially modified and fused to form single guide RNA (sgRNA).
  • the gRNA of the present invention may be natural, or artificially modified or designed and synthesized.
  • the targeting sequence is any polynucleotide sequence that has sufficient complementarity with the target sequence to hybridize with the target sequence and guide the specific binding of the CRISPR/Cas complex to the target sequence, usually having 17- Sequence length of 23nt.
  • the degree of complementarity between the targeting sequence and its corresponding target sequence is at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, Or at least 99%. Determining the best alignment is within the abilities of those of ordinary skill in the art. For example, there are published and commercially available alignment algorithms and programs, such as but not limited to ClustalW, Smith-Waterman algorithm in matlab, Bowtie, Geneious, Biopython, and SeqMan.
  • nucleotide sequence is from 5'to 3', unless otherwise specified.
  • adenine deaminase refers to TadA adenine deaminase, derived from Escherichia coli, originally acting on tRNA and capable of deaminating specific adenines in tRNA.
  • the applicable TadA includes both the wild-type form and its specific mutant form TadA7-10, or a combination of the wild-type form and the mutant form.
  • TadA7-10 can perform deamination reaction with DNA as a substrate.
  • the coding sequence of the adenine deaminase of the present invention is codon-optimized, so that it can be expressed in plants more efficiently.
  • cytosine deaminase refers to the cytosine deaminase APOBEC, derived from Escherichia coli, originally acting on tRNA, capable of deaminating specific cytosines in tRNA.
  • the applicable cytosine deaminase includes both wild-type and specific mutant forms (such as CBE2.0, CBE2.1, CBE2.2, CBE2.3, CBE2.4), or Contains a combination of wild-type and mutant forms.
  • the mutant form of cytosine deaminase can perform deamination reaction with DNA as a substrate.
  • the coding sequence of the cytosine deaminase of the present invention is codon-optimized, so that it can be expressed in plants more efficiently.
  • the preferred cytosine deaminase is CBE2.0.
  • amino acid sequence of CBE2.0 is shown in SEQ ID NO.: 23.
  • the amino acid sequence of CBE2.1 is shown in SEQ ID NO.: 24.
  • the amino acid sequence of CBE2.2 is shown in SEQ ID NO.: 25.
  • amino acid sequence of CBE2.3 is shown in SEQ ID NO.: 26.
  • the amino acid sequence of CBE2.4 is shown in SEQ ID NO.: 27.
  • the present invention provides a nucleic acid construct for gene editing of plants, the nucleic acid construct has a 5'-3' formula I structure:
  • I1 is the first integrated component
  • I2 is the second integrated component
  • Z1 is the first expression cassette
  • Z2 is the second expression cassette
  • one of the expression cassettes of Z1 and Z2 has the structure of Ia or Ia', and the other expression cassette has the structure of formula Ib:
  • I1, P1, S1, X1, L1, X2, X4, L2, X3, P2, Y1, I2 are elements used to form the construct, respectively, and the definitions are as described in the first aspect of the present invention
  • each "-" is a bond or a nucleotide connection sequence
  • the additional condition is that when X1 is the coding sequence of adenine deaminase, X4 is none, when X1 is the coding sequence of cytosine deaminase, and X4 is the coding sequence of the uracil glycosidase inhibitor UGI.
  • the I1 element (or the integration element on the left) and the I2 element (or the integration element on the right) can cooperate to integrate the elements between them (that is, the nucleotide sequence from P1 to Y1) into In the genome of plant cells.
  • I1 and I2 are Ti elements from Agrobacterium. Of course, other elements that can play a similar integration role can also be used in the present invention.
  • the various elements used in the construct of the present invention are either known in the art or can be prepared by methods known to those skilled in the art.
  • the corresponding elements can be obtained by conventional methods, such as PCR methods, fully artificial chemical synthesis methods, and enzyme digestion methods, and then connected together by well-known DNA ligation techniques to form the construct of the present invention.
  • Inserting the construct of the present invention into an exogenous vector constitutes the vector of the present invention.
  • the vector of the present invention is transformed into plant cells so as to mediate the integration of the vector of the present invention into the chromosomes of plant cells and express in the plant body to prepare gene-edited plant cells.
  • the gene-edited plant cell of the present invention is regenerated into a plant body, thereby obtaining a gene-edited plant.
  • nucleic acid construct constructed by the present invention can be introduced into plant cells through conventional plant recombination technology (for example, Agrobacterium transfer technology), thereby obtaining the nucleic acid construct (or the vector carrying the nucleic acid construct) Or obtain a plant cell with the nucleic acid construct integrated in the genome.
  • plant recombination technology for example, Agrobacterium transfer technology
  • the plant individual integrated with the nucleic acid construct can be isolated or removed in its progeny by routine screening or other means known in the art, thereby obtaining a gene-edited plant without the nucleic acid construct body.
  • the present invention is to construct an optimized adenine deaminase expression cassette and/or cytosine deaminase expression cassette in the CRISPR/Cas9 system of plants.
  • Adenine deaminase can convert the target DNA Adenine (A) is deaminated and converted to inosine (I). Inosine can be paired with cytosine, and it is read and copied as guanine (G) at the DNA level to realize A to G mutation within the mutation window;
  • Cytosine deaminase can deaminate the cytosine (C) in the target DNA into uracil (U), and uracil will be recognized as T during the DNA replication process, realizing the mutation from C to T.
  • the basic structure of the nucleic acid construct of the adenine base editor element is as follows:
  • the deaminase expression cassette is ZmUbi-ecTadA-32aa linker-ecTadA(7.10)-32aa linker-nSpCas9/nSaCas9-SV40NLS-NOS
  • the basic structure of the nucleic acid construct of the cytosine base editor element is as follows:
  • the deaminase expression cassette is ZmUbi-rAPOBEC1-XTEN-nCas9-UGI-SV40NLS-NOS
  • the main feature of the vector is to link adenine deaminase and/or cytosine deaminase with the Cas protein in the CRISPR/Cas system and the coding sequence of the nuclear localization signal bpNLS to form the coding sequence of the fusion protein.
  • the fusion protein encoded by the coding sequence is expressed in the cytoplasm, the fusion protein can be transferred to the nucleus very efficiently, and guided to the target site in the genome by the guide RNA encoded by the formula I construct, Therefore, the base substitution from AT to GC or CG to TA is performed at the target site, and the risk of insertion/deletion is basically avoided or eliminated.
  • the Cas protein is a mutant Cas protein with no cleavage activity or a single-strand cleavage activity.
  • the Cas protein of the present invention may be Cas9 (D10A), and its amino acid sequence is shown in SEQ ID NO.:2.
  • the Cas protein of the present invention may be nCas9NG, and its amino acid sequence is shown in SEQ ID NO.:3.
  • the proteins are usually connected by some flexible short peptides, namely Linker (connecting peptide sequence).
  • Linker connecting peptide sequence
  • the Linker can use XTEN, and its coding sequence is shown in SEQ ID NO.:4.
  • suitable promoters include constitutive and/or inducible promoters.
  • a strong promoter suitable for plant cells can be selected. Representative examples include (but not limited to): CaMV 35S promoter or UBI promoter or Actin promoter.
  • the action area of the deaminase is fixed.
  • adenine deaminase TadA or mutant adenine deaminase TadA7-10 protein for example, (a) adenine deaminase TadA or mutant adenine deaminase TadA7-10 protein; and/or (b) cytosine deaminase APOBEC or mutant cytosine deaminase (CBE2. 0)
  • the experimental results obtained by the present invention show that the editing windows of the various Cas proteins of the present invention have little difference, and they are all within the first 20 bases of the PAM site Within, the preferred hot spot area is in the range of 3-10 bases.
  • the method for introducing the construct of formula I of the present invention into cells or integrating into the genome there is no particular limitation on the method for introducing the construct of formula I of the present invention into cells or integrating into the genome. It can be carried out by conventional methods, for example, the construct of formula I or the corresponding vector is introduced into plant cells by a suitable method.
  • Representative introduction methods include but are not limited to: Agrobacterium transfection method, gene gun method, microinjection method, electric shock method, ultrasonic method, and polyethylene glycol (PEG)-mediated method.
  • the recipient plants which include various crop plants (such as grasses), forestry plants, horticultural plants (such as floral plants) and the like.
  • crop plants such as grasses
  • forestry plants such as floral plants
  • horticultural plants such as floral plants
  • Representative examples include, but are not limited to: rice, soybeans, tomatoes, corn, tobacco, wheat, sorghum, potatoes and the like.
  • the DNA in the transformed plant cell is allowed to express the fusion protein and gRNA.
  • the Cas protein fused with adenine deaminase under the guidance of the corresponding gRNA, mutates the A at the target position to G (thereby mutates the T of the complementary strand to C) or mutates the C at the target position to T (and thus makes the complementary
  • the G mutation of the chain is A).
  • codon optimization is performed on the coding sequence of adenine deaminase and/or cytosine deaminase, especially plant codon optimization.
  • codon optimization There are 64 genetic codes, but most of them tend to use some of these codons. Those most frequently used are called preferred codons or optimal codons, and those that are not frequently used are called rare or low-utilized codons. In fact, every organism that uses protein expression or production shows a certain degree of difference or preference in codon usage. Use preferred codons and avoid inefficient or rare codons for gene synthesis. This kind of gene resetting is called codon optimization.
  • codon optimization is performed on the amino acid sequence of adenine deaminase and/or cytosine deaminase and nuclear localization signal, and the codons preferred by eukaryotic cells are used for optimization.
  • the preferred codons of animal cells are used.
  • the codons are optimized, and more preferably, the codons preferred by plant cells are used for optimization.
  • the invention can be used in the field of plant genetic engineering, for plant research and breeding, especially for genetic improvement of agricultural crops, forestry crops or horticultural plants with economic value.
  • the present invention combines Cas9 nuclease (such as nCas9 or nCas9NG) with optimized adenine deaminase and binuclear localization signal to form a fusion protein for the first time, and successfully realizes sgRNA-guided base-directed mutation (such as A mutation to G), and the mutation efficiency is very high (up to ⁇ 40% or higher), while significantly reducing or substantially eliminating the risk of insertion and/or deletion (indel) at the target site.
  • Cas9 nuclease such as nCas9 or nCas9NG
  • adenine deaminase and binuclear localization signal to form a fusion protein for the first time
  • sgRNA-guided base-directed mutation such as A mutation to G
  • the present invention combines Cas9 nuclease (such as nCas9 or nCas9NG) with optimized cytosine deaminase, UGI, and binuclear localization signal to form a fusion protein for the first time, and successfully realizes sgRNA-guided base-directed mutation (such as C
  • the mutation is T
  • the mutation efficiency is very high (up to ⁇ 80% or higher), while significantly reducing or substantially eliminating the risk of insertion and/or deletion (indel) at the target site, and can reduce the indel ratio It is ⁇ 7%.
  • the present invention can expand the range of targeted editing in the plant genome by using different forms of Cas9.
  • the present invention finds for the first time that the nCas9NG of the present invention can recognize non-NGG (such as NG) PAM, and can obtain very efficient base editing efficiency.
  • non-NGG such as NG
  • ALS-sg3 CGCATTCAAGGACATGATCCTGG (SEQ ID NO.: 9)
  • ALS-sg1 GCGCCCCCACTTGGGATCATAGG (SEQ ID NO.: 10)
  • the nucleic acid construct (pCambia1300) containing the base editor and hygromycin resistance gene was introduced into Agrobacterium EHA105, and then transformed into rice callus.
  • the transformation, tissue culture, and plant growth of rice were performed according to the procedures recorded in the literature. (Nishimura et al., 2006; Wang et al., 2015).
  • the efficiency of ABE-nCas9 to generate A to G mutations at the target site is about 20%, while the efficiency of ABEmax-nCas9 is about 40%-50%.
  • the specific results are shown in Figure 1.
  • the editing efficiency of ABEmax-nCas9 is as high as 40%, which is double the editing efficiency of ABE-nCas9.
  • the results show that the optimized ABEmax-nCas9 base editor can be efficiently applied to site-specific base replacement in plant genomes.
  • Adopt Mut MultiS multiple site-directed mutagenesis kit obtained nSpCas9-NG on the basis of nSpCas9 (D10A), which contains 7 amino acid substitutions R1335A/L1111R/D1135V/G1218R/E1219F/A1322R/T1337R.
  • the mutated sequence replaced the nSpCas9(D10A) fragment in the ABEmax-nCas9 editor with BamHI and SpeI restriction enzyme sites to obtain the ABEmax-nCas9NG editor.
  • Example 3 The base editing effect of CBE2.0-nCas9 on rice NRT1.1B and SLR1 genes
  • NRT1.1B and SLR1 genes were selected, and the previously tested sgRNA was used for editing.
  • NRT1.1B controls the absorption of nitrogen as a nutrient element in rice.
  • the change of amino acid at position 327 from Thr to Met can increase yield;
  • SLR1 controls the synthesis of gibberellin and affects rice plant height.
  • OsU6-sgRNA NRT1.1B The sequence of OsU6-sgRNA (NRT1.1B) is shown in SEQ ID NO.: 28.
  • OsU6-sgRNA SLR1
  • SLR1 The sequence of OsU6-sgRNA (SLR1) is shown in SEQ ID NO.: 29.
  • the nucleic acid construct (pCambia1300) containing the base editor and hygromycin resistance gene was introduced into Agrobacterium EHA105, and then transformed into rice callus.
  • the transformation, tissue culture, and plant growth of rice were performed according to the procedures recorded in the literature. (Nishimura et al., 2006; Wang et al., 2015).
  • the base replacement efficiency of the first generation of CBE base editor at the above two sites is only 2.7% (NRT1.1B) and 13.3% (SLR1) (Lu and Zhu, 2017), compared with the present invention
  • the efficiency of the optimized CBE base editor has been increased by 26 times and 6 times respectively.
  • Example 4 The base editing effect of CBE2.0-nCas9 on rice acetolactate synthase gene (ALS)
  • Acetolactate synthase is a key enzyme for the synthesis of valine, leucine and isoleucine in plants.
  • ALS inhibitors are often used as herbicides. However, in addition to inhibiting the growth of weeds, these herbicides also Can inhibit the growth of crops. Studies have shown that the mutation of Ser at position 627 of the protein sequence of ALS to Asn (corresponding to the mutation of G at position 1880 in the DNA sequence to A) confers tolerance to imidazolinone herbicides (Piao et al., 2018) ).
  • Example 5 The base editing effect of CBE2.0-nCas9NG on acetolactate synthase gene (ALS) and 5-enolpyruvylshikimate-3-phosphate synthase gene (EPSPS) in rice
  • ALS acetolactate synthase gene
  • EPSPS 5-enolpyruvylshikimate-3-phosphate synthase gene
  • Adopt Mut MultiS multiple site-directed mutagenesis kit purchased from Nanjing Novozan obtained nSpCas9-NG on the basis of nSpCas9 (D10A), which contains 7 amino acid substitutions R1335A/L1111R/D1135V/G1218R/E1219F/A1322R/T1337R.
  • the mutated sequence replaced the nSpCas9(D10A) fragment in the CBE2.0-nCas9 editor with BamHI and SpeI restriction enzyme sites to obtain the CBE2.0-nCas9NG editor.
  • CBE2.0-nCas9NG base editor can recognize the PAM motif of NGN.
  • the sgRNA sequence for ALS has been redesigned (ALSsg2, with AGC as the PAM motif: CCCCACTTGGGATCATAGGCAGC (SEQ ID NO.: 44))
  • ALSsg2 with AGC as the PAM motif: CCCCACTTGGGATCATAGGCAGC (SEQ ID NO.: 44)

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Developmental Biology & Embryology (AREA)
  • Microbiology (AREA)
  • Environmental Sciences (AREA)
  • Physiology (AREA)
  • Botany (AREA)
  • Medicinal Chemistry (AREA)
  • Natural Medicines & Medicinal Plants (AREA)
  • Biophysics (AREA)
  • Cell Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

提供了一种用于基因编辑的核酸构建物,该特定结构的核酸构建物可在植物中进行gRNA引导的碱基定点突变。

Description

一种用于基因编辑的核酸构建物 技术领域
本发明涉及生物技术领域,具体地,涉及一种用于基因编辑的核酸构建物。
背景技术
植物中许多性状的差异由一个或几个DNA碱基的变异造成,突变某些特定碱基可增强、减弱或抑制其某个性状的表达。CRISPR-Cas9基因编辑技术已经广泛应用于动植物基因编辑的研究中,目前主要包括胞嘧啶碱基编辑器(CBE)和腺嘌呤碱基编辑器(ABE),碱基编辑器能够在基因组中进行精确的碱基改变(如替换),且不会造成DNA双链断裂(DSB)。早期开发的CBE和ABE***,分别由大鼠胞苷脱氨酶APOBEC1来源于七鳃鳗的胞苷脱氨酶PmCDA1和由tRNA腺嘌呤脱氨酶进化来的TadA组成,已应用于许多植物物种中,如水稻,小麦,玉米,番茄,拟南芥和甘蓝型油菜。为了进一步提高植物的碱基编辑效率,Kang等人对启动子进行了优化,Zong等人对脱氨酶进行了优化。另一方面,为了扩大基因编辑范围,Qin等和Hua等人利用了不同于SpCas9的其它Cas9蛋白,这些Cas9蛋白的变体可识别与经典NGG基序不一样的PAM序列。然而,相对于目前广泛使用的基因敲除技术,单碱基编辑的效率仍然很低。此外,至今报道的碱基编辑器仅能识别有限的几种PAM序列,使得植物基因组中可编辑的范围受到很大的限制。
因此,本领域迫切需要开发一种在植物细胞中可高效、可在较大范围内且精确地实现A-G转化同时显著降低***或缺失突变风险的新的用于基因编辑的核酸构建物。
发明内容
本发明的目的在于提供一种在植物细胞中可高效、可在较大范围且精确地实现A-G转化同时显著降低***或缺失突变风险的新的用于基因编辑的核酸构建物。
本发明第一方面提供了一种核酸构建物,所述核酸构建物具有5’-3’(5’至3’)的式I结构:
I1-Z1-Z2-I2(I)
式中,
I1为第一整合元件;
I2为第二整合元件;
Z1为第一表达盒;
Z2为第二表达盒;
并且,Z1和Z2中的一个表达盒具有Ia或Ia’结构,而另一个表达盒具有式Ib结构:
P1-S1-X1-L1-X2-X4-L2-X3(Ia)或
P1-S1-X2-L1-X1-X4-L2-X3(Ia’)
P2-Y1(Ib);
式中,
P1、S1、X1、L1、X2、X4、L2、X3、P2、Y1分别为用于构成所述构建物的元件;
P1为第一启动子,所述第一启动子为RNA聚合酶II依赖的启动子;
S1为第一核定位信号的编码序列;
X1为腺嘌呤脱氨酶(如野生型和/或突变型TadA)的编码序列和/或胞嘧啶脱氨酶的编码序列;
L1为无或第一连接肽的编码序列;
X2为Cas9核酸酶的编码序列,所述的Cas9核酸酶是无切割活性或单链切割活性的;
X4为无或尿嘧啶糖苷酶抑制剂UGI的编码序列;
L2为无或第二连接肽的编码序列;
X3为第二核定位信号的编码序列;
P2为第二启动子;
Y1为gRNA的编码序列;
并且,各“-”为键或核苷酸连接序列;
附加条件是,当X1为腺嘌呤脱氨酶的编码序列,X4为无,当X1为胞嘧啶脱氨酶的编码序列,X4为尿嘧啶糖苷酶抑制剂UGI的编码序列。
在另一优选例中,所述gRNA包括crRNA、tracrRNA、sgRNA。
在另一优选例中,所述第一启动子来源于选自下组的一种或多种植物:玉米、水稻、大豆、拟南芥或番茄。
在另一优选例中,所述第一启动子来源于选自下组的一种或多种微生物:链霉菌、大肠杆菌。
在另一优选例中,所述第一启动子来源于选自下组的一种或多种病毒:烟草花叶病毒、黄叶卷曲病毒、花椰菜花叶病毒、棉花曲叶病毒。
在另一优选例中,所述第一启动子包括玉米泛素启动子。
在另一优选例中,所述泛素启动子包括UBI启动子。
在另一优选例中,所述第一启动子选自下组:UBI、UBQ、35S、Actin、SPL,CmYLCV、YAO、CDC45、rbcS、rbcL、PsGNS2、UEP1、TobRB7、Cab、或其组合。
在另一优选例中,所述的L1和L2的核苷酸序列长度各自独立地为3-120nt,较佳的为3-96nt,并且优选为3的倍数。
在另一优选例中,所述的L1和L2编码的氨基酸序列长度各自独立的为3-40aa,较佳的为6-32aa,较佳的为18-32aa,较佳的为24-32aa。
在另一优选例中,所述的核苷酸连接序列长度为1-300nt,较佳地1-100nt。
在另一优选例中,所述核苷酸连接序列不影响各元件的正常转录和翻译。
在另一优选例中,所述Cas9核酸酶选自下组:nCas9、Cas9NG、nCas9NG、或其组合。
在另一优选例中,所述Cas9核酸酶选自下组:nSpCas9(D10A)、nSpCas9NG、nSaCas9、nScCas9、nSqCas9、nXCas9、或其组合。
在另一优选例中,所述Cas9核酸酶包括突变的Cas9核酸酶。
在另一优选例中,所述的Cas9核酸酶与所述的突变的Cas9核酸酶的相同性≥80%,较佳地≥90%;更佳地≥95%,更佳地,≥98%或99%。
在另一优选例中,所述的突变Cas9核酸酶与野生型Cas9酶的活性相当或显著优于野生型Cas9酶的活性。
在另一优选例中,所述突变Cas9核酸酶由所述的野生型的Cas9核酸酶经过一个或多个,较佳地1-15个,较佳地1-10个,较佳的1-7个,更佳地2-5个,氨基酸取代、缺失;和/或经过1-5,较佳地1-4个,更佳地1-3个,最佳地1-2个氨基酸的添加形成的。
在另一优选例中,所述X2元件中,突变位点在Cas9核酸酶的D10A位,其氨基酸序列如SEQ ID NO.:2所示。
在另一优选例中,所述X2元件在Cas9核酸酶的对应于SEQ ID NO.:2的选自下组的一个或多个位点发生突变:
第1335位精氨酸(R);
第1111位亮氨酸(L);
第1135位天冬氨酸(D);
第1218位甘氨酸(G);
第1219位谷氨酸(E);
第1322位的丙氨酸(A);
第1337位的苏氨酸(T)。
在另一优选例中,所述第1335位精氨酸(R)突变为丙氨酸(A);和/或
第1111位亮氨酸(L)突变为精氨酸(R);和/或
第1135位天冬氨酸(D)突变为缬氨酸(V);和/或
第1218位甘氨酸(G)突变为精氨酸(R);和/或
第1219位谷氨酸(E)突变为苯丙氨酸(F);和/或
第1322位的丙氨酸(A)突变为精氨酸(R);和/或
第1337位的苏氨酸(T)突变为精氨酸(R)。
在另一优选例中,所述突变选自下组:R1335A;L1111R;D1135V;G1218R;E1219F;A1322R;T1337R。
在另一优选例中,所述X2元件的氨基酸序列如SEQ ID NO.:3所示。
在另一优选例中,所述X2元件来源于细菌。
在另一优选例中,所述X2元件来源选自下组:酿脓链球菌(Streptococcus pyogenes)、葡萄球菌(Staphylococcus aureus)、犬链球菌(Streptococcus canis)、或其组合。
在另一优选例中,所述第一连接肽的编码序列、第二连接肽的编码序列各自独立地包括XTEN。
在另一优选例中,所述第一连接肽的编码序列、第二连接肽的编码序列如SEQ ID NO.:4或7所示。
在另一优选例中,所述核定位信号为密码子优化的核定位信号。
在另一优选例中,所述核定位信号为植物密码子优化的核定位信号。
在另一优选例中,所述核定位信号包括bpNLS。
在另一优选例中,所述核定位信号为bpNLS。
在另一优选例中,所述S1元件、X3元件的核苷酸序列各自如SEQ ID NO.:5或20所示。
在另一优选例中,所述腺嘌呤脱氨酶包括野生型和突变型。
在另一优选例中,所述腺嘌呤脱氨酶包括野生型和/或突变型的TadA。
在另一优选例中,所述腺嘌呤脱氨酶包括TadA。
在另一优选例中,所述腺嘌呤脱氨酶的突变型包括TadA7-10。
在另一优选例中,所述腺嘌呤脱氨酶为串联型腺嘌呤脱氨酶,所述串联型腺嘌呤脱氨酶结构如式II所示:
Z8-L8-Z9(II)
其中,
Z8为野生型的腺嘌呤脱氨酶TadA的氨基酸序列;
L8为任选的连接肽序列;
Z9为突变型的腺嘌呤脱氨酶TadA7-10的氨基酸序列。
在另一优选例中,所述腺嘌呤脱氨酶为密码子优化的腺嘌呤脱氨酶。
在另一优选例中,所述腺嘌呤脱氨酶为植物密码子优化的腺嘌呤脱氨酶。
在另一优选例中,所述腺嘌呤脱氨酶的编码序列选自下组:
(i)序列如SEQ ID NO.:1所示的多核苷酸;
(ii)核苷酸序列与SEQ ID NO.:1所示序列的同源性≥75%(较佳地≥85%,更佳地≥90%或≥95%或≥98%或≥99%)的多核苷酸;
(iii)在SEQ ID NO.:1所示多核苷酸的5'端和/或3'端截短或添加1-60个(较佳地1-30,更佳地1-10个)核苷酸的多核苷酸;
(iv)与(i)-(iii)任一所述的多核苷酸互补的多核苷酸。
在另一优选例中,所述腺嘌呤脱氨酶的编码序列如SEQ ID NO.1所示。
在另一优选例中,所述腺嘌呤脱氨酶的氨基酸序列如SEQ ID NO.:8所示。
在另一优选例中,所述胞嘧啶脱氨酶包括野生型和突变型。
在另一优选例中,所述胞嘧啶脱氨酶包括APOBEC。
在另一优选例中,所述APOBEC选自下组:APOBEC1(A1)、APOBEC2(A2)、APOBEC3A、APOBEC3B、APOBEC3C、APOBEC3D、APOBEC3E、APOBEC3F、APOBEC3H、APOBEC4(A4)、活化诱导脱氨酶(activation induced cytidine deaminase,AID)、或其组合。
在另一优选例中,所述胞嘧啶脱氨酶的突变型包括CBE2.0、CBE2.1、CBE2.2、 CBE2.3、CBE2.4。
在另一优选例中,所述胞嘧啶脱氨酶为密码子优化的胞嘧啶脱氨酶。
在另一优选例中,所述胞嘧啶脱氨酶为植物密码子优化的胞嘧啶脱氨酶。
在另一优选例中,所述胞嘧啶脱氨酶的编码序列选自下组:
(i)序列如SEQ ID NO.:19所示的多核苷酸;
(ii)核苷酸序列与SEQ ID NO.:19所示序列的同源性≥75%(较佳地≥85%,更佳地≥90%或≥95%或≥98%或≥99%)的多核苷酸;
(iii)在SEQ ID NO.:19所示多核苷酸的5'端和/或3'端截短或添加1-60个(较佳地1-30,更佳地1-10个)核苷酸的多核苷酸;
(iv)与(i)-(iii)任一所述的多核苷酸互补的多核苷酸。
在另一优选例中,所述胞嘧啶脱氨酶的编码序列如SEQ ID NO.19所示。
在另一优选例中,所述胞嘧啶脱氨酶的氨基酸序列如SEQ ID NO.:21所示。
在另一优选例中,所述X4元件的核苷酸序列如SEQ ID NO.:22所示。
在另一优选例中,所述第二启动子来源于选自下组的一种或多种植物:水稻、玉米、大豆、拟南芥或番茄。
在另一优选例中,所述第二启动子来源于选自下组的一种或多种微生物:链霉菌、大肠杆菌。
在另一优选例中,所述第二启动子来源于选自下组的一种或多种病毒:烟草花叶病毒、黄叶卷曲病毒、花椰菜花叶病毒、棉花曲叶病毒。
在另一优选例中,所述第二启动子包括RNA聚合酶III依赖的启动子。
在另一优选例中,所述第二启动子为RNA聚合酶III依赖的启动子。
在另一优选例中,所述第二启动子选自下组:U6、U3、U6a、U6b、U6c、U6-1、U3b、U3d、U6-26、U6-29、H1、或其组合。
在另一优选例中,所述第二启动子包括U6启动子。
在另一优选例中,所述第二启动子选自下组:OsU6、OsU3、OsU6a、OsU6b、OsU6c、AtU6-1、AtU3b、AtU3d、AtU6-1、AtU6-26、AtU6-29、或其组合。
在另一优选例中,所述的“无切割活性或单链切割活性”指Cas9核酸酶对于靶位点T所在的单链无切割活性。
在另一优选例中,所述的“无切割活性或单链切割活性”指Cas9核酸酶对于靶位点G所在的单链无切割活性。
在另一优选例中,本发明的上述核苷酸元件是按阅读框(in-frame)连接的,从 而表达氨基酸序列正确的融合蛋白。
在另一优选例中,所述的构建物具有式IIa或IIa’或式IIb或IIb’结构:
I1-P1-S1-X1-L1-X2-X4-L2-X3-P2-Y1-I2(IIa);或
I1-P1-S1-X2-L1-X1-X4-L2-X3-P2-Y1-I2(IIa’);
I1-P2-Y1-P1-S1-X1-L1-X2-X4-L2-X3-I2(IIb);或
I1-P2-Y1-P1-S1-X2-L1-X1-X4-L2-X3-I2(IIb’);
式中,各元件的定义如上所述。
在另一优选例中,所述的第一表达盒和第二表达盒均具有终止子。
在另一优选例中,所述的第一表达盒和第二表达盒共用相同的终止子。
在另一优选例中,所述终止子包括适用于植物基因编辑的终止子。
在另一优选例中,所述终止子选自下组:NOS、Poly A、T-UBQ、rbcS、或其组合。
在另一优选例中,所述终止子的核苷酸序列如SEQ ID NO.:6所示。
在另一优选例中,所述第一整合元件包括5’同源臂序列。
在另一优选例中,所述第二整合元件包括3’同源臂序列。
在另一优选例中,所述核酸构建物的长度为3000-10000bp,较佳地,4000-8500bp,更佳地,4000-6000bp。
在另一优选例中,在所述的I1和I2元件之间,还含有额外***的一个或多个额外的表达盒。
在另一优选例中,所述的额外表达盒是独立于所述的第一表达盒和第二表达盒的。
在另一优选例中,所述的额外表达盒表达选自下组的物质:
(a1)标记基因;
(a2)与Y1编码的gRNA不同的一种或多种gRNA。
在另一优选例中,所述标记基因包括抗性基因(如潮霉素抗性基因、除草剂抗性基因)、荧光基因、或其组合。
本发明第二方面提供了一种融合蛋白,所述的融合蛋白包括(a)腺嘌呤脱氨酶和/或胞嘧啶脱氨酶;和(b)Cas9核酸酶,并且所述融合蛋白由本发明第一方面所述的核酸构建物编码。
本发明第三方面提供了一种载体,所述载体含有本发明第一方面所述的核酸构建物。
在另一优选例中,所述载体为植物表达载体。
在另一优选例中,所述的载体为可转染或转化植物细胞的表达载体。
在另一优选例中,所述的载体为农杆菌Ti载体。
在另一优选例中,所述的构建物整合到所述载体的T-DNA区。
在另一优选例中,所述载体是环状的或线性的。
本发明第四方面提供了一种基因工程细胞,所述细胞含有本发明第一方面所述的核酸构建物,或其基因组整合有一个或多个本发明第一方面所述的核酸构建物。
在另一优选例中,所述的细胞为植物细胞。
在另一优选例中,所述的植物选自下组:单子叶植物、双子叶植物、裸子植物、或其组合。
在另一优选例中,所述的植物选自下组:禾本科植物、豆科植物、十字花科植物、茄科、伞形科、或其组合。
在另一优选例中,所述的植物包括:拟南芥、小麦、大麦、燕麦、玉米、水稻、高粱、粟、大豆、花生、烟草、番茄、白菜、油菜、菠菜、生菜、黄瓜、茼蒿、空心菜、芹菜、油麦菜、或其组合。
在另一优选例中,所述的基因工程细胞是用选自下组的方法将权利要求1所述的核酸构建物导入细胞的:农杆菌转化法、基因枪法、显微注射法、电击法、超声波法和聚乙二醇(PEG)介导法。
本发明第五方面提供了一种用于基因编辑的试剂组合,包括:
(i)第一核酸构建物,或含有所述第一核酸构建物的第一载体,所述第一核酸构建物具有从5’-3’的式Ia或Ia’结构:
P1-S1-X1-L1-X2-X4-L2-X3(Ia)或
P1-S1-X2-L1-X1-X4-L2-X3(Ia’)
其中,
P1为第一启动子,所述第一启动子为RNA聚合酶II依赖的启动子;
S1为第一核定位信号的编码序列;
X1为腺嘌呤脱氨酶(如野生型和/或突变型TadA)的编码序列和/或胞嘧啶脱氨酶的编码序列;
L1为无或第一连接肽的编码序列;
X2为Cas9核酸酶的编码序列,所述的Cas9核酸酶是无切割活性或单链切割 活性的;
X4为无或尿嘧啶糖苷酶抑制剂UGI的编码序列;
L2为无或第二连接肽的编码序列;
X3为核定位信号的编码序列;
并且,“-”为键或核苷酸连接序列;
附加条件是,当X1为腺嘌呤脱氨酶的编码序列,X4为无,当X1为胞嘧啶脱氨酶的编码序列,X4为尿嘧啶糖苷酶抑制剂UGI的编码序列;和
(ii)第二核酸构建物,或含有所述第二核酸构建物的第二载体,所述第二核酸构建物具有从5’-3’的式Ib所示的结构:
P2-Y1(Ib);
其中,P2为第二启动子;
Y1为gRNA的编码序列;
并且,“-”为键或核苷酸连接序列。
在另一优选例中,所述第一载体和所述第二载体为不同的载体。
在另一优选例中,所述第一核酸构建物和所述第二核酸构建物位于不同的载体上。
在另一优选例中,所述第一载体和所述第二载体为相同的载体。
在另一优选例中,所述第一核酸构建物和所述第二核酸构建物位于相同的载体上。
本发明第六方面提供了一种试剂盒,所述试剂盒含有本发明第五方面所述的试剂组合。
在另一优选例中,所述试剂盒还含有标签或说明书。
本发明第七方面提供了一种对植物进行基因编辑的方法,包括步骤:
(i)提供待编辑植物;和
(ii)将本发明第一方面所述的核酸构建物、本发明第三方面所述的载体或本发明第五方面所述的试剂组合导入所述待编辑植物的植物细胞,从而在所述植物细胞内进行基因编辑。
在另一优选例中,所述导入为通过农杆菌导入。
在另一优选例中,所述导入为通过基因枪导入。
在另一优选例中,所述的基因编辑为定点碱基替换(或突变)。
在另一优选例中,所述定点替换(或突变)包括将A突变为G。
在另一优选例中,所述定点替换(或突变)包括将C突变为T。
在另一优选例中,所述的植物包括任何可进行转化技术的高等植物类型,包括单子叶植物、双子叶植物和裸子植物。
在另一优选例中,所述的植物选自下组:禾本科植物、豆科植物、十字花科植物、茄科、伞形科、或其组合。
在另一优选例中,所述的植物包括:拟南芥、小麦、大麦、燕麦、玉米、水稻、高粱、粟、大豆、花生、烟草、番茄、白菜、油菜、菠菜、生菜、黄瓜、茼蒿、空心菜、芹菜、油麦菜、或其组合。
本发明第八方面提供了一种制备经基因编辑的植物细胞的方法,包括步骤:
将本发明第一方面所述的核酸构建物、本发明第三方面所述的载体或本发明第五方面所述的试剂组合转染植物细胞,使得所述植物细胞中的染色体发生定点替换(或突变),从而制得所述经基因编辑的植物细胞。
在另一优选例中,所述的转染采用农杆菌转化法或基因枪轰击法。
本发明第九方面提供了一种本发明第一方面所述的核酸构建物本发明第二方面所述的融合蛋白、本发明第三方面所述的载体、本发明第四方面所述的基因工程细胞、本发明第五方面所述的试剂组合、本发明第六方面所述的试剂盒的用途,用于对植物进行基因编辑。
本发明第十方面提供了一种制备经基因编辑的植物的方法,包括步骤:
将本发明第八方面或本发明第九方面所述方法制备的所述经基因编辑的植物细胞再生为植物体,从而获得所述经基因编辑的植物。
本发明第十一方面提供了一种经基因编辑的植物,所述的植物是用本发明第十方面所述的方法制备的。
本发明第十二方面提供了一种复合物,包括以下两种组分:
(a)核酸组分,所述核酸为gRNA;
(b)蛋白组分,所述蛋白为本发明第二方面所述的融合蛋白。
应理解,在本发明范围内中,本发明的上述各技术特征和在下文(如实施例)中具体描述的各技术特征之间都可以互相组合,从而构成新的或优选的技术方案。限于篇幅,在此不再一一累述。
附图说明
图1显示了ABE-nCas9和ABEmax-nCas9对ALS基因的编辑效率。
图2显示了ABE-nCas9、ABEmax-nCas9和ABEmax-nCas9NG在不同PAM位点对ALS基因的编辑效率。
图3显示了CBE2.0-nCas9对水稻NRT1.1B基因靶标位点C替换为T的碱基编辑效率,横坐标代表sgRNA-PAM序列,纵坐标代表取代效率。
图4显示了水稻NRT1.1B基因突变型植株中纯合体与其他非纯合体的比率。
图5和图6显示了水稻SLR1基因突变型植株与野生型植株的性状差异.
图7显示了CBE2.0-nCas9对水稻SLR1基因靶标位点C替换为T的碱基编辑效率,横坐标代表sgRNA-PAM序列,纵坐标代表取代效率。
图8显示了水稻SLR1基因突变体中纯合体与其他非纯合体突变类型比率。
图9显示了CBE2.0-nCas9对水稻ALS基因靶标位点C替换为T的碱基编辑效率。
图10显示了水稻ALS基因的突变型植株中纯合体与其他非纯合体的比率。
图11显示了CBE2.0-nCas9NG对水稻ALS基因靶标位点C替换为T的碱基编辑效率。
图12显示了喷施咪唑乙烟酸除草剂的水稻ALS基因突变型植株与野生型植株的生长表型差异。
图13显示了水稻ALS基因突变型植株相对野生型植株的突变位点。
图14显示了水稻EPSPS基因的突变位点。
图15显示了CBE2.0-nCas9、CBE2.0-nCas9NG对不同靶基因的编辑效率。
具体实施方式
本发明人经过广泛而深入地研究,通过优化腺嘌呤碱基编辑器元件和/或胞嘧啶碱基编辑器元件的质量和数量,采用双核定位信号、优化的腺嘌呤脱氨酶和/或胞嘧啶脱氨酶、不同的Cas9核酸酶,构建了一种在植物基因编辑中更加高效、识别范围更宽的腺嘌呤和/或胞嘧啶碱基编辑工具,本发明首次在植物中成功实现了sgRNA引导的碱基定点突变(如A突变为G或C突变为T),并且腺嘌呤碱基编辑器元件的突变效率非常高(可高达≥40%或更高),可识别更多的PAM位点(包括NGG,NG),同时indel比例很低;此外,发明人通过优化胞嘧啶脱氨酶,也显著提高了突变效率(可高达≥80%或更高),通过优 化Cas9蛋白可识别更多的PAM位点(包括NGG、NG),同时indel比例很低,<7%,提高了编辑的精确率。在此基础上完成了本发明。
术语
如本文所用,术语“同源臂”指打靶载体上待***的外源序列两侧的与基因组序列完全一致的侧翼序列,用于识别并发生重组的区域。
如本文所用,术语“植物启动子”指能够在植物细胞中启动核酸转录的核酸序列。该植物启动子可以是来源于植物、微生物(如细菌、病毒)或动物等,或者是人工合成或改造过的启动子。
如本文所用,术语“基因编辑”或“碱基突变”或“碱基编辑”指核苷酸序列的某一位置处发生碱基的替换(substitution)、***(insertion)和/或缺失(deletion)。本发明中所述“编辑”或“突变”优选为单碱基突变。
如本文所用,术语“碱基替换”指核苷酸序列的某一位置处的碱基突变为另一不同的碱基,比如A突变为G。
如本文所用,术语“A.T到G.C”指在双链核酸序列(尤其是基因组序列)中,某一位置上的A-T碱基对突变为或替换为G-C碱基对。
如本文所用,术语“C.G到T.A”指在双链核酸序列(尤其是基因组序列)中,某一位置上的C-G碱基对突变为或替换为T-A碱基对。
如本文所用,术语“Cas蛋白”指一种核酸酶。一种优选的Cas蛋白是Cas9蛋白。典型的Cas9蛋白包括(但并不限于):来源于酿脓链球菌(Streptococcuspyogenes)的Cas9。在本发明中,Cas9蛋白为突变的Cas9蛋白,具体地,是无切割活性或只具有单链切割活性的突变的Cas9蛋白。在一优选实施方式中,本发明的Cas9蛋白包括SpCas9n(D10A)、nSpCas9NG、SaCas9n、ScCas9n、XCas9n。
如本文所用,术语“Cas蛋白的编码序列”指编码Cas蛋白的核苷酸序列。在***的多聚核苷酸序列被转录和翻译从而产生功能性Cas蛋白的情况下,技术人员会认识到,因为密码子的简并性,有大量多聚核苷酸序列可以编码相同的多肽。另外,技术人员也会认识到不同物种对于密码子具有一定的偏好性,可能会根据在不同物种中表达的需要,会对Cas蛋白的密码子进行优化,这些变异体都被术语“Cas蛋白的编码序列”所具体涵盖。此外,术语特定地包括了全长的、与Cas基因序列基本相同的序列,以及编码出保留Cas蛋白功能的 蛋白质的序列。
如本文所用,所述的“gRNA”又称为guide RNA或导向RNA,并且具有本领域技术人员通常理解的含义。一般而言,导向RNA可以包含同向(direct)重复序列和导向序列(guide sequence),或者基本上由或由同向重复序列和导向序列(在内源性CRISPR***背景下也称为间隔序列(spacer))组成。gRNA在不同的CRISPR***中,依据其所依赖的Cas蛋白的不同,可以包括crRNA和tracrRNA,也可以只含有crRNA。crRNA和tracrRNA可以经过人工改造融合形成single guide RNA(sgRNA)。本发明所述的gRNA可以是天然的,也可以是经过人工改造或设计合成的。在某些情况下,导向序列是与靶序列具有足够互补性从而与所述靶序列杂交并引导CRISPR/Cas复合物与所述靶序列的特异性结合的任何多核苷酸序列,通常具有17-23nt的序列长度。在某些实施方案中,当最佳比对时,导向序列与其相应靶序列之间的互补程度为至少50%、至少60%、至少70%、至少80%、至少90%、至少95%、或至少99%。确定最佳比对在本领域的普通技术人员的能力范围内。例如,存在公开和可商购的比对算法和程序,诸如但不限于ClustalW、matlab中的史密斯-沃特曼算法(Smith-Waterman)、Bowtie、Geneious、Biopython以及SeqMan。
在本发明中,核苷酸序列的描述是从5’至3’方向,除非特别注明。
腺嘌呤脱氨酶
如本文所用,术语“腺嘌呤脱氨酶”指TadA腺嘌呤脱氨酶,来源于大肠杆菌,原本作用于tRNA,能够对tRNA中的特定腺嘌呤进行脱氨反应。
在本发明中,适用的TadA既包含野生型的形式也包含其特定的突变形式TadA7-10,也可包含野生型的形式和突变形式的组合。TadA7-10能够以DNA作为底物进行脱氨反应。
在另一优选例中,本发明的腺嘌呤脱氨酶的编码序列是对密码子进行优化的,从而能够更高效地在植物中表达。
胞嘧啶脱氨酶
如本文所用,术语“胞嘧啶脱氨酶”指胞嘧啶脱氨酶APOBEC,来源于大肠杆菌,原本作用于tRNA,能够对tRNA中的特定胞嘧啶进行脱氨反应。
在本发明中,适用的胞嘧啶脱氨酶既包含野生型的形式也包含其特定的突变形式(如CBE2.0、CBE2.1、CBE2.2、CBE2.3、CBE2.4),也可包含野生型的形式 和突变形式的组合。突变形式的胞嘧啶脱氨酶能够以DNA作为底物进行脱氨反应。
在另一优选例中,本发明的胞嘧啶脱氨酶的编码序列是对密码子进行优化的,从而能够更高效地在植物中表达。
在本发明的一个优选的实施方式中,优选的的胞嘧啶脱氨酶为CBE2.0。
在本发明中,CBE2.0的氨基酸序列如SEQ ID NO.:23所示。
CBE2.1的氨基酸序列如SEQ ID NO.:24所示。
CBE2.2的氨基酸序列如SEQ ID NO.:25所示。
CBE2.3的氨基酸序列如SEQ ID NO.:26所示。
CBE2.4的氨基酸序列如SEQ ID NO.:27所示。
本发明的构建物
本发明提供了一种核酸构建物,用于对植物进行基因编辑,所述的核酸构建物具有5’-3’的式I结构:
I1-Z1-Z2-I2(I)
式中,
I1为第一整合元件;
I2为第二整合元件;
Z1为第一表达盒;
Z2为第二表达盒;
并且,Z1和Z2中的一个表达盒具有Ia或Ia’结构,而另一个表达盒具有式Ib结构:
P1-S1-X1-L1-X2-X4-L2-X3(Ia)或
P1-S1-X2-L1-X1-X4-L2-X3(Ia’)
P2-Y1(Ib);
I1、P1、S1、X1、L1、X2、X4、L2、X3、P2、Y1、I2分别为用于构成所述构建物的元件,其定义如本发明第一方面所述;
并且,各“-”为键或核苷酸连接序列;
附加条件是,当X1为腺嘌呤脱氨酶的编码序列,X4为无,当X1为胞嘧啶脱氨酶的编码序列,X4为尿嘧啶糖苷酶抑制剂UGI的编码序列。
在上述式I结构中,I1元件(或左侧整合元件)和I2元件(或右侧整合元件)可协同作用,从而将位于其间的元件(即从P1至Y1的核苷酸序列)整合到植物细胞的 基因组中。
代表性的I1和I2是来自于农杆菌的Ti元件。当然,其他可起到类似整合作用的元件也可用于本发明。
本发明的构建物中所用的各种元件或者是本领域中已知的,或者可用本领域技术人员已知的方法制备。例如,可通过常规方法,如PCR方法、全人工化学合成法、酶切方法获得相应的元件,然后通过熟知的DNA连接技术连接在一起,就形成了本发明的构建物。
将本发明的构建物***外源载体(尤其是适合转基因植物操作的载体),就构成了本发明的载体。
将本发明的载体转化植物细胞从而介导本发明的载体对植物细胞染色体进行整合,并在植物体内表达,制得经基因编辑的植物细胞。
将本发明的经基因编辑的植物细胞再生为植物体,从而获得经基因编辑的植物。
将本发明构建好的上述核酸构建物,通过常规的植物重组技术(例如农杆菌转让技术),可以导入植物细胞,从而获得携带所述核酸构建物(或带有所述核酸构建物的载体)的植物细胞,或获得基因组中整合有所述核酸构建物的植物细胞。
本发明中整合有所述核酸构建物的植物个体,在其子代可通过常规筛选或采用本领域已知的其他手段进行分离或去除,从而制得经基因编辑且不含有核酸构建物的植物体。
具体地,本发明是将一种优化过的腺嘌呤脱氨酶表达盒和/或胞嘧啶脱氨酶表达盒构建于植物的CRISPR/Cas9***中,腺嘌呤脱氨酶可将靶DNA中的腺嘌呤(A)脱氨基转变为肌苷(I),肌苷可与胞嘧啶配对,在DNA水平被当成鸟嘌呤(G)进行读码与复制,实现在突变窗口内A到G的突变;
胞嘧啶脱氨酶可将靶DNA中的胞嘧啶(C)脱氨基转变为尿嘧啶(U),尿嘧啶在DNA复制过程中会被识别成T,实现C到T的突变。
在一优选实施方式中,腺嘌呤碱基编辑器元件的核酸构建物的基本结构如下:
原有的ABE***中,脱氨酶表达盒为ZmUbi-ecTadA-32aa linker-ecTadA(7.10)-32aa linker-nSpCas9/nSaCas9-SV40NLS-NOS
本发明中:
ABEmax-nCas9表达盒结构:
ZmUbi-bpNLS-OptimizedABE7.10-linker-nCas9-bpNLS-NOS
或ZmUbi-bpNLS-nCas9-linker-OptimizedABE7.10-bpNLS-NOS
ABEmax-nCas9碱基编辑器表达盒结构:
OsU6-sgRNA-ZmUbi-bpNLS-OptimizedABE7.10-linker-nCas9-bpNLS-NOS或ZmUbi-bpNLS-nCas9-linker-OptimizedABE7.10-bpNLS-OsU6-sgRNA-NOS或ZmUbi-bpNLS-OptimizedABE7.10-linker-nCas9-bpNLS-OsU6-sgRNA-NOS或OsU6-sgRNA-ZmUbi-bpNLS-nCas9-linker-OptimizedABE7.10-bpNLS-NOS
ABEmax-nCas9NG表达盒结构:
ZmUbi-bpNLS-OptimizedABE7.10-linker-nCas9NG-bpNLS-NOS
或ZmUbi-bpNLS-nCas9NG-linker-OptimizedABE7.10-bpNLS-NOS
ABEmax-nCasNG碱基编辑器表达盒结构:
OsU6-sgRNA-ZmUbi-bpNLS-OptimizedABE7.10-linker-nCas9NG-bpNLS-NOS或ZmUbi-bpNLS-nCas9NG-linker-OptimizedABE7.10-bpNLS-OsU6-sgRNA-NOS或ZmUbi-bpNLS-OptimizedABE7.10-linker-nCas9NG-bpNLS-OsU6-sgRNA-NOS或OsU6-sgRNA-ZmUbi-bpNLS-nCas9NG-linker-OptimizedABE7.10-bpNLS-NOS
在一优选实施方式中,胞嘧啶碱基编辑器元件的核酸构建物的基本结构如下:
原有的CBE***中,脱氨酶表达盒为ZmUbi-rAPOBEC1-XTEN-nCas9-UGI-SV40NLS-NOS
本发明中:
CBE2.0-nCas9表达盒结构:
ZmUbi-bpNLS-CBE2.0-linker-nCas9-UGI-bpNLS-NOS
或ZmUbi-bpNLS-nCas9-linker-CBE2.0-UGI-bpNLS-NOS
CBE2.0-nCas9碱基编辑器表达盒结构:
OsU6-sgRNA-ZmUbi-bpNLS-CBE2.0-linker-nCas9-UGI-bpNLS-NOS 或ZmUbi-bpNLS-nCas9-linker-CBE2.0-UGI-bpNLS-OsU6-sgRNA-NOS或ZmUbi-bpNLS-CBE2.0-linker-nCas9-UGI-bpNLS-OsU6-sgRNA-NOS或OsU6-sgRNA-ZmUbi-bpNLS-nCas9-linker-CBE2.0-UGI-bpNLS-NOS
CBE2.0-nCas9NG表达盒结构:
ZmUbi-bpNLS-CBE2.0-linker-nCas9NG-UGI-bpNLS-NOS
或ZmUbi-bpNLS-nCas9NG-linker-CBE2.0-UGI-bpNLS-NOS
CBE2.0-nCas9NG碱基编辑器表达盒结构:
OsU6-sgRNA-ZmUbi-bpNLS-CBE2.0-linker-nCas9NG-UGI-bpNLS-NOS或ZmUbi-bpNLS-nCas9NG-linker-CBE2.0-UGI-bpNLS-OsU6-sgRNA-NOS或ZmUbi-bpNLS-CBE2.0-linker-nCas9NG-UGI-bpNLS-OsU6-sgRNA-NOS或OsU6-sgRNA-ZmUbi-bpNLS-nCas9NG-linker-CBE2.0-UGI-bpNLS-NOS
Figure PCTCN2020078079-appb-000001
Figure PCTCN2020078079-appb-000002
载体构建
该载体的主要特征是将腺嘌呤脱氨基酶和/或胞嘧啶脱氨基酶与CRISPR/Cas***中的Cas蛋白以及核定位信号bpNLS的编码序列连接在一起,从而形成融合蛋白的编码序列。当该编码序列所编码的融合蛋白在细胞质中表达后,所述的融合蛋白可以非常高效地被转移至细胞核内,并由式I构建物所编码的guide RNA引导至基因组中的靶点位置,从而在靶点位置进行A.T到G.C或C.G到T.A的碱基替换,并基本上避免或消除了发生***/缺失的风险。
由于腺嘌呤脱氨基酶将A突变为G,胞嘧啶脱氨基酶将C突变为T并不需要Cas蛋白的DNA双链切割活性。因此,在本发明中Cas蛋白是无切割活性或具有单链切割活性的突变的Cas蛋白。在一优选实施方式中,本发明的Cas蛋白可以是Cas9(D10A),其氨基酸序列如SEQ ID NO.:2所示。在一优选实施方式中,本发明的Cas蛋白可以是nCas9NG,其氨基酸序列如SEQ ID NO.:3所示。一般的,为了增加融合蛋白的活性,蛋白间一般通过一些柔性短肽连接,即Linker(连接肽序列)。优选的,该Linker可以选用XTEN,其编码序列如SEQ ID NO.:4所示。
在本发明中,合适的启动子包括组成型和/或诱导型启动子。优选地,为了增加效率,可以选择适用于植物细胞的强启动子,代表性的例子包括(但并不限于):CaMV 35S启动子或者UBI启动子或Actin启动子等。
选择适用于植物细胞的guide RNA的表达框,并将其与上述融合蛋白的开放表达框(ORF)构建在同一载体。
靶点设计
在本发明中,当腺嘌呤脱氨基酶和/或胞嘧啶脱氨基酶通过CRISPR/Cas9***引导至靶点位置后,脱氨基酶的作用区域就被固定的。
例如,将(a)腺嘌呤脱氨基酶TadA或突变型的腺嘌呤脱氨基酶TadA7-10蛋白;和/或(b)胞嘧啶脱氨酶APOBEC或突变型的胞嘧啶脱氨酶(CBE2.0)通过32个氨基酸的XTEN Linker连接至Cas9的N端后,本发明得到的实验结果表明,本发明的各种Cas蛋白的编辑窗口差别不大,均为PAM位点前20个碱基范围内,较佳的热点区域在3-10个碱基范围内。
遗传转化
在本发明中,对于将本发明的式I构建物导入细胞或整合到基因组的方法,没有特别限制。可以用常规的方法进行,例如将式I构建物或相应的载体通过合适的方法导入到植物细胞中。代表性的导入方法包括但并不限于:农杆菌转染法、基因枪法、显微注射法、电击法、超声波法、和聚乙二醇(PEG)介导法等。
在本发明中,对于受体植物没有特别限制,其中包括各种不同的农作物植物(如禾本科植物)、林业植物、园艺植物(如花卉植物)等。代表性的例子包括但不限于:水稻、大豆、番茄、玉米、烟草、小麦、高粱、马铃薯等。
上述DNA载体或片段导入植物细胞后,使转化的植物细胞中的DNA表达该融合蛋白和gRNA。融合腺嘌呤脱氨基酶的Cas蛋白在相应gRNA的引导下,将靶点位置的A突变为G(进而使得互补链的T突变为C)或将靶点位置的C突变为T(进而使得互补链的G突变为A)。
对于用本发明方法进行植物基因组定点替换后的植物细胞或组织或器官,可以用常规方法再生获得相应的经基因编辑的植株。例如,通过组织培养,再生获得碱基替换后的植株。
密码子优化
在本发明中,对腺嘌呤脱氨酶和/或胞嘧啶脱氨酶的编码序列进行密码子优化,尤其是植物密码子优化。
遗传密码有64种,但绝大多数倾向于利用这些密码子中的一部分。那些 最被频繁利用的称为偏爱密码子或最佳密码子,那些不被经常利用的称为稀有或利用率低的密码子。实际上利用蛋白表达或生产的每种生物都表现出某种程度的密码子利用的差异或偏爱。利用偏爱密码子并避免利用率低的或稀有的密码子进行基因合成,基因的这种重设称为密码子优化。
本发明中,对腺嘌呤脱氨酶和/或胞嘧啶脱氨酶、核定位信号的氨基酸序列进行了密码子优化,采用真核细胞偏爱的密码子进行优化,优选的,采用动物细胞偏爱的密码子进行优化,更优选的,采用植物细胞偏爱的密码进行优化。
应用
本发明可以用于植物基因工程领域,用于植物研究和育种,尤其是具有经济价值的农作物、林业作物或园艺植物的遗传改良。
本发明的主要优点包括:
(1)本发明首次将Cas9核酸酶(如nCas9或nCas9NG)与优化的腺嘌呤脱氨酶、双核定位信号构成融合蛋白,在植物中成功实现了sgRNA引导的碱基定点突变(如A突变为G),并且突变效率非常高(可高达≥40%或更高),同时显著降低或基本上消除在靶位点发生***和/或缺失(indel)的风险。
(2)本发明首次将Cas9核酸酶(如nCas9或nCas9NG)与优化的胞嘧啶脱氨酶、UGI、双核定位信号构成融合蛋白,在植物中成功实现了sgRNA引导的碱基定点突变(如C突变为T),并且突变效率非常高(可高达≥80%或更高),同时显著降低或基本上消除在靶位点发生***和/或缺失(indel)的风险,并且能将indel比例降为<7%。
(3)本发明可以通过使用不同形式的Cas9,扩大植物基因组中可被定点编辑的范围。
(4)本发明首次发现,本发明的nCas9NG可识非NGG(比如NG)的PAM,并可获得非常高效的碱基编辑效率。
下面结合具体实施例,进一步阐述本发明。应理解,这些实施例仅用于说明本发明而不用于限制本发明的范围。下列实施例中未注明具体条件的实验方法,通常按照常规条件如Sambrook等人,分子克隆:实验室手册(New York:Cold Spring Harbor Laboratory Press,1989)中所述的条件,或按照制造厂商所建 议的条件。除非另外说明,否则百分比和份数按重量计算。本发明中所涉及的实验材料和试剂如无特殊说明均可从市售渠道获得。
实施例1利用ABEmax-nCas9的碱基编辑效果
具体操作流程如下:
1.1载体构建
ABE-nCas9编辑器构建
具体参见文献Hua K,Tao X,Yuan F,Wang D and Zhu J(2018)Precise A.T to G.C Base Editing in the Rice Genome.MOL PLANT 11:627-630.
ABEmax-nCas9编辑器构建
(1)通过南京金斯瑞公司合成bis-bpNLS,密码子优化的ABE7.10和由96个碱基组成桥梁序列。
(2)利用KpnI和HindIII限制性酶切位点用合成的序列替换原有ABE编辑器中的ecTadA-linker-ecTadA(7.10)-linker-Cas9-SV40NLS(Hua et al.,2018)。
1.2sgRNA设计
针对水稻ALS基因设计合成sgRNA,
ALS-sg3:CGCATTCAAGGACATGATCCTGG(SEQ ID NO.:9)
ALS-sg1:GCGCCCCCACTTGGGATCATAGG(SEQ ID NO.:10)
1.3水稻遗传转化过程
包含碱基编辑器和潮霉素抗性基因的核酸构建物(pCambia1300)导入农杆菌EHA105,然后转化水稻愈伤组织,水稻的转化、组织培养、植株生长过程按照文献中记载过程执行。(Nishimura et al.,2006;Wang et al.,2015).
1.4植株处理和碱基编辑效率的检测
从每一株转基因植株中取2-3片叶子,并从中分离出基因组DNA。靶标位点由PCR技术扩增,然后通过Sanger测序分析。测序峰图通过DSDecodeM网站分解出具体的DNA序列。测序峰图复杂的样品再通过TA克隆测序进一步验证。碱基编辑效率为靶标位点实现预期碱基替换的植株数量与鉴定的植株总数的百分比。用于PCR和测序的引物列在表1-2中。
表1用于扩增再生稻苗中靶位点的PCR引物
Figure PCTCN2020078079-appb-000003
Figure PCTCN2020078079-appb-000004
1.5实验结果
ABE-nCas9在靶位点产生A到G突变的效率约20%,而ABEmax-nCas9的效率约40%-50%,具体结果见图1。
1.6实验结论
ABEmax-nCas9的编辑效率高达40%,比ABE-nCas9的编辑效率提高了一倍,结果表明,优化后的ABEmax-nCas9碱基编辑器可高效的应用于植物基因组的定点碱基替换。
实施例2利用ABEmax-nCas9NG的碱基编辑效果
1、载体构建
采用Mut 
Figure PCTCN2020078079-appb-000005
MultiS多重定点突变试剂盒(南京诺唯赞)在nSpCas9(D10A)的基础上获得nSpCas9-NG,共包含7个氨基酸的替换R1335A/L1111R/D1135V/G1218R/E1219F/A1322R/T1337R。突变后的序列通过BamHI和SpeI限制性酶切位点替换ABEmax-nCas9编辑器中的nSpCas9(D10A)片段,获得了ABEmax-nCas9NG编辑器。
为了验证ABEmax-nCas9NG编辑器在非传统的以NGG作为PAM基序的编辑可行性,在水稻EPSPS和ALS基因各设计了一条sgRNA(EPSPS-sg2:GAGAAGGATGCGAAAGAGGAAGT(SEQ ID NO.:17)和ALS-sg4:TAACAAAGAAGAGTGAAGTCCGT(SEQ ID NO.:18),分别以AGT和CGT作为PAM基序。遗传转化以及碱基编辑效率 检测参考实施例1中的1.2和1.3。转基因植株的鉴定结果表明,在EPSPS靶位点出现A到G碱基替换的植株超过40%,而在ALS靶位点出现A到G碱基替换的植株也将近10%(图2)。这些实验结果表明,ABEmax-nCas9NG碱基编辑器可以NGN作为PAM基序有效的进行碱基替换,在植物基因组中极大的扩大了碱基编辑的范围。
实施例3利用CBE2.0-nCas9对水稻NRT1.1B和SLR1基因的碱基编辑效果
为了直接比较优化的CBE碱基编辑器CBE2.0与第一代CBE的性能,选择NRT1.1B和SLR1基因,使用先前测试的sgRNA进行编辑。NRT1.1B控制水稻营养元素氮的吸收,第327位氨基酸由Thr到Met的改变(DNA序列由C变成T)可以增加产量;SLR1控制赤霉素的合成,影响水稻植株高度。
OsU6-sgRNA(NRT1.1B)序列如SEQ ID NO.:28所示。
OsU6-sgRNA(SLR1)序列如SEQ ID NO.:29所示。
具体操作流程如下:
1、载体构建
CBE2.0-nCas9编辑器构建
(1)通过南京金斯瑞公司合成bis-bpNLS,密码子优化的CBE2.0和由96个碱基组成桥梁序列。
(2)利用KpnI和HindIII限制性酶切位点用合成的序列替换原有CBE编辑器中的SV40NLS-rAPOBEC1-XTEN(Lu and Zhu,2017)。
(3)采用天根生物的EasyGeno片段重组试剂盒,将另外一个包含UGI和bis-bpNLS的合成序列,经过重组反应与nCas9的3’端融合,最终获得CBE2.0-nCas9。
2、水稻遗传转化过程
包含碱基编辑器和潮霉素抗性基因的核酸构建物(pCambia1300)导入农杆菌EHA105,然后转化水稻愈伤组织,水稻的转化、组织培养、植株生长过程按照文献中记载过程执行。(Nishimura et al.,2006;Wang et al.,2015).
3、植株处理和碱基编辑效率的检测
从每一株转基因植株中取2-3片叶子,并从中分离出基因组DNA。靶标位点由PCR技术扩增,然后通过Sanger测序分析。测序峰图通过DSDecodeM网站分解出具体的DNA序列。测序峰图复杂的样品再通过TA克隆测序进一步验证。碱基编辑效率为靶标位点实现预期碱基替换的植株数量与鉴定的植株总数的百分比。用于 PCR和测序的引物列在表3-4中。
表3.用于扩增再生稻苗中靶位点的PCR引物
Figure PCTCN2020078079-appb-000006
表4.用于对靶PCR产物进行测序的引物。
Figure PCTCN2020078079-appb-000007
4、实验结果
如图3所示,接近80%的转基因水稻株系在NRT1.1B靶位点处具有预期的C至T置换,并且这些株系中的76%(占总转基因株系的55%)是纯合的,并且indel比例很低,只有6.9%(图4)。对SLR1位点的测试也发现大多数转基因植株具有明显的矮化表型(图5,6),测序结果显示超过80%的转基因株系在其靶位点有C到T的替换,并且72%的转基因株系为纯合或双等位突变体,并且indel比例很低,只有3.4%(图7,8)。第一代CBE碱基编辑器在以上两个位点实现的碱基替换效率只有2.7%(NRT1.1B)和13.3%(SLR1)(Lu和Zhu,2017),与之相比,本发明的优化的CBE碱基编辑器效率分别提高了26倍和6倍。
实施例4利用CBE2.0-nCas9对水稻乙酰乳酸合成酶基因(ALS)的碱基编辑效果
乙酰乳酸合成酶是植物体内缬氨酸、亮氨酸和异亮氨酸的合成的关键酶,ALS抑制剂常作为除草剂使用,但是该类除草剂在应用时除了抑制杂草的生长,也会抑制农作物的生长,研究表明ALS的蛋白质序列第627位Ser突变成Asn(对应DNA序列中第1880位G突变成A)赋予对咪唑啉酮除草剂的耐受性(Piao等,2018)。针对该位点设计sgRNA序列(ALSsg1,以AGG为PAM基序:GCGCCCCCACTTGGGATCATAGG (SEQ ID NO.:42)),并利用CBE2.0-nCas9进行碱基编辑,遗传转化过程以及碱基编辑效率检测参考实施例3。结果显示,超过70%的转基因植株在其靶位点处含有C到T的替换,并且大多数为纯合或双等位突变(图9,10)。该实施例进一步表明CBE2.0-nCas9具有很高的碱基编辑效率。但是由于PAM基序的限制,第1880位G并不处于编辑窗口的热点区,仅有两株植株实现第1880位G突变成A(图9)。
实施例5通过CBE2.0-nCas9NG对水稻中的乙酰乳酸合成酶基因(ALS)和5-烯醇丙酮莽草酸-3-磷酸合成酶基因(EPSPS)的碱基编辑效果
1、CBE2.0-nCas9NG编辑器的构建
采用Mut 
Figure PCTCN2020078079-appb-000008
MultiS多重定点突变试剂盒(购自南京诺唯赞)在nSpCas9(D10A)的基础上获得nSpCas9-NG,共包含7个氨基酸的替换R1335A/L1111R/D1135V/G1218R/E1219F/A1322R/T1337R。突变后的序列通过BamHI和SpeI限制性酶切位点替换CBE2.0-nCas9编辑器中的nSpCas9(D10A)片段,获得了CBE2.0-nCas9NG编辑器。
2、CBE2.0-nCas9NG碱基编辑器对ALS的编辑效率
CBE2.0-nCas9NG碱基编辑器可以识别NGN的PAM基序,利用这一特点,重新设计了针对ALS的sgRNA序列(ALSsg2,以AGC为PAM基序:CCCCACTTGGGATCATAGGCAGC(SEQ ID NO.:44)),把第1880位G置于编辑窗口的热点区,遗传转化过程以及碱基编辑效率检测参考实施例3。碱基编辑的结果表明,将近60%的转基因植株实现了第1880位G到A的替换,并且靶区域里没有其它非目标碱基的附带突变(图11)。我们取其中两株野生型植株和两株实现碱基编辑的纯合突变植株,用“豆说好”牌咪唑乙烟酸除草剂(山东先达)喷施处理,结果表明野生型植株停止生长并逐渐枯萎,而突变体不受影响(图12,13)。
3、CBE2.0-nCas9NG碱基编辑器对EPSPS的编辑效率
有研究表明水稻中的5-烯醇丙酮莽草酸-3-磷酸合成酶基因(EPSPS)编码序列的一个单碱基替换(C 317-T)能提供对草甘膦除草剂的抗性(Zhou et al.,2006))。由于该位点附近序列没有符合NGG的PAM基序,设计了一条以TGA作为PAM的sgRNA序列(EPSPS-sg1:GCGACCATTGACAGCAGCCGTGA(SEQ ID NO.:43)),利用CBE2.0-nCas9NG碱基编辑器对该位点进行碱基编辑。遗传转化以及碱基编辑效率检测参考实施例3。转基因植株的鉴定表明有17%的植株实现了C 317到T的碱基替换(图14,15)。
以上两个实验的结果说明CBE2.0-nCas9NG碱基编辑器有效地扩大了碱基编辑的范围。
此外,本发明的研究还发现,将碱基编辑器中的腺嘌呤脱氨酶或胞嘧啶脱氨酶与Cas9核酸酶互换位置,也可对靶位点进行碱基编辑,并且也可有效扩大碱基编辑的范围。
参考文献
1.Hua K,Tao X,Yuan F,Wang D and Zhu J(2018)Precise A.T to G.C Base Editing in the Rice Genome.MOL PLANT 11:627-630.
2.Koblan LW,Doman JL,Wilson C,Levy JM,Tay T,Newby GA,Maianti JP,Raguram A and Liu DR(2018)Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction.NAT BIOTECHNOL 36:843-846.
3.Nishimasu H,Shi X,Ishiguro S,Gao L,Hirano S,Okazaki S,Noda T,Abudayyeh OO,Gootenberg JS,Mori H,Oura S,Holmes B,Tanaka M,Seki M,Hirano H,Aburatani H,Ishitani R,Ikawa M,Yachie N,Zhang F and Nureki O(2018)Engineered CRISPR-Cas9nuclease with expanded targeting space.SCIENCE 361:1259-1262.
4.Hu,J.H.,Miller,S.M.,Geurts,M.H.,Tang,W.,Chen,L.,Sun,N.,Zeina,C.M.,Gao,X.,Rees,H.A.,Lin,Z.,et al.(2018).Evolved Cas9variants with broad PAM compatibility and high DNA specificity.Nature 556,57-63.
在本发明提及的所有文献都在本申请中引用作为参考,就如同每一篇文献被单独引用作为参考那样。此外应理解,在阅读了本发明的上述讲授内容之后,本领域技术人员可以对本发明作各种改动或修改,这些等价形式同样落于本申请所附权利要求书所限定的范围。

Claims (11)

  1. 一种核酸构建物,其特征在于,所述核酸构建物具有5’-3’(5’至3’)的式I结构:
    I1-Z1-Z2-I2  (I)
    式中,
    I1为第一整合元件;
    I2为第二整合元件;
    Z1为第一表达盒;
    Z2为第二表达盒;
    并且,Z1和Z2中的一个表达盒具有Ia或Ia’结构,而另一个表达盒具有式Ib结构:
    P1-S1-X1-L1-X2-X4-L2-X3  (Ia)或
    P1-S1-X2-L1-X1-X4-L2-X3  (Ia’)
    P2-Y1  (Ib);
    式中,
    P1、S1、X1、L1、X2、X4、L2、X3、P2、Y1分别为用于构成所述构建物的元件;
    P1为第一启动子,所述第一启动子为RNA聚合酶II依赖的启动子;
    S1为第一核定位信号的编码序列;
    X1为腺嘌呤脱氨酶的编码序列和/或胞嘧啶脱氨酶的编码序列;
    L1为无或第一连接肽的编码序列;
    X2为Cas9核酸酶的编码序列,所述的Cas9核酸酶是无切割活性或单链切割活性的;
    X4为无或尿嘧啶糖苷酶抑制剂UGI的编码序列;
    L2为无或第二连接肽的编码序列;
    X3为第二核定位信号的编码序列;
    P2为第二启动子;
    Y1为gRNA的编码序列;
    并且,各“-”为键或核苷酸连接序列;
    附加条件是,当X1为腺嘌呤脱氨酶的编码序列,X4为无,当X1为胞嘧啶脱氨酶的编码序列,X4为尿嘧啶糖苷酶抑制剂UGI的编码序列。
  2. 一种融合蛋白,其特征在于,所述的融合蛋白包括(a)腺嘌呤脱氨酶和/或胞嘧啶脱氨酶和(b)Cas9核酸酶,并且所述融合蛋白由权利要求1所述的核酸构建物编码。
  3. 一种载体,其特征在于,所述载体含有权利要求1所述的核酸构建物。
  4. 一种基因工程细胞,其特征在于,所述细胞含有权利要求1所述的核酸构建物,或其基因组整合有一个或多个权利要求1所述的核酸构建物。
  5. 一种用于基因编辑的试剂组合,其特征在于,包括:
    (i)第一核酸构建物,或含有所述第一核酸构建物的第一载体,所述第一核酸构建物具有从5’-3’的式Ia或Ia’结构:
    P1-S1-X1-L1-X2-X4-L2-X3  (Ia)或
    P1-S1-X2-L1-X1-X4-L2-X3  (Ia’)
    其中,
    P1为第一启动子,所述第一启动子为RNA聚合酶II依赖的启动子;
    S1为第一核定位信号的编码序列;
    X1为腺嘌呤脱氨酶(如野生型和/或突变型TadA)的编码序列和/或胞嘧啶脱氨酶的编码序列;
    L1为无或第一连接肽的编码序列;
    X2为Cas9核酸酶的编码序列,所述的Cas9核酸酶是无切割活性或单链切割活性的;
    X4为无或尿嘧啶糖苷酶抑制剂UGI的编码序列;
    L2为无或第二连接肽的编码序列;
    X3为核定位信号的编码序列;
    并且,“-”为键或核苷酸连接序列;
    附加条件是,当X1为腺嘌呤脱氨酶的编码序列,X4为无,当X1为胞嘧啶脱氨酶的编码序列,X4为尿嘧啶糖苷酶抑制剂UGI的编码序列;和
    (ii)第二核酸构建物,或含有所述第二核酸构建物的第二载体,所述第二核酸构建物具有从5’-3’的式Ib所示的结构:
    P2-Y1  (Ib);
    其中,P2为第二启动子;
    Y1为gRNA的编码序列;
    并且,“-”为键或核苷酸连接序列。
  6. 一种试剂盒,其特征在于,所述试剂盒含有权利要求5所述的试剂组合。
  7. 一种对植物进行基因编辑的方法,其特征在于,包括步骤:
    (i)提供待编辑植物;和
    (ii)将权利要求1所述的核酸构建物、权利要求3所述的载体或权利要求5所述的试剂组合导入所述待编辑植物的植物细胞,从而在所述植物细胞内进行基因编辑。
  8. 一种制备经基因编辑的植物细胞的方法,其特征在于,包括步骤:
    将权利要求1所述的核酸构建物、权利要求3所述的载体或权利要求5所述的试剂组合转染植物细胞,使得所述植物细胞中的染色体发生定点替换(或突变),从而制得所述经基因编辑的植物细胞。
  9. 一种权利要求1所述的核酸构建物、权利要求2所述的融合蛋白、权利要求3所述的载体、权利要求4所述的基因工程细胞、权利要求5所述的试剂组合、权利要求6所述的试剂盒的用途,其特征在于,用于对植物进行基因编辑。
  10. 一种制备经基因编辑的植物的方法,其特征在于,包括步骤:
    将权利要求8所述方法制备的所述经基因编辑的植物细胞再生为植物体,从而获得所述经基因编辑的植物。
  11. 一种复合物,其特征在于,包括以下两种组分:
    (1)核酸组分,所述核酸为gRNA;
    (2)蛋白组分,所述蛋白为权利要求2所述的融合蛋白。
PCT/CN2020/078079 2019-03-06 2020-03-05 一种用于基因编辑的核酸构建物 WO2020177751A1 (zh)

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
CN201910169340.X 2019-03-06
CN201910169340 2019-03-06
CN201910173148 2019-03-07
CN201910173148.8 2019-03-07
CN201910839046.5 2019-09-05
CN201910838286.3 2019-09-05
CN201910839046.5A CN110527695B (zh) 2019-03-07 2019-09-05 一种用于基因定点突变的核酸构建物
CN201910838286.3A CN110526993B (zh) 2019-03-06 2019-09-05 一种用于基因编辑的核酸构建物

Publications (1)

Publication Number Publication Date
WO2020177751A1 true WO2020177751A1 (zh) 2020-09-10

Family

ID=72337612

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/078079 WO2020177751A1 (zh) 2019-03-06 2020-03-05 一种用于基因编辑的核酸构建物

Country Status (1)

Country Link
WO (1) WO2020177751A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104293828A (zh) * 2013-07-16 2015-01-21 中国科学院上海生命科学研究院 植物基因组定点修饰方法
CN106609282A (zh) * 2016-12-02 2017-05-03 中国科学院上海生命科学研究院 一种用于植物基因组定点碱基替换的载体
CN110157726A (zh) * 2018-02-11 2019-08-23 中国科学院上海生命科学研究院 植物基因组定点替换的方法
CN110526993A (zh) * 2019-03-06 2019-12-03 山东舜丰生物科技有限公司 一种用于基因编辑的核酸构建物
CN110527695A (zh) * 2019-03-07 2019-12-03 山东舜丰生物科技有限公司 一种用于基因定点突变的核酸构建物

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104293828A (zh) * 2013-07-16 2015-01-21 中国科学院上海生命科学研究院 植物基因组定点修饰方法
CN106609282A (zh) * 2016-12-02 2017-05-03 中国科学院上海生命科学研究院 一种用于植物基因组定点碱基替换的载体
CN110157726A (zh) * 2018-02-11 2019-08-23 中国科学院上海生命科学研究院 植物基因组定点替换的方法
CN110526993A (zh) * 2019-03-06 2019-12-03 山东舜丰生物科技有限公司 一种用于基因编辑的核酸构建物
CN110527695A (zh) * 2019-03-07 2019-12-03 山东舜丰生物科技有限公司 一种用于基因定点突变的核酸构建物

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KAI HUA: "Precise A.T to G*C Base Editing in the Rice Genome", MOLECULAR PLANT, vol. 11, no. 4, 21 February 2018 (2018-02-21), pages 627 - 630, XP055655070, ISSN: 1752-9867 *
ZAFRA MARIA PAZ, SCHATOFF EMMA M, KATTI ALYNA, FORONDA MIGUEL, BREINIG MARCO, SCHWEITZER ANABEL Y, SIMON AMBER, HAN TENG, GOSWAMI : "Optimized Base Editors Enable Efficient Editing in Cells, Organoids and Mice", NAT BIOTECHNOL., vol. 36, no. 9, 3 July 2018 (2018-07-03), pages 888 - 893, XP036929662, ISSN: 1546-1696 *

Similar Documents

Publication Publication Date Title
CN110526993B (zh) 一种用于基因编辑的核酸构建物
CN110527695B (zh) 一种用于基因定点突变的核酸构建物
EP3902911B1 (en) Polypeptides useful for gene editing and methods of use
CN110157726B (zh) 植物基因组定点替换的方法
CN111263810A (zh) 使用多核苷酸指导的核酸内切酶的细胞器基因组修饰
US20210403901A1 (en) Targeted mutagenesis using base editors
CN112266420B (zh) 一种植物高效胞嘧啶单碱基编辑器及其构建与应用
WO2018098935A1 (zh) 一种用于植物基因组定点碱基替换的载体
JP2023550932A (ja) アデノシンデアミナーゼ、並びに関連する生体材料及びその使用
JP2023534693A (ja) ウラシル安定化タンパク質並びにその活性断片及びバリアント並びに使用方法
CN113151229A (zh) 胞嘧啶脱氨酶及包含该酶的胞嘧啶编辑器
CN113994007B (zh) 一种核酸表达的方法
TW202300649A (zh) Dna修飾酶及活性片段及其變體及使用方法
US20230332167A1 (en) Plant promoter for transgene expression
WO2020177751A1 (zh) 一种用于基因编辑的核酸构建物
US11814633B2 (en) Plant terminator for transgene expression
EP3052633B1 (en) Zea mays metallothionein-like regulatory elements and uses thereof
CN114686456A (zh) 基于双分子脱氨酶互补的碱基编辑***及其应用
Karcher et al. Faithful editing of a tomato-specific mRNA editing site in transgenic tobacco chloroplasts
CN114196644B (zh) 一种蛋白棕榈酰化转移酶dhhc16及其在提高水稻耐盐方面的应用
CN116724119A (zh) 除草剂抗性植物
CN111019969B (zh) 一种通过优化供体dna模板来提高基因精确替换效率的方法
WO2022188816A1 (zh) 改进的cg碱基编辑***
CN114174518A (zh) 非生物胁迫耐受性植物及方法
CN116855532A (zh) 一种用于骨干载体的RNAi表达框及相应载体和应用

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20765898

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20765898

Country of ref document: EP

Kind code of ref document: A1