WO2023092731A1 - Mad7-nls fusion protein, and nucleic acid construct for site-directed editing of plant genome and application thereof - Google Patents

Mad7-nls fusion protein, and nucleic acid construct for site-directed editing of plant genome and application thereof Download PDF

Info

Publication number
WO2023092731A1
WO2023092731A1 PCT/CN2021/138184 CN2021138184W WO2023092731A1 WO 2023092731 A1 WO2023092731 A1 WO 2023092731A1 CN 2021138184 W CN2021138184 W CN 2021138184W WO 2023092731 A1 WO2023092731 A1 WO 2023092731A1
Authority
WO
WIPO (PCT)
Prior art keywords
mad7
sequence
nucleic acid
plant
acid construct
Prior art date
Application number
PCT/CN2021/138184
Other languages
French (fr)
Chinese (zh)
Inventor
周红菊
李相敢
郑华颖
裴睿丽
刘政
刘子嘉
李莹莹
Original Assignee
科稷达隆(北京)生物技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 科稷达隆(北京)生物技术有限公司 filed Critical 科稷达隆(北京)生物技术有限公司
Publication of WO2023092731A1 publication Critical patent/WO2023092731A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8218Antisense, co-suppression, viral induced gene silencing [VIGS], post-transcriptional induced gene silencing [PTGS]
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/02Fusion polypeptide containing a localisation/targetting motif containing a signal sequence
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/09Fusion polypeptide containing a localisation/targetting motif containing a nuclear localisation signal

Definitions

  • the present invention relates to the field of biotechnology. It specifically relates to a MAD7-NLS fusion protein, a nucleic acid construct for site-directed editing of plant genomes, and an efficient site-directed gene editing method for plant genomes based on the novel nuclease MAD7 using RNA guidance.
  • CRISPR regularly interspaced short palindromic repeats
  • Cas9 recognizes the 3'G-rich site
  • Cpf1 Cas12a
  • the Cas12a protein has the functions of both DNA shearing enzyme and RNA trimming enzyme, not only can target DNA double-strand cutting, but also can process and shear the corresponding immature crRNA (pre-crRNA) into a mature form.
  • the Cas12a protein molecule is relatively smaller and more specific, and has stronger advantages over Cas9 for multi-target gene editing; in addition, compared with Cas9
  • the blunt end is formed by cutting the genome, and the sticky end formed by Cas12a cutting the genome is more conducive to the directional insertion of foreign genes.
  • MAD7 belongs to the type II V-A Cpf1-like family. It was discovered by Inscripta in the genus Eubacterium and optimized. It is available for free use in scientific research institutes and commercial research. MAD7 has the highest homology with AsCpf1 protein, and the homology is only 31%. The PAM recognition site is YTTN. At present, the MAD7 system has been verified to have high editing activity in bacteria, yeast, zebrafish, mice and human cells. In order to promote its application range in plants, the rice codon was optimized to study its editing efficiency in plants. Although the application of the MAD7 system in rice has also been reported recently, its mutation efficiency is 49-65.6%. To sum up, in order to meet the needs of plant genetic engineering and enrich the toolbox of plant gene editing, it is necessary to develop a more efficient gene editing system with strong advantages.
  • the purpose of the present invention is to provide a set of MAD7-based CRISPR/MAD7 plant genome efficient site-directed editing system, which can simply and efficiently realize single gene knockout, multiple gene knockout and homologous recombination or foreign fragment site-directed knockout in monocotyledonous plants enter.
  • the present invention firstly provides a MAD7-NLS fusion protein, which has the following structure:
  • C is MAD7 protein
  • B1 and B2 are independent nuclear localization signal sequences (NLS).
  • the nuclear localization signal sequence is selected from: SV40, KRP2 (Kiprelated protein gene NO.2), MDM2, CDc25C, DPP9, MTA1, CBP80, AreA, M9, Rev, One or a combination of any two or more of hTAP, MyRF, EBNA-6, TERT or Tfam.
  • the N-terminal of the MAD7-NLS fusion protein further includes a signal peptide and/or protein tag sequence.
  • nucleic acid construct for site-directed editing of plant genomes comprising a first expression cassette comprising a sequentially connected first promoter, the above-mentioned MAD7-NLS fusion protein The coding nucleotide sequence and the first terminator;
  • the first promoter is a Pol II type promoter, preferably Ubi, Actin, CmYLCV, UBQ, 35S, SPL, one of the tissue-specific promoter YAO, CDC45, rbcS and the inducible promoter XEV or a combination of any two.
  • Pol II type promoter preferably Ubi, Actin, CmYLCV, UBQ, 35S, SPL, one of the tissue-specific promoter YAO, CDC45, rbcS and the inducible promoter XEV or a combination of any two.
  • the nucleic acid construct further comprises a second expression cassette, said second expression cassette comprising a second promoter connected in sequence, several tandem repeat sequences;
  • the repeat sequence is one or both of a mature direct repeat sequence (direct repeat, DR) and an immature direct repeat sequence; preferably, the second promoter is a Pol II type or Pol III type promoter; More preferably, the second promoter is selected from one of OsU3, OsU6a, OsU6b, OsU6c, Actin, 35S, Ubi, UBQ, SPL, CmYLCV, tissue-specific promoter YAO, CDC45, rbcS or inducible promoter XEV , two or more.
  • the second expression cassette further comprises a second termination sequence connected to the end of the repeat sequence
  • the second termination sequence is selected from polyT, NOS, polyA or a combination thereof.
  • the repeat sequence further includes a target site guide sequence sg; preferably, the length of the sg sequence is 17-35bp, preferably 19-28bp, more preferably 19-25bp ;
  • the repeat number of the repeat sequence is 2-50, preferably 2-10, more preferably 3-15, further preferably 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 or 12.
  • the nucleic acid construct is selected from one of SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6 or SEQ ID NO:21 or more.
  • the nucleic acid construct is a vector comprising both the first expression cassette and the second expression cassette, or,
  • a combination of vectors consisting of a first vector each comprising a first expression cassette and a second vector comprising a second expression cassette.
  • the present invention also provides a kit for gene editing in plants, which is characterized in that it includes the above-mentioned nucleic acid construct; preferably, it also includes an auxiliary vector carrying a donor DNA expression cassette.
  • Another aspect of the present invention provides a method for plant gene editing, comprising:
  • step (iii) regenerating or culturing the plant cell, plant tissue or plant body identified in step (ii) as having undergone the gene editing;
  • the gene editing includes one of gene knockout, site-directed insertion or gene replacement, or an optional combination of two or three;
  • the gene editing is single-site or multi-site gene editing
  • the introduction is achieved by any method selected from Agrobacterium transformation, gene gun method, microinjection method, electric shock method, ultrasonic method or polyethylene glycol (PEG)-mediated method;
  • the plant is selected from any one of grasses, leguminous plants, solanaceae or cruciferous plants;
  • the plant is selected from any one of Arabidopsis thaliana, wheat, barley, oat, corn, rice, sorghum, millet, soybean, peanut, tobacco or tomato.
  • the MAD7-based CRISPR/MAD7 plant genome efficient fixed-point editing system of the present invention can simply and efficiently realize single gene knockout, multiple gene knockout and homologous recombination or foreign fragment fixed-point knock-in in monocotyledonous plants.
  • CRISPR/Cas9, CRISPR/Cpf1 gene editing system, the CRISPR/MAD7 system provided by the present invention has higher specificity and is more flexible for the selection of target sites.
  • 1 is a schematic diagram of the structural composition of each element in the related sequence (SEQ ID NO.2) of the rice crRNA expression cassette of MAD7 according to the present invention.
  • Figure 2 is a schematic diagram of the structural composition of each element in the related sequence (SEQ ID NO.3) of the rice multi-gene site editing sg-DR crRNA expression cassette of MAD7 according to the present invention.
  • FIG. 3 is a schematic diagram of the structural composition of each element in the related sequence (SEQ ID NO.4) of the rice multi-gene site editing tRNA-sg-DR crRNA expression cassette of MAD7 according to the present invention.
  • FIG. 4 is a schematic diagram of the structural composition of each element in the related sequence (SEQ ID NO.5) of the rice multi-gene locus editing miniOsU3/U6-DR-sg crRNA expression cassette according to the present invention.
  • FIG. 5 is a schematic diagram of the structural composition of each element in the related sequence (SEQ ID NO.6) of the rice multi-gene site editing HH-DR-sg-HDV crRNA expression cassette of MAD7 according to the present invention.
  • FIG. 6 is a schematic diagram of the structural composition of each element in the related sequence (SEQ ID NO.21) of the maize crRNA expression cassette of MAD7 according to the present invention.
  • Figure 7 is a schematic diagram of the construction of the rice single-site knockout CRISPR-MAD7 (ncNLS) expression vector.
  • Figure 8 is a schematic diagram of the construction of the expression vector under the three placement modes of the NLS design when the OsHD3A site is knocked out at a single site in rice.
  • Figure 9 is a schematic diagram of the construction of the rice multi-site knockout CRISPR-MAD7 expression vector comprising the DR-sg sequence string.
  • Figure 10 is a schematic diagram of the construction of the rice multi-site knockout CRISPR-MAD7 expression vector with tRNA tandem DR-sg sequence.
  • FIG 11 Schematic diagram of the construction of CRISPR-MAD7 expression vectors driven by miniOsU3/miniOsU6 respectively driving the DR-sg sequence.
  • Figure 12 is a schematic diagram of the construction of a CRISPR-MAD7 expression vector driven by a Pol III type promoter and HH-HDV tandem DR-sg sequence.
  • Figure 13 is a schematic diagram of the construction of a CRISPR-MAD7 expression vector driven by a Pol II type promoter and HH-HDV tandem DR-sg sequence.
  • Fig. 14 is a schematic diagram of the construction of a maize single-site knockout CRISPR-MAD7 (ncNLS) expression vector.
  • the inventors After extensive and in-depth research, the inventors have constructed a highly versatile and highly specific CRISPR/MAD7 plant genome-directed and efficient editing system based on MAD7, as well as nucleic acid constructs, vectors or vectors for plant genome-directed editing Combinatorial, and site-directed editing methods for plant genomes. Specifically, based on the method of the present invention, single gene knockout, multiple gene knockout, or homologous recombination and directional insertion of foreign fragments can be easily and efficiently performed at predetermined plant genome sites. On this basis, the present invention has been accomplished.
  • the CRISPR/MAD7 editing system of the present invention uses mature crRNA. Since the mature crRNA is shorter, it is convenient for artificial synthesis and easier to transform into cells, so it will be more advantageous to develop a mature crRNA-based CRISPR-MAD7 system. On this basis, a technical solution for placing the nuclear localization signal sequence (NLS) at the N-terminus and C-terminus of the MAD7 protein was also developed. Experimental results show that the above-mentioned technical scheme has successfully completed the gene editing of rice with an efficiency as high as 95%.
  • NLS nuclear localization signal sequence
  • operably linked refers to the condition that some portion of a linear DNA sequence is capable of modulating or controlling the activity of other portions of the same linear DNA sequence.
  • a promoter is operably linked to a coding sequence if it controls the transcription of the sequence.
  • the present invention provides a nucleic acid construct for high-efficiency site-directed editing of plant genomes, the nucleic acid construct comprising a first expression cassette and an optional second expression cassette;
  • the first expression cassette is a MAD7-NLS fusion protein expression cassette, wherein the MAD7-NLS fusion protein has a formula I structure:
  • P1 is the first promoter
  • A is none, signal peptide, and/or protein tag sequence
  • B1 is no or nuclear localization signal sequence NLS
  • B2 is no or nuclear localization signal sequence NLS
  • the additional condition is that at most one of B1 and B2 is none;
  • C is MAD7 protein
  • E1 is the first terminator
  • the second expression cassette is a crRNA expression cassette, and the crRNA expression cassette contains coding sequences corresponding to mature crRNA or immature pre-crRNA.
  • the above-mentioned elements can be prepared by conventional methods (such as PCR method, artificial total synthesis), and then connected by conventional methods to form the nucleic acid construct of the present invention.
  • An enzyme cleavage reaction may optionally be performed before the ligation reaction, if desired.
  • nucleic acid construct of the present invention can be linear or circular.
  • the nucleic acid construct of the present invention can be single-stranded or double-stranded.
  • the nucleic acid construct of the present invention can be DNA, RNA, or DNA/RNA hybrid.
  • nNLS refers to the nuclear localization signal sequence NLS located at the 5' end of the MAD7 protein coding sequence
  • cNLS refers to the nuclear localization signal sequence NLS located at the 3' end of the MAD7 protein coding sequence
  • ncNLS refers to the 5' end of the MAD7 protein coding sequence.
  • the nuclear localization signal sequence NLS can be connected to both the end and the 3' end.
  • exogenous gene refers to an exogenous DNA molecule that acts in stages.
  • the exogenous genes that can be used in this application are not particularly limited, and include various exogenous genes commonly used in the field of transgenic animals. Representative examples include (but are not limited to): ⁇ -glucuronidase gene, red fluorescent protein gene, green fluorescent protein gene, lysozyme gene, salmon calcitonin gene, lactoferrin, or serum albumin gene and the like.
  • selectable marker gene refers to a gene used to screen transgenic cells or transgenic animals during the transgenic process.
  • the selectable marker gene that can be used in this application is not particularly limited, including various commonly used in the field of transgenic. Representative examples Including, but not limited to: hygromycin resistance gene (Hyg), kanamycin resistance gene (NPTII), neomycin, or puromycin resistance gene.
  • the term "expression cassette” refers to a polynucleotide sequence that contains the sequence components of the gene to be expressed and the elements required for expression.
  • the term “selectable marker expression cassette” refers to a polynucleotide sequence comprising a sequence encoding a selectable marker and sequence modules of elements required for expression. Components required for expression include a promoter and polyadenylation signal sequence.
  • the screening marker expression cassette may or may not contain other sequences, including (but not limited to): enhancers, secretion signal peptide sequences, and the like.
  • plant promoter refers to a nucleic acid sequence capable of initiating transcription of a nucleic acid in a plant cell.
  • the plant promoter may be derived from plants, microorganisms (such as bacteria, viruses) or animals, or artificially synthesized or modified.
  • plant terminator refers to a terminator capable of stopping transcription in a plant cell.
  • the plant transcription terminator may be derived from plants, microorganisms (such as bacteria, viruses) or animals, or artificially synthesized or modified terminators. Representative examples include (but are not limited to): Nos terminators.
  • MAD7 protein refers to a nuclease. Typical MAD7 proteins include (but are not limited to):
  • the term "coding sequence of MAD7 protein” refers to a nucleotide sequence encoding MAD7 protein having cleavage activity.
  • the skilled artisan will recognize that, because of codon degeneracy, there are a large number of polynucleotide sequences that can encode the same polypeptide .
  • technicians will also recognize that different species have certain preferences for codons, and may optimize the codons of the MAD7 protein according to the needs of expression in different species. These variants are all referred to by the term “MAD7 protein Coding sequences" are specifically covered.
  • the term specifically includes a full-length sequence that is substantially identical to the MAD7 gene sequence, as well as a sequence that encodes a protein that retains the function of the MAD7 protein.
  • said C corresponds to the full-length or fusion protein of MAD7 protein
  • said second expression cassette has a crRNA expression cassette with formula II structure:
  • P2 is the second promoter
  • Each R is independently corresponding to a mature or immature direct repeat sequence (direct repeat, DR)
  • each S is independently none or a target site leader sequence sg;
  • q is a positive integer ⁇ 1;
  • T is none or polyT or Nos or polyA sequence.
  • the present invention also provides a vector or a combination of vectors, which contains the nucleic acid construct of the present invention.
  • the vector combination of the present invention further includes an auxiliary vector carrying a donor DNA expression cassette.
  • some elements are operably linked.
  • a promoter when operably linked to a coding sequence, it means that the promoter is capable of initiating the transcription of the coding sequence.
  • the present invention also provides a reagent combination and a kit containing the above-mentioned vector or vector combination, which can be used in the plant gene editing method of the present invention.
  • the present invention also provides a method for gene editing of plants, comprising the steps of:
  • the gene editing includes gene knockout, site-directed insertion, gene replacement, or a combination thereof.
  • the plant gene editing method of the present invention can be used to improve various plants, especially crops.
  • plant includes whole plants, plant organs (eg, leaves, stems, roots, etc.), seeds and plant cells, as well as their progeny.
  • plant organs eg, leaves, stems, roots, etc.
  • the types of plants that can be used in the method of the present invention are not particularly limited, and generally include any type of higher plants that can be subjected to transformation techniques, including monocots, dicots and gymnosperms.
  • the invention can be used in the field of plant genetic engineering, such as the study of plant gene function and crop genetic improvement.
  • the coding sequence of the MAD7 protein is a codon-optimized coding sequence for rice, and the specific sequence is shown in SEQ ID NO.1.
  • the related sequence (SEQ ID NO.2) of the rice crRNA expression cassette of sequence 2 MAD7, the element structure in this expression cassette is as shown in Figure 1, and the underline before and after this sequence marks respectively AvrII, AfeI enzyme cutting site, for the convenience of cloning into
  • the pCAMBIA expression vector is set; the black shading marks the mature DR sequence corresponding to MAD7; the sg site is inside the box, and the sequence is artificially synthesized complementary double strands when constructing a gene knockout vector, and the PCR product with the OsU3 promoter is passed through Overlapping PCR
  • the full length was amplified by the method; the bold letter is the transcription terminator sequence; the rest is the sequence of the OsU3 promoter.
  • Sequence 3 The related sequence (SEQ ID NO.3) of the rice multi-gene site editing sg-DR crRNA expression cassette of MAD7, the element structure in the expression cassette is shown in Figure 2, wherein the underline indicates the BsaI restriction site, Set for the convenience of cloning into the pCAMBIA expression vector; the black shading marks the mature DR sequence corresponding to MAD7; the sg sequences of the four targeting sites are in the box, and they are OsDEP1-g6, OsDEP1-g6, OsBEL260-g1, OsRoc5-g1 and OsHD3A; bold letters are transcription terminator sequences; the rest are sequences of OsU3 promoter.
  • Sequence 4 The related sequence (SEQ ID NO.4) of the rice multi-gene locus editing tRNA-sg-DR crRNA expression cassette of MAD7, the element structure in the expression cassette is shown in Figure 3, wherein the underlines before and after indicate AvrII, The BsaI restriction site is set for the convenience of cloning into the pCAMBIA expression vector; the black shading marks the mature DR sequence corresponding to MAD7; the sg sequences of the four targeting sites are in the box, in order from 5' to 3' They are OsDEP1-g6, OsBEL260-g1, OsRoc5-g1 and OsHD3A respectively; double underlines are tRNA sequences; bold letters are transcription terminator sequences; the rest are sequences of OsU3 promoter.
  • Sequence 5 The relevant sequence of the rice multi-gene locus editing miniOsU3/U6-DR-sg crRNA expression cassette (SEQ ID NO.5) of MAD7, the structure of the elements in the expression cassette is shown in Figure 4, where the underlines before and after are respectively marked AvrII and BsaI restriction sites are set for convenient cloning into the pCAMBIA expression vector; the black shading marks the mature DR sequence corresponding to MAD7; the sg sequences of the four targeting sites are in the box, according to 5' to 3' The sequences are OsDEP1-g6, OsBEL260-g1, OsRoc5-g1 and OsHD3A respectively; bold letters are transcription terminator sequences; the rest are sequences of miniOsU3/U6 promoters.
  • Sequence 6 The related sequence (SEQ ID NO.6) of the rice multi-gene site editing HH-DR-sg-HDV crRNA expression cassette of MAD7, the element structure in the expression cassette is shown in Figure 5, wherein the underline indicates the BsaI enzyme
  • the cutting site is set for the convenience of cloning into the pCAMBIA expression vector; the black shading marks the mature DR sequence corresponding to MAD7; the sg sequences of the four targeting sites are in the box, and they are respectively in the order of 5' to 3' OsDEP1-g6, OsBEL260-g1, OsRoc5-g1, and OsHD3A; bold letters are transcription terminator sequences; double underlines and dots are hammerhead (HH) and Hepatitis deltavirus (HDV) ribozyme sequences, respectively.
  • HH hammerhead
  • HDV Hepatitis deltavirus
  • the 5' in the HH nucleic acid sequence Complementary to first 6 bases of DR.
  • the related sequence (SEQ ID NO.21) of the maize crRNA expression cassette of sequence 21 MAD7, the element structure in this expression cassette is as shown in Figure 6, wherein, the underline before and after marks the ApaI restriction site respectively, for the convenience of cloning into The pCAMBIA expression vector is set; the black shading marks the mature DR sequence corresponding to MAD7; the sg site is inside the box, and the complementary double-stranded sequence is artificially synthesized when constructing the gene knockout vector, and the PCR product with the ZmU3 promoter is passed through Overlapping PCR The full length was amplified by the method; the bold letter is the transcription terminator sequence; the rest is the sequence of the ZmU3 promoter.
  • the rice codon was optimized for MAD7, two expression cassettes of MAD7 and crRNA were constructed and cloned into the pCAMBIA expression vector.
  • the crRNA expression cassette (SEQ ID NO: 2) has the following four elements from 5' to 3': OsU6 or OsU3 promoter, mature direct repeat (DR) corresponding to MAD7, sg sequence , Transcription terminator sequence (TTTTTTT).
  • the MAD7 expression cassette has the following elements from 5'-3': Ubi promoter from maize, NLS nuclear localization signal sequence (nNLS), coding sequence of MAD7, second NLS nuclear localization signal sequence (cNLS), NOS transcription termination Subsequence ( Figure 7).
  • Another design of the present invention is to change the length of sg, or remove the NLS nuclear localization signal sequence at the 5' or 3' end of the MAD7 coding sequence on the basis of the above-mentioned construction, and keep other elements (Figure 8) to explore the presence of MAD7 in different combinations. Cleavage activity in vivo.
  • MAD7 itself has RNase activity, it can cut and process the transcribed precursor crRNA sequence by itself. It is speculated that if a crRNA expression cassette is connected in series with multiple DR-sg sequences, it can be cut into a single DR-sg sequence by MAD7 after transcription. The realization of multi-site knockout. This design requires the construction of a multi-site crRNA expression cassette.
  • RNA sequence corresponding to MAD7 (including DR sequence), sg1 corresponding to target site 1 sequence, crRNA sequence, sg2 sequence corresponding to target site 2, crRNA sequence, sg3 sequence corresponding to target site 3, crRNA sequence, sgN sequence corresponding to target site N, transcription terminator sequence (TTTTTTTT).
  • TTTTTTTT transcription terminator sequence
  • Another design of the present invention is that multiple DR-sgs are respectively driven by miniOsU3/U6 promoters, or different DR-sg sequences are connected with tRNA processing recognition sequences, or HH-HDV spacers, and the MAD7 expression cassette and single-site knockout The same build.
  • the above two expression cassettes of crRNA and MAD7 nuclease were cloned into the pCAMBIA expression vector (Fig. 9-13).
  • MAD7 cuts like Cpf1 to produce sticky ends, which is theoretically more suitable for directional insertion of foreign fragments.
  • CRISPR-MAD7 to create a DSB near the target site, and using gene gun bombardment or DNA virus replication to introduce a large number of foreign fragments can efficiently achieve gene homologous recombination or directional insertion in plant cells.
  • Example 1 Using the CRISPR-MAD7 (ncNLS) system to perform single-site knockout of endogenous genes in rice
  • LbCpf1 was used to efficiently edit the OsHD3A promoter targeting site (LOCOs06g06320, editing efficiency 81.5%), and was connected to the pCAMBIA-CRISPR-MAD7 expression vector through AvrII and Afel restriction sites, respectively.
  • the constructed vector was transformed into Agrobacterium EHA105, and then the callus of rice variety Nanjing 46 (Oryza sativa ssp japonica cv. The transformed callus was transferred to the selection medium containing hygromycin, and after 28-30 days of culture, it was transferred to the differentiation medium containing hygromycin to regenerate plants.
  • the regenerated plants were sampled to extract DNA, and Taqman detected MAD7 positive individual plants, and the The target sites of positive individual plants were amplified and sequenced. The results show that the system can target mutations in rice cells and generate mutant plants.
  • the editing efficiency of the OsHD3A site is 89.9% and 94.9%, respectively, and the two alleles are edited at the same time
  • the frequency (including homo) of the gene is as high as 77.5% and 78.1%, which is comparable to the editing efficiency of LbCpf1, and there is no significant difference between the editing efficiency under two different temperature treatments (Table 1).
  • Table 1 The PAM-sg sequence of rice endogenous gene single-site knockout in CRISPR/MAD7(ncNLS) system and the identification results of T0 generation plants
  • Example 2 Using the CRISPR-MAD7 (ncNLS) system to perform single-site knockout of multiple endogenous genes in rice
  • CRISPR-MAD7 ncNLS
  • five Cpf1 editing sites in rice reported in the paper were selected (OsRLK-799-g1, LOC_Os02g07960; OsDEP1-g6, LOC_Os09g26999; OsALS- g7, LOC_Os02g30630; OsBEL260-g1, LOC_Os03g55260; OsRoc5-g1, LOC_Os02g45250), and newly designed CRISPR-MAD7 targeting sites (OsDEP1-g7; OsPDS1-g3; LOC_Os03g08570; OsRoc5- g2 and OsBEL260-g4 ), artificially synthesized DR-sgRNA, and the overlap PCR amplification product was connected to the pCAMBIA-CRISPR-MAD7 expression vector through restriction sites.
  • the constructed vector was transformed into Agrobacterium EHA105, and then the callus of rice variety Nanjing 46 was infected by this strain. After 3 days of co-cultivation, the culture was resumed at 30°C for 4 days, and the transformed callus was transferred to a culture medium containing hygromycin. Screen the culture medium, transfer to the differentiation medium containing hygromycin to regenerate plants after 28-30 days of culture, take samples from the regenerated plants to extract DNA, detect MAD7 positive individual plants by Taqman, and amplify and sequence the target sites of positive individual plants.
  • Example 1 the CRISPR-MAD7 system PCR amplification products were respectively connected to the pCAMBIA-CRISPR-MAD7 expression vector through the BsrGI and AvrII restriction sites, or the ApaI restriction site, to construct pCAMBIA-CRISPR-nNLS-MAD7 (remove MAD7 Nucleic acid editing enzyme C-terminal NLS nuclear localization sequence), pCAMBIA-CRISPR-MAD7-cNLS (remove MAD7 nucleic acid editing enzyme N-terminal NLS nuclear localization sequence) system.
  • the constructed vector was transformed into Agrobacterium EHA105, and then the callus of rice variety Nanjing 46 was infected by this strain.
  • the culture was resumed at 30°C for 4 days, and the transformed callus was transferred to a culture medium containing hygromycin.
  • the medium was selected, and after 28-30 days of culture, it was transferred to a differentiation medium containing hygromycin to regenerate plants.
  • the regenerated plants were sampled to extract DNA, Taqman detected MAD7-positive individual plants, and the target sites of positive individual plants were amplified and sequenced.
  • Table 3 The identification results of PAM-sg sequence and T0 generation plants of rice endogenous gene single site knockout in CRISPR-MAD7 system with different NLS positions and numbers
  • the constructed vector was transformed into Agrobacterium EHA105, and then the callus of rice variety Nanjing 46 was infected by this strain. After 3 days of co-cultivation, the culture was resumed at 30°C for 4 days, and the transformed callus was transferred to a culture medium containing hygromycin. The medium was selected, and after 28-30 days of culture, it was transferred to a differentiation medium containing hygromycin to regenerate plants.
  • the regenerated plants were sampled to extract DNA, Taqman detected MAD7-positive individual plants, and the target sites of positive individual plants were amplified and sequenced.
  • the results showed that as the length of sg increased from 19bp, 21bp, 23bp to 25bp, the editing frequency and biallelic (including homo) frequency in rice cells decreased from 95.7% and 89.2% to 85.9% and 56.5%, respectively. Except for the significant reduction in the biallelic frequency of the length, there was no significant difference in the editing rate and biallelic frequency between sg vectors of different lengths (Table 4).
  • Example 5 The CRISPR-MAD7 vector comprising the DR-sg sequence string is used for multi-site knockout in rice
  • MAD7 has the ability to autonomously cut and process pre-crRNA
  • this example selects the editing sites of Cpf1 reported in 4 articles in Example 2 (OsDEP1-g6, OsBEL260-g1, OsRoc5-g1, OsHD3A-g22), Interspaced by the mature DR sequence of MAD7 and under the control of the same OsU3 promoter (SEQ ID NO.3).
  • This expression cassette was then ligated to the MAD7 expression cassette and placed within the LB and RB sequences of pCAMBIA.
  • the constructed vector was transformed into Agrobacterium EHA105, and then the callus of rice variety Nanjing 46 was infected by this strain.
  • the culture was resumed at 30°C for 4 days, and the transformed callus was transferred to a culture medium containing hygromycin.
  • the medium was selected, and after 28-30 days of culture, it was transferred to a differentiation medium containing hygromycin to regenerate plants.
  • the regenerated plants were sampled to extract DNA, Taqman detected MAD7-positive individual plants, and the target sites of positive individual plants were amplified and sequenced.
  • the results showed that the mutation frequencies of the four genes in this system were 34.0%, 80.9%, 3.2% and 3.2%, respectively, and the simultaneous editing efficiencies of two alleles were 11.7%, 59.6%, 0% and 0%, respectively.
  • the first and second genes of this system can be efficiently edited in rice cells.
  • the DR-sg sequence string in Example 5 is connected at intervals through the recognition site of RNAase in the endogenous tRNA processing system, and is under the control of the same OsU3 promoter (sequence 4 SEQ ID NO.4).
  • This expression cassette was then ligated to the MAD7 expression cassette and placed within the LB and RB sequences of pCAMBIA.
  • the constructed vector was transformed into Agrobacterium EHA105, and then the callus of rice variety Nanjing 46 was infected by this strain. After 3 days of co-cultivation, the culture was resumed at 30°C for 4 days, and the transformed callus was transferred to a culture medium containing hygromycin.
  • the medium was selected, and after 28-30 days of culture, it was transferred to a differentiation medium containing hygromycin to regenerate plants.
  • the regenerated plants were sampled to extract DNA, Taqman detected MAD7-positive individual plants, and the target sites of positive individual plants were amplified and sequenced.
  • the results showed that the mutation frequencies of the four genes in this system were 49.4%, 91.0%, 71.9% and 68.2%, respectively, and the simultaneous editing efficiencies of two alleles were 14.6%, 78.7%, 58.4% and 47.7%, respectively.
  • the frequency of simultaneous mutations accounted for 38.2% of positive individual plants, and the system can efficiently edit multiple genes simultaneously in rice cells (Table 6).
  • Example 7 The use of miniOsU3/miniOsU6 to drive the CRISPR-MAD7 vector of the DR-sg sequence for multi-site knockout in rice
  • the DR-sg sequence strings in Example 5 were respectively driven by miniOsU3/miniOsU6 (SEQ ID NO.5).
  • This expression cassette was then ligated to the MAD7 expression cassette and placed within the LB and RB sequences of pCAMBIA.
  • the constructed vector was transformed into Agrobacterium EHA105, and then the callus of rice variety Nanjing 46 was infected by this strain. After 3 days of co-cultivation, the culture was resumed at 30°C for 4 days, and the transformed callus was transferred to a culture medium containing hygromycin. The medium was selected, and after 28-30 days of culture, it was transferred to a differentiation medium containing hygromycin to regenerate plants.
  • the regenerated plants were sampled to extract DNA, Taqman detected MAD7-positive individual plants, and the target sites of positive individual plants were amplified and sequenced.
  • the results show that the system can efficiently edit multiple genes simultaneously in rice cells.
  • the mutation frequencies of the four genes are 44.4%, 94.4%, 92.2% and 90.0%, respectively, and the efficiency of simultaneous editing of two alleles is 22.2%, respectively. %, 93.3%, 86.7% and 87.8%, the frequency of simultaneous mutation of the four genes accounted for 42.2% of the positive individual plants.
  • This system can efficiently perform multi-gene editing in rice cells simultaneously (Table 7).
  • Example 8 The CRISPR-MAD7 vector driven by the Pol II type promoter and the HH-HDV tandem DR-sg sequence is used for multi-site knockout in rice
  • the DR-guide array driven by the OsU3 or miniOsU3/miniOsU6 promoter can achieve efficient knockout of four gene loci in rice, but both U3/U6 belong to Pol III type promoters, driving long chains The capacity is limited, and Pol III type promoters do not have condition-specific or tissue-specific activation capabilities, but Pol II-like promoters can effectively overcome the above defects.
  • the promoter maize Ubiquitin of Pol II type was constructed to drive the crRNA expression cassette, and two ribozymes hammerhead ribozyme (HH) and Hepatitis deltavirus ribozyme (HDV) with RNA self-cleavage activity were used to convert the transcribed DR-
  • the guide target sequence was isolated (sequence 6 SEQ ID NO.6), and the control of the OsU3 promoter-driven crRNA expression cassette was constructed, and the above two crRNA expression cassettes were cloned into the pCAMBIA expression vector with the original MAD7 expression cassette.
  • the constructed vector was transformed into Agrobacterium EHA105, and then the callus of rice variety Nanjing 46 was infected by this strain.
  • the culture was resumed at 30°C for 4 days, and the transformed callus was transferred to a culture medium containing hygromycin.
  • the medium was selected, and after 28-30 days of culture, it was transferred to a differentiation medium containing hygromycin to regenerate plants.
  • the regenerated plants were sampled to extract DNA, Taqman detected MAD7-positive individual plants, and the target sites of positive individual plants were amplified and sequenced.
  • the mutation frequencies of the four genes in the ZmUbi-HH-HDV system were 82.6%, 92.1%, 88.8% and 93.3%, respectively, and the simultaneous editing efficiencies of the two alleles were 62.8%, 84.3%, 84.3% and 92.1% (Table 8), the use effect even exceeds the editing efficiency of single point knockout, and the frequency of simultaneous mutation of four genes accounts for 77.5% of the positive individual plants.
  • This system can perform multi-gene editing in rice cells at the same time with very high efficiency (Table 8).
  • Example 9 Using the CRISPR-MAD7 (ncNLS) system to perform single-site knockout of endogenous genes in maize
  • CRISPR-MAD7 ncNLS
  • the editing site of Cpf1 in maize glossy2, Zm00001d002353
  • DR-sgRNA was artificially synthesized
  • the overlap PCR amplification product was passed through ApaI
  • the restriction site was connected to the pCAMBIA-CRISPR-MAD7 expression vector.
  • the constructed vector was transformed into Agrobacterium LBA4404, and then the immature embryos of maize variety B104 were infected by this strain. After 7 days of co-cultivation, the culture was resumed at 28°C for two weeks. The transformed immature embryos were transferred to the selection medium containing mannose and cultured.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Zoology (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Virology (AREA)
  • Cell Biology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

An MAD7-NLS fusion protein, having the following structure: B1-C-B2, B1-C, or C-B2, wherein C is an MAD7 protein, and B1 and B2 are independently nuclear localization signal (NLS) sequences. A nucleic acid construct for site-directed editing of a plant genome, comprising a first expression cassette, the first expression cassette comprising a first promoter, a coding nucleotide sequence of the MAD7-NLS fusion protein, and a first terminator which are connected in sequence. An application of the nucleic acid construct in editing of a plant gene. Single-gene knockout, multi-gene knockout, or homologous recombination and directional insertion of an exogenous fragment can be simply, conveniently, and efficiently performed at a predetermined plant genome site by using MAD7.

Description

MAD7-NLS融合蛋白、用于植物基因组定点编辑的核酸构建物及其应用MAD7-NLS fusion protein, nucleic acid construct and application thereof for site-directed editing of plant genome 技术领域technical field
本发明涉及生物技术领域。具体涉及MAD7-NLS融合蛋白、用于植物基因组定点编辑的核酸构建物,以及基于新型核酸酶MAD7利用RNA引导的植物基因组的高效定点基因编辑方法。The present invention relates to the field of biotechnology. It specifically relates to a MAD7-NLS fusion protein, a nucleic acid construct for site-directed editing of plant genomes, and an efficient site-directed gene editing method for plant genomes based on the novel nuclease MAD7 using RNA guidance.
背景技术Background technique
随着基因定向编辑工具如锌指蛋白核糖核酸酶(Zinc finger nulease,ZFN)、转录激活样效应因子核酸酶(Transcription activator-like effectors nulease,TALEN)和规律成簇间隔短回文重复序列(Clustered regularly interspaced short palindromic repeats,CRISPR)***的推广应用,特别是CRISPR***由于活性高,特异性好、操作简便、快速,近年来发展非常迅速,并且在大量物种上得到成功应用。目前应用最广的核酸酶为II型Cas9和Cpf1(Cas12a)。其中,Cas9识别3′G-rich位点,而Cpf1(Cas12a)识别5′A-T rich位点。而相对Cas9来说,Cas12a蛋白兼具DNA剪切酶和RNA修剪酶的功能,不但能够靶向切割DNA双链,而且能把相应的非成熟型crRNA(pre-crRNA)加工剪切成成熟型crRNA(Fonfara et al.,2016),且不需要tracrRNA的参与;同时,Cas12a蛋白分子相对更小,特异性更强,相对Cas9对多靶点基因编辑具有更强的优势;另外,相对于Cas9剪切基因组形成的是平末端,Cas12a剪切基因组形成的粘性末端更有利于外源基因的定向***。With the development of gene-directed editing tools such as zinc finger ribonuclease (ZFN), transcription activator-like effector nuclease (TALEN) and clustered regularly interspaced short palindromic repeats (Clustered The popularization and application of the regularly interspaced short palindromic repeats (CRISPR) system, especially the CRISPR system, has developed very rapidly in recent years due to its high activity, good specificity, easy operation, and rapidity, and has been successfully applied to a large number of species. Currently the most widely used nucleases are type II Cas9 and Cpf1 (Cas12a). Among them, Cas9 recognizes the 3'G-rich site, while Cpf1 (Cas12a) recognizes the 5'A-T rich site. Compared with Cas9, the Cas12a protein has the functions of both DNA shearing enzyme and RNA trimming enzyme, not only can target DNA double-strand cutting, but also can process and shear the corresponding immature crRNA (pre-crRNA) into a mature form. crRNA (Fonfara et al., 2016), and does not require the participation of tracrRNA; at the same time, the Cas12a protein molecule is relatively smaller and more specific, and has stronger advantages over Cas9 for multi-target gene editing; in addition, compared with Cas9 The blunt end is formed by cutting the genome, and the sticky end formed by Cas12a cutting the genome is more conducive to the directional insertion of foreign genes.
MAD7隶属于II型V-A Cpf1-like家族,由Inscripta公司发现于真杆菌属并经优化改造,可供科研院所和商业研究***。MAD7同AsCpf1蛋白同源性最高,同源度也仅为31%,PAM识别位点为YTTN,目前MAD7***已在细菌、酵母、斑马鱼、小鼠和人体细胞中验证具有较高编辑活性,为推广其植物中应用范围,将其水稻密码子优化后研究其在植物中编辑效率。尽管近期MAD7***亦已报道在水稻中的应用,但其突变效率49-65.6%。综上所述, 为了满足植物基因工程的需要,丰富植物基因编辑的工具箱,有必要开发更高效的基因编辑***具有较强优势。MAD7 belongs to the type II V-A Cpf1-like family. It was discovered by Inscripta in the genus Eubacterium and optimized. It is available for free use in scientific research institutes and commercial research. MAD7 has the highest homology with AsCpf1 protein, and the homology is only 31%. The PAM recognition site is YTTN. At present, the MAD7 system has been verified to have high editing activity in bacteria, yeast, zebrafish, mice and human cells. In order to promote its application range in plants, the rice codon was optimized to study its editing efficiency in plants. Although the application of the MAD7 system in rice has also been reported recently, its mutation efficiency is 49-65.6%. To sum up, in order to meet the needs of plant genetic engineering and enrich the toolbox of plant gene editing, it is necessary to develop a more efficient gene editing system with strong advantages.
发明内容Contents of the invention
本发明的目的在于提供一套基于MAD7的CRISPR/MAD7植物基因组高效定点编辑***,可以简单高效地在单双子叶植物实现单基因敲除、多基因敲除和同源重组或外源片段定点敲入。The purpose of the present invention is to provide a set of MAD7-based CRISPR/MAD7 plant genome efficient site-directed editing system, which can simply and efficiently realize single gene knockout, multiple gene knockout and homologous recombination or foreign fragment site-directed knockout in monocotyledonous plants enter.
本发明首先提供了一种MAD7-NLS融合蛋白,具有以下结构:The present invention firstly provides a MAD7-NLS fusion protein, which has the following structure:
B1-C-B2、B1-C或C-B2;B1-C-B2, B1-C or C-B2;
其中,in,
C为MAD7蛋白;C is MAD7 protein;
B1和B2为各自独立的核定位信号序列(NLS)。B1 and B2 are independent nuclear localization signal sequences (NLS).
在根据本发明的一个实施方案中,所述核定位信号序列(NLS)选自:SV40、KRP2(Kiprelated protein gene NO.2)、MDM2、CDc25C、DPP9、MTA1、CBP80、AreA、M9、Rev、hTAP、MyRF、EBNA-6、TERT或Tfam中的一种或者任两种或多种的组合。In one embodiment according to the present invention, the nuclear localization signal sequence (NLS) is selected from: SV40, KRP2 (Kiprelated protein gene NO.2), MDM2, CDc25C, DPP9, MTA1, CBP80, AreA, M9, Rev, One or a combination of any two or more of hTAP, MyRF, EBNA-6, TERT or Tfam.
在根据本发明的一个实施方案中,所述MAD7-NLS融合蛋白的N端还包含信号肽和/或蛋白标签序列。In one embodiment according to the present invention, the N-terminal of the MAD7-NLS fusion protein further includes a signal peptide and/or protein tag sequence.
本发明的另一方面提供了一种用于植物基因组定点编辑的核酸构建物,包含第一表达盒,所述第一表达盒包含依次连接的第一启动子、如上述的MAD7-NLS融合蛋白的编码核苷酸序列和第一终止子;Another aspect of the present invention provides a nucleic acid construct for site-directed editing of plant genomes, comprising a first expression cassette comprising a sequentially connected first promoter, the above-mentioned MAD7-NLS fusion protein The coding nucleotide sequence and the first terminator;
优选地,所述第一启动子为Pol II类型的启动子优选为Ubi、Actin、CmYLCV、UBQ、35S、SPL,组织特异性启动子YAO、CDC45、rbcS和诱导型启动子XEV中的一种或任两种的组合。Preferably, the first promoter is a Pol II type promoter, preferably Ubi, Actin, CmYLCV, UBQ, 35S, SPL, one of the tissue-specific promoter YAO, CDC45, rbcS and the inducible promoter XEV or a combination of any two.
在根据本发明的一个实施方案中,该核酸构建物还包含第二表达盒,所述的第二表达盒包含依次连接的第二启动子、若干串联的重复序列;In one embodiment according to the present invention, the nucleic acid construct further comprises a second expression cassette, said second expression cassette comprising a second promoter connected in sequence, several tandem repeat sequences;
所述重复序列为成熟型直接重复序列(direct repeat,DR)和非成熟型直接重复序列中的一种或两种;优选地,第二启动子为Pol II类型或Pol III类型的启动子;更优选地,第二启动子选自OsU3、OsU6a、OsU6b、OsU6c、Actin、35S、Ubi、UBQ、SPL、CmYLCV、组织特异性启动子YAO、CDC45、rbcS 或诱导型启动子XEV中的一种、两种或多种。The repeat sequence is one or both of a mature direct repeat sequence (direct repeat, DR) and an immature direct repeat sequence; preferably, the second promoter is a Pol II type or Pol III type promoter; More preferably, the second promoter is selected from one of OsU3, OsU6a, OsU6b, OsU6c, Actin, 35S, Ubi, UBQ, SPL, CmYLCV, tissue-specific promoter YAO, CDC45, rbcS or inducible promoter XEV , two or more.
在根据本发明的一个实施方案中,所述第二表达盒还包含连接于重复序列末端的第二终止序列;In one embodiment according to the present invention, the second expression cassette further comprises a second termination sequence connected to the end of the repeat sequence;
优选地,所述第二终止序列选自polyT、NOS、polyA或其组合。Preferably, the second termination sequence is selected from polyT, NOS, polyA or a combination thereof.
在根据本发明的一个实施方案中,所述重复序列还包含目标位点引导序列sg;优选地,所述的sg序列的长度为17-35bp,优选为19-28bp,更优选为19-25bp;In one embodiment according to the present invention, the repeat sequence further includes a target site guide sequence sg; preferably, the length of the sg sequence is 17-35bp, preferably 19-28bp, more preferably 19-25bp ;
优选地,所述重复序列重复数为2-50,优选为2-10,更优选为3-15,进一步优选为2、3、4、5、6、7、8、9、10、11或12。Preferably, the repeat number of the repeat sequence is 2-50, preferably 2-10, more preferably 3-15, further preferably 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 or 12.
尤其优选地,所述核酸构建物为选自SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5、SEQ ID NO:6或SEQ ID NO:21中的一个或多个。Especially preferably, the nucleic acid construct is selected from one of SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6 or SEQ ID NO:21 or more.
在根据本发明的一个实施方案中,所述核酸构建物为同时包含第一表达盒和第二表达盒的载体,或者,In one embodiment according to the present invention, the nucleic acid construct is a vector comprising both the first expression cassette and the second expression cassette, or,
由分别包含第一表达盒的第一载体与包含第二表达盒的第二载体组成的载体组合。A combination of vectors consisting of a first vector each comprising a first expression cassette and a second vector comprising a second expression cassette.
本发明还提供了一种用于植物体基因编辑的试剂盒,其特征在于,包括上述的核酸构建物;优选地,还包括携带供体DNA表达盒的辅助载体。The present invention also provides a kit for gene editing in plants, which is characterized in that it includes the above-mentioned nucleic acid construct; preferably, it also includes an auxiliary vector carrying a donor DNA expression cassette.
本发明的再一方面提供了一种用于植物基因编辑的方法,包括:Another aspect of the present invention provides a method for plant gene editing, comprising:
(i)将上述的核酸构建物和任选的供体核酸片段,导入到植物细胞、植物组织或植物体,并在所述植物细胞、植物组织或植物体中进行基因编辑;(i) introducing the above-mentioned nucleic acid construct and optional donor nucleic acid fragments into plant cells, plant tissues or plant bodies, and performing gene editing in the plant cells, plant tissues or plant bodies;
(ii)对发生所述基因编辑的植物细胞、植物组织或植物体进行筛选和鉴定;(ii) screening and identifying the plant cell, plant tissue or plant body in which the gene editing occurs;
(iii)将步骤(ii)中经鉴定发生了所述基因编辑的植物细胞、植物组织或植物体进行再生或培养;(iii) regenerating or culturing the plant cell, plant tissue or plant body identified in step (ii) as having undergone the gene editing;
优选地,所述的基因编辑包括基因敲除、定点***或基因置换中的一种、或者任选的两种或三种的组合;Preferably, the gene editing includes one of gene knockout, site-directed insertion or gene replacement, or an optional combination of two or three;
优选地,所述的基因编辑为单位点或多位点的基因编辑;Preferably, the gene editing is single-site or multi-site gene editing;
优选地,所述导入是利用选自农杆菌转化法、基因枪法、显微注射法、电击法、超声波法或聚乙二醇(PEG)介导法中的任一种方法实现的;Preferably, the introduction is achieved by any method selected from Agrobacterium transformation, gene gun method, microinjection method, electric shock method, ultrasonic method or polyethylene glycol (PEG)-mediated method;
优选地,所述的植物选自禾本科植物、豆科植物、茄科或十字花科植物中的任一种;Preferably, the plant is selected from any one of grasses, leguminous plants, solanaceae or cruciferous plants;
更优选地,所述的植物选自拟南芥、小麦、大麦、燕麦、玉米、水稻、高粱、粟、大豆、花生、烟草或番茄中的任一种。More preferably, the plant is selected from any one of Arabidopsis thaliana, wheat, barley, oat, corn, rice, sorghum, millet, soybean, peanut, tobacco or tomato.
本发明的上述技术方案的有益效果如下:The beneficial effects of above-mentioned technical scheme of the present invention are as follows:
本发明基于MAD7的CRISPR/MAD7植物基因组高效定点编辑***,可以简单高效地在单双子叶植物实现单基因敲除、多基因敲除和同源重组或外源片段定点敲入,相对于传统的CRISPR/Cas9、CRISPR/Cpf1基因编辑***,本发明提供的CRISPR/MAD7***具有更高的特异性,同时对于靶位点的选择也更灵活。The MAD7-based CRISPR/MAD7 plant genome efficient fixed-point editing system of the present invention can simply and efficiently realize single gene knockout, multiple gene knockout and homologous recombination or foreign fragment fixed-point knock-in in monocotyledonous plants. CRISPR/Cas9, CRISPR/Cpf1 gene editing system, the CRISPR/MAD7 system provided by the present invention has higher specificity and is more flexible for the selection of target sites.
附图说明Description of drawings
图1为根据本发明的MAD7之水稻crRNA表达盒的相关序列(SEQ ID NO.2)中的各元件结构组成示意图。1 is a schematic diagram of the structural composition of each element in the related sequence (SEQ ID NO.2) of the rice crRNA expression cassette of MAD7 according to the present invention.
图2为根据本发明的MAD7之水稻多基因位点编辑sg-DR crRNA表达盒的相关序列(SEQ ID NO.3)中的各元件结构组成示意图。Figure 2 is a schematic diagram of the structural composition of each element in the related sequence (SEQ ID NO.3) of the rice multi-gene site editing sg-DR crRNA expression cassette of MAD7 according to the present invention.
图3为根据本发明的MAD7之水稻多基因位点编辑tRNA-sg-DR crRNA表达盒的相关序列(SEQ ID NO.4)中的各元件结构组成示意图。3 is a schematic diagram of the structural composition of each element in the related sequence (SEQ ID NO.4) of the rice multi-gene site editing tRNA-sg-DR crRNA expression cassette of MAD7 according to the present invention.
图4为根据本发明的MAD7之水稻多基因位点编辑miniOsU3/U6-DR-sg crRNA表达盒的相关序列(SEQ ID NO.5)中的各元件结构组成示意图。4 is a schematic diagram of the structural composition of each element in the related sequence (SEQ ID NO.5) of the rice multi-gene locus editing miniOsU3/U6-DR-sg crRNA expression cassette according to the present invention.
图5为根据本发明的MAD7之水稻多基因位点编辑HH-DR-sg-HDV crRNA表达盒的相关序列(SEQ ID NO.6)中的各元件结构组成示意图。5 is a schematic diagram of the structural composition of each element in the related sequence (SEQ ID NO.6) of the rice multi-gene site editing HH-DR-sg-HDV crRNA expression cassette of MAD7 according to the present invention.
图6为根据本发明的MAD7之玉米crRNA表达盒的相关序列(SEQ ID NO.21)中的各元件结构组成示意图。6 is a schematic diagram of the structural composition of each element in the related sequence (SEQ ID NO.21) of the maize crRNA expression cassette of MAD7 according to the present invention.
图7为水稻单位点敲除CRISPR-MAD7(ncNLS)表达载体构建示意图。Figure 7 is a schematic diagram of the construction of the rice single-site knockout CRISPR-MAD7 (ncNLS) expression vector.
图8为水稻单位点敲除OsHD3A位点时NLS设计三种放置方式‘”下表达载体构建示意图。Figure 8 is a schematic diagram of the construction of the expression vector under the three placement modes of the NLS design when the OsHD3A site is knocked out at a single site in rice.
图9为包含DR-sg序列串的水稻多位点敲除CRISPR-MAD7表达载体构 建示意图。Figure 9 is a schematic diagram of the construction of the rice multi-site knockout CRISPR-MAD7 expression vector comprising the DR-sg sequence string.
图10为tRNA串联DR-sg序列的水稻多位点敲除CRISPR-MAD7表达载体构建示意图。Figure 10 is a schematic diagram of the construction of the rice multi-site knockout CRISPR-MAD7 expression vector with tRNA tandem DR-sg sequence.
图11miniOsU3/miniOsU6分别驱动DR-sg序列的CRISPR-MAD7表达载体构建示意图。Figure 11 Schematic diagram of the construction of CRISPR-MAD7 expression vectors driven by miniOsU3/miniOsU6 respectively driving the DR-sg sequence.
图12为Pol III类型启动子驱动、HH-HDV串联DR-sg序列的CRISPR-MAD7表达载体构建示意图。Figure 12 is a schematic diagram of the construction of a CRISPR-MAD7 expression vector driven by a Pol III type promoter and HH-HDV tandem DR-sg sequence.
图13为Pol II类型启动子驱动、HH-HDV串联DR-sg序列的CRISPR-MAD7表达载体构建示意图。Figure 13 is a schematic diagram of the construction of a CRISPR-MAD7 expression vector driven by a Pol II type promoter and HH-HDV tandem DR-sg sequence.
图14为玉米单位点敲除CRISPR-MAD7(ncNLS)表达载体构建示意图。Fig. 14 is a schematic diagram of the construction of a maize single-site knockout CRISPR-MAD7 (ncNLS) expression vector.
具体实施方式Detailed ways
为使本发明要解决的技术问题、技术方案和优点更加清楚,下面将结合附图及具体实施例进行详细描述。In order to make the technical problems, technical solutions and advantages to be solved by the present invention clearer, the following will describe in detail with reference to the drawings and specific embodiments.
本发明人经过广泛而深入的研究,构建了一种基于MAD7的高通用性、高特异性的CRISPR/MAD7植物基因组定点高效编辑***,以及用于植物基因组定点编辑的核酸构建物、载体或载体组合,以及植物基因组定点编辑方法。具体地,基于本发明的方法,可以在预定的植物基因组位点,简便而高效地进行单基因敲除、多基因敲除或者同源重组和外源片段的定向***。在此基础上,完成了本发明。After extensive and in-depth research, the inventors have constructed a highly versatile and highly specific CRISPR/MAD7 plant genome-directed and efficient editing system based on MAD7, as well as nucleic acid constructs, vectors or vectors for plant genome-directed editing Combinatorial, and site-directed editing methods for plant genomes. Specifically, based on the method of the present invention, single gene knockout, multiple gene knockout, or homologous recombination and directional insertion of foreign fragments can be easily and efficiently performed at predetermined plant genome sites. On this basis, the present invention has been accomplished.
具体地,本发明CRISPR/MAD7编辑***采用成熟型crRNA,由于成熟型crRNA更加短小,便于人工合成并且更加容易转化进入细胞,所以开发基于成熟型crRNA的CRISPR-MAD7***会更具优势。在此基础上,还开发了将核定位信号序列(NLS)置于MAD7蛋白的N端、C端的技术方案。实验结果表明,采用上述技术方案成功完成了水稻的基因编辑,效率高达95%。Specifically, the CRISPR/MAD7 editing system of the present invention uses mature crRNA. Since the mature crRNA is shorter, it is convenient for artificial synthesis and easier to transform into cells, so it will be more advantageous to develop a mature crRNA-based CRISPR-MAD7 system. On this basis, a technical solution for placing the nuclear localization signal sequence (NLS) at the N-terminus and C-terminus of the MAD7 protein was also developed. Experimental results show that the above-mentioned technical scheme has successfully completed the gene editing of rice with an efficiency as high as 95%.
术语the term
除非另外定义,本文使用的所有技术和科学术语的意义与本发明所属领域普通技术人员通常所理解的相同。本文中述及的所有出版物和其他参考文献都 通过引用纳入本文。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. All publications and other references mentioned herein are incorporated by reference.
如本文所用,所述的“含有”、“具有”或“包括”包括了“包含”、“主要由……构成”、“基本上由……构成”、和“由……构成”。As used herein, the words "comprising", "having" or "comprising" include "comprising", "consisting essentially of", "consisting essentially of", and "consisting of".
如本文所用,术语“操作性相连”或“可操作地连于”指这样一种状况,即线性DNA序列的某些部分能够调节或控制同一线性DNA序列其它部分的活性。例如,如果启动子控制序列的转录,那么它就是可操作地连于编码序列。As used herein, the terms "operably linked" or "operably linked to" refer to the condition that some portion of a linear DNA sequence is capable of modulating or controlling the activity of other portions of the same linear DNA sequence. For example, a promoter is operably linked to a coding sequence if it controls the transcription of the sequence.
基于MAD7的植物基因组高效定点编辑的核酸构建物和方法Nucleic acid constructs and methods for efficient site-directed editing of plant genomes based on MAD7
本发明提供了一种用于植物基因组高效定点编辑的核酸构建物,所述的核酸构建物包括第一表达盒和任选的第二表达盒;The present invention provides a nucleic acid construct for high-efficiency site-directed editing of plant genomes, the nucleic acid construct comprising a first expression cassette and an optional second expression cassette;
其中,所述的第一表达盒为MAD7-NLS融合蛋白表达盒,其中所述的MAD7-NLS融合蛋白具有式I结构:Wherein, the first expression cassette is a MAD7-NLS fusion protein expression cassette, wherein the MAD7-NLS fusion protein has a formula I structure:
P1-A-B1-C-B2-E1      (I)P1-A-B1-C-B2-E1 (I)
式中,In the formula,
P1为第一启动子;P1 is the first promoter;
A为无、信号肽、和/或蛋白标签序列;A is none, signal peptide, and/or protein tag sequence;
B1为无或核定位信号序列NLS;B1 is no or nuclear localization signal sequence NLS;
B2为无或核定位信号序列NLS;B2 is no or nuclear localization signal sequence NLS;
附加条件是,B1和B2中至多一个为无;The additional condition is that at most one of B1 and B2 is none;
C为MAD7蛋白;C is MAD7 protein;
E1为第一终止子;E1 is the first terminator;
所述的第二表达盒为crRNA表达盒,在所述的crRNA表达盒含有对应于成熟型crRNA或非成熟型pre-crRNA的编码序列。The second expression cassette is a crRNA expression cassette, and the crRNA expression cassette contains coding sequences corresponding to mature crRNA or immature pre-crRNA.
在本发明中,上述的各元件可用常规方法(如PCR法、人工全合成)制备,然后用常规方法进行连接,从而形成本发明所述的核酸构建物。如需要,在连接反应之前,可以任选地进行酶切反应。In the present invention, the above-mentioned elements can be prepared by conventional methods (such as PCR method, artificial total synthesis), and then connected by conventional methods to form the nucleic acid construct of the present invention. An enzyme cleavage reaction may optionally be performed before the ligation reaction, if desired.
此外,本发明的所述核酸构建物可以是线性的,也可以是环状的。本发明的所述核酸构建物可以是单链的,也可以是双链的。本发明的所述核酸构建物可以是DNA,也可以是RNA,或DNA/RNA杂合。In addition, the nucleic acid construct of the present invention can be linear or circular. The nucleic acid construct of the present invention can be single-stranded or double-stranded. The nucleic acid construct of the present invention can be DNA, RNA, or DNA/RNA hybrid.
如本文所用,nNLS是指核定位信号序列NLS位于MAD7蛋白编码序列的5'端,cNLS是指核定位信号序列NLS位于MAD7蛋白编码序列的3'端,ncNLS是指MAD7蛋白编码序列的5'端和3'端均可连接核定位信号序列NLS。As used herein, nNLS refers to the nuclear localization signal sequence NLS located at the 5' end of the MAD7 protein coding sequence, cNLS refers to the nuclear localization signal sequence NLS located at the 3' end of the MAD7 protein coding sequence, and ncNLS refers to the 5' end of the MAD7 protein coding sequence. The nuclear localization signal sequence NLS can be connected to both the end and the 3' end.
如本文所用,“外源基因”指作用是阶段性作用的外源DNA分子。可用于本申请的外源基因没有特别限制,包括转基因动物领域常用的各种外源基因。代表性例子包括(但并不限于):β-葡萄糖苷酸酶基因、红色荧光蛋白基因、绿色荧光蛋白基因、溶菌酶基因、鲑鱼降钙素基因、乳铁蛋白、或血清白蛋白基因等。As used herein, "exogenous gene" refers to an exogenous DNA molecule that acts in stages. The exogenous genes that can be used in this application are not particularly limited, and include various exogenous genes commonly used in the field of transgenic animals. Representative examples include (but are not limited to): β-glucuronidase gene, red fluorescent protein gene, green fluorescent protein gene, lysozyme gene, salmon calcitonin gene, lactoferrin, or serum albumin gene and the like.
如本文所用,“筛选标记基因”指转基因过程中用来筛选转基因细胞或转基因动物的基因,可用于本申请的筛选标记基因没有特别限制,包括转基因领域常用的各种筛选标记基因,代表性例子包括(但并不限于):潮霉素抗性基因(Hyg)、卡那霉素抗性基因(NPTII)、新霉素、或嘌呤霉素抗性基因。As used herein, "selectable marker gene" refers to a gene used to screen transgenic cells or transgenic animals during the transgenic process. The selectable marker gene that can be used in this application is not particularly limited, including various commonly used in the field of transgenic. Representative examples Including, but not limited to: hygromycin resistance gene (Hyg), kanamycin resistance gene (NPTII), neomycin, or puromycin resistance gene.
如本文所用,术语“表达盒”是指含有待表达基因以及表达所需元件的序列组件的一段多聚核苷酸序列。例如,在本发明中,术语“筛选标记表达盒”指含有编码筛选标记的序列以及表达所需元件的序列组件的多聚核苷酸序列。表达所需的组件包括启动子和聚腺苷酸化信号序列。此外,筛选标记表达盒还可以含有或不含有其他序列,包括(但并不限于):增强子、分泌信号肽序列等。As used herein, the term "expression cassette" refers to a polynucleotide sequence that contains the sequence components of the gene to be expressed and the elements required for expression. For example, in the present invention, the term "selectable marker expression cassette" refers to a polynucleotide sequence comprising a sequence encoding a selectable marker and sequence modules of elements required for expression. Components required for expression include a promoter and polyadenylation signal sequence. In addition, the screening marker expression cassette may or may not contain other sequences, including (but not limited to): enhancers, secretion signal peptide sequences, and the like.
如本文所用,术语“植物启动子”指能够在植物细胞中启动核酸转录的核酸序列。该植物启动子可以是来源于植物、微生物(如细菌、病毒)或动物等,或者是人工合成或改造过的启动子。As used herein, the term "plant promoter" refers to a nucleic acid sequence capable of initiating transcription of a nucleic acid in a plant cell. The plant promoter may be derived from plants, microorganisms (such as bacteria, viruses) or animals, or artificially synthesized or modified.
如本文所用,术语“植物终止子”指能够在植物细胞中可使转录停止的终止子。该植物转录终止子可以是来源于植物、微生物(如细菌、病毒)或动物等,或者是人工合成或改造过的终止子。代表性的例子包括(但并不限于):Nos终止子。As used herein, the term "plant terminator" refers to a terminator capable of stopping transcription in a plant cell. The plant transcription terminator may be derived from plants, microorganisms (such as bacteria, viruses) or animals, or artificially synthesized or modified terminators. Representative examples include (but are not limited to): Nos terminators.
如本文所用,术语“MAD7蛋白”指一种核酸酶。典型的MAD7蛋白包括(但并不限于):As used herein, the term "MAD7 protein" refers to a nuclease. Typical MAD7 proteins include (but are not limited to):
ErCas12a(Eubacterium rectale)ErCas12a (Eubacterium rectale)
如本文所用,术语“MAD7蛋白的编码序列”指编码具有切割活性的MAD7蛋白的核苷酸序列。在***的多聚核苷酸序列被转录和翻译从而产生功能性MAD7蛋白的情况下,技术人员会认识到,因为密码子的简并性,有大量多聚核苷酸序列可以编码相同的多肽。另外,技术人员也会认识到不同物种对于密码子具有一定的偏好性,可能会根据在不同物种中表达的需要,会对MAD7蛋白的密码子进行优化,这些变异体都被术语“MAD7蛋白的编码序列”所具体涵盖。此外,术语特定地包括了全长的,与MAD7基因序列基本相同的序列,以及编码出保留MAD7蛋白功能的蛋白质的序列。As used herein, the term "coding sequence of MAD7 protein" refers to a nucleotide sequence encoding MAD7 protein having cleavage activity. In cases where the inserted polynucleotide sequence is transcribed and translated to produce a functional MAD7 protein, the skilled artisan will recognize that, because of codon degeneracy, there are a large number of polynucleotide sequences that can encode the same polypeptide . In addition, technicians will also recognize that different species have certain preferences for codons, and may optimize the codons of the MAD7 protein according to the needs of expression in different species. These variants are all referred to by the term "MAD7 protein Coding sequences" are specifically covered. In addition, the term specifically includes a full-length sequence that is substantially identical to the MAD7 gene sequence, as well as a sequence that encodes a protein that retains the function of the MAD7 protein.
在本发明中,所述的C对应于MAD7蛋白的全长或融合蛋白,所述的第二表达盒具有式II结构crRNA表达盒:In the present invention, said C corresponds to the full-length or fusion protein of MAD7 protein, and said second expression cassette has a crRNA expression cassette with formula II structure:
P2-(R-S)q-T   (II)P2-(R-S)q-T (II)
式中,In the formula,
P2为第二启动子;P2 is the second promoter;
各R独立地为对应于成熟型或非成熟型直接重复序列(direct repeat,DR)Each R is independently corresponding to a mature or immature direct repeat sequence (direct repeat, DR)
各S独立地为无或目标位点引导序列sg;each S is independently none or a target site leader sequence sg;
q为≥1的正整数;q is a positive integer ≥ 1;
T为无或polyT或Nos或polyA序列。T is none or polyT or Nos or polyA sequence.
本发明还提供了一种载体或载体组合,所述的载体或载体组合含有本发明所述的核酸构建物。The present invention also provides a vector or a combination of vectors, which contains the nucleic acid construct of the present invention.
优选地,本发明所述的载体组合还包括携带供体DNA表达盒的辅助载体。Preferably, the vector combination of the present invention further includes an auxiliary vector carrying a donor DNA expression cassette.
在本发明的核酸构建物和/或载体中,一些元件之间(尤其是各表达盒中的相应元件)是可操作连接的。例如当启动子与编码序列可操作连接时,指所述启动子能够启动所述编码序列的转录。In the nucleic acid construct and/or vector of the present invention, some elements (especially corresponding elements in each expression cassette) are operably linked. For example, when a promoter is operably linked to a coding sequence, it means that the promoter is capable of initiating the transcription of the coding sequence.
本发明还提供了含有上述载体或载体组合的试剂组合以及试剂盒,它们可用于本发明的植物基因编辑方法。The present invention also provides a reagent combination and a kit containing the above-mentioned vector or vector combination, which can be used in the plant gene editing method of the present invention.
本发明还提供了一种对植物进行基因编辑的方法,包括步骤:The present invention also provides a method for gene editing of plants, comprising the steps of:
(i)将(a)本发明所述的载体或载体组合以及(b)任选的供体核酸片段,导入植物细胞、植物组织或植物(植株),从而在所述植物细胞、植物组 织或植物中产生基因编辑;和(i) introducing (a) the vector or combination of vectors of the present invention and (b) optional donor nucleic acid fragments into plant cells, plant tissues or plants (plants), so that in said plant cells, plant tissues or Generating gene edits in plants; and
(ii)任选地,对发生所述基因编辑的植物细胞或植物进行检测、筛选或鉴定。(ii) Optionally, detecting, screening or identifying the plant cell or plant in which the gene editing has occurred.
在本发明中,所述的基因编辑包括基因敲除、定点***、基因置换、或其组合。In the present invention, the gene editing includes gene knockout, site-directed insertion, gene replacement, or a combination thereof.
本发明的植物基因编辑方法,可用于改良各类植物,尤其是对农作物进行改良。The plant gene editing method of the present invention can be used to improve various plants, especially crops.
如本文所用,术语“植物”包括全植株、植物器官(如叶、茎、根等)、种子和植物细胞以及它们的子代。可用于本发明方法的植物的种类没有特别限制,一般包括任何可进行转化技术的高等植物类型,包括单子叶、双子叶植物和裸子植物。As used herein, the term "plant" includes whole plants, plant organs (eg, leaves, stems, roots, etc.), seeds and plant cells, as well as their progeny. The types of plants that can be used in the method of the present invention are not particularly limited, and generally include any type of higher plants that can be subjected to transformation techniques, including monocots, dicots and gymnosperms.
本发明可以用于植物基因工程领域,例如植物基因功能研究和作物遗传改良。The invention can be used in the field of plant genetic engineering, such as the study of plant gene function and crop genetic improvement.
下面结合具体实施例,进一步阐述本发明。应理解,这些实施例仅用于说明本发明,而不用于限制本发明的范围。下列实施例中未注明具体条件的实验方法,通常按照常规条件如Sambrook等人,分子克隆:实验室手册(New York:Cold Spring Harbor LaboratoryPress,1989)中所述的条件,或按照制造厂商所建议的条件。本发明中所涉及的实验材料如无特殊说明均可从市售渠道获得。Below in conjunction with specific embodiment, further illustrate the present invention. It should be understood that these examples are only used to illustrate the present invention, not to limit the scope of the present invention. The experimental methods not indicating specific conditions in the following examples are usually according to conventional conditions such as Sambrook et al., molecular cloning: the conditions described in the laboratory manual (New York: Cold Spring Harbor Laboratory Press, 1989), or according to the manufacturer's instructions suggested conditions. The experimental materials involved in the present invention can be obtained from commercially available channels unless otherwise specified.
材料Material
MAD7蛋白的编码序列为针对水稻进行密码子优化的编码序列,具体序列见SEQ ID NO.1。The coding sequence of the MAD7 protein is a codon-optimized coding sequence for rice, and the specific sequence is shown in SEQ ID NO.1.
序列二MAD7之水稻crRNA表达盒的相关序列(SEQ ID NO.2),该表达盒中的元件结构如图1所示,该序列前后下划线分别标示AvrII、AfeI酶切位点,为方便克隆至pCAMBIA表达载体而设置;黑色底纹标示MAD7对应的成熟型DR序列;方框内为sg位点,构建基因敲除载体时该序列人工合成互补双链,同OsU3启动子的PCR产物通过Overlapping PCR法扩增全长;加粗字母为转录终止子序列;其余为OsU3启动子的序列。The related sequence (SEQ ID NO.2) of the rice crRNA expression cassette of sequence 2 MAD7, the element structure in this expression cassette is as shown in Figure 1, and the underline before and after this sequence marks respectively AvrII, AfeI enzyme cutting site, for the convenience of cloning into The pCAMBIA expression vector is set; the black shading marks the mature DR sequence corresponding to MAD7; the sg site is inside the box, and the sequence is artificially synthesized complementary double strands when constructing a gene knockout vector, and the PCR product with the OsU3 promoter is passed through Overlapping PCR The full length was amplified by the method; the bold letter is the transcription terminator sequence; the rest is the sequence of the OsU3 promoter.
序列三MAD7之水稻多基因位点编辑sg-DR crRNA表达盒的相关序列(SEQ ID NO.3),该表达盒中的元件结构如图2所示,其中,下划线标示BsaI酶切位点,为方便克隆至pCAMBIA表达载体而设置;黑色底纹标示MAD7对应的成熟型DR序列;方框内为四个靶向位点的sg序列,按5’到3’的顺序分别为OsDEP1-g6、OsBEL260-g1、OsRoc5-g1和OsHD3A;加粗字母为转录终止子序列;其余为OsU3启动子的序列。Sequence 3 The related sequence (SEQ ID NO.3) of the rice multi-gene site editing sg-DR crRNA expression cassette of MAD7, the element structure in the expression cassette is shown in Figure 2, wherein the underline indicates the BsaI restriction site, Set for the convenience of cloning into the pCAMBIA expression vector; the black shading marks the mature DR sequence corresponding to MAD7; the sg sequences of the four targeting sites are in the box, and they are OsDEP1-g6, OsDEP1-g6, OsBEL260-g1, OsRoc5-g1 and OsHD3A; bold letters are transcription terminator sequences; the rest are sequences of OsU3 promoter.
序列四MAD7之水稻多基因位点编辑tRNA-sg-DR crRNA表达盒的相关序列(SEQ ID NO.4),该表达盒中的元件结构如图3所示,其中,前后下划线分别标示AvrII、BsaI酶切位点,为方便克隆至pCAMBIA表达载体而设置;黑色底纹标示MAD7对应的成熟型DR序列;方框内为四个靶向位点的sg序列,按5’到3’的顺序分别为OsDEP1-g6、OsBEL260-g1、OsRoc5-g1和OsHD3A;双下划线为tRNA序列;加粗字母为转录终止子序列;其余为OsU3启动子的序列。Sequence 4 The related sequence (SEQ ID NO.4) of the rice multi-gene locus editing tRNA-sg-DR crRNA expression cassette of MAD7, the element structure in the expression cassette is shown in Figure 3, wherein the underlines before and after indicate AvrII, The BsaI restriction site is set for the convenience of cloning into the pCAMBIA expression vector; the black shading marks the mature DR sequence corresponding to MAD7; the sg sequences of the four targeting sites are in the box, in order from 5' to 3' They are OsDEP1-g6, OsBEL260-g1, OsRoc5-g1 and OsHD3A respectively; double underlines are tRNA sequences; bold letters are transcription terminator sequences; the rest are sequences of OsU3 promoter.
序列五MAD7之水稻多基因位点编辑miniOsU3/U6-DR-sg crRNA表达盒的相关序列(SEQ ID NO.5),该表达盒中的元件结构如图4所示,其中,前后下划线分别标示AvrII、BsaI酶切位点,为方便克隆至pCAMBIA表达载体而设置;黑色底纹标示MAD7对应的成熟型DR序列;方框内为四个靶向位点的sg序列,按5’到3’的顺序分别为OsDEP1-g6、OsBEL260-g1、OsRoc5-g1和OsHD3A;加粗字母为转录终止子序列;其余为miniOsU3/U6启动子的序列。Sequence 5 The relevant sequence of the rice multi-gene locus editing miniOsU3/U6-DR-sg crRNA expression cassette (SEQ ID NO.5) of MAD7, the structure of the elements in the expression cassette is shown in Figure 4, where the underlines before and after are respectively marked AvrII and BsaI restriction sites are set for convenient cloning into the pCAMBIA expression vector; the black shading marks the mature DR sequence corresponding to MAD7; the sg sequences of the four targeting sites are in the box, according to 5' to 3' The sequences are OsDEP1-g6, OsBEL260-g1, OsRoc5-g1 and OsHD3A respectively; bold letters are transcription terminator sequences; the rest are sequences of miniOsU3/U6 promoters.
序列六MAD7之水稻多基因位点编辑HH-DR-sg-HDV crRNA表达盒的相关序列(SEQ ID NO.6),该表达盒中的元件结构如图5所示,其中,下划线标示BsaI酶切位点,为方便克隆至pCAMBIA表达载体而设置;黑色底纹标示MAD7对应的成熟型DR序列;方框内为四个靶向位点的sg序列,按5’到3’的顺序分别为OsDEP1-g6、OsBEL260-g1、OsRoc5-g1和OsHD3A;加粗字母为转录终止子序列;双下划线和点分别为hammerhead(HH)、Hepatitis deltavirus(HDV)核酶序列。Sequence 6 The related sequence (SEQ ID NO.6) of the rice multi-gene site editing HH-DR-sg-HDV crRNA expression cassette of MAD7, the element structure in the expression cassette is shown in Figure 5, wherein the underline indicates the BsaI enzyme The cutting site is set for the convenience of cloning into the pCAMBIA expression vector; the black shading marks the mature DR sequence corresponding to MAD7; the sg sequences of the four targeting sites are in the box, and they are respectively in the order of 5' to 3' OsDEP1-g6, OsBEL260-g1, OsRoc5-g1, and OsHD3A; bold letters are transcription terminator sequences; double underlines and dots are hammerhead (HH) and Hepatitis deltavirus (HDV) ribozyme sequences, respectively.
其中,HH核酸序列中的5’
Figure PCTCN2021138184-appb-000001
与DR的前6个碱基互补。
Wherein, the 5' in the HH nucleic acid sequence
Figure PCTCN2021138184-appb-000001
Complementary to first 6 bases of DR.
序列二十一MAD7之玉米crRNA表达盒的相关序列(SEQ ID NO.21),该表达盒中的元件结构如图6所示,其中,前后下划线分别标示ApaI酶切位点,为方便克隆至pCAMBIA表达载体而设置;黑色底纹标示MAD7对应的成熟型DR序列;方框内为sg位点,构建基因敲除载体时该序列人工合成互补双链,同ZmU3启动子的PCR产物通过Overlapping PCR法扩增全长;加粗字母为转录终止子序列;其余为ZmU3启动子的序列。The related sequence (SEQ ID NO.21) of the maize crRNA expression cassette of sequence 21 MAD7, the element structure in this expression cassette is as shown in Figure 6, wherein, the underline before and after marks the ApaI restriction site respectively, for the convenience of cloning into The pCAMBIA expression vector is set; the black shading marks the mature DR sequence corresponding to MAD7; the sg site is inside the box, and the complementary double-stranded sequence is artificially synthesized when constructing the gene knockout vector, and the PCR product with the ZmU3 promoter is passed through Overlapping PCR The full length was amplified by the method; the bold letter is the transcription terminator sequence; the rest is the sequence of the ZmU3 promoter.
方法method
1)构建CRISPR-MAD7表达载体用于单位点敲除1) Construction of CRISPR-MAD7 expression vector for single-site knockout
对MAD7进行水稻密码子优化,构建MAD7和crRNA两个表达盒并克隆到pCAMBIA表达载体上。以水稻为例,crRNA表达盒(SEQ ID NO:2)从5’-3’具有以下4个元件:OsU6或OsU3启动子、MAD7对应的成熟型直接重复序列(direct repeat,DR)、sg序列、转录终止子序列(TTTTTTT)。MAD7表达盒从5’-3’具有以下元件:来源于玉米的Ubi启动子、NLS核定位信号序列(nNLS)、MAD7的编码序列、第二段NLS核定位信号序列(cNLS)、NOS转录终止子序列(图7)。本发明的另一设计是改变sg长度,或者在上述构造的基础上去掉MAD7编码序列5’端或3’端的NLS核定位信号序列,其它元件保留(图8),以探讨不同组合时MAD7在体内的切割活性。The rice codon was optimized for MAD7, two expression cassettes of MAD7 and crRNA were constructed and cloned into the pCAMBIA expression vector. Taking rice as an example, the crRNA expression cassette (SEQ ID NO: 2) has the following four elements from 5' to 3': OsU6 or OsU3 promoter, mature direct repeat (DR) corresponding to MAD7, sg sequence , Transcription terminator sequence (TTTTTTT). The MAD7 expression cassette has the following elements from 5'-3': Ubi promoter from maize, NLS nuclear localization signal sequence (nNLS), coding sequence of MAD7, second NLS nuclear localization signal sequence (cNLS), NOS transcription termination Subsequence (Figure 7). Another design of the present invention is to change the length of sg, or remove the NLS nuclear localization signal sequence at the 5' or 3' end of the MAD7 coding sequence on the basis of the above-mentioned construction, and keep other elements (Figure 8) to explore the presence of MAD7 in different combinations. Cleavage activity in vivo.
2)构建CRISPR-MAD7载体用于多位点敲除2) Construction of CRISPR-MAD7 vector for multi-site knockout
由于MAD7自身具有RNA酶活性,可以自我剪切加工转录的前体crRNA序列,推测如果一个crRNA表达盒串联多个DR-sg序列,转录后可由MAD7剪切分成单个的DR-sg序列,从而方便的实现多位点敲除。该设计需要构建多位点crRNA表达盒,以水稻为例,从5’-3’具有以下元件:OsU6或OsU3启动子、MAD7对应的crRNA序列(含DR序列)、对应靶标位点1的sg1序列、crRNA序列、对应靶标位点2的sg2序列、crRNA序列、对应靶标位点3的sg3序列、crRNA序列、对应靶标位点N的sgN序列、转录终止子序列(TTTTTTT)。本发明的另一设计是多个DR-sg分别由miniOsU3/U6启动子驱动,或将不同DR-sg序列用tRNA加工识别序列、或HH-HDV间隔连接,而MAD7表达盒与单位点敲除的构建一样。最后把上述crRNA和MAD7核酸酶 两个表达盒克隆到pCAMBIA表达载体上(图9-13)。Since MAD7 itself has RNase activity, it can cut and process the transcribed precursor crRNA sequence by itself. It is speculated that if a crRNA expression cassette is connected in series with multiple DR-sg sequences, it can be cut into a single DR-sg sequence by MAD7 after transcription. The realization of multi-site knockout. This design requires the construction of a multi-site crRNA expression cassette. Taking rice as an example, it has the following elements from 5' to 3': OsU6 or OsU3 promoter, crRNA sequence corresponding to MAD7 (including DR sequence), sg1 corresponding to target site 1 sequence, crRNA sequence, sg2 sequence corresponding to target site 2, crRNA sequence, sg3 sequence corresponding to target site 3, crRNA sequence, sgN sequence corresponding to target site N, transcription terminator sequence (TTTTTTTT). Another design of the present invention is that multiple DR-sgs are respectively driven by miniOsU3/U6 promoters, or different DR-sg sequences are connected with tRNA processing recognition sequences, or HH-HDV spacers, and the MAD7 expression cassette and single-site knockout The same build. Finally, the above two expression cassettes of crRNA and MAD7 nuclease were cloned into the pCAMBIA expression vector (Fig. 9-13).
3)利用CRISPR-MAD7进行同源重组或基于非同源末端连接(non-homologous end joining、NHEJ)介导的外源片段定点***。3) Use CRISPR-MAD7 for homologous recombination or site-specific insertion of exogenous fragments mediated by non-homologous end joining (NHEJ).
较Cas9切割产生平末端相比,MAD7与Cpf1一样切割产生粘性末端,理论上也更适合进行外源片段的定向***。利用CRISPR-MAD7在靶标位点附近制造一个DSB,同时利用基因枪轰击或DNA病毒复制的方式导入大量外源片段,即可高效地实现植物细胞的基因同源重组或定向***。Compared with Cas9 cutting to produce blunt ends, MAD7 cuts like Cpf1 to produce sticky ends, which is theoretically more suitable for directional insertion of foreign fragments. Using CRISPR-MAD7 to create a DSB near the target site, and using gene gun bombardment or DNA virus replication to introduce a large number of foreign fragments can efficiently achieve gene homologous recombination or directional insertion in plant cells.
实施例1 利用CRISPR-MAD7(ncNLS)***对水稻内源基因进行单位点敲除Example 1 Using the CRISPR-MAD7 (ncNLS) system to perform single-site knockout of endogenous genes in rice
采用LbCpf1高效编辑OsHD3A启动子靶向位点(LOCOs06g06320,编辑效率81.5%),分别通过AvrII、Afel酶切位点连接到pCAMBIA-CRISPR-MAD7表达载体上。构建好的载体转化到农杆菌EHA105,然后通过该菌株侵染水稻品种南粳46(Oryza sativa ssp japonica cv.Nangeng46)的愈伤组织,共培养3天后分别30℃和34℃恢复4天,所转化的愈伤组织转移到含有潮霉素的筛选培养基,培养28-30天后转移到含有潮霉素的分化培养基中再生植株,再生植株取样提取DNA,Taqman检测MAD7阳性单株,并对阳性单株靶标位点进行扩增测序。结果显示该***能在水稻细胞内靶向切割产生突变并生成突变体植株,30℃和34℃恢复培养时,OsHD3A位点编辑效率分别为89.9%和94.9%,两个等位基因同时被编辑的频率(含homo)高达77.5%和78.1%,与LbCpf1的编辑效率相当,两个不同温度处理下的编辑效率间不存在显著差异(表1)。LbCpf1 was used to efficiently edit the OsHD3A promoter targeting site (LOCOs06g06320, editing efficiency 81.5%), and was connected to the pCAMBIA-CRISPR-MAD7 expression vector through AvrII and Afel restriction sites, respectively. The constructed vector was transformed into Agrobacterium EHA105, and then the callus of rice variety Nanjing 46 (Oryza sativa ssp japonica cv. The transformed callus was transferred to the selection medium containing hygromycin, and after 28-30 days of culture, it was transferred to the differentiation medium containing hygromycin to regenerate plants. The regenerated plants were sampled to extract DNA, and Taqman detected MAD7 positive individual plants, and the The target sites of positive individual plants were amplified and sequenced. The results show that the system can target mutations in rice cells and generate mutant plants. When cultured at 30°C and 34°C, the editing efficiency of the OsHD3A site is 89.9% and 94.9%, respectively, and the two alleles are edited at the same time The frequency (including homo) of the gene is as high as 77.5% and 78.1%, which is comparable to the editing efficiency of LbCpf1, and there is no significant difference between the editing efficiency under two different temperature treatments (Table 1).
表1 CRISPR/MAD7(ncNLS)***水稻内源基因单位点敲除的PAM-sg序列及T0代植株鉴定结果Table 1 The PAM-sg sequence of rice endogenous gene single-site knockout in CRISPR/MAD7(ncNLS) system and the identification results of T0 generation plants
Figure PCTCN2021138184-appb-000002
Figure PCTCN2021138184-appb-000002
Figure PCTCN2021138184-appb-000003
Figure PCTCN2021138184-appb-000003
注:括号内数字为无测序结果单株数;下划线表示PAM序列;WT代表野生型,He代表杂合突变体,Bi代表等位双突变体(包含Homo纯合突变体)。编辑效率(%)=突变株数/(MAD7 +阳性单株-测序失败株数)*100;Biallelic频率(%)=Biallelic株数/(MAD7 +阳性单株-测序失败株数)*100。 Note: The number in brackets is the number of individual plants without sequencing results; the underline represents the PAM sequence; WT represents the wild type, He represents the heterozygous mutant, and Bi represents the allelic double mutant (including Homo homozygous mutant). Editing efficiency (%)=number of mutant strains/(MAD7 + positive individual strains-number of sequencing failure strains)*100; Biallelic frequency (%)=number of Biallelic strains/(MAD7 + positive individual strains-number of sequencing failure strains)*100.
实施例2 利用CRISPR-MAD7(ncNLS)***对水稻多个内源基因进行单位点敲除Example 2 Using the CRISPR-MAD7 (ncNLS) system to perform single-site knockout of multiple endogenous genes in rice
为进一步验证CRISPR-MAD7(ncNLS)***在植物细胞内靶向切割的效率,选取5个文章报道的水稻中Cpf1的编辑位点(OsRLK-799-g1,LOC_Os02g07960;OsDEP1-g6,LOC_Os09g26999;OsALS-g7,LOC_Os02g30630;OsBEL260-g1,LOC_Os03g55260;OsRoc5-g1,LOC_Os02g45250),同时根据已知基因序列新设计CRISPR-MAD7靶向位点(OsDEP1-g7;OsPDS1-g3;LOC_Os03g08570;OsRoc5-g2和OsBEL260-g4),人工合成DR-sgRNA,overlap PCR扩增产物通过酶切位点连接到pCAMBIA-CRISPR-MAD7表达载体上。构建好的载体转化到农杆菌EHA105,然后通过该菌株侵染水稻品种南粳46的愈伤组织,共培养3天后30℃恢复培养4天,所转化的愈伤组织转移到含有潮霉素的筛选培养基,培养28-30天后转移到含有潮霉素的分化培养基中再生植株,再生植株取样提取DNA,Taqman检测MAD7阳性单株,并对阳性单株靶标位点进行扩增测序。结果显示该***在水稻细胞内对多个内源基因的多个靶点高效切割,编辑效率高达59.6%-96.9%,两个等位基因同时被编辑的频率(含homo)高达34.8%-95.9%,利用根据本发明构建的核酸构建物或组合CRISPR-MAD7可在植物细胞内高效编辑(表2)。In order to further verify the efficiency of CRISPR-MAD7 (ncNLS) system for targeted cutting in plant cells, five Cpf1 editing sites in rice reported in the paper were selected (OsRLK-799-g1, LOC_Os02g07960; OsDEP1-g6, LOC_Os09g26999; OsALS- g7, LOC_Os02g30630; OsBEL260-g1, LOC_Os03g55260; OsRoc5-g1, LOC_Os02g45250), and newly designed CRISPR-MAD7 targeting sites (OsDEP1-g7; OsPDS1-g3; LOC_Os03g08570; OsRoc5- g2 and OsBEL260-g4 ), artificially synthesized DR-sgRNA, and the overlap PCR amplification product was connected to the pCAMBIA-CRISPR-MAD7 expression vector through restriction sites. The constructed vector was transformed into Agrobacterium EHA105, and then the callus of rice variety Nanjing 46 was infected by this strain. After 3 days of co-cultivation, the culture was resumed at 30°C for 4 days, and the transformed callus was transferred to a culture medium containing hygromycin. Screen the culture medium, transfer to the differentiation medium containing hygromycin to regenerate plants after 28-30 days of culture, take samples from the regenerated plants to extract DNA, detect MAD7 positive individual plants by Taqman, and amplify and sequence the target sites of positive individual plants. The results show that the system efficiently cuts multiple targets of multiple endogenous genes in rice cells, the editing efficiency is as high as 59.6%-96.9%, and the frequency of simultaneous editing of two alleles (including homo) is as high as 34.8%-95.9 %, using the nucleic acid construct constructed according to the present invention or combining CRISPR-MAD7 can be efficiently edited in plant cells (Table 2).
表2 CRISPR/MAD7(ncNLS)***水稻多个内源基因单位点敲除的PAM-sg序列及T0代植株鉴定结果Table 2 The PAM-sg sequences of multiple endogenous rice single-site knockouts in the CRISPR/MAD7 (ncNLS) system and the identification results of the T0 generation plants
Figure PCTCN2021138184-appb-000004
Figure PCTCN2021138184-appb-000004
Figure PCTCN2021138184-appb-000005
Figure PCTCN2021138184-appb-000005
注:括号内数字为无测序结果单株数;下划线表示PAM序列;WT野生型,He杂合突变体,Bi等位双突变体(包含Homo纯合突变体)。编辑效率(%)=突变株数/(MAD7 +阳性单株-测序失败株数)*100;Biallelic频率(%)=Biallelic株数/(MAD7 +阳性单株-测序失败株数)*100。 Note: The number in brackets is the number of individual plants without sequencing results; the underline indicates the PAM sequence; WT wild type, He heterozygous mutant, Bi allelic double mutant (including Homo homozygous mutant). Editing efficiency (%)=number of mutant strains/(MAD7 + positive individual strains-number of sequencing failure strains)*100; Biallelic frequency (%)=number of Biallelic strains/(MAD7 + positive individual strains-number of sequencing failure strains)*100.
实施例3 CRISPR-MAD7***不同NLS位置和数目对水稻内源基因单位点敲除Example 3 CRISPR-MAD7 System Different NLS Positions and Numbers Knock Out Endogenous Genes in Rice
实施例1中CRISPR-MAD7***PCR扩增产物分别通过BsrGI和AvrII 酶切位点、或ApaI酶切位点连接到PCAMBIA-CRISPR-MAD7表达载体上,构建pCAMBIA-CRISPR-nNLS-MAD7(去掉MAD7核酸编辑酶C端NLS核定位序列)、pCAMBIA-CRISPR-MAD7-cNLS(去掉MAD7核酸编辑酶N端NLS核定位序列)***。构建好的载体转化到农杆菌EHA105,然后通过该菌株侵染水稻品种南粳46的愈伤组织,共培养3天后30℃恢复培养4天,所转化的愈伤组织转移到含有潮霉素的筛选培养基,培养28-30天后转移到含有潮霉素的分化培养基中再生植株。再生植株取样提取DNA,Taqman检测MAD7阳性单株,并对阳性单株靶标位点进行扩增测序。结果显示CRISPR-MAD7(ncNLS)、CRISPR-MAD7(nNLS)和CRISPR-MAD7(cNLS)***在水稻细胞内编辑效率分别为89.9%、89.0%和78.7%,biallelic(含homo)频率分别为77.5%、85.4%和54.7%,去掉MAD7蛋白N端核定位信号载体在水稻细胞内编辑效率有轻微下降,双等位基因编辑的频率显著下降(表3)。In Example 1, the CRISPR-MAD7 system PCR amplification products were respectively connected to the pCAMBIA-CRISPR-MAD7 expression vector through the BsrGI and AvrII restriction sites, or the ApaI restriction site, to construct pCAMBIA-CRISPR-nNLS-MAD7 (remove MAD7 Nucleic acid editing enzyme C-terminal NLS nuclear localization sequence), pCAMBIA-CRISPR-MAD7-cNLS (remove MAD7 nucleic acid editing enzyme N-terminal NLS nuclear localization sequence) system. The constructed vector was transformed into Agrobacterium EHA105, and then the callus of rice variety Nanjing 46 was infected by this strain. After 3 days of co-cultivation, the culture was resumed at 30°C for 4 days, and the transformed callus was transferred to a culture medium containing hygromycin. The medium was selected, and after 28-30 days of culture, it was transferred to a differentiation medium containing hygromycin to regenerate plants. The regenerated plants were sampled to extract DNA, Taqman detected MAD7-positive individual plants, and the target sites of positive individual plants were amplified and sequenced. The results showed that the editing efficiencies of CRISPR-MAD7(ncNLS), CRISPR-MAD7(nNLS) and CRISPR-MAD7(cNLS) systems in rice cells were 89.9%, 89.0% and 78.7%, respectively, and the biallelic (including homo) frequencies were 77.5% , 85.4% and 54.7%, removing the MAD7 protein N-terminal nuclear localization signal carrier slightly decreased the editing efficiency in rice cells, and the frequency of biallelic gene editing decreased significantly (Table 3).
表3 CRISPR-MAD7***不同NLS位置和数目对水稻内源基因单位点敲除PAM-sg序列及T0代植株鉴定结果Table 3 The identification results of PAM-sg sequence and T0 generation plants of rice endogenous gene single site knockout in CRISPR-MAD7 system with different NLS positions and numbers
Figure PCTCN2021138184-appb-000006
Figure PCTCN2021138184-appb-000006
注:括号内数字为无测序结果单株数;下划线表示PAM序列;WT野生型,He杂合突变体,Bi等位双突变体(包含Homo纯合突变体)。编辑效率(%)=突变株数/(MAD7 +阳性单株-测序失败株数)*100;Biallelic频率(%)=Biallelic株数/(MAD7 +阳性单株-测序失败株数)*100。 Note: The number in brackets is the number of individual plants without sequencing results; the underline indicates the PAM sequence; WT wild type, He heterozygous mutant, Bi allelic double mutant (including Homo homozygous mutant). Editing efficiency (%)=number of mutant strains/(MAD7 + positive individual strains-number of sequencing failure strains)*100; Biallelic frequency (%)=number of Biallelic strains/(MAD7 + positive individual strains-number of sequencing failure strains)*100.
实施例4 CRISPR-MAD7***不同sg长度对水稻内源基因单位点敲除Example 4 CRISPR-MAD7 System Different sg Lengths Knockout Rice Endogenous Gene Single Site
同实施例1中sg序列,长度分别去掉2bp、4bp或增加2bp,通过AvrII、RsrII酶切位点连接到pCAMBIA-CRISPR-MAD7表达载体上。构建好的载体转化到农杆菌EHA105,然后通过该菌株侵染水稻品种南粳46的愈伤组织, 共培养3天后30℃恢复培养4天,所转化的愈伤组织转移到含有潮霉素的筛选培养基,培养28-30天后转移到含有潮霉素的分化培养基中再生植株。再生植株取样提取DNA,Taqman检测MAD7阳性单株,并对阳性单株靶标位点进行扩增测序。结果显示随着sg长度从19bp、21bp、23bp增加到25bp,水稻细胞内编辑频率和biallelic(含homo)频率分别从95.7%、89.2%呈趋势递减到85.9%、56.5%,除25bp sg较其他长度的biallelic频率显著降低外,不同长度sg载体间编辑率和biallelic频率不存在显著差异(表4)。The same as the sg sequence in Example 1, the length was removed by 2bp, 4bp or increased by 2bp, respectively, and connected to the pCAMBIA-CRISPR-MAD7 expression vector through the AvrII and RsrII restriction sites. The constructed vector was transformed into Agrobacterium EHA105, and then the callus of rice variety Nanjing 46 was infected by this strain. After 3 days of co-cultivation, the culture was resumed at 30°C for 4 days, and the transformed callus was transferred to a culture medium containing hygromycin. The medium was selected, and after 28-30 days of culture, it was transferred to a differentiation medium containing hygromycin to regenerate plants. The regenerated plants were sampled to extract DNA, Taqman detected MAD7-positive individual plants, and the target sites of positive individual plants were amplified and sequenced. The results showed that as the length of sg increased from 19bp, 21bp, 23bp to 25bp, the editing frequency and biallelic (including homo) frequency in rice cells decreased from 95.7% and 89.2% to 85.9% and 56.5%, respectively. Except for the significant reduction in the biallelic frequency of the length, there was no significant difference in the editing rate and biallelic frequency between sg vectors of different lengths (Table 4).
表4 CRISPR-MAD7***不同sg长度对水稻内源基因单位点敲除PAM-sg序列及T0代植株鉴定结果Table 4 CRISPR-MAD7 system with different sg lengths for single point knockout of rice endogenous gene PAM-sg sequence and identification results of T0 generation plants
Figure PCTCN2021138184-appb-000007
Figure PCTCN2021138184-appb-000007
注:括号内数字为无测序结果单株数;下划线表示PAM序列;WT野生型,He杂合突变体,Bi等位双突变体(包含Homo纯合突变体)。编辑效率(%)=突变株数/(MAD7 +阳性单株-测序失败株数)*100;Biallelic频率(%)=Biallelic株数/(MAD7 +阳性单株-测序失败株数)*100。 Note: The number in brackets is the number of individual plants without sequencing results; the underline indicates the PAM sequence; WT wild type, He heterozygous mutant, Bi allelic double mutant (including Homo homozygous mutant). Editing efficiency (%)=number of mutant strains/(MAD7 + positive individual strains-number of sequencing failure strains)*100; Biallelic frequency (%)=number of Biallelic strains/(MAD7 + positive individual strains-number of sequencing failure strains)*100.
实施例5 包含DR-sg序列串的CRISPR-MAD7载体用于水稻多位点敲除Example 5 The CRISPR-MAD7 vector comprising the DR-sg sequence string is used for multi-site knockout in rice
因MAD7具有自主剪切、加工pre-crRNA的能力,本实施例选取实施例2中4个文章报道的Cpf1的编辑位点(OsDEP1-g6,OsBEL260-g1,OsRoc5-g1, OsHD3A-g22),通过MAD7的成熟DR序列间隔连接,并处于同一个OsU3启动子的控制(SEQ ID NO.3)。随后将该表达盒与MAD7表达盒连接并置于pCAMBIA的LB与RB序列内。构建好的载体转化到农杆菌EHA105,然后通过该菌株侵染水稻品种南粳46的愈伤组织,共培养3天后30℃恢复培养4天,所转化的愈伤组织转移到含有潮霉素的筛选培养基,培养28-30天后转移到含有潮霉素的分化培养基中再生植株。再生植株取样提取DNA,Taqman检测MAD7阳性单株,并对阳性单株靶标位点进行扩增测序。结果显示该***4个基因突变频率分别为34.0%、80.9%、3.2%和3.2%,两个等位基因同时被编辑的效率分别为11.7%、59.6%、0%和0%。该***第一、第二个基因能高效率地在水稻细胞内进行基因编辑,因U3属于Pol III类型启动子,驱动长链的能力有限,该***的第三、第四个基因编辑效率显著下降,且突变基因型全部为杂合;未发现两个等位基因同时被编辑的单株和多个基因同时被编辑的单株(表5)。Because MAD7 has the ability to autonomously cut and process pre-crRNA, this example selects the editing sites of Cpf1 reported in 4 articles in Example 2 (OsDEP1-g6, OsBEL260-g1, OsRoc5-g1, OsHD3A-g22), Interspaced by the mature DR sequence of MAD7 and under the control of the same OsU3 promoter (SEQ ID NO.3). This expression cassette was then ligated to the MAD7 expression cassette and placed within the LB and RB sequences of pCAMBIA. The constructed vector was transformed into Agrobacterium EHA105, and then the callus of rice variety Nanjing 46 was infected by this strain. After 3 days of co-cultivation, the culture was resumed at 30°C for 4 days, and the transformed callus was transferred to a culture medium containing hygromycin. The medium was selected, and after 28-30 days of culture, it was transferred to a differentiation medium containing hygromycin to regenerate plants. The regenerated plants were sampled to extract DNA, Taqman detected MAD7-positive individual plants, and the target sites of positive individual plants were amplified and sequenced. The results showed that the mutation frequencies of the four genes in this system were 34.0%, 80.9%, 3.2% and 3.2%, respectively, and the simultaneous editing efficiencies of two alleles were 11.7%, 59.6%, 0% and 0%, respectively. The first and second genes of this system can be efficiently edited in rice cells. Because U3 belongs to the Pol III type promoter, the ability to drive long chains is limited, and the third and fourth genes of this system have significant editing efficiency. decreased, and all mutant genotypes were heterozygous; no individual plants with two alleles edited at the same time and individual plants with multiple genes edited at the same time were not found (Table 5).
表5 CRISPR-MAD7(DR串联sgRNA)多基因编辑***对水稻不同基因进行敲除PAM-sg序列及T0代植株鉴定结果Table 5 CRISPR-MAD7 (DR tandem sgRNA) multi-gene editing system to knock out different genes in rice PAM-sg sequence and identification results of T0 generation plants
Figure PCTCN2021138184-appb-000008
Figure PCTCN2021138184-appb-000008
注:括号内数字为无测序结果单株数;下划线表示PAM序列;WT野生型,He杂合突变体,Bi等位双突变体(包含Homo纯合突变体)。编辑效率(%)=突变株数/(MAD7 +阳性单株-测序失败株数)*100;Biallelic频率(%) =Biallelic株数/(MAD7 +阳性单株-测序失败株数)*100。 Note: The number in brackets is the number of individual plants without sequencing results; the underline indicates the PAM sequence; WT wild type, He heterozygous mutant, Bi allelic double mutant (including Homo homozygous mutant). Editing efficiency (%)=number of mutant strains/(MAD7 + positive individual strains-number of sequencing failure strains)*100; Biallelic frequency (%)=number of Biallelic strains/(MAD7 + positive individual strains-number of sequencing failure strains)*100.
实施例6 利用tRNA串联DR-sg序列的CRISPR-MAD7载体用于水稻多位点敲除Example 6 Using tRNA tandem DR-sg sequence CRISPR-MAD7 vector for multi-site knockout in rice
将实施例5中DR-sg序列串通过內源tRNA加工***中RNAase的识别位点间隔连接,并处于同一个OsU3启动子的控制(序列四SEQ ID NO.4)。随后将该表达盒与MAD7表达盒连接并置于pCAMBIA的LB与RB序列内。构建好的载体转化到农杆菌EHA105,然后通过该菌株侵染水稻品种南粳46的愈伤组织,共培养3天后30℃恢复培养4天,所转化的愈伤组织转移到含有潮霉素的筛选培养基,培养28-30天后转移到含有潮霉素的分化培养基中再生植株。再生植株取样提取DNA,Taqman检测MAD7阳性单株,并对阳性单株靶标位点进行扩增测序。结果显示该***4个基因突变频率分别为49.4%、91.0%、71.9%和68.2%,两个等位基因同时被编辑的效率分别为14.6%、78.7%、58.4%和47.7%,4个基因同时突变的频率占阳性单株的38.2%,该***能高效地在水稻细胞内同时进行多基因编辑(表6)。The DR-sg sequence string in Example 5 is connected at intervals through the recognition site of RNAase in the endogenous tRNA processing system, and is under the control of the same OsU3 promoter (sequence 4 SEQ ID NO.4). This expression cassette was then ligated to the MAD7 expression cassette and placed within the LB and RB sequences of pCAMBIA. The constructed vector was transformed into Agrobacterium EHA105, and then the callus of rice variety Nanjing 46 was infected by this strain. After 3 days of co-cultivation, the culture was resumed at 30°C for 4 days, and the transformed callus was transferred to a culture medium containing hygromycin. The medium was selected, and after 28-30 days of culture, it was transferred to a differentiation medium containing hygromycin to regenerate plants. The regenerated plants were sampled to extract DNA, Taqman detected MAD7-positive individual plants, and the target sites of positive individual plants were amplified and sequenced. The results showed that the mutation frequencies of the four genes in this system were 49.4%, 91.0%, 71.9% and 68.2%, respectively, and the simultaneous editing efficiencies of two alleles were 14.6%, 78.7%, 58.4% and 47.7%, respectively. The frequency of simultaneous mutations accounted for 38.2% of positive individual plants, and the system can efficiently edit multiple genes simultaneously in rice cells (Table 6).
表6 CRISPR-MAD7(tRNA串联DR-sgRNA)多基因编辑***对水稻不同基因进行敲除PAM-sg序列及T0代植株鉴定结果Table 6 CRISPR-MAD7 (tRNA tandem DR-sgRNA) multi-gene editing system to knock out different genes in rice PAM-sg sequence and identification results of T0 generation plants
Figure PCTCN2021138184-appb-000009
Figure PCTCN2021138184-appb-000009
注:括号内数字为无测序结果单株数;下划线表示PAM序列;WT野生型,He杂合突变体,Bi等位双突变体(包含Homo纯合突变体)。编辑效率(%)=突变株数/(MAD7 +阳性单株-测序失败株数)*100;Biallelic频率(%)=Biallelic株数/(MAD7 +阳性单株-测序失败株数)*100。 Note: The number in brackets is the number of individual plants without sequencing results; the underline indicates the PAM sequence; WT wild type, He heterozygous mutant, Bi allelic double mutant (including Homo homozygous mutant). Editing efficiency (%)=number of mutant strains/(MAD7 + positive individual strains-number of sequencing failure strains)*100; Biallelic frequency (%)=number of Biallelic strains/(MAD7 + positive individual strains-number of sequencing failure strains)*100.
实施例7 利用miniOsU3/miniOsU6分别驱动DR-sg序列的CRISPR-MAD7载体用于水稻多位点敲除Example 7 The use of miniOsU3/miniOsU6 to drive the CRISPR-MAD7 vector of the DR-sg sequence for multi-site knockout in rice
将实施例5中DR-sg序列串利用miniOsU3/miniOsU6分别驱动(SEQ ID NO.5)。随后将该表达盒与MAD7表达盒连接并置于pCAMBIA的LB与RB序列内。构建好的载体转化到农杆菌EHA105,然后通过该菌株侵染水稻品种南粳46的愈伤组织,共培养3天后30℃恢复培养4天,所转化的愈伤组织转移到含有潮霉素的筛选培养基,培养28-30天后转移到含有潮霉素的分化培养基中再生植株。再生植株取样提取DNA,Taqman检测MAD7阳性单株,并对阳性单株靶标位点进行扩增测序。结果显示该***能高效率地在水稻细胞内同时进行多基因编辑,4个基因突变频率分别为44.4%、94.4%、92.2%和90.0%,两个等位基因同时被编辑的效率分别为22.2%、93.3%、86.7%和87.8%,4个基因同时突变的频率占阳性单株的42.2%,该***能高效地在水稻细胞内同时进行多基因编辑(表7)。The DR-sg sequence strings in Example 5 were respectively driven by miniOsU3/miniOsU6 (SEQ ID NO.5). This expression cassette was then ligated to the MAD7 expression cassette and placed within the LB and RB sequences of pCAMBIA. The constructed vector was transformed into Agrobacterium EHA105, and then the callus of rice variety Nanjing 46 was infected by this strain. After 3 days of co-cultivation, the culture was resumed at 30°C for 4 days, and the transformed callus was transferred to a culture medium containing hygromycin. The medium was selected, and after 28-30 days of culture, it was transferred to a differentiation medium containing hygromycin to regenerate plants. The regenerated plants were sampled to extract DNA, Taqman detected MAD7-positive individual plants, and the target sites of positive individual plants were amplified and sequenced. The results show that the system can efficiently edit multiple genes simultaneously in rice cells. The mutation frequencies of the four genes are 44.4%, 94.4%, 92.2% and 90.0%, respectively, and the efficiency of simultaneous editing of two alleles is 22.2%, respectively. %, 93.3%, 86.7% and 87.8%, the frequency of simultaneous mutation of the four genes accounted for 42.2% of the positive individual plants. This system can efficiently perform multi-gene editing in rice cells simultaneously (Table 7).
表7 CRISPR-MAD7(miniOsU3/miniOsU6分别驱动DR-sgRNA)多基因编辑***对水稻不同基因进行敲除PAM-sg序列及T0代植株鉴定结果Table 7 CRISPR-MAD7 (miniOsU3/miniOsU6 drive DR-sgRNA respectively) multi-gene editing system to knock out PAM-sg sequence of different genes in rice and identification results of T0 generation plants
Figure PCTCN2021138184-appb-000010
Figure PCTCN2021138184-appb-000010
注:括号内数字为无测序结果单株数;下划线表示PAM序列;WT野生型,He杂合突变体,Bi等位双突变体(包含Homo纯合突变体)。编辑效率(%)=突变株数/(MAD7 +阳性单株-测序失败株数)*100;Biallelic频率(%)=Biallelic株数/(MAD7 +阳性单株-测序失败株数)*100。 Note: The number in brackets is the number of individual plants without sequencing results; the underline indicates the PAM sequence; WT wild type, He heterozygous mutant, Bi allelic double mutant (including Homo homozygous mutant). Editing efficiency (%)=number of mutant strains/(MAD7 + positive individual strains-number of sequencing failure strains)*100; Biallelic frequency (%)=number of Biallelic strains/(MAD7 + positive individual strains-number of sequencing failure strains)*100.
实施例8 利用Pol II类型启动子驱动、HH-HDV串联DR-sg序列的CRISPR-MAD7载体用于水稻多位点敲除Example 8 The CRISPR-MAD7 vector driven by the Pol II type promoter and the HH-HDV tandem DR-sg sequence is used for multi-site knockout in rice
在实施例5-7中,利用OsU3或miniOsU3/miniOsU6启动子驱动的DR-guide阵列能实现水稻4个基因位点的高效敲除,但U3/U6均属于Pol III类型启动子,驱动长链的能力有限,而且Pol III类型启动子不具备条件特异性或组织特异性激活能力,但是Pol II类似的启动子能有效克服上述缺陷。在该实施例中,构建Pol II类型的启动子玉米Ubiquitin驱动crRNA表达盒,利用两种具有RNA自剪切活性的核酶hammerhead ribozyme(HH)和Hepatitis deltavirus ribozyme(HDV)把转录后的DR-guide靶向序列分离出来(序列六SEQ ID NO.6),同时构建OsU3启动子驱动crRNA表达盒的对照,将上述两个crRNA表达盒分别与原有的MAD7表达盒克隆到pCAMBIA表达载体上。构建好的载体转化到农杆菌EHA105,然后通过该菌株侵染水稻品种南粳46的愈伤组织,共培养3天后30℃恢复培养4天,所转化的愈伤组织转移到含有潮霉素的筛选培养基,培养28-30天后转移到含有潮霉素的分化培养基中再生植株。再生植株取样提取DNA,Taqman检测MAD7阳性单株,并对阳性单株靶标位点进行扩增测序。结果显示OsU3-HH-HDV***4个基因突变频率分别为7.6%、6.7%、0%和3.3%,且除第二个基因上检测到一个两个等位基因同时被编辑的突变单株外,其余各基因突变单株均为杂合(表8)。ZmUbi-HH-HDV***4个基因突变频率分别为82.6%、92.1%、88.8%和93.3%,两个等位基因同时被编辑的效率分别为62.8%、84.3%、84.3%和92.1%(表8),使用效果甚至超过了单位点敲除的编辑效率,且4个基因同时突变的频率占阳性单株的77.5%,该***能非常高效率地在水稻细胞内同时进行多基因编辑(表8)。In Examples 5-7, the DR-guide array driven by the OsU3 or miniOsU3/miniOsU6 promoter can achieve efficient knockout of four gene loci in rice, but both U3/U6 belong to Pol III type promoters, driving long chains The capacity is limited, and Pol III type promoters do not have condition-specific or tissue-specific activation capabilities, but Pol II-like promoters can effectively overcome the above defects. In this embodiment, the promoter maize Ubiquitin of Pol II type was constructed to drive the crRNA expression cassette, and two ribozymes hammerhead ribozyme (HH) and Hepatitis deltavirus ribozyme (HDV) with RNA self-cleavage activity were used to convert the transcribed DR- The guide target sequence was isolated (sequence 6 SEQ ID NO.6), and the control of the OsU3 promoter-driven crRNA expression cassette was constructed, and the above two crRNA expression cassettes were cloned into the pCAMBIA expression vector with the original MAD7 expression cassette. The constructed vector was transformed into Agrobacterium EHA105, and then the callus of rice variety Nanjing 46 was infected by this strain. After 3 days of co-cultivation, the culture was resumed at 30°C for 4 days, and the transformed callus was transferred to a culture medium containing hygromycin. The medium was selected, and after 28-30 days of culture, it was transferred to a differentiation medium containing hygromycin to regenerate plants. The regenerated plants were sampled to extract DNA, Taqman detected MAD7-positive individual plants, and the target sites of positive individual plants were amplified and sequenced. The results showed that the mutation frequencies of the four genes in the OsU3-HH-HDV system were 7.6%, 6.7%, 0% and 3.3%, respectively, except for a single mutant in which both alleles were edited at the same time on the second gene , the rest of the gene mutations were all heterozygous (Table 8). The mutation frequencies of the four genes in the ZmUbi-HH-HDV system were 82.6%, 92.1%, 88.8% and 93.3%, respectively, and the simultaneous editing efficiencies of the two alleles were 62.8%, 84.3%, 84.3% and 92.1% (Table 8), the use effect even exceeds the editing efficiency of single point knockout, and the frequency of simultaneous mutation of four genes accounts for 77.5% of the positive individual plants. This system can perform multi-gene editing in rice cells at the same time with very high efficiency (Table 8).
表8 CRISPR-MAD7(HH-HDV串联DR-sgRNA)多基因编辑***对水稻不同基因进行敲除PAM-sg序列及T0代植株鉴定结果Table 8 CRISPR-MAD7 (HH-HDV tandem DR-sgRNA) multi-gene editing system to knock out different genes in rice PAM-sg sequence and identification results of T0 generation plants
Figure PCTCN2021138184-appb-000011
Figure PCTCN2021138184-appb-000011
Figure PCTCN2021138184-appb-000012
Figure PCTCN2021138184-appb-000012
注:括号内数字为无测序结果单株数;下划线表示PAM序列;WT野生型,He杂合突变体,Bi等位双突变体(包含Homo纯合突变体)。编辑效率(%)=突变株数/(MAD7 +阳性单株-测序失败株数)*100;Biallelic频率(%)=Biallelic株数/(MAD7 +阳性单株-测序失败株数)*100。 Note: The number in brackets is the number of individual plants without sequencing results; the underline indicates the PAM sequence; WT wild type, He heterozygous mutant, Bi allelic double mutant (including Homo homozygous mutant). Editing efficiency (%)=number of mutant strains/(MAD7 + positive individual strains-number of sequencing failure strains)*100; Biallelic frequency (%)=number of Biallelic strains/(MAD7 + positive individual strains-number of sequencing failure strains)*100.
实施例9 利用CRISPR-MAD7(ncNLS)***对玉米内源基因进行单位点敲除Example 9 Using the CRISPR-MAD7 (ncNLS) system to perform single-site knockout of endogenous genes in maize
为进一步验证CRISPR-MAD7(ncNLS)***在植物细胞内靶向切割的效率,选取文章报道的玉米中Cpf1的编辑位点(glossy2,Zm00001d002353),人工合成DR-sgRNA,overlap PCR扩增产物通过ApaI酶切位点连接到pCAMBIA-CRISPR-MAD7表达载体上。构建好的载体转化到农杆菌LBA4404,然后通过该菌株侵染玉米品种B104的幼胚,共培养7天后28℃恢复培养两周,所转化的幼胚转移到含有甘露糖的筛选培养基,培养两周后转移到含有甘露糖的分化培养基中再生植株,再生植株取样提取DNA,Taqman检测MAD7阳性单株,并对阳性单株靶标位点进行扩增测序。结果显示该***能在玉米细胞内靶向切割产生突变并生成突变体植株,编辑效率为6.5%;当 MAD7成熟DR序列5′端增加一个碱基C时(图14),MAD7在玉米细胞内靶向切割产生突变的效率提高,编辑效率提高到13.7%,为MAD7在玉米中编辑应用提供了新的思路。In order to further verify the efficiency of CRISPR-MAD7 (ncNLS) system in targeted cutting in plant cells, the editing site of Cpf1 in maize (glossy2, Zm00001d002353) reported in the article was selected, DR-sgRNA was artificially synthesized, and the overlap PCR amplification product was passed through ApaI The restriction site was connected to the pCAMBIA-CRISPR-MAD7 expression vector. The constructed vector was transformed into Agrobacterium LBA4404, and then the immature embryos of maize variety B104 were infected by this strain. After 7 days of co-cultivation, the culture was resumed at 28°C for two weeks. The transformed immature embryos were transferred to the selection medium containing mannose and cultured. Two weeks later, the plants were transferred to the differentiation medium containing mannose to regenerate the plants, and the regenerated plants were sampled to extract DNA. Taqman detected MAD7-positive individual plants, and the target sites of positive individual plants were amplified and sequenced. The results show that the system can produce mutations in maize cells by targeted cutting and generate mutant plants, and the editing efficiency is 6.5%. When a base C is added to the 5' end of the mature DR sequence of MAD7 (Fig. The efficiency of targeted cutting to generate mutations was improved, and the editing efficiency was increased to 13.7%, which provided a new idea for the editing application of MAD7 in maize.
表9 CRISPR/MAD7(ncNLS)***玉米内源基因单位点敲除的PAM-sg序列及T0代植株鉴定结果Table 9 The PAM-sg sequence and the identification results of the T0 generation plants knocked out by the CRISPR/MAD7 (ncNLS) system
Figure PCTCN2021138184-appb-000013
Figure PCTCN2021138184-appb-000013
注:括号内为载体编号;下划线表示PAM序列;编辑效率(%)=突变株数/(MAD7 +阳性单株-测序失败株数)*100;Biallelic频率(%)=Biallelic株数/(MAD7 +阳性单株-测序失败株数)*100。 Note: The carrier number is in brackets; the underline indicates the PAM sequence; editing efficiency (%) = number of mutant strains / (MAD7 + positive individual strains - number of sequencing failure strains) * 100; Biallelic frequency (%) = number of Biallelic strains / (MAD7 + positive individual strains strains - the number of strains that failed to be sequenced) * 100.
以上所述是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明所述原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。The above description is a preferred embodiment of the present invention, it should be pointed out that for those of ordinary skill in the art, without departing from the principle of the present invention, some improvements and modifications can also be made, these improvements and modifications It should also be regarded as the protection scope of the present invention.

Claims (14)

  1. 一种MAD7-NLS融合蛋白,其特征在于,具有如下结构:A MAD7-NLS fusion protein is characterized in that it has the following structure:
    B1-C-B2、B1-C或C-B2;B1-C-B2, B1-C or C-B2;
    其中,in,
    C为MAD7蛋白;C is MAD7 protein;
    B1和B2为各自独立的核定位信号序列。B1 and B2 are independent nuclear localization signal sequences.
  2. 如权利要求1所述的MAD7-NLS融合蛋白,其特征在于,所述核定位信号序列选自:SV40、KRP2、MDM2、CDc25C、DPP9、MTA1、CBP80、AreA、M9、Rev、hTAP、MyRF、EBNA-6、TERT或Tfam中的一种,或者任两种或多种的组合。The MAD7-NLS fusion protein according to claim 1, wherein the nuclear localization signal sequence is selected from the group consisting of: SV40, KRP2, MDM2, CDc25C, DPP9, MTA1, CBP80, AreA, M9, Rev, hTAP, MyRF, One of EBNA-6, TERT or Tfam, or a combination of any two or more.
  3. 如权利要求1或2所述的MAD7-NLS融合蛋白,其特征在于,所述MAD7-NLS融合蛋白N端还包含信号肽和/或蛋白标签序列。The MAD7-NLS fusion protein according to claim 1 or 2, wherein the N-terminus of the MAD7-NLS fusion protein further comprises a signal peptide and/or protein tag sequence.
  4. 一种用于植物基因组定点编辑的核酸构建物,其特征在于,包含第一表达盒,所述第一表达盒包含依次连接的第一启动子、如权利要求1-3中任一项所述的MAD7-NLS融合蛋白的编码核苷酸序列和第一终止子。A nucleic acid construct for site-directed editing of plant genomes, characterized in that it comprises a first expression cassette comprising a sequentially connected first promoter, as described in any one of claims 1-3 The coding nucleotide sequence and the first terminator of the MAD7-NLS fusion protein.
  5. 如权利要求4所述的核酸构建物,其特征在于,所述第一启动子为Pol II类型的启动子,选自Ubi、Actin、CmYLCV、UBQ、35S、SPL,组织特异性启动子YAO、CDC45、rbcS和诱导型启动子XEV中的一种或任两种的组合。The nucleic acid construct according to claim 4, wherein the first promoter is a Pol II type promoter selected from Ubi, Actin, CmYLCV, UBQ, 35S, SPL, tissue-specific promoter YAO, One or a combination of any two of CDC45, rbcS and the inducible promoter XEV.
  6. 如权利要求4所述的核酸构建物,其特征在于,还包含第二表达盒,所述的第二表达盒包含依次连接的第二启动子、若干串联的重复序列;The nucleic acid construct according to claim 4, further comprising a second expression cassette, said second expression cassette comprising a second promoter connected in sequence, several tandem repeat sequences;
    所述重复序列为成熟型直接重复序列和非成熟型直接重复序列中的一种或两种。The repeat sequence is one or both of mature direct repeat sequence and immature direct repeat sequence.
  7. 如权利要求6所述的核酸构建物,其特征在于,第二启动子为Pol II类型或Pol III类型的启动子;所述第二启动子选自OsU3、OsU6a、OsU6b、OsU6c、Actin、35S、Ubi、UBQ、SPL、CmYLCV、组织特异性启动子YAO、CDC45、rbcS或诱导型启动子XEV中的一种、两种或多种。The nucleic acid construct according to claim 6, wherein the second promoter is a Pol II type or Pol III type promoter; the second promoter is selected from OsU3, OsU6a, OsU6b, OsU6c, Actin, 35S , Ubi, UBQ, SPL, CmYLCV, tissue-specific promoter YAO, CDC45, rbcS or inducible promoter XEV, one, two or more.
  8. 如权利要求6所述的核酸构建物,其特征在于,所述第二表达盒还包含连接于重复序列末端的第二终止序列;所述第二终止序列选自polyT、NOS、 polyA或其组合。The nucleic acid construct according to claim 6, wherein the second expression cassette further comprises a second termination sequence connected to the end of the repeat sequence; the second termination sequence is selected from polyT, NOS, polyA or a combination thereof .
  9. 如权利要求6所述的核酸构建物,其特征在于,所述重复序列还包含目标位点引导序列sg;所述的目标位点引导序列sg的长度为17-35bp和/或所述重复序列重复数为2-50。The nucleic acid construct according to claim 6, wherein the repeat sequence further comprises a target site guide sequence sg; the length of the target site guide sequence sg is 17-35bp and/or the repeat sequence The number of repetitions is 2-50.
  10. 如权利要求9所述的核酸构建物,其特征在于,所述核酸构建物为选自SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5、SEQ ID NO:6或SEQ ID NO:21中的一个或多个。The nucleic acid construct according to claim 9, wherein the nucleic acid construct is selected from SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO : 6 or one or more of SEQ ID NO: 21.
  11. 如权利要求4-10中任一项所述的核酸构建物,其特征在于,所述核酸构建物为同时包含第一表达盒和第二表达盒的载体,或者,The nucleic acid construct according to any one of claims 4-10, wherein the nucleic acid construct is a vector comprising both the first expression cassette and the second expression cassette, or,
    由分别包含第一表达盒的第一载体与包含第二表达盒的第二载体组成的载体组合。A combination of vectors consisting of a first vector each comprising a first expression cassette and a second vector comprising a second expression cassette.
  12. 一种用于植物体基因编辑的试剂盒,其特征在于,包括如权利要求4-11中任一项所述的核酸构建物。A kit for gene editing in plants, characterized by comprising the nucleic acid construct according to any one of claims 4-11.
  13. 根据权利要求12所述的试剂盒,其特征在于,还包括携带供体DNA表达盒的辅助载体。The kit according to claim 12, further comprising an auxiliary vector carrying a donor DNA expression cassette.
  14. 一种用于植物基因编辑的方法,其特征在于,包括:A method for gene editing in plants, comprising:
    (i)将权利要求4-11任一项所述的核酸构建物和任选的供体核酸片段,导入到植物细胞、植物组织或植物体,并在所述植物细胞、植物组织或植物体中进行基因编辑;(i) introducing the nucleic acid construct and optional donor nucleic acid fragments according to any one of claims 4-11 into plant cells, plant tissues or plant bodies, and in said plant cells, plant tissues or plant bodies gene editing in
    (ii)对发生所述基因编辑的植物细胞、植物组织或植物体进行筛选和鉴定;(ii) screening and identifying the plant cell, plant tissue or plant body in which the gene editing occurs;
    (iii)将步骤(ii)中经鉴定发生了所述基因编辑的植物细胞、植物组织或植物体进行再生或培养;(iii) regenerating or culturing the plant cell, plant tissue or plant body identified in step (ii) as having undergone the gene editing;
    所述的植物选自禾本科植物、豆科植物、茄科或十字花科植物中的任一种。The plant is selected from any one of grasses, leguminous plants, solanaceae or cruciferous plants.
PCT/CN2021/138184 2021-11-29 2021-12-15 Mad7-nls fusion protein, and nucleic acid construct for site-directed editing of plant genome and application thereof WO2023092731A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111427444.X 2021-11-29
CN202111427444.XA CN113846075A (en) 2021-11-29 2021-11-29 MAD7-NLS fusion protein, nucleic acid construct for site-directed editing of plant genome and application thereof

Publications (1)

Publication Number Publication Date
WO2023092731A1 true WO2023092731A1 (en) 2023-06-01

Family

ID=78982191

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/138184 WO2023092731A1 (en) 2021-11-29 2021-12-15 Mad7-nls fusion protein, and nucleic acid construct for site-directed editing of plant genome and application thereof

Country Status (2)

Country Link
CN (1) CN113846075A (en)
WO (1) WO2023092731A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113846075A (en) * 2021-11-29 2021-12-28 科稷达隆(北京)生物技术有限公司 MAD7-NLS fusion protein, nucleic acid construct for site-directed editing of plant genome and application thereof
CN114438123B (en) * 2022-03-07 2024-04-02 中量大黄山高质量发展研究院有限公司 Dicotyledon polygene editing vector and construction method thereof

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105821073A (en) * 2015-01-27 2016-08-03 中国科学院遗传与发育生物学研究所 Method of site-directed modification for intact plant by means of gene transient expression
CN108130342A (en) * 2016-12-01 2018-06-08 中国科学院上海生命科学研究院 Plant Genome fixed point edit methods based on Cpf1
CN109750062A (en) * 2019-03-12 2019-05-14 湖南杂交水稻研究中心 A kind of rice breeding method
CN111621515A (en) * 2020-05-14 2020-09-04 中国计量大学 Method for enhancing gene editing efficiency of CRISPR/Cas9 system
CN112251419A (en) * 2019-11-07 2021-01-22 青岛清原化合物有限公司 Method for generating new mutation in organism and application
WO2021074191A1 (en) * 2019-10-14 2021-04-22 KWS SAAT SE & Co. KGaA Mad7 nuclease in plants and expanding its pam recognition capability
WO2021081384A1 (en) * 2019-10-25 2021-04-29 Greenvenus, Llc Synthetic nucleases
US20210130838A1 (en) * 2019-11-05 2021-05-06 University Of Maryland, College Park SYSTEMS AND METHODS FOR PLANT GENOME EDITING USING CAS 12a ORTHOLOGS
CN113846075A (en) * 2021-11-29 2021-12-28 科稷达隆(北京)生物技术有限公司 MAD7-NLS fusion protein, nucleic acid construct for site-directed editing of plant genome and application thereof

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NZ760730A (en) * 2017-06-23 2023-04-28 Inscripta Inc Nucleic acid-guided nucleases
US10011849B1 (en) * 2017-06-23 2018-07-03 Inscripta, Inc. Nucleic acid-guided nucleases
AU2020310837A1 (en) * 2019-07-08 2022-02-24 Inscripta, Inc. Increased nucleic acid-guided cell editing via a LexA-Rad51 fusion protein

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105821073A (en) * 2015-01-27 2016-08-03 中国科学院遗传与发育生物学研究所 Method of site-directed modification for intact plant by means of gene transient expression
CN108130342A (en) * 2016-12-01 2018-06-08 中国科学院上海生命科学研究院 Plant Genome fixed point edit methods based on Cpf1
CN109750062A (en) * 2019-03-12 2019-05-14 湖南杂交水稻研究中心 A kind of rice breeding method
WO2021074191A1 (en) * 2019-10-14 2021-04-22 KWS SAAT SE & Co. KGaA Mad7 nuclease in plants and expanding its pam recognition capability
WO2021081384A1 (en) * 2019-10-25 2021-04-29 Greenvenus, Llc Synthetic nucleases
US20210130838A1 (en) * 2019-11-05 2021-05-06 University Of Maryland, College Park SYSTEMS AND METHODS FOR PLANT GENOME EDITING USING CAS 12a ORTHOLOGS
CN112251419A (en) * 2019-11-07 2021-01-22 青岛清原化合物有限公司 Method for generating new mutation in organism and application
CN111621515A (en) * 2020-05-14 2020-09-04 中国计量大学 Method for enhancing gene editing efficiency of CRISPR/Cas9 system
CN113846075A (en) * 2021-11-29 2021-12-28 科稷达隆(北京)生物技术有限公司 MAD7-NLS fusion protein, nucleic acid construct for site-directed editing of plant genome and application thereof

Also Published As

Publication number Publication date
CN113846075A (en) 2021-12-28

Similar Documents

Publication Publication Date Title
CN108130342B (en) Cpf 1-based plant genome fixed-point editing method
JP6505599B2 (en) A specially designed transgene integration platform (ETIP) for gene targeting and trait stacking
EP0131623B1 (en) Chimeric genes suitable for expression in plant cells
WO2019207274A1 (en) Gene replacement in plants
JP2020508046A (en) Genome editing system and method
JP4611307B2 (en) DNA cloning vector plasmid and method of use thereof
CN110157726B (en) Method for site-directed substitution of plant genome
EP2796558A1 (en) Improved gene targeting and nucleic acid carrier molecule, in particular for use in plants
JP2018527920A (en) Method for obtaining glyphosate-tolerant rice by site-specific nucleotide substitution
WO2023092731A1 (en) Mad7-nls fusion protein, and nucleic acid construct for site-directed editing of plant genome and application thereof
JP2015500648A (en) Compositions and methods for modifying a given target nucleic acid sequence
WO2021185358A1 (en) Method for improving plant genetic transformation and gene editing efficiency
US20230348869A1 (en) Mad7 nuclease in plants and expanding its pam recognition capability
WO2018098935A1 (en) Vector for plant genome site-directed base substitution
JPS61502166A (en) Improved methods and vectors for transformation of plant cells
WO2021175289A1 (en) Multiplex genome editing method and system
WO2019205939A1 (en) Repeat-mediated plant site-specific recombination method
JP2022511508A (en) Gene silencing by genome editing
US20220033833A1 (en) Compositions and methods for transferring biomolecules to wounded cells
US20050066386A1 (en) Method of modifying genome in higher plant
WO2018082611A1 (en) Nucleic acid construct expressing exogenous gene in plant cells and use thereof
CN108424911B (en) Seed-specific bidirectional promoter and application thereof
CN112522299A (en) Method for obtaining rice with increased tillering by using OsTNC1 gene mutation
WO2019154285A1 (en) No-label reagent combination for gene editing, and application thereof
WO2015118640A1 (en) Acquisition method for plant transformed cell

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21965452

Country of ref document: EP

Kind code of ref document: A1