CN112979821B - Fusion protein for improving gene editing efficiency and application thereof - Google Patents

Fusion protein for improving gene editing efficiency and application thereof Download PDF

Info

Publication number
CN112979821B
CN112979821B CN201911310969.8A CN201911310969A CN112979821B CN 112979821 B CN112979821 B CN 112979821B CN 201911310969 A CN201911310969 A CN 201911310969A CN 112979821 B CN112979821 B CN 112979821B
Authority
CN
China
Prior art keywords
fusion protein
leu
lys
glu
ile
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911310969.8A
Other languages
Chinese (zh)
Other versions
CN112979821A (en
Inventor
李大力
张晓辉
刘明耀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Bioray Laboratories Inc
Original Assignee
East China Normal University
Bioray Laboratories Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University, Bioray Laboratories Inc filed Critical East China Normal University
Priority to CN201911310969.8A priority Critical patent/CN112979821B/en
Priority to EP20903960.1A priority patent/EP4079765A4/en
Priority to JP2022538379A priority patent/JP2023507034A/en
Priority to PCT/CN2020/137239 priority patent/WO2021121321A1/en
Publication of CN112979821A publication Critical patent/CN112979821A/en
Application granted granted Critical
Publication of CN112979821B publication Critical patent/CN112979821B/en
Priority to US17/843,462 priority patent/US20220364072A1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • C12Y305/04001Cytosine deaminase (3.5.4.1)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/80Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Zoology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The application discloses a fusion protein for improving gene editing efficiency and application thereof. The fusion protein comprises a single-stranded DNA binding protein functional domain, a nucleoside deaminase and a nuclease. During the process of converting C-G to T-A bases according to CBEs, nucleoside deaminases such as cytosine deaminase take single-stranded DNA as a substrate for deamination, and single-stranded DNA binding protein functional domains are fused on fusion proteins of the nucleoside deaminases and nucleases, so that the chance of mutexposing the single-stranded DNA to the nucleoside deaminases is greatly increased, and the base editing efficiency is obviously improved. The invention makes breakthrough improvement on the single-base gene editing technology and can greatly promote the application of the single-base gene editing technology in the aspects of gene editing, gene therapy, cell therapy, animal model making, crop genetic breeding and the like.

Description

Fusion protein for improving gene editing efficiency and application thereof
Technical Field
The invention relates to the field of biotechnology, in particular to a fusion protein for improving gene editing efficiency and application thereof.
Background
Since 2013, a new generation of gene editing technology represented by CRISPR/Cas9 enters various experiments in the field of biology, and the traditional gene operation means is changed. The single base gene editing technique was first reported by David Liu laboratories in year 2016, after which other types of single base gene editing techniques based on the principle of cytosine deaminase (e.g., cytosine deaminase from lamprey and humans fused differently to dCas9 or Cas9 n) were also reported in succession. It is derived from Streptococcus pyogenes (Streptococcus pyogenes) spCas9 in CRISPR/Cas9 with NGG as PAM and recognizes and specifically binds DNA to achieve single base mutations C to T or G to a upstream of NGG.
Single-base gene editing techniques have been reported to be useful for efficient gene mutation or repair of genomes, creation of disease animal models, and gene therapy. Among the single-base gene editing tools that have been found so far, BE3 (base editor 3) is most widely used. BE3 exhibits its great potential for use in single base mutation modification or single base mutation therapy of the genome with base substitution efficiencies up to 37%, much higher than those achieved with homologous recombination, while maintaining low off-target effects. With the progress of the research, it was found that introducing additional two or more copies of UGI (uracil glycosidase inhibitor) to BE3 can further enhance its editing efficiency and product purity. The editing efficiency is further improved by introducing double-type NLS (nuclear localization signal) and codon BE4 max. These methods have a uniform degree of improvement in efficiency, but are limited.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a fusion protein for improving the gene editing efficiency and application thereof.
In one aspect, the invention provides a fusion protein for increasing gene editing efficiency, comprising a single-stranded DNA binding protein domain, a nucleoside deaminase, and a nuclease.
Specifically, the connection sequence of the fusion protein is as follows: the nucleoside deaminase is positioned at the N-terminus or C-terminus of the nuclease, and the single-stranded DNA binding protein functional domain is positioned at the N-terminus, C-terminus of the nucleoside deaminase and the nuclease and/or between the nucleoside deaminase and the nuclease;
preferably, the nucleoside deaminase is located at the N-terminus of the nuclease;
more preferably, the single-stranded DNA-binding protein functional domain is located between the nucleoside deaminase and the nuclease.
In the above fusion protein, the single-stranded DNA binding protein includes a sequence-specific single-stranded DNA binding protein, and/or a non-sequence-specific single-stranded DNA binding protein, preferably, a non-sequence-specific single-stranded DNA binding protein,
preferably, the non-sequence-specific single-stranded DNA binding protein is selected from any one or more of RPA70 (70 subunit of human replication protein a), RPA32 (32 subunit of human replication protein a), BRCA2 (breast cancer gene No. 2), hnRNPK (heterogeneous nuclear ribonucleoprotein K), PUF60 (poly-U binding splice factor 60KDa) and Rad51 (a homologous recombinant repair protein);
preferably, the sequence-specific single-stranded DNA binding protein is selected from any one or more of TEBP (telomere binding protein), Teb1 (a constituent protein of telomerase) and POT1 (human telomere protection protein 1);
preferably, the single-stranded DNA-binding protein functional domain comprises at least one (any one, any two, any three or all four) of the following four domains or a partial polypeptide fragment having a function of binding to single-stranded DNA in the following four domains, and any combination thereof: OB fold (oligo/oligopeptide binding fold), KH domain (K homology domain), RRMS (RNA recognition motif), vortex domain (whirly domains) of the single stranded DNA binding protein;
more preferably, the single-stranded DNA binding protein functional domain comprises the DNA Binding Domain (DBD) of Rad51, more preferably, the amino acid sequence of the DNA binding domain of Rad51 comprises the sequence shown in SEQ ID No.1, more preferably, the coding sequence of the DNA binding domain of Rad51 comprises the sequence shown in SEQ ID No. 2;
more preferably, the amino acid sequence of the DNA binding domain of RPA70 comprises the sequence shown in SEQ ID No.19, and even more preferably, the coding sequence of the DNA binding domain of RPA70 comprises the sequence shown in SEQ ID No. 20.
In the above fusion protein, the deaminase comprises cytosine deaminase (APOBEC) and/or adenosine deaminase, preferably cytosine deaminase, which can be derived from different organisms,
more preferably, the cytosine deaminase is rat-derived cytosine deaminase, more preferably, the amino acid sequence of the rat-derived cytosine deaminase comprises the sequence shown in SEQ ID No.3, and more preferably, the coding sequence of the rat-derived cytosine deaminase comprises the sequence shown in SEQ ID No. 4;
the nuclease is selected from one or more of Cas9, Cas3, Cas8a, Cas8b, Cas10d, Cse1, Csy1, Csn2, Cas4, Cas10, Csm2, Cmr5, Fok1 and Cpf 1; preferably, the nuclease is Cas 9; more preferably, the Cas9 is selected from Cas9 derived from streptococcus pneumoniae, staphylococcus aureus, streptococcus pyogenes or streptococcus thermophilus, more preferably, the Cas9 is selected from Cas9 mutant VQR-spCas9, VRER-spCas9, spCas9n, more preferably, spCas9n, more preferably, the amino acid sequence of the spCas9n comprises the sequence shown in SEQ ID No.5, more preferably, the coding sequence of the spCas9n comprises the sequence shown in SEQ ID No. 6.
In the above fusion protein, a NLS (nuclear localization signal) is further included, and preferably, the NLS is located at least one end (C-terminal and/or N-terminal) of the fusion protein; more preferably, the amino acid sequence of the NLS comprises a sequence shown as SEQ ID No.7, and more preferably, the coding sequence of the NLS comprises a sequence shown as SEQ ID No. 8;
the fusion protein further comprises two copies of UGI (uracil glycosidase inhibitor), preferably, the UGI is located at least one end (C-and/or N-terminus) of the fusion protein; more preferably, the amino acid sequence of the UGI comprises the sequence shown in SEQ ID No.9, and more preferably, the coding sequence of the UGI comprises the sequence shown in SEQ ID No. 10.
In another aspect, the present invention also provides any one of the following a) -C) biomaterials:
A) a gene encoding a fusion protein as described in any one of the above; the gene is DNA or RNA (such as mRNA);
B) a recombinant vector comprising a) the gene; the recombinant vector comprises a viral vector and/or a non-viral vector; the virus vector comprises an adeno-associated virus vector, an adenovirus vector, a lentivirus vector, a retrovirus vector and/or an oncolytic virus vector, and the non-virus vector comprises a cationic high molecular polymer, a plasmid vector and/or a liposome;
C) a recombinant cell or recombinant bacterium containing the fusion protein or the gene of A), wherein the recombinant bacterium can be an engineering bacterium, and the recombinant cell can be a target cell to be edited, such as an immune cell (such as a T cell), a hematopoietic stem cell, a red blood cell and the like.
In another aspect, the present invention provides a single-base gene editing system, including any one of the above fusion proteins or the biological material, and sgrnas, wherein the sgrnas guide the fusion proteins to perform single-base gene editing on a target gene in a target cell;
preferably, the target sequence of the sgRNA includes at least one of SEQ ID nos. 11 to 18.
In specific implementation, the target sequence of the sgRNA includes any one, any two, any three, any four, any five, any six, any seven, or all eight of SEQ ID nos. 11 to 18.
In another aspect, the invention provides the use of any of the fusion proteins, the biological materials, and the single base gene editing systems described above in the preparation of gene editing products, disease treatment and/or prevention products, animal models, or new plant varieties.
In another aspect, the present invention provides a method for improving single-base gene editing efficiency, including the steps of introducing a fusion protein and sgRNA of any one of the above into a cell, and performing gene editing on a target gene, wherein the sgRNA guides the fusion protein to perform single-base gene editing on the target gene.
In the above method, preferably, the target sequence of the sgRNA includes at least one of SEQ ID nos. 11 to 18.
The invention has the following beneficial effects:
according to the invention, during the conversion process from C-G to T-A bases by CBEs (pyrimidine base conversion technology), the nucleoside deaminase such as cytosine deaminase takes single-stranded DNA as a substrate for deamination, and the single-stranded DNA binding protein functional domain is fused on the fusion protein of the nucleoside deaminase and nuclease, so that the chance of mutexposing the single-stranded DNA to the nucleoside deaminase is greatly increased, and the base editing efficiency is obviously improved.
The invention discovers that the fusion of one single-stranded DNA binding domain (1-114AA) of human-derived Rad51 shows the highest efficiency improvement between Apobec1 and Cas9n by screening 10 non-sequence-preferred single-stranded DNA binding protein domains for fusion with BE4max, which is named as hyBE4 max. Compared with BE4max, the C-G to T-A editing efficiency of hyBE4max is improved by 16 times to the maximum, and especially the site efficiency close to the PAM region is improved more obviously, and simultaneously lower indels (insertions or deletions) are kept.
The invention makes breakthrough improvement on the single-base gene editing technology and can greatly promote the application of the single-base gene editing technology in the aspects of gene editing, gene therapy, cell therapy, animal model making, crop genetic breeding and the like.
Drawings
FIG. 1 is a schematic diagram of the structure of the fusion of different single-stranded DNA-binding protein domains with BE4 max. Wherein NLS is a nuclear localization signal (the amino acid sequence is shown as SEQ ID No.7, and the coding sequence is shown as SEQ ID No. 8), rA1 is cytidine deaminase APOBEC1 (the amino acid sequence is shown as SEQ ID No.3, and the coding sequence is shown as SEQ ID No. 4), spCas9n is Cas9n (the amino acid sequence is shown as SEQ ID No.5, and the coding sequence is shown as SEQ ID No. 6) derived from Streptococcus pyogenes, UGI is a uracil glycosidase inhibitor (the amino acid sequence is shown as SEQ ID No.9, and the coding sequence is shown as SEQ ID No. 10), and SSDBD is a single-chain DNA binding protein functional domain.
FIG. 2 is a comparison of the C to T base editing efficiency (i.e., ordinate, in%) achieved by hyBE4max versus BE4max at 8 targets on 293T.
FIG. 3 is a comparison of the average C to T base editing efficiency (i.e., ordinate, in%) of 8 targets at 293T for hyBE4max versus BE4 max.
FIG. 4 is a comparison of base editing efficiency (i.e., ordinate in%) for indels generated by BE4max at 8 targets on 293T.
FIG. 5 is a schematic structural diagram of fusion proteins A3A-BE4max and hyA3A-BE4 max. Wherein hA3A is human cytidine deaminase APOBEC3A (the amino acid sequence is shown as SEQ ID No.21, and the coding sequence is shown as SEQ ID No. 22), and NLS, spCas9n and UGI are shown in FIG. 1.
FIG. 6 is a comparison of the C to T base editing efficiencies (i.e., ordinates, in%) achieved by hyA3A-BE4max versus A3A-BE4max at 8 endogenous targets on 293T.
FIG. 7 is a comparison of the C to T base editing efficiencies (i.e., ordinates, in%) achieved by hyA3A-BE4max versus A3A-BE4max at 8 endogenous targets on 293T.
FIG. 8 is a comparison of the base-editing efficiencies (i.e., ordinates in%) of hyA3A-BE4max versus indels produced by A3A-BE4max at 8 endogenous targets at 293T.
FIG. 9 is a schematic structural diagram of the fusion proteins eA3A-BE4max and hyeA3A-BE4 max. Wherein, A3A N57G is N57G mutant of hA3A used in figure 5, NLS, spCas9N and UGI are the same as figure 1.
FIG. 10 is a comparison of the C to T base editing efficiencies (i.e., ordinates, in%) achieved by hyeA3A-BE4max versus eA3A-BE4max at 11 endogenous targets on 293T.
FIG. 11 is a comparison of the C to T base editing efficiencies (i.e., ordinates, in%) achieved by hyeA3A-BE4max versus eA3A-BE4max at 11 endogenous targets on 293T.
FIG. 12 is a comparison of the base-editing efficiency (i.e., ordinate, in%) of indels produced by hyeA3A-BE4max versus 11 endogenous targets at 293T by eA3A-BE4 max.
Wherein, the abscissa C and the following numbers in fig. 2, 3, 5, 6, 11 represent the position of C edited as T on the corresponding target sequence, e.g., C5 represents the efficiency of C edited as T from the 5 th position 5' of the corresponding target sequence.
Detailed Description
The present invention will be described in further detail with reference to the following specific examples and drawings, and the present invention is not limited to the following examples. Variations and advantages that may occur to those skilled in the art may be incorporated into the invention without departing from the spirit and scope of the inventive concept, and the scope of the appended claims is intended to be protected. The procedures, conditions, reagents, experimental methods and the like for carrying out the present invention are general knowledge and common general knowledge in the art except for the contents specifically mentioned below, and the present invention is not particularly limited. Such as described in Sambrook et al, molecular cloning, A Laboratory Manual (New York: Cold Spring Harbor Laboratory Press,1989), or according to the manufacturer's recommendations.
First, the BE4max editing efficiency of the functional domain fused with the Rad51DBD (1-114aa) single-stranded DNA binding protein is improved most obviously
1.1 plasmid design and construction
1.1.1, based on the property of single-stranded DNA as substrate of Apobec1 of CBEs in the single-base editing technology, we designed 10 different functional domains of non-sequence-biased single-stranded DNA binding protein derived from human (mainly RPA70(630aa) -A, RPA70-B, RPA70-AB, RPA70-C, RPA32-D, BRCA2-OB2, BRCA2-OB3, HNRNPK KH domain, PUF60 RRM, Rad51 DBD) (Table 1), and since the reported fusion protein is placed at the C-terminal of BE4max (the first diagram from top to bottom in FIG. 1) and tends to BE inactive, these functional domains are fused at the N-terminal of BE4max (the second diagram from top to bottom in FIG. 1), and two endogenous targets EMX1 site1, Tim3-sg1 derived from human (the sequence shown in Table 2) are designed at the same time.
1.1.2, the DNAs of 10 different domains of human-derived, non-sequence-biased, single-stranded DNA binding proteins shown in Table 1 were synthesized, and then seamlessly assembled into the N-terminus of BE4max in plasmid pCMV-BE4max (addendum, #112093), to construct 10 recombinant plasmids (FIG. 1): pRPA70-A-BE4max, pRPA70-B-BE4max, pRPA70-AB-BE4max, pRPA70-C-BE4max, pRPA32-D-BE4max, pBRCA2-OB2-BE4max, pBRCA2-OB3-BE4max, pKH-BE4max, pRRM-BE4max, pRad51DBD-BE4 max.
DNAs of target points EMX1 site1 and Tim3-sg1 shown in Table 2 are artificially synthesized and respectively connected to Bbs I site of sgRNA expression plasmid U6-sgRNA-EF1 alpha-GFP (used for expressing sgRNA of the corresponding target points) to obtain recombinant plasmids pE and pT.
1.1.3 plasmids constructed in 1.1.1 and 1.1.2 were sequenced by sanger to ensure complete correctness.
TABLE 1 sequences of different functional domains of the Single-stranded DNA binding proteins used
Figure BDA0002324514780000041
Figure BDA0002324514780000051
Figure BDA0002324514780000061
TABLE 2 targets and sequences used
Name of target point Sequence (5 '-3') SEQ ID No.
EMX1 site1 GAGTCCGAGCAGAAGAAGAAGGG 11
Tim3-sg1 TTCTACACCCCAGCCGCCCCAGG 12
VEGFA site2 GACCCCCTCCACCCCGCCTCCGG 13
Lag3-sg2 CGCTACACGGTGCTGAGCGTGGG 14
HEK3 GGCCCAGACTGAGCACGTGATGG 15
HEK4 GGCACTGCGGCTGGAGGTGGGGG
16
EMX1-sg2p GACATCGATGTCCTCCCCATTGG 17
Nme1-sg1 AGGGATCGTCTTTCAAGGCGAGG 18
1.2 transfection of cells
HEK293T 5X 105Cells were plated in 24-well plates and plasmid combinations were transfected at pssDBD-BE4max: pE (or pT) 750ng:250ng when cells grew to 70% -80%, with 3-well replicates per plasmid combination, 2X 10 per well5And (4) cells. At the same time, a blank control without any plasmid transfection was set.
pssDBD-BE4max represents: any one of plasmids pRPA70-A-BE4max, pRPA70-B-BE4max, pRPA70-AB-BE4max, pRPA70-C-BE4max, pRPA32-D-BE4max, pBRCA2-OB2-BE4max, pBRCA2-OB3-BE4max, pKH-BE4max, pRRM-BE4max and pRad51DBD-BE4max, with plasmid pCMV-BE4max as a negative control.
1.3 genome extraction and preparation of amplicon libraries
At 72h after transfection, cell genomic DNA was extracted using a Tiangen cell genome extraction kit (DP 304). Then, the operation flow of the Hitom kit is used for designing corresponding identification primers (table 3), namely, a bridging sequence 5'-ggagtgagtacggtgtgc-3' is added at the 5 'end of the forward identification primer, a bridging sequence 5'-gagttggatgctggatgg-3'is added at the 5' end of the reverse identification primer, so that a round of PCR product is obtained, then, the round of PCR product is used as a template for carrying out two rounds of PCR, and the round of PCR product is mixed together for carrying out gel cutting, recovery and purification, and then, the mixture is sent to a company for deep sequencing.
TABLE 3 identifying primers for target used
Figure BDA0002324514780000071
Figure BDA0002324514780000081
1.4 deep sequencing result analysis and statistics
The deep sequencing results of step 1.3 were analyzed using the BE-analyzer website, and the ratio of C to T, Indels was counted, with the results shown in tables 4 and 5.
The results show that: compared with BE4max, BE4max (Rad51DBD-N-BE4max or Rad51DBD-BE4max) fused with the functional domain of Rad51 single-stranded DNA binding protein has the most obvious improvement on the C-to-T editing efficiency on a target spot, and BE4max fused with the functional domain of RPA70-C single-stranded DNA binding protein.
Second, the edit efficiency of hyBE4max is optimal
In order to further test the fusion position of the Rad51 single-stranded DNA binding protein domain with the highest efficiency of editing C to T on the target in step one, Rad51DBD was fused to two different positions of BE4max, three recombinant plasmids, BE4max (third to fifth from top to bottom in fig. 1) fused with Rad51DBD, were transfected with recombinant plasmid pE or pT according to the method of 1.2 in step one, and the editing efficiency results were obtained according to the methods of 1.3 and 1.4 in step one (tables 4 and 5).
The three types of BE4max fused with Rad51DBD shown in the third to fifth graphs from top to bottom in FIG. 1 are as follows:
rad51DBD-N-BE 4max: rad51DBD is fused between NLS and rA1 in BE4max, namely Rad51DBD is positioned at the N end of rA1 and spCas 9N;
rad51DBD-C-BE 4max: in BE4max, Rad51DBD is fused between spCas9n and UGI, namely Rad51DBD is positioned at the C ends of rA1 and spCas9 n;
hyBE 4max: rad51DBD was fused between rA1 and spCas9n in BE4 max.
TABLE 4 editing efficiency results (unit,%) for target EMX1 site1
Figure BDA0002324514780000082
Figure BDA0002324514780000091
TABLE 5 editing efficiency results (unit,%) for target Tim3-sg1
Figure BDA0002324514780000092
Figure BDA0002324514780000101
The results in tables 4 and 5 show that: compared with the fusion of Rad51DBD (namely Rad51DBD-N-BE4max) between NLS and rA1 in BE4max, the fusion of Rad51DBD (namely hyBE4max) between rA1 and spCas9N in BE4max has the most obvious improvement on the editing efficiency of C to T on a target point.
Operating characteristics of tri, hyBE4max
To further fairly describe the performance characteristics of hyBE4max, another 6 additional targets were designed VEGFA site2, Lag3-sg2, HEK3, HEK4, EMX1-sg2p, Nme1-sg1 (sequences as in table 2) and ligated to plasmid U6-sgRNA-EF1 α -GFP at the BbsI site to give recombinant plasmids pV, pL, pH3, pH4, pEP and pN. The plasmid was sequenced by sanger, ensuring complete correctness.
And (3) carrying out cell transfection on the recombinant plasmid containing hyBE4max in the second step and the recombinant plasmids pE, pT, pV, pL, pH3, pH4, pEP or pN according to the method 1.2 in the first step, obtaining the editing efficiency result according to the methods 1.3 and 1.4 in the first step, and carrying out statistical mapping by using graphpad prism 8.0.
As a result, as shown in FIGS. 2 and 3, in edit window C3-C8, the C to T edit efficiency of hyBE4max is 19-71%, and the corresponding BE4max is 13-47%; in edit window C9-C12, the C to T edit efficiency of hyBE4max is 19-55%, corresponding to BE4max being 1.4-17%. Relative to BE4max, within edit window C3-C8, hyBE4max has an average C to T edit efficiency that is 1.6-2.2 times BE4 max; within edit window C9-C12, hyBE4max has an average C to T edit efficiency that is 3.3-17 times BE4 max. While hyBE4max remained low for indels production (fig. 4).
Effect of fusion proteins containing different cytosine deaminases
(I) fusion protein hyA3A-BE4max working characteristics
4.1.1 Rad51-DBD was synthesized according to the coding sequence in Table 1, followed by seamless clonal assembly between hA3A and spCas9n in plasmid pCMV-A3A-BE4max (FIG. 5) expressing protein A3A-BE4max (FIG. 5), to construct recombinant plasmid pA expressing fusion protein hyA3A-BE4max (FIG. 5).
4.1.2, sequentially synthesizing 8 human endogenous targets: the target sequences of EMX1 site1, Tim3-sg1, VEGFA site2, EMX1-sg2p and Nme1-sg1 are shown in Table 2, and the target sequences of FANCF site1, EGFR-sg5 and EGFR-sg21 are shown in Table 6; respectively connected to Bbs I sites of sgRNA expression plasmids pU6-sgRNA-EF1 alpha-GFP to obtain recombinant plasmids pB1, pB2 and pB … … 8 of sgRNA expressing corresponding targets.
4.1.3 plasmids constructed in 4.1.1 and 4.1.2 were sequenced by sanger to ensure complete correctness.
TABLE 6 targets and sequences used
Name of target point Sequence (5 '-3') SEQ ID No.
FANCF site1 GGAATCCCTTCTGCAGCACCTGG 23
EGFR-sg5 GTGCTGGGCTCCGGTGCGTTCGG 24
EGFR-sg21 CAAAGCAGAAACTCACATCGAGG 25
4.1.4 transfection of cells
Will be 5X 105HEK293T cells were plated in 24-well plates and plasmid combinations were transfected pA (or plasmid pCMV-A3A-BE4max) pB1 (or pB2, pB3, … … pB8) 750ng 250ng when the cells grew to 70% -80%, each plasmid combination transfection was repeated 3 wells, 2X 10 cells per well5And (4) cells. At the same time, a blank control without any plasmid transfection was set.
4.1.5 genome extraction and preparation of amplicon libraries
The method according to step 1.3, wherein the primers for identifying the target sites of FANCF site1, EGFR-sg5 and EGFR-sg21 are shown in Table 7, and the remaining primers for identifying the target sites are shown in Table 3.
TABLE 7 identifying primers for target used
Figure BDA0002324514780000111
4.1.6 deep sequencing result analysis and statistics
The procedure was as in step 1.4.
The results show that: compared with the protein A3A-BE4max, the editing efficiency of the fusion protein hyA3A-BE4max to a single base C to T at different positions (C3-C15) of each target point is obviously improved (FIG. 6). Compared with A3A-BE4max, the high activity window of hyA3A-BE4max is expanded from original C3-C11 to C3-C15; among them, the efficiency of editing a single base C to T by C3-C11, hyA3A-BE4max far away from the PAM region is 1.1-2.3 times of that of A3A-BE4max, and the efficiency of editing a single base C to T by C12-C15, hyA3A-BE4max near the PAM region is 3.1-4.1 times of that of A3A-BE4max, namely the efficiency of editing a single base C to T by C12-C15, hyA3A-BE4max near the PAM region is improved more obviously (FIG. 7). And hyA3A-BE4max while maintaining a lower indels (FIG. 8).
(II) fusion protein hyeA3A-BE4max working property
4.2.1 working System plasmid construction
Rad51-DBD was synthesized according to the coding sequence in Table 1, followed by seamless clonal assembly between eA3A and spCas9n in plasmid pCMV-eA3A-BE4max (FIG. 9) expressing protein eA3A-BE4max, to construct recombinant plasmid pAe expressing fusion protein hyeA3A-BE4max (FIG. 9).
4.2.2 construction of target plasmids
Simultaneously, 11 endogenous targets from human are designed and synthesized: the target sequences of EMX1-sg2p, EMX1 site1 and Nme1-sg1 are shown in Table 2, the target sequence of EGFR-sg21 is shown in Table 6, and the rest target sequences are shown in Table 8, and are respectively connected to BbsI sites of sgRNA expression plasmids U6-sgRNA-EF1 alpha-GFP for expressing sgRNAs of corresponding targets, so that recombinant plasmids pC1, pC2 and … … pC11 are obtained.
4.2.3 plasmids constructed in 4.2.1 and 4.2.2 were sequenced by sanger to ensure complete correctness.
TABLE 8 targets and sequences used
Name of target point Sequence (5 '-3') SEQ ID No.
CTLA-sg1 CTCCCTCAAGCAGGCCCCGCTGG 26
EGFR-sg5 GTGCTGGGCTCCGGTGCGTTCGG 27
CDK10-sg1 TTCTCGGAGGCTCAGGTGCGTGG 28
EMX1-sg1 GCTCCCATCACATCAACCGGTGG 29
HPRT1-sg6 GCCCTCTGTGTGCTCAAGGGGGG 30
EGFR-sg26 CATGCCCTTCGGCTGCCTCCTGG 31
CCR5-sg1 TAATAATTGATGTCATAGATTGG 32
4.2.4 cell transfection-validation hyeA3A-BE4max working System
Will be 5X 105HEK293T cells were plated in 24-well plates and plasmid combinations were transfected pA (or plasmid pCMV-eA3A-BE4max) pC1 (or pC2, pC3, … … pC11) 750ng 250ng when the cells grew to 70% -80%, each plasmid combination transfection was repeated in 3 wells, 2X 10 per well5And (4) cells. At the same time, a blank control without any plasmid transfection was set.
4.2.5 genome extraction and preparation of amplicon libraries
The method is carried out according to the step 1.3, wherein the identification primers of EMX1-sg2p, EMX1 site1 and Nme1-sg1 are shown in Table 3, the identification primer of EGFR-sg21 is shown in Table 7, and the rest target sequences are shown in Table 9.
TABLE 9 identifying primers for targets used
Figure BDA0002324514780000121
4.2.6 analysis and statistics of deep sequencing results
The procedure was as in step 1.4.
The results show that: compared with the protein eA3A-BE4max, the editing efficiency of the fusion protein hyeA3A-BE4max to a single base C to T at different positions (C3-C15) of each target spot is mostly obviously improved, a high-activity window is expanded from the original C3-C11 to the C3-C15 position, and the single base C in TC motif can BE specifically targeted to realize C-T conversion (figure 10); wherein, the editing efficiency of the hyeA3A-BE4max on a single base C to T is 1.6-2.8 times of that of eA3A-BE4max at C3-C11 far away from the PAM region, and the editing efficiency of the hyeA3A-BE4max on a single base C to T is 4.5-31.9 times of that of eA3A-BE4max at C12-C15 near the PAM region, namely the editing efficiency of the hyeA3A-BE4max on a single base C to T is improved more obviously at C12-C15 near the PAM region (FIG. 11). While hyeA3A-BE4max remained low for indels (FIG. 12).
Those not described in detail in this specification are within the skill of the art. The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.
Sequence listing
<110> Shanghai Bodhisae Biotech Co., Ltd, university of east China
<120> fusion protein for improving gene editing efficiency and application thereof
<130> JH-CNP191374
<160> 32
<170> PatentIn version 3.5
<210> 1
<211> 114
<212> PRT
<213> human (Homo sapiens)
<400> 1
Met Ala Met Gln Met Gln Leu Glu Ala Asn Ala Asp Thr Ser Val Glu
1 5 10 15
Glu Glu Ser Phe Gly Pro Gln Pro Ile Ser Arg Leu Glu Gln Cys Gly
20 25 30
Ile Asn Ala Asn Asp Val Lys Lys Leu Glu Glu Ala Gly Phe His Thr
35 40 45
Val Glu Ala Val Ala Tyr Ala Pro Lys Lys Glu Leu Ile Asn Ile Lys
50 55 60
Gly Ile Ser Glu Ala Lys Ala Asp Lys Ile Leu Ala Glu Ala Ala Lys
65 70 75 80
Leu Val Pro Met Gly Phe Thr Thr Ala Thr Glu Phe His Gln Arg Arg
85 90 95
Ser Glu Ile Ile Gln Ile Thr Thr Gly Ser Lys Glu Leu Asp Lys Leu
100 105 110
Leu Gln
<210> 2
<211> 342
<212> DNA
<213> human (Homo sapiens)
<400> 2
atggcaatgc agatgcagct tgaagcaaat gcagatactt cagtggaaga agaaagcttt 60
ggcccacaac ccatttcacg gttagagcag tgtggcataa atgccaacga tgtgaagaaa 120
ttggaagaag ctggattcca tactgtggag gctgttgcct atgcgccaaa gaaggagcta 180
ataaatatta agggaattag tgaagccaaa gctgataaaa ttctggctga ggcagctaaa 240
ttagttccaa tgggtttcac cactgcaact gaattccacc aaaggcggtc agagatcata 300
cagattacta ctggctccaa agagcttgac aaactacttc aa 342
<210> 3
<211> 228
<212> PRT
<213> rat (Rattus norvegicus)
<400> 3
Ser Ser Glu Thr Gly Pro Val Ala Val Asp Pro Thr Leu Arg Arg Arg
1 5 10 15
Ile Glu Pro His Glu Phe Glu Val Phe Phe Asp Pro Arg Glu Leu Arg
20 25 30
Lys Glu Thr Cys Leu Leu Tyr Glu Ile Asn Trp Gly Gly Arg His Ser
35 40 45
Ile Trp Arg His Thr Ser Gln Asn Thr Asn Lys His Val Glu Val Asn
50 55 60
Phe Ile Glu Lys Phe Thr Thr Glu Arg Tyr Phe Cys Pro Asn Thr Arg
65 70 75 80
Cys Ser Ile Thr Trp Phe Leu Ser Trp Ser Pro Cys Gly Glu Cys Ser
85 90 95
Arg Ala Ile Thr Glu Phe Leu Ser Arg Tyr Pro His Val Thr Leu Phe
100 105 110
Ile Tyr Ile Ala Arg Leu Tyr His His Ala Asp Pro Arg Asn Arg Gln
115 120 125
Gly Leu Arg Asp Leu Ile Ser Ser Gly Val Thr Ile Gln Ile Met Thr
130 135 140
Glu Gln Glu Ser Gly Tyr Cys Trp Arg Asn Phe Val Asn Tyr Ser Pro
145 150 155 160
Ser Asn Glu Ala His Trp Pro Arg Tyr Pro His Leu Trp Val Arg Leu
165 170 175
Tyr Val Leu Glu Leu Tyr Cys Ile Ile Leu Gly Leu Pro Pro Cys Leu
180 185 190
Asn Ile Leu Arg Arg Lys Gln Pro Gln Leu Thr Phe Phe Thr Ile Ala
195 200 205
Leu Gln Ser Cys His Tyr Gln Arg Leu Pro Pro His Ile Leu Trp Ala
210 215 220
Thr Gly Leu Lys
225
<210> 4
<211> 684
<212> DNA
<213> rat (Rattus norvegicus)
<400> 4
tcctcagaga ctgggcctgt cgccgtcgat ccaaccctgc gccgccggat tgaacctcac 60
gagtttgaag tgttctttga cccccgggag ctgagaaagg agacatgcct gctgtacgag 120
atcaactggg gaggcaggca ctccatctgg aggcacacct ctcagaacac aaataagcac 180
gtggaggtga acttcatcga gaagtttacc acagagcggt acttctgccc caataccaga 240
tgtagcatca catggtttct gagctggtcc ccttgcggag agtgtagcag ggccatcacc 300
gagttcctgt ccagatatcc acacgtgaca ctgtttatct acatcgccag gctgtatcac 360
cacgcagacc caaggaatag gcagggcctg cgcgatctga tcagctccgg cgtgaccatc 420
cagatcatga cagagcagga gtccggctac tgctggcgga acttcgtgaa ttattctcct 480
agcaacgagg cccactggcc taggtaccca cacctgtggg tgcgcctgta cgtgctggag 540
ctgtattgca tcatcctggg cctgccccct tgtctgaata tcctgcggag aaagcagccc 600
cagctgacct tctttacaat cgccctgcag tcttgtcact atcagaggct gccaccccac 660
atcctgtggg ccacaggcct gaag 684
<210> 5
<211> 1367
<212> PRT
<213> Streptococcus pyogenes (Streptococcus pyogenes)
<400> 5
Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly
1 5 10 15
Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys
20 25 30
Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly
35 40 45
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys
50 55 60
Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr
65 70 75 80
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 95
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His
100 105 110
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His
115 120 125
Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser
130 135 140
Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met
145 150 155 160
Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175
Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn
180 185 190
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys
195 200 205
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu
210 215 220
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu
225 230 235 240
Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp
245 250 255
Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp
260 265 270
Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
275 280 285
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
290 295 300
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met
305 310 315 320
Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala
325 330 335
Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350
Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln
355 360 365
Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly
370 375 380
Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
385 390 395 400
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly
405 410 415
Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro
435 440 445
Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met
450 455 460
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val
465 470 475 480
Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn
485 490 495
Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu
500 505 510
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
515 520 525
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys
530 535 540
Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val
545 550 555 560
Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser
565 570 575
Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr
580 585 590
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu
610 615 620
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His
625 630 635 640
Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
660 665 670
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala
675 680 685
Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys
690 695 700
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His
705 710 715 720
Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735
Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg
740 745 750
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
755 760 765
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu
770 775 780
Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
785 790 795 800
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815
Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
820 825 830
Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys Asp
835 840 845
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg Gly
850 855 860
Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn
865 870 875 880
Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
885 890 895
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
900 905 910
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys
915 920 925
His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
930 935 940
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys
945 950 955 960
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu
965 970 975
Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990
Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
995 1000 1005
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1010 1015 1020
Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr
1025 1030 1035
Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn
1040 1045 1050
Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr
1055 1060 1065
Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg
1070 1075 1080
Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
1085 1090 1095
Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1100 1105 1110
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1115 1120 1125
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1130 1135 1140
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser
1145 1150 1155
Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe
1160 1165 1170
Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu
1175 1180 1185
Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe
1190 1195 1200
Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu
1205 1210 1215
Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn
1220 1225 1230
Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
1235 1240 1245
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His
1250 1255 1260
Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg
1265 1270 1275
Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr
1280 1285 1290
Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile
1295 1300 1305
Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
1310 1315 1320
Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr
1325 1330 1335
Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
1340 1345 1350
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
<210> 6
<211> 4101
<212> DNA
<213> Streptococcus pyogenes (Streptococcus pyogenes)
<400> 6
gacaagaagt acagcatcgg cctggccatc ggcaccaact ctgtgggctg ggccgtgatc 60
accgacgagt acaaggtgcc cagcaagaaa ttcaaggtgc tgggcaacac cgaccggcac 120
agcatcaaga agaacctgat cggagccctg ctgttcgaca gcggcgaaac agccgaggcc 180
acccggctga agagaaccgc cagaagaaga tacaccagac ggaagaaccg gatctgctat 240
ctgcaagaga tcttcagcaa cgagatggcc aaggtggacg acagcttctt ccacagactg 300
gaagagtcct tcctggtgga agaggataag aagcacgagc ggcaccccat cttcggcaac 360
atcgtggacg aggtggccta ccacgagaag taccccacca tctaccacct gagaaagaaa 420
ctggtggaca gcaccgacaa ggccgacctg cggctgatct atctggccct ggcccacatg 480
atcaagttcc ggggccactt cctgatcgag ggcgacctga accccgacaa cagcgacgtg 540
gacaagctgt tcatccagct ggtgcagacc tacaaccagc tgttcgagga aaaccccatc 600
aacgccagcg gcgtggacgc caaggccatc ctgtctgcca gactgagcaa gagcagacgg 660
ctggaaaatc tgatcgccca gctgcccggc gagaagaaga atggcctgtt cggaaacctg 720
attgccctga gcctgggcct gacccccaac ttcaagagca acttcgacct ggccgaggat 780
gccaaactgc agctgagcaa ggacacctac gacgacgacc tggacaacct gctggcccag 840
atcggcgacc agtacgccga cctgtttctg gccgccaaga acctgtccga cgccatcctg 900
ctgagcgaca tcctgagagt gaacaccgag atcaccaagg cccccctgag cgcctctatg 960
atcaagagat acgacgagca ccaccaggac ctgaccctgc tgaaagctct cgtgcggcag 1020
cagctgcctg agaagtacaa agagattttc ttcgaccaga gcaagaacgg ctacgccggc 1080
tacattgacg gcggagccag ccaggaagag ttctacaagt tcatcaagcc catcctggaa 1140
aagatggacg gcaccgagga actgctcgtg aagctgaaca gagaggacct gctgcggaag 1200
cagcggacct tcgacaacgg cagcatcccc caccagatcc acctgggaga gctgcacgcc 1260
attctgcggc ggcaggaaga tttttaccca ttcctgaagg acaaccggga aaagatcgag 1320
aagatcctga ccttccgcat cccctactac gtgggccctc tggccagggg aaacagcaga 1380
ttcgcctgga tgaccagaaa gagcgaggaa accatcaccc cctggaactt cgaggaagtg 1440
gtggacaagg gcgcttccgc ccagagcttc atcgagcgga tgaccaactt cgataagaac 1500
ctgcccaacg agaaggtgct gcccaagcac agcctgctgt acgagtactt caccgtgtat 1560
aacgagctga ccaaagtgaa atacgtgacc gagggaatga gaaagcccgc cttcctgagc 1620
ggcgagcaga aaaaggccat cgtggacctg ctgttcaaga ccaaccggaa agtgaccgtg 1680
aagcagctga aagaggacta cttcaagaaa atcgagtgct tcgactccgt ggaaatctcc 1740
ggcgtggaag atcggttcaa cgcctccctg ggcacatacc acgatctgct gaaaattatc 1800
aaggacaagg acttcctgga caatgaggaa aacgaggaca ttctggaaga tatcgtgctg 1860
accctgacac tgtttgagga cagagagatg atcgaggaac ggctgaaaac ctatgcccac 1920
ctgttcgacg acaaagtgat gaagcagctg aagcggcgga gatacaccgg ctggggcagg 1980
ctgagccgga agctgatcaa cggcatccgg gacaagcagt ccggcaagac aatcctggat 2040
ttcctgaagt ccgacggctt cgccaacaga aacttcatgc agctgatcca cgacgacagc 2100
ctgaccttta aagaggacat ccagaaagcc caggtgtccg gccagggcga tagcctgcac 2160
gagcacattg ccaatctggc cggcagcccc gccattaaga agggcatcct gcagacagtg 2220
aaggtggtgg acgagctcgt gaaagtgatg ggccggcaca agcccgagaa catcgtgatc 2280
gaaatggcca gagagaacca gaccacccag aagggacaga agaacagccg cgagagaatg 2340
aagcggatcg aagagggcat caaagagctg ggcagccaga tcctgaaaga acaccccgtg 2400
gaaaacaccc agctgcagaa cgagaagctg tacctgtact acctgcagaa tgggcgggat 2460
atgtacgtgg accaggaact ggacatcaac cggctgtccg actacgatgt ggaccatatc 2520
gtgcctcaga gctttctgaa ggacgactcc atcgacaaca aggtgctgac cagaagcgac 2580
aagaaccggg gcaagagcga caacgtgccc tccgaagagg tcgtgaagaa gatgaagaac 2640
tactggcggc agctgctgaa cgccaagctg attacccaga gaaagttcga caatctgacc 2700
aaggccgaga gaggcggcct gagcgaactg gataaggccg gcttcatcaa gagacagctg 2760
gtggaaaccc ggcagatcac aaagcacgtg gcacagatcc tggactcccg gatgaacact 2820
aagtacgacg agaatgacaa gctgatccgg gaagtgaaag tgatcaccct gaagtccaag 2880
ctggtgtccg atttccggaa ggatttccag ttttacaaag tgcgcgagat caacaactac 2940
caccacgccc acgacgccta cctgaacgcc gtcgtgggaa ccgccctgat caaaaagtac 3000
cctaagctgg aaagcgagtt cgtgtacggc gactacaagg tgtacgacgt gcggaagatg 3060
atcgccaaga gcgagcagga aatcggcaag gctaccgcca agtacttctt ctacagcaac 3120
atcatgaact ttttcaagac cgagattacc ctggccaacg gcgagatccg gaagcggcct 3180
ctgatcgaga caaacggcga aaccggggag atcgtgtggg ataagggccg ggattttgcc 3240
accgtgcgga aagtgctgag catgccccaa gtgaatatcg tgaaaaagac cgaggtgcag 3300
acaggcggct tcagcaaaga gtctatcctg cccaagagga acagcgataa gctgatcgcc 3360
agaaagaagg actgggaccc taagaagtac ggcggcttcg acagccccac cgtggcctat 3420
tctgtgctgg tggtggccaa agtggaaaag ggcaagtcca agaaactgaa gagtgtgaaa 3480
gagctgctgg ggatcaccat catggaaaga agcagcttcg agaagaatcc catcgacttt 3540
ctggaagcca agggctacaa agaagtgaaa aaggacctga tcatcaagct gcctaagtac 3600
tccctgttcg agctggaaaa cggccggaag agaatgctgg cctctgccgg cgaactgcag 3660
aagggaaacg aactggccct gccctccaaa tatgtgaact tcctgtacct ggccagccac 3720
tatgagaagc tgaagggctc ccccgaggat aatgagcaga aacagctgtt tgtggaacag 3780
cacaagcact acctggacga gatcatcgag cagatcagcg agttctccaa gagagtgatc 3840
ctggccgacg ctaatctgga caaagtgctg tccgcctaca acaagcaccg ggataagccc 3900
atcagagagc aggccgagaa tatcatccac ctgtttaccc tgaccaatct gggagcccct 3960
gccgccttca agtactttga caccaccatc gaccggaaga ggtacaccag caccaaagag 4020
gtgctggacg ccaccctgat ccaccagagc atcaccggcc tgtacgagac acggatcgac 4080
ctgtctcagc tgggaggtga c 4101
<210> 7
<211> 18
<212> PRT
<213> Artificial sequence
<400> 7
Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Ser Pro Lys Lys Lys Arg
1 5 10 15
Lys Val
<210> 8
<211> 54
<212> DNA
<213> Artificial sequence
<400> 8
aaacggacag ccgacggaag cgagttcgag tcaccaaaga agaagcggaa agtc 54
<210> 9
<211> 176
<212> PRT
<213> Bacillus subtilis bacteriophage
<400> 9
Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln Leu Val
1 5 10 15
Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu Val Ile
20 25 30
Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr Asp Glu
35 40 45
Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro Glu Tyr
50 55 60
Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile
65 70 75 80
Lys Met Leu Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser Thr Asn Leu
85 90 95
Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln Leu Val Ile Gln Glu
100 105 110
Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu Val Ile Gly Asn Lys
115 120 125
Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr Asp Glu Ser Thr Asp
130 135 140
Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro Glu Tyr Lys Pro Trp
145 150 155 160
Ala Leu Val Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile Lys Met Leu
165 170 175
<210> 10
<211> 528
<212> DNA
<213> Bacillus subtilis bacteriophage
<400> 10
actaatctga gcgacatcat tgagaaggag actgggaaac agctggtcat tcaggagtcc 60
atcctgatgc tgcctgagga ggtggaggaa gtgatcggca acaagccaga gtctgacatc 120
ctggtgcaca ccgcctacga cgagtccaca gatgagaatg tgatgctgct gacctctgac 180
gcccccgagt ataagccttg ggccctggtc atccaggatt ctaacggcga gaataagatc 240
aagatgctga gcggaggatc cggaggatct ggaggcagca ccaacctgtc tgacatcatc 300
gagaaggaga caggcaagca gctggtcatc caggagagca tcctgatgct gcccgaagaa 360
gtcgaagaag tgatcggaaa caagcctgag agcgatatcc tggtccatac cgcctacgac 420
gagagtaccg acgaaaatgt gatgctgctg acatccgacg ccccagagta taagccctgg 480
gctctggtca tccaggattc caacggagag aacaaaatca aaatgctg 528
<210> 11
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 11
gagtccgagc agaagaagaa ggg 23
<210> 12
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 12
ttctacaccc cagccgcccc agg 23
<210> 13
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 13
gaccccctcc accccgcctc cgg 23
<210> 14
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 14
cgctacacgg tgctgagcgt ggg 23
<210> 15
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 15
ggcccagact gagcacgtga tgg 23
<210> 16
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 16
ggcactgcgg ctggaggtgg ggg 23
<210> 17
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 17
gacatcgatg tcctccccat tgg 23
<210> 18
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 18
agggatcgtc tttcaaggcg agg 23
<210> 19
<211> 181
<212> PRT
<213> human (Homo sapiens)
<400> 19
Gly Gly Ser Asn Thr Asn Trp Lys Thr Leu Tyr Glu Val Lys Ser Glu
1 5 10 15
Asn Leu Gly Gln Gly Asp Lys Pro Asp Tyr Phe Ser Ser Val Ala Thr
20 25 30
Val Val Tyr Leu Arg Lys Glu Asn Cys Met Tyr Gln Ala Cys Pro Thr
35 40 45
Gln Asp Cys Asn Lys Lys Val Ile Asp Gln Gln Asn Gly Leu Tyr Arg
50 55 60
Cys Glu Lys Cys Asp Thr Glu Phe Pro Asn Phe Lys Tyr Arg Met Ile
65 70 75 80
Leu Ser Val Asn Ile Ala Asp Phe Gln Glu Asn Gln Trp Val Thr Cys
85 90 95
Phe Gln Glu Ser Ala Glu Ala Ile Leu Gly Gln Asn Ala Ala Tyr Leu
100 105 110
Gly Glu Leu Lys Asp Lys Asn Glu Gln Ala Phe Glu Glu Val Phe Gln
115 120 125
Asn Ala Asn Phe Arg Ser Phe Ile Phe Arg Val Arg Val Lys Val Glu
130 135 140
Thr Tyr Asn Asp Glu Ser Arg Ile Lys Ala Thr Val Met Asp Val Lys
145 150 155 160
Pro Val Asp Tyr Arg Glu Tyr Gly Arg Arg Leu Val Met Ser Ile Arg
165 170 175
Arg Ser Ala Leu Met
180
<210> 20
<211> 543
<212> DNA
<213> human (Homo sapiens)
<400> 20
ggagggagta acaccaactg gaaaaccttg tatgaggtca aatccgagaa cctgggccaa 60
ggcgacaagc cggactactt tagttctgtg gccacagtgg tgtatcttcg caaagagaac 120
tgcatgtacc aagcctgccc gactcaggac tgcaataaga aagtgattga tcaacagaat 180
ggattgtacc gctgtgagaa gtgcgacacc gaatttccca atttcaagta ccgcatgatc 240
ctgtcagtaa atattgcaga ttttcaagag aatcagtggg tgacttgttt ccaggagtct 300
gctgaagcta tccttggaca aaatgctgct tatcttgggg aattaaaaga caagaatgaa 360
caggcatttg aagaagtttt ccagaatgcc aacttccgat ctttcatatt cagagtcagg 420
gtcaaagtgg agacctacaa cgacgagtct cgaattaagg ccactgtgat ggacgtgaag 480
cccgtggact acagagagta tggccgaagg ctggtcatga gcatcaggag aagtgcattg 540
atg 543
<210> 21
<211> 198
<212> PRT
<213> human (Homo sapiens)
<400> 21
Glu Ala Ser Pro Ala Ser Gly Pro Arg His Leu Met Asp Pro His Ile
1 5 10 15
Phe Thr Ser Asn Phe Asn Asn Gly Ile Gly Arg His Lys Thr Tyr Leu
20 25 30
Cys Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met Asp
35 40 45
Gln His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys Gly
50 55 60
Phe Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro Ser
65 70 75 80
Leu Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile Ser
85 90 95
Trp Ser Pro Cys Phe Ser Trp Gly Cys Ala Gly Glu Val Arg Ala Phe
100 105 110
Leu Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg Ile
115 120 125
Tyr Asp Tyr Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg Asp
130 135 140
Ala Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His Cys
145 150 155 160
Trp Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp Asp
165 170 175
Gly Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala Ile
180 185 190
Leu Gln Asn Gln Gly Asn
195
<210> 22
<211> 594
<212> DNA
<213> human (Homo sapiens)
<400> 22
gaggcatctc cagcaagcgg accaaggcac ctgatggacc cccacatctt cacctctaac 60
tttaacaatg gcatcggcag gcacaagaca tacctgtgct atgaggtgga gcgcctggac 120
aacggcacca gcgtgaagat ggatcagcac agaggcttcc tgcacaacca ggccaagaat 180
ctgctgtgcg gcttctacgg ccggcacgca gagctgagat ttctggacct ggtgcctagc 240
ctgcagctgg atccagccca gatctatagg gtgacctggt tcatcagctg gtccccatgc 300
ttttcctggg gatgtgcagg agaggtgcgc gccttcctgc aggagaatac acacgtgcgg 360
ctgagaatct ttgccgcccg gatctacgac tatgatcctc tgtacaagga ggccctgcag 420
atgctgagag acgcaggagc ccaggtgtcc atcatgacct atgatgagtt caagcactgc 480
tgggacacat ttgtggatca ccagggctgt ccctttcagc cttgggacgg actggatgag 540
cactcccagg ccctgtctgg caggctgagg gccatcctgc agaaccaggg caat 594
<210> 23
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 23
ggaatccctt ctgcagcacc tgg 23
<210> 24
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 24
gtgctgggct ccggtgcgtt cgg 23
<210> 25
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 25
caaagcagaa actcacatcg agg 23
<210> 26
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 26
ctccctcaag caggccccgc tgg 23
<210> 27
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 27
gtgctgggct ccggtgcgtt cgg 23
<210> 28
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 28
ttctcggagg ctcaggtgcg tgg 23
<210> 29
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 29
gctcccatca catcaaccgg tgg 23
<210> 30
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 30
gccctctgtg tgctcaaggg ggg 23
<210> 31
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 31
catgcccttc ggctgcctcc tgg 23
<210> 32
<211> 23
<212> DNA
<213> human (Homo sapiens)
<400> 32
taataattga tgtcatagat tgg 23

Claims (23)

1. A fusion protein for increasing gene editing efficiency, comprising a single-stranded DNA binding protein domain, a nucleoside deaminase, and a nuclease;
the connection sequence of the fusion protein is as follows: the nucleoside deaminase is positioned at the N-terminus of the nuclease and the single-stranded DNA binding protein functional domain is positioned between the nucleoside deaminase and the nuclease;
the single-stranded DNA-binding protein domain comprises the DNA-binding domain of Rad51 and/or the DNA-binding domain of RPA 70;
the amino acid sequence of the DNA binding domain of Rad51 is shown in SEQ ID No. 1;
the amino acid sequence of the DNA binding domain of the RPA70 is shown as SEQ ID No. 19;
the fusion protein further comprises NLS;
the NLS is located at least one end of the fusion protein;
the fusion protein further comprises two copies of UGI;
the UGI is located at least one end of the fusion protein.
2. The fusion protein of claim 1, wherein the single-stranded DNA binding protein domain is the DNA binding domain of Rad 51.
3. The fusion protein of claim 1, wherein the coding sequence for the DNA binding domain of Rad51 is set forth in SEQ ID No. 2.
4. The fusion protein of claim 1, wherein the coding sequence for the DNA binding domain of RPA70 is set forth in SEQ ID No. 20.
5. The fusion protein of claim 1, wherein the nucleoside deaminase comprises a cytosine deaminase and/or an adenosine deaminase.
6. The fusion protein of claim 5, wherein the nucleoside deaminase is a cytosine deaminase.
7. The fusion protein of claim 6, wherein the cytosine deaminase is rat-derived.
8. The fusion protein of claim 7, wherein the amino acid sequence of the rat cytosine deaminase is as set forth in SEQ ID No. 3.
9. The fusion protein of claim 8, wherein the coding sequence of the rat cytosine deaminase is as set forth in SEQ ID No. 4.
10. The fusion protein of claim 1, wherein the nuclease is selected from one or more of Cas9, Cas3, Cas8a, Cas8b, Cas10d, Cse1, Csy1, Csn2, Cas4, Cas10, Csm2, Cmr5, Fok1 and Cpf 1.
11. The fusion protein of claim 10, wherein the nuclease is Cas 9.
12. The fusion protein of claim 11, wherein the Cas9 is selected from Cas9 derived from streptococcus pneumoniae, staphylococcus aureus, streptococcus pyogenes, or streptococcus thermophilus.
13. The fusion protein of claim 12, wherein the Cas9 is selected from the group consisting of Cas9 mutant VQR-spCas9, VRER-spCas9, spCas9 n.
14. The fusion protein of claim 13, wherein the Cas9 mutant is spCas9 n.
15. The fusion protein of claim 14, wherein the spCas9n has the amino acid sequence shown as SEQ ID No. 5.
16. The fusion protein of claim 15, wherein the coding sequence of spCas9n is set forth in SEQ ID No. 6.
17. The fusion protein of claim 1, wherein the amino acid sequence of NLS is shown in SEQ ID No. 7.
18. The fusion protein of claim 1, wherein the coding sequence of NLS is shown in SEQ ID No. 8.
19. The fusion protein of claim 1, wherein the UGI has an amino acid sequence as set forth in SEQ ID No. 9.
20. The fusion protein of claim 1, wherein the coding sequence of the UGI is set forth in SEQ ID No. 10.
21. Any of the following A) -C) biomaterials:
A) a gene encoding the fusion protein of any one of claims 1-20;
B) a recombinant vector comprising a) the gene;
C) a recombinant cell or recombinant bacterium comprising the fusion protein of any one of claims 1 to 20, or comprising the gene of a).
22. A single base gene editing system comprising the fusion protein of any one of claims 1-20 and/or the biological material of claim 21 and sgrnas, the sgrnas directing the fusion protein to perform single base gene editing of a gene of interest in a cell of interest;
the target sequence of the sgRNA includes at least one of SEQ ID Nos. 11 to 18.
23. Use of the fusion protein according to any one of claims 1 to 20, the biological material according to claim 21 or the single base gene editing system according to claim 22 for the preparation of a gene editing product, a disease treatment and/or prevention product, an animal model or a new plant variety.
CN201911310969.8A 2019-12-18 2019-12-18 Fusion protein for improving gene editing efficiency and application thereof Active CN112979821B (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
CN201911310969.8A CN112979821B (en) 2019-12-18 2019-12-18 Fusion protein for improving gene editing efficiency and application thereof
EP20903960.1A EP4079765A4 (en) 2019-12-18 2020-12-17 Fusion protein that improves gene editing efficiency and application thereof
JP2022538379A JP2023507034A (en) 2019-12-18 2020-12-17 Fusion protein that improves genome editing efficiency and use thereof
PCT/CN2020/137239 WO2021121321A1 (en) 2019-12-18 2020-12-17 Fusion protein that improves gene editing efficiency and application thereof
US17/843,462 US20220364072A1 (en) 2019-12-18 2022-06-17 Fusion protein that improves gene editing efficiency and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911310969.8A CN112979821B (en) 2019-12-18 2019-12-18 Fusion protein for improving gene editing efficiency and application thereof

Publications (2)

Publication Number Publication Date
CN112979821A CN112979821A (en) 2021-06-18
CN112979821B true CN112979821B (en) 2022-02-08

Family

ID=76343949

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911310969.8A Active CN112979821B (en) 2019-12-18 2019-12-18 Fusion protein for improving gene editing efficiency and application thereof

Country Status (1)

Country Link
CN (1) CN112979821B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113621634B (en) * 2021-07-07 2023-09-15 浙江大学杭州国际科创中心 Base editing system and base editing method for increasing mutation rate of genome
CN115704015A (en) * 2021-08-12 2023-02-17 清华大学 Targeted mutagenesis system based on adenine and cytosine double-base editor
CN115725650A (en) * 2021-08-26 2023-03-03 华东师范大学 Base editing system for realizing A to C and/or A to T base mutation and application thereof
CN113717961B (en) * 2021-09-10 2023-05-05 成都赛恩吉诺生物科技有限公司 Fusion protein and polynucleotide, base editor and application thereof in preparation of medicines
CN114686456B (en) * 2022-05-10 2023-02-17 中山大学 Base editing system based on bimolecular deaminase complementation and application thereof

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4269577A3 (en) * 2015-10-23 2024-01-17 President and Fellows of Harvard College Nucleobase editors and uses thereof
US11274288B2 (en) * 2016-02-16 2022-03-15 Emendobio Inc. Compositions and methods for promoting homology directed repair mediated gene editing
WO2017172775A1 (en) * 2016-04-01 2017-10-05 Children's Medical Center Corporation Methods and compositions relating to homology-directed repair
WO2018229521A1 (en) * 2016-06-16 2018-12-20 Oslo Universitetssykehus Hf Improved gene editing
EP3797160A1 (en) * 2018-05-23 2021-03-31 The Broad Institute Inc. Base editors and uses thereof
CN109266648B (en) * 2018-09-26 2021-10-19 中国科学技术大学 Gene editing composition or kit for in vivo gene therapy

Also Published As

Publication number Publication date
CN112979821A (en) 2021-06-18

Similar Documents

Publication Publication Date Title
CN112979821B (en) Fusion protein for improving gene editing efficiency and application thereof
CN109021111A (en) A kind of gene base editing machine
EP3676287A1 (en) Fusion proteins for improved precision in base editing
WO2019161783A1 (en) Fusion proteins for base editing
CN114438110B (en) Adenine base editor without PAM limitation accurately and construction method thereof
JPH02182188A (en) Preparation of factor viii and related product
CN110551761B (en) CRISPR/Sa-SepCas9 gene editing system and application thereof
CN110205318A (en) Macro Extraction Methods of Genome based on CRISPR-Cas removal host genome DNA
WO2023023515A1 (en) Persistent allogeneic modified immune cells and methods of use thereof
Karagyaur et al. Practical recommendations for improving efficiency and accuracy of the CRISPR/Cas9 genome editing system
KR20220151175A (en) RNA-guided genomic recombination at the kilobase scale
CN110499335B (en) CRISPR/SauriCas9 gene editing system and application thereof
CN110551762B (en) CRISPR/ShaCas9 gene editing system and application thereof
CN116656649A (en) IS200/IS60S transposon ISCB mutant protein and application thereof
CN110499334A (en) CRISPR/SlugCas9 gene editing system and its application
CN112979823B (en) Product and fusion protein for treating and/or preventing beta-hemoglobinopathy
KR102648886B1 (en) Method for modifying a target nucleic acid in the genome of a cell
CN113564145B (en) Fusion protein for cytosine base editing and application thereof
CN115703842A (en) Base editor for efficient and highly accurate cytosine C to guanine G conversion
CN116217733A (en) Base editing fusion protein and application thereof
CN110577970B (en) CRISPR/Sa-SlutCas9 gene editing system and application thereof
CN112979822A (en) Construction method of disease animal model and fusion protein
CN111454367B (en) Base editing molecule and application thereof
CN113073094B (en) Single base mutation system based on cytidine deaminase LjCDA1L1_4a and mutants thereof
CN116200382A (en) Novel gene editing system for mediating A-to-C mutation or T-to-G mutation and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant