CN108949794B - TALE expression vector and rapid construction method and application thereof - Google Patents

TALE expression vector and rapid construction method and application thereof Download PDF

Info

Publication number
CN108949794B
CN108949794B CN201810723261.4A CN201810723261A CN108949794B CN 108949794 B CN108949794 B CN 108949794B CN 201810723261 A CN201810723261 A CN 201810723261A CN 108949794 B CN108949794 B CN 108949794B
Authority
CN
China
Prior art keywords
tale
seq
monomer
artificial sequence
base determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810723261.4A
Other languages
Chinese (zh)
Other versions
CN108949794A (en
Inventor
王进科
张书衍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN201810723261.4A priority Critical patent/CN108949794B/en
Publication of CN108949794A publication Critical patent/CN108949794A/en
Application granted granted Critical
Publication of CN108949794B publication Critical patent/CN108949794B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/70Vectors or expression systems specially adapted for E. coli
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • A61K48/0008Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'non-active' part of the composition delivered, e.g. wherein such 'non-active' part is not delivered simultaneously with the 'active' part of the composition
    • A61K48/0025Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'non-active' part of the composition delivered, e.g. wherein such 'non-active' part is not delivered simultaneously with the 'active' part of the composition wherein the non-active part clearly interacts with the delivered nucleic acid
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P35/00Antineoplastic agents
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/66General methods for inserting a gene into a vector to form a recombinant vector using cleavage and ligation; Use of non-functional linkers or adaptors, e.g. linkers containing the sequence for a restriction endonuclease

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Biotechnology (AREA)
  • Organic Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Veterinary Medicine (AREA)
  • Medicinal Chemistry (AREA)
  • Biophysics (AREA)
  • Public Health (AREA)
  • Plant Pathology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Epidemiology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The invention discloses a TALE expression vector and a rapid construction method and application thereof, wherein the TALE vector is a final functional TALE expression vector formed by assembling an assembled annular 5-linked body and a TALE framework vector, and the assembled annular 5-linked body is formed by carrying out enzyme digestion-connection reaction on 62 monomers. The TALE expression vector can be rapidly assembled into a customized TALE vector in one day, eliminates the obstacles of TALE application in the prior art, and can activate the expression of an exogenous reporter gene and an endogenous gene. The invention discloses a method for rapidly constructing TALE vectors by TALE expression, which is characterized in that 62 new monomers are prepared and can be used for rapidly assembling customized TALE vectors in one day, the feasibility of the method is verified by designing and assembling 9 TALE vectors aiming at transcription factor gene HNF4 alpha and E47 promoters, and two TALE vectors which can be used for liver cancer and pancreatic cancer differentiation therapy are identified.

Description

TALE expression vector and rapid construction method and application thereof
Technical Field
The invention belongs to the technical field of gene regulation and gene editing, and particularly relates to a TALE expression vector and a rapid construction method and application thereof.
Background
Currently, the most widely used gene manipulation techniques are Zinc Fingers (ZFs), transcription activator-like effectors (TALEs) and regularly clustered interspaced short palindromic repeats (CRISPR) systems. ZFs are the earliest programmable DNA binding domains that enable arrays of ZF modules to be assembled into tandem arrays and targeted to novel DNA binding sites in the genome. Each finger in the ZF array targets three DNA bases. TALEs are second generation programmable DNA binding domains. And TALE-DNA binding monomers target mononucleotides more modular than ZFs. TALEs can theoretically target any sequence and have found application and dramatic success in many organisms. More importantly, TALE-based DNA binding modules have higher sequence specificity, lower off-target rates and lower cytotoxicity than ZFs.
TALEs are type III effector proteins from pathogenic bacteria of the xanthomonas genus. All TALEs consist of an N-terminal translocation domain, a C-terminal nuclear localization signal with an acidic transcription activation domain and a central tandem repeat DNA binding domain consisting of recognition and binding of tandem repeat DNA containing 34 amino acid sequences (called monomers). Naturally occurring TALEs were found to have variable numbers of monomers ranging from 1.5 to 33.5. Although the sequence of each monomer is highly conserved, they differ mainly at two positions, which are called repeat variable residues (RVD, positions 12 and 13). Recent reports found that the nature of these two residues determines the nucleotide binding specificity of each TALE repeat, and that a simple code specifies the target base for each RVD (NI-A, HD-C, NG-T, NN-G). Thus, each monomer targets one nucleotide, and the linear sequence of monomers in the TALE designates the target DNA sequence in the 5 'to 3' direction. Based on these TALE-DNA molecule encodings, biologists assembled TALE-DNA binding domain proteins and fused in tandem with other functional protein domains to target the genome, including transcriptional regulation of endogenous genes and genome editing. TALEs are powerful tools for genome manipulation.
After 2013, CRISPR/Cas9 became the most widely used gene manipulation technology because it can be easily designed and prepared. Typically, only one expression single stranded guide rna (sgrna) vector needs to be prepared to target the DNA of interest. However, off-target effects are a major problem with CRISPR/Cas9 genome editing technology [20 ]. First, since the Cas9 protein is insensitive to mismatches between the 5' terminal sequence of the sgRNA and the target site, the risk of CRISPR/Cas9 off-target is higher than ZFNs and TALE nucleases (TALENs). Second, the Protospacer Adjacent Motif (PAM) sequence can also affect CRISPR/Cas 9-mediated genome editing efficiency and specificity. The effect of CRISPR/Cas9 on the target DNA sequence depends not only on the guidance of the sgRNA, but also on the 2-5 nt PAM sequence in the invaded DNA sequence that is associated with the crRNA target sequence. It is far less accurate and efficient than TALENs. Meanwhile, it is currently mainly used for functional genome studies due to cytotoxicity caused by CRISPR. TALEs are currently the leading technology for biomedical therapy. In fact, the first patients receiving TALEN genetically engineered products have shown significant improvement, and these 2015 patients receiving treatment have been in complete remission to date. Since then, clinical trials have shown the clinical potential of TALEN products.
Currently, the biggest problem restricting the use of TALE is that its construction process is very complicated and needs a long time. Although many assembly methods have recently been reported, including a series of methods derived from the Golden Gate clone, conventional cloning methods and high-throughput methods. The construction of DNA binding monomers is very cumbersome due to the repetitive sequence nature of TALEs. Previously, many groups employed a hierarchical ligation strategy to overcome the difficulty of assembling monomers into ordered multimeric arrays, which was the advantage of exploiting codon degeneracy around monomer linkers and type IIS restriction enzymes. However, these strategies are still limited by their cumbersome and time-consuming process of constructing TALE expression vectors. Therefore, in order to fully utilize the advantages of the TALE and simplify the construction process of the TALE, the construction process of the TALE can be as simple and fast as that of CRISPR, and a new TALE vector construction method still needs to be developed.
Transcription activator-like effectors (TALEs) have been a promising gene editing and regulatory tool to replace zinc fingers. However, with the advent of simpler CRISPR technology, TALEs were quickly abandoned due to their complex, cumbersome and time consuming construction procedures. However, with the widespread use of CRISPRs, its lethal defective off-target effect is gradually discovered and limits its medical applications. TALEs have been allowed for in vivo gene editing and therapy due to their high targeting.
The TALE assembly Kit Golden Gate TALEN and TAL Effector Kit 2.0(Golden Gate TALEN and TAL Effector Kit 2.0, Addge) is the most widely used commercial Kit for assembling TALE vectors at present, but the assembly of the TALE expression vector by using the Kit takes up to 5 days, and the assembly can be completed by skilled technicians trained in the field of TALE research. The tedious, time consuming, inefficient nature of this process is an important reason why TALE technology is rapidly abandoned, despite its important advantages over CRISPR technology.
TALEs have been a promising second generation gene regulation and editing tool with a high degree of sequence specificity in binding to their DNA targets. Because of high targeting, TALEs have been used by Cellectis, france, to produce universal chimeric antigen receptor T (UCAR-T) cells, and this kind of genetically engineered cells has been FDA approved for clinical immunotherapy of cancer. However, with the advent of CRISPR, and because of its simplicity of application, TALEs were soon almost completely replaced by the latter. Nevertheless, with the widespread use of CRISPRs, its fatal disadvantage, i.e., high off-target rate, has been gradually discovered, thereby limiting the clinical use of CRISPRs. TALEs, however, were found to have low off-target rates than CRISPRs. It was also found that the CRISPR/dCas9 system is less effective than TALE in activating endogenous gene expression and reprogramming iPS cells using somatic cells. However, the process of assembling and constructing the TALE expression vector is complex, cumbersome and time-consuming, so the assembly of the TALE expression vector is a significant technical challenge for the wide application of TALE technology.
Disclosure of Invention
The purpose of the invention is as follows: aiming at the problems in the prior art, the invention provides the TALE vector which can be used for quickly constructing a customized TALE expression vector in one day and eliminating the obstacles of TALE application in the prior art; the constructed customized TALE expression vector can be used for activating the expression of exogenous reporter genes and endogenous genes.
The invention also provides a rapid construction method and application of the TALE expression vector. The invention provides a new method for quickly constructing a TALE (transcription activator like effector) vector, which prepares 62 new monomers which can be prepared by amplifying a pair of universal primers on a 96-well plate and develops a new TALE vector assembly scheme. With these monomers and assembly protocols, customized TALE expression vectors can be rapidly assembled within a day. The feasibility of the new method is verified by designing and assembling 9 TALE vectors aiming at transcription factor gene HNF4 alpha and E47 promoters, and two TALE vectors which can be used for liver cancer and pancreatic cancer differentiation therapy are identified.
The technical scheme is as follows: in order to achieve the above object, the TALE vector according to the present invention is a final functional TALE expression vector assembled from an assembled cyclic 5-mer and a TALE backbone vector, wherein the assembled cyclic 5-mer is formed by an enzyme digestion-ligation reaction of 62 monomers.
Wherein, the 62 monomers are 62 monomers prepared by high-throughput PCR amplification on a 96-well plate by using a pair of universal primers. The universal primer can anneal with a constant sequence at the upper end of 62 monomers and is used for amplifying the 62 monomers to carry out PCR amplification preparation of the 62 monomers.
Preferably, the universal primer sequences are TATCATCATGCCTCCTCTAGAG and TTGGTCATGGGTGGCTCGAGG.
Further, the 62 monomers are linear double-stranded DNA fragments without redundant sequences.
The 62 monomers each end with a pair of constant sequences that are annealing sites for a pair of universal primers. The universal primers are TATCATCATGCCTCCTCTAGAG and TTGGTCATGGGTGGCTCGAGG. .
Wherein, the 62 monomers are composed of 60 base determination monomers and 2 connecting monomers, wherein, the amino acid sequence coded by the base determination monomers can be combined with one base, and the connecting monomers only play a connecting role in assembly of the TALE carrier and do not combine bases.
The 62 monomers contain different IIS restriction enzyme cleavage points and can be connected in a set sequence through an enzyme digestion-connection reaction, and the process is also called Golden Gate assembly.
Preferably, the assembled cyclic 5-mer comprises 4 cyclic 5-mers, wherein the first cyclic 5-mer is assembled by enzyme digestion-ligation reaction of 5 base determining monomers and can encode the amino acid sequence of 1 st to 5 th bases for binding TALE binding target; the second cyclic 5-linked body is formed by assembling 5 base determination monomers through enzyme digestion-connection reaction and can encode an amino acid sequence of 6 th to 10 th bases combined with TALE binding targets; the third cyclic 5-linked body is formed by assembling four base determination monomers and a connecting monomer through enzyme digestion-connection reaction, and can encode an amino acid sequence of 11 th to 14 th bases combined with TALE binding targets; the fourth cyclic 5-linked body is formed by assembling four base determination monomers and a connecting monomer through enzyme digestion-connection reaction, and can encode an amino acid sequence of 15 th to 18 th bases combined with TALE binding targets. In 4 reactions, IIS restriction enzyme BsaI and DNA ligase were added except for monomers, and after multiple cycles of enzyme digestion-ligation reactions, assembled circular 5-mers were formed, after which free linear fragments were eliminated with plasmid safety enzyme.
Wherein the final functional TALE expression vector is subjected to an enzyme digestion-ligation reaction; the reaction consists of 4 prepared cyclic 5-linked bodies and a TALE framework carrier; IIS restriction enzyme BsmBI and DNA ligase are added in the reaction, and after a plurality of cycles of enzyme digestion-ligation reactions, a circular final functional TALE expression vector is formed.
Furthermore, the TALE framework vector refers to a sequence containing a promoter, a TALE constant N-terminal sequence, a TALE DNA binding sequence insertion site, a TALE constant C-terminal sequence and a certain functional domain, such as a gene expression activation or inhibition domain, an epigenetic modification domain, a DNA endonuclease and the like.
Further, the final functional TALE expression vector can transform bacteria, and a customized TALE expression vector for transfecting cells is obtained after clone growth, clone PCR identification, positive clone propagation and plasmid DNA purification.
The rapid construction method of the TALE vector comprises the following steps:
(1) construction of cyclic 5-mer: the process is 4 monomer enzyme digestion-connection reactions which can be synchronously carried out; the first reaction comprises 5 base determination monomers, IIS restriction enzymes BsaI and DNA ligase, and a first cyclic 5-linked body is formed after enzyme digestion-ligation reaction; the second reaction comprises 5 base determination monomers, IIS restriction enzyme BsaI and DNA ligase, and a second cyclic 5-linked body is formed after enzyme digestion-ligation reaction; the third reaction comprises four base determination monomers, a connecting monomer, IIS restriction enzyme BsaI and DNA ligase, and a third cyclic 5-linked body is formed after enzyme digestion-connection reaction; the fourth reaction comprises four base determination monomers, another connecting monomer, IIS restriction enzyme BsaI and DNA ligase, and a fourth cyclic 5-linked body is formed after the enzyme digestion-connection reaction; the 60 base determination monomers and 2 connection monomers are used in the process; the 60-base determining monomer can be selected from suitable monomers according to the target sequence of the customized TALE, if the first base of the binding target of the customized TALE vector is C, the monomer HD1 (figure 2) is selected in the enzyme digestion-connection reaction of the first circular 5-concatemer, and if the first base of the binding target of the customized TALE vector is T, the monomer NG1 (figure 2) is selected in the enzyme digestion-connection reaction of the first circular 5-concatemer; the 2 connecting monomersThe method is used for constructing each customized TALE vector and preparing a third cyclic 5-concatemer and a fourth cyclic 5-concatemer; the 2 connection monomer is named as dsDNA10.5And dsDNA17.5Wherein dsDNA10.5For preparation of the third circular 5-mer, dsDNA17.5For the preparation of the fourth cyclic 5-mer.
(2) And (3) constructing a final functional TALE expression vector: and (2) carrying out enzyme digestion-ligation reaction on the 4 circular 5 conjuncts prepared in the step (1), the TALE framework vector, the IIS restriction enzyme BsmBI and the DNA ligase to efficiently form the required final functional TALE expression vector.
Preferably, the enzyme digestion-ligation reaction in step (1) is performed alternately at optimal reaction temperatures of the IIS restriction enzymes BsaI and DNA ligase so that multiple rounds of enzyme digestion-ligation reactions can occur to efficiently form the desired cyclic 5-concatemer. The optimum reaction temperature is the reaction temperature recommended by the manufacturer's instructions for the enzyme used.
Preferably, the enzyme digestion-ligation reaction in the step (2) is performed alternately at the optimal reaction temperature of a plurality of IIS restriction enzymes BsmBI and DNA ligase so as to generate multiple rounds of enzyme digestion-ligation reactions, and the required final functional TALE expression vector is efficiently formed. The optimum reaction temperature is the reaction temperature recommended by the manufacturer's instructions for the enzyme used.
The TALE vector is applied to preparing TALE expression vectors of targeted transcription factor genes HNF4 alpha and E47 promoters for differentiation treatment of liver cancer and pancreatic cancer.
The TALE vector disclosed by the invention is applied to preparation of gene expression regulation and control, gene therapy and gene editing biomedical therapeutic reagents.
The novel TALE vector construction method is applied to the fields of gene expression regulation, gene therapy, gene editing and the like to provide commercial technical services.
In the invention, the monomer library and the assembly process of the Golden Gate TALEN and TAL effector kit 2.0 (named Addgene kit) are innovatively and obviously improved, and the kit is a main commercialized kit for constructing customized TALE and TALEN at present.
Specifically, two major improvements were made. First, a 60 base-determined monomer library and two ligation monomers (dsDNA) were prepared10.5And dsDNA17.5) Constituting a new monomer library. All base determining monomers were prepared by PCR amplification using plasmid monomers in the Addgene kit as templates. The two linker monomers are prepared by chemical synthesis. Most importantly, all these new monomers end in two constant sequences that can be amplified by PCR with high fidelity enzymes using a pair of universal primers. The greatest advantage of the plasmid-free monomer library is that it can be easily replicated in this high-throughput manner by high-fidelity PCR amplification in 96-well PCR plates. Another advantage is that the net linear fragment (net linear fragment) monomers allow more monomer molecules to be included in the first digest-ligation reaction for efficient formation of more cyclic pentamers. Second, a new procedure for assembling customized TALEs is created by using new monomers. The customized TALE can be assembled quickly and conveniently in one day, making TALE assembly faster than current CRISPRs. Indeed, with a prepared monomer library, TALE preparation can be completed in as little as 2 days, starting from a given target sequence; however, CRISPR preparation must take 3 days, with day 1 being used for ordering oligonucleotides for preparation of sgRNA expression vectors, day 2 for construction of sgRNA expression vectors, and day 3 for screening colonies and plasmid extraction. Therefore, new monomer libraries and assembly procedures can facilitate the new widespread use of TALEs, as TALEs are a more reliable gene regulation and editing tool.
The invention can construct TALE vectors with different transcription activation activities. In the present invention, two different transcription activation domains, VP64 and VPR, are used. The results show that VPR always has higher transcriptional activation activity than VP64, which is consistent with the results obtained with CRISPR studies. Thus, TALE-VPR was used to activate endogenous genes in subsequent experiments. In addition, more powerful activation systems such as Suntag may be used in TALEs. To knock down gene expression, transcription Repression Domains (RD) such as KRAB (Krluppel-related transcription repression domain) can be fused to TALE. In addition, other epigenetic functional domains such as DNMT3a (DNA methyltransferase), EZH2 (histone 3 lysine 27 methyltransferase) and LSD1 (histone demethylase) can be fused to TALEs to make variant gene modulators (GR). These GRs have a wide range of applications in molecular biology research and biomedical fields. In addition, the experimental research of the invention shows that the gene transcription activation level is also related to the transcription activation structural domain and the position of TALE targeting. The position of the TALE-targeting target relative to the Transcription Start Site (TSS) has a significant impact on TALE activation. Thus, several TALE-VPRs are typically constructed for each genomic site for screening use. These TALEs are intended to bind adjacent regions around a particular target site, as some binding sites may bind more readily than others. TALEs that do not work in one particular cell type may work in another cell type due to differences in epigenetic status between different cells. Thus, although three TALE sequences were designed for HNF 4a and E47 genes, respectively, the well-activated TALE sequences were all located in the proximal promoter region before the start codon.
The target endogenous gene constructed by the method can cause the change of cell phenotype after the TALE-VPR expression vector transfects cells. In the present invention, HNF4 α and E47 were selected as target genes and activated using constructed TALEs, because it was reported that activation of two transcription factors leads to differentiation of oncogenic cells HepG2 and PANC1 cells. To observe the functional results of the constructed TALEs, HepG2 and PANC1 cells were transfected with HNF4 α -TALE-VPR and E47-TALE-VPR, respectively. The HNF4 alpha and E47 genes were found to be significantly activated by transfected TALE-VPRs. Correspondingly, TALE-VPRs also regulate the expression of some target genes regulated by two transcription factors, suggesting a regulatory function for the two transcription factors that are activated. Examination of cell viability, cycle and migration showed that TALEs-transfected cells underwent significant phenotypic changes, including cycle arrest and reduced viability and reduced migratory capacity, consistent with changes in the expression of HNF 4a and E47 genes themselves, their target genes and stem genes. HNF4 α is a central regulator of the differentiated hepatocyte phenotype, and forced re-expression of HNF4 α is sufficient to overcome the suppression of phenotype in de-differentiated hepatoma cells. TALE-VPR targeting HNF4 alpha promoter can obviously up-regulate the expression of HNF4 alpha in HepG2 cells, so that the expression of hepatocyte marker genes is up-regulated and the expression of stem genes is down-regulated. For example, CD133 is currently considered to be a target for identifying a portion of cancer stem/progenitor cells of hepatocellular carcinoma (HCC). Upregulation of HNF4 α expression induces down-regulation of CD133 and certain genes (embryonic cell-specific gene 1) that are expressed in human embryonic stem cells in enriched amounts and are involved in establishing or maintaining pluripotency, including OCT3/4, BMI, SOX2, KLF4, LIN28, and ESG 1. In PANC1 cells, E47 regulates cancer and cell cycle genes, such as p21 and TP53INP1, which are target genes for E47 transcription factors. It was reported that E47 up-regulated the expression of the p21 and TP53INP1 genes in the PANC1 cell line in genes associated with G0/G1 phase block. These data indicate that TALE-VPRs targeting HNF 4a and E47 promoters have potential clinical applications as gene therapy in the future.
Has the advantages that: compared with the prior art, the invention has the following advantages:
the invention can construct TALE vectors with different transcription activation activities, the TALE expression vector is the TALE vector constructed by rapidly assembling customized TALE vectors in one day, the obstacles of TALE application in the prior art are eliminated, and the expression of exogenous reporter genes and endogenous genes can be activated. In the present invention, two different transcription activation domains, VP64 and VPR, are used. The results show that VPR always has higher transcriptional activation activity than VP64, which is consistent with the results obtained with CRISPR studies.
In the invention, in order to overcome the key technical bottleneck of the current TALE application, a quick and effective method is established and verified, so that researchers can assemble the TALE and the TALEN in one day. To achieve this strategy, a new monomer library was prepared, which contained a 60 base-determining monomer library and two new linker monomers, which could be amplified by high throughput PCR. With the new monomer, any TALE and TALEN targeting 18-bp can realize rapid assembly within one day by using a new program, and a new TALE vector is formed. The feasibility of the new method is verified by designing and assembling 9 TALE vectors aiming at transcription factor gene HNF4 alpha and E47 promoters, and two TALE vectors which can be used for liver cancer and pancreatic cancer differentiation therapy are identified.
The method promotes the wide application of the TALE, and proves that the TALE vector is a more reliable tool for gene regulation and editing, has a target gene activation function, has potential treatment values of liver cancer and pancreatic cancer, can be applied to the preparation of biomedical treatment reagents for gene expression regulation, gene treatment and gene editing, and can also be applied to the commercial technical services in the fields of gene expression regulation, gene treatment, gene editing and the like.
Drawings
FIG. 1 is a process of constructing TALE expression vector by Golden Gate TALEN and TAL Effector Kit 2.0(Golden Gate TALEN and TAL Effector Kit 2.0, Addgene);
FIG. 2 is a schematic of monomers in a 96-well PCR plate; each monomer was prepared by PCR amplification using the monomeric plasmid contained in the Addgene kit as a template and different primers designed to amplify the variant monomer, for a total of 60 base determining monomers and two ligating monomers (dsDNA)10.5And dsDNA17.5) The position of each monomer is displayed in a well plate, the monomer plate can be easily regenerated by 96-well PCR amplification, the plate also contains two TALE framework vectors, TALE-VP64 and TALE-VPR, can be generated by E.coli DH5 alpha transformation and extraction and added into the well;
fig. 3 is a schematic diagram of a TALE structure, monomer, TALE backbone vector for assembling customized TALEs; (a) the natural structure of TALEs derived from xanthomonas, each DNA binding module consisting of 34 amino acids, wherein the repeat variable Residue (RVD) at the 12 th and 13 th amino acid positions of each repeat specifies the DNA base targeted according to the codes NG ═ T, HD ═ C, NI ═ a, NN ═ G; (b) monomers and TALE frameworks for constructing customized TALE, wherein each monomer is terminated by two constant sequences, an annealing site of a pair of universal primers is provided, and the monomers can be used for regenerating a monomer library; CMV: a cytomegalovirus promoter; n, the non-repeating amino terminus of TALE; c: non-repetitive carboxy-termini of TALEs; BsmBI: a type IIs restriction site for insertion of a custom TALE DNA binding domain; LacZa: LacZa expression gene for blue white spot screening; NLS: a nuclear localization signal; VP 64: a synthetic transcriptional activator derived from the herpes simplex virus VP16 protein; VPR: fusion of VP64, p65/RelA and Rta; (c) activating schematic diagram of gene expression by TALE, TALE-TF, TALE transcription factor;
FIG. 4 is a map of the PIE-VP64 vector;
FIG. 5 is a PIE-VPR vector map;
fig. 6 is a TALE-VP64 map;
FIG. 7 is a TALE-VPR map;
FIG. 8 is an HNF4 α -TALE-reporter map;
FIG. 9 is an E47-TALE-reporter map;
fig. 10 is a flow diagram for constructing TALEs with new monomers; summarizing the steps of constructing TALEs, the time spent at each step is given, and it can be seen that the entire procedure can be completed in one day; after colony PCR screening, plasmid extraction and EcoRI digestion confirmation, the available final TALE can be obtained on the next day; the time spent for this procedure was the same as for the current CRISPR, E47-TALE1-VPR transfected e.coli colonies on solid agar; (c and d) colony PCR detection of 4 groups of randomly selected white spots, each group comprising 8 colonies and full-length PCR products;
fig. 11 is a diagram of the construction of a customized TALE using new monomers and new procedures; (a and b) the PCR product is often less obvious when HNF 4. alpha./should be 2051bp long, while the "step effect" represents a strong indicator of successful assembly; (e and f) colony identification by EcoRI digestion, culturing all selected colonies and extracting plasmids, digesting the extracted plasmids with EcoRI, which can cut the full-length TALE with a length of 3537bp, the incorrectly assembled TALE will produce a 2143bp band in EcoRI digestion, and it can be seen that colony PCR identification is the same as EcoRI digestion;
FIG. 12 is a final TALE-VPR colony detected using colony PCR; full-length PCR products are usually less prominent, while the "ladder effect" is a strong indicator of successful assembly; DNA labeling: DL2000 marker, randomly picked white spots detected by colony PCR, indicating that most white spots are correctly assembled TALEs;
fig. 13 is a schematic representation of activation of an exogenous reporter gene with TALE; cotransfection of 293T cell, TALE-VPR/VP64 plasmid and corresponding ZsGreen expressing report plasmid; (a) the position of the TALE target in the HNF 4a and E47 gene promoters; (b) co-transfecting cells with HNF4 alpha-TALE 1/2/3-VP64 and HNF4 alpha-TALE-reporter; (c) co-transfecting cells with HNF4 alpha-TALE 1/2/3-VPR and HNF4 alpha-TALE-reporter; (d) co-transfecting cells with E47-TALE1/2/3-VPR and E47-TALE-reporter;
figure 14 is a schematic of TALE activating endogenous genes; transfecting HepG2 and PANC1 cells with HNF4 alpha/E47-TALE 1-VPR; (a) expression of HNF4 α and hepatocyte marker genes in HepG2 cells; (b) expression of stem cell genes in HepG2 cells; (c) expression of E47, cell cycle arrest-related genes (p21 and TP53INP1) and E47 target genes (HNF6, Sox9, CX32 and miss 1) in PANC1 cells;
FIG. 15 is a schematic of an analysis of cell viability, cycle and migration; transfecting HepG2 and PANC1 cells with HNF4 alpha/E47-TALE 1-VPR; (a) HepG2 cell viability, (b) HepG2 cell cycle, (c) HepG2 cell migration, (d) PANC1 cell viability, (e) PANC1 cell cycle, (f) PANC1 cell migration; CCK-8 is used for detecting the cell viability at different time points; the flow cytometer is used for detecting the cell cycle; determining the percentage of cells in a single cell cycle phase; transwell was used to detect cell migration;
FIG. 16 is a schematic representation of flow cytometry detection of the cell cycle; (a) liposome-treated HepG 2; (b) HepG2 treated with liposomes and HNF4 α -TALE 1-VPR; (C) PANC1 treated by liposomes; (D) PANC1 treated by liposomes and E47-TALE 1-VPR.
Detailed Description
The present invention will be further described with reference to the following examples and the accompanying drawings.
Example 1
Preparation of monomers and TALE-TF framework vectors
The experimental method comprises the following steps:
monomers are prepared with ligation linkers and universal primer annealing sites.
Monomers were prepared using high fidelity enzymatic PCR reaction amplification according to the following protocol.
In step 1, a diluted mixture of forward and reverse monomeric primers is prepared. In a 96-well PCR plate, primer mixtures of SEQ ID nos. 1-12 (table 1) for amplification of TALE monomers (also known as TALE monomer libraries) were prepared. Forward and reverse primers were mixed for each of the 18 positions, depending on the primers needed to amplify each monomer. The concentration of each primer in the mixture was 20. mu.M.
TABLE 1 primers for TALE monomer PCR amplification and colony PCR amplification.
Figure BDA0001718996500000091
Figure BDA0001718996500000101
Step 2, two 96-well monomer library plates containing monomers are set. Each plate contained a total of 60 PCR reactions. Although smaller volumes of PCR reactions are acceptable, we will typically set the monomers to larger amounts because one monomer library plate can be reused to construct many TALEs. Each PCR reaction should be composed as follows, with a total volume of 240. mu.L. The PCR reaction (240. mu.L) contained 48ng of monomeric template plasmid, 10. mu.M of forward primer, 10. mu.M of reverse primer,
Figure BDA0001718996500000103
a polymerase. The PCR reaction program is pre-denaturation at 96 ℃ for 3 minutes; 28 cycles comprising denaturation at 96 ℃ for 20 seconds, optimal annealing temperature (. degree.C.) for 20 seconds, and extension at 72 ℃ for 20 seconds; extension at 72 ℃ for 3 min.
Chemical synthesis of ligated monomeric dsDNA10.5And dsDNA17.5SEQ ID NO.13-14 (Table 2).
TABLE 2 chemically synthesized linker monomers.
Figure BDA0001718996500000102
And 3, after the reaction is finished, confirming whether the monomer amplification is successful by using gel electrophoresis. The gel should have enough lanes to run 240 μ L of each PCR product from step 2. It is not necessary to check all 60 reactions in this step. It is sufficient to examine 18 reactions of one monomer template. Successful amplification should show-100 bp product (expected length of each monomer). Due to the difference in length of the longer outer primers, the monomers at the beginning or end of each pentamer ( monomers 1, 5, 6, 10, 14, 15) should be slightly longer than the other monomers.
Step 4, the combined reactions were purified using the MinElute gel extraction kit according to the manufacturer's instructions. The DNA was eluted from each well using 30. mu.L of buffer EB.
In step 5, the concentration of the purified PCR product is adjusted by adding buffer EB so that each monomer has the same molarity. Since monomers 1, 5, 6, 10, 14 and 15 are longer than the other monomers, it is necessary to adjust them to slightly higher concentrations. For example, we adjusted monomers 1, 5, 6, 10, 14 and 15 to 80 ng/. mu.L and the other monomers to 60 ng/. mu.L.
The experimental results are as follows:
the monomers of Golden Gate TALEN and TAL effector kit 2.0 (addrene) are plasmids contained in e. To construct a customized TALE, bacteria must be amplified in culture and the various plasmids must be purified. The purified plasmids took up to five days to construct a custom TALE by using the Golden Gate program (fig. 1). Obviously, this procedure must go through two bacterial transfections, colony selection and positive colony culture to obtain the final TALE, which is the main rate-limiting step of the protocol. This step must be deleted if it is desired to simplify and speed up the procedure. Moreover, the cleavage-ligation reaction of plasmids is inefficient because it prevents more monomer molecules from being added to the cleavage-ligation reaction due to the large amount of useless DNA sequences in the plasmid backbone. In addition, the efficiency of plasmid DNA cleavage is low due to the compact supercoiled structure.
After the defects of the Addgene kit and the TALE construction process are considered, a new TALE construction scheme is supposed to be developed, wherein the new TALE construction scheme contains a plasmid-free monomer and a bacteria-free program. Therefore, a set of PCR primers was designed that can be used to amplify various monomers from Addgene monomer plasmids. Importantly, all PCR products can pass through 96 wellsA pair of universal primers in the PCR plate was amplified. To eliminate the intermediate bacterial transfection step, we have creatively introduced two novel linker molecules dsDNA10.5And dsDNA17.5(Table 2), they can also be amplified by universal primers in 96-well PCR plates. Once the monomeric 96-well PCR plate was constructed (fig. 2), it was permanently stored to construct 18-bp TALE DNA-binding domains targeting various sequences. The original monomer plate can be stored as a template and a replica monomer plate is generated by high fidelity PCR amplification. The quality of the replica PCR plate can be checked by simple PCR product sequencing. To construct the final functional TALE, we also constructed two TALE-backbone vectors TALE-VP64 and TALE-VPR, which still contain LacZ expression cassette, for easy selection of final positive TALE colonies with blue white spot selection (fig. 3). To enhance the expression of the final TALE in mammalian cells, the TALE backbone vector used a common strong CMV promoter (fig. 3).
Example 2
Construction and preparation of TALE framework vector and fluorescent report vector
The experimental method comprises the following steps:
for the preparation of TALE backbone vectors, VP64 and VPR fragments were obtained from pcDNA-dCas9-VP64 and pcDNA-dCas9-VPR by PCR amplification, and EcoRI and NotI cleavage sites were introduced upstream and downstream of VP64 and VPR, respectively. The PCR reaction (30. mu.L) system consisted of 1ng of pcDNA-dCas9-VP64 or pcDNA-dCas9-VPR, 10. mu.M of VP64-F or VPR-F (Table 3), 10. mu.M of VP64-R or VPR-R, SEQ ID NO.15-18 (Table 3) and 1 XPrimeSTAR HS DNA polymerase. The PCR reaction program is pre-denaturation at 96 ℃ for 3 minutes; 28 cycles comprising denaturation at 96 ℃ for 20 seconds, annealing at 58 ℃ for 20 seconds and extension at 72 ℃ for 20 seconds; extension at 72 ℃ for 3 min. The cloned fragments were ligated into PIRES2-EGFP, named PIE-VP64 (FIG. 4) and PIE-VPR (FIG. 5). Both vectors were verified by DNA sequencing. Next, the TAL (LacZa) fragment obtained by digesting pTAL2 vector with EcoRI in the Golden Gate TALEN and TAL effector kit 2.0 was ligated to PIE-VP64 and PIE-VPR, and named TALE-VP64 (FIG. 6) backbone vector and TALE-VPR backbone vector (FIG. 7).
And 3, constructing PCR primers of the TALE framework vector.
Figure BDA0001718996500000121
To prepare high quality plasmids for TALE backbone vectors for TALE monomer libraries, e.coli competent cells DH5 α were transformed with the constructed TALE-VP64 and TALE-VPR backbone vectors, respectively. All plasmids were extracted from the colony PCR reaction-verified positive colony cultures using the EndoFree plasmid kit according to the manufacturer's instructions. All plasmids were further verified by DNA sequencing.
The experimental results are as follows:
clone sequencing shows that the sequence and elements of the TALE-VP64 vector are as shown in SEQ ID NO. 19:
Figure BDA0001718996500000122
Figure BDA0001718996500000131
wherein the lower case sequence is CMV promoter sequence, the upper case sequence is TALE N terminal sequence (front) and C terminal sequence (back), the underline sequence is LacZa, and the lower case sequence is the substitution region of TALE DNA binding structure domain coding sequence to be inserted, and the italic sequence VP64 coding sequence.
Clone sequencing shows that the sequence and elements of the TALE-VPR vector are as shown in SEQ ID NO. 20:
Figure BDA0001718996500000132
Figure BDA0001718996500000141
Figure BDA0001718996500000151
wherein the lower case sequence is CMV promoter sequence, the upper case sequence is TALE N terminal sequence (front) and C terminal sequence (back), the underline sequence is LacZa, and the lower case sequence is replacement region of TALE DNA binding structure domain coding sequence to be inserted, and italic sequence VPR coding sequence.
Example 3
Construction and preparation of TALE (transcription activator like effector) targeted fluorescent reporter vector
The experimental method comprises the following steps:
to prepare the reporter vector, promoter fragments (1000bp) of HNF4 α and E47 genes were obtained by PCR amplification from human genomic dna (gdna), and XhoI and HindIII cleavage sites were introduced upstream and downstream of the HNF4 α promoter fragment and the E47 promoter fragment, respectively. PCR reaction system (30. mu.L) human gDNA extracted from HepG2 cells and PANC1 cells at 100ng, 10. mu.M HNF 4. alpha. -F or E47-F (Table 4), 10. mu.M HNF 4. alpha. -R or E47-R (Table 4), and 1 XPrimeSTAR HS DNA polymerase. The PCR reaction program is pre-denaturation at 96 ℃ for 3 minutes; 28 cycles comprising denaturation at 96 ℃ for 20 seconds, annealing at 58 ℃ for 20 seconds and extension at 72 ℃ for 1 minute; extension at 72 ℃ for 5 minutes. All PCR reaction programs were run on the same PCR instrument (MasterCycler Pro, Eppendorf). The amplified promoter fragment was ligated to pEZX-miniCMV-Zsgreen vector to construct fluorescent reporter vectors HNF4 alpha-TALE-reporter (FIG. 8) and E47-TALE-reporter (FIG. 9), respectively.
And 4, constructing PCR primers of the TALE framework vector.
Name (R) Sequence numbering Sequence (5 '-3') Use of
HNF4α-F SEQ21 CCGCTCGAGTGAGATCCAAAACTGAGAC HNF4 alpha gene promoter amplification
HNF4α-R SEQ22 CCCAAGCTTAAGCCCACCCAGCCGGAGAG
E47-F SEQ23 CCGCTCGAGGCTCAGTAGC CACCAACCACC E47 Gene promoter amplification
E47-R SEQ24 CCCAAGCTTTCTGTGGAGGGGAGCTGGTAAG
To prepare plasmids for transfection of mammalian cells, E.coli competent cells DH5 α were transformed with constructed HNF4 α -TALE-reporter and E47-TALE-reporter plasmids, respectively. All plasmids were extracted from the colony PCR reaction-verified positive colony cultures using the EndoFree plasmid kit according to the manufacturer's instructions. All plasmids were further verified by DNA sequencing.
The experimental results are as follows:
clone sequencing shows that the sequence and elements of the HNF4 alpha-TALE-reporter vector are as shown in SEQ ID NO. 25:
Figure BDA0001718996500000161
wherein the lower case sequence is HNF4 alpha promoter sequence, the bold sequence is TALE binding target, the italic sequence is miniCMV sequence, and the underlined sequence is ZSGreen coding sequence.
Clone sequencing shows that the sequence and elements of the HNF4 alpha-TALE-reporter vector are as shown in SEQ ID NO. 26:
Figure BDA0001718996500000162
Figure BDA0001718996500000171
wherein the lower case sequence is HNF4 alpha promoter sequence, the bold sequence is TALE binding target, the italic sequence is miniCMV sequence, and the underlined sequence is ZSGreen coding sequence.
Example 4
Construction of 18-bp as artificial TALE of target
The experimental method comprises the following steps:
step 1: the target sequence is selected. Typical TALE recognition sequences start in the 5' to 3' direction and with 5' thymine. The following procedure describes binding to a 19-bp target sequence:
5'-T0N1N2N3N4N5N6N7N8N9N10N11N12N13N14N15N16N17N18-3',
TALE construction where N ═ A, G, T or C), where the first base (typically thymine). The target 18bp was targeted by RVDs in the intermediate tandem repeat of 18 monomers according to the codes NI-A, HD-C, NG-T and NN-G.
Step 2: the target sequences are divided into pentamers. N1-N18 was divided into four pentamers (N1-N2-N3-N4-N5, N6-N7-N8-N9-N10, dsDNA10.5-N11-N12-N13-N14 and N15-N16-N17-dsDNA17.5-N18). For example, HNF4 α -TALE3 targeting 5'-TGTCCTTCAGTCCCTTCAT-3' can be classified as pentamer (T) GTCCT, TCAGT, dsDNA10.5CCCT, TCA-dsDNA17.5-T. The four pentamers (from pentamers 1 through 4) are:
NN1-NG2-HD3-HD4-NG5,NG6-HD7-NI8-NN9-NG10,
dsDNA10.5-HD1-HD2-HD3-NG14 and NG15-HD16-NI7-dsDNA17.5-NI18。
The experimental research of the invention constructs 9 TALE carriers in total, wherein 6 TALE carriers targeting the HNF4 ALPHA gene promoter are HNF4 ALPHA-TALE 1/2/3-VP64 and HNF4 ALPHA-TALE 1/2/3-VPR, and the target sequences are respectively SEQ ID NO. 27: 5'-TGCCCCCAGCTCTCCGGCT-3' (TALE1), SEQ ID NO. 28: 5'-TCCGGCCCTGTCCTCAAAT-3' (TALE2), SEQ ID NO. 29: 5'-TGTCCTTCAGTCCCTTCAT-3' (TALE 3); 3 TALE vectors targeting E47 gene promoters are respectively E47-TALE1/2/3-VPR, and the target sequences are respectively SEQ ID NO. 30: 5'-TCCCCACTCGGCCCCCACC-3' (TALE1), SEQ ID NO. 31: 5'-TCGCCAGGGACGGTAGGGC-3' (TALE2), SEQ ID NO. 32: 5'-TACCCGTCCGCCAAGACCC-3' (TALE 3).
And 3, step 3: digestion-ligation (Golden Gate method) reactions were performed simultaneously to assemble each pentamer. Each pentamer ligation reaction (20 μ L) was prepared by mixing the following reagents in a tube: 10U BsaI, 400U T4DNA ligase, 1 XT 4DNA ligase buffer, 2. mu.g BSA and 200ng of 5 monomers each. Each pentamer reaction tube was placed in a thermal cycler and the Golden Gate reaction was performed for about 2.5 hours using the following cycling conditions: 10 cycles comprising 5 minutes at 37 ℃ and 10 minutes at 16 ℃; then 5 minutes at 50 ℃ and 5 minutes at 80 ℃; the product was kept at 4 ℃ until use.
And 4, step 4: exonuclease treatment degrades the non-circularized ligation products. During the Golden Gate reaction, only the fully linked pentamer can undergo cyclization. Plasmid safety nucleases selectively degrade non-circular (incomplete) ligation products. mu.L of plasmid-safe nuclease (10U/. mu.L) and 1. mu.L of ATP (10mM) were added to each pentamer reaction (20. mu.L). Each pentamer reaction tube was incubated at 37 ℃ for 30 minutes and then inactivated at 70 ℃ for 30 minutes.
And 5, step 5: the pentamer reaction was performed on a 1.5% agarose gel in 1 XTAE electrophoresis buffer. The pentameric gel band was purified using the MinElute gel extraction kit according to the manufacturer's instructions. The DNA of each reaction was eluted using 30. mu.L of buffer EB which was previously warmed to 65 ℃ and the concentration of each purified pentamer was quantified. The concentration of each pentamer was adjusted to 100 ng/. mu.L by addition of buffer EB.
And 6, step 6: pentamers assembled using the Golden Gate reaction are attached to the TALE backbone. The pentamer and appropriate TALE backbone vector (transcription factor TALE-TF or nuclease TALEN) were combined in a Golden Gate digestion-ligation reaction. The reaction (20 μ L) for each TALE was established by mixing the following reagents in a reaction tube: 75ng TALE backbone vector (TALE-VP64 or TALE-VPR backbone vector), 10U BsmBI, 400U T4DNA ligase, 1 XT 4DNA ligase buffer and 200ng each of the four pentamers. In addition, negative control linkages were made by including TALE backbone vectors without any pentamers. The reaction tube was placed in a thermal cycler and the Golden Gate reaction was carried out for about 2.5 hours using the following cycling conditions: 10 cycles comprising 5 minutes at 37 ℃ and 10 minutes at 16 ℃; then 5 minutes at 50 ℃ and 5 minutes at 80 ℃; the product was kept at 4 ℃ until use.
And 7, step 7: and (4) transformation. The ligation product from step 6 (20. mu.L) was transformed into E.coli competent cell DH 5. alpha. mu.L of the transformed culture (1 mL in total) was plated on 90-mm LB plates containing 100. mu.g/mL kanamycin and coated with 40. mu. L X-gal (20mg/mL) and 8. mu.L IPTG (200mg/mL) and incubated overnight at 37 ℃.
And 8, step 8: and (5) identifying positive colonies. For each TALE plate, 20 μ L of sterile tips were used, the tips were touched to one spot of white colonies, and then the tips were spun with 12 μ L of sterile distilled water to dissolve the colonies as a colony suspension. Colonies were then identified by colony PCR using primers TAL-F and TAL-R (Table 1). Colony PCR reactions (30. mu.L) contained 3. mu.L of colony suspension, 1 Xpremixed Taq, 10. mu.M TAL-F and 10. mu.M TAL-R. The PCR procedure was pre-denaturation at 96 ℃ for 10 min; 35 cycles comprising denaturation at 96 ℃ for 20 seconds, annealing at 55 ℃ for 20 seconds and extension at 72 ℃ for 3 minutes; extension at 72 ℃ for 10 min. To check colony PCR results, the gel should have enough lanes to check 10 μ L of each PCR product. One lane contained the 1 μ gDL2000DNA marker. Gel electrophoresis was run until the 2000-bp bands clearly separated.
Step 9: plasmid extraction and validation. For each colony with the correct and incorrect band size, the colony identified by the clone was inoculated into 10mL of LB medium containing 100. mu.g/mL kanamycin and incubated at 37 ℃ for 1 hour in a shaking incubator. Plasmid DNA was extracted from the cultures using the QIAprep Spin Miniprep Kit according to the manufacturer's instructions. Colonies were verified by digesting the purified final TALE plasmid with EcoRI. The reaction contained 1. mu.g of TALE plasmid, 15U of EcoRI and 1 XEcoRI buffer. The reaction was incubated at 37 ℃ for 30 minutes. The reaction solution was subjected to electrophoresis in 1% agarose gel in 1 XTAE.
Step 10: plasmids for transfection of mammalian cells were prepared. More plasmids were extracted from cultures of validated positive clones using the EndoFree plasmid kit according to the manufacturer's instructions. In the present invention, a total of 9 TALEs were assembled, including HNF4 α -TALE1/2/3-VP64, HNF4 α -TALE 1/23-VPR and E47-TALE1/2/3-VPR, targeting the promoters of the HNF4 α or E47 genes, respectively.
The experimental results are as follows:
procedure for constructing TALEs using new plasmid-free monomers:
using the newly designed and produced monomers (fig. 2 and 3) and TALE backbone vectors (fig. 4), we designed a new procedure to rapidly construct customized TALEs within a day (fig. 10). This procedure eliminates the time consuming and error prone steps of Addgene kit 2.0. The program is designed to be conveniently applicable to all biological laboratories like CRISPR. This procedure assembles a custom TALE by using the Golden Gate method, where monomers are cleaved using two IIS restriction enzymes BsaI and BsmBI to form pentamers and TALE-backbone vectors. To construct TALEs that bind the 18-bp target site, four digestion-ligation reactions were first constructed with the monomers, BsaI and T4 ligase, which produced four pentamers. Another digestion-ligation reaction was then set up with four pentamers, TALE backbone vector (TALE-VP64/VPR), BsmBI and T4DNA ligase, which produced the final TALE plasmid vector that could be used directly as an artificial transcription factor to activate gene expression. The final TALE plasmid vector can be obtained by bacterial transfection, colony PCR identification and plasmid extraction. The positive TALE plasmid vector can be further rapidly confirmed by EcoRI digestion identification.
Constructing a TALE-targeted promoter of interest:
to validate the newly generated monomers and TALE assembly program, we assembled 9 TALEs targeting the human HNF 4a and E47 promoters following the assembly steps described in the methods, named:
HNF4 alpha-TALE 1/2/3-VP64, HNF4 alpha-TALE 1/2/3-VPR and E47-TALE 1/2/3-VPR.
The final TALE was transfected into e.coli and cultured on solid agar. The results showed that many white spots were generated (fig. 11a and 11 b). No blue spots were seen on the agar plates, indicating that our protocol has high digestion-ligation efficiency. Randomly picked colonies confirmed the size of the insert in colony PCR assays and also showed that typically more than 80% of the colonies were truly positive colonies (fig. 11c and 11d), indicating the high efficiency of the monomer used (fig. 2), the backbone vector (fig. 3) and the procedure (fig. 10) to construct the custom TALE. The colony PCR results were further confirmed by EcoRI digestion of the subsequently extracted plasmids, which indicated that typically 80% of the white spots were the final TALE that was correctly and successfully assembled (fig. 11e and 11 f). In the assembly of six vectors of HNF4 alpha-TALE 1/2/3-VPR and E47-TALE1/2/3-VPR, a high positive rate is also obtained through colony PCR identification (figure 12), and further, the efficiency of assembling the customized TALE by using the novel method is high. In addition, another TALE, HNF4 a-TALE-VP 64, was assembled targeting the same site of HNF4 a-TALE-VPR in order to compare the activation abilities of the two transcriptional activation domains, VP64 and VPR.
Example 5
Constructed TALE activated exogenous reporter gene
The experimental method comprises the following steps:
cell culture and transfection: 293T, HepG2 and PANC1 cells in DMEM containing 10% (v/v) Fetal Bovine Serum (FBS), 100U/mL penicillin and 100. mu.g/mL streptomycin, 37 ℃, 5% (v/v) CO2Cultured in an incubator. Cells were seeded at 3000 cells/well in 96-well plates at 5X 104Individual cell/well Density in 24-well plates, 2X 105The density of individual cells/well was plated in 6-well plates and cultured for 24 hours. The cells were then transfected by lysis of the plasmid using Lipofectamine 2000 transfection reagent. All plasmid vectors used for transfection were extracted using the EndoFree plasmid extraction kit (purchased from CWBio). Before transfection, the medium was carefully removed. However, the device is not suitable for use in a kitchenHepG2 and PANC1 cells were then washed with PBS, but 293T cells were not washed with PBS. Subsequently, 100. mu.L of opti-MEM containing 0.5. mu.g of Lipofectamine 2000 and 200ng of plasmid was added to each well of the 96-well plate, and 500. mu.L of opti-MEM containing 2. mu.g of Lipofectamine 2000 and 800ng of plasmid were added to each well of the 24-well plate or 2mL of opti-MEM containing 10. mu.g of Lipofectamine 2000 and 2000ng of plasmid were added to each well of the 6-well plate. Cells were incubated for 4 hours. The transfection medium was removed and fresh complete medium was added. The cells were cultured for another 48 hours or 72 hours.
Detecting exogenous gene expression by reporter gene vector: 293T cells were co-transfected in 24-well plates with 400ng TALE vectors (HNF 4. alpha. -TALE1/2/3-VP64, HNF 4. alpha. -TALE1/2/3-VPR or E47-TALE1/2/3-VPR) and 400ng ZsGreen reporter vectors (HNF 4. alpha. -TALE-reporter or E47-TALE-reporter). Cells were photographed with fluorescence microscope IX51-DP71(Olympus) and the fluorescence intensity of the cells was quantified using a flow cytometer Calibur (BD).
The experimental results are as follows:
to determine whether the constructed TALE functions in mammalian cells, fluorescent reporter vectors were prepared by cloning HNF4 α/E47 promoter containing TALE binding sites plus a minimal CMV promoter upstream of the reporter ZsGreen (HNF4 α/E47-TALE-reporter). To generate TALEs, we replaced the endogenous Nuclear Localization Signal (NLS) and the acidic transcriptional activation domain of wild-type hax3 with a mammalian NLS derived from simian virus 40 large T antigen and a synthetic transcriptional activation domain VP64 or VPR (VP64-p65-RTA) (fig. 3 b). We assembled three TALEs targeting different sites in HNF 4a and E47 promoters, respectively (fig. 13 a). 293T cells were then co-transfected with the assembled TALE and the corresponding TALE reporter. The results show that all TALEs activate the expression of the corresponding TALE reporter (fig. 13 b-13 d); however, the activation level was varied (fig. 13b to 13 d). TALEs that target the vicinity of the Transcription Start Site (TSS) were found to show the highest transcriptional activation activity (HNF 4. alpha. -TALE1-VP64/VPR and E47-TALE 1-VPR; FIGS. 13 b-13 d). In contrast, TALE-VPR activated higher expression of ZsGreen than TALE-VP64 (FIG. 13b and FIG. 13 c). These data indicate that the TALE assembled using the monomer library, TALE backbone vector and the construction program has target gene activation function.
Example 6
Activation of endogenous genes with constructed TALE
The experimental method comprises the following steps:
cell culture and transfection: 293T, HepG2 and PANC1 cells in DMEM containing 10% (v/v) Fetal Bovine Serum (FBS), 100U/mL penicillin and 100. mu.g/mL streptomycin, 37 ℃, 5% (v/v) CO2Cultured in an incubator. Cells were seeded at 3000 cells/well in 96-well plates at 5X 104Individual cell/well Density in 24-well plates, 2X 105The density of individual cells/well was plated in 6-well plates and cultured for 24 hours. The cells were then transfected by lysis of the plasmid using Lipofectamine 2000 transfection reagent. All plasmid vectors used for transfection were extracted using the EndoFree plasmid extraction kit. Before transfection, the medium was carefully removed. HepG2 and PANC1 cells were then washed with PBS, but 293T cells were not washed with PBS. Subsequently, 100. mu.L of opti-MEM containing 0.5. mu.g of Lipofectamine 2000 and 200ng of plasmid was added to each well of the 96-well plate, and 500. mu.L of opti-MEM containing 2. mu.g of Lipofectamine 2000 and 800ng of plasmid were added to each well of the 24-well plate or 2mL of opti-MEM containing 10. mu.g of Lipofectamine 2000 and 2000ng of plasmid were added to each well of the 6-well plate. Cells were incubated for 4 hours. The transfection medium was removed and fresh complete medium was added. The cells were cultured for another 48 hours or 72 hours.
Cultured HepG2 and PANC1 cells were transfected with HNF4 α -TALE1-VPR and E47-TALE1-VPR, respectively, in 24-well plates. Elution buffer from the EndoFree plasmid extraction kit was used as a blank. After 48 hours, total RNA was extracted by Trizol, which was used as a template for reverse transcription to synthesize complementary dna (cdna). gDNA was removed from total RNA by incubating 2. mu.g of total RNA in 10. mu.L of 1 XgDNA buffer containing the FastKing RT kit (containing gDNase). The reaction system was incubated at 42 ℃ for 3 minutes and then placed on ice. mu.L of 10 xKing RT buffer, 1. mu.L of FastKing RT enzyme mixture, 2. mu.L of FQ-RT primer mixture and 5. mu. L H were added to the reaction system2O, the reaction system was incubated at 42 ℃ for 15 minutes and then at 95 ℃ for 3 minutes. The resulting cDNA was detected using fluorescent quantitative pcr (qpcr).
Endogenous gene expression was detected by qPCR: expression of HNF4 α and E47 and their target genes were quantitatively analyzed by qPCR. Beta-actin is used as an internal reference gene to analyze the relative mRNA expression of different genes. The qPCR reaction (10 μ L) contained 1 μ L cDNA, 2 μ M forward primer (table S4), 2 μ M reverse primer; SEQ ID NO.29-90 (Table 5) and 1 Xfast Green Master Mix. The PCR reaction program is pre-denaturation at 95 ℃ for 3 minutes; 40 cycles, including 95 ℃ denaturation for 15 seconds and 60 ℃ extension for 1 minute. The qPCR program was run on a real-time PCR instrument StepOne Plus (Applied Biosystems). At least three technical replicates were performed per qPCR assay. Analysis of the dissolution curve revealed a single PCR product. Data analysis was performed using Applied Biosystems StepOne software v2.3 and Ct values were normalized to β -actin. The relative expression levels of the target mRNAs were calculated as relative amounts (RQ) according to the following formula: RQ ═ 2-ΔΔCt
TABLE 5 primers for quantitative PCR detection of gene expression
Figure BDA0001718996500000221
Figure BDA0001718996500000231
The experimental results are as follows:
to determine whether the constructed TALE can activate endogenous gene expression in mammalian cells, HepG2 cells were first transfected with HNF4 α -TALE-VPR. qPCR detection of HNF 4a gene expression showed that its expression was highly activated by HNF4 a-TALE-VPR (fig. 14 a). In addition, with the up-regulation of the endogenous HNF 4. alpha. gene in HepG2 cells, the expression of various hepatocyte marker genes, including APOCIII, CYP1A2, G-6-P, GYS2, APOAI and HPD, was induced (FIG. 14 a). Thus, HNF4 α reestablished the expression profile of hepatocyte markers characteristic of hepatocytes. With HNF 4a upregulated, expression of various stem genes was significantly downregulated, including CD133, OCT3/4, BMI, SOX2, KLF4, and LIN28 (fig. 14 b). Thus, HNF4 α activation decreases the stem/progenitor ratio in hepatoma cells.
At the same time, PANC1 cells were transfected with E47-TALE-VPR. qPCR detection of E47 gene expression showed that its expression was highly activated by E47-TALE-VPR (fig. 14 c). Furthermore, with the upregulation of E47, the expression of several E47 target genes (P21, TP53INP1, HNF6, SOX9, CX32, and miss 1) was also significantly upregulated. In addition, the expression of two cell cycle arrest-related genes (P21 and TP53INP1) was also significantly up-regulated. The P21 gene is a member of the CLP family, and is a cyclin-dependent kinase inhibitor located downstream of the P53 gene. P21 and P53 can jointly form a cell cycle G1 checkpoint, and DNA cannot pass through the checkpoint without repair after being damaged, so that the replication and accumulation of damaged DNA are reduced, and the cancer inhibition effect is achieved. Following E47-TALE-VPR transfection, increased expression of endogenous E47 gene in PANC1 cells resulted in upregulation of pancreatic marker gene clusters, including P21, TP53INP1 and CX 32. Expression of E47 reduced the expression of highly expressed cell cycle genes in PANC1 cells, such as the TOP2A gene and AURKA gene. TOP2A gene encodes DNA topoisomerase II alpha, is one of nuclear matrix components, plays a role in the nucleus, can regulate the dynamic change of a nucleic acid space structure, and participates in the processes of DNA replication, transcription, recombination and repair. The AURKA gene encodes an evolutionarily conserved serine/threonine kinase, one of the Aurora kinase family members. During mitosis, AURKA plays an important role in the cell cycle by participating in the separation and maturation of centrosomes and the establishment of two spindle poles to ensure the correct separation of chromosomes and the smooth completion of cytokinesis in mitosis. Aberrant amplification and/or high expression of AURKA is common in a variety of human tumors. Is reduced in PANC1 cells. In addition, E47 induced the expression of catheter genes (product gene) SOX9 and HNF6 in PANC1 cells. Meanwhile, E47 induced the expression of miss 1, a gene that regulates the expression of CX32 in PANC1 cells. The change of the gene expression indicates that after the E47-TALE-VPR vector constructed by the method is transfected into PANC1 cells, the benign differentiation of the pancreatic cancer cells is caused, and the characteristics of the pancreatic cancer cells are changed from malignant tumor cells to normal pancreatic cells. The E47-TALE-VPR vector has potential pancreatic cancer treatment value.
Example 7
TALE expression vector constructed by transfection to promote cancer cell differentiation (tumor differentiation therapy)
The experimental method comprises the following steps:
cell culture and transfection: 293T, HepG2 and PANC1 cells in DMEM containing 10% (v/v) Fetal Bovine Serum (FBS), 100U/mL penicillin and 100. mu.g/mL streptomycin, 37 ℃, 5% (v/v) CO2Cultured in an incubator. Cells were seeded at 3000 cells/well in 96-well plates at 5X 104Individual cell/well Density in 24-well plates, 2X 105The density of individual cells/well was plated in 6-well plates and cultured for 24 hours. The cells were then transfected by lysis of the plasmid using Lipofectamine 2000 transfection reagent. All plasmid vectors used for transfection were extracted using the EndoFree plasmid extraction kit. Before transfection, the medium was carefully removed. HepG2 and PANC1 cells were then washed with PBS, but 293T cells were not washed with PBS. Subsequently, 100. mu.L of opti-MEM containing 0.5. mu.g of Lipofectamine 2000 and 200ng of plasmid was added to each well of the 96-well plate, and 500. mu.L of opti-MEM containing 2. mu.g of Lipofectamine 2000 and 800ng of plasmid were added to each well of the 24-well plate or 2mL of opti-MEM containing 10. mu.g of Lipofectamine 2000 and 2000ng of plasmid were added to each well of the 6-well plate. Cells were incubated for 4 hours. The transfection medium was removed and fresh complete medium was added. The cells were cultured for another 48 hours or 72 hours.
Cell viability and cycle were measured by CCK8 and flow cytometry: cell viability was measured using CCK 8. HepG2 and PANC1 cells were seeded overnight at a density of 3000 cells/well in 96-well plates and cultured overnight after transfection with 200ng of the plasmids HNF 4. alpha. -TALE1-VPR or E47-TALE1-VPR and 0.5. mu.g of Lipofectamine 2000, after 4 hours, in fresh complete medium. After adding 10. mu.L of CCK8 solution per well at fixed time points per day and incubating at 37 ℃ for 4 hours, the Optical Density (OD) of the cells in each well was read by a microplate reader (BioTek) at a wavelength of 450 nm.
Cell cycle analysis by flow cytometry: HepG2 and PANC1 cells were seeded overnight at a density of 3000 cells/well in 96-well plates, transfected with 200ng of plasmids HNF 4. alpha. -TALE1-VPR or E47-TALE1-VPR and 0.5. mu.g of Lipofectamine 2000 for 4 hours, and cultured for 24 hours after replacement with fresh complete medium. Cells were harvested by trypsinization and resuspended in 40. mu.L of 4% paraformaldehyde. mu.L of 100. mu.g/mL DAPI was added to the cells and incubated at room temperature for 30 min. The cells were pelleted by centrifugation and washed once with PBS, then resuspended in PBS. Cells were analyzed by flow cytometry calibur (bd). The data for stages G0/G1, S and G2/M were generated by modeling with ModFitLT Software (version Software House).
Cell migration was detected by transwell: HepG2 and PANC1 cells cultured in 24-well plates were transfected with 800ng of HNF4 α -TALE1-VPR and E47-TALE1-VPR, respectively. Using Lipofectamine TM2000 as blank control. After 48 hours, cells were washed with serum-free DMEM and collected by trypsin at 1 × 105The final concentration of individual cells/mL was resuspended in serum-free DMEM. The cell suspension was seeded into the upper transwell chamber at a volume of 200 μ L per well. DMEM medium containing 10% FBS and 50. mu.g/mL fibronectin was added to the lower transwell chamber in a volume of 600. mu.L. Cells were incubated at 37 ℃ for 48 hours. the transwell was removed. The lower chamber was washed once with PBS and 100. mu.L of PBS containing 10mg/mL acridine orange was added. After 30 min staining at room temperature, cells were washed three times with PBS. Cells were photographed using a fluorescence microscope (IX51-DP71 (Olympus).
The experimental results are as follows:
to explore whether TALE-induced changes in gene expression induce phenotypic changes in cells, cell viability, cycle and migration were examined. Dynamic examination of cell activity with CCK8 showed that transfection of HNF4 α -TALE-VPR reduced cell viability of HepG2 cells on day three post transfection (fig. 15 a). Notably, cell viability decreased by more than 50% by the fifth day after transfection. Cell cycle measurements indicated that transfection of HNF4 α -TALE-VPR resulted in significant cell cycle arrest in HepG2 cells (FIG. 15 b). On day 3 post-infection, only 9.66% of HNF4 α -TALE-VPR transfected HepG2 cells were in S phase, while up to 40.36% of control cells were in S phase (FIG. 15 b). Cell migration assay showed that transfection of HNF4 a-TALE-VPR resulted in significant inhibition of HepG2 cell migration (fig. 15 c). These phenotypic changes indicate that HNF4 α activation by HNF4 α -TALE-VPR transfection induced differentiation of HepG2 cells into normal hepatocytes, consistent with significant upregulation of characteristic hepatocyte markers (fig. 15a) and significant downregulation of stem cell genes (fig. 15 b).
At the same time, the cell viability, cycle and migration of PANC1 cells were also examined. Similarly, the results indicate that transfection of E47-TALE-VPR resulted in significant reduction in cell viability (fig. 15c), cycle arrest (fig. 15d) and migration (fig. 15E) in PANC1 cells. FIG. 16 is raw data from cell cycle assays. These phenotypic changes were consistent with significant upregulation of P21, TP53INP1, acinase and MIST1, suggesting that E47 reprograms important PANC1 cells to quiescent acinar state by restoring expression of P21, TP53INP1, acinase and MIST 1.
Sequence listing
<110> university of southeast
TALE expression vector and rapid construction method and application thereof
<160> 94
<170> SIPOSequenceListing 1.0
<210> 1
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 1
tatcatcatg cctcctctag ag 22
<210> 2
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 2
ttggtcatgg gtggctcgag g 21
<210> 3
<211> 99
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
atcatcatgc ctcctctaga ggtctcccta tcttaaaccg gccaacatac ccgtctcccc 60
ctgaacctga ccccggacca agtggtggct atcgccagc 99
<210> 4
<211> 65
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
ttggtcatgg gtggctcgag ggtctccata gagtctgtct ttcccctttc ccgtctcctg 60
caccg 65
<210> 5
<211> 68
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
tatcatcatg cctcctctag aggtctccct atcttaaacc ggccaacata cccgtctcgt 60
gcagcggc 68
<210> 6
<211> 62
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 6
ttggtcatgg gtggctcgag ggtctccata gagtctgtct ttcccctttc ccgtctcccg 60
cc 62
<210> 7
<211> 66
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
ttggtcatgg gtggctcgag ggtctcctcg aagtctgtct ttcccctttc ccgtctccaa 60
cagccg 66
<210> 8
<211> 68
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 8
tatcatcatg cctcctctag aggtctcctc gacttaaacc ggccaacata cccgtctcct 60
gttgccgg 68
<210> 9
<211> 39
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 9
tatcatcatg cctcctctag aggtctcgct atcgccagc 39
<210> 10
<211> 69
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 10
ttggtcatgg gtggctcgag ggtctcttcg aagtctgtct ttcccctttc ccgtctctcg 60
ttggtcaac 69
<210> 11
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 11
ttggcgtcgg caaacagtgg 20
<210> 12
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 12
ggcgacgagg tggtcgttgg 20
<210> 13
<211> 149
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 13
tatcatcatg cctcctctag aggtctcctc gacttaaacc ggccaacata cccgtctctg 60
gcggcaagca agcgctcgaa acggtgcagc ggctgttgcc ggtgctgtgc caggaccatg 120
gccgagaccc tcgagccacc catgaccaa 149
<210> 14
<211> 126
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 14
tatcatcatg cctcctctag aggtctccga aacggtgcag cggctgttgc cggtgctgtg 60
ccaggaccat ggcctgaccc cggaccaagt ggtggctatc gagaccctcg agccacccat 120
gaccaa 126
<210> 15
<211> 29
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 15
ccggaattca gcgctggagg aggtggaag 29
<210> 16
<211> 36
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 16
aaggaaaaaa gcggccgctc agttaatcag catgtc 36
<210> 17
<211> 49
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 17
ccggaattcg acgcattgga cgattttgat ctggatatgc tgggaagtg 49
<210> 18
<211> 55
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 18
aaggaaaaaa gcggccgcaa acagagatgt gtcgaagatg gacagtcctg tgctg 55
<210> 19
<211> 2840
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 19
cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt 60
gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca 120
atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc 180
aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta 240
catgacctta tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac 300
catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg 360
atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg 420
ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt 480
acggtgggag gtctatataa gcagagctat ggatcccatt cgtccgcgca ggccaagtcc 540
tgcccgcgag cttctgcccg gaccccaacc ggatagggtt cagccgactg cagatcgtgg 600
ggtgtctgcg cctgctggca gccctctgga tggcttgccc gctcggcgga cggtgtcccg 660
gacccggctg ccatctcccc ctgcgccctc acctgcgttc tcggcgggca gcttcagcga 720
tctgctccgt ccgttcgatc cgtcgcttct tgatacatcg cttcttgatt cgatgcctgc 780
cgtcggcacg ccgcatacag cggctgcccc agcagagtgg gatgaggcgc aatcggctct 840
gcgtgcagcc gatgacccgc cacccaccgt gcgtgtcgct gtcactgccg cgcggccgcc 900
gcgcgccaag ccggccccgc gacggcgtgc tgcgcaaccc tccgacgctt cgccggccgc 960
gcaggtggat ctacgcacgc tcggctacag tcagcagcag caagagaaga tcaaaccgaa 1020
ggtgcgttcg acagtggcgc agcaccacga ggcactggtg ggccatgggt ttacacacgc 1080
gcacatcgtt gcgctcagcc aacacccggc agcgttaggg accgtcgctg tcacgtatca 1140
gcacataatc acggcgttgc cagaggcgac acacgaagac atcgttggcg tcggcaaaca 1200
gtggtccggc gcacgcgccc tggaggcctt gctcacggat gcgggggagt tgagaggtcc 1260
gccgttacag ttggacacag gccaacttgt gaagattgca aaacgtggcg gcgtgaccgc 1320
aatggaggca gtgcatgccc tggagacggg cgccgctaca gggcgcgtcc cattcgccat 1380
tcaggctgcg caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc 1440
tggcgaaagg gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt 1500
cacgacgttg taaaacgacg gccagtgagc gcgcgtaata cgactcacta tagggcgaat 1560
tgggtaccgg gccccccctc gaggtcctcc agcttttgtt ccctttagtg agggttaatt 1620
gcgcgcttgg cgtaatcatg gtcatagctg tttcctgtgt gaaattgtta tccgctcaca 1680
attccacaca acatacgagc cggaagcata aagtgtaaag cctggggtgc ctaatgagtg 1740
agctaactca cattaattgc gttgcgctca ctgcccgctt tccaccggtc gtctccaacg 1800
accacctcgt cgccttggcc tgcctcggcg gacgtcctgc catggatgca gtgaaaaagg 1860
gattgccgca cgcgccggaa ttgatcagaa gagtcaatcg ccgtattggc gaacgcacgt 1920
cccatcgcgt tgccgactac gcgcaagtgg ttcgcgtgct ggagtttttc cagtgccact 1980
cccacccagc gtacgcattt gatgaggcca tgacgcagtt cgggatgagc aggaacgggt 2040
tggtacagct ctttcgcaga gtgggcgtca ccgaactcga agcccgcggt ggaacgctcc 2100
ccccagcctc gcagcgttgg gaccgtatcc tccaggcatc agggatgaaa agggccaaac 2160
cgtcccctac ttcagctcaa acaccggatc aggcgtcttt gcatgcattc gccgattcgc 2220
tggagcgtga ccttgatgcg cccagcccaa tgcacgaggg agatcagacg cgggcaagca 2280
gccgtaaacg gtcccgatcg gatcgtgctg tcaccggccc ctccgcacag caggctgtcg 2340
aggtgcgcgt tcccgaacag cgcgatgcgc tgcatttgcc cctcagctgg agggtaaaac 2400
gcccgcgtac caggatctgg ggcggcctcc cggatcctgg tacgcccatg gctgccgacc 2460
tggcagcgtc cagcaccgtg atgtgggaac aagatgcgga ccccttcgca ggggcagcgg 2520
atgatttccc ggcattcaac gaagaggaac tcgcatggtt gatggagcta ttgcctcaga 2580
agagcgctgg aggaggtgga agcggaggag gaggaagcgg aggaggaggt agcggaccta 2640
agaaaaagag gaaggtggcc gctgctggct ctggacgggc tgacgcattg gacgattttg 2700
atctggatat gctgggaagt gacgccctcg atgattttga ccttgacatg cttggttcgg 2760
atgcccttga tgactttgac ctcgacatgc tcggcagtga cgcccttgat gatttcgacc 2820
tggacatgct gattaactga 2840
<210> 20
<211> 5870
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 20
cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt 60
gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca 120
atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc 180
aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta 240
catgacctta tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac 300
catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg 360
atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg 420
ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt 480
acggtgggag gtctatataa gcagagctat ggatcccatt cgtccgcgca ggccaagtcc 540
tgcccgcgag cttctgcccg gaccccaacc ggatagggtt cagccgactg cagatcgtgg 600
ggtgtctgcg cctgctggca gccctctgga tggcttgccc gctcggcgga cggtgtcccg 660
gacccggctg ccatctcccc ctgcgccctc acctgcgttc tcggcgggca gcttcagcga 720
tctgctccgt ccgttcgatc cgtcgcttct tgatacatcg cttcttgatt cgatgcctgc 780
cgtcggcacg ccgcatacag cggctgcccc agcagagtgg gatgaggcgc aatcggctct 840
gcgtgcagcc gatgacccgc cacccaccgt gcgtgtcgct gtcactgccg cgcggccgcc 900
gcgcgccaag ccggccccgc gacggcgtgc tgcgcaaccc tccgacgctt cgccggccgc 960
gcaggtggat ctacgcacgc tcggctacag tcagcagcag caagagaaga tcaaaccgaa 1020
ggtgcgttcg acagtggcgc agcaccacga ggcactggtg ggccatgggt ttacacacgc 1080
gcacatcgtt gcgctcagcc aacacccggc agcgttaggg accgtcgctg tcacgtatca 1140
gcacataatc acggcgttgc cagaggcgac acacgaagac atcgttggcg tcggcaaaca 1200
gtggtccggc gcacgcgccc tggaggcctt gctcacggat gcgggggagt tgagaggtcc 1260
gccgttacag ttggacacag gccaacttgt gaagattgca aaacgtggcg gcgtgaccgc 1320
aatggaggca gtgcatgccc tggagacggg cgccgctaca gggcgcgtcc cattcgccat 1380
tcaggctgcg caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc 1440
tggcgaaagg gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt 1500
cacgacgttg taaaacgacg gccagtgagc gcgcgtaata cgactcacta tagggcgaat 1560
tgggtaccgg gccccccctc gaggtcctcc agcttttgtt ccctttagtg agggttaatt 1620
cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt 1680
gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca 1740
atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc 1800
aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta 1860
catgacctta tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac 1920
catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg 1980
atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg 2040
ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt 2100
acggtgggag gtctatataa gcagagctat ggatcccatt cgtccgcgca ggccaagtcc 2160
tgcccgcgag cttctgcccg gaccccaacc ggatagggtt cagccgactg cagatcgtgg 2220
ggtgtctgcg cctgctggca gccctctgga tggcttgccc gctcggcgga cggtgtcccg 2280
gacccggctg ccatctcccc ctgcgccctc acctgcgttc tcggcgggca gcttcagcga 2340
tctgctccgt ccgttcgatc cgtcgcttct tgatacatcg cttcttgatt cgatgcctgc 2400
cgtcggcacg ccgcatacag cggctgcccc agcagagtgg gatgaggcgc aatcggctct 2460
gcgtgcagcc gatgacccgc cacccaccgt gcgtgtcgct gtcactgccg cgcggccgcc 2520
gcgcgccaag ccggccccgc gacggcgtgc tgcgcaaccc tccgacgctt cgccggccgc 2580
gcaggtggat ctacgcacgc tcggctacag tcagcagcag caagagaaga tcaaaccgaa 2640
ggtgcgttcg acagtggcgc agcaccacga ggcactggtg ggccatgggt ttacacacgc 2700
gcacatcgtt gcgctcagcc aacacccggc agcgttaggg accgtcgctg tcacgtatca 2760
gcacataatc acggcgttgc cagaggcgac acacgaagac atcgttggcg tcggcaaaca 2820
gtggtccggc gcacgcgccc tggaggcctt gctcacggat gcgggggagt tgagaggtcc 2880
gccgttacag ttggacacag gccaacttgt gaagattgca aaacgtggcg gcgtgaccgc 2940
aatggaggca gtgcatgccc tggagacggg cgccgctaca gggcgcgtcc cattcgccat 3000
tcaggctgcg caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc 3060
tggcgaaagg gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt 3120
cacgacgttg taaaacgacg gccagtgagc gcgcgtaata cgactcacta tagggcgaat 3180
tgggtaccgg gccccccctc gaggtcctcc agcttttgtt ccctttagtg agggttaatt 3240
gcgcgcttgg cgtaatcatg gtcatagctg tttcctgtgt gaaattgtta tccgctcaca 3300
attccacaca acatacgagc cggaagcata aagtgtaaag cctggggtgc ctaatgagtg 3360
agctaactca cattaattgc gttgcgctca ctgcccgctt tccaccggtc gtctccaacg 3420
accacctcgt cgccttggcc tgcctcggcg gacgtcctgc catggatgca gtgaaaaagg 3480
gattgccgca cgcgccggaa ttgatcagaa gagtcaatcg ccgtattggc gaacgcacgt 3540
cccatcgcgt tgccgactac gcgcaagtgg ttcgcgtgct ggagtttttc cagtgccact 3600
cccacccagc gtacgcattt gatgaggcca tgacgcagtt cgggatgagc aggaacgggt 3660
tggtacagct ctttcgcaga gtgggcgtca ccgaactcga agcccgcggt ggaacgctcc 3720
ccccagcctc gcagcgttgg gaccgtatcc tccaggcatc agggatgaaa agggccaaac 3780
cgtcccctac ttcagctcaa acaccggatc aggcgtcttt gcatgcattc gccgattcgc 3840
tggagcgtga ccttgatgcg cccagcccaa tgcacgaggg agatcagacg cgggcaagca 3900
gccgtaaacg gtcccgatcg gatcgtgctg tcaccggccc ctccgcacag caggctgtcg 3960
aggtgcgcgt tcccgaacag cgcgatgcgc tgcatttgcc cctcagctgg agggtaaaac 4020
gcccgcgtac caggatctgg ggcggcctcc cggatcctgg tacgcccatg gctgccgacc 4080
tggcagcgtc cagcaccgtg atgtgggaac aagatgcgga ccccttcgca ggggcagcgg 4140
atgatttccc ggcattcaac gaagaggaac tcgcatggtt gatggagcta ttgcctcaga 4200
aggacgcatt ggacgatttt gatctggata tgctgggaag tgacgccctc gatgattttg 4260
accttgacat gcttggttcg gatgcccttg atgactttga cctcgacatg ctcggcagtg 4320
acgcccttga tgatttcgac ctggacatgc tgattaactc tagaagttcc ggatctccga 4380
aaaagaaacg caaagttggt agccagtacc tgcccgacac cgacgaccgg caccggatcg 4440
aggaaaagcg gaagcggacc tacgagacat tcaagagcat catgaagaag tcccccttca 4500
gcggccccac cgaccctaga cctccaccta gaagaatcgc cgtgcccagc agatccagcg 4560
ccagcgtgcc aaaacctgcc ccccagcctt accccttcac cagcagcctg agcaccatca 4620
actacgacga gttccctacc atggtgttcc ccagcggcca gatctctcag gcctctgctc 4680
tggctccagc ccctcctcag gtgctgcctc aggctcctgc tcctgcacca gctccagcca 4740
tggtgtctgc actggctcag gcaccagcac ccgtgcctgt gctggctcct ggacctccac 4800
aggctgtggc tccaccagcc cctaaaccta cacaggccgg cgagggcaca ctgtctgaag 4860
ctctgctgca gctgcagttc gacgacgagg atctgggagc cctgctggga aacagcaccg 4920
atcctgccgt gttcaccgac ctggccagcg tggacaacag cgagttccag cagctgctga 4980
accagggcat ccctgtggcc cctcacacca ccgagcccat gctgatggaa taccccgagg 5040
ccatcacccg gctcgtgaca ggcgctcaga ggcctcctga tccagctcct gcccctctgg 5100
gagcaccagg cctgcctaat ggactgctgt ctggcgacga ggacttcagc tctatcgccg 5160
atatggattt ctcagccttg ctgggctctg gcagcggcag ccgggattcc agggaaggga 5220
tgtttttgcc gaagcctgag gccggctccg ctattagtga cgtgtttgag ggccgcgagg 5280
tgtgccagcc aaaacgaatc cggccatttc atcctccagg aagtccatgg gccaaccgcc 5340
cactccccgc cagcctcgca ccaacaccaa ccggtccagt acatgagcca gtcgggtcac 5400
tgaccccggc accagtccct cagccactgg atccagcgcc cgcagtgact cccgaggcca 5460
gtcacctgtt ggaggatccc gatgaggaga ctagccaggc tgtcaaagcc cttcgggaga 5520
tggccgatac tgtgattccc cagaaggaag aggctgcaat ctgtggccaa atggaccttt 5580
cccatccgcc cccaaggggc catctggatg agctgacaac cacacttgag tccatgaccg 5640
aggatctgaa cctggactca cccctgaccc cggaattgaa cgagattctg gataccttcc 5700
tgaacgacga gtgcctcttg catgccatgc atatcagcac aggactgtcc atcttcgaca 5760
catctctgtt tgcggccgcg actctagatc ataatcagcc ataccacatt tgtagaggtt 5820
ttacttgctt taaaaaacct cccacacctc cccctgaacc tgaaacataa 5870
<210> 21
<211> 28
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 21
ccgctcgagt gagatccaaa actgagac 28
<210> 22
<211> 29
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 22
cccaagctta agcccaccca gccggagag 29
<210> 23
<211> 30
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 23
ccgctcgagg ctcagtagcc accaaccacc 30
<210> 24
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 24
cccaagcttt ctgtggaggg gagctggtaa g 31
<210> 25
<211> 1728
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 25
tgagatccaa aactgagaca aaagaaacgg ggctgttcca aaaaaaaagc taggtggcag 60
gtgtctaaca tgccagggag ctaaaacaga gtgtgtgagt ttcagcagca ggttgaattt 120
agaatgggga aggagaccag aggagacgcc agacaggatg actttgtccc attggcctgg 180
aggcagcccc atgtttctcc acccctcata tcactcacca gtttgtaata gtatctttga 240
atgacgatct gattaaggtc cgtctcctcc attagtccac aagtttcggg ggtacatcta 300
ctttgctcat ttccatatcc ccagagtcta gcacaaggcc tggtacatag taggtgctca 360
ataaatatgt tagatgaaag gaagataaca cctctatgta ctagcagtga gactccaggc 420
atgcaatttc tctctgtcct tcagtccctt catctcaagg tttaatttaa atatggtaac 480
gcctgtatgc aactcccagc atccagtagg cactcactaa acacagttct ccaccctcct 540
tttttcctct gcccctccct cggttttccc actacttcct gcatggtgac acacccatag 600
tttggagcca taaaacccaa cccaggttgg actctcacct ctccagcccc ttctgctccg 660
gccctgtcct caaattgggg ggctgatgtc cccatacacc tggctctggg ttcccctaac 720
cccagagtgc aggactagga cccgagtgga cctcaggtct ggccaggtcg ccattgccat 780
ggagacagca acagtcccca gccgcgggtt ccctaagtga ctggttactc tttaacgtat 840
ccacccacct tgggtgatta gaagaatcaa taagataacc gggcggtggc agctggccgc 900
actcaccgcc ttcctggtgg acgggctcct ggtggctgtg ctgctgctgt gagcgggccc 960
ctgctcctcc atgcccccag ctctccggct gggtgggctt tagagggtat ataatggaag 1020
ctcgacttcc agatggccca gtccaagcac ggcctgacca aggagatgac catgaagtac 1080
cgcatggagg gctgcgtgga cggccacaag ttcgtgatca ccggcgaggg catcggctac 1140
cccttcaagg gcaagcaggc catcaacctg tgcgtggtgg agggcggccc cttgcccttc 1200
gccgaggaca tcttgtccgc cgccttcatg tacggcaacc gcgtgttcac cgagtacccc 1260
caggacatcg tcgactactt caagaactcc tgccccgccg gctacacctg ggaccgctcc 1320
ttcctgttcg aggacggcgc cgtgtgcatc tgcaacgccg acatcaccgt gagcgtggag 1380
gagaactgca tgtaccacga gtccaagttc tacggcgtga acttccccgc cgacggcccc 1440
gtgatgaaga agatgaccga caactgggag ccctcctgcg agaagatcat ccccgtgccc 1500
aagcagggca tcttgaaggg cgacgtgagc atgtacctgc tgctgaagga cggtggccgc 1560
ttgcgctgcc agttcgacac cgtgtacaag gccaagtccg tgccccgcaa gatgcccgac 1620
tggcacttca tccagcacaa gctgacccgc gaggaccgca gcgacgccaa gaaccagaag 1680
tggcacctga ccgagcacgc catcgcctcc ggctccgcct tgccctaa 1728
<210> 26
<211> 1728
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 26
gctcagtagc caccaaccac cctctgccca gtgcctgtgc tggccccaag agccctgggc 60
atacatctcc tcacccctcc agccgaccag ctccccacag gcctgggttc ggggttagag 120
tcaggttgac ttcccaggcc aaaggcgtgg tcaggctctg ctcccaacag tatcactcac 180
tcccctcctt ccccagggtg cactttctct gccctgaaac cgccccccag tttattacaa 240
catcaggccc acccagcata cttcccgctc cagcacagca gcagacgtta ccaacaggcc 300
ccagccccac agtccaggca gtgccaacca gtcagagctg gcgctactca cagacccctc 360
ttcctgttac tttagtgaag gacgaatgtt gctctgggac caagggaagc tcagcactgg 420
aagattgatt ccagctgggg aacaagggag aaggatggga tggggtgttg gagctggggt 480
tttgatgggt cgataggagt tttctggatt aaggaggaaa aaggtttagt aggcagagcc 540
agatccaagt cctcaagtac aaaggtcaga agttcaggga ggcgggaggt gggggctgga 600
aaaggagtag gagagagaga ccaggacatc ataagaacct caaacttggc caggggctgg 660
gatttaccta aaggggggtg gctttacagg gaagggaaac ggggtggtag atgtgctacc 720
aggaggggac aggaggccgg caaggtgcat gggctgatcg tggtccctcc gtcctgactg 780
caccccccac cgcccacccg cccgcaggtg tacccaccca gctcaggtga ggactacggc 840
agggatgcca ccgcctaccc gtccgccaag acccccagca gcacctatcc cgcccccttc 900
tacgtggcag gtacatggca gggcgggggc tcgccaggga cggtagggca gggctggggt 960
tccccactcg gcccccacct taccagctcc cctccacaga tagagggtat ataatggaag 1020
ctcgacttcc agatggccca gtccaagcac ggcctgacca aggagatgac catgaagtac 1080
cgcatggagg gctgcgtgga cggccacaag ttcgtgatca ccggcgaggg catcggctac 1140
cccttcaagg gcaagcaggc catcaacctg tgcgtggtgg agggcggccc cttgcccttc 1200
gccgaggaca tcttgtccgc cgccttcatg tacggcaacc gcgtgttcac cgagtacccc 1260
caggacatcg tcgactactt caagaactcc tgccccgccg gctacacctg ggaccgctcc 1320
ttcctgttcg aggacggcgc cgtgtgcatc tgcaacgccg acatcaccgt gagcgtggag 1380
gagaactgca tgtaccacga gtccaagttc tacggcgtga acttccccgc cgacggcccc 1440
gtgatgaaga agatgaccga caactgggag ccctcctgcg agaagatcat ccccgtgccc 1500
aagcagggca tcttgaaggg cgacgtgagc atgtacctgc tgctgaagga cggtggccgc 1560
ttgcgctgcc agttcgacac cgtgtacaag gccaagtccg tgccccgcaa gatgcccgac 1620
tggcacttca tccagcacaa gctgacccgc gaggaccgca gcgacgccaa gaaccagaag 1680
tggcacctga ccgagcacgc catcgcctcc ggctccgcct tgccctaa 1728
<210> 27
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 27
tgcccccagc tctccggct 19
<210> 28
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 28
tccggccctg tcctcaaat 19
<210> 29
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 29
tgtccttcag tcccttcat 19
<210> 30
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 30
tccccactcg gcccccacc 19
<210> 31
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 31
tcgccaggga cggtagggc 19
<210> 32
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 32
tacccgtccg ccaagaccc 19
<210> 33
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 33
cctgcttgta tgctggagtc 20
<210> 34
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 34
gaaaagtcgt tgatgttgga 20
<210> 35
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 35
gggtactcct tgttgttgc 19
<210> 36
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 36
aaatcccaga actcagagaa c 21
<210> 37
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 37
ctggcctctg ccatcttctg 20
<210> 38
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 38
ttagcctcct tgctcacatg c 21
<210> 39
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 39
ggctccatga ctgtgggatc 20
<210> 40
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 40
ttcagctgca cagcccagaa 20
<210> 41
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 41
gtgtccctct agtctatgaa gc 22
<210> 42
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 42
attgacttga tcctccagat ac 22
<210> 43
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 43
acaaggtgct gcgggaatca 20
<210> 44
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 44
actggtggga ggggtaggtg 20
<210> 45
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 45
aggaggactc ttctctccca a 21
<210> 46
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 46
gattcatctg cagccaggat 20
<210> 47
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 47
gccttgggaa aacagctaaa 20
<210> 48
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 48
aggccctctg tctccttttc 20
<210> 49
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 49
aggccctctg tctccttttc 20
<210> 50
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 50
ttctctcccc attcatctgc 20
<210> 51
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 51
ttgggaaggt gaagtttgct 20
<210> 52
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 52
gcatttgggc agtttaggaa 20
<210> 53
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 53
acatgaaaag acctggggg 19
<210> 54
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 54
gatctggtgt cccagcatg 19
<210> 55
<211> 17
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 55
cggaagaccc cagtcca 17
<210> 56
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 56
acgaaggctc tggtccacta 20
<210> 57
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 57
cgaccatctg ccgctttgag 20
<210> 58
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 58
ccccctgtcc cccattccta 20
<210> 59
<211> 25
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 59
atctccacag gagagactgg ttcgg 25
<210> 60
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 60
aaagtggggc cttgggaaca tg 22
<210> 61
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 61
aggtctgagg agcagcttca 20
<210> 62
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 62
attgtccacg ctggattttc 20
<210> 63
<211> 26
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 63
ggagaccagc aagtattgtc ctattt 26
<210> 64
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 64
cattgtcgct gggcatcgta ag 22
<210> 65
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 65
tgtaagtggt tcaacgtgcg 20
<210> 66
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 66
cctcaccctc cttcaagctc 20
<210> 67
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 67
gctgcttaga cgctggattt 20
<210> 68
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 68
ctcctcctcg tcgcagtaga 20
<210> 69
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 69
gcggcaaaac ctacacaaag 20
<210> 70
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 70
ccccgtgtgt ttacggtagt 20
<210> 71
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 71
gcgaaccatc tctgtggtct 20
<210> 72
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 72
ggaaagttgg gatcgaacaa 20
<210> 73
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 73
gcgcagtatc acagccttaa a 21
<210> 74
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 74
tcaatctctt ggcgatttca 20
<210> 75
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 75
gatttgtggg cctgaagaaa 20
<210> 76
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 76
ttgggactgg tggaagaatc 20
<210> 77
<211> 18
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 77
cctggcaccc agcacaat 18
<210> 78
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 78
gggccggact cgtcatact 19
<210> 79
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 79
ggatgtccgt cagaacccat 20
<210> 80
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 80
ccctccagtg gtgtctcggt g 21
<210> 81
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 81
aaaccttctc attgaacatc cc 22
<210> 82
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 82
ccattgtgct tgacttgcc 19
<210> 83
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 83
aaccctggag caaactcaaa 20
<210> 84
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 84
aagaccaacc tgggcttttt 20
<210> 85
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 85
ctgcagacat tctctgggaa a 21
<210> 86
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 86
gcaccatgat tctgaagatg a 21
<210> 87
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 87
gggtagcaat aatctaaacc t 21
<210> 88
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 88
ccagttcttc aatagtaccc t 21
<210> 89
<211> 25
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 89
tctagtcctc cttaaccact tatct 25
<210> 90
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 90
gacacatggc ctcttctgta tc 22
<210> 91
<211> 18
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 91
ccagcactac cagcagca 18
<210> 92
<211> 18
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 92
aggactgggc gctaggtg 18
<210> 93
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 93
cacccggatt acaagtacca g 21
<210> 94
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 94
aaggtctcga tgttggagat g 21

Claims (3)

1. A rapid construction method of a TALE (transcription activator like effector) vector is characterized by comprising the following steps:
(1) construction of cyclic 5-mer: the process is four monomer enzyme digestion-connection reactions which can be synchronously carried out; the first reaction comprises a first base determining monomer, a second base determining monomer, a third base determining monomer, a fourth base determining monomer, a fifth base determining monomer, a IIS type restriction enzyme BsaI and DNA ligase, and a first cyclic 5-concatemer is formed after enzyme digestion-ligation reaction; the second reaction comprises a sixth base determining monomer, a seventh base determining monomer, an eighth base determining monomer, a ninth base determining monomer, a tenth base determining monomer, a IIS type restriction enzyme BsaI and DNA ligase, and a second cyclic 5-linked body is formed after enzyme digestion-ligation reaction; the third reaction comprises an eleventh base determining monomer, a twelfth base determining monomer, a thirteenth base determining monomer, a fourteenth base determining monomer, a linker monomer dsDNA10.5The IIS restriction enzyme BsaI and the DNA ligase form a third cyclic 5-linked body after enzyme digestion-ligation reaction; the fourth reaction comprises a fifteenth base determining monomer, a sixteenth base determining monomer, a seventeenth base determining monomer, an eighteenth base determining monomer, a linker monomer dsDNA17.5The IIS restriction enzyme BsaI and the DNA ligase form a fourth cyclic 5-linked body after enzyme digestion-ligation reaction;
(2) and (3) constructing a final functional TALE expression vector: performing enzyme digestion-ligation reaction on the 4 circular 5 conjuncts prepared in the step (1), the TALE framework vector, the IIS restriction enzyme BsmBI and the DNA ligase to efficiently form a required final functional TALE expression vector;
the base determining monomer is obtained by carrying out PCR amplification by taking a Golden Gate TALEN of Addgene and a plasmid monomer in a TAL effector kit 2.0 as templates, wherein an upstream primer for amplifying the first base determining monomer is SEQ ID No.3, and a downstream primer is SEQ ID No. 2;
the upstream primer used for amplifying the second to fourth base determining monomers is SEQ ID NO.1, and the downstream primer is SEQ ID NO. 2;
the upstream primer for amplifying the fifth base determining monomer is SEQ ID NO.1, and the downstream primer is SEQ ID NO. 4;
the upstream primer used for amplifying the sixth base determining monomer is SEQ ID NO.5, and the downstream primer is SEQ ID NO. 2;
the upstream primer used for amplifying the seventh to ninth base determining monomers is SEQ ID NO.1, and the downstream primer is SEQ ID NO. 2;
the upstream primer for amplifying the tenth base determining monomer is SEQ ID NO.1, and the downstream primer is SEQ ID NO. 6;
the upstream primer for amplifying the eleventh to thirteenth base determining monomers is SEQ ID NO.1, and the downstream primer is SEQ ID NO. 2;
the upstream primer used for amplifying the fourteenth base determining monomer is SEQ ID NO.1, and the downstream primer is SEQ ID NO. 7;
the upstream primer used for amplifying the fifteenth base determining monomer is SEQ ID NO.8, and the downstream primer is SEQ ID NO. 2;
the upstream primer for amplifying the sixteenth to seventeenth base determining monomers is SEQ ID NO.1, and the downstream primer is SEQ ID NO. 2;
the upstream primer for amplifying the eighteenth base determining monomer is SEQ ID NO.9, and the downstream primer is SEQ ID NO. 10;
the ligated monomeric dsDNA10.5And dsDNA17.5The sequences of (A) are respectively shown as SEQ ID NO.13 and SEQ ID NO. 14; the TALE framework vector is a TALE-VP64 framework vector or a TALE-VPR framework vector, and the sequences of the TALE framework vector and the TALE-VPR framework vector are respectively shown as SEQ ID NO.19 and SEQ ID NO. 20.
2. The method for rapidly constructing the TALE vector according to claim 1, wherein the enzyme digestion-ligation reaction in the step (1) is performed alternately at the optimal reaction temperature of a plurality of IIS restriction enzymes BsaI and DNA ligase so as to generate multiple rounds of enzyme digestion-ligation reactions and efficiently form the required cyclic 5-concatemer.
3. The method for rapidly constructing the TALE vector according to claim 1, wherein the enzyme digestion-ligation reaction in the step (2) is performed alternately at the optimal reaction temperature of a plurality of class IIS restriction enzymes BsmBI and DNA ligase so as to generate multiple rounds of enzyme digestion-ligation reactions, thereby efficiently forming the desired final functional TALE expression vector.
CN201810723261.4A 2018-07-04 2018-07-04 TALE expression vector and rapid construction method and application thereof Active CN108949794B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810723261.4A CN108949794B (en) 2018-07-04 2018-07-04 TALE expression vector and rapid construction method and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810723261.4A CN108949794B (en) 2018-07-04 2018-07-04 TALE expression vector and rapid construction method and application thereof

Publications (2)

Publication Number Publication Date
CN108949794A CN108949794A (en) 2018-12-07
CN108949794B true CN108949794B (en) 2021-06-01

Family

ID=64485722

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810723261.4A Active CN108949794B (en) 2018-07-04 2018-07-04 TALE expression vector and rapid construction method and application thereof

Country Status (1)

Country Link
CN (1) CN108949794B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2020352552A1 (en) * 2019-09-23 2022-03-17 Omega Therapeutics, Inc. Compositions and methods for modulating hepatocyte nuclear factor 4-alpha (HNF4α) gene expression
CN111235183B (en) * 2020-01-13 2021-07-27 东南大学 TALEN expression vector, rapid preparation method thereof, target gene and cell double-marker system and application
CN113584064B (en) * 2021-07-01 2023-07-21 五邑大学 Construction method of rapid TALE expression vector based on codon degeneracy
CN114717252A (en) * 2022-04-22 2022-07-08 沈阳大学 TALE library expression vector and preparation method and application thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103146735A (en) * 2012-12-28 2013-06-12 西北农林科技大学 Construction method of TALE repetitive unit tetramer library and construction method and application of TALEN expression vector
CN103497966A (en) * 2013-09-27 2014-01-08 上海斯丹赛生物技术有限公司 Single-module DNA (deoxyribonucleic acid) library and connecting method for TALENs (transcription activator-like effector nucleases) identification modules

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103146735A (en) * 2012-12-28 2013-06-12 西北农林科技大学 Construction method of TALE repetitive unit tetramer library and construction method and application of TALEN expression vector
CN103497966A (en) * 2013-09-27 2014-01-08 上海斯丹赛生物技术有限公司 Single-module DNA (deoxyribonucleic acid) library and connecting method for TALENs (transcription activator-like effector nucleases) identification modules

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Design, Assembly, and Characterization of TALE-Based Transcriptional Activators and Repressors;Pratiksha I Thakore等;《Methods Mol Biol》;20170101;第1338卷;第71-88页 *
类转录激活因子效应物核酸酶(TALEN)介导的基因组定点修饰技术;沈延等;《遗传》;20130222;第35卷(第4期);第395-409页 *

Also Published As

Publication number Publication date
CN108949794A (en) 2018-12-07

Similar Documents

Publication Publication Date Title
CN108949794B (en) TALE expression vector and rapid construction method and application thereof
CN114269919B (en) Targeted RNA editing with engineered RNA using endogenous ADAR
CN111448318A (en) Methods of modifying specificity of non-coding RNA molecules for silencing gene expression in eukaryotic cells
EP1100900B1 (en) Xiap ires and uses thereof
CA2708766A1 (en) Compositions and methods related to mrna translational enhancer elements
CN110295149B (en) Mutant strain 3 type duck hepatitis A virus CH-P60-117C strain and construction method thereof
KR20220044811A (en) Targeted trans-sequencing using CRISPR/CAS13
KR20220047623A (en) Compositions and methods for identifying modulators of cell type fate specification
KR20230045612A (en) KRAB fusion inhibitors and methods and compositions for inhibiting gene expression
WO2009092042A1 (en) Reprogramming of differentiated progenitor or somatic cells using homologous recombination
US20230212615A1 (en) Grna targeting ctgf gene and use thereof
CN114350615B (en) STAT2 gene deletion cell strain and preparation method and application thereof
CN109055375B (en) Method for activating gene expression by CRISPR (clustered regularly interspaced short palindromic repeats) auxiliary trans-enhancer and application of method
CN109321571A (en) A method of utilizing CRISPR/Cas9 preparation and reorganization porcine pseudorabies virus
CN109456991B (en) Protocatechuic acid regulated switch system, regulating method and application thereof
CN106591366A (en) Gene knockout test kit and method for rapidly screening sgRNA
CN114369619A (en) Reporter vector for gene knockout, vector system and application
CN111088282B (en) Application of AAVS1 and H11 safe harbor sites in recombinant expression protein
CN113549656A (en) Lentiviral vector expression system for polygene transformation
CN117015605A (en) Targeted RNA editing using engineered RNAs by utilizing endogenous ADAR
CN109456992B (en) Protocatechuic acid regulated multifunctional gene expression platform and application thereof
CN109371167A (en) Genetic elements and the application of frameshift mutation are generated for detecting CRISPR/Cas9 gene editing system cutting gene
CN114395579B (en) Gene I and gene III Japanese encephalitis virus infectious clone and construction method and application thereof
CN111662904A (en) Preparation of sgRNA and HaCaT cell models of human genome sequences at high-frequency integration sites of targeted high-risk HPV
KR102590276B1 (en) Fusion promoter for tissue-specific expression, and use Thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant