CN109971775A

CN109971775A - A kind of nucleic acid constructs and its modulin synthetic method

Info

Publication number: CN109971775A
Application number: CN201810797876.1A
Authority: CN
Inventors: 郭敏; 王海鹏; 王静; 战荣荣; 徐开; 于雪
Original assignee: Kang Code (shanghai) Biological Technology Co Ltd
Current assignee: Kang Code (shanghai) Biological Technology Co Ltd
Priority date: 2017-12-27
Filing date: 2018-07-19
Publication date: 2019-07-05
Anticipated expiration: 2038-07-19
Also published as: CN109971775B

Abstract

The present invention provides a kind of nucleic acid constructs and its modulin synthetic methods, it is transformed by the DNA nucleic acid sequence of 5'UTR and 3'UTR to coding mRNA, it is inserted into special 3'CITE 5' terminal sequence corresponding with its respectively, the mRNA for generating transcription forms cyclic structure, the ribosomes for promoting translation to terminate reaches the translation that 5'UTR quickly restarts a new round, utilize the stability of the hairpin structure enhancing mRNA at both ends, synergistic effect is generated with poly (A) structure on 5'UTR nucleic acid elements and 3'UTR, improves the efficiency of protein synthesis translation.

Description

A kind of nucleic acid constructs and its modulin synthetic method

Technical field

The present invention relates to technical field of bioengineering, and in particular, to a kind of nucleic acid constructs and its modulin synthesis Method.

Background technique

Protein is the important molecule in cell, almost takes part in the functional execution of cell institute.The sequence and knot of albumen Structure is different, determines the difference of its function.In the cell, albumen can be used as enzyme and be catalyzed various biochemical reactions, can be used as Signaling molecule coordinates the various activities of organism, can store energy, transport molecules, and transport organism in the form of biological support It is dynamic.In field of biomedicine, protein antibody is the important means of the diseases such as treating cancer as targeted drug.

The Four processes of protein translation include translation initiation, translation extends, translation termination and ribosomes recycle, wherein The step of modulated most and speed limit is the starting [1] of translation.During the translation initiation of bacterium, 30S small subunit rises at 3 kinds AUG under the mediation of the beginning factor directly in initiation codon is formed about initiation complex.Eukaryotic translation initiation then compared with For complexity, 11 kinds of initiation factors are needed, and site and protokaryon of the ribosomes when initiation protein is synthesized in conjunction with mRNA are raw Object difference [2].It is Eukaryotic that there is " cap sequence " (cap structure, m⁷GpppN mRNA translation initiation) is mainly [1-3] carried out by the scan mechanism of dependence " cap sequence ".Wherein " cap sequence " can recruit translation initiation factor EIF4F, and then the 43S including ribosomes 40S small subunit and translation initiation factor eIF1, eIF3 and eIF5 for recruiting downstream is turned over Preinitiation complex (Pre-initiation complex, PIC) is translated, another includes translation initiation factor eIF2 and starting The ternary complex of tRNA (Met-tRNAi) is also enrolled on mRNA simultaneously.The compound come is recruited along the end 5' to 3' The direction at end is scanned, and starts translation process after finding first initiation codon AUG.

Other than " cap sequence ", there is poly adenosine chain [poly at the end 3' of Eukaryotic major part mRNA (A)] it, can be combined by poly (A) binding protein (pAB1).In addition to combining the end 3' polyA, the pAB1 albumen of mRNA that can also tie Close translation initiation factor eIF4G, binding site of the pAB1 on eIF4G and the end 5' " cap sequence " that mRNA can be combined Binding site of the eIF4E on eIF4G is different, and such pAB1 and eIF4E can be simultaneously in conjunction with eIF4G.In eIF4E, eIF4G Under mediation with pAB1 albumen, the end 5' and the end 3' of mRNA spatially very close to so as to form cyclic structure, this with The polysome that electron micrograph is observed is that cricoid experimental result coincide [4].Eukaryotic mRNA cyclized structure There is presently no final conclusion, possible explanations to be to aid in ribosomes and still be incorporated in one with mRNA after completing translation for functional meaning It rises, so that ribosomes is transferred directly to the end 5' from the end 3' and continue to scan on and restart translation process, reduce ribosomes Asia The re-circulation of base and translation initiation factor compound assembles [2].The intracorporal mRNA translation of biology is by many internal factors The regulation of (such as energy charge, amino acid etc.) and external factor (such as nutrition, stress pressure etc.).Part regulation passes through unlike signal Access carries out signal transduction, realizes the Active Regulation of the phosphorylation level regulation and GAP-associated protein GAP of translation initiation factor eIFs.With Crucial signal path rapamycin target protein (target of rapamycin, abbreviation TOR) way in protein translation synthesis regulation For diameter, one of substrate of mammal rapamycin target protein 1 (abbreviation mTORC1) is the Translational repression factor (4E-Binding Protein, abbreviation 4E-BPs) little albumen family, being can be with the small molecular protein of eIF4G competitive binding eIF4E.Activate shape MTORC1 albumen under state can be such that 4E-BP1 disintegrates down from eIF4E with the specific threonine residues on phosphorylation 4E-BP1, To promote the formation [9] of eIF4F compound.There are many factor that the mRNA cyclisation formed by eIF4F is related to, and what is be subject to comes from The regulation of different approaches will limit the initial rate of albumen synthesis translation.

A main mRNA degradation pathway is the polyadenylation that goes first to mRNA in yeast, then carries out raising one's hat again anti- It answers, finally reaches the end 5' of mRNA to the degradation at the end 3'.In addition, going to the end 3' of the mRNA of polyadenylation can be exposed, thus [10,11] are degraded from the end 3' to the end 5'.The first step of the degradation of many mRNA in eucaryote is that polyadenylation is anti- It answers.Therefore, possibility of the template with polyA in cell extract is more unstable, is easy to be removed polyadenylation and caused Degradation.

External biological synthesis system refers in the cracking extract of bacterium, fungi, plant cell or zooblast, passes through Be added nucleic acid DNA, RNA, substrate and the energy source of outer source code, completion specified chemical molecule or large biological molecule (DNA, RNA or Protein) in vitro efficiently synthesize.In vitro in protein synthesis system, using cell extract, by adding external source mRNA Or DNA profiling completes the rapidly and efficiently expression to foreign recombinant proteins.At present using prokaryotic cell or eukaryocyte as host The external albumen synthesis expression system of cell preparation has all realized commercialization.Due to synthesizing the raw material 7- methyl guanine of 5' cap Nucleosides it is expensive, and capping efficiency is lower, and protein translation template is used as in eukaryocyte protein-synthesizing system MRNA be typically designed to without the end 5' " cap sequence ", it is at this time it is generally necessary to specific on 5' non-translational region (i.e. 5'UTR) RNA structure carrys out the translation of initiation protein.

Currently, common such RNA structure is that internal ribosome enters sequence (internal ribosome entry Site, abbreviation IRES) and some translational enhancers from virus, such as Omega sequence (the abbreviation Ω sequence of tobacco mosaic virus (TMV) Column).But because having lacked 5' " cap sequence ", the mRNA in biosynthesis system for protein synthesis is hardly formed ring in vitro Shape structure, this means that ribosomes needs to disintegrate down from mRNA molecule after synthesis, needs through 5'UTR again The protein translation of next round is recruited and then originates, the mRNA with polyA is unstable, can be carried out by the enzyme in cell pyrolysis liquid It goes polyadenylation to react, is easy to be degraded, this has seriously affected the efficiency of protein synthesis.

In conclusion the efficiency of the protein translation synthesis of the current external biological synthesis system in this field is also lower, it is difficult to Satisfactory, therefore, this field needs to develop a kind of new nucleic acid constructs, by the transcription of the nucleic acid constructs and turns over It translates, regulates and controls the synthesis of foreign protein, improve the efficiency of the foreign protein translation synthesis.

Summary of the invention

The purpose of the present invention is to provide a kind of new protein translation can be significantly improved in biosynthesis system in vitro and close At the nucleic acid constructs of efficiency.Nucleic acid constructs of the present invention is interacted by RNA-RNA, and can produce one kind can direct ring The mRNA of change, i.e. crRNA (circular RNA), to significantly improve the effect that protein synthesizes in external biological synthetic system Rate.

First aspect present invention provides a kind of nucleic acid constructs, and the nucleic acid constructs is shown in formula A with 5'-3''s Structure:

Z1-Z2-Y1-Z3-Z4-Y2-Z5 (A)

In formula,

Z1 is promoter element；

Z2 and Z4 is cyclisation element, wherein cyclisation element Z2 and Z4 is directly acted on by RNA-RNA to be made corresponding to formula A's MRNA transcription product forms a cyclized structure, and the cyclisation element is selected from the group: TED, BTE, PTE, I type structural detail, Y Type structural detail, T-type structure element, other be capable of forming element that RNA-RNA directly acts on, or combinations thereof；

Y1 is element shown in nothing or " Z6-Z7-Z8 "；

Z3 is the coded sequence of foreign protein；

Y2 is nothing or Z9；

Z5 is to terminate subcomponent；

Z6 is nothing or enhancer element；

Z7 is the end 5' leader sequence (leading sequence)-Ω sequence of nothing or tobacco mosaic virus (TMV)；

Z8 is nothing or Kozak sequence；

Z9 is the poly chain of adenine DNA；

Each "-" independently is key or nucleotide catenation sequence.

In another preferred example, it includes that base pairing interacts that the RNA-RNA, which is directly acted on,.

In another preferred example, the base pairing interaction includes complete base pairing or incomplete base pairing.

In another preferred example, it is 10A-200A that the poly chain, which is bases longs,.

In another preferred example, the poly chain is selected from the group: 50A, 70A, 90A.

In another preferred example, the length of each nucleotide catenation sequence is 1-20nt, preferably 1-10nt.

In another preferred example, the T-type structure element is Kl-TSS.

In another preferred example, the y-type structure element is YSS.

In another preferred example, the promoter is selected from the group: T7 promoter, T3 promoter, SP6 promoter or its Combination.

In another preferred example, the termination subcomponent is selected from the group: T7 terminator, T3 terminator, SP6 terminator or A combination thereof.

In another preferred example, described Z6, Z8 derive from yeast.

In another preferred example, the yeast is selected from the group: saccharomyces cerevisiae, Kluyveromyces yeast, or combinations thereof.

In another preferred example, the Kluyveromyces yeast is selected from the group: Kluyveromyces lactis (Kluyveromyces lactis), kluyveromyces marxianus (Kluyveromyces marxianus), more Bu Keluwei ferment Female (Kluyveromyces dobzhanskii), or combinations thereof.

In another preferred example, the enhancer is selected from the group: IRES element, RBS element, non-coding sequence or its group It closes.

In another preferred example, the enhancer element includes IRES element.

In another preferred example, IRES element source one or more cells selected from the group below: prokaryotic cell, eukaryon Cell.

In another preferred example, the eukaryocyte includes higher eucaryotic cells.

In another preferred example, the IRES element includes endogenous IRES element and exogenous IRES element.

In another preferred example, IRES element source one or more cells selected from the group below: people (human), in State's hamster ovary cell (Chinese hamster ovary cell, CHO), insect cell (insect), wheat germ (Wheat Germ cells), rabbit granulophilocyte (Rabbit reticulocyte).

In another preferred example, the IRES element is selected from the group: ScBOI1, ScFLO8, ScNCE102, ScMSN1, KlFLO8, KlNCE102, KlMSN1, KlBOI1, or combinations thereof.

In another preferred example, the Ω sequence includes direct replicated blocks (ACAATTAC)_m(CAA)_pModule.

In another preferred example, the m is 1-6, preferably, 2-4.

In another preferred example, the p is 6-12, preferably, 8-10.

In another preferred example, described (CAA)_pModule is 1-5, preferably, 1-3.

In another preferred example, described (CAA)_pModule further includes (CAA) of optimization_pModule.

In another preferred example, Z2 3'CITE, and Z4 is 5' terminal sequence corresponding with the 3'CITE.

In another preferred example, the cyclisation element is from virus.

In another preferred example, the virus is selected from the group: PEMV, TBSV, HCRSV, STNV, or combinations thereof.

In another preferred example, the nucleic acid constructs has Formulas I or Formula II structure, wherein Formulas I and Formula II structure are such as It is lower described.

In another preferred example, the sequence of the nucleic acid constructs is as shown in SEQ ID NO.:2-13,51-53.

In another preferred example, the sequence of the nucleic acid constructs is as shown in SEQ ID NO.:3,7-9,51-53.

In terms of present invention 1a, the nucleic acid constructs contains structure nucleic acid sequence shown in formula I:

Z1-Z2-Z3-Z4-Z5 (I)

In formula,

Z1-Z4 and Z5 is respectively the element for being used to constitute the construction；

Each "-" independently is key or nucleotide catenation sequence；

Z1 is promoter element, and the preferably described promoter element is selected from the group: T7 promoter, T3 promoter, SP6 starting Son, or combinations thereof；

Z4 and Z2 is cyclisation element, and the preferably described cyclisation element is selected from the group: TED, BTE, PTE, I type structural detail, Y Type structural detail, T-type structure element, other be capable of forming element that RNA-RNA directly acts on, or combinations thereof；

Z3 is the coded sequence of foreign protein；

Z5 is to terminate subcomponent, and the preferably described termination subcomponent is selected from the group: T7 terminator, T3 terminator, SP6 are terminated Son, or combinations thereof.

In another preferred example, the coded sequence of the foreign protein comes from prokaryotes, eucaryote.

In another preferred example, the coded sequence of the foreign protein comes from animal, plant, pathogen.

In another preferred example, the coded sequence of the foreign protein comes from mammal, preferably Primate, grinding tooth Animal, including people, mouse, rat.

In another preferred example, the coded sequence of the foreign protein encodes foreign protein selected from the group below: fluorescein It is albumen or luciferase (such as firefly luciferase), green fluorescent protein, yellow fluorescence protein, aminoacyl tRNA synthetase, sweet Oily aldehyde -3- phosphate dehydrogenase, catalase, actin, the Variable Area of antibody, luciferase mutant, alpha-amylase, Enterocin A, hepatitis C virus E 2 glycoprotein, insulin precurosor, Interferon α A, cell factor, interferon alpha 2 b, leucocyte - 1 β of interleukin, lysozyme element, seralbumin, single-chain antibody section (scFV), transthyretin, tyrosinase, xylan Enzyme, or combinations thereof.

In another preferred example, the foreign protein is selected from the group: (such as firefly is glimmering for fluorescent proteins or luciferase Light element enzyme), green fluorescent protein, yellow fluorescence protein, aminoacyl tRNA synthetase, glyceraldehyde-3-phosphate dehydrogenase, hydrogen peroxide Enzyme, actin, the Variable Area of antibody, luciferase mutation, alpha-amylase, enterocin A, hepatitis C virus E 2 sugar egg White, insulin precurosor, Interferon α A, cell factor, interferon alpha 2 b, interleukin-1 ' beta ', lysozyme element, seralbumin, Single-chain antibody section (scFV), transthyretin, tyrosinase, zytase, or combinations thereof.

In terms of present invention 1b, a kind of nucleic acid constructs is provided, the nucleic acid constructs contains structure as shown in Formula II Nucleic acid sequence:

Z1-Z2-Z6-Z7-Z8-Z3-Z4-Z9-Z5 (II)

In formula,

Z1-Z9 is respectively the element for being used to constitute the construction；

Each "-" independently is key or nucleotide catenation sequence；

Z1 is promoter element, it is preferable that the promoter element is selected from the group: T7 promoter, T3 promoter, SP6 are opened Mover, or combinations thereof；

Z4 and Z2 is cyclisation element, it is preferable that the cyclisation element is selected from the group: TED, BTE, PTE, I type structural detail, Y-type structure element, T-type structure element, other be capable of forming element that RNA-RNA directly acts on, or combinations thereof；

Z6 is enhancer element, it is preferable that the enhancer element includes IRES element；

Z7 is the end 5' leader sequence (leading sequence)-Ω sequence of tobacco mosaic virus (TMV)；

Z8 is Kozak sequence, it is preferable that the Kozak sequence derives from yeast；

Z3 is the coded sequence of foreign protein；

Z9 is the poly chain [poly (A)] of adenine DNA, it is preferable that the poly chain is selected from the group: 50A, 70A, 90A, or combinations thereof；

Z5 is to terminate subcomponent, it is preferable that the termination subcomponent is selected from the group: T7 terminator, T3 terminator, SP6 are whole Only sub, or combinations thereof.

In second aspect of the present invention, a kind of external foreign protein synthetic system is provided, comprising:

(a) nucleic acid constructs described in first aspect present invention；With

(b) yeast cell extract.

In another preferred example, in the foreign protein synthetic system, the concentration of component (a) is 5-30ng/ μ L, preferably Ground, 8-20ng/ μ L, more preferably, 10-16ng/ μ L, with the total volume meter of the foreign protein synthetic system.

In another preferred example, in the foreign protein synthetic system, the concentration (v/v) of component (b) is 20%-70%, Preferably, 30-60%, more preferably, 40%-50%, with the total volume meter of the foreign protein synthetic system.

In another preferred example, the yeast or yeast cells are selected from the group: saccharomyces cerevisiae, pichia yeast, Crewe tie up ferment Female, or combinations thereof；Preferably, the yeast cells includes: kluyveromyces, it is more preferably Kluyveromyces lactis.

In another preferred example, the yeast cell extract is the aqueous extract to yeast cells.

In another preferred example, the yeast cell extract is free of the long nucleic acid molecule of yeast entogenous.

In another preferred example, the foreign protein synthetic system further includes one or more components selected from the group below:

(c1) magnesium ion；

(c2) potassium ion；

(c3) buffer；

(c4) energy-regenerating system；

(c5) polyethylene glycol；

(c6) for synthesizing the substrate of RNA；

(c7) it is used for the substrate of synthetic proteins；

(c8) RNA polymerase；

(c9) optional Exogenous Sucrose；

(c10) optional solvent, the solvent are water or aqueous solvent.

In another preferred example, the magnesium ion derives from magnesium ion source, and the magnesium ion source is selected from the group: magnesium acetate, Psicosoma, or combinations thereof.

In another preferred example, the source of potassium ions is selected from the group in potassium ion source, the potassium ion source: potassium acetate, Potassium glutamate, or combinations thereof.

In another preferred example, the buffer is selected from the group: 4- hydroxyethyl piperazineethanesulfonic acid, trihydroxy methyl amino first Alkane, or combinations thereof.

In another preferred example, the energy-regenerating system is selected from the group: phosphocreatine/phosphocreatine enzyme system, sugared ferment Solution approach and its intermediate product energy system, or combinations thereof.

In another preferred example, the substrate of the synthesis RNA includes: Nucleotide monophosphates, ribonucleoside triphosphote or its group It closes.

In another preferred example, the substrate of the synthetic proteins includes: 1-20 kind natural amino acid and non-natural ammonia Base acid.

In another preferred example, the ribonucleoside triphosphote is selected from the group: adenosine triphyosphate, three phosphorus of guanosine Acid, cytidine triphosphate, uridine diphosphate guanosine triphosphate, or combinations thereof.

In another preferred example, the amino acid is to be selected from the group: glycine, alanine, valine, leucine, different bright Propylhomoserin, phenylalanine, proline, tryptophan, serine, tyrosine, cysteine, methionine, asparagine, glutamine, Threonine, aspartic acid, glutamic acid, lysine, arginine, histidine, or combinations thereof.

In another preferred example, the amino acid includes D type amino acid and/or L-type amino acid.

In another preferred example, the polyethylene glycol is selected from the group: PEG3000, PEG8000, PEG6000, PEG3350, Or combinations thereof.

In another preferred example, the polyethylene glycol includes the polyethylene glycol that molecular weight (Da) is 200-10000, preferably Ground, molecular weight are the polyethylene glycol of 3000-10000.In another preferred example, the RNA polymerase is t7 rna polymerase.

In another preferred example, the yeast cell extract is prepared with method comprising the following steps:

(i) yeast cells is provided；

(ii) carrying out washing treatment is carried out to yeast cells, obtains washed yeast cells；

(iii) broken cell processing is carried out to washed yeast cells, to obtain yeast crude extract；With

(iv) the yeast crude extract is separated by solid-liquid separation, obtains liquid portion, as yeast cell extract.

In another preferred example, the foreign protein synthetic system includes ingredient selected from the group below: polyethylene glycol, 4- hydroxyl second Base piperazine ethanesulfonic acid, potassium acetate, magnesium acetate, ribonucleoside triphosphote, amino acid, phosphocreatine, dithiothreitol (DTT) (DTT), phosphoric acid flesh Acid kinase, sucrose, nucleic acid constructs (is outer with firefly luciferase or green fluorescent protein described in first aspect present invention Source protein), t7 rna polymerase, yeast cell extract, or combinations thereof.

Third aspect present invention provides a kind of carrier or carrier combination, and the carrier or carrier combination contain the present invention Nucleic acid constructs described in first aspect.

Fourth aspect present invention provides a kind of genetically engineered cell, one of the genome of the genetically engineered cell or Multiple integrations, which have, contains third of the present invention in construction described in first aspect present invention or the genetically engineered cell Carrier described in aspect or carrier combination.

In another preferred example, the genome includes Matrix attachment region, cytoplasmic skeleton, and/or organelle gene group.

In another preferred example, the organelle includes mitochondria, and/or chloroplaset.

In another preferred example, the genetically engineered cell includes prokaryotic cell, eukaryocyte.

In another preferred example, the eukaryocyte includes higher eucaryotic cells.

In another preferred example, the genetically engineered cell is selected from the group: human archeocyte (such as Hela cell), Chinese hamster Gonad cell, insect cell, wheat germ cell, rabbit granulophilocyte, yeast cells, or combinations thereof.

In another preferred example, the genetically engineered cell is yeast cells.

In another preferred example, the yeast cells is selected from the group: saccharomyces cerevisiae, Kluyveromyces yeast or its group It closes.

In another preferred example, the Kluyveromyces yeast is selected from the group: Kluyveromyces lactis, Marx's Crewe Tie up yeast, more cloth kluyveromyces, or combinations thereof.

One of fifth aspect present invention provides a kind of kit, and the reagent for including in the kit is selected from the group Or it is a variety of:

(a) construction described in first aspect present invention；

(b) carrier described in third aspect present invention or carrier combination；

(c) genetically engineered cell described in fourth aspect present invention.

In another preferred example, the kit further includes (d) eukaryon external biological synthetic system (outer albumen of such as eucaryote Synthetic system).

In another preferred example, the eukaryon external biological synthetic system is selected from the group: yeast external biological synthetic system, Chinese hamster ovary cell external biological synthetic system, insect cell external biological synthetic system, Hela cells in vitro biology close Architectonical, or combinations thereof.

In another preferred example, the eukaryon external biological synthetic system includes the outer albumen synthetic system of eucaryote.

In another preferred example, the outer albumen synthetic system of the eucaryote is selected from the group: the outer albumen synthetic system of yeast, The external albumen synthetic system of Chinese hamster ovary cell, the external albumen synthetic system of insect cell, Hela cells in vitro albumen close Architectonical, or combinations thereof.

In another preferred example, the kit further includes (e) yeast external biological synthetic system (outer albumen of such as yeast Synthetic system).

In another preferred example, the yeast external biological synthetic system (the outer albumen synthetic system of such as yeast) is Crewe Tieing up yeast external biological synthetic system (the external albumen synthetic system of such as kluyveromyces), (preferably Kluyveromyces lactis is given birth in vitro Object synthetic system) (the external albumen synthetic system of such as Kluyveromyces lactis).

Sixth aspect present invention provides a kind of construction as described in first aspect or a second aspect of the present invention, the present invention Genetically engineered cell or fifth aspect present invention described in carrier described in the third aspect or carrier combination, fourth aspect present invention The purposes of the kit, which is characterized in that for carrying out high-throughput protein synthesis in vitro.

Seventh aspect present invention provides a kind of foreign protein synthetic method of external high throughput, comprising steps of

(i) in the presence of eukaryon external biological synthetic system, nucleic acid constructs described in first aspect present invention is provided；

(ii) under the suitable conditions, the eukaryon external biological synthetic system of incubation step (i) T1 for a period of time, to close At the foreign protein.

In another preferred example, the method also includes: (iii) is optionally from the eukaryon external biological synthetic system In, separate or detect the foreign protein.

In another preferred example, the eukaryon external biological synthetic system is yeast external biological synthetic system (such as yeast External albumen synthetic system).

In another preferred example, the foreign protein synthetic method includes external high-throughput foreign protein synthetic method.

In another preferred example, in the step (ii), reaction temperature is 20-37 DEG C, preferably, 22-35 DEG C.

In another preferred example, in the step (ii), reaction time 1-72h, preferably, 2-23h.

It should be understood that above-mentioned each technical characteristic of the invention and having in below (eg embodiment) within the scope of the present invention It can be combined with each other between each technical characteristic of body description, to form a new or preferred technical solution.As space is limited, exist This no longer tires out one by one states.

Detailed description of the invention

Fig. 1 shows 5 ' in biosynthesis-the key effect that is originated for protein translation of UTR sequence.In cap knot During the non-dependent translation initiation of structure, the Omega sequence in 5'UTR recruits 40S ribosomes under the assistance of rho factor The schematic diagram of subunit.

Fig. 2 shows different types of 3' translational enhancer element CITE (cap-independent translation Enhancers, CITE) difference of protein synthesis efficiency is influenced in protein synthesis system in vitro.It will select from plant 4 kinds of 3'CITE (kl-TSS, YSS, PTE, TED) insertion pD2P-1.0 plasmid of object virus (is obtained from health code (Shanghai) biotechnology Co., Ltd, sequence is as shown in SEQ ID NO.:1) 3'UTR, obtain four kinds of plasmid templates, respectively pD2P- 1.0C001, pD2P-1.0C002, pD2P-1.0C003, pD2P-1.0C004 (plasmid template sequence SEQ ID NO.:2-5), are answered For in protein synthesis system outside yeast, testing the relative light unit value (RLU) of the luciferase of its synthesis.Wherein, PC For the relative light unit value (RLU) for not adding the luciferase that the pD2P-1.0 plasmid of 3'CITE synthesizes.

Fig. 3 shows the difference for 3'CITE and 5' hairpin structure element being applied simultaneously and being influenced protein synthesis in vitro efficiency It is different.Wherein, PC is the light relatively that pD2P-1.0 plasmid is applied to the luciferase synthesized in the outer protein synthesis system of yeast Unit value (RLU).4 kinds of 5' hairpin structure sequences are added to the Ω sequence for four kinds of template plasmids for being already inserted into 3'CITE 8 kinds of plasmid template pD2P-1.0C005~pD2P-1.0C0012 (plasmid template sequence SEQ ID is constructed in front end and rear end NO.:6-13)。

Fig. 4 shows the difference for 3'CITE and 5' hairpin structure element being applied simultaneously and being influenced protein synthesis in vitro efficiency It is different.PC is pD2P-1.1 plasmid (being obtained from health code (Shanghai) Biotechnology Co., Ltd, sequence is as shown in SEQ ID NO.:50) Relative Fluorescence Unit values applied to the enhanced green fluorescence protein (eGFP) synthesized in protein synthesis system outside yeast (RFU).It is the template of cloning vector by pD2P-1.0C006, pD2P-1.0C007 and pD2P-1.0C008, is encoded albumen After sequence replaces with green fluorescent protein by firefly luciferase, correspondence has obtained pD2P-1.1C001, pD2P-1.1C002 With tri- plasmids of pD2P-1.1C003 (plasmid template sequence SEQ ID NO.:51-53), and it is applied to the outer protein of yeast and closes The Relative Fluorescence Unit values (RFU) of the enhanced green fluorescence protein (eGFP) synthesized in architectonical.

Fig. 5 shows that nucleic acid constructs of the present invention improves the schematic illustration of external protein translation start efficiency.In addition to 5'UTR element, 3'CITE it is also possible that in the eukaryotic cells that the end 3' is recruited translation initiation factor compound EIF4F, so that small subunit ribosome reaches the end 5' of RNA, and the 5' hairpin structure generation phase interaction that 3'CITE is corresponding With so that mRNA is cyclized.

Specific embodiment

After extensive and in-depth study, by largely screening and groping, have unexpectedly discovered that one kind can be cyclized for the first time MRNA and then the new method for enhancing synthetic proteins matter efficiency in biosynthesis system in vitro.By to external biological synthetic system In mRNA both ends insertion cyclisation element (such as 3'CITE 5' hairpin structure corresponding with its) it is more stable form mRNA Cyclic structure, the ribosomes that translation can not only be made to terminate reaches 5'UTR and quickly restarts the translation of a new round, additionally it is possible to Using the stability of the hairpin structure enhancing mRNA at both ends, while can also be with the RNA of the recruitment translation initiation factor on 5'UTR Element generates synergistic effect, the further efficiency for increasing protein translation.On this basis, inventor completes this hair It is bright.

Term

As used herein, term " poly (A) " and " poly (A) n " are used interchangeably, and refer to adenine DNA Poly chain or polyA tail, the positive integer that wherein n is >=10.Preferably n >=30.Usual n is 10-200, preferably 30-90.

As used herein, term " nucleic acid constructs of the invention " refers to the nucleotides sequence with formula A, Formulas I or Formula II structure Column.Nucleic acid constructs of the invention can be DNA, RNA or other nucleic acid analogs.

As used herein, for term, " cyclisation element Z2 and Z4 is directly acted on by RNA-RNA so that mRNA forms cyclisation Structure ", it indicate: (i) when nucleic acid constructs of the invention can be RNA, due to Z2 and Z4 be rna form cyclisation member Part, therefore they are directly acted on by RNA-RNA so that mRNA forms cyclized structure；(ii) is when nucleic acid constructs of the invention When can be DNA, due to the cyclisation element that Z2 and Z4 is DNA form, after the nucleic acid constructs of DNA form is transcribed, The cyclisation element of rna form corresponding to Z2 and Z4 is directly acted on by RNA-RNA so that mRNA forms cyclized structure.Ying Li Solution, in the present invention, Z2 and Z4 element is a pair of of element, they are directly acted on by RNA-RNA makes transcription product jointly (mRNA) cyclized structure is formed.

The building of theoretical system:

1. science background

Compared with the protein translation that Eukaryotic " cap sequence " relies on, there is also some " caps in nature The non-dependent protein translation initialization mode of minor structure ".Ribosome bind site (ribosome binding on bacterium mRNA Site, RBS) under the auxiliary of translation initiation factor directly recruit ribosomes.MRNA in some viruses and certain eukaryocytes On some special RNA structures can recruit eukaryocyte Protein translation initiation, and then utilize the egg of eukaryocyte White matter machine translator, as internal ribosome enters sequence (internal ribosome entry site, IRES) and tobacco Omega sequence of mosaic virus etc..These special RNA elements are normally at the end 5' of mRNA, are usually located in 5'UTR [6,7].

3'UTR also plays important regulating and controlling effect in mRNA stability and translational control.The positive chain RNA of many plants The end 5' of the geneome RNA of virus does not contain any special modification, simple for free phosphate moiety.This kind of viral gene An important feature of group RNA is to contain the translational enhancer element (cap- that cap sequence is non-dependent in the 3'UTR of RNA independent translation enhancers,CITE)[8].3'CITE functionally can effectively substitute 5' " cap Structure " function, so that ribosomes is entered by the end 5' of mRNA and scanning.

2. basic principle

Two important features of 3'CITE include that can recruit translation initiation factor compound in eukaryotic cells The subunit of eIF4F, and can be interacted by the RNA-RNA of high-affinity (kissing-loop interaction) with Hairpin structure (hairpin loop) positioned at 5'UTR interacts, and the translation initiation factor recruited can be enable to reach The end 5' of RNA, and then the translation process [8] of initiation protein.

According to the difference of sequence and secondary structure, presently found 3'CITE points are following six seed type: most early in satellite The translation enhancing structure domain found in tobacco necrosis virus (Satellite tobacco necrosis virus, STNV) (translation enhancer domain, TED), luteovirus sample interpretive element (Barley yellow dwarf Virus-Like element, BTE), panicum mosaic virus sample interpretive element (Panicum mosaic virus-Like Element, PTE), I type structural detail, y-type structure element and T-type structure element [8].In addition to T-type structure element, general 3' The hairpin structure of CITE sequence can directly be acted on the hairpin structure sequence being correspondingly positioned in 5'UTR forms mRNA relatively surely Fixed cyclic structure.

In vitro in biosynthesis system, 3'CITE hairpin structure positioned at 5'UTR corresponding with its is integrated into and is free of There are the both ends of the RNA of " cap sequence ", can be interacted by RNA-RNA and directly be cyclized RNA, the circular rna newly formed (crRNA) ribosomes that translation can be allow to terminate is not dissociated, directly reaches the end 5' of RNA again.Stable cyclic annular knot Structure can also prevent degradation of the RNA enzyme to RNA, increase stability, further increase the efficiency of protein synthesis.Meanwhile it is certain 3'CITE can be by recruiting other translation initiation factors and compound, energy on combining translation initiation factor compound eIF4F Enough poly (A) elements on other RNA elements and 3'UTR of 5'UTR cooperate with, and increase the efficiency [5] of translation initiation.

External foreign protein synthetic system

In the present invention, external foreign protein synthetic system is a kind of body of transcription-translation coupling based on eukaryocyte System can be starting synthesis RNA with DNA profiling, or the external synthesis of protein is completed using DNA or RNA as template.Eukaryon is thin Born of the same parents include yeast cells, rabbit granulophilocyte, wheat germ cell, insect cell, human archeocyte etc..Eukaryon external biological synthetic system With the RNA or protein with labyrinth, and protein post-translational modification can be synthesized.

In the present invention, the external foreign protein synthetic system is not particularly limited, a kind of preferred external external source egg White synthetic system includes yeast external biological synthetic system, preferably, for the outer albumen synthetic system of yeast, preferably gram Yeast expression system is tieed up in Shandong, is more preferably Kluyveromyces lactis expression system.

Yeast (yeast) has both the advantage for cultivating simple efficient protein matter folding and posttranslational modification.It wherein makes wine ferment Female (Saccharomyces cerevisiae) and pichia yeast (Pichia pastoris) be express complicated eukaryotic protein with The model organism of memebrane protein, yeast also can be used as the raw material for preparing external translating system.

Kluyveromyces (Kluyveromyces) are a kind of ascospore yeast, kluyveromyces marxianus therein (Kluyveromyces marxianus) and Kluyveromyces lactis (Kluyveromyces lactis) are industrially to make extensively Yeast.Compared with other yeast, Kluyveromyces lactis is had many advantages, such as superpower secretion capacity, preferably big Scale fermentation characteristic, the rank of food safety and there is the ability modified after protein translation simultaneously etc..

In the present invention, the external foreign protein synthetic system includes:

(a) nucleic acid constructs described in first aspect present invention；With

(b) yeast cell extract.

In a preferred embodiment, the external foreign protein synthetic system further includes selected from the group below one or more Component:

(c1) magnesium ion；

(c2) potassium ion；

(c3) buffer；

(c4) energy-regenerating system；

(c5) polyethylene glycol；

(c6) for synthesizing the substrate of RNA；

(c7) it is used for the substrate of synthetic proteins；

(c8) RNA polymerase；

(c9) optional Exogenous Sucrose；

(c10) optional solvent, the solvent are water or aqueous solvent.

In the present invention, RNA polymerase is not particularly limited, and can be selected from one or more RNA polymerases, typically RNA polymerase is t7 rna polymerase.

In the present invention, ratio of the yeast cell extract in vitro in biosynthesis system is not particularly limited, Shared system is 20-70% to the yeast cell extract in biosynthesis system in vitro, preferably, 30-60%；More preferably Ground, 40-50%.

In the present invention, the yeast cell extract is free of complete cell, typical yeast cell extract packet It includes for various types of RNA polymerases needed for RNA synthesis, and the ribosomes, tRNA, aminoacyl tRNA for protein translation close The initiation factor and elongation factors and termination releasing factor needed at enzyme, protein synthesis.In addition, in yeast cell extract Also containing other albumen in some cytoplasm from eukaryocyte, especially soluble protein.

In the present invention, protein content contained by the yeast cell extract is 20-100mg/mL, preferably 50- 100mg/mL.The measurement protein content method is Coomassie brilliant blue measuring method.

In the present invention, the preparation method of the yeast cell extract is unrestricted, a kind of preferred preparation method The following steps are included:

(i) yeast cells is provided；

(iii) broken cell processing is carried out to washed yeast cells, to obtain yeast cells crude extract；

(iv) the yeast cells crude extract is separated by solid-liquid separation, obtains liquid portion, as yeast cell extract.

In the present invention, the magnesium ion derives from magnesium ion source, and the magnesium ion source is selected from the group: magnesium acetate, glutamic acid Magnesium, or combinations thereof.The source of potassium ions is selected from the group in potassium ion source, the potassium ion source: potassium acetate, potassium glutamate or A combination thereof.The buffer is selected from the group: 4- hydroxyethyl piperazineethanesulfonic acid, trishydroxymethylaminomethane, or combinations thereof.

In the present invention, the energy-regenerating system is selected from the group: phosphocreatine/phosphocreatine enzyme system, glycolytic pathway And its intermediate product energy system, or combinations thereof.

In the present invention, the substrate of the synthesis RNA include: Nucleotide monophosphates, ribonucleoside triphosphote, or combinations thereof.It is described Synthetic proteins substrate be ispol, comprising: 1-20 kind natural amino acid and unnatural amino acid.

The ribonucleoside triphosphote is selected from the group: adenosine triphyosphate, guanopterin nucleoside triphosphate, cytidine three Phosphoric acid, uridine diphosphate guanosine triphosphate, or combinations thereof in the present invention, the concentration of various mononucleotides is not particularly limited, every kind The concentration of mononucleotide is 0.5-5mM, preferably 1.0-2.0mM.

Ispol in the external foreign protein synthetic system may include natural or non-natural amino acids, can wrap Include D type or L-type amino acid.Representative amino acid includes but is not limited to 20 kinds of natural amino acids: glycine, alanine, figured silk fabrics Propylhomoserin, leucine, isoleucine, phenylalanine, proline, tryptophan, serine, tyrosine, cysteine, methionine, day Winter amide, glutamine, threonine, aspartic acid, glutamic acid, lysine, arginine and histidine.The concentration of every kind of amino acid Usually 0.01-0.5mM, preferably 0.02-0.2mM, such as 0.05mM, 0.06mM, 0.07mM, 0.08mM.

In preference, the external foreign protein synthetic system also contains polyethylene glycol or its analog.Polyethylene glycol Or the concentration of its analog is not particularly limited, in general, the concentration (w/v) of polyethylene glycol or its analog is 0.1-8%, preferably Ground, 0.5-4%, more preferably, 1-2%, with the total weight of the biosynthesis system.It is preferred that PEG3000, PEG3350, PEG6000 or PEG8000.It should be understood that system of the invention further includes the polyethylene glycol of other various molecular weight, as PEG200, 400,1500,2000,4000,10000 etc..

In preference, the external foreign protein synthetic system also contains sucrose.The concentration of sucrose is not particularly limited, In general, the concentration of sucrose is 0.03-40wt%, preferably, 0.08-10wt%, more preferably, 0.1-5wt%, with albumen conjunction The total weight of architectonical.

A kind of particularly preferred external foreign protein synthetic system, contains following components: 22mM pH is the 4- hydroxyl second of 7-8 Base piperazine ethanesulfonic acid, 30-150mM potassium acetate, 1.0-5.0mM magnesium acetate, 1.5-4mM ribonucleoside triphosphote mixture, 0.08- The ispol of 0.24mM, 25mM phosphocreatine, 1.7mM dithiothreitol (DTT), 0.27mg/mL creatine phosphokinase, 1%- 4% polyethylene glycol, 0.5%-2% sucrose, nucleic acid constructs 8-20ng/ μ L of the invention is (with firefly luciferase or green Fluorescin is foreign protein), 0.027-0.054mg/mL t7 rna polymerase, the yeast cell extract of 50% volume.

Ω sequence

As used herein, term " Ω sequence " is the end the 5' leader sequence of tmv cdna group, is this kind of virus Translational enhancer.The DNA sequence dna of Ω contains 68 base-pairs, straight by 1-6 (preferably 2-4, more preferable 3) 8 base-pairs Connecing replicated blocks (ACAATTAC) and 1-5 (preferably 1-3, more preferable 1) (CAA) p modules, wherein p is 6-12, compared with Goodly, 8-10.The two modules are crucial for the enhancing interpretative function of Ω sequence.The protein outside yeast of the invention In synthetic system, Ω sequence can originate the protein translation of " cap sequence " dependent/non-dependent, and this function may be by recruiting What the translation initiation factor eIF4G that raises was realized.But Ω sequence initiation protein translation efficiency it is relatively low, need to constitute it into Row optimization, and cooperate other DNA elements or protein to improve the efficiency of protein translation.

Kozak sequence

By dividing translation initiation codon (AUG) upstream and downstream sequence in known Eukaryotic mRNA molecule Analysis, the consensus sequence found out are referred to as Kozak sequence.Kozak sequence is proved that the translation initiation efficiency of mRNA can be enhanced.No Infraspecific Kozak sequence be it is different, as brewing yeast cell (Saccharomyces cerevisiae) and lactation are dynamic The Kozak sequence of object cell there is significant difference.

In the present invention, Kozak sequence used includes 6-12 adenine deoxyribonucleoside oligomerization chain (preferably, 8- 10), translation initiation codon (such as ATG, ATA, ATT, GTG, TTG, preferably ATG) and serine codon (such as TCT, TCC, TCA, TCG, AGT, AGC etc., preferably TCT), it derives from kluyveromyces (preferably Kluyveromyces lactis).

Outer source coding sequence (exogenous DNA)

As used herein, term " outer source coding sequence " is used interchangeably with " exogenous DNA ", refer both to external source for instructing RNA or the DNA molecular of protein synthesis.In general, the DNA molecular is linear or cricoid.The DNA molecular contains There is the sequence of encoding exogenous RNA or foreign protein.

In the present invention, the outer source coding sequence includes but is not limited to: genome sequence, cDNA sequence.It is described Outer source coding sequence also contain promoter sequence, 5 ' non-translated sequences, 3 ' non-translated sequences.

In the present invention, the selection of the exogenous DNA is not particularly limited, in general, exogenous DNA is selected from the group: small non-coding RNA (small non-coding RNA, sncRNA), long non-coding RNA (long non-coding RNA, lncRNA), transhipment RNA (transfer RNA, tRNA), glucosamine-6-phosphate synzyme (Glucosamine-6-phosphate Synthase, glmS) etc. ribozymes, small nRNA (small nuclear RNA, snRNA), spliceosome (spliceosome) etc. The compound of RNA and protein, other various non-coding RNAs, or combinations thereof.

Exogenous DNA is further selected from the following group: coding fluorescence fibroin or luciferase (such as firefly luciferase), green are glimmering Photoprotein, aminoacyl tRNA synthetase, glyceraldehyde-3-phosphate dehydrogenase, catalase, actin, resists yellow fluorescence protein The exogenous DNA of the Variable Area of body, luciferase mutant DNA, or combinations thereof.

Exogenous DNA is also selected from the following group: coding alpha-amylase, enterocin A, hepatitis C virus E 2 glycoprotein, pancreas Island element precursor, Interferon α A, interleukin-1 ' beta ', lysozyme element, seralbumin, single-chain antibody section (scFV), thyroxine Transporter, tyrosinase, zytase exogenous DNA, or combinations thereof.

In a preferred embodiment, the exogenous DNA encodes albumen selected from the group below: green fluorescent protein (enhanced GFP, eGFP), yellow fluorescence protein (YFP), Escherichia coli beta galactosidase (β-galactosidase, LacZ), people's lysine-tRNA synzyme (Lysine-tRNA synthetase), human leucine-tRNA synzyme (Leucine-tRNA synthetase), arabidopsis glyceraldehyde 3 phosphate dehydrogenase (Glyceraldehyde-3-phosphate Dehydrogenase), mouse catalase (Catalase), or combinations thereof.

Nucleic acid constructs

The present invention provides a kind of nucleic acid constructs, the nucleic acid constructs has the structure shown in formula A of 5'-3':

Z1-Z2-Y1-Z3-Z4-Y2-Z5 (A)

In formula,

Z1 is promoter element；

Y1 is element shown in nothing or " Z6-Z7-Z8 "；

Z3 is the coded sequence of foreign protein；

Y2 is nothing or Z9；

Z5 is to terminate subcomponent；

Z6 is nothing or enhancer element；

Z8 is nothing or Kozak sequence；

Z9 is the poly chain of adenine DNA；

Each "-" independently is key or nucleotide catenation sequence.

In a preferred embodiment, the construction of the invention has from 5 ' to 3 ' Formulas I structure:

Z1-Z2-Z3-Z4-Z5 (I)

In formula,

Each "-" independently is key or nucleotide catenation sequence；

Z3 is the coded sequence of foreign protein；

In another preferred embodiment, nucleic acid constructs of the invention contains structure nucleic acid sequence as shown in Formula II:

Z1-Z2-Z6-Z7-Z8-Z3-Z4-Z9-Z5 (II)

In formula,

Each "-" independently is key or nucleotide catenation sequence；

Z3 is the coded sequence of foreign protein；

In the present invention, the selection of the coded sequence of the foreign protein is not particularly limited, in general, the volume of foreign protein Code sequence is selected from the group: coding fluorescence fibroin or luciferase (such as firefly luciferase), green fluorescent protein, yellow The variable region of fluorescin, aminoacyl tRNA synthetase, glyceraldehyde-3-phosphate dehydrogenase, catalase, actin, antibody The DNA of the exogenous DNA in domain, luciferase mutant.

The coded sequence of foreign protein can also encode albumen selected from the group below: alpha-amylase, enterocin A, the third type liver Scorching virus E 2 glycoprotein, insulin precurosor, Interferon α A, interleukin-1 ' beta ', lysozyme element, seralbumin, single-chain antibody Section (scFV), transthyretin, tyrosinase, zytase.

In addition, the nucleic acid constructs of the invention can be linear, it is also possible to cricoid.The core of the invention Acid construct object can be single-stranded, be also possible to double-strand.The nucleic acid constructs of the invention can be DNA, be also possible to RNA or DNA/RNA heterozygosis.

In a preferred embodiment, the sequence of nucleic acid constructs of the invention such as SEQ ID NO.:2-13 and SEQ ID Shown in NO.:51-53.In another preferred example, the construction further includes element selected from the group below or combinations thereof: poly (A) Element, transhipment element, gene target element, riddled basins, enhancer, resistant gene, swivel base enzyme coding gene.

Multiple choices marker gene can be applied to the present invention, including but not limited to: nutrient defect type mark, resistance mark Note, reporter gene label.The application of selective key plays a role the screening of recombinant cell (recon), so that recipient cell Born of the same parents can significantly be distinguished with unconverted cell.Nutrient defect type mark is the marker gene and recipient cell by being transferred to Mutated gene is complementary, so that recipient cell be made to show wild type growth.Resistance marker, which refers to, is transferred to recipient cell for resistant gene In, the gene being transferred to makes recipient cell show drug resistance under certain drug concentration.As preferred embodiment of the invention, application Resistance marker realizes the convenient screening of recombinant cell.

In the present invention, it is answered in yeast external biological synthetic system (such as Yeast protein biosynthesis system) of the invention With nucleic acid constructs of the invention, it is remarkably improved the efficiency of foreign protein translation, wherein in a preferred embodiment, this The relative light unit value of the luciferase of nucleic acid constructs (such as pD2P-1.0C008) synthesis of invention is pD2P-1.0 sequence 2.8 times, the RLU of the green fluorescent protein of pD2P-1.1C003 synthesis of the invention is 1.8 times of pD2P-1.1 sequence.

Carrier, genetically engineered cell

The present invention also provides a kind of carriers or carrier to combine, and the carrier contains nucleic acid constructs of the invention.It is preferred that Ground, the carrier are selected from: bacterial plasmid, bacteriophage, yeast plasmid or zooblast carrier, shuttle vector；The carrier is Transposon vector.The method for being used to prepare recombinant vector is well known to those of ordinary skill in the art.As long as it can be in place Duplication and stabilization in main body, any plasmid and carrier are all can be adopted.

Those of ordinary skill in the art can be used well known method building and contain promoter of the present invention and/or mesh Gene order expression vector.These methods include recombinant DNA technology in vi, DNA synthetic technology, In vivo recombination technology etc..

The present invention also provides a kind of genetically engineered cell, the genetically engineered cell contains the nucleic acid constructs Or carrier or carrier combine or the genetically engineered cell chromosomal integration has the nucleic acid constructs or carrier.

In another preferred example, the genetically engineered cell further includes the carrier containing encoding transposase gene or its dye Transposase gene is integrated on colour solid.

Preferably, the genetically engineered cell is eukaryocyte.

In another preferred example, the eukaryocyte, including but not limited to: human body cell, Chinese hamster ovary cell, elder brother The higher eukaryotic cells such as worm cell, wheat germ cell, rabbit granulophilocyte.

In another preferred example, the eukaryocyte, including but not limited to: yeast cells, preferably Kluyveromyces cells, More preferable Kluyveromyces lactis cell.

Construction or carrier of the invention, can be used for converting genetically engineered cell appropriate.Genetically engineered cell can be with It is prokaryotic cell, such as Escherichia coli, streptomyces, Agrobacterium: or low eukaryocyte, such as yeast cells；Or it is high dynamic Object cell, such as insect cell.Persons skilled in the art are aware that how to select carrier and genetically engineered cell appropriate.With Recombinant DNA transformation gene engineering cell can be carried out with routine techniques well known to those skilled in the art.When host is prokaryotes When, such as Escherichia coli can use CaCl₂Method processing, it is also possible to which electroporation carries out.When host is eucaryote, can be selected such as Under DNA transfection method: calcium phosphate precipitation, conventional mechanical methods, including microinjection, electroporation, liposome packaging etc.. The methods of Agrobacterium-mediated Transformation or via Particle Bombardment Transformation, such as leaf disk method, rataria conversion method, bud infusion method can also be used in conversion plant Deng.

Regulate and control external albumen synthetic method

The present invention provides a kind of external albumen synthetic methods of regulation, comprising steps of

(ii) under the proper conditions, the eukaryon external biological synthetic system of incubation step (i) T1 for a period of time, to synthesize The foreign protein.

Suitable condition described in the step (ii) are as follows: 20-37 DEG C of reaction temperature, preferably, 22-35 DEG C, when reaction Between be 1h-72h, preferably, 2-23h, by oscillation, rolling, rotation, stir, shake up, overturn or other modes persistently mix instead It answers, or stands reaction after mixing.

In another preferred example, the method also includes steps (iii) divides from the eukaryon external biological synthetic system From or the detection foreign protein.

Kit

The present invention provides the kits synthesized for foreign protein, comprising:

(k1) the first container, and nucleic acid constructs described in first aspect present invention in the first container；

(k2) second container, and the yeast cell extract in second container.In a preferred embodiment, institute The first container stated, second container are same container or different vessels.

A kind of kit of particularly preferred external foreign protein synthesis includes an external foreign protein synthetic system, this Synthetic system includes: yeast cell extract, nucleic acid constructs of the invention and other suitable reagents.It is described suitable Reagent includes (but being not limited to): the 4- hydroxyethyl piperazineethanesulfonic acid that 10-50mM (such as 22mM), pH are 7.4,30-150mM acetic acid Potassium, 1.0-5.0mM magnesium acetate, 1.5-4mM ribonucleoside triphosphote mixture, the ispol of 0.08-0.24mM, 25mM phosphoric acid Creatine, 1.7mM dithiothreitol (DTT), 0.27mg/mL creatine phosphokinase, 1%-4% polyethylene glycol, 0.5%-2% sucrose, 0.027-0.054mg/mL t7 rna polymerase.

Main advantages of the present invention include:

(1) present invention firstly discovers that, by optional promoter, (3'CITE of such as plant virus is corresponded to cyclisation element with it 5' hairpin structure), optional Ω sequence, optional Kozak sequence and foreign protein coded sequence as nucleic acid constructs, Applied in eukaryon external biological synthetic system (the outer albumen synthetic system of such as yeast) of the invention, it is remarkably improved external source egg The efficiency of white translation.

(2) present invention is inserted into the RNA element that can be interacted to the 5'UTR and 3'UTR of mRNA to make mRNA be formed More stable cyclic structure, additionally it is possible to using the stability of the hairpin structure enhancing mRNA at both ends, while can also be with 5'UTR On recruitment translation initiation factor RNA element and poly (A) structure on 3'UTR generate synergistic effect, enhance eucaryote The protein translation start efficiency of outer biology synthetic system.Plasmid containing 3'CITE and 5' hairpin structure albumen outside yeast The light relatively for causing firefly luciferase (Firefly luciferase, abbreviation Fluc, similarly hereinafter) to issue in matter synthetic system The plasmid of unit value (Relative Light Units, abbreviation RLU, similarly hereinafter) Ω sequence independent more than 5'.Wherein, pD2P- The RLU for the Fluc that 1.0C008 originates synthesis in Kluyveromyces lactis external biological synthetic system has reached 4.44 × 10⁸, Be pD2P-1.0 sequence 2.8 times (RLU of pD2P-1.0 sequence be 1.58 × 10⁸).PD2P-1.1C003 is tieed up in Kluyveromyces Lactis The Relative Fluorescence Unit values of the green fluorescent protein (referred to as, eGFP, similarly hereinafter) of synthesis are originated in yeast external biological synthetic system (referred to as, RFU, similarly hereinafter) has reached 472, is 1.8 times (RFU of pD2P-1.1 sequence is 267) of pD2P-1.1 sequence.

(3) compared with saccharomyces cerevisiae, Kluyveromyces lactis because its safety and high efficiency can be applied to food and The production of medicine field protein, in addition being such as adapted to high-throughput protein synthesis sieve the advantages of external biological synthetic system Choosing, synthesizes toxic protein and the time is short at low cost etc., so the external biological compound body of Kluyveromyces lactis cell origin Tying up to protein synthesis related fields can also be widely used.

(4) nucleic acid constructs provided by the invention can not only improve the conjunction of eukaryon external biological synthetic system protein translation At efficiency, more important is can increase Kluyveromyces lactis external biological synthetic system for different proteins synthesis Possibility.

(5) nucleic acid constructs of the invention not only increases the efficiency of protein translation synthesis, and provides one kind and set New approaches and new method of the meter for the DNA element of eukaryon external biological synthetic system, can greatly improve related system and exist The application of scientific research and field of industrial production.

(6) present invention firstly discovers that, will strong promoter (such as T7 promoter, T3 promoter, SP6 promoter) and of the invention Nucleic acid constructs combination, also can get very high albumen combined coefficient.

Present invention will be further explained below with reference to specific examples.It should be understood that these embodiments are merely to illustrate the present invention Rather than it limits the scope of the invention.In the following examples, the experimental methods for specific conditions are not specified, usually according to conventional strip Part, such as Sambrook et al., molecular cloning: laboratory manual (New York:Cold Spring Harbor Laboratory Press, 1989) condition described in, or according to the normal condition proposed by manufacturer.Unless otherwise stated, no Then percentage and number are weight percent and parts by weight.

Unless otherwise instructed, then material used in the embodiment of the present invention and reagent are commercial product.

The translational enhancer CITE of the non-dependent cap sequence of 3' in 1 plant virus source of embodiment and the correspondence of interaction The design of 5'UTR

1.1 4 plasmids of building: the translational enhancer (cap-independent of the non-dependent cap sequence in 3'UTR Translation enhancers, abbreviation CITE) function that functionally can effectively substitute 5' cap sequence, so that ribose Body is entered by the end 5' of mRNA and scanning.Different types of 3'CITE (table 1) can promote the outer albumen of inchoate aspect in varying degrees Matter synthesis.It is designed containing the plasmid of 3'CITE coded sequence and reporter protein (preferably Fluc) gene: above-mentioned 4 3'CITE is compiled Code sequence is inserted into existing plasmid pD2P-1.0 plasmid and (is obtained from health code (Shanghai) Biotechnology Co., Ltd, sequence such as SEQ Shown in ID NO.:1) 3'UTR in, form 4 new plasmids, respectively pD2P-1.0C001, pD2P-1.0C002, pD2P- 1.0C003, pD2P-1.0C004, wherein with having the RLU very significant of the Fluc of 1 plasmid pD2P-1.0C002 starting synthesis (p < 0.01) is higher than the original template PC (positive control) that traditional Ω sequence is used only in template, other 3 plasmids Do not have with former PC template then significant difference (p > 0.05) (Fig. 2).4 kinds of plasmid template sequences are as shown in SEQ ID NO.:2-5.

1.2 8 plasmids of building: it on 4 plasmid templates for being added to 3'CITE coded sequence, is further added at the end 5' The coded sequence of its corresponding hairpin structure positioned at 5'UTR.It is 5' hairpin structure (table 1) coded sequence () and original changing After the upstream and downstream sequence of 5' Ω sequence, the design (pD2P-1.0C005 of 8 kinds of plasmid templates is shared；pD2P-1.0C006,pD2P- 1.0C007, pD2P-1.0C008 and pD2P-1.0C009, pD2P-1.0C010, pD2P-1.0C011, pD2P-1.0C012).8 Kind plasmid template sequence is corresponded as shown in SEQ ID NO.:6-13 with the sequence of above-mentioned plasmid.

1.3 3 plasmids of building: on the basis of the further experimental result of verifying Fluc, cyclisation element is added and has PD2P-1.0C006, pD2P-1.0C007 and the pD2P-1.0C008 for having gain effect are the template of cloning vector, are encoded After protein sequence replaces with eGFP by Fluc, correspondence has obtained pD2P-1.1C001, pD2P-1.1C002 and pD2P-1.1C003 Three plasmids, sequence is as shown in SEQ ID NO.:51-53.

The cell pellet overnight culture of plasmid transformed competence colibacillus will be constructed, after 6 monoclonals of picking expand culture, will be sequenced After confirmation is correct, extracts plasmid and save.

Table 1.3'CITE and corresponding 5' sequence information

Embodiment 2: the external protein containing 3'CITE coded sequence 5' hairpin structure coded sequence corresponding with its closes The building of architectonical plasmid

The building of 2.1 plasmids: the 1.1 specifically used primers of 4 plasmids of building are as shown in table 2, and 1.2 described 8 The specifically used primer of plasmid is as shown in table 3, and the 1.3 specifically used primers of 3 plasmids are as shown in table 4.

Specific building process is as follows:

For the 3'CITE segment being inserted into, with pD2P-1.0 (SEQ ID NO.:1) or pD2P-1.1 (SEQ ID NO.: It 50) is vector plasmid.For the 5' hairpin structure segment being inserted into, then respectively using be inserted into 3'CITE plasmid be its respectively Corresponding vector plasmid.It uses two pairs of primers to carry out PCR amplification respectively for being inserted into segment and carrier, and 10 μ L is respectively taken to expand Product is mixed；1 μ L Dpn I, 37 DEG C of incubation 6h is added into 20 μ L amplified productions；4 μ L of product is added after DpnI is handled In 50 μ L DH5 α competent cells, 30min is placed on ice, after 42 DEG C of heat shock 45s, place 3min on ice, 200 μ L LB liquid are added 37 DEG C of shaken cultivation 4h of body culture medium, are coated on the LB solid medium containing Amp antibiotic and are incubated overnight；6 lists of picking After clone expands culture, after carrying out sequencing confirmation correctly, extracts plasmid and save.

The PCR amplification primer of table 2.3'CITE plasmid

The PCR amplification primer of table 3.5' hairpin structure plasmid

The PCR amplification primer of table 4.eGFP plasmid

The application of embodiment 3:3'CITE and corresponding 5'UTR component outside yeast in protein synthesis system

3.1 experimental designs: the DNA fragmentation pD2P-1.0 for not containing 3'CITE and/or 5' hairpin structure coded sequence is used (Fluc is foreign protein) or pD2P-1.1 (eGFP is foreign protein) or as positive control, any DNA profiling is not added Reaction group is negative control (Negative control, NC), and each sample designs three groups of independent experiments.

3.2PCR amplification: using the method for PCR, and primer T7_pET21a_F is used: CGCGAAATTAATACGACTCACTATAGG (SEQ ID NO.:60) and T7ter_pET21a_R: TCCGGATATAGTTCCTCCTTTCAG (SEQ ID NO.:61) will be located at T7 transcriptional initiation sequence and terminate sequence in all plasmids Segment comprising series connection DNA element and foreign protein genes between column is expanded.

3.3 purifying: and the method for the DNA fragmentation ethanol precipitation that amplification obtains is purified and is enriched with: to PCR product Then the middle 3M sodium acetate (pH 5.2) that 1/10 volume is added adds 2.5-3 times of volume (after the volume is addition sodium acetate Volume) 95% ethyl alcohol, be placed in and be incubated for 15min on ice；30min is centrifuged with the speed higher than 14000g under room temperature, Discard supernatant；It is cleaned using 70% ethyl alcohol, is then centrifuged 15min at the same rate again, discard supernatant, and will with ultrapure water Precipitating dissolution, measures DNA concentration.

3.4 are incubated for and detect: according to operation instruction, the DNA fragmentation of purifying being added to homemade Kluyveromyces lactis In protein synthesis in vitro system.And above-mentioned reaction system is placed in 20-30 DEG C of environment, stationary incubation about 2-8h.

It is after reaction, micro- in 96 hole white microwell plates or 384 hole whites for using Fluc as the system of foreign protein Isometric Fluc substrate luciferin (luciferin) is added in orifice plate, is placed in Envision2120 multifunctional enzyme mark immediately Instrument (Perkin Elmer), reading, detection Fluc activity, using RLU as active reference value, as shown in Figure 2 and Figure 3.

For using green fluorescent protein as the system of foreign protein, after reaction, in 96 hole blackboards or 384 hole blackboards 10 μ L reaction system solution are added, are placed in 2000 multi-function microplate reader of Tecan immediately, read, detection green fluorescent protein is living Property, using RFU as active reference value, as shown in Figure 4.

3.5 experimental results:

(1) design of the plasmid of 3'CITE and 5' hairpin structure containing plant virus source

By the 3'UTR of 4 kinds of difference 3'CITE coded sequence insertion pD2P-1.0 plasmids, four kinds of plasmid templates are obtained, by 4 kinds Different 5' hairpin structure coded sequences are inserted into the front end Ω or rear end of above-mentioned four kinds of plasmid templates, form 12 altogether and are used for body The plasmid of outer protein synthesis, method used by the design of plasmid are shown in embodiment 1.

It constructs 3 plasmids: on the basis of the further experimental result of verifying Fluc, being increased so that cyclisation element is added and has PD2P-1.0C006, pD2P-1.0C007 and the pD2P-1.0C008 of beneficial effect are the template of cloning vector, are encoded albumen After sequence replaces with eGFP by Fluc, correspondence has obtained pD2P-1.1C001, pD2P-1.1C002 and pD2P-1.1C003 tri- Plasmid, sequence is as shown in SEQ ID NO.:51-53.

(2) building of the plasmid of the 3'CITE coded sequence containing plant virus source and 5' hairpin structure coded sequence

By repeatedly attempting, confirms through gene sequencing, finally construct successfully all 15 external protein synthesis system matter Grain.

(3) 3'CITE and application of the 5' hairpin structure element outside yeast in protein synthesis system

As shown in Figures 2 and 3, plasmid of 12 of screening containing 3'CITE coded sequence Yu 5' hairpin structure coded sequence The RLU for causing Fluc to issue in protein synthesis system outside yeast is more than transformation with having 4 very significants (p < 0.01) Preceding pD2P-1.0 sequence, i.e. pD2P-1.0C002, pD2P-1.0C006, pD2P-1.0C007 and pD2P-1.0C008, RLU points Do not reach 2.14 × 10⁸、2.60×10⁸、4.06×10⁸With 4.44 × 10⁸, it is 1.4 times of pD2P-1.0 sequence, 1.7 respectively Again, 2.6 times and 2.8 times (RLU of pD2P-1.0 sequence is 1.57 × 10⁸).In addition there is the RLU of plasmid after 3 transformations very aobvious It lands lower than the RLU of pD2P-1.0 sequence, respectively pD2P-1.0C005, pD2P-1.0C009, pD2P-1.0C010.Remaining two The RLU and the pD2P-1.0 before transformation of a plasmid pD2P-1.0C011 and pD2P-1.0C012 do not have significant difference (p > 0.05)。

Because being in the RLU of detecting instrument and the range of linearity of protein concentration relationship, the highest pD2P- of activity The RLU of 1.0C008 is 2.8 times of pD2P-1.0 plasmid, shows to compile in this example containing 3'CITE coded sequence and 5' hairpin structure The plasmid of code sequence can be improved about 2.8 times of luciferase synthesis.

As shown in figure 4, three plasmids of the gain effect of verifying 5' and 3' end ring element protein outside yeast closes The RFU for causing green fluorescent protein (eGFP) to issue in architectonical is more than before being transformed with having 3 very significants (p < 0.01) The RFU of pD2P-1.1 sequence, pD2P-1.1C001, pD2P-1.1C002 and pD2P-1.1C003 have respectively reached 398.33, 386.33 and 472, it is that (RFU of pD2P-1.1 sequence is for 1.5 times, 1.4 times and 1.8 times of pD2P-1.1 sequence respectively 267.33)。

Because in the range of linearity in detecting instrument Relative Fluorescence Unit values RFU and protein concentration relationship, activity is most The RFU of high pD2P-1.1C003 is 1.8 times of pD2P-1.1 plasmid, shows to send out in this example containing 3'CITE coded sequence and 5' The plasmid of clamping structure coded sequence can be improved about 1.8 times of egfp synthesis.

It in the present invention, can be efficient by using nucleic acid constructs shown in the formula A of special designing (or Formulas I or II) Ground makes mRNA transcription product form stable cyclic structure.

Of the invention is above-mentioned the results showed that 3'CITE and 5' hairpin structure are added to the both ends energy of mRNA by the present invention Protein synthesis efficiency is enough improved, the two can be interacted by RNA-RNA and directly be cyclized RNA, the crRNA energy newly formed The ribosomes for enough terminating translation is not dissociated, directly reaches the end 5' (such as Fig. 5) of RNA again.

In addition, stable cyclic structure can also prevent degradation of the RNA enzyme to RNA, stability is improved, egg is further increased The efficiency of white matter synthesis.The design can be applied in the outer protein synthesis system of yeast, the efficiency of initiation protein synthesis It can be more than template plasmid of the common Ω sequence plus the tail of 90A.Wherein pD2P-1.0C008 initiation protein synthesizes Amount is 2.8 times of pD2P-1.0 sequence, and wherein the amount of pD2P-1.1C003 initiation protein synthesis is the 1.8 of pD2P-1.1 sequence Times, the efficiency of the outer protein synthesis system translated protein of yeast is improved, Kluyveromyces lactis is increased and synthesizes in vitro System carries out the selectivity of protein synthesis initiation of translation element, greatly enhances Kluyveromyces lactis protein synthesis in vitro The availability of system.

All references mentioned in the present invention is incorporated herein by reference, independent just as each document It is incorporated as with reference to such.In addition, it should also be understood that, after reading the above teachings of the present invention, those skilled in the art can To make various changes or modifications to the present invention, such equivalent forms equally fall within model defined by the application the appended claims It encloses.

Bibliography

1.Sonenberg&Hinnebusch.,Regulation of translation initiation in eukaryotes:mechanisms and biological targets.Cell,2009.136(4):731-45.

2.Dever et al.,Mechanism and Regulation of Protein Synthesis in Saccharomyces cerevisiae.Genetics,2016.203(1):65-107.

3.M.,D.J.R.G.,Nucleic Acid.Encyclopedia of Cell Biology.Elsevier, 2015.

4.Wells et al.,Circularization of mRNA by eukaryotic translation initiation factors.Mol.Cell,1998.2:135–140.

5.Ogawa et al.,Identification of short untranslated regions that sufficiently enhance translation in high-quality wheat germ extract.Bioorganic&Medicinal Chemistry Letters,2014.24(16):3724-3727.

6.Komar,A.A.,and Hatzoglou,M.Cellular IRES-mediated translation:the war of ITAFs in pathophysiological states.Cell cycle,2011.10:229-240.

7.Lee,K.M.,et al.Regulation Mechanisms of Viral IRES-Driven Translation.Trends in microbiology,2017.

8.Simon&Miller.,3'Cap-Independent translation enhancers of plant viruses.Annu Rev Microbiol,2013.67:21-42.

9.Roux&Topisirovic.,Regulation of mRNA translation by signaling pathways.Cold Spring Harb Perspect Biol,2012.4(11):843-853.

10.Muhlrad,D.,C.J.Decker,and R.Parker.1995.Turnover mechanisms of the stable yeast PGKl mRNA.Molecular&Cellular Biology 15(4):2145-2156.

11.Anderson,J.S.J.and R.P.Parker.1998.The 3'to 5'degradation of yeast mRNAs is a general mechanism for mRNA turnover that requires the SKJ2 DEVH box protein and 3'to 5'exonucleases of the exosome complex.EMBO Journal17(5): 1497-1506.

Sequence table

<110>health code (Shanghai) Biotechnology Co., Ltd

<120>a kind of nucleic acid constructs and its modulin synthetic method

<130> P2018-1188

<141> 2018-07-19

<160> 61

<170> SIPOSequenceListing 1.0

<210> 1

<211> 2043

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 1

cgcgaaatta atacgactca ctataggggt atttttacaa caattaccaa caacaacaaa 60

caacaaacaa cattacaatt actatttaca attacaaaaa aaaaaaatgt ctgaagacgc 120

caaaaacata aagaaaggcc cggcgccatt ctatcctcta gaggatggaa ccgctggaga 180

gcaactgcat aaggctatga agagatacgc cctggttcct ggaacaattg cttttacaga 240

tgcacatatc gaggtgaaca tcacgtacgc ggaatacttc gaaatgtccg ttcggttggc 300

agaagctatg aaacgatatg ggctgaatac aaatcacaga atcgtcgtat gcagtgaaaa 360

ctctcttcaa ttctttatgc cggtgttggg cgcgttattt atcggagttg cagttgcgcc 420

cgcgaacgac atttataatg aacgtgaatt gctcaacagt atgaacattt cgcagcctac 480

cgtagtgttt gtttccaaaa aggggttgca aaaaattttg aacgtgcaaa aaaaattacc 540

aataatccag aaaattatta tcatggattc taaaacggat taccagggat ttcagtcgat 600

gtacacgttc gtcacatctc atctacctcc cggttttaat gaatacgatt ttgtaccaga 660

gtcctttgat cgtgacaaaa caattgcact gataatgaat tcctctggat ctactgggtt 720

acctaagggt gtggcccttc cgcatagaac tgcctgcgtc agattctcgc atgccagaga 780

tcctattttt ggcaatcaaa tcattccgga tactgcgatt ttaagtgttg ttccattcca 840

tcacggtttt ggaatgttta ctacactcgg atatttgata tgtggatttc gagtcgtctt 900

aatgtataga tttgaagaag agctgttttt acgatccctt caggattaca aaattcaaag 960

tgcgttgcta gtaccaaccc tattttcatt cttcgccaaa agcactctga ttgacaaata 1020

cgatttatct aatttacacg aaattgcttc tgggggcgca cctctttcga aagaagtcgg 1080

ggaagcggtt gcaaaacgct tccatcttcc agggatacga caaggatatg ggctcactga 1140

gactacatca gctattctga ttacacccga gggggatgat aaaccgggcg cggtcggtaa 1200

agttgttcca ttttttgaag cgaaggttgt ggatctggat accgggaaaa cgctgggcgt 1260

taatcagaga ggcgaattat gtgtcagagg acctatgatt atgtccggtt atgtaaacaa 1320

tccggaagcg accaacgcct tgattgacaa ggatggatgg ctacattctg gagacatagc 1380

ttactgggac gaagacgaac acttcttcat agttgaccgc ttgaagtctt taattaaata 1440

caaaggatat caggtggccc ccgctgaatt ggaatcgata ttgttacaac accccaacat 1500

cttcgacgcg ggcgtggcag gtcttcccga cgatgacgcc ggtgaacttc ccgccgccgt 1560

tgttgttttg gagcacggaa agacgatgac ggaaaaagag atcgtggatt acgtcgccag 1620

tcaagtaaca accgcgaaaa agttgcgcgg aggagttgtg tttgtggacg aagtaccgaa 1680

aggtcttacc ggaaaactcg acgcaagaaa aatcagagag atcctcataa aggccaagaa 1740

gggcggaaag tccaaattgg tttaaataag gattaattac ttggatgcca ataaaaaaaa 1800

aaaagcgaca tagccaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1860

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaactcga gcaccaccac 1920

caccaccact gagatccggc tgctaacaaa gcccgaaagg aagctgagtt ggctgctgcc 1980

accgctgagc aataactagc ataacccctt ggggcctcta aacgggtctt gaggggtttt 2040

ttg 2043

<210> 2

<211> 2040

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 2

cgcgaaatta atacgactca ctataggggt atttttacaa caattaccaa caacaacaaa 60

caacaaacaa cattacaatt actatttaca attacaaaaa aaaaaaatgt ctgaagacgc 120

caaaaacata aagaaaggcc cggcgccatt ctatcctcta gaggatggaa ccgctggaga 180

gcaactgcat aaggctatga agagatacgc cctggttcct ggaacaattg cttttacaga 240

tgcacatatc gaggtgaaca tcacgtacgc ggaatacttc gaaatgtccg ttcggttggc 300

agaagctatg aaacgatatg ggctgaatac aaatcacaga atcgtcgtat gcagtgaaaa 360

ctctcttcaa ttctttatgc cggtgttggg cgcgttattt atcggagttg cagttgcgcc 420

cgcgaacgac atttataatg aacgtgaatt gctcaacagt atgaacattt cgcagcctac 480

cgtagtgttt gtttccaaaa aggggttgca aaaaattttg aacgtgcaaa aaaaattacc 540

aataatccag aaaattatta tcatggattc taaaacggat taccagggat ttcagtcgat 600

gtacacgttc gtcacatctc atctacctcc cggttttaat gaatacgatt ttgtaccaga 660

gtcctttgat cgtgacaaaa caattgcact gataatgaat tcctctggat ctactgggtt 720

acctaagggt gtggcccttc cgcatagaac tgcctgcgtc agattctcgc atgccagaga 780

tcctattttt ggcaatcaaa tcattccgga tactgcgatt ttaagtgttg ttccattcca 840

tcacggtttt ggaatgttta ctacactcgg atatttgata tgtggatttc gagtcgtctt 900

aatgtataga tttgaagaag agctgttttt acgatccctt caggattaca aaattcaaag 960

tgcgttgcta gtaccaaccc tattttcatt cttcgccaaa agcactctga ttgacaaata 1020

cgatttatct aatttacacg aaattgcttc tgggggcgca cctctttcga aagaagtcgg 1080

ggaagcggtt gcaaaacgct tccatcttcc agggatacga caaggatatg ggctcactga 1140

gactacatca gctattctga ttacacccga gggggatgat aaaccgggcg cggtcggtaa 1200

agttgttcca ttttttgaag cgaaggttgt ggatctggat accgggaaaa cgctgggcgt 1260

taatcagaga ggcgaattat gtgtcagagg acctatgatt atgtccggtt atgtaaacaa 1320

tccggaagcg accaacgcct tgattgacaa ggatggatgg ctacattctg gagacatagc 1380

ttactgggac gaagacgaac acttcttcat agttgaccgc ttgaagtctt taattaaata 1440

caaaggatat caggtggccc ccgctgaatt ggaatcgata ttgttacaac accccaacat 1500

cttcgacgcg ggcgtggcag gtcttcccga cgatgacgcc ggtgaacttc ccgccgccgt 1560

tgttgttttg gagcacggaa agacgatgac ggaaaaagag atcgtggatt acgtcgccag 1620

tcaagtaaca accgcgaaaa agttgcgcgg aggagttgtg tttgtggacg aagtaccgaa 1680

aggtcttacc ggaaaactcg acgcaagaaa aatcagagag atcctcataa aggccaagaa 1740

gggcggaaag tccaaattgg tttaaataag gattaattag gttgggacgc aggtgtcgga 1800

gggatcgcca gccctcagga tagtgagctc ccgcagaggg ataagctatc tccctgcgac 1860

cttggatgcc aataaaaaaa aaaaagcgac atagccaaaa aaaaaaaaaa aaaaaaaaaa 1920

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1980

aaaaaactcg agctagcata accccttggg gcctctaaac gggtcttgag gggttttttg 2040

<210> 3

<211> 2132

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 3

cgcgaaatta atacgactca ctataggggt atttttacaa caattaccaa caacaacaaa 60

caacaaacaa cattacaatt actatttaca attacaaaaa aaaaaaatgt ctgaagacgc 120

caaaaacata aagaaaggcc cggcgccatt ctatcctcta gaggatggaa ccgctggaga 180

gcaactgcat aaggctatga agagatacgc cctggttcct ggaacaattg cttttacaga 240

tgcacatatc gaggtgaaca tcacgtacgc ggaatacttc gaaatgtccg ttcggttggc 300

agaagctatg aaacgatatg ggctgaatac aaatcacaga atcgtcgtat gcagtgaaaa 360

ctctcttcaa ttctttatgc cggtgttggg cgcgttattt atcggagttg cagttgcgcc 420

cgcgaacgac atttataatg aacgtgaatt gctcaacagt atgaacattt cgcagcctac 480

cgtagtgttt gtttccaaaa aggggttgca aaaaattttg aacgtgcaaa aaaaattacc 540

aataatccag aaaattatta tcatggattc taaaacggat taccagggat ttcagtcgat 600

gtacacgttc gtcacatctc atctacctcc cggttttaat gaatacgatt ttgtaccaga 660

gtcctttgat cgtgacaaaa caattgcact gataatgaat tcctctggat ctactgggtt 720

acctaagggt gtggcccttc cgcatagaac tgcctgcgtc agattctcgc atgccagaga 780

tcctattttt ggcaatcaaa tcattccgga tactgcgatt ttaagtgttg ttccattcca 840

tcacggtttt ggaatgttta ctacactcgg atatttgata tgtggatttc gagtcgtctt 900

aatgtataga tttgaagaag agctgttttt acgatccctt caggattaca aaattcaaag 960

tgcgttgcta gtaccaaccc tattttcatt cttcgccaaa agcactctga ttgacaaata 1020

cgatttatct aatttacacg aaattgcttc tgggggcgca cctctttcga aagaagtcgg 1080

ggaagcggtt gcaaaacgct tccatcttcc agggatacga caaggatatg ggctcactga 1140

gactacatca gctattctga ttacacccga gggggatgat aaaccgggcg cggtcggtaa 1200

agttgttcca ttttttgaag cgaaggttgt ggatctggat accgggaaaa cgctgggcgt 1260

taatcagaga ggcgaattat gtgtcagagg acctatgatt atgtccggtt atgtaaacaa 1320

tccggaagcg accaacgcct tgattgacaa ggatggatgg ctacattctg gagacatagc 1380

ttactgggac gaagacgaac acttcttcat agttgaccgc ttgaagtctt taattaaata 1440

caaaggatat caggtggccc ccgctgaatt ggaatcgata ttgttacaac accccaacat 1500

cttcgacgcg ggcgtggcag gtcttcccga cgatgacgcc ggtgaacttc ccgccgccgt 1560

tgttgttttg gagcacggaa agacgatgac ggaaaaagag atcgtggatt acgtcgccag 1620

tcaagtaaca accgcgaaaa agttgcgcgg aggagttgtg tttgtggacg aagtaccgaa 1680

aggtcttacc ggaaaactcg acgcaagaaa aatcagagag atcctcataa aggccaagaa 1740

gggcggaaag tccaaattgg tttaaataag gattaattat cttgttgggt tctggatgtt 1800

aggatgacga gtcgactcgg gctccgcact aggtttggtc gcctagagga tggagatatg 1860

gaaagggtct cgtgtggtat cagtcggtcg aaagacgcgc ttccaacatg ggcctatggt 1920

cggataagtc ttagcaatac cagccagcat gacttggatg ccaataaaaa aaaaaaagcg 1980

acatagccaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2040

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaact cgagctagca taaccccttg 2100

gggcctctaa acgggtcttg aggggttttt tg 2132

<210> 4

<211> 2087

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 4

cgcgaaatta atacgactca ctataggggt atttttacaa caattaccaa caacaacaaa 60

caacaaacaa cattacaatt actatttaca attacaaaaa aaaaaaatgt ctgaagacgc 120

caaaaacata aagaaaggcc cggcgccatt ctatcctcta gaggatggaa ccgctggaga 180

gcaactgcat aaggctatga agagatacgc cctggttcct ggaacaattg cttttacaga 240

tgcacatatc gaggtgaaca tcacgtacgc ggaatacttc gaaatgtccg ttcggttggc 300

agaagctatg aaacgatatg ggctgaatac aaatcacaga atcgtcgtat gcagtgaaaa 360

ctctcttcaa ttctttatgc cggtgttggg cgcgttattt atcggagttg cagttgcgcc 420

cgcgaacgac atttataatg aacgtgaatt gctcaacagt atgaacattt cgcagcctac 480

cgtagtgttt gtttccaaaa aggggttgca aaaaattttg aacgtgcaaa aaaaattacc 540

aataatccag aaaattatta tcatggattc taaaacggat taccagggat ttcagtcgat 600

gtacacgttc gtcacatctc atctacctcc cggttttaat gaatacgatt ttgtaccaga 660

gtcctttgat cgtgacaaaa caattgcact gataatgaat tcctctggat ctactgggtt 720

acctaagggt gtggcccttc cgcatagaac tgcctgcgtc agattctcgc atgccagaga 780

tcctattttt ggcaatcaaa tcattccgga tactgcgatt ttaagtgttg ttccattcca 840

tcacggtttt ggaatgttta ctacactcgg atatttgata tgtggatttc gagtcgtctt 900

aatgtataga tttgaagaag agctgttttt acgatccctt caggattaca aaattcaaag 960

tgcgttgcta gtaccaaccc tattttcatt cttcgccaaa agcactctga ttgacaaata 1020

cgatttatct aatttacacg aaattgcttc tgggggcgca cctctttcga aagaagtcgg 1080

ggaagcggtt gcaaaacgct tccatcttcc agggatacga caaggatatg ggctcactga 1140

gactacatca gctattctga ttacacccga gggggatgat aaaccgggcg cggtcggtaa 1200

agttgttcca ttttttgaag cgaaggttgt ggatctggat accgggaaaa cgctgggcgt 1260

taatcagaga ggcgaattat gtgtcagagg acctatgatt atgtccggtt atgtaaacaa 1320

tccggaagcg accaacgcct tgattgacaa ggatggatgg ctacattctg gagacatagc 1380

ttactgggac gaagacgaac acttcttcat agttgaccgc ttgaagtctt taattaaata 1440

caaaggatat caggtggccc ccgctgaatt ggaatcgata ttgttacaac accccaacat 1500

cttcgacgcg ggcgtggcag gtcttcccga cgatgacgcc ggtgaacttc ccgccgccgt 1560

tgttgttttg gagcacggaa agacgatgac ggaaaaagag atcgtggatt acgtcgccag 1620

tcaagtaaca accgcgaaaa agttgcgcgg aggagttgtg tttgtggacg aagtaccgaa 1680

aggtcttacc ggaaaactcg acgcaagaaa aatcagagag atcctcataa aggccaagaa 1740

gggcggaaag tccaaattgg tttaaataag gattaattaa ttgtatgggc ctccgtgctg 1800

ttaggggcag tggaaacgtc ggtctagcca gccgtcccct gggtagtgtg ctccgtctaa 1860

gtacaccact actcggtttc cacaaacgat cggactccca aaatagtctt ggatgccaat 1920

aaaaaaaaaa aagcgacata gccaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1980

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaactcgagc 2040

tagcataacc ccttggggcc tctaaacggg tcttgagggg ttttttg 2087

<210> 5

<211> 2019

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 5

cgcgaaatta atacgactca ctataggggt atttttacaa caattaccaa caacaacaaa 60

caacaaacaa cattacaatt actatttaca attacaaaaa aaaaaaatgt ctgaagacgc 120

caaaaacata aagaaaggcc cggcgccatt ctatcctcta gaggatggaa ccgctggaga 180

gcaactgcat aaggctatga agagatacgc cctggttcct ggaacaattg cttttacaga 240

tgcacatatc gaggtgaaca tcacgtacgc ggaatacttc gaaatgtccg ttcggttggc 300

agaagctatg aaacgatatg ggctgaatac aaatcacaga atcgtcgtat gcagtgaaaa 360

ctctcttcaa ttctttatgc cggtgttggg cgcgttattt atcggagttg cagttgcgcc 420

cgcgaacgac atttataatg aacgtgaatt gctcaacagt atgaacattt cgcagcctac 480

cgtagtgttt gtttccaaaa aggggttgca aaaaattttg aacgtgcaaa aaaaattacc 540

aataatccag aaaattatta tcatggattc taaaacggat taccagggat ttcagtcgat 600

gtacacgttc gtcacatctc atctacctcc cggttttaat gaatacgatt ttgtaccaga 660

gtcctttgat cgtgacaaaa caattgcact gataatgaat tcctctggat ctactgggtt 720

acctaagggt gtggcccttc cgcatagaac tgcctgcgtc agattctcgc atgccagaga 780

tcctattttt ggcaatcaaa tcattccgga tactgcgatt ttaagtgttg ttccattcca 840

tcacggtttt ggaatgttta ctacactcgg atatttgata tgtggatttc gagtcgtctt 900

aatgtataga tttgaagaag agctgttttt acgatccctt caggattaca aaattcaaag 960

tgcgttgcta gtaccaaccc tattttcatt cttcgccaaa agcactctga ttgacaaata 1020

cgatttatct aatttacacg aaattgcttc tgggggcgca cctctttcga aagaagtcgg 1080

ggaagcggtt gcaaaacgct tccatcttcc agggatacga caaggatatg ggctcactga 1140

gactacatca gctattctga ttacacccga gggggatgat aaaccgggcg cggtcggtaa 1200

agttgttcca ttttttgaag cgaaggttgt ggatctggat accgggaaaa cgctgggcgt 1260

taatcagaga ggcgaattat gtgtcagagg acctatgatt atgtccggtt atgtaaacaa 1320

tccggaagcg accaacgcct tgattgacaa ggatggatgg ctacattctg gagacatagc 1380

ttactgggac gaagacgaac acttcttcat agttgaccgc ttgaagtctt taattaaata 1440

caaaggatat caggtggccc ccgctgaatt ggaatcgata ttgttacaac accccaacat 1500

cttcgacgcg ggcgtggcag gtcttcccga cgatgacgcc ggtgaacttc ccgccgccgt 1560

tgttgttttg gagcacggaa agacgatgac ggaaaaagag atcgtggatt acgtcgccag 1620

tcaagtaaca accgcgaaaa agttgcgcgg aggagttgtg tttgtggacg aagtaccgaa 1680

aggtcttacc ggaaaactcg acgcaagaaa aatcagagag atcctcataa aggccaagaa 1740

gggcggaaag tccaaattgg tttaaataag gattaattac gtagctaccc tcctggtgca 1800

cttcctggtg caaagcagaa ccaagggtac ggtggtacgc ttggatgcca ataaaaaaaa 1860

aaaagcgaca tagccaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1920

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaactcga gctagcataa 1980

ccccttgggg cctctaaacg ggtcttgagg ggttttttg 2019

<210> 6

<211> 2190

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 6

cgcgaaatta atacgactca ctatagggaa aaatcggtta aaaaaacttt tcttctgggt 60

atttatagag atcagtatga actgtgtcgc taggatcaag cggtggttca cacctgactt 120

cacccctggc gagggcgtga agtctcaaag catacctaat aacaatataa tcccataggt 180

atttttacaa caattaccaa caacaacaaa caacaaacaa cattacaatt actatttaca 240

attacaaaaa aaaaaaatgt ctgaagacgc caaaaacata aagaaaggcc cggcgccatt 300

ctatcctcta gaggatggaa ccgctggaga gcaactgcat aaggctatga agagatacgc 360

cctggttcct ggaacaattg cttttacaga tgcacatatc gaggtgaaca tcacgtacgc 420

ggaatacttc gaaatgtccg ttcggttggc agaagctatg aaacgatatg ggctgaatac 480

aaatcacaga atcgtcgtat gcagtgaaaa ctctcttcaa ttctttatgc cggtgttggg 540

cgcgttattt atcggagttg cagttgcgcc cgcgaacgac atttataatg aacgtgaatt 600

gctcaacagt atgaacattt cgcagcctac cgtagtgttt gtttccaaaa aggggttgca 660

aaaaattttg aacgtgcaaa aaaaattacc aataatccag aaaattatta tcatggattc 720

taaaacggat taccagggat ttcagtcgat gtacacgttc gtcacatctc atctacctcc 780

cggttttaat gaatacgatt ttgtaccaga gtcctttgat cgtgacaaaa caattgcact 840

gataatgaat tcctctggat ctactgggtt acctaagggt gtggcccttc cgcatagaac 900

tgcctgcgtc agattctcgc atgccagaga tcctattttt ggcaatcaaa tcattccgga 960

tactgcgatt ttaagtgttg ttccattcca tcacggtttt ggaatgttta ctacactcgg 1020

atatttgata tgtggatttc gagtcgtctt aatgtataga tttgaagaag agctgttttt 1080

acgatccctt caggattaca aaattcaaag tgcgttgcta gtaccaaccc tattttcatt 1140

cttcgccaaa agcactctga ttgacaaata cgatttatct aatttacacg aaattgcttc 1200

tgggggcgca cctctttcga aagaagtcgg ggaagcggtt gcaaaacgct tccatcttcc 1260

agggatacga caaggatatg ggctcactga gactacatca gctattctga ttacacccga 1320

gggggatgat aaaccgggcg cggtcggtaa agttgttcca ttttttgaag cgaaggttgt 1380

ggatctggat accgggaaaa cgctgggcgt taatcagaga ggcgaattat gtgtcagagg 1440

acctatgatt atgtccggtt atgtaaacaa tccggaagcg accaacgcct tgattgacaa 1500

ggatggatgg ctacattctg gagacatagc ttactgggac gaagacgaac acttcttcat 1560

agttgaccgc ttgaagtctt taattaaata caaaggatat caggtggccc ccgctgaatt 1620

ggaatcgata ttgttacaac accccaacat cttcgacgcg ggcgtggcag gtcttcccga 1680

cgatgacgcc ggtgaacttc ccgccgccgt tgttgttttg gagcacggaa agacgatgac 1740

ggaaaaagag atcgtggatt acgtcgccag tcaagtaaca accgcgaaaa agttgcgcgg 1800

aggagttgtg tttgtggacg aagtaccgaa aggtcttacc ggaaaactcg acgcaagaaa 1860

aatcagagag atcctcataa aggccaagaa gggcggaaag tccaaattgg tttaaataag 1920

gattaattag gttgggacgc aggtgtcgga gggatcgcca gccctcagga tagtgagctc 1980

ccgcagaggg ataagctatc tccctgcgac cttggatgcc aataaaaaaa aaaaagcgac 2040

atagccaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2100

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaactcg agctagcata accccttggg 2160

gcctctaaac gggtcttgag gggttttttg 2190

<210> 7

<211> 2271

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 7

cgcgaaatta atacgactca ctatagggaa aaatcggtta aaaaaacttt tcttctagaa 60

attctccagg atttctcgac ctagttcgtt tatctggtga cttgcgctac cgttgctttg 120

cgtagagaat ttctcaaagc atacctaata acaatataat cccataggta tttttacaac 180

aattaccaac aacaacaaac aacaaacaac attacaatta ctatttacaa ttacaaaaaa 240

aaaaaatgtc tgaagacgcc aaaaacataa agaaaggccc ggcgccattc tatcctctag 300

aggatggaac cgctggagag caactgcata aggctatgaa gagatacgcc ctggttcctg 360

gaacaattgc ttttacagat gcacatatcg aggtgaacat cacgtacgcg gaatacttcg 420

aaatgtccgt tcggttggca gaagctatga aacgatatgg gctgaataca aatcacagaa 480

tcgtcgtatg cagtgaaaac tctcttcaat tctttatgcc ggtgttgggc gcgttattta 540

tcggagttgc agttgcgccc gcgaacgaca tttataatga acgtgaattg ctcaacagta 600

tgaacatttc gcagcctacc gtagtgtttg tttccaaaaa ggggttgcaa aaaattttga 660

acgtgcaaaa aaaattacca ataatccaga aaattattat catggattct aaaacggatt 720

accagggatt tcagtcgatg tacacgttcg tcacatctca tctacctccc ggttttaatg 780

aatacgattt tgtaccagag tcctttgatc gtgacaaaac aattgcactg ataatgaatt 840

cctctggatc tactgggtta cctaagggtg tggcccttcc gcatagaact gcctgcgtca 900

gattctcgca tgccagagat cctatttttg gcaatcaaat cattccggat actgcgattt 960

taagtgttgt tccattccat cacggttttg gaatgtttac tacactcgga tatttgatat 1020

gtggatttcg agtcgtctta atgtatagat ttgaagaaga gctgttttta cgatcccttc 1080

aggattacaa aattcaaagt gcgttgctag taccaaccct attttcattc ttcgccaaaa 1140

gcactctgat tgacaaatac gatttatcta atttacacga aattgcttct gggggcgcac 1200

ctctttcgaa agaagtcggg gaagcggttg caaaacgctt ccatcttcca gggatacgac 1260

aaggatatgg gctcactgag actacatcag ctattctgat tacacccgag ggggatgata 1320

aaccgggcgc ggtcggtaaa gttgttccat tttttgaagc gaaggttgtg gatctggata 1380

ccgggaaaac gctgggcgtt aatcagagag gcgaattatg tgtcagagga cctatgatta 1440

tgtccggtta tgtaaacaat ccggaagcga ccaacgcctt gattgacaag gatggatggc 1500

tacattctgg agacatagct tactgggacg aagacgaaca cttcttcata gttgaccgct 1560

tgaagtcttt aattaaatac aaaggatatc aggtggcccc cgctgaattg gaatcgatat 1620

tgttacaaca ccccaacatc ttcgacgcgg gcgtggcagg tcttcccgac gatgacgccg 1680

gtgaacttcc cgccgccgtt gttgttttgg agcacggaaa gacgatgacg gaaaaagaga 1740

tcgtggatta cgtcgccagt caagtaacaa ccgcgaaaaa gttgcgcgga ggagttgtgt 1800

ttgtggacga agtaccgaaa ggtcttaccg gaaaactcga cgcaagaaaa atcagagaga 1860

tcctcataaa ggccaagaag ggcggaaagt ccaaattggt ttaaataagg attaattatc 1920

ttgttgggtt ctggatgtta ggatgacgag tcgactcggg ctccgcacta ggtttggtcg 1980

cctagaggat ggagatatgg aaagggtctc gtgtggtatc agtcggtcga aagacgcgct 2040

tccaacatgg gcctatggtc ggataagtct tagcaatacc agccagcatg acttggatgc 2100

caataaaaaa aaaaaagcga catagccaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2160

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaactc 2220

gagctagcat aaccccttgg ggcctctaaa cgggtcttga ggggtttttt g 2271

<210> 8

<211> 2168

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 8

cgcgaaatta atacgactca ctatagggaa aaatcggtta aaaaaacttt tcttctggaa 60

accgtggcaa gtttcccaaa gcatacctaa taacaatata atcccatagg tatttttaca 120

acaattacca acaacaacaa acaacaaaca acattacaat tactatttac aattacaaaa 180

aaaaaaaatg tctgaagacg ccaaaaacat aaagaaaggc ccggcgccat tctatcctct 240

agaggatgga accgctggag agcaactgca taaggctatg aagagatacg ccctggttcc 300

tggaacaatt gcttttacag atgcacatat cgaggtgaac atcacgtacg cggaatactt 360

cgaaatgtcc gttcggttgg cagaagctat gaaacgatat gggctgaata caaatcacag 420

aatcgtcgta tgcagtgaaa actctcttca attctttatg ccggtgttgg gcgcgttatt 480

tatcggagtt gcagttgcgc ccgcgaacga catttataat gaacgtgaat tgctcaacag 540

tatgaacatt tcgcagccta ccgtagtgtt tgtttccaaa aaggggttgc aaaaaatttt 600

gaacgtgcaa aaaaaattac caataatcca gaaaattatt atcatggatt ctaaaacgga 660

ttaccaggga tttcagtcga tgtacacgtt cgtcacatct catctacctc ccggttttaa 720

tgaatacgat tttgtaccag agtcctttga tcgtgacaaa acaattgcac tgataatgaa 780

ttcctctgga tctactgggt tacctaaggg tgtggccctt ccgcatagaa ctgcctgcgt 840

cagattctcg catgccagag atcctatttt tggcaatcaa atcattccgg atactgcgat 900

tttaagtgtt gttccattcc atcacggttt tggaatgttt actacactcg gatatttgat 960

atgtggattt cgagtcgtct taatgtatag atttgaagaa gagctgtttt tacgatccct 1020

tcaggattac aaaattcaaa gtgcgttgct agtaccaacc ctattttcat tcttcgccaa 1080

aagcactctg attgacaaat acgatttatc taatttacac gaaattgctt ctgggggcgc 1140

acctctttcg aaagaagtcg gggaagcggt tgcaaaacgc ttccatcttc cagggatacg 1200

acaaggatat gggctcactg agactacatc agctattctg attacacccg agggggatga 1260

taaaccgggc gcggtcggta aagttgttcc attttttgaa gcgaaggttg tggatctgga 1320

taccgggaaa acgctgggcg ttaatcagag aggcgaatta tgtgtcagag gacctatgat 1380

tatgtccggt tatgtaaaca atccggaagc gaccaacgcc ttgattgaca aggatggatg 1440

gctacattct ggagacatag cttactggga cgaagacgaa cacttcttca tagttgaccg 1500

cttgaagtct ttaattaaat acaaaggata tcaggtggcc cccgctgaat tggaatcgat 1560

attgttacaa caccccaaca tcttcgacgc gggcgtggca ggtcttcccg acgatgacgc 1620

cggtgaactt cccgccgccg ttgttgtttt ggagcacgga aagacgatga cggaaaaaga 1680

gatcgtggat tacgtcgcca gtcaagtaac aaccgcgaaa aagttgcgcg gaggagttgt 1740

gtttgtggac gaagtaccga aaggtcttac cggaaaactc gacgcaagaa aaatcagaga 1800

gatcctcata aaggccaaga agggcggaaa gtccaaattg gtttaaataa ggattaatta 1860

attgtatggg cctccgtgct gttaggggca gtggaaacgt cggtctagcc agccgtcccc 1920

tgggtagtgt gctccgtcta agtacaccac tactcggttt ccacaaacga tcggactccc 1980

aaaatagtct tggatgccaa taaaaaaaaa aaagcgacat agccaaaaaa aaaaaaaaaa 2040

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2100

aaaaaaaaaa aaaactcgag ctagcataac cccttggggc ctctaaacgg gtcttgaggg 2160

gttttttg 2168

<210> 9

<211> 2102

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 9

cgcgaaatta atacgactca ctatagggaa aaatcggtta aaaaaacttt tcttctagta 60

aagacaggaa actttactca aagcatacct aataacaata taatcccata ggtattttta 120

caacaattac caacaacaac aaacaacaaa caacattaca attactattt acaattacaa 180

aaaaaaaaaa tgtctgaaga cgccaaaaac ataaagaaag gcccggcgcc attctatcct 240

ctagaggatg gaaccgctgg agagcaactg cataaggcta tgaagagata cgccctggtt 300

cctggaacaa ttgcttttac agatgcacat atcgaggtga acatcacgta cgcggaatac 360

ttcgaaatgt ccgttcggtt ggcagaagct atgaaacgat atgggctgaa tacaaatcac 420

agaatcgtcg tatgcagtga aaactctctt caattcttta tgccggtgtt gggcgcgtta 480

tttatcggag ttgcagttgc gcccgcgaac gacatttata atgaacgtga attgctcaac 540

agtatgaaca tttcgcagcc taccgtagtg tttgtttcca aaaaggggtt gcaaaaaatt 600

ttgaacgtgc aaaaaaaatt accaataatc cagaaaatta ttatcatgga ttctaaaacg 660

gattaccagg gatttcagtc gatgtacacg ttcgtcacat ctcatctacc tcccggtttt 720

aatgaatacg attttgtacc agagtccttt gatcgtgaca aaacaattgc actgataatg 780

aattcctctg gatctactgg gttacctaag ggtgtggccc ttccgcatag aactgcctgc 840

gtcagattct cgcatgccag agatcctatt tttggcaatc aaatcattcc ggatactgcg 900

attttaagtg ttgttccatt ccatcacggt tttggaatgt ttactacact cggatatttg 960

atatgtggat ttcgagtcgt cttaatgtat agatttgaag aagagctgtt tttacgatcc 1020

cttcaggatt acaaaattca aagtgcgttg ctagtaccaa ccctattttc attcttcgcc 1080

aaaagcactc tgattgacaa atacgattta tctaatttac acgaaattgc ttctgggggc 1140

gcacctcttt cgaaagaagt cggggaagcg gttgcaaaac gcttccatct tccagggata 1200

cgacaaggat atgggctcac tgagactaca tcagctattc tgattacacc cgagggggat 1260

gataaaccgg gcgcggtcgg taaagttgtt ccattttttg aagcgaaggt tgtggatctg 1320

gataccggga aaacgctggg cgttaatcag agaggcgaat tatgtgtcag aggacctatg 1380

attatgtccg gttatgtaaa caatccggaa gcgaccaacg ccttgattga caaggatgga 1440

tggctacatt ctggagacat agcttactgg gacgaagacg aacacttctt catagttgac 1500

cgcttgaagt ctttaattaa atacaaagga tatcaggtgg cccccgctga attggaatcg 1560

atattgttac aacaccccaa catcttcgac gcgggcgtgg caggtcttcc cgacgatgac 1620

gccggtgaac ttcccgccgc cgttgttgtt ttggagcacg gaaagacgat gacggaaaaa 1680

gagatcgtgg attacgtcgc cagtcaagta acaaccgcga aaaagttgcg cggaggagtt 1740

gtgtttgtgg acgaagtacc gaaaggtctt accggaaaac tcgacgcaag aaaaatcaga 1800

gagatcctca taaaggccaa gaagggcgga aagtccaaat tggtttaaat aaggattaat 1860

tacgtagcta ccctcctggt gcacttcctg gtgcaaagca gaaccaaggg tacggtggta 1920

cgcttggatg ccaataaaaa aaaaaaagcg acatagccaa aaaaaaaaaa aaaaaaaaaa 1980

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2040

aaaaaaaact cgagctagca taaccccttg gggcctctaa acgggtcttg aggggttttt 2100

tg 2102

<210> 10

<211> 2190

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 10

cgcgaaatta atacgactca ctataggggt atttttacaa caattaccaa caacaacaaa 60

caacaaacaa cattacaatt actatttaca attacagaaa aatcggttaa aaaaactttt 120

cttctgggta tttatagaga tcagtatgaa ctgtgtcgct aggatcaagc ggtggttcac 180

acctgacttc acccctggcg agggcgtgaa gtctcaaagc atacctaata acaatataat 240

cccataaaaa aaaaaaatgt ctgaagacgc caaaaacata aagaaaggcc cggcgccatt 300

ctatcctcta gaggatggaa ccgctggaga gcaactgcat aaggctatga agagatacgc 360

cctggttcct ggaacaattg cttttacaga tgcacatatc gaggtgaaca tcacgtacgc 420

ggaatacttc gaaatgtccg ttcggttggc agaagctatg aaacgatatg ggctgaatac 480

aaatcacaga atcgtcgtat gcagtgaaaa ctctcttcaa ttctttatgc cggtgttggg 540

cgcgttattt atcggagttg cagttgcgcc cgcgaacgac atttataatg aacgtgaatt 600

gctcaacagt atgaacattt cgcagcctac cgtagtgttt gtttccaaaa aggggttgca 660

aaaaattttg aacgtgcaaa aaaaattacc aataatccag aaaattatta tcatggattc 720

taaaacggat taccagggat ttcagtcgat gtacacgttc gtcacatctc atctacctcc 780

cggttttaat gaatacgatt ttgtaccaga gtcctttgat cgtgacaaaa caattgcact 840

gataatgaat tcctctggat ctactgggtt acctaagggt gtggcccttc cgcatagaac 900

tgcctgcgtc agattctcgc atgccagaga tcctattttt ggcaatcaaa tcattccgga 960

tactgcgatt ttaagtgttg ttccattcca tcacggtttt ggaatgttta ctacactcgg 1020

atatttgata tgtggatttc gagtcgtctt aatgtataga tttgaagaag agctgttttt 1080

acgatccctt caggattaca aaattcaaag tgcgttgcta gtaccaaccc tattttcatt 1140

cttcgccaaa agcactctga ttgacaaata cgatttatct aatttacacg aaattgcttc 1200

tgggggcgca cctctttcga aagaagtcgg ggaagcggtt gcaaaacgct tccatcttcc 1260

agggatacga caaggatatg ggctcactga gactacatca gctattctga ttacacccga 1320

gggggatgat aaaccgggcg cggtcggtaa agttgttcca ttttttgaag cgaaggttgt 1380

ggatctggat accgggaaaa cgctgggcgt taatcagaga ggcgaattat gtgtcagagg 1440

acctatgatt atgtccggtt atgtaaacaa tccggaagcg accaacgcct tgattgacaa 1500

ggatggatgg ctacattctg gagacatagc ttactgggac gaagacgaac acttcttcat 1560

agttgaccgc ttgaagtctt taattaaata caaaggatat caggtggccc ccgctgaatt 1620

ggaatcgata ttgttacaac accccaacat cttcgacgcg ggcgtggcag gtcttcccga 1680

cgatgacgcc ggtgaacttc ccgccgccgt tgttgttttg gagcacggaa agacgatgac 1740

ggaaaaagag atcgtggatt acgtcgccag tcaagtaaca accgcgaaaa agttgcgcgg 1800

aggagttgtg tttgtggacg aagtaccgaa aggtcttacc ggaaaactcg acgcaagaaa 1860

aatcagagag atcctcataa aggccaagaa gggcggaaag tccaaattgg tttaaataag 1920

gattaattag gttgggacgc aggtgtcgga gggatcgcca gccctcagga tagtgagctc 1980

ccgcagaggg ataagctatc tccctgcgac cttggatgcc aataaaaaaa aaaaagcgac 2040

atagccaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2100

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaactcg agctagcata accccttggg 2160

gcctctaaac gggtcttgag gggttttttg 2190

<210> 11

<211> 2271

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 11

cgcgaaatta atacgactca ctataggggt atttttacaa caattaccaa caacaacaaa 60

caacaaacaa cattacaatt actatttaca attacagaaa aatcggttaa aaaaactttt 120

cttctagaaa ttctccagga tttctcgacc tagttcgttt atctggtgac ttgcgctacc 180

gttgctttgc gtagagaatt tctcaaagca tacctaataa caatataatc ccataaaaaa 240

aaaaaatgtc tgaagacgcc aaaaacataa agaaaggccc ggcgccattc tatcctctag 300

aggatggaac cgctggagag caactgcata aggctatgaa gagatacgcc ctggttcctg 360

gaacaattgc ttttacagat gcacatatcg aggtgaacat cacgtacgcg gaatacttcg 420

aaatgtccgt tcggttggca gaagctatga aacgatatgg gctgaataca aatcacagaa 480

tcgtcgtatg cagtgaaaac tctcttcaat tctttatgcc ggtgttgggc gcgttattta 540

tcggagttgc agttgcgccc gcgaacgaca tttataatga acgtgaattg ctcaacagta 600

tgaacatttc gcagcctacc gtagtgtttg tttccaaaaa ggggttgcaa aaaattttga 660

acgtgcaaaa aaaattacca ataatccaga aaattattat catggattct aaaacggatt 720

accagggatt tcagtcgatg tacacgttcg tcacatctca tctacctccc ggttttaatg 780

aatacgattt tgtaccagag tcctttgatc gtgacaaaac aattgcactg ataatgaatt 840

cctctggatc tactgggtta cctaagggtg tggcccttcc gcatagaact gcctgcgtca 900

gattctcgca tgccagagat cctatttttg gcaatcaaat cattccggat actgcgattt 960

taagtgttgt tccattccat cacggttttg gaatgtttac tacactcgga tatttgatat 1020

gtggatttcg agtcgtctta atgtatagat ttgaagaaga gctgttttta cgatcccttc 1080

aggattacaa aattcaaagt gcgttgctag taccaaccct attttcattc ttcgccaaaa 1140

gcactctgat tgacaaatac gatttatcta atttacacga aattgcttct gggggcgcac 1200

ctctttcgaa agaagtcggg gaagcggttg caaaacgctt ccatcttcca gggatacgac 1260

aaggatatgg gctcactgag actacatcag ctattctgat tacacccgag ggggatgata 1320

aaccgggcgc ggtcggtaaa gttgttccat tttttgaagc gaaggttgtg gatctggata 1380

ccgggaaaac gctgggcgtt aatcagagag gcgaattatg tgtcagagga cctatgatta 1440

tgtccggtta tgtaaacaat ccggaagcga ccaacgcctt gattgacaag gatggatggc 1500

tacattctgg agacatagct tactgggacg aagacgaaca cttcttcata gttgaccgct 1560

tgaagtcttt aattaaatac aaaggatatc aggtggcccc cgctgaattg gaatcgatat 1620

tgttacaaca ccccaacatc ttcgacgcgg gcgtggcagg tcttcccgac gatgacgccg 1680

gtgaacttcc cgccgccgtt gttgttttgg agcacggaaa gacgatgacg gaaaaagaga 1740

tcgtggatta cgtcgccagt caagtaacaa ccgcgaaaaa gttgcgcgga ggagttgtgt 1800

ttgtggacga agtaccgaaa ggtcttaccg gaaaactcga cgcaagaaaa atcagagaga 1860

tcctcataaa ggccaagaag ggcggaaagt ccaaattggt ttaaataagg attaattatc 1920

ttgttgggtt ctggatgtta ggatgacgag tcgactcggg ctccgcacta ggtttggtcg 1980

cctagaggat ggagatatgg aaagggtctc gtgtggtatc agtcggtcga aagacgcgct 2040

tccaacatgg gcctatggtc ggataagtct tagcaatacc agccagcatg acttggatgc 2100

caataaaaaa aaaaaagcga catagccaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2160

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaactc 2220

gagctagcat aaccccttgg ggcctctaaa cgggtcttga ggggtttttt g 2271

<210> 12

<211> 2168

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 12

cgcgaaatta atacgactca ctataggggt atttttacaa caattaccaa caacaacaaa 60

caacaaacaa cattacaatt actatttaca attacagaaa aatcggttaa aaaaactttt 120

cttctggaaa ccgtggcaag tttcccaaag catacctaat aacaatataa tcccataaaa 180

aaaaaaaatg tctgaagacg ccaaaaacat aaagaaaggc ccggcgccat tctatcctct 240

agaggatgga accgctggag agcaactgca taaggctatg aagagatacg ccctggttcc 300

tggaacaatt gcttttacag atgcacatat cgaggtgaac atcacgtacg cggaatactt 360

cgaaatgtcc gttcggttgg cagaagctat gaaacgatat gggctgaata caaatcacag 420

aatcgtcgta tgcagtgaaa actctcttca attctttatg ccggtgttgg gcgcgttatt 480

tatcggagtt gcagttgcgc ccgcgaacga catttataat gaacgtgaat tgctcaacag 540

tatgaacatt tcgcagccta ccgtagtgtt tgtttccaaa aaggggttgc aaaaaatttt 600

gaacgtgcaa aaaaaattac caataatcca gaaaattatt atcatggatt ctaaaacgga 660

ttaccaggga tttcagtcga tgtacacgtt cgtcacatct catctacctc ccggttttaa 720

tgaatacgat tttgtaccag agtcctttga tcgtgacaaa acaattgcac tgataatgaa 780

ttcctctgga tctactgggt tacctaaggg tgtggccctt ccgcatagaa ctgcctgcgt 840

cagattctcg catgccagag atcctatttt tggcaatcaa atcattccgg atactgcgat 900

tttaagtgtt gttccattcc atcacggttt tggaatgttt actacactcg gatatttgat 960

atgtggattt cgagtcgtct taatgtatag atttgaagaa gagctgtttt tacgatccct 1020

tcaggattac aaaattcaaa gtgcgttgct agtaccaacc ctattttcat tcttcgccaa 1080

aagcactctg attgacaaat acgatttatc taatttacac gaaattgctt ctgggggcgc 1140

acctctttcg aaagaagtcg gggaagcggt tgcaaaacgc ttccatcttc cagggatacg 1200

acaaggatat gggctcactg agactacatc agctattctg attacacccg agggggatga 1260

taaaccgggc gcggtcggta aagttgttcc attttttgaa gcgaaggttg tggatctgga 1320

taccgggaaa acgctgggcg ttaatcagag aggcgaatta tgtgtcagag gacctatgat 1380

tatgtccggt tatgtaaaca atccggaagc gaccaacgcc ttgattgaca aggatggatg 1440

gctacattct ggagacatag cttactggga cgaagacgaa cacttcttca tagttgaccg 1500

cttgaagtct ttaattaaat acaaaggata tcaggtggcc cccgctgaat tggaatcgat 1560

attgttacaa caccccaaca tcttcgacgc gggcgtggca ggtcttcccg acgatgacgc 1620

cggtgaactt cccgccgccg ttgttgtttt ggagcacgga aagacgatga cggaaaaaga 1680

gatcgtggat tacgtcgcca gtcaagtaac aaccgcgaaa aagttgcgcg gaggagttgt 1740

gtttgtggac gaagtaccga aaggtcttac cggaaaactc gacgcaagaa aaatcagaga 1800

gatcctcata aaggccaaga agggcggaaa gtccaaattg gtttaaataa ggattaatta 1860

attgtatggg cctccgtgct gttaggggca gtggaaacgt cggtctagcc agccgtcccc 1920

tgggtagtgt gctccgtcta agtacaccac tactcggttt ccacaaacga tcggactccc 1980

aaaatagtct tggatgccaa taaaaaaaaa aaagcgacat agccaaaaaa aaaaaaaaaa 2040

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2100

aaaaaaaaaa aaaactcgag ctagcataac cccttggggc ctctaaacgg gtcttgaggg 2160

gttttttg 2168

<210> 13

<211> 2102

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 13

cgcgaaatta atacgactca ctataggggt atttttacaa caattaccaa caacaacaaa 60

caacaaacaa cattacaatt actatttaca attacagaaa aatcggttaa aaaaactttt 120

cttctagtaa agacaggaaa ctttactcaa agcataccta ataacaatat aatcccataa 180

aaaaaaaaaa tgtctgaaga cgccaaaaac ataaagaaag gcccggcgcc attctatcct 240

ctagaggatg gaaccgctgg agagcaactg cataaggcta tgaagagata cgccctggtt 300

cctggaacaa ttgcttttac agatgcacat atcgaggtga acatcacgta cgcggaatac 360

ttcgaaatgt ccgttcggtt ggcagaagct atgaaacgat atgggctgaa tacaaatcac 420

agaatcgtcg tatgcagtga aaactctctt caattcttta tgccggtgtt gggcgcgtta 480

tttatcggag ttgcagttgc gcccgcgaac gacatttata atgaacgtga attgctcaac 540

agtatgaaca tttcgcagcc taccgtagtg tttgtttcca aaaaggggtt gcaaaaaatt 600

ttgaacgtgc aaaaaaaatt accaataatc cagaaaatta ttatcatgga ttctaaaacg 660

gattaccagg gatttcagtc gatgtacacg ttcgtcacat ctcatctacc tcccggtttt 720

aatgaatacg attttgtacc agagtccttt gatcgtgaca aaacaattgc actgataatg 780

aattcctctg gatctactgg gttacctaag ggtgtggccc ttccgcatag aactgcctgc 840

gtcagattct cgcatgccag agatcctatt tttggcaatc aaatcattcc ggatactgcg 900

attttaagtg ttgttccatt ccatcacggt tttggaatgt ttactacact cggatatttg 960

atatgtggat ttcgagtcgt cttaatgtat agatttgaag aagagctgtt tttacgatcc 1020

cttcaggatt acaaaattca aagtgcgttg ctagtaccaa ccctattttc attcttcgcc 1080

aaaagcactc tgattgacaa atacgattta tctaatttac acgaaattgc ttctgggggc 1140

gcacctcttt cgaaagaagt cggggaagcg gttgcaaaac gcttccatct tccagggata 1200

cgacaaggat atgggctcac tgagactaca tcagctattc tgattacacc cgagggggat 1260

gataaaccgg gcgcggtcgg taaagttgtt ccattttttg aagcgaaggt tgtggatctg 1320

gataccggga aaacgctggg cgttaatcag agaggcgaat tatgtgtcag aggacctatg 1380

attatgtccg gttatgtaaa caatccggaa gcgaccaacg ccttgattga caaggatgga 1440

tggctacatt ctggagacat agcttactgg gacgaagacg aacacttctt catagttgac 1500

cgcttgaagt ctttaattaa atacaaagga tatcaggtgg cccccgctga attggaatcg 1560

atattgttac aacaccccaa catcttcgac gcgggcgtgg caggtcttcc cgacgatgac 1620

gccggtgaac ttcccgccgc cgttgttgtt ttggagcacg gaaagacgat gacggaaaaa 1680

gagatcgtgg attacgtcgc cagtcaagta acaaccgcga aaaagttgcg cggaggagtt 1740

gtgtttgtgg acgaagtacc gaaaggtctt accggaaaac tcgacgcaag aaaaatcaga 1800

gagatcctca taaaggccaa gaagggcgga aagtccaaat tggtttaaat aaggattaat 1860

tacgtagcta ccctcctggt gcacttcctg gtgcaaagca gaaccaaggg tacggtggta 1920

cgcttggatg ccaataaaaa aaaaaaagcg acatagccaa aaaaaaaaaa aaaaaaaaaa 1980

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2040

aaaaaaaact cgagctagca taaccccttg gggcctctaa acgggtcttg aggggttttt 2100

tg 2102

<210> 14

<211> 53

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 14

cctgcgtccc aacctaatta atccttattt aaaccaattt ggactttccg ccc 53

<210> 15

<211> 36

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 15

cttggatgcc aataaaaaaa aaaaagcgac atagcc 36

<210> 16

<211> 47

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 16

ccaaattggt ttaaataagg attaattagg ttgggacgca ggtgtcg 47

<210> 17

<211> 59

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 17

ggctatgtcg cttttttttt tttattggca tccaaggtcg cagggagata gcttatccc 59

<210> 18

<211> 58

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 18

catccagaac ccaacaagat aattaatcct tatttaaacc aatttggact ttccgccc 58

<210> 19

<211> 54

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 19

gcaataccag ccagcatgac ttggatgcca ataaaaaaaa aaaagcgaca tagc 54

<210> 20

<211> 57

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 20

ccaaattggt ttaaataagg attaattatc ttgttgggtt ctggatgtta ggatgac 57

<210> 21

<211> 41

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 21

ggcatccaag tcatgctggc tggtattgct aagacttatc c 41

<210> 22

<211> 53

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 22

gaggcccata caattaatta atccttattt aaaccaattt ggactttccg ccc 53

<210> 23

<211> 55

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 23

gatcggactc ccaaaatagt cttggatgcc aataaaaaaa aaaaagcgac atagc 55

<210> 24

<211> 51

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 24

gtccaaattg gtttaaataa ggattaatta attgtatggg cctccgtgct g 51

<210> 25

<211> 37

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 25

ggcatccaag actattttgg gagtccgatc gtttgtg 37

<210> 26

<211> 55

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 26

caggagggta gctacgtaat taatccttat ttaaaccaat ttggactttc cgccc 55

<210> 27

<211> 50

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 27

gggtacggtg gtacgcttgg atgccaataa aaaaaaaaaa gcgacatagc 50

<210> 28

<211> 41

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 28

ggtttaaata aggattaatt acgtagctac cctcctggtg c 41

<210> 29

<211> 35

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 29

attggcatcc aagcgtacca ccgtaccctt ggttc 35

<210> 30

<211> 61

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 30

agaagaaaag tttttttaac cgatttttcc ctatagtgag tcgtattaat ttcgcgagat 60

c 61

<210> 31

<211> 69

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 31

gcatacctaa taacaatata atcccatagg tatttttaca acaattacca acaacaacaa 60

acaacaaac 69

<210> 32

<211> 59

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 32

ctcactatag ggaaaaatcg gttaaaaaaa cttttcttct gggtatttat agagatcag 59

<210> 33

<211> 59

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 33

gtaattgttg taaaaatacc tatgggatta tattgttatt aggtatgctt tgagacttc 59

<210> 34

<211> 57

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 34

cgactcacta tagggaaaaa tcggttaaaa aaacttttct tctagaaatt ctccagg 57

<210> 35

<211> 59

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 35

gtaattgttg taaaaatacc tatgggatta tattgttatt aggtatgctt tgagaaatt 59

<210> 36

<211> 51

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 36

cgactcacta tagggaaaaa tcggttaaaa aaacttttct tctggaaacc g 51

<210> 37

<211> 59

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 37

gtaattgttg taaaaatacc tatgggatta tattgttatt aggtatgctt tgggaaact 59

<210> 38

<211> 58

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 38

gactcactat agggaaaaat cggttaaaaa aacttttctt ctagtaaaga caggaaac 58

<210> 39

<211> 59

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 39

gtaattgttg taaaaatacc tatgggatta tattgttatt aggtatgctt tgagtaaag 59

<210> 40

<211> 61

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 40

aaagtttttt taaccgattt ttctgtaatt gtaaatagta attgtaatgt tgtttgttgt 60

t 61

<210> 41

<211> 69

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 41

caaagcatac ctaataacaa tataatccca taaaaaaaaa aaatgtctga agacgccaaa 60

aacataaag 69

<210> 42

<211> 59

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 42

cattacaatt actatttaca attacagaaa aatcggttaa aaaaactttt cttctgggt 59

<210> 43

<211> 59

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 43

cagacatttt ttttttttat gggattatat tgttattagg tatgctttga gacttcacg 59

<210> 44

<211> 59

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 44

caacattaca attactattt acaattacag aaaaatcggt taaaaaaact tttcttcta 59

<210> 45

<211> 59

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 45

cagacatttt ttttttttat gggattatat tgttattagg tatgctttga gaaattctc 59

<210> 46

<211> 59

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 46

caacattaca attactattt acaattacag aaaaatcggt taaaaaaact tttcttctg 59

<210> 47

<211> 59

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 47

cagacatttt ttttttttat gggattatat tgttattagg tatgctttgg gaaacttgc 59

<210> 48

<211> 59

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 48

caacattaca attactattt acaattacag aaaaatcggt taaaaaaact tttcttcta 59

<210> 49

<211> 59

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 49

gacatttttt ttttttatgg gattatattg ttattaggta tgctttgagt aaagtttcc 59

<210> 50

<211> 1128

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 50

cgcgaaatta atacgactca ctataggggt atttttacaa caattaccaa caacaacaaa 60

caacaaacaa cattacaatt actatttaca attacaaaaa aaaaaaatgg tgagcaaggg 120

cgaggagctg ttcaccgggg tggtgcccat cctggtcgag ctggacggcg acgtaaacgg 180

ccacaagttc agcgtgcgcg gcgagggcga gggcgatgcc accaacggca agctgaccct 240

gaagttcatc tgcaccaccg gcaagctgcc cgtgccctgg cccaccctcg tgaccaccct 300

gacctacggc gtgcagtgct tcagccgcta ccccgaccac atgaagcagc acgacttctt 360

caagtccgcc atgcccgaag gctacgtcca ggagcgcacc atctccttca aggacgacgg 420

cacctacaag acccgcgccg aggtgaagtt cgagggcgac accctggtga accgcatcga 480

gctgaagggc atcgacttca aggaggacgg caacatcctg gggcacaagc tggagtacaa 540

cttcaacagc cacaacgtct atatcacggc cgacaagcag aagaacggca tcaaggcgaa 600

cttcaagatc cgccacaacg tcgaggacgg cagcgtgcag ctcgccgacc actaccagca 660

gaacaccccc atcggcgacg gccccgtgct gctgcccgac aaccactacc tgagcaccca 720

gtccaagctg agcaaagacc ccaacgagaa gcgcgatcac atggtcctgc tggagttcgt 780

gaccgccgcc gggatcactc tcggcatgga cgagctgtac aagtaaataa ggattaatta 840

cttggatgcc aataaaaaaa aaaaagcgac atagccaaaa aaaaaaaaaa aaaaaaaaaa 900

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 960

aaaaaactcg agcaccacca ccaccaccac tgagatccgg ctgctaacaa agcccgaaag 1020

gaagctgagt tggctgctgc caccgctgag caataactag cataacccct tggggcctct 1080

aaacgggtct tgaggggttt tttgctgaaa ggaggaacta tatccgga 1128

<210> 51

<211> 1356

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 51

cgcgaaatta atacgactca ctatagggaa aaatcggtta aaaaaacttt tcttctagaa 60

attctccagg atttctcgac ctagttcgtt tatctggtga cttgcgctac cgttgctttg 120

cgtagagaat ttctcaaagc atacctaata acaatataat cccataggta tttttacaac 180

aattaccaac aacaacaaac aacaaacaac attacaatta ctatttacaa ttacaaaaaa 240

aaaaaatggt gagcaagggc gaggagctgt tcaccggggt ggtgcccatc ctggtcgagc 300

tggacggcga cgtaaacggc cacaagttca gcgtgcgcgg cgagggcgag ggcgatgcca 360

ccaacggcaa gctgaccctg aagttcatct gcaccaccgg caagctgccc gtgccctggc 420

ccaccctcgt gaccaccctg acctacggcg tgcagtgctt cagccgctac cccgaccaca 480

tgaagcagca cgacttcttc aagtccgcca tgcccgaagg ctacgtccag gagcgcacca 540

tctccttcaa ggacgacggc acctacaaga cccgcgccga ggtgaagttc gagggcgaca 600

ccctggtgaa ccgcatcgag ctgaagggca tcgacttcaa ggaggacggc aacatcctgg 660

ggcacaagct ggagtacaac ttcaacagcc acaacgtcta tatcacggcc gacaagcaga 720

agaacggcat caaggcgaac ttcaagatcc gccacaacgt cgaggacggc agcgtgcagc 780

tcgccgacca ctaccagcag aacaccccca tcggcgacgg ccccgtgctg ctgcccgaca 840

accactacct gagcacccag tccaagctga gcaaagaccc caacgagaag cgcgatcaca 900

tggtcctgct ggagttcgtg accgccgccg ggatcactct cggcatggac gagctgtaca 960

agtaaataag gattaattat cttgttgggt tctggatgtt aggatgacga gtcgactcgg 1020

gctccgcact aggtttggtc gcctagagga tggagatatg gaaagggtct cgtgtggtat 1080

cagtcggtcg aaagacgcgc ttccaacatg ggcctatggt cggataagtc ttagcaatac 1140

cagccagcat gacttggatg ccaataaaaa aaaaaaagcg acatagccaa aaaaaaaaaa 1200

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1260

aaaaaaaaaa aaaaaaaact cgagctagca taaccccttg gggcctctaa acgggtcttg 1320

aggggttttt tgctgaaagg aggaactata tccgga 1356

<210> 52

<211> 1253

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 52

cgcgaaatta atacgactca ctatagggaa aaatcggtta aaaaaacttt tcttctggaa 60

accgtggcaa gtttcccaaa gcatacctaa taacaatata atcccatagg tatttttaca 120

acaattacca acaacaacaa acaacaaaca acattacaat tactatttac aattacaaaa 180

aaaaaaaatg gtgagcaagg gcgaggagct gttcaccggg gtggtgccca tcctggtcga 240

gctggacggc gacgtaaacg gccacaagtt cagcgtgcgc ggcgagggcg agggcgatgc 300

caccaacggc aagctgaccc tgaagttcat ctgcaccacc ggcaagctgc ccgtgccctg 360

gcccaccctc gtgaccaccc tgacctacgg cgtgcagtgc ttcagccgct accccgacca 420

catgaagcag cacgacttct tcaagtccgc catgcccgaa ggctacgtcc aggagcgcac 480

catctccttc aaggacgacg gcacctacaa gacccgcgcc gaggtgaagt tcgagggcga 540

caccctggtg aaccgcatcg agctgaaggg catcgacttc aaggaggacg gcaacatcct 600

ggggcacaag ctggagtaca acttcaacag ccacaacgtc tatatcacgg ccgacaagca 660

gaagaacggc atcaaggcga acttcaagat ccgccacaac gtcgaggacg gcagcgtgca 720

gctcgccgac cactaccagc agaacacccc catcggcgac ggccccgtgc tgctgcccga 780

caaccactac ctgagcaccc agtccaagct gagcaaagac cccaacgaga agcgcgatca 840

catggtcctg ctggagttcg tgaccgccgc cgggatcact ctcggcatgg acgagctgta 900

caagtaaata aggattaatt aattgtatgg gcctccgtgc tgttaggggc agtggaaacg 960

tcggtctagc cagccgtccc ctgggtagtg tgctccgtct aagtacacca ctactcggtt 1020

tccacaaacg atcggactcc caaaatagtc ttggatgcca ataaaaaaaa aaaagcgaca 1080

tagccaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1140

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaactcga gctagcataa ccccttgggg 1200

cctctaaacg ggtcttgagg ggttttttgc tgaaaggagg aactatatcc gga 1253

<210> 53

<211> 1187

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 53

cgcgaaatta atacgactca ctatagggaa aaatcggtta aaaaaacttt tcttctagta 60

aagacaggaa actttactca aagcatacct aataacaata taatcccata ggtattttta 120

caacaattac caacaacaac aaacaacaaa caacattaca attactattt acaattacaa 180

aaaaaaaaaa tggtgagcaa gggcgaggag ctgttcaccg gggtggtgcc catcctggtc 240

gagctggacg gcgacgtaaa cggccacaag ttcagcgtgc gcggcgaggg cgagggcgat 300

gccaccaacg gcaagctgac cctgaagttc atctgcacca ccggcaagct gcccgtgccc 360

tggcccaccc tcgtgaccac cctgacctac ggcgtgcagt gcttcagccg ctaccccgac 420

cacatgaagc agcacgactt cttcaagtcc gccatgcccg aaggctacgt ccaggagcgc 480

accatctcct tcaaggacga cggcacctac aagacccgcg ccgaggtgaa gttcgagggc 540

gacaccctgg tgaaccgcat cgagctgaag ggcatcgact tcaaggagga cggcaacatc 600

ctggggcaca agctggagta caacttcaac agccacaacg tctatatcac ggccgacaag 660

cagaagaacg gcatcaaggc gaacttcaag atccgccaca acgtcgagga cggcagcgtg 720

cagctcgccg accactacca gcagaacacc cccatcggcg acggccccgt gctgctgccc 780

gacaaccact acctgagcac ccagtccaag ctgagcaaag accccaacga gaagcgcgat 840

cacatggtcc tgctggagtt cgtgaccgcc gccgggatca ctctcggcat ggacgagctg 900

tacaagtaaa taaggattaa ttacgtagct accctcctgg tgcacttcct ggtgcaaagc 960

agaaccaagg gtacggtggt acgcttggat gccaataaaa aaaaaaaagc gacatagcca 1020

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1080

aaaaaaaaaa aaaaaaaaaa aaaaaaaaac tcgagctagc ataacccctt ggggcctcta 1140

aacgggtctt gaggggtttt ttgctgaaag gaggaactat atccgga 1187

<210> 54

<211> 59

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 54

tttttttttt gtaattgtaa atagtaattg taatgttgtt tgttgtttgt tgttgttgg 59

<210> 55

<211> 55

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 55

cggcatggac gagctgtaca agtaaataag gattaattat cttgttgggt tctgg 55

<210> 56

<211> 58

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 56

caacattaca attactattt acaattacaa aaaaaaaaaa tggtgagcaa gggcgagg 58

<210> 57

<211> 31

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 57

ccttatttac ttgtacagct cgtccatgcc g 31

<210> 58

<211> 58

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 58

cggcatggac gagctgtaca agtaaataag gattaattaa ttgtatgggc ctccgtgc 58

<210> 59

<211> 59

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 59

cggcatggac gagctgtaca agtaaataag gattaattac gtagctaccc tcctggtgc 59

<210> 60

<211> 27

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 60

cgcgaaatta atacgactca ctatagg 27

<210> 61

<211> 24

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 61

tccggatata gttcctcctt tcag 24

Claims

1. a kind of nucleic acid constructs, the nucleic acid constructs has the structure shown in formula A of 5'-3':

Z1-Z2-Y1-Z3-Z4-Y2-Z5 (A)

In formula,

Z1 is promoter element；

Z2 and Z4 is cyclisation element, turns the mRNA corresponding to formula A wherein cyclisation element Z2 and Z4 is directly acted on by RNA-RNA It records product and forms a cyclized structure, and the cyclisation element is selected from the group: TED, BTE, PTE, I type structural detail, y-type structure Element, T-type structure element, other be capable of forming element that RNA-RNA directly acts on, or combinations thereof；

Y1 is element shown in nothing or " Z6-Z7-Z8 "；

Z3 is the coded sequence of foreign protein；

Y2 is nothing or Z9；

Z5 is to terminate subcomponent；

Z6 is nothing or enhancer element；

Z8 is nothing or Kozak sequence；

Z9 is the poly chain of adenine DNA；

Each "-" independently is key or nucleotide catenation sequence.

2. a kind of nucleic acid constructs, which is characterized in that the nucleic acid constructs contains structure nucleic acid sequence shown in formula I:

Z1-Z2-Z3-Z4-Z5 (I)

In formula,

Z1-Z4 is respectively the element for being used to constitute the construction；

Each "-" independently is key or nucleotide catenation sequence；

Z1 is promoter element, and preferably the promoter element is selected from the group: T7 promoter, T3 promoter, SP6 promoter, Or combinations thereof；

Z4 and Z2 is cyclisation element, and the cyclisation element is selected from the group: TED, BTE, PTE, I type structural detail, y-type structure Element, T-type structure element, other be capable of forming element that RNA-RNA directly acts on, or combinations thereof；

Z3 is the coded sequence of foreign protein；

Z5 be terminate subcomponent, preferably the termination subcomponent is selected from the group: T7 terminator, T3 terminator, SP6 terminator, Or combinations thereof.

3. a kind of nucleic acid constructs, which is characterized in that the nucleic acid constructs contains structure nucleic acid sequence as shown in Formula II:

Z1-Z2-Z6-Z7-Z8-Z3-Z4-Z9-Z5 (II)

In formula,

Each "-" independently is key or nucleotide catenation sequence；

Z1 is promoter element, and the promoter element is selected from the group: T7 promoter, T3 promoter, SP6 promoter or its group It closes；

Z4 and Z2 is cyclisation element, and the cyclisation element is selected from the group: TED, BTE, PTE, I type structural detail, y-type structure member Part, T-type structure element, other be capable of forming element that RNA-RNA directly acts on, or combinations thereof；

Z6 is enhancer element, and the enhancer element includes IRES element；

Z8 is Kozak sequence；

Z3 is the coded sequence of foreign protein；

Z9 is the poly chain [poly (A)] of adenine DNA, which is characterized in that the poly chain is selected from the group: 50A, 70A, 90A, or combinations thereof；

Z5 is to terminate subcomponent, and the termination subcomponent is selected from the group: T7 terminator, T3 terminator, SP6 terminator or its group It closes.

4. a kind of external foreign protein synthetic system characterized by comprising

(a) nucleic acid constructs of any of claims 1-3；With

(b) yeast cell extract.

5. a kind of carrier or carrier combination, which is characterized in that the carrier or carrier combination are containing any in claim 1-3 Nucleic acid constructs described in.

6. a kind of genetically engineered cell, which is characterized in that one or more sites of the genome of the genetically engineered cell are whole Conjunction is had the right to require in the described in any item nucleic acid constructs of 1-3 or the genetically engineered cell containing described in claim 5 Carrier or carrier combination.

7. a kind of kit, which is characterized in that the reagent for including in the kit one of is selected from the group or a variety of:

(a) any construction of claim 1-3；

(b) carrier described in claim 5 or carrier combination；

(c) genetically engineered cell as claimed in claim 6.

8. carrier described in a kind of construction a method according to any one of claims 1-3, claim 5 or carrier combination, right are wanted The purposes of genetically engineered cell described in asking 6 or kit as claimed in claim 7, which is characterized in that for carrying out high throughput Protein synthesis in vitro.

9. a kind of synthetic method of the foreign protein of external high throughput, which is characterized in that comprising steps of

(i) in the presence of eukaryon external biological synthetic system, any nucleic acid constructs of claim 1-3 is provided；

(ii) under the suitable conditions, the eukaryon external biological synthetic system of incubation step (i) T1 for a period of time, to synthesize institute State foreign protein.

10. method as claimed in claim 9, which is characterized in that in the step (ii), reaction temperature is 20-37 DEG C, preferably Ground, 22-35 DEG C.

11. method as claimed in claim 9, which is characterized in that in the step (ii), reaction time 1-72h, preferably Ground, 2-23h.

12. method as claimed in claim 9, which is characterized in that further include that (iii) is optionally closed from the eukaryon external biological In architectonical, the foreign protein is separated or detected.