CN113755512B - Method for preparing tandem repeat protein and application thereof - Google Patents
Method for preparing tandem repeat protein and application thereof Download PDFInfo
- Publication number
- CN113755512B CN113755512B CN202011405477.XA CN202011405477A CN113755512B CN 113755512 B CN113755512 B CN 113755512B CN 202011405477 A CN202011405477 A CN 202011405477A CN 113755512 B CN113755512 B CN 113755512B
- Authority
- CN
- China
- Prior art keywords
- sequence
- gene
- double
- stranded dna
- intron
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 164
- 102000004169 proteins and genes Human genes 0.000 title claims abstract description 67
- 238000000034 method Methods 0.000 title claims abstract description 45
- 108020004414 DNA Proteins 0.000 claims abstract description 68
- 102000053602 DNA Human genes 0.000 claims abstract description 55
- 210000004027 cell Anatomy 0.000 claims abstract description 30
- 230000014509 gene expression Effects 0.000 claims abstract description 30
- 239000013604 expression vector Substances 0.000 claims abstract description 15
- 210000003370 receptor cell Anatomy 0.000 claims abstract description 4
- 239000002773 nucleotide Substances 0.000 claims description 81
- 125000003729 nucleotide group Chemical group 0.000 claims description 81
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims description 33
- 108020005067 RNA Splice Sites Proteins 0.000 claims description 20
- 238000006243 chemical reaction Methods 0.000 claims description 18
- 108091081024 Start codon Proteins 0.000 claims description 14
- 239000013598 vector Substances 0.000 claims description 11
- 108091026890 Coding region Proteins 0.000 claims description 10
- 125000006850 spacer group Chemical group 0.000 claims description 10
- 230000001580 bacterial effect Effects 0.000 claims description 7
- 230000000295 complement effect Effects 0.000 claims description 7
- 101100263837 Bovine ephemeral fever virus (strain BB7721) beta gene Proteins 0.000 claims description 6
- 239000002243 precursor Substances 0.000 claims description 6
- 241000588724 Escherichia coli Species 0.000 claims description 5
- 108091092195 Intron Proteins 0.000 claims description 5
- 238000002360 preparation method Methods 0.000 claims description 5
- 108700026220 vif Genes Proteins 0.000 claims description 5
- 108020004705 Codon Proteins 0.000 claims description 4
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 4
- 238000012258 culturing Methods 0.000 claims description 4
- 102000040650 (ribonucleotides)n+m Human genes 0.000 claims description 2
- 241000588722 Escherichia Species 0.000 claims description 2
- 230000000813 microbial effect Effects 0.000 claims description 2
- 244000005700 microbiome Species 0.000 claims description 2
- 238000004519 manufacturing process Methods 0.000 claims 1
- 238000002474 experimental method Methods 0.000 abstract description 5
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 18
- 229920001872 Spider silk Polymers 0.000 description 14
- 229960000723 ampicillin Drugs 0.000 description 13
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 13
- 102000004190 Enzymes Human genes 0.000 description 12
- 108090000790 Enzymes Proteins 0.000 description 12
- 229940041514 candida albicans extract Drugs 0.000 description 12
- 239000012137 tryptone Substances 0.000 description 12
- 239000012138 yeast extract Substances 0.000 description 12
- 239000003550 marker Substances 0.000 description 8
- 239000002609 medium Substances 0.000 description 7
- 108020004999 messenger RNA Proteins 0.000 description 7
- 239000000047 product Substances 0.000 description 7
- 229920001817 Agar Polymers 0.000 description 6
- 239000008272 agar Substances 0.000 description 6
- 239000011780 sodium chloride Substances 0.000 description 6
- 238000013518 transcription Methods 0.000 description 6
- 230000035897 transcription Effects 0.000 description 6
- 241001198387 Escherichia coli BL21(DE3) Species 0.000 description 5
- 239000008367 deionised water Substances 0.000 description 5
- 229910021641 deionized water Inorganic materials 0.000 description 5
- 239000001963 growth medium Substances 0.000 description 5
- 230000001681 protective effect Effects 0.000 description 5
- 108091008146 restriction endonucleases Proteins 0.000 description 5
- 238000013519 translation Methods 0.000 description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 5
- 230000003115 biocidal effect Effects 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 239000000499 gel Substances 0.000 description 4
- 230000002363 herbicidal effect Effects 0.000 description 4
- 239000004009 herbicide Substances 0.000 description 4
- 230000000977 initiatory effect Effects 0.000 description 4
- 239000007788 liquid Substances 0.000 description 4
- 239000013612 plasmid Substances 0.000 description 4
- 239000007787 solid Substances 0.000 description 4
- 239000000243 solution Substances 0.000 description 4
- 108091033380 Coding strand Proteins 0.000 description 3
- 239000002299 complementary DNA Substances 0.000 description 3
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 3
- 230000010076 replication Effects 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 239000006228 supernatant Substances 0.000 description 3
- 241000701489 Cauliflower mosaic virus Species 0.000 description 2
- 108091028732 Concatemer Proteins 0.000 description 2
- WQZGKKKJIJFFOK-QTVWNMPRSA-N D-mannopyranose Chemical compound OC[C@H]1OC(O)[C@@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-QTVWNMPRSA-N 0.000 description 2
- 101150074155 DHFR gene Proteins 0.000 description 2
- 102000012410 DNA Ligases Human genes 0.000 description 2
- 108010061982 DNA Ligases Proteins 0.000 description 2
- YQYJSBFKSSDGFO-UHFFFAOYSA-N Epihygromycin Natural products OC1C(O)C(C(=O)C)OC1OC(C(=C1)O)=CC=C1C=C(C)C(=O)NC1C(O)C(O)C2OCOC2C1O YQYJSBFKSSDGFO-UHFFFAOYSA-N 0.000 description 2
- 108010022355 Fibroins Proteins 0.000 description 2
- XPJBQTCXPJNIFE-ZETCQYMHSA-N Gly-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 2
- 239000005562 Glyphosate Substances 0.000 description 2
- 101100288095 Klebsiella pneumoniae neo gene Proteins 0.000 description 2
- 108060001084 Luciferase Proteins 0.000 description 2
- 108091022912 Mannose-6-Phosphate Isomerase Proteins 0.000 description 2
- IAJOBQBIJHVGMQ-UHFFFAOYSA-N Phosphinothricin Natural products CP(O)(=O)CCC(N)C(O)=O IAJOBQBIJHVGMQ-UHFFFAOYSA-N 0.000 description 2
- 238000010802 RNA extraction kit Methods 0.000 description 2
- 239000003242 anti bacterial agent Substances 0.000 description 2
- 229940088710 antibiotic agent Drugs 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- 238000003169 complementation method Methods 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 2
- 239000005547 deoxyribonucleotide Substances 0.000 description 2
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 2
- 238000001962 electrophoresis Methods 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 238000001976 enzyme digestion Methods 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- IAJOBQBIJHVGMQ-BYPYZUCNSA-N glufosinate-P Chemical compound CP(O)(=O)CC[C@H](N)C(O)=O IAJOBQBIJHVGMQ-BYPYZUCNSA-N 0.000 description 2
- 108010026364 glycyl-glycyl-leucine Proteins 0.000 description 2
- XDDAORKBJWWYJS-UHFFFAOYSA-N glyphosate Chemical compound OC(=O)CNCP(O)(O)=O XDDAORKBJWWYJS-UHFFFAOYSA-N 0.000 description 2
- 229940097068 glyphosate Drugs 0.000 description 2
- 210000003000 inclusion body Anatomy 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- 229930027917 kanamycin Natural products 0.000 description 2
- 229960000318 kanamycin Drugs 0.000 description 2
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 2
- 229930182823 kanamycin A Natural products 0.000 description 2
- 238000001819 mass spectrum Methods 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 230000008488 polyadenylation Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 239000012460 protein solution Substances 0.000 description 2
- 238000007480 sanger sequencing Methods 0.000 description 2
- 238000009987 spinning Methods 0.000 description 2
- 108010020183 3-phosphoshikimate 1-carboxyvinyltransferase Proteins 0.000 description 1
- 241000589158 Agrobacterium Species 0.000 description 1
- WMYJZJRILUVVRG-WDSKDSINSA-N Ala-Gly-Gln Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O WMYJZJRILUVVRG-WDSKDSINSA-N 0.000 description 1
- OMDNCNKNEGFOMM-BQBZGAKWSA-N Ala-Met-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O OMDNCNKNEGFOMM-BQBZGAKWSA-N 0.000 description 1
- 241000239290 Araneae Species 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 101150111720 EPSPS gene Proteins 0.000 description 1
- 241000620209 Escherichia coli DH5[alpha] Species 0.000 description 1
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 1
- JXFLPKSDLDEOQK-JHEQGTHGSA-N Gln-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O JXFLPKSDLDEOQK-JHEQGTHGSA-N 0.000 description 1
- UGVQELHRNUDMAA-BYPYZUCNSA-N Gly-Ala-Gly Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)NCC([O-])=O UGVQELHRNUDMAA-BYPYZUCNSA-N 0.000 description 1
- QPTNELDXWKRIFX-YFKPBYRVSA-N Gly-Gly-Gln Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O QPTNELDXWKRIFX-YFKPBYRVSA-N 0.000 description 1
- INLIXXRWNUKVCF-JTQLQIEISA-N Gly-Gly-Tyr Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 INLIXXRWNUKVCF-JTQLQIEISA-N 0.000 description 1
- 229920000271 Kevlar® Polymers 0.000 description 1
- FBOZXECLQNJBKD-ZDUSSCGKSA-N L-methotrexate Chemical compound C=1N=C2N=C(N)N=C(N)C2=NC=1CN(C)C1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 FBOZXECLQNJBKD-ZDUSSCGKSA-N 0.000 description 1
- YRAWWKUTNBILNT-FXQIFTODSA-N Met-Ala-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O YRAWWKUTNBILNT-FXQIFTODSA-N 0.000 description 1
- 108010079364 N-glycylalanine Proteins 0.000 description 1
- 108091034057 RNA (poly(A)) Proteins 0.000 description 1
- 238000011529 RT qPCR Methods 0.000 description 1
- AEGUWTFAQQWVLC-BQBZGAKWSA-N Ser-Gly-Arg Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O AEGUWTFAQQWVLC-BQBZGAKWSA-N 0.000 description 1
- 229910000831 Steel Inorganic materials 0.000 description 1
- 108091036066 Three prime untranslated region Proteins 0.000 description 1
- 108700019146 Transgenes Proteins 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- XZNUGFQTQHRASN-XQENGBIVSA-N apramycin Chemical compound O([C@H]1O[C@@H]2[C@H](O)[C@@H]([C@H](O[C@H]2C[C@H]1N)O[C@@H]1[C@@H]([C@@H](O)[C@H](N)[C@@H](CO)O1)O)NC)[C@@H]1[C@@H](N)C[C@@H](N)[C@H](O)[C@H]1O XZNUGFQTQHRASN-XQENGBIVSA-N 0.000 description 1
- 229950006334 apramycin Drugs 0.000 description 1
- 101150103518 bar gene Proteins 0.000 description 1
- 238000009395 breeding Methods 0.000 description 1
- 230000001488 breeding effect Effects 0.000 description 1
- 238000010805 cDNA synthesis kit Methods 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000006555 catalytic reaction Methods 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 239000013043 chemical agent Substances 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 210000004748 cultured cell Anatomy 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 1
- 108010010096 glycyl-glycyl-tyrosine Proteins 0.000 description 1
- 101150054900 gus gene Proteins 0.000 description 1
- 101150029559 hph gene Proteins 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000011090 industrial biotechnology method and process Methods 0.000 description 1
- 239000004761 kevlar Substances 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 229960000485 methotrexate Drugs 0.000 description 1
- 108010058731 nopaline synthase Proteins 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000002054 transplantation Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/70—Vectors or expression systems specially adapted for E. coli
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/43504—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from invertebrates
- C07K14/43513—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from invertebrates from arachnidae
- C07K14/43518—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from invertebrates from arachnidae from spiders
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A50/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
- Y02A50/30—Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change
Abstract
The application provides a method for preparing tandem repeat protein, and a related product and application thereof. The method for preparing tandem repeat protein comprises the steps of introducing an expression vector containing double-stranded DNA molecules named as single copy gene expression cassettes into receptor cells to obtain recombinant cells, extracting total RNA of the recombinant cells, and translating the total RNA to obtain the tandem repeat protein, so that the time for preparing the long tandem repeat protein is greatly shortened. Experiments prove that the tandem repeat MaSp1 which is repeated for 40 times can be obtained only by 7 days, and the time is greatly shortened compared with the traditional method. The method has the characteristics of short experimental period, time and cost saving, high efficiency and the like.
Description
Technical Field
The application relates to a method for preparing tandem repeat protein and application thereof in the field of biotechnology.
Background
Tandem repeat proteins are proteins whose amino acid sequence is highly repetitive, resulting from tandem repeat gene expression. In the past, tandem repeat proteins were prepared by constructing an expression vector containing tandem repeat DNA, and then expressing the tandem repeat proteins. The current construction method of tandem repeat DNA expression vector mainly comprises 2 methods of asymmetric cohesive end complementation method and isotail enzyme method. The asymmetric cohesive end complementation method generates random copy number and requires multiple enzymes for enzyme digestion connection. The homotail enzyme method is also complicated, and needs to repeatedly carry out enzyme digestion connection. Both methods are time-consuming and laborious.
The traction silk protein in the spider silk has high strength, and the traction silk strength of the spider silk is 5 times of that of the steel wire and 3 times of that of the artificial Kevlar fiber under the same weight. Meanwhile, spider silk has good plasticity, and the two characteristics lead the spider silk to be widely applied in various fields. In the industrial sector, for example, in the preparation of parachutes, protective clothing, composite materials for aircraft. In the biomedical field, including wound sutures, carriers for the transport of biological drugs, scaffolds for cell culture and organ transplantation. The dragline silk is mainly composed of spider silk proteins MaSp1 (major ampullate spidroins 1) and MaSp2 (major ampullate spidroins 2). These two proteins are highly modular proteins with long repeats within the sequence, with the flanking sequences being approximately 100 amino acid residues in length. However, spider silk is difficult to obtain in large quantities by breeding spiders, because of its strong field awareness and aggressiveness. Thus, many studies have attempted to express recombinant spider silk proteins in other hosts. Increasing the length of recombinant dragline silk proteins is one of the key factors in improving the mechanical properties of spider silk spinning. The size of the traction silk protein in nature is 250-320kDa. One scholars expressed 284.9kDa recombinant spider silk protein by using expression E.coli in 2010 and the spinning mechanical properties were similar to those of natural spider silk. The recombinant spider silk protein expressing 184.9kDa needs to synthesize a repeating unit MaSp1, then uses the homotail enzyme seamless splicing technology MaSp2 concatemer, and further sequentially synthesizes MaSp4, maSp8, maSp16, maSp32 and MaSp48 by repeating the same method, and finally splices MaSp96. The steps are complicated, and time and labor are wasted. And if it is desired to optimize the spider silk sequence, it is necessary to resynthesize the gene, and it takes a lot of time to reconstruct a series of concatemers.
Disclosure of Invention
The problem to be solved by the present application is how to prepare tandem repeat proteins.
In order to solve the technical problems, the application provides a method for preparing tandem repeat protein, which comprises the steps of introducing an expression vector containing double-stranded DNA molecules named as single copy gene expression cassettes into receptor cells to obtain recombinant cells, culturing the recombinant cells, and expressing to obtain the tandem repeat protein; the single copy gene expression cassette contains a promoter, an intron named 3' intron linked to the promoter, a target protein coding gene named single copy gene linked to the 3' intron, a coding sequence of a Ribosome Binding Site (RBS) linked to the target protein coding gene, a spacer sequence (Interval sequence) linked to the coding sequence of the ribosome binding site, a start codon linked to the spacer sequence, and an intron named 5' intron linked to the start codon; the 3 'intron and the 5' intron satisfy condition a that precursor RNAs transcribed from the single copy gene expression cassette form splice vesicles by base complementary pairing in the recombinant cell and produce mature circular single stranded RNA molecules by a splicing reaction; the target protein coding gene does not contain a stop codon.
In the above method, the spacer sequence is a sequence between RBS and ATG that acts to bind the ribosome to mRNA with high strength. The spacer sequence may be a double-stranded DNA of 4-10bp, e.g., a double-stranded DNA in which the nucleotide sequence of one strand is nucleotides 5535-5543 of sequence 1.
In the above method, the tandem repeat protein may contain more than 2 copies of the single copy protein, such as more than 7 copies of the single copy protein, and more than 10 copies of the single copy protein.
In the above method, the single copy gene expression cassette is formed by ligating a promoter, the 3 'intron, the target protein encoding gene, the ribosome binding site encoding sequence, the spacer sequence, the initiation codon and the 5' intron.
The single copy gene expression cassette may include, as an expression cassette for circular mRNA of the target protein, a promoter for initiating transcription of the gene encoding the target protein, and a terminator for terminating transcription of the gene encoding the target protein. Further, the single copy gene expression cassette may also include an enhancer sequence. Promoters useful in the present application include, but are not limited to: constitutive promoters, tissue, organ and development specific promoters, and inducible promoters. Examples of promoters include, but are not limited to: t7 promoter of T7 phage, constitutive promoter 35S of cauliflower mosaic virus. They may be used alone or in combination with other promoters. Suitable transcription terminators include, but are not limited to: agrobacterium nopaline synthase terminator (NOS terminator), cauliflower mosaic virus CaMV 35S terminator, tml terminator.
In the above method, the single copy gene is a target protein-encoding gene which does not contain a stop codon (TAA, TGA or TAG).
In the above method, the initiation codon is ATG.
In the above method, the target gene may further comprise a replication initiation site (pMB 1) gene.
In the above method, the target gene may further comprise a selectable marker gene. The selectable marker gene is a gene of known function and sequence that is capable of functioning as a specific marker. For example, genes encoding enzymes or luminescent compounds which produce a color change (GUS gene, luciferase gene, etc.), antibiotic marker genes (such as nptII gene which confers resistance to kanamycin and related antibiotics, bar gene which confers resistance to the herbicide phosphinothricin, hph gene which confers resistance to the antibiotic hygromycin, and dhfr gene which confers resistance to methotrexate, EPSPS gene which confers resistance to glyphosate) or chemical agent marker genes, etc. (such as herbicide resistance genes), mannose-6-phosphate isomerase gene which provides the ability to metabolize mannose.
In the above method, the target protein encoding gene encodes a target protein, and the target protein may be MaSp1; the MaSp1 is a protein with an amino acid sequence of SEQ ID No. 3.
In the above method, the recipient cell is any one of C1) -C4):
c1 Prokaryotic microbial cells;
c2 Gram negative bacterial cells;
c3 A bacterial cell of the genus Escherichia;
c4 Coli BL21 (DE 3) cells.
In the above method, the 3 'intron and the 5' intron satisfying the condition a are a pair of introns as follows:
the 3 'intron contains 6 splice vesicles and 3' splice sites, the names of the encoding DNA of the 6 splice vesicles are respectively 3'sp1 gene, 3' sp2 gene, 3'sp3 gene, 3' sp4 gene, 3'sp5 gene and 3' sp6 gene, and the names of the encoding DNA of the 3 'splice sites are 3' ss gene; the nucleotide sequence of the 3' sp1 gene which is one chain is a double-stranded DNA molecule of 5193-5214 sites of a sequence 1 in a sequence table; the nucleotide sequence of the 3' sp2 gene which is one chain is a double-stranded DNA molecule of 5278-5289 bits of sequence 1 in a sequence table; the nucleotide sequence of the 3' sp3 gene which is one chain is a double-stranded DNA molecule of 5293-5306 sites of sequence 1 in a sequence table; the nucleotide sequence of the 3' sp4 gene which is one chain is a double-stranded DNA molecule of 5318 th to 5337 th positions of a sequence 1 in a sequence table; the nucleotide sequence of the 3' sp5 gene which is one chain is a double-stranded DNA molecule of 5352 th to 5370 th sites of a sequence 1 in a sequence table; the nucleotide sequence of the 3' sp6 gene which is one chain is a double-stranded DNA molecule of 5371-5386 bits of sequence 1 in a sequence table; the 3' splice site is a double-stranded DNA molecule with a nucleotide sequence of one chain being 5419-5423 bits of a sequence 1 in a sequence table;
the 5' intron contains a 5' splice site and a 5' ss sequence; the nucleotide sequence of the 5' splice site is 5547-5556 of sequence 1; the nucleotide sequence of the 5' ss sequence is 5557-5721 of the sequence 1, and comprises 4 splicing vesicles, and the names of the encoding DNA are respectively 5' sp1 gene, 5' sp2 gene, 5' sp3 gene and 5' sp4 gene; the 5' sp1 gene is a double-stranded DNA molecule of which the nucleotide sequence of one strand is 5569-5590 bits of a sequence 1 in a sequence table; the nucleotide sequence of the 5' sp2 gene which is one chain is double-stranded DNA molecules of 5634-5643 positions of a sequence 1 in a sequence table; the nucleotide sequence of the 5'sp3 gene is a double-stranded DNA molecule of 5648-5698 sites of the sequence 1 in the sequence table, and the nucleotide sequence of the 5' sp4 gene is a double-stranded DNA molecule of 5671-5687 sites of the sequence 1 in the sequence table.
In the above method, the 3 'intron is a double-stranded DNA having a nucleotide sequence of one strand (coding strand) of nucleotides 5190 to 5423 of the sequence 1, and the 5' intron is a double-stranded DNA having a nucleotide sequence of one strand (coding strand) of nucleotides 5547 to 5721 of the sequence 1.
In the above method, the single copy gene expression cassette is a double stranded DNA molecule having a nucleotide sequence of one strand (coding strand) of SEQ ID No.1 at positions 5117-5835;
or the expression vector is a double-stranded DNA molecule (expressing tandem repeat MaSp protein) with the nucleotide sequence of one strand of SEQ ID No. 1.
The application also provides any one of the following products related to the method:
a1 The double-stranded DNA molecule named single copy gene expression cassette in the method;
a2 A) a vector containing the double stranded DNA molecule of A1);
a3 A) a recombinant microorganism comprising the double stranded DNA molecule of A1).
The vector of A2) can be constructed using existing expression vectors. The existing expression vectors comprise pMD 18-T vector, pET21b and the like. The existing expression vectors may also contain the 3' -untranslated region of the foreign gene, i.e., contain the polyadenylation signal and any other DNA fragments involved in mRNA processing or gene expression. The polyadenylation signal may direct the addition of polyadenylic acid to the 3' end of the mRNA precursor. In constructing the vector of A2), enhancers, such as transcription enhancers, may also be used, which may be ATG start codon or adjacent region start codon, etc., but must be in the same reading frame as the coding sequence to ensure proper translation of the entire sequence. To facilitate identification and screening of the transgene results, existing expression vectors used may be processed, such as by adding genes encoding enzymes or luminescent compounds that produce color changes (GUS genes, luciferase genes, etc.), antibiotic marker genes (such as nptII genes conferring resistance to kanamycin and related antibiotics, bar genes conferring resistance to the herbicide phosphinothricin, hph genes conferring resistance to the antibiotic hygromycin, dhfr genes conferring resistance to methatrexa, EPSPS genes conferring resistance to glyphosate) or chemical reagent marker genes, etc. (such as herbicide resistance genes), mannose-6-phosphate isomerase genes providing the ability to metabolize mannose.
The application provides the application of the method or the product in preparing tandem repeat protein.
The application provides a method for preparing tandem repeat proteins, which comprises the steps of introducing an expression vector containing double-stranded DNA molecules named as single copy gene expression cassettes into receptor cells to obtain recombinant cells, extracting total RNA of the recombinant cells, and translating the total RNA into the tandem repeat proteins, so that the time for preparing the long tandem repeat proteins is greatly shortened. Experiments prove that the tandem repeat MaSp1 protein which is repeated 40 times can be obtained only by 7 days, and the time is greatly shortened.
Drawings
FIG. 1 is a schematic diagram of the MaSp1 RNA expression cassette according to example 1 of the present application. In the figure RBS is the coding sequence for the ribosome binding site and ATG is the start codon.
FIG. 2 is a schematic representation of the mechanism by which introns splice to form circular MaSp1 RNA in example 1 of the present application. BSJ is a back' splice junction (splice) site in the figure; RBS is a ribosome binding site; ATG is the initiation codon.
FIG. 3 is a schematic representation of the mechanism of translation of MaSp1 tandem repeat proteins according to example 1 of the present application. RBS is a ribosome binding site; ATG is the initiation codon.
FIG. 4 is a verification electrophoretogram of the MaSp1 RNA loop formation in example 1 of the present application.
FIG. 5 is a graph showing the results of sanger sequencing of MaSp1 RNA splice junctions in example 1 of the present application.
FIG. 6 is a schematic representation of a spider silk protein PAGE gel after translation of the circular MaSp1 RNA of example 1 of the present application, wherein M is Marker,1 is MaSp1 inclusion body and 2 is MaSp1 supernatant.
FIG. 7 is a Western diagram of the protein after translation of the circular MaSp1 RNA of example 1 of the present application, wherein 1 is the MaSp1 inclusion body and 2 is the MaSp1 supernatant.
FIG. 8 is a mass spectrum of a suspected MaSp1 protein according to example 1 of the present application.
Detailed Description
The following detailed description of the application is provided in connection with the accompanying drawings that are presented to illustrate the application and not to limit the scope thereof. The experimental methods in the following examples are conventional methods unless otherwise specified. Materials, reagents and the like used in the examples described below are commercially available unless otherwise specified.
In the following examples, E.coli DH 5. Alpha. (BC 102-02) is a product of Biomed corporation; coli BL21 (CW 0809S) is a product of Beijing kang, century corporation.
In the following examples, the RNAprep Pure cultured cells/bacteria Total RNA extraction kit (DP 430) is available from TIANGEN company; rever Tra Ace qPCR RT kit cDNA A synthesis kit (FSQ-101) is a product of TOYOBO company.
In the following examples, the 10xBSA protein solution (B9000S) is a NEB company product; 2xEs Taq MasterMix (containing dye) (CW 0690H) is a product of Beijing kang, century corporation.
In the following examples, the media used are in particular as follows:
the solid LB culture medium is a sterile culture medium prepared from tryptone, yeast extract, naCl, agar and deionized water, and the contents of the tryptone, the yeast extract, the NaCl and the agar are as follows: 10g/L tryptone, 5g/L yeast extract, 10g/L NaCl,15g/L agar.
The liquid LB medium is a sterile medium prepared from tryptone, yeast extract, naCl and deionized water, and the contents of the tryptone, the yeast extract and the NaCl are as follows: 10g/L tryptone, 5g/L yeast extract, 10g/L NaCl.
The solid LB medium with ampicillin concentration of 100. Mu.g/mL was a sterile medium made of ampicillin, tryptone, yeast extract, naCl, agar and deionized water, the contents of ampicillin, tryptone, yeast extract, naCl, agar were as follows: 100. Mu.g/mL ampicillin, 10g/L tryptone, 5g/L yeast extract, 10g/L NaCl,15g/L agar.
The liquid LB medium with ampicillin concentration of 100. Mu.g/mL was a sterile medium made of ampicillin, tryptone, yeast extract, naCl and deionized water, and the contents of ampicillin, tryptone, yeast extract, naCl were as follows: 100. Mu.g/mL ampicillin, 10g/L tryptone, 5g/L yeast extract, 10g/L NaCl.
Example 1 preparation of tandem repeat MaSp1
This example prepared an expression vector containing a single copy gene expression cassette designated pMaSp1, the pMaSp1 being a double stranded DNA having one strand of nucleotide sequence of sequence 1 (SEQ ID No. 1) in the sequence Listing. In sequence 1, the 494-1459 th is the apramycin resistance gene, the 5117-5835 th is the DNA molecule named single copy gene expression cassette, and the MaSp1 RNA expression cassette is called below. The MaSp1 RNA expression cassette has a structure shown in FIG. 1, and consists of a T7 promoter (the nucleotide sequence is 5117-5135 nucleotide of the sequence 1), an intron (the nucleotide sequence is 5190-5423 nucleotide of the sequence 1) which is connected with the T7 promoter and is named 3' intron, wherein 5190-5418 nucleotide is 3' ss gene, 5419-5423 nucleotide is 3' splice site), a target protein coding gene (hereinafter referred to as MaSp1 gene, the nucleotide sequence is nucleotide 5424-5528 nucleotide of the sequence 1), a coding sequence (the nucleotide sequence is 5529-5534 nucleotide of the sequence 1) of a Ribosome Binding Site (RBS) which is connected with the MaSp1 gene, a spacer (Interval sequence) (the nucleotide sequence is nucleotide 5535-5543 nucleotide of the sequence 1), a start codon ATG (the nucleotide sequence is 5544 nucleotide 5544-5546 nucleotide of the sequence 1) which is connected with the spacer, and a coding sequence (5557 nucleotide 5547-5557 nucleotide sequence which is connected with the start codon of the sequence is 5547 ' nucleotide sequence, and a coding sequence (5547-5547) which is connected with the nucleotide sequence of the sequence 5547 ' nucleotide sequence which is 5547. The 5788 th to 5835 th sites of the sequence 1 in the sequence table are terminators for stopping transcription of the introns and the MaSp1 genes, and the 12 th to 467 th sites are replication initiation sites.
The MaSp1 gene does not contain a stop codon (TAA, TGA or TAG). The MaSp1 gene encodes MaSp1, and MaSp1 is a protein with an amino acid sequence of sequence 2. The 3 'and 5' introns meet condition a that the precursor RNA transcribed from the single copy gene expression cassette forms a splice vesicle by base complementary pairing in the recombinant cell and produces a mature circular single stranded RNA molecule by a splicing reaction (G-OH catalyzed splicing reaction) (mechanism see fig. 2).
The 3 'intron contains 6 splice vesicles and a 3' splice site, the names of the DNA encoding the 6 splice vesicles are respectively 3'sp1 gene, 3' sp2 gene, 3'sp3 gene, 3' sp4 gene, 3'sp5 gene and 3' sp6 gene, and the DNA encoding the 3 'splice site is called 3' ss gene; the nucleotide sequence of the 3' sp1 gene which is one chain is a double-stranded DNA molecule of 5193-5214 sites of a sequence 1 in a sequence table; the nucleotide sequence of the 3' sp2 gene which is one chain is a double-stranded DNA molecule of 5278-5289 bits of sequence 1 in a sequence table; the nucleotide sequence of the 3' sp3 gene which is one chain is a double-stranded DNA molecule of 5293-5306 sites of sequence 1 in a sequence table; the nucleotide sequence of the 3' sp4 gene which is one chain is a double-stranded DNA molecule of 5318 th to 5337 th positions of a sequence 1 in a sequence table; the nucleotide sequence of the 3' sp5 gene which is one chain is a double-stranded DNA molecule of 5352 th to 5370 th sites of a sequence 1 in a sequence table; the nucleotide sequence of the 3' sp6 gene which is one chain is a double-stranded DNA molecule of 5371-5386 bits of sequence 1 in a sequence table; the 3' splice site is a double-stranded DNA molecule with a nucleotide sequence of one strand being 5419-5423 positions of a sequence 1 in a sequence table.
The 5' intron contains a 5' splice site and a 5' ss sequence; the nucleotide sequence of the 5' splice site is 5547-5556 of sequence 1; the nucleotide sequence of the 5' ss sequence is 5557-5721 of the sequence 1, and comprises 4 splicing vesicles, and the names of the encoding DNA are respectively 5' sp1 gene, 5' sp2 gene, 5' sp3 gene and 5' sp4 gene; the 5' sp1 gene is a double-stranded DNA molecule of which the nucleotide sequence of one strand is 5569-5590 bits of a sequence 1 in a sequence table; the nucleotide sequence of the 5' sp2 gene which is one chain is double-stranded DNA molecules of 5634-5643 positions of a sequence 1 in a sequence table; the nucleotide sequence of the 5'sp3 gene is a double-stranded DNA molecule of 5648-5698 sites of the sequence 1 in the sequence table, and the nucleotide sequence of the 5' sp4 gene is a double-stranded DNA molecule of 5671-5687 sites of the sequence 1 in the sequence table.
The mechanism for preparing tandem repeat proteins using the expression vector pMaSp1 containing the single copy gene expression cassette described above is to introduce pMaSp1 into recipient cells to obtain recombinant cells in which pMaSp1 transcribes a precursor RNA, also called nuclear pre-mRNA (pre-mRNA), shown in the left diagram in fig. 2. In the precursor RNA, the 3 'intron and the 5' intron form a splicing vesicle by base complementation, and a splicing reaction occurs by G-OH catalysis, so that a loop is formed, and a mature circular single-stranded RNA molecule shown in the right diagram in FIG. 2 is generated, which is called circular MaSp1 RNA. Ribosome binds to the Ribosome Binding Site (RBS) sequence on circular MaSp1 RNA, which initiates translation of the protein from AUG, and because circular MaSp1 RNA does not contain UAA, UGA or UAG, the ribosome will continue to translate on circular MaSp1 mRNA, thereby producing a MaSp1 tandem repeat protein (FIG. 3). The specific process is as follows:
1. preparation of expression vector pMaSp1 containing Single copy Gene expression cassette
The pMaSp1 is constructed in a modularized mode, and each module is connected by adopting a Goldengate method: the protective base and enzyme recognition site of restriction endonuclease BsaI and complementary sticky ends are added at two ends of each module, and the protective base and enzyme cleavage site and complementary sticky ends are added by way of primer embedding, specifically as follows:
1.1 Module
Construction of pMaSp1 requires module a and module B:
the deoxyribonucleotide sequence of the module A is shown as 5547-5423 of the sequence 1 in the sequence table, wherein 5547-5721 is 5 'intron (wherein 5547-5556 is 5' splice site, 5557-5721 is 5'ss gene), 5788-5835 is transcription terminator, 12-467 is replication initiation site (pMB 1) gene, 494-1459 is ampicillin resistance gene, 5117-5135 is T7 promoter, 5190-5423 is 3' intron (wherein 5190-5418 is 3'ss gene, 5419-5423 is 3' splice site).
The module B contains a MaSp1 gene, and the deoxyribonucleotide sequence of the module B is shown as 5424-5528 th positions of the sequence 1 in the sequence table.
1.2 processing of modules
Adding a protective base and an enzyme recognition site of restriction endonuclease BsaI and a complementary sticky end at two ends of the module A through PCR reaction to obtain a module A with restriction endonuclease BsaI sites at two ends, which is named as a module A-BsaI; the primer pairs used for this PCR reaction were PartA-F and PartA-R.
PartA-F:5’-CCAGGTCTCAAAGGAGTACTCGATGGATCTCAGGTCAATTGAGGCCTGAGTA-3' (underlined nucleotides are BsaI recognition sites)
PartA-R:5’-CCAGGTCTCAGGTAGCATTATGTTCAGATAAGGTC-3'. (underlined nucleotides are BsaI recognition sites)
The two ends of the module B are added with a protective base and an enzyme recognition site of restriction endonuclease BsaI and complementary sticky ends through PCR reaction, so that a module B with restriction endonuclease BsaI sites at the two ends is obtained, and the module B is named as a module B-BsaI; the primer pairs used for this PCR reaction were PartB-F and PartB-R.
PartB-F:5’-CCAGGTCTCATACCAGCGGACGTGG-3' (underlined nucleotides are BsaI recognition sites)
PartB-R:5’-CCAGGTCTCACCTTTGTTCCCTGGCTTCC-3 (underlined nucleotides are BsaI recognition sites).
1.3 construction of pMaSp1
The modules A-BsaI and B-BsaI were ligated into circular MaSp1 RNA in a molar ratio of 1:1 to prepare plasmids by: 20. Mu.L of the reaction system of module A-BsaI 5.05E-8mol (about 100 ng), module B-BsaI 5.05E-8mol; bsaI enzyme 1. Mu.L, T4 DNA Ligase (T4 DNA Ligase) 1. Mu.L, 10x T4 buffer (10 x T4 buffer) 2. Mu.L, 10x BSA protein solution 2. Mu.L, and the solution was made up to 20. Mu.L with deionized water. The ligation was performed by the following reaction conditions: reacting for 3min at 37 ℃; reacting for 4min at 25 ℃ for 25 cycles; the unligated fragments were excised by reaction at 50℃for 5min, and then the enzyme was inactivated by reaction at 80℃for 5 min. After the completion of the reaction, a Goldengate reaction solution of pMaSp1 was obtained.
Transferring 5 mu L of Goldengate reaction solution of pMaSp1 into competent cells of escherichia coli DH5 alpha, screening on a solid LB culture medium with ampicillin concentration of 100 mu g/mL, performing bacterial picking sequencing, screening and constructing a correct plasmid, and amplifying and extracting the plasmid to obtain pMaSp1.
2. Construction and validation of circular MaSp1 RNA
Transferring 0.5 μl of pMaSp1 into competent cells of Escherichia coli BL21 (DE 3) at ampicillin concentration of 1Screening on a solid LB culture medium with the concentration of 00 mug/mL to obtain an escherichia coli BL21 (DE 3) positive transformant (a recombinant cell transferred into pMaSp 1), transferring the escherichia coli BL21 (DE 3) positive transformant into a liquid LB culture medium with the concentration of 100 mug/mL of ampicillin, and culturing at 37 ℃ until OD 600nm 1mM isopropyl-beta-D-thiogalactoside (IPTG) was used to induce for 12h, 1mL of bacterial solution was taken, 12000rmp was centrifuged for 2min, the supernatant was discarded, and the total RNA of E.coli BL21 (DE 3) positive transformants was extracted according to the method described in the specification using RNAprep Pure culture cell/bacterial total RNA extraction kit.
Total RNA was reverse transcribed into cDNA using a Rever Tra Ace qPCRRT kit cDNA synthesis kit according to the protocol described.
PCR reactions were performed on cDNA using primer pairs Testify cirMaSp1-F and Testify cirMaSp1-R to verify whether the MaSp1 RNA was circular.
Testify cirMaSp1-F:5’-CAGGACAGGGAGGATATGGA-3’;
Testify cirMaSp1-R:5’-CTCCTCCCATGGCTGC-3’。
The DNA polymerase used was verified to be 2xEs Taq MasterMix (containing dye). The samples after the PCR reaction were run and the electrophoresis pattern was shown in FIG. 4. The bands with a molecular weight of 100bp were sent for sequencing and the results of sanger sequencing of the MaSp1 RNA splice junction (Splicejunction) are shown in FIG. 5. Sequencing results indicated that the 5 'splice site was ligated to the 3' splice site, indicating that the MaSp1 RNA had been circularized.
3. Preparation of MaSp1 tandem repeat proteins
Inoculating the positive transformant of Escherichia coli BL21 (DE 3) with the circular MaSp1 RNA verified in step 2 into liquid LB medium with ampicillin concentration of 100 μg/mL, and culturing at 37deg.C to OD 600 The induction was continued for 6h at 37℃with 1mM IPTG (labeled MaSp1 cirmRNA 6h in FIG. 6); replacement of MaSp1 RNA-primed BL21 strain with the empty vector BL21 strain served as control (labeled empty vector 6h in FIG. 6). After induction, a protein gel sample is prepared, the result of a gel detection experiment is shown in fig. 6, and three more protein gel strips are obtained after the treatment of escherichia coli BL21 (DE 3) positive transformant with MaSp1 RNA being looped compared with a control. Dividing three stripsThe mass spectrum was then isolated and the result is shown in FIG. 8, wherein the molecular weight of the band was greater than 118kD, which is spider silk protein.
Wherein, BL21 strain of the empty vector is a transformant obtained by transferring the empty vector into E.coli BL21 (DE 3). The empty vector is a plasmid obtained by removing the MaSp1 RNA expression cassette in pMaSp1 and keeping other nucleotides of pMaSp1 unchanged. The empty vector differs from pMaSp1 only in that it does not contain a MaSp1 RNA expression cassette.
Experiments prove that the tandem repeat MaSp1 which is repeated for 40 times can be obtained only by 7 days, and the time is greatly shortened compared with the traditional method.
The present application is described in detail above. It will be apparent to those skilled in the art that the present application can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the application and without undue experimentation. While the application has been described with respect to specific embodiments, it will be appreciated that the application may be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. The application of some of the basic features may be done in accordance with the scope of the claims that follow.
Sequence listing
<110> institute of Tianjin Industrial biotechnology, national academy of sciences
<120> a method for preparing tandem repeat protein and use thereof
<130> GNCSY200930
<160> 2
<170> SIPOSequenceListing 1.0
<210> 1
<211> 5860
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 1
tggcgaatgg gacgcgccct gtagcggcgc attaagcgcg gcgggtgtgg tggttacgcg 60
cagcgtgacc gctacacttg ccagcgccct agcgcccgct cctttcgctt tcttcccttc 120
ctttctcgcc acgttcgccg gctttccccg tcaagctcta aatcgggggc tccctttagg 180
gttccgattt agtgctttac ggcacctcga ccccaaaaaa cttgattagg gtgatggttc 240
acgtagtggg ccatcgccct gatagacggt ttttcgccct ttgacgttgg agtccacgtt 300
ctttaatagt ggactcttgt tccaaactgg aacaacactc aaccctatct cggtctattc 360
ttttgattta taagggattt tgccgatttc ggcctattgg ttaaaaaatg agctgattta 420
acaaaaattt aacgcgaatt ttaacaaaat attaacgttt acaatttcag gtggcacttt 480
tcggggaaat gtgcgcggaa cccctatttg tttatttttc taaatacatt caaatatgta 540
tccgctcatg agacaataac cctgataaat gcttcaataa tattgaaaaa ggaagagtat 600
gagtattcaa catttccgtg tcgcccttat tccctttttt gcggcatttt gccttcctgt 660
ttttgctcac ccagaaacgc tggtgaaagt aaaagatgct gaagatcagt tgggtgcacg 720
agtgggttac atcgaactgg atctcaacag cggtaagatc cttgagagtt ttcgccccga 780
agaacgtttt ccaatgatga gcacttttaa agttctgcta tgtggcgcgg tattatcccg 840
tattgacgcc gggcaagagc aactcggtcg ccgcatacac tattctcaga atgacttggt 900
tgagtactca ccagtcacag aaaagcatct tacggatggc atgacagtaa gagaattatg 960
cagtgctgcc ataaccatga gtgataacac tgcggccaac ttacttctga caacgatcgg 1020
aggaccgaag gagctaaccg cttttttgca caacatgggg gatcatgtaa ctcgccttga 1080
tcgttgggaa ccggagctga atgaagccat accaaacgac gagcgtgaca ccacgatgcc 1140
tgcagcaatg gcaacaacgt tgcgcaaact attaactggc gaactactta ctctagcttc 1200
ccggcaacaa ttaatagact ggatggaggc ggataaagtt gcaggaccac ttctgcgctc 1260
ggcccttccg gctggctggt ttattgctga taaatctgga gccggtgagc gtgggtctcg 1320
cggtatcatt gcagcactgg ggccagatgg taagccctcc cgtatcgtag ttatctacac 1380
gacggggagt caggcaacta tggatgaacg aaatagacag atcgctgaga taggtgcctc 1440
actgattaag cattggtaac tgtcagacca agtttactca tatatacttt agattgattt 1500
aaaacttcat ttttaattta aaaggatcta ggtgaagatc ctttttgata atctcatgac 1560
caaaatccct taacgtgagt tttcgttcca ctgagcgtca gaccccgtag aaaagatcaa 1620
aggatcttct tgagatcctt tttttctgcg cgtaatctgc tgcttgcaaa caaaaaaacc 1680
accgctacca gcggtggttt gtttgccgga tcaagagcta ccaactcttt ttccgaaggt 1740
aactggcttc agcagagcgc agataccaaa tactgtcctt ctagtgtagc cgtagttagg 1800
ccaccacttc aagaactctg tagcaccgcc tacatacctc gctctgctaa tcctgttacc 1860
agtggctgct gccagtggcg ataagtcgtg tcttaccggg ttggactcaa gacgatagtt 1920
accggataag gcgcagcggt cgggctgaac ggggggttcg tgcacacagc ccagcttgga 1980
gcgaacgacc tacaccgaac tgagatacct acagcgtgag ctatgagaaa gcgccacgct 2040
tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc agggtcggaa caggagagcg 2100
cacgagggag cttccagggg gaaacgcctg gtatctttat agtcctgtcg ggtttcgcca 2160
cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc tatggaaaaa 2220
cgccagcaac gcggcctttt tacggttcct ggccttttgc tggccttttg ctcacatgtt 2280
ctttcctgcg ttatcccctg attctgtgga taaccgtatt accgcctttg agtgagctga 2340
taccgctcgc cgcagccgaa cgaccgagcg cagcgagtca gtgagcgagg aagcggaaga 2400
gcgcctgatg cggtattttc tccttacgca tctgtgcggt atttcacacc gcatatatgg 2460
tgcactctca gtacaatctg ctctgatgcc gcatagttaa gccagtatac actccgctat 2520
cgctacgtga ctgggtcatg gctgcgcccc gacacccgcc aacacccgct gacgcgccct 2580
gacgggcttg tctgctcccg gcatccgctt acagacaagc tgtgaccgtc tccgggagct 2640
gcatgtgtca gaggttttca ccgtcatcac cgaaacgcgc gaggcagctg cggtaaagct 2700
catcagcgtg gtcgtgaagc gattcacaga tgtctgcctg ttcatccgcg tccagctcgt 2760
tgagtttctc cagaagcgtt aatgtctggc ttctgataaa gcgggccatg ttaagggcgg 2820
ttttttcctg tttggtcact gatgcctccg tgtaaggggg atttctgttc atgggggtaa 2880
tgataccgat gaaacgagag aggatgctca cgatacgggt tactgatgat gaacatgccc 2940
ggttactgga acgttgtgag ggtaaacaac tggcggtatg gatgcggcgg gaccagagaa 3000
aaatcactca gggtcaatgc cagcgcttcg ttaatacaga tgtaggtgtt ccacagggta 3060
gccagcagca tcctgcgatg cagatccgga acataatggt gcagggcgct gacttccgcg 3120
tttccagact ttacgaaaca cggaaaccga agaccattca tgttgttgct caggtcgcag 3180
acgttttgca gcagcagtcg cttcacgttc gctcgcgtat cggtgattca ttctgctaac 3240
cagtaaggca accccgccag cctagccggg tcctcaacga caggagcacg atcatgcgca 3300
cccgtggggc cgccatgccg gcgataatgg cctgcttctc gccgaaacgt ttggtggcgg 3360
gaccagtgac gaaggcttga gcgagggcgt gcaagattcc gaataccgca agcgacaggc 3420
cgatcatcgt cgcgctccag cgaaagcggt cctcgccgaa aatgacccag agcgctgccg 3480
gcacctgtcc tacgagttgc atgataaaga agacagtcat aagtgcggcg acgatagtca 3540
tgccccgcgc ccaccggaag gagctgactg ggttgaaggc tctcaagggc atcggtcgag 3600
atcccggtgc ctaatgagtg agctaactta cattaattgc gttgcgctca ctgcccgctt 3660
tccagtcggg aaacctgtcg tgccagctgc attaatgaat cggccaacgc gcggggagag 3720
gcggtttgcg tattgggcgc cagggtggtt tttcttttca ccagtgagac gggcaacagc 3780
tgattgccct tcaccgcctg gccctgagag agttgcagca agcggtccac gctggtttgc 3840
cccagcaggc gaaaatcctg tttgatggtg gttaacggcg ggatataaca tgagctgtct 3900
tcggtatcgt cgtatcccac taccgagata tccgcaccaa cgcgcagccc ggactcggta 3960
atggcgcgca ttgcgcccag cgccatctga tcgttggcaa ccagcatcgc agtgggaacg 4020
atgccctcat tcagcatttg catggtttgt tgaaaaccgg acatggcact ccagtcgcct 4080
tcccgttccg ctatcggctg aatttgattg cgagtgagat atttatgcca gccagccaga 4140
cgcagacgcg ccgagacaga acttaatggg cccgctaaca gcgcgatttg ctggtgaccc 4200
aatgcgacca gatgctccac gcccagtcgc gtaccgtctt catgggagaa aataatactg 4260
ttgatgggtg tctggtcaga gacatcaaga aataacgccg gaacattagt gcaggcagct 4320
tccacagcaa tggcatcctg gtcatccagc ggatagttaa tgatcagccc actgacgcgt 4380
tgcgcgagaa gattgtgcac cgccgcttta caggcttcga cgccgcttcg ttctaccatc 4440
gacaccacca cgctggcacc cagttgatcg gcgcgagatt taatcgccgc gacaatttgc 4500
gacggcgcgt gcagggccag actggaggtg gcaacgccaa tcagcaacga ctgtttgccc 4560
gccagttgtt gtgccacgcg gttgggaatg taattcagct ccgccatcgc cgcttccact 4620
ttttcccgcg ttttcgcaga aacgtggctg gcctggttca ccacgcggga aacggtctga 4680
taagagacac cggcatactc tgcgacatcg tataacgtta ctggtttcac attcaccacc 4740
ctgaattgac tctcttccgg gcgctatcat gccataccgc gaaaggtttt gcgccattcg 4800
atggtgtccg ggatctcgac gctctccctt atgcgactcc tgcattagga agcagcccag 4860
tagtaggttg aggccgttga gcaccgccgc cgcaaggaat ggtgcatgca aggagatggc 4920
gcccaacagt cccccggcca cggggcctgc caccataccc acgccgaaac aagcgctcat 4980
gagcccgaag tggcgagccc gatcttcccc atcggtgatg tcggcgatat aggcgccagc 5040
aaccgcacct gtggcgccgg tgatgccggc cacgatgcgt ccggcgtaga ggatcgagat 5100
ctcgatcccg cgaaattaat acgactcact ataggggaat tgtgagcgga taacaattcc 5160
cctctagaaa taattttgtt taactttaaa attctagaga aaatttcgtc tggattagtt 5220
acttatcgtg taaaatctga taaatggaat tggttctaca taaatgccta acgactatcc 5280
ctttggggag tagggtcaag tgactcgaaa cgatagacaa cttgctttaa caagttggag 5340
atatagtctg ctctgcatgg tgacatgcag ctggatataa ttccggggta agattaacga 5400
ccttatctga acataatgct accagcggtc gcggcggtct gggtggccag ggtgcaggta 5460
tggcggctgc ggctgcaatg ggcggtgctg gccaaggtgg ctacggcggc ctgggttctc 5520
agggtactaa ggagatatac catatggatc tgcgttcaat tgaggcctga gtataaggtg 5580
acttatactt gtaatctatc taaacgggga acctctctag tagacaatcc cgtgctaaat 5640
tgtaggactg ccctttaata aatacttcta tatttaaaga ggtatttatg aaaagcggaa 5700
tttatcagat taaaaatact ttgagatccg gctgctaaca aagcccgaaa ggaagctgag 5760
ttggctgctg ccaccgctga gcaataacta gcataacccc ttggggcctc taaacgggtc 5820
ttgaggggtt ttttgctgaa aggaggaact atatccggat 5860
<210> 2
<211> 35
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 2
Ser Gly Arg Gly Gly Leu Gly Gly Gln Gly Ala Gly Met Ala Ala Ala
1 5 10 15
Ala Ala Met Gly Gly Ala Gly Gln Gly Gly Tyr Gly Gly Leu Gly Ser
20 25 30
Gln Gly Thr
35
Claims (12)
1. A method for producing tandem repeat proteins, characterized by: comprises introducing an expression vector containing a double-stranded DNA molecule named single copy gene expression cassette into a recipient cell to obtain a recombinant cell, culturing the recombinant cell, and expressing to obtain tandem repeat proteins; the single copy gene expression cassette comprises a promoter, an intron which is connected with the promoter and is named as a 3' intron, a target protein coding gene which is connected with the 3' intron and is named as a single copy gene, a coding sequence of a ribosome binding site which is connected with the target protein coding gene, a spacer sequence which is connected with the coding sequence of the ribosome binding site, a start codon which is connected with the spacer sequence, and an intron which is connected with the start codon and is named as a 5' intron; the 3 'intron and the 5' intron satisfy condition a that precursor RNAs transcribed from the single copy gene expression cassette form splice vesicles by base complementary pairing in the recombinant cell and produce mature circular single stranded RNA molecules by a splicing reaction; the target protein coding gene does not contain a stop codon;
the 3 'intron and the 5' intron satisfying condition a are a pair of introns:
the 3 'intron contains 6 splice vesicles and 3' splice sites, the names of the encoding DNA of the 6 splice vesicles are respectively 3'sp1 gene, 3' sp2 gene, 3'sp3 gene, 3' sp4 gene, 3'sp5 gene and 3' sp6 gene, and the names of the encoding DNA of the 3 'splice sites are 3' ss gene; the nucleotide sequence of the 3' sp1 gene which is one chain is a double-stranded DNA molecule of 5193-5214 sites of a sequence 1 in a sequence table; the nucleotide sequence of the 3' sp2 gene which is one chain is a double-stranded DNA molecule of 5278-5289 bits of sequence 1 in a sequence table; the nucleotide sequence of the 3' sp3 gene which is one chain is a double-stranded DNA molecule of 5293-5306 sites of sequence 1 in a sequence table; the nucleotide sequence of the 3' sp4 gene which is one chain is a double-stranded DNA molecule of 5318 th to 5337 th positions of a sequence 1 in a sequence table; the nucleotide sequence of the 3' sp5 gene which is one chain is a double-stranded DNA molecule of 5352 th to 5370 th sites of a sequence 1 in a sequence table; the nucleotide sequence of the 3' sp6 gene which is one chain is a double-stranded DNA molecule of 5371-5386 bits of sequence 1 in a sequence table; the 3' splice site is a double-stranded DNA molecule with a nucleotide sequence of one chain being 5419-5423 bits of a sequence 1 in a sequence table;
the 5' intron contains a 5' splice site and a 5' ss sequence; the nucleotide sequence of the 5' splice site is 5547-5556 of sequence 1; the nucleotide sequence of the 5' ss sequence is 5557-5721 of the sequence 1, and comprises 4 splicing vesicles, and the names of the encoding DNA are respectively 5' sp1 gene, 5' sp2 gene, 5' sp3 gene and 5' sp4 gene; the 5' sp1 gene is a double-stranded DNA molecule of which the nucleotide sequence of one strand is 5569-5590 bits of a sequence 1 in a sequence table; the nucleotide sequence of the 5' sp2 gene which is one chain is double-stranded DNA molecules of 5634-5643 positions of a sequence 1 in a sequence table; the nucleotide sequence of the 5'sp3 gene is a double-stranded DNA molecule of 5648-5698 sites of the sequence 1 in the sequence table, and the nucleotide sequence of the 5' sp4 gene is a double-stranded DNA molecule of 5671-5687 sites of the sequence 1 in the sequence table.
2. The method according to claim 1, characterized in that: the single copy gene expression cassette is formed by connecting a promoter, the 3 'intron, the target protein coding gene, a coding sequence of the ribosome binding site, the spacer sequence, the initiation codon and the 5' intron.
3. The method according to claim 1 or 2, characterized in that: the target protein is MaSp1.
4. A method according to claim 3, characterized in that: the MaSp1 is a protein with an amino acid sequence of a sequence 2.
5. The method as claimed in claim 4, wherein: the recipient cell is a prokaryotic microbial cell.
6. The method as claimed in claim 5, wherein: the recipient cell is a gram-negative bacterial cell.
7. The method as claimed in claim 6, wherein: the recipient cell is an Escherichia bacterial cell.
8. The method as claimed in claim 7, wherein: the receptor cell is an E.coli BL21 (DE 3) cell.
9. The method as claimed in claim 5, wherein: the 3 'intron is double-stranded DNA with one strand of nucleotide sequence being 5190-5423 nucleotides of the sequence 1, and the 5' intron is double-stranded DNA with one strand of nucleotide sequence being 5547-5721 nucleotides of the sequence 1.
10. The method as claimed in claim 9, wherein: the single copy gene expression cassette is a double-stranded DNA molecule with the nucleotide sequence of one strand being 5117-5835 of the sequence 1;
or the expression vector is a double-stranded DNA molecule with one strand of nucleotide sequence of sequence 1.
11. A product of any one of the following:
a1 A double stranded DNA molecule of the name single copy gene expression cassette in the method of any one of claims 1-10;
a2 A) a vector containing the double stranded DNA molecule of A1);
a3 A) a recombinant microorganism comprising the double stranded DNA molecule of A1).
12. Use of the method of any one of claims 1-10 or the product of claim 11 for the preparation of tandem repeat proteins.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011405477.XA CN113755512B (en) | 2020-12-03 | 2020-12-03 | Method for preparing tandem repeat protein and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011405477.XA CN113755512B (en) | 2020-12-03 | 2020-12-03 | Method for preparing tandem repeat protein and application thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113755512A CN113755512A (en) | 2021-12-07 |
CN113755512B true CN113755512B (en) | 2023-11-10 |
Family
ID=78786166
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011405477.XA Active CN113755512B (en) | 2020-12-03 | 2020-12-03 | Method for preparing tandem repeat protein and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113755512B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4673641A (en) * | 1982-12-16 | 1987-06-16 | Molecular Genetics Research And Development Limited Partnership | Co-aggregate purification of proteins |
KR20040083194A (en) * | 2003-03-21 | 2004-10-01 | 한국생명공학연구원 | The transformed plant cell expressing tandem repeats of β-amyloid gene and plant produced by the same |
WO2006073727A2 (en) * | 2004-12-21 | 2006-07-13 | Monsanto Technology, Llc | Recombinant dna constructs and methods for controlling gene expression |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001068674A2 (en) * | 2000-03-13 | 2001-09-20 | Monsanto Technology Llc | Preparation of recombinant proteins containing repeating units |
US9051383B2 (en) * | 2007-06-11 | 2015-06-09 | The Regents Of The University Of California | Spider silk dragline polynucleotides, polypeptides and methods of use thereof |
RU2451023C1 (en) * | 2010-11-25 | 2012-05-20 | Владимир Григорьевич Богуш | Method of producing recombinant spider-web protein, fused protein, recombinant dna, expression vector, host cell and producer strain |
-
2020
- 2020-12-03 CN CN202011405477.XA patent/CN113755512B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4673641A (en) * | 1982-12-16 | 1987-06-16 | Molecular Genetics Research And Development Limited Partnership | Co-aggregate purification of proteins |
KR20040083194A (en) * | 2003-03-21 | 2004-10-01 | 한국생명공학연구원 | The transformed plant cell expressing tandem repeats of β-amyloid gene and plant produced by the same |
WO2006073727A2 (en) * | 2004-12-21 | 2006-07-13 | Monsanto Technology, Llc | Recombinant dna constructs and methods for controlling gene expression |
Non-Patent Citations (1)
Title |
---|
J Riet 等.mproving the PCR protocol to amplify a repetitive DNA sequence.《Genet Mol Res . 》.2017,第16卷(第3期),第1-11页. * |
Also Published As
Publication number | Publication date |
---|---|
CN113755512A (en) | 2021-12-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110923183A (en) | Construction method of lanosterol-producing escherichia coli strain | |
CN111893104B (en) | Structure-based CRISPR protein optimization design method | |
CN106755031B (en) | Rhamnolipid production plasmid, construction method thereof, escherichia coli engineering bacteria and application | |
CN113755512B (en) | Method for preparing tandem repeat protein and application thereof | |
CN109402109B (en) | Improved overlap extension PCR method | |
CN115247173A (en) | Gene editing system for constructing TMPRSS6 gene mutant iron deficiency anemia pig nuclear transplantation donor cells and application thereof | |
CN111748034B (en) | Preparation method of mycoplasma synoviae monoclonal antibody | |
KR20130135722A (en) | Light inducible promoter and gene expression system comprising the same | |
CN107075495B (en) | Lyase, DNA encoding the lyase, vector comprising the DNA, and method for asymmetric synthesis of (S) -phenylacetylcarbinol | |
CN106715689B (en) | Lyase and method for asymmetric synthesis of (S) -phenylacetylcarbinol | |
CN113234746B (en) | Method for pesticide induced protein interaction and induced gene expression | |
RU2774333C1 (en) | RECOMBINANT PLASMID pET-GST-3CL-GPG PROVIDING SYNTHESIS OF SARS-CoV-2 3CL PROTEASE IN E. COLI CELLS IN SOLUBLE FORM | |
RU2792132C1 (en) | Soluble recombinant plasmid pet-gst-3cl ensuring synthesis of 3cl sars-cov-2 protease in e. coli cells | |
KR100902634B1 (en) | Nucleic acid delivery complex comprising recombinant hmgb-1 peptide | |
CN112553177B (en) | Glutamine transaminase variant with improved heat stability | |
CN114317473B (en) | Glutamine transaminase variants with improved catalytic activity and thermostability | |
CN112813087A (en) | Preparation method of SalI restriction endonuclease | |
CN107354172B (en) | Recombinant expression vector and construction method and application thereof | |
CN112662647A (en) | Method for preparing recombinant NcoI restriction enzyme | |
CN115232813A (en) | Gene editing system for constructing von willebrand model pig nuclear transplantation donor cells with vWF gene mutation and application of gene editing system | |
CN115247153A (en) | Gene editing system for constructing diabetes model pig nuclear transplantation donor cells with HNF1A gene mutation and application thereof | |
KR20220080101A (en) | Chimeric thermostable aminoacyl-tRNA synthetase for improved unnatural amino acid incorporation | |
CN115232811A (en) | Method for constructing HBB gene mutant sickle cell anemia model pig nuclear transplantation donor cell and application | |
CN115247191A (en) | Gene editing system and application thereof in construction of double-gene-mutation nevus basal cell carcinoma syndrome pig nuclear transplantation donor cell | |
CN115232812A (en) | Method for constructing nuclear transplantation donor cells of MRAP2 gene mutation severe early obesity model pigs |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |