CN114075557A - 重组转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 - Google Patents
重组转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 Download PDFInfo
- Publication number
- CN114075557A CN114075557A CN202010838888.1A CN202010838888A CN114075557A CN 114075557 A CN114075557 A CN 114075557A CN 202010838888 A CN202010838888 A CN 202010838888A CN 114075557 A CN114075557 A CN 114075557A
- Authority
- CN
- China
- Prior art keywords
- seq
- val
- gly
- amino acid
- acid sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 108090000340 Transaminases Proteins 0.000 title claims abstract description 208
- NCXSNNVYILYEBC-SNVBAGLBSA-N (2r)-2-(2,5-difluorophenyl)pyrrolidine Chemical compound FC1=CC=C(F)C([C@@H]2NCCC2)=C1 NCXSNNVYILYEBC-SNVBAGLBSA-N 0.000 title claims abstract description 45
- 238000003786 synthesis reaction Methods 0.000 title abstract description 15
- 230000015572 biosynthetic process Effects 0.000 title abstract description 13
- 102000003929 Transaminases Human genes 0.000 title abstract 7
- 150000001413 amino acids Chemical class 0.000 claims abstract description 63
- 230000035772 mutation Effects 0.000 claims abstract description 60
- 238000006467 substitution reaction Methods 0.000 claims abstract description 37
- 230000000694 effects Effects 0.000 claims abstract description 31
- 238000006243 chemical reaction Methods 0.000 claims abstract description 27
- MCKOYDSNCWNCHG-UHFFFAOYSA-N 4-chloro-1-(2,5-difluorophenyl)butan-1-one Chemical compound FC1=CC=C(F)C(C(=O)CCCCl)=C1 MCKOYDSNCWNCHG-UHFFFAOYSA-N 0.000 claims abstract description 16
- 102000014898 transaminase activity proteins Human genes 0.000 claims description 201
- NGVDGCNFYWLIFO-UHFFFAOYSA-N pyridoxal 5'-phosphate Chemical compound CC1=NC=C(COP(O)(O)=O)C(C=O)=C1O NGVDGCNFYWLIFO-UHFFFAOYSA-N 0.000 claims description 28
- 238000000034 method Methods 0.000 claims description 24
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 claims description 22
- 108090000623 proteins and genes Proteins 0.000 claims description 22
- 102000004190 Enzymes Human genes 0.000 claims description 21
- 108090000790 Enzymes Proteins 0.000 claims description 21
- WEVYAHXRMPXWCK-UHFFFAOYSA-N Acetonitrile Chemical compound CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 claims description 15
- 150000007523 nucleic acids Chemical class 0.000 claims description 14
- 235000007682 pyridoxal 5'-phosphate Nutrition 0.000 claims description 14
- 239000011589 pyridoxal 5'-phosphate Substances 0.000 claims description 14
- 229960001327 pyridoxal phosphate Drugs 0.000 claims description 14
- 239000000203 mixture Substances 0.000 claims description 13
- 239000000243 solution Substances 0.000 claims description 13
- 239000013604 expression vector Substances 0.000 claims description 12
- 102000039446 nucleic acids Human genes 0.000 claims description 12
- 108020004707 nucleic acids Proteins 0.000 claims description 12
- 238000012216 screening Methods 0.000 claims description 11
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 claims description 10
- 239000000872 buffer Substances 0.000 claims description 10
- 238000003259 recombinant expression Methods 0.000 claims description 10
- 239000000843 powder Substances 0.000 claims description 8
- ISYORFGKSZLPNW-UHFFFAOYSA-N propan-2-ylazanium;chloride Chemical compound [Cl-].CC(C)[NH3+] ISYORFGKSZLPNW-UHFFFAOYSA-N 0.000 claims description 8
- 241001052560 Thallis Species 0.000 claims description 7
- 239000006184 cosolvent Substances 0.000 claims description 5
- JJWLVOIRVHMVIS-UHFFFAOYSA-N isopropylamine Chemical compound CC(C)N JJWLVOIRVHMVIS-UHFFFAOYSA-N 0.000 claims description 5
- 238000002360 preparation method Methods 0.000 claims description 5
- 238000007792 addition Methods 0.000 claims description 3
- 239000007853 buffer solution Substances 0.000 claims description 3
- 230000006698 induction Effects 0.000 claims description 3
- 238000002156 mixing Methods 0.000 claims description 3
- 230000001131 transforming effect Effects 0.000 claims description 3
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 2
- 230000004048 modification Effects 0.000 claims description 2
- 238000012986 modification Methods 0.000 claims description 2
- 239000008363 phosphate buffer Substances 0.000 claims description 2
- 230000035484 reaction time Effects 0.000 claims description 2
- HHLJUSLZGFYWKW-UHFFFAOYSA-N triethanolamine hydrochloride Chemical compound Cl.OCCN(CCO)CCO HHLJUSLZGFYWKW-UHFFFAOYSA-N 0.000 claims description 2
- 125000003275 alpha amino acid group Chemical group 0.000 claims 43
- 238000012217 deletion Methods 0.000 claims 1
- 230000037430 deletion Effects 0.000 claims 1
- 229920001184 polypeptide Polymers 0.000 claims 1
- 102000004196 processed proteins & peptides Human genes 0.000 claims 1
- 108090000765 processed proteins & peptides Proteins 0.000 claims 1
- 238000009776 industrial production Methods 0.000 abstract description 9
- AXXCUABIFZPKPM-BQBZGAKWSA-N Asp-Arg-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O AXXCUABIFZPKPM-BQBZGAKWSA-N 0.000 description 30
- 108010078144 glutaminyl-glycine Proteins 0.000 description 30
- 108020004414 DNA Proteins 0.000 description 17
- 229940088598 enzyme Drugs 0.000 description 17
- FUSPCLTUKXQREV-ACZMJKKPSA-N Ala-Glu-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O FUSPCLTUKXQREV-ACZMJKKPSA-N 0.000 description 15
- VGPWRRFOPXVGOH-BYPYZUCNSA-N Ala-Gly-Gly Chemical compound C[C@H](N)C(=O)NCC(=O)NCC(O)=O VGPWRRFOPXVGOH-BYPYZUCNSA-N 0.000 description 15
- KMGOBAQSCKTBGD-DLOVCJGASA-N Ala-His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CN=CN1 KMGOBAQSCKTBGD-DLOVCJGASA-N 0.000 description 15
- MAEQBGQTDWDSJQ-LSJOCFKGSA-N Ala-Met-His Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N MAEQBGQTDWDSJQ-LSJOCFKGSA-N 0.000 description 15
- CREYEAPXISDKSB-FQPOAREZSA-N Ala-Thr-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CREYEAPXISDKSB-FQPOAREZSA-N 0.000 description 15
- MTDDMSUUXNQMKK-BPNCWPANSA-N Ala-Tyr-Arg Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N MTDDMSUUXNQMKK-BPNCWPANSA-N 0.000 description 15
- QRIYOHQJRDHFKF-UWJYBYFXSA-N Ala-Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 QRIYOHQJRDHFKF-UWJYBYFXSA-N 0.000 description 15
- SYAUZLVLXCDRSH-IUCAKERBSA-N Arg-Gly-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCCN=C(N)N)N SYAUZLVLXCDRSH-IUCAKERBSA-N 0.000 description 15
- GNYUVVJYGJFKHN-RVMXOQNASA-N Arg-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N GNYUVVJYGJFKHN-RVMXOQNASA-N 0.000 description 15
- LVMUGODRNHFGRA-AVGNSLFASA-N Arg-Leu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O LVMUGODRNHFGRA-AVGNSLFASA-N 0.000 description 15
- YKZJPIPFKGYHKY-DCAQKATOSA-N Arg-Leu-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YKZJPIPFKGYHKY-DCAQKATOSA-N 0.000 description 15
- SPCONPVIDFMDJI-QSFUFRPTSA-N Asn-Ile-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O SPCONPVIDFMDJI-QSFUFRPTSA-N 0.000 description 15
- TVIZQBFURPLQDV-DJFWLOJKSA-N Asp-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC(=O)O)N TVIZQBFURPLQDV-DJFWLOJKSA-N 0.000 description 15
- QPDUWAUSSWGJSB-NGZCFLSTSA-N Asp-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N QPDUWAUSSWGJSB-NGZCFLSTSA-N 0.000 description 15
- GSNRZJNHMVMOFV-ACZMJKKPSA-N Cys-Asp-Glu Chemical compound C(CC(=O)O)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CS)N GSNRZJNHMVMOFV-ACZMJKKPSA-N 0.000 description 15
- IRKLTAKLAFUTLA-KATARQTJSA-N Cys-Thr-Lys Chemical compound C[C@@H](O)[C@H](NC(=O)[C@@H](N)CS)C(=O)N[C@@H](CCCCN)C(O)=O IRKLTAKLAFUTLA-KATARQTJSA-N 0.000 description 15
- FTIJVMLAGRAYMJ-MNXVOIDGSA-N Gln-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(N)=O FTIJVMLAGRAYMJ-MNXVOIDGSA-N 0.000 description 15
- RGNMNWULPAYDAH-JSGCOSHPSA-N Gln-Trp-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N RGNMNWULPAYDAH-JSGCOSHPSA-N 0.000 description 15
- MKRDNSWGJWTBKZ-GVXVVHGQSA-N Gln-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N MKRDNSWGJWTBKZ-GVXVVHGQSA-N 0.000 description 15
- RUFHOVYUYSNDNY-ACZMJKKPSA-N Glu-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O RUFHOVYUYSNDNY-ACZMJKKPSA-N 0.000 description 15
- MXOODARRORARSU-ACZMJKKPSA-N Glu-Ala-Ser Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)O)N MXOODARRORARSU-ACZMJKKPSA-N 0.000 description 15
- KXTAGESXNQEZKB-DZKIICNBSA-N Glu-Phe-Val Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 KXTAGESXNQEZKB-DZKIICNBSA-N 0.000 description 15
- UDEPRBFQTWGLCW-CIUDSAMLSA-N Glu-Pro-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O UDEPRBFQTWGLCW-CIUDSAMLSA-N 0.000 description 15
- KRRMJKMGWWXWDW-STQMWFEESA-N Gly-Arg-Phe Chemical compound NC(=N)NCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KRRMJKMGWWXWDW-STQMWFEESA-N 0.000 description 15
- YYPFZVIXAVDHIK-IUCAKERBSA-N Gly-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN YYPFZVIXAVDHIK-IUCAKERBSA-N 0.000 description 15
- QPTNELDXWKRIFX-YFKPBYRVSA-N Gly-Gly-Gln Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O QPTNELDXWKRIFX-YFKPBYRVSA-N 0.000 description 15
- SWQALSGKVLYKDT-UHFFFAOYSA-N Gly-Ile-Ala Natural products NCC(=O)NC(C(C)CC)C(=O)NC(C)C(O)=O SWQALSGKVLYKDT-UHFFFAOYSA-N 0.000 description 15
- MXIULRKNFSCJHT-STQMWFEESA-N Gly-Phe-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 MXIULRKNFSCJHT-STQMWFEESA-N 0.000 description 15
- YXTFLTJYLIAZQG-FJXKBIBVSA-N Gly-Thr-Arg Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YXTFLTJYLIAZQG-FJXKBIBVSA-N 0.000 description 15
- JYGYNWYVKXENNE-OALUTQOASA-N Gly-Tyr-Trp Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O JYGYNWYVKXENNE-OALUTQOASA-N 0.000 description 15
- BAYQNCWLXIDLHX-ONGXEEELSA-N Gly-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN BAYQNCWLXIDLHX-ONGXEEELSA-N 0.000 description 15
- AFMOTCMSEBITOE-YEPSODPASA-N Gly-Val-Thr Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AFMOTCMSEBITOE-YEPSODPASA-N 0.000 description 15
- STGQSBKUYSPPIG-CIUDSAMLSA-N His-Ser-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CN=CN1 STGQSBKUYSPPIG-CIUDSAMLSA-N 0.000 description 15
- GQKSJYINYYWPMR-NGZCFLSTSA-N Ile-Gly-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N GQKSJYINYYWPMR-NGZCFLSTSA-N 0.000 description 15
- HUORUFRRJHELPD-MNXVOIDGSA-N Ile-Leu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HUORUFRRJHELPD-MNXVOIDGSA-N 0.000 description 15
- FTUZWJVSNZMLPI-RVMXOQNASA-N Ile-Met-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N1CCC[C@@H]1C(=O)O)N FTUZWJVSNZMLPI-RVMXOQNASA-N 0.000 description 15
- RENBRDSDKPSRIH-HJWJTTGWSA-N Ile-Phe-Met Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(=O)O RENBRDSDKPSRIH-HJWJTTGWSA-N 0.000 description 15
- QGXQHJQPAPMACW-PPCPHDFISA-N Ile-Thr-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)O)N QGXQHJQPAPMACW-PPCPHDFISA-N 0.000 description 15
- ZYVTXBXHIKGZMD-QSFUFRPTSA-N Ile-Val-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N ZYVTXBXHIKGZMD-QSFUFRPTSA-N 0.000 description 15
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 15
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 15
- PJYSOYLLTJKZHC-GUBZILKMSA-N Leu-Asp-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(N)=O PJYSOYLLTJKZHC-GUBZILKMSA-N 0.000 description 15
- ULXYQAJWJGLCNR-YUMQZZPRSA-N Leu-Asp-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O ULXYQAJWJGLCNR-YUMQZZPRSA-N 0.000 description 15
- KWLWZYMNUZJKMZ-IHRRRGAJSA-N Leu-Pro-Leu Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O KWLWZYMNUZJKMZ-IHRRRGAJSA-N 0.000 description 15
- HGLKOTPFWOMPOB-MEYUZBJRSA-N Leu-Thr-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HGLKOTPFWOMPOB-MEYUZBJRSA-N 0.000 description 15
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 15
- DTUZCYRNEJDKSR-NHCYSSNCSA-N Lys-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN DTUZCYRNEJDKSR-NHCYSSNCSA-N 0.000 description 15
- OXHSZBRPUGNMKW-DCAQKATOSA-N Met-Gln-Arg Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OXHSZBRPUGNMKW-DCAQKATOSA-N 0.000 description 15
- HUURTRNKPBHHKZ-JYJNAYRXSA-N Met-Phe-Val Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 HUURTRNKPBHHKZ-JYJNAYRXSA-N 0.000 description 15
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 15
- 108010079364 N-glycylalanine Proteins 0.000 description 15
- 108010066427 N-valyltryptophan Proteins 0.000 description 15
- MGECUMGTSHYHEJ-QEWYBTABSA-N Phe-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 MGECUMGTSHYHEJ-QEWYBTABSA-N 0.000 description 15
- SSSFPISOZOLQNP-GUBZILKMSA-N Pro-Arg-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O SSSFPISOZOLQNP-GUBZILKMSA-N 0.000 description 15
- MGDFPGCFVJFITQ-CIUDSAMLSA-N Pro-Glu-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MGDFPGCFVJFITQ-CIUDSAMLSA-N 0.000 description 15
- ZUZINZIJHJFJRN-UBHSHLNASA-N Pro-Phe-Ala Chemical compound C([C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)[C@H]1NCCC1)C1=CC=CC=C1 ZUZINZIJHJFJRN-UBHSHLNASA-N 0.000 description 15
- WHNJMTHJGCEKGA-ULQDDVLXSA-N Pro-Phe-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O WHNJMTHJGCEKGA-ULQDDVLXSA-N 0.000 description 15
- FIDNSJUXESUDOV-JYJNAYRXSA-N Pro-Tyr-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O FIDNSJUXESUDOV-JYJNAYRXSA-N 0.000 description 15
- ZAUHSLVPDLNTRZ-QXEWZRGKSA-N Pro-Val-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O ZAUHSLVPDLNTRZ-QXEWZRGKSA-N 0.000 description 15
- IMNVAOPEMFDAQD-NHCYSSNCSA-N Pro-Val-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IMNVAOPEMFDAQD-NHCYSSNCSA-N 0.000 description 15
- IOVHBRCQOGWAQH-ZKWXMUAHSA-N Ser-Gly-Ile Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IOVHBRCQOGWAQH-ZKWXMUAHSA-N 0.000 description 15
- KKKVOZNCLALMPV-XKBZYTNZSA-N Ser-Thr-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O KKKVOZNCLALMPV-XKBZYTNZSA-N 0.000 description 15
- VFEHSAJCWWHDBH-RHYQMDGZSA-N Thr-Arg-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O VFEHSAJCWWHDBH-RHYQMDGZSA-N 0.000 description 15
- JEDIEMIJYSRUBB-FOHZUACHSA-N Thr-Asp-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O JEDIEMIJYSRUBB-FOHZUACHSA-N 0.000 description 15
- SHOMROOOQBDGRL-JHEQGTHGSA-N Thr-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SHOMROOOQBDGRL-JHEQGTHGSA-N 0.000 description 15
- AAZOYLQUEQRUMZ-GSSVUCPTSA-N Thr-Thr-Asn Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(N)=O AAZOYLQUEQRUMZ-GSSVUCPTSA-N 0.000 description 15
- LNGFWVPNKLWATF-ZVZYQTTQSA-N Trp-Val-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LNGFWVPNKLWATF-ZVZYQTTQSA-N 0.000 description 15
- KPEVFMGKBCMTJF-SZMVWBNQSA-N Trp-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N KPEVFMGKBCMTJF-SZMVWBNQSA-N 0.000 description 15
- PEVVXUGSAKEPEN-AVGNSLFASA-N Tyr-Asn-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PEVVXUGSAKEPEN-AVGNSLFASA-N 0.000 description 15
- AOIZTZRWMSPPAY-KAOXEZKKSA-N Tyr-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)O AOIZTZRWMSPPAY-KAOXEZKKSA-N 0.000 description 15
- RUCNAYOMFXRIKJ-DCAQKATOSA-N Val-Ala-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RUCNAYOMFXRIKJ-DCAQKATOSA-N 0.000 description 15
- MHAHQDBEIDPFQS-NHCYSSNCSA-N Val-Glu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)C(C)C MHAHQDBEIDPFQS-NHCYSSNCSA-N 0.000 description 15
- PIFJAFRUVWZRKR-QMMMGPOBSA-N Val-Gly-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O PIFJAFRUVWZRKR-QMMMGPOBSA-N 0.000 description 15
- HGJRMXOWUWVUOA-GVXVVHGQSA-N Val-Leu-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N HGJRMXOWUWVUOA-GVXVVHGQSA-N 0.000 description 15
- NHXZRXLFOBFMDM-AVGNSLFASA-N Val-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)C(C)C NHXZRXLFOBFMDM-AVGNSLFASA-N 0.000 description 15
- SUGRIIAOLCDLBD-ZOBUZTSGSA-N Val-Trp-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(=O)O)C(=O)O)N SUGRIIAOLCDLBD-ZOBUZTSGSA-N 0.000 description 15
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 15
- 108010060035 arginylproline Proteins 0.000 description 15
- 108010077245 asparaginyl-proline Proteins 0.000 description 15
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 15
- 108010047857 aspartylglycine Proteins 0.000 description 15
- 108010054813 diprotin B Proteins 0.000 description 15
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 15
- 108010037850 glycylvaline Proteins 0.000 description 15
- 108010057821 leucylproline Proteins 0.000 description 15
- 108010056582 methionylglutamic acid Proteins 0.000 description 15
- 108010068488 methionylphenylalanine Proteins 0.000 description 15
- 108010074082 phenylalanyl-alanyl-lysine Proteins 0.000 description 15
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 15
- 108010031719 prolyl-serine Proteins 0.000 description 15
- 108010015796 prolylisoleucine Proteins 0.000 description 15
- 108010090894 prolylleucine Proteins 0.000 description 15
- 108010020532 tyrosyl-proline Proteins 0.000 description 15
- 108010027345 wheylin-1 peptide Proteins 0.000 description 15
- BYLPQJAWXJWUCJ-YDHLFZDLSA-N Asp-Tyr-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O BYLPQJAWXJWUCJ-YDHLFZDLSA-N 0.000 description 13
- FANFRJOFTYCNRG-JYBASQMISA-N Cys-Thr-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CS)N)O FANFRJOFTYCNRG-JYBASQMISA-N 0.000 description 13
- COYSIHFOCOMGCF-UHFFFAOYSA-N Val-Arg-Gly Natural products CC(C)C(N)C(=O)NC(C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-UHFFFAOYSA-N 0.000 description 13
- UIQGJYUEQDOODF-KWQFWETISA-N Gly-Tyr-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 UIQGJYUEQDOODF-KWQFWETISA-N 0.000 description 12
- WXHHTBVYQOSYSL-FXQIFTODSA-N Met-Ala-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O WXHHTBVYQOSYSL-FXQIFTODSA-N 0.000 description 12
- WGBMNLCRYKSWAR-DCAQKATOSA-N Met-Asp-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN WGBMNLCRYKSWAR-DCAQKATOSA-N 0.000 description 12
- LJSZPMSUYKKKCP-UBHSHLNASA-N Val-Phe-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 LJSZPMSUYKKKCP-UBHSHLNASA-N 0.000 description 12
- 108010047495 alanylglycine Proteins 0.000 description 12
- 108010016686 methionyl-alanyl-serine Proteins 0.000 description 12
- TZFQICWZWFNIKU-KKUMJFAQSA-N Asn-Leu-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 TZFQICWZWFNIKU-KKUMJFAQSA-N 0.000 description 11
- BCADFFUQHIMQAA-KKHAAJSZSA-N Asn-Thr-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O BCADFFUQHIMQAA-KKHAAJSZSA-N 0.000 description 11
- NURNJECQNNCRBK-FLBSBUHZSA-N Ile-Thr-Thr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NURNJECQNNCRBK-FLBSBUHZSA-N 0.000 description 10
- GAYLGYUVTDMLKC-UWJYBYFXSA-N Tyr-Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 GAYLGYUVTDMLKC-UWJYBYFXSA-N 0.000 description 10
- CGWVCWFQGXOUSJ-ULQDDVLXSA-N Arg-Tyr-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O CGWVCWFQGXOUSJ-ULQDDVLXSA-N 0.000 description 9
- 238000004128 high performance liquid chromatography Methods 0.000 description 9
- 230000014759 maintenance of location Effects 0.000 description 9
- 239000012071 phase Substances 0.000 description 9
- QDRGPQWIVZNJQD-CIUDSAMLSA-N Ala-Arg-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O QDRGPQWIVZNJQD-CIUDSAMLSA-N 0.000 description 8
- GFWLIJDQILOEPP-HSCHXYMDSA-N Lys-Ile-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CCCCN)N GFWLIJDQILOEPP-HSCHXYMDSA-N 0.000 description 8
- 108010054155 lysyllysine Proteins 0.000 description 8
- 239000000047 product Substances 0.000 description 7
- 239000007788 liquid Substances 0.000 description 6
- CFGHCPUPFHWMCM-FDARSICLSA-N Arg-Ile-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N CFGHCPUPFHWMCM-FDARSICLSA-N 0.000 description 5
- VZQRNAYURWAEFE-KKUMJFAQSA-N Ser-Leu-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VZQRNAYURWAEFE-KKUMJFAQSA-N 0.000 description 5
- 230000001580 bacterial effect Effects 0.000 description 5
- 238000004296 chiral HPLC Methods 0.000 description 5
- 229930027917 kanamycin Natural products 0.000 description 5
- 229960000318 kanamycin Drugs 0.000 description 5
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 5
- 229930182823 kanamycin A Natural products 0.000 description 5
- 238000004519 manufacturing process Methods 0.000 description 5
- 230000002194 synthesizing effect Effects 0.000 description 5
- NCXSNNVYILYEBC-JTQLQIEISA-N (2s)-2-(2,5-difluorophenyl)pyrrolidine Chemical compound FC1=CC=C(F)C([C@H]2NCCC2)=C1 NCXSNNVYILYEBC-JTQLQIEISA-N 0.000 description 4
- GRZSCTXVCDUIPO-SRVKXCTJSA-N Leu-Arg-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O GRZSCTXVCDUIPO-SRVKXCTJSA-N 0.000 description 4
- CXGLFEOYCJFKPR-RCWTZXSCSA-N Pro-Thr-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O CXGLFEOYCJFKPR-RCWTZXSCSA-N 0.000 description 4
- QGMLKFGTGXWAHF-IHRRRGAJSA-N Ser-Arg-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QGMLKFGTGXWAHF-IHRRRGAJSA-N 0.000 description 4
- VAIWUNAAPZZGRI-IHPCNDPISA-N Ser-Trp-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)NC(=O)[C@H](CO)N VAIWUNAAPZZGRI-IHPCNDPISA-N 0.000 description 4
- 238000003556 assay Methods 0.000 description 4
- 238000012258 culturing Methods 0.000 description 4
- 239000013612 plasmid Substances 0.000 description 4
- 108010073969 valyllysine Proteins 0.000 description 4
- WXLYNEHOGRYNFU-URLPEUOOSA-N Ile-Thr-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N WXLYNEHOGRYNFU-URLPEUOOSA-N 0.000 description 3
- 239000012880 LB liquid culture medium Substances 0.000 description 3
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 3
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 3
- GULIUBBXCYPDJU-CQDKDKBSSA-N Tyr-Leu-Ala Chemical compound [O-]C(=O)[C@H](C)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CC1=CC=C(O)C=C1 GULIUBBXCYPDJU-CQDKDKBSSA-N 0.000 description 3
- 239000003153 chemical reaction reagent Substances 0.000 description 3
- 108010073101 phenylalanylleucine Proteins 0.000 description 3
- 239000006228 supernatant Substances 0.000 description 3
- 108010035534 tyrosyl-leucyl-alanine Proteins 0.000 description 3
- NYNZQNWKBKUAII-KBXCAEBGSA-N (3s)-n-[5-[(2r)-2-(2,5-difluorophenyl)pyrrolidin-1-yl]pyrazolo[1,5-a]pyrimidin-3-yl]-3-hydroxypyrrolidine-1-carboxamide Chemical compound C1[C@@H](O)CCN1C(=O)NC1=C2N=C(N3[C@H](CCC3)C=3C(=CC=C(F)C=3)F)C=CN2N=C1 NYNZQNWKBKUAII-KBXCAEBGSA-N 0.000 description 2
- BVBKBQRPOJFCQM-DCAQKATOSA-N Arg-Asn-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BVBKBQRPOJFCQM-DCAQKATOSA-N 0.000 description 2
- PZBSKYJGKNNYNK-ULQDDVLXSA-N Arg-Leu-Tyr Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCN=C(N)N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O PZBSKYJGKNNYNK-ULQDDVLXSA-N 0.000 description 2
- QNMKWNONJGKJJC-NHCYSSNCSA-N Asp-Leu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O QNMKWNONJGKJJC-NHCYSSNCSA-N 0.000 description 2
- NAPULYCVEVVFRB-HEIBUPTGSA-N Cys-Thr-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](N)CS NAPULYCVEVVFRB-HEIBUPTGSA-N 0.000 description 2
- 241000588724 Escherichia coli Species 0.000 description 2
- 241001198387 Escherichia coli BL21(DE3) Species 0.000 description 2
- 239000007818 Grignard reagent Substances 0.000 description 2
- VEXZGXHMUGYJMC-UHFFFAOYSA-N Hydrochloric acid Chemical compound Cl VEXZGXHMUGYJMC-UHFFFAOYSA-N 0.000 description 2
- KWHFUMYCSPJCFQ-NGTWOADLSA-N Ile-Thr-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N KWHFUMYCSPJCFQ-NGTWOADLSA-N 0.000 description 2
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 2
- FACUGMGEFUEBTI-SRVKXCTJSA-N Lys-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCCCN FACUGMGEFUEBTI-SRVKXCTJSA-N 0.000 description 2
- MIMXMVDLMDMOJD-BZSNNMDCSA-N Lys-Tyr-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O MIMXMVDLMDMOJD-BZSNNMDCSA-N 0.000 description 2
- NRKNYPRRWXVELC-NQCBNZPSSA-N Phe-Ile-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CC=CC=C3)N NRKNYPRRWXVELC-NQCBNZPSSA-N 0.000 description 2
- JFWDJFULOLKQFY-QWRGUYRKSA-N Ser-Gly-Phe Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JFWDJFULOLKQFY-QWRGUYRKSA-N 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- MEZCXKYMMQJRDE-PMVMPFDFSA-N Trp-Leu-Tyr Chemical compound C([C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CC=1C2=CC=CC=C2NC=1)CC(C)C)C(O)=O)C1=CC=C(O)C=C1 MEZCXKYMMQJRDE-PMVMPFDFSA-N 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 239000003054 catalyst Substances 0.000 description 2
- 238000006555 catalytic reaction Methods 0.000 description 2
- 229940125782 compound 2 Drugs 0.000 description 2
- 238000005336 cracking Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000001976 enzyme digestion Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 150000004795 grignard reagents Chemical class 0.000 description 2
- 239000000411 inducer Substances 0.000 description 2
- 230000001939 inductive effect Effects 0.000 description 2
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 2
- 229950003970 larotrectinib Drugs 0.000 description 2
- 108010038320 lysylphenylalanine Proteins 0.000 description 2
- 239000012528 membrane Substances 0.000 description 2
- 239000012074 organic phase Substances 0.000 description 2
- 238000007789 sealing Methods 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 238000001308 synthesis method Methods 0.000 description 2
- 108010080629 tryptophan-leucine Proteins 0.000 description 2
- NCXSNNVYILYEBC-UHFFFAOYSA-N 2-(2,5-difluorophenyl)pyrrolidine Chemical compound FC1=CC=C(F)C(C2NCCC2)=C1 NCXSNNVYILYEBC-UHFFFAOYSA-N 0.000 description 1
- -1 4-chloro-1- (2, 5-difluorophenyl) butan-1-one (4-chloro-1- (2, 5-difluorophenyl) butan-1-one) Chemical compound 0.000 description 1
- 102000005751 Alcohol Oxidoreductases Human genes 0.000 description 1
- 108010031132 Alcohol Oxidoreductases Proteins 0.000 description 1
- 102000007698 Alcohol dehydrogenase Human genes 0.000 description 1
- 108010021809 Alcohol dehydrogenase Proteins 0.000 description 1
- 241001465318 Aspergillus terreus Species 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- 102000012410 DNA Ligases Human genes 0.000 description 1
- 108010061982 DNA Ligases Proteins 0.000 description 1
- 239000005551 L01XE03 - Erlotinib Substances 0.000 description 1
- 239000002136 L01XE07 - Lapatinib Substances 0.000 description 1
- 239000006142 Luria-Bertani Agar Substances 0.000 description 1
- 102000016943 Muramidase Human genes 0.000 description 1
- 108010014251 Muramidase Proteins 0.000 description 1
- 108010062010 N-Acetylmuramoyl-L-alanine Amidase Proteins 0.000 description 1
- SBKRTALNRRAOJP-BWSIXKJUSA-N N-[(2S)-4-amino-1-[[(2S,3R)-1-[[(2S)-4-amino-1-oxo-1-[[(3S,6S,9S,12S,15R,18R,21S)-6,9,18-tris(2-aminoethyl)-15-benzyl-3-[(1R)-1-hydroxyethyl]-12-(2-methylpropyl)-2,5,8,11,14,17,20-heptaoxo-1,4,7,10,13,16,19-heptazacyclotricos-21-yl]amino]butan-2-yl]amino]-3-hydroxy-1-oxobutan-2-yl]amino]-1-oxobutan-2-yl]-6-methylheptanamide (6S)-N-[(2S)-4-amino-1-[[(2S,3R)-1-[[(2S)-4-amino-1-oxo-1-[[(3S,6S,9S,12S,15R,18R,21S)-6,9,18-tris(2-aminoethyl)-15-benzyl-3-[(1R)-1-hydroxyethyl]-12-(2-methylpropyl)-2,5,8,11,14,17,20-heptaoxo-1,4,7,10,13,16,19-heptazacyclotricos-21-yl]amino]butan-2-yl]amino]-3-hydroxy-1-oxobutan-2-yl]amino]-1-oxobutan-2-yl]-6-methyloctanamide sulfuric acid Polymers OS(O)(=O)=O.CC(C)CCCCC(=O)N[C@@H](CCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCN)C(=O)N[C@H]1CCNC(=O)[C@@H](NC(=O)[C@H](CCN)NC(=O)[C@H](CCN)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](Cc2ccccc2)NC(=O)[C@@H](CCN)NC1=O)[C@@H](C)O.CC[C@H](C)CCCCC(=O)N[C@@H](CCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCN)C(=O)N[C@H]1CCNC(=O)[C@@H](NC(=O)[C@H](CCN)NC(=O)[C@H](CCN)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](Cc2ccccc2)NC(=O)[C@@H](CCN)NC1=O)[C@@H](C)O SBKRTALNRRAOJP-BWSIXKJUSA-N 0.000 description 1
- 108010093965 Polymyxin B Proteins 0.000 description 1
- 108091005682 Receptor kinases Proteins 0.000 description 1
- PMZURENOXWZQFD-UHFFFAOYSA-L Sodium Sulfate Chemical compound [Na+].[Na+].[O-]S([O-])(=O)=O PMZURENOXWZQFD-UHFFFAOYSA-L 0.000 description 1
- GSEJCLTVZPLZKY-UHFFFAOYSA-N Triethanolamine Chemical compound OCCN(CCO)CCO GSEJCLTVZPLZKY-UHFFFAOYSA-N 0.000 description 1
- 102000005937 Tropomyosin Human genes 0.000 description 1
- 108010030743 Tropomyosin Proteins 0.000 description 1
- HSVPZJLMPLMPOX-BPNCWPANSA-N Tyr-Arg-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O HSVPZJLMPLMPOX-BPNCWPANSA-N 0.000 description 1
- WTTRJMAZPDHPGS-KKXDTOCCSA-N Tyr-Phe-Ala Chemical compound C[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](N)Cc1ccc(O)cc1)C(O)=O WTTRJMAZPDHPGS-KKXDTOCCSA-N 0.000 description 1
- 229910052782 aluminium Inorganic materials 0.000 description 1
- XAGFODPZIPBFFR-UHFFFAOYSA-N aluminium Chemical compound [Al] XAGFODPZIPBFFR-UHFFFAOYSA-N 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 239000002246 antineoplastic agent Substances 0.000 description 1
- 229940041181 antineoplastic drug Drugs 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 239000012069 chiral reagent Substances 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 229940125904 compound 1 Drugs 0.000 description 1
- 229940126214 compound 3 Drugs 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000001035 drying Methods 0.000 description 1
- 238000003912 environmental pollution Methods 0.000 description 1
- AAKJLRGGTJKAMG-UHFFFAOYSA-N erlotinib Chemical compound C=12C=C(OCCOC)C(OCCOC)=CC2=NC=NC=1NC1=CC=CC(C#C)=C1 AAKJLRGGTJKAMG-UHFFFAOYSA-N 0.000 description 1
- 229960001433 erlotinib Drugs 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 239000000706 filtrate Substances 0.000 description 1
- 239000011888 foil Substances 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- JMMWKPVZQRWMSS-UHFFFAOYSA-N isopropanol acetate Natural products CC(C)OC(C)=O JMMWKPVZQRWMSS-UHFFFAOYSA-N 0.000 description 1
- 229940011051 isopropyl acetate Drugs 0.000 description 1
- GWYFCOCPABKNJV-UHFFFAOYSA-N isovaleric acid Chemical compound CC(C)CC(O)=O GWYFCOCPABKNJV-UHFFFAOYSA-N 0.000 description 1
- 229960004891 lapatinib Drugs 0.000 description 1
- BCFGMOOMADDAQU-UHFFFAOYSA-N lapatinib Chemical compound O1C(CNCCS(=O)(=O)C)=CC=C1C1=CC=C(N=CN=C2NC=3C=C(Cl)C(OCC=4C=C(F)C=CC=4)=CC=3)C2=C1 BCFGMOOMADDAQU-UHFFFAOYSA-N 0.000 description 1
- 238000004895 liquid chromatography mass spectrometry Methods 0.000 description 1
- 239000006166 lysate Substances 0.000 description 1
- 229960000274 lysozyme Drugs 0.000 description 1
- 239000004325 lysozyme Substances 0.000 description 1
- 235000010335 lysozyme Nutrition 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000002609 medium Substances 0.000 description 1
- 239000011259 mixed solution Substances 0.000 description 1
- 229910000510 noble metal Inorganic materials 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- YJVFFLUZDVXJQI-UHFFFAOYSA-L palladium(ii) acetate Chemical compound [Pd+2].CC([O-])=O.CC([O-])=O YJVFFLUZDVXJQI-UHFFFAOYSA-L 0.000 description 1
- 229960003548 polymyxin b sulfate Drugs 0.000 description 1
- 239000011148 porous material Substances 0.000 description 1
- 230000001376 precipitating effect Effects 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 238000010791 quenching Methods 0.000 description 1
- 238000002708 random mutagenesis Methods 0.000 description 1
- 239000002994 raw material Substances 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 238000006798 ring closing metathesis reaction Methods 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- 238000000527 sonication Methods 0.000 description 1
- 229960001945 sparteine Drugs 0.000 description 1
- SLRCCWJSBJZJBV-AJNGGQMLSA-N sparteine Chemical compound C1N2CCCC[C@H]2[C@@H]2CN3CCCC[C@H]3[C@H]1C2 SLRCCWJSBJZJBV-AJNGGQMLSA-N 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 239000007858 starting material Substances 0.000 description 1
- 238000003756 stirring Methods 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
- 238000010189 synthetic method Methods 0.000 description 1
- CESUXLKAADQNTB-UHFFFAOYSA-N tert-butanesulfinamide Chemical compound CC(C)(C)S(N)=O CESUXLKAADQNTB-UHFFFAOYSA-N 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/1096—Transferases (2.) transferring nitrogenous groups (2.6)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P17/00—Preparation of heterocyclic carbon compounds with only O, N, S, Se or Te as ring hetero atoms
- C12P17/10—Nitrogen as only ring hetero atom
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Engineering & Computer Science (AREA)
- Wood Science & Technology (AREA)
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Microbiology (AREA)
- Biotechnology (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Medicinal Chemistry (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Enzymes And Modification Thereof (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
本发明提供了重组转氨酶及其在合成(R)‑2‑(2,5‑二氟苯基)吡咯烷中的应用,所述转氨酶是野生型转氨酶的突变体。本发明通过对SEQ ID NO:1所示的野生型转氨酶的第13~312位氨基酸中的一个或多个氨基酸进行取代突变,获得的转氨酶突变体具有显著提高的转氨酶活性,可以高效催化4‑氯‑1‑(2,5‑二氟苯基)丁‑1‑酮合成(R)‑2‑(2,5‑二氟苯基)吡咯烷,反应条件温和,成本低廉,在(R)‑2‑(2,5‑二氟苯基)吡咯烷以及拉罗替尼的工业化生产领域具有重要意义。
Description
技术领域
本发明属于生物酶进化及医药制造技术领域,涉及重组转氨酶及其在合成拉罗替尼关键手性中间体(R) -2-(2,5-二氟苯基)吡咯烷中的应用。
背景技术
拉罗替尼(Larotrectinib)是由Array BioPharma公司研发、由拜耳公司与LoxoOncology公司共同临床开发的一种选择性原肌球蛋白受体激酶(TRK)抑制剂,是一种广谱抗癌药。结构式如下:
目前,Larotrectinib的制备通常是先合成其关键手性中间体(R)-2-(2,5-二氟苯基)吡咯烷,再利用 (R)-2-(2,5-二氟苯基)吡咯烷继续合成,最终制得Larotrectinib。(R)-2-(2,5-二氟苯基)吡咯烷的 CAS号为:1218935-59-1,分子式为C10H11F2N,分子量为:183.20,结构式如下:
目前,合成中间体(R)-2-(2,5-二氟苯基)吡咯烷主要有化学合成法和生物合成法两类:
化学合成法主要是通过有机合成手段合成(R)-2-(2,5-二氟苯基)吡咯烷。WO2010048314A1报道在低温条件下用醋酸钯和手性诱导剂(-)-sparteine诱导合成(R)-2-(2,5-二氟苯基)吡咯烷,但是该方法使用贵金属作为催化剂,成本较高,且存在污染环境等问题,因此不适用于大规模工业化生产。
CN107286070A采用手性试剂(S)-叔丁基亚磺酰胺及其类似物和格式试剂合成(R)-2-(2,5-二氟苯基)吡咯烷,格式试剂反应属于剧烈反应,对温度、加料速度以及物料和设备的干燥性均有严格要求,若未能有效控制上述参数,则可能造成反应失控,引起火灾***事故,因此亦不适用于大规模工业化生产。
生物合成法主要利用生物酶催化反应合成(R)-2-(2,5-二氟苯基)吡咯烷。CN108484361A利用羰基还原酶或醇脱氢酶先合成(S)-4-氯-1-(2,5)-二氟苯基丁-1-醇,再构型翻转合成(R)-2-(2,5-二氟苯基)吡咯烷。
现有技术中,化学合成法常使用昂贵的诱导剂或者危险的格式试剂,生物合成法需要先合成反构型的手性醇,再构型翻转,关环得到目的化合物,合成步骤冗长,均不适用于工业化生产。
如何提供一种(R)-2-(2,5-二氟苯基)吡咯烷的合成方法,能够避免使用昂贵且危险的试剂,同时合成工艺简单,适用于工业化生产,成为了目前迫切需要解决的问题。
发明内容
针对现有技术的不足和实际需求,本发明提供了重组转氨酶及其在合成(R)-2-(2,5-二氟苯基)吡咯烷中的应用,所述转氨酶突变体能够高效催化4-氯-1-(2,5-二氟苯基)丁-1-酮合成(R)-2-(2,5-二氟苯基)吡咯烷,工艺简单、成本低廉、安全性高,适用于大规模工业化生产。
为达此目的,本发明采用以下技术方案:
第一方面,本发明提供了一种转氨酶突变体,所述转氨酶突变体包括与SEQ IDNO:1具有80%以上同一性、且具有相同或相似转氨酶活性的氨基酸序列;
SEQ ID NO:1:
MASMDKVFAGYAARQAILESTETTNPFAKGIAWVEGELVPLAEARIPLLDQGFMHSDLTYDVPSV WDGRFFRLDDHITRLEASCTKLRLRLPLPRDQVKQILVEMVAKSGIRDAFVELIVTRGLKGVRGTRPEDI VNNLYMFVQPYVWVMEPDMQRVGGSAVVARTVRRVPPGAIDPTVKNLQWGDLVRGMFEAADRGAT YPFLTDGDAHLTEGSGFNIVLVKDGVLYTPDRGVLQGVTRKSVINAAEAFGIEVRVEFVPVELAYRCDEI FMCTTAGGIMPITTLDGMPVNGGQIGPITKKIWDGYWAMHYDAAYSFEIDYNERN。
本发明中,通过对SEQ ID NO:1所示的野生型转氨酶进行突变,获得的与野生型转氨酶具有80%以上同一性的转氨酶突变体具有显著提高的转氨酶活性,可以高效催化4-氯-1-(2,5-二氟苯基)丁-1-酮合成 (R)-2-(2,5-二氟苯基)吡咯烷,反应条件温和,成本低廉,在(R)-2-(2,5-二氟苯基)吡咯烷以及拉罗替尼的工业化生产领域具有重要意义。
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列的第13~312位氨基酸经修饰、取代、缺失或添加一个或几个氨基酸获得的氨基酸序列。
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列的第13~312位氨基酸经取代突变一个或几个氨基酸获得的氨基酸序列。
本发明中,对SEQ ID NO:1所示的野生型转氨酶的第13~312位氨基酸中的一个或几个氨基酸进行取代突变,突变后的转氨酶突变体的转氨酶活性得到显著提高,能够更高效地催化(R)-2-(2,5-二氟苯基) 吡咯烷的合成,进而提高了生产效率,降低了生产成本。
优选地,所述突变选自第180位、第181位、第187位、第216位或第275位中的任意一个或多个,进一步优选为突变选自第180位、第181位、第187位、第216位或第275位中的任意三个位点。
优选地,所述突变选自K180R、N181Y、L187Y、G216L或T275W中的任意一个或多个,进一步优选为突变选自K180R、N181Y、L187Y、G216L或T275W中的任意三个位点。
本发明中,对野生型转氨酶(SEQ ID NO:1)进行K180R、N181Y、L187Y、G216L或T275W突变,获得的转氨酶突变体的转氨酶活性有所提高。
优选地,所述突变进一步选自第177位。
优选地,所述突变选自P177N、K180R、N181Y、L187Y、G216L、G216R或T275W中的任意一个或多个。
本发明中,对野生型转氨酶(SEQ ID NO:1)进行P177N、K180R、N181Y、L187Y、G216L、G216R 或T275W突变,获得的转氨酶突变体的转氨酶活性明显提高。
优选地,所述突变进一步选自第13位和/或第301位。
优选地,所述突变选自A13L、A13R、P177N、K180R、N181Y、L187Y、G216W、G216R、T275W、 K301R或K301F中的任意一个或多个。
优选地,所述突变进一步选自第284位和/或第312位。
优选地,所述突变选自A13L、A13R、P177N、K180R、N181Y、L187Y、G216W、G216R、T275W、 T284F、K301R、K301F或D312F中的任意一个或多个。
本发明中,对野生型转氨酶(SEQ ID NO:1)进行A13L、A13R、P177N、K180R、N181Y、L187Y、 G216W、G216R、T275W、T284F、K301R、K301F或D312F突变,获得的转氨酶突变体具有明显提高的转氨酶活性。
优选地,所述突变进一步选自第138位,即所述取代突变包括第13位、第138位、第177位、第180 位、第181位、第187位、第216位、第275位、第284位、第301位或第312位中的任意一个或至少两个氨基酸被取代。
优选地,所述取代突变的位点包括A13L、A13R、N138W、N138R、P177N、K180R、N181Y、L187Y、 G216L、G216R、G216W、T275W、T284F、T284W、K301R、K301F、D312F、D312L或D312R中的任意一种或至少两种的组合。
本发明中,对野生型转氨酶(SEQ ID NO:1)进行A13L、A13R、N138W、N138R、P177N、K180R、 N181Y、L187Y、G216L、G216R、G216W、T275W、T284F、T284W、K301R、K301F、D312F、D312L 或D312R突变,获得的转氨酶突变体具有显著提高的转氨酶活性。
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经L187Y、G216L和T275W取代突变获得的氨基酸序列。
优选地,所述转氨酶突变体包括SEQ ID NO:3所示的氨基酸序列,较野生型转氨酶的活性有所提高;
SEQ ID NO:3:
MASMDKVFAGYAARQAILESTETTNPFAKGIAWVEGELVPLAEARIPLLDQGFMHSDLTYDVPSV WDGRFFRLDDHITRLEASCTKLRLRLPLPRDQVKQILVEMVAKSGIRDAFVELIVTRGLKGVRGTRPEDI VNNLYMFVQPYVWVMEPDMQRVGGSAVVARTVRRVPPGAIDPTVKNLQWGDYVRGMFEAADRGAT YPFLTDGDAHLTEGSLFNIVLVKDGVLYTPDRGVLQGVTRKSVINAAEAFGIEVRVEFVPVELAYRCDEI FMCTWAGGIMPITTLDGMPVNGGQIGPITKKIWDGYWAMHYDAAYSFEIDYNERN。
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经N181Y、G216L和T275W取代突变获得的氨基酸序列。
优选地,所述转氨酶突变体包括SEQ ID NO:5所示的氨基酸序列,较野生型转氨酶的活性有所提高;
SEQ ID NO:5:
MASMDKVFAGYAARQAILESTETTNPFAKGIAWVEGELVPLAEARIPLLDQGFMHSDLTYDVPSV WDGRFFRLDDHITRLEASCTKLRLRLPLPRDQVKQILVEMVAKSGIRDAFVELIVTRGLKGVRGTRPEDI VNNLYMFVQPYVWVMEPDMQRVGGSAVVARTVRRVPPGAIDPTVKYLQWGDLVRGMFEAADRGAT YPFLTDGDAHLTEGSLFNIVLVKDGVLYTPDRGVLQGVTRKSVINAAEAFGIEVRVEFVPVELAYRCDEI FMCTWAGGIMPITTLDGMPVNGGQIGPITKKIWDGYWAMHYDAAYSFEIDYNERN。
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经K180R、L187Y和G216L取代突变获得的氨基酸序列。
优选地,所述转氨酶突变体包括SEQ ID NO:7所示的氨基酸序列,较野生型转氨酶的活性有所提高;
SEQ ID NO:7:
MASMDKVFAGYAARQAILESTETTNPFAKGIAWVEGELVPLAEARIPLLDQGFMHSDLTYDVPSV WDGRFFRLDDHITRLEASCTKLRLRLPLPRDQVKQILVEMVAKSGIRDAFVELIVTRGLKGVRGTRPEDI VNNLYMFVQPYVWVMEPDMQRVGGSAVVARTVRRVPPGAIDPTVRNLQWGDYVRGMFEAADRGAT YPFLTDGDAHLTEGSLFNIVLVKDGVLYTPDRGVLQGVTRKSVINAAEAFGIEVRVEFVPVELAYRCDEI FMCTTAGGIMPITTLDGMPVNGGQIGPITKKIWDGYWAMHYDAAYSFEIDYNERN。
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经P177N、K180R、N181Y、L187Y 和T275W取代突变获得的氨基酸序列。
优选地,所述转氨酶突变体包括SEQ ID NO:9所示的氨基酸序列,较野生型转氨酶的活性明显提高; SEQ ID NO:9:
MASMDKVFAGYAARQAILESTETTNPFAKGIAWVEGELVPLAEARIPLLDQGFMHSDLTYDVPSV WDGRFFRLDDHITRLEASCTKLRLRLPLPRDQVKQILVEMVAKSGIRDAFVELIVTRGLKGVRGTRPEDI VNNLYMFVQPYVWVMEPDMQRVGGSAVVARTVRRVPPGAIDNTVRYLQWGDYVRGMFEAADRGAT YPFLTDGDAHLTEGSGFNIVLVKDGVLYTPDRGVLQGVTRKSVINAAEAFGIEVRVEFVPVELAYRCDEI FMCTWAGGIMPITTLDGMPVNGGQIGPITKKIWDGYWAMHYDAAYSFEIDYNERN。
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经P177N、N181Y、L187Y、G216L 和T275W取代突变获得的氨基酸序列。
优选地,所述转氨酶突变体包括SEQ ID NO:11的氨基酸序列,较野生型转氨酶的活性明显提高;
SEQ ID NO:11:
MASMDKVFAGYAARQAILESTETTNPFAKGIAWVEGELVPLAEARIPLLDQGFMHSDLTYDVPSV WDGRFFRLDDHITRLEASCTKLRLRLPLPRDQVKQILVEMVAKSGIRDAFVELIVTRGLKGVRGTRPEDI VNNLYMFVQPYVWVMEPDMQRVGGSAVVARTVRRVPPGAIDNTVKYLQWGDYVRGMFEAADRGAT YPFLTDGDAHLTEGSLFNIVLVKDGVLYTPDRGVLQGVTRKSVINAAEAFGIEVRVEFVPVELAYRCDEI FMCTWAGGIMPITTLDGMPVNGGQIGPITKKIWDGYWAMHYDAAYSFEIDYNERN。
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经P177N、K180R、L187Y、G216R 和T275W取代突变获得的氨基酸序列。
优选地,所述转氨酶突变体包括SEQ ID NO:13所示的氨基酸序列,较野生型转氨酶的活性明显提高; SEQ ID NO:13:
MASMDKVFAGYAARQAILESTETTNPFAKGIAWVEGELVPLAEARIPLLDQGFMHSDLTYDVPSV WDGRFFRLDDHITRLEASCTKLRLRLPLPRDQVKQILVEMVAKSGIRDAFVELIVTRGLKGVRGTRPEDI VNNLYMFVQPYVWVMEPDMQRVGGSAVVARTVRRVPPGAIDNTVRNLQWGDYVRGMFEAADRGAT YPFLTDGDAHLTEGSRFNIVLVKDGVLYTPDRGVLQGVTRKSVINAAEAFGIEVRVEFVPVELAYRCDEI FMCTWAGGIMPITTLDGMPVNGGQIGPITKKIWDGYWAMHYDAAYSFEIDYNERN。
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经P177N、K180R、N181Y、L187Y、 G216L和T275W取代突变获得的氨基酸序列。
优选地,所述转氨酶突变体包括SEQ ID NO:15所示的氨基酸序列,较野生型转氨酶的活性显著提高; SEQ ID NO:15:
MASMDKVFAGYAARQAILESTETTNPFAKGIAWVEGELVPLAEARIPLLDQGFMHSDLTYDVPSV WDGRFFRLDDHITRLEASCTKLRLRLPLPRDQVKQILVEMVAKSGIRDAFVELIVTRGLKGVRGTRPEDI VNNLYMFVQPYVWVMEPDMQRVGGSAVVARTVRRVPPGAIDNTVRYLQWGDYVRGMFEAADRGAT YPFLTDGDAHLTEGSLFNIVLVKDGVLYTPDRGVLQGVTRKSVINAAEAFGIEVRVEFVPVELAYRCDEI FMCTWAGGIMPITTLDGMPVNGGQIGPITKKIWDGYWAMHYDAAYSFEIDYNERN。
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经A13L、P177N、K180R、N181Y、 L187Y、G216W、T275W和K301R取代突变获得的氨基酸序列。
优选地,所述转氨酶突变体包括SEQ ID NO:17所示的氨基酸序列,较野生型转氨酶的活性显著提高; SEQ ID NO:17:
MASMDKVFAGYALRQAILESTETTNPFAKGIAWVEGELVPLAEARIPLLDQGFMHSDLTYDVPSV WDGRFFRLDDHITRLEASCTKLRLRLPLPRDQVKQILVEMVAKSGIRDAFVELIVTRGLKGVRGTRPEDI VNNLYMFVQPYVWVMEPDMQRVGGSAVVARTVRRVPPGAIDNTVRYLQWGDYVRGMFEAADRGAT YPFLTDGDAHLTEGSWFNIVLVKDGVLYTPDRGVLQGVTRKSVINAAEAFGIEVRVEFVPVELAYRCDE IFMCTWAGGIMPITTLDGMPVNGGQIGPITKRIWDGYWAMHYDAAYSFEIDYNERN。
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经A13R、P177N、K180R、N181Y、 L187Y、G216R、T275W、T284F和K301F取代突变获得的氨基酸序列。
优选地,所述转氨酶突变体包括SEQ ID NO:19所示的氨基酸序列,较野生型转氨酶的活性显著提高; SEQ ID NO:19:
MASMDKVFAGYARRQAILESTETTNPFAKGIAWVEGELVPLAEARIPLLDQGFMHSDLTYDVPSV WDGRFFRLDDHITRLEASCTKLRLRLPLPRDQVKQILVEMVAKSGIRDAFVELIVTRGLKGVRGTRPEDI VNNLYMFVQPYVWVMEPDMQRVGGSAVVARTVRRVPPGAIDNTVRYLQWGDYVRGMFEAADRGAT YPFLTDGDAHLTEGSRFNIVLVKDGVLYTPDRGVLQGVTRKSVINAAEAFGIEVRVEFVPVELAYRCDEI FMCTWAGGIMPITFLDGMPVNGGQIGPITKFIWDGYWAMHYDAAYSFEIDYNERN。
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经A13L、P177N、K180R、N181Y、 L187Y、G216W、T275W、K301R和D312F取代突变获得的氨基酸序列。
优选地,所述转氨酶突变体包括SEQ ID NO:21所示的氨基酸序列,较野生型转氨酶的活性显著提高; SEQ ID NO:21:
MASMDKVFAGYALRQAILESTETTNPFAKGIAWVEGELVPLAEARIPLLDQGFMHSDLTYDVPSV WDGRFFRLDDHITRLEASCTKLRLRLPLPRDQVKQILVEMVAKSGIRDAFVELIVTRGLKGVRGTRPEDI VNNLYMFVQPYVWVMEPDMQRVGGSAVVARTVRRVPPGAIDNTVRYLQWGDYVRGMFEAADRGAT YPFLTDGDAHLTEGSWFNIVLVKDGVLYTPDRGVLQGVTRKSVINAAEAFGIEVRVEFVPVELAYRCDE IFMCTWAGGIMPITTLDGMPVNGGQIGPITKRIWDGYWAMHYFAAYSFEIDYNERN。
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经A13R、N138W、P177N、K180R、 N181Y、L187Y、G216R、T275W、T284W、K301R和D312L取代突变获得的氨基酸序列。
优选地,所述转氨酶突变体包括SEQ ID NO:23所示的氨基酸序列,较野生型转氨酶的活性显著提高; SEQ ID NO:23:
MASMDKVFAGYARRQAILESTETTNPFAKGIAWVEGELVPLAEARIPLLDQGFMHSDLTYDVPSV WDGRFFRLDDHITRLEASCTKLRLRLPLPRDQVKQILVEMVAKSGIRDAFVELIVTRGLKGVRGTRPEDI VNWLYMFVQPYVWVMEPDMQRVGGSAVVARTVRRVPPGAIDNTVRYLQWGDYVRGMFEAADRGAT YPFLTDGDAHLTEGSRFNIVLVKDGVLYTPDRGVLQGVTRKSVINAAEAFGIEVRVEFVPVELAYRCDEI FMCTWAGGIMPITWLDGMPVNGGQIGPITKRIWDGYWAMHYLAAYSFEIDYNERN。
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经A13L、N138R、P177N、K180R、 N181Y、L187Y、G216R、T275W、T284W、K301R和D312L取代突变获得的氨基酸序列。
优选地,所述转氨酶突变体包括SEQ ID NO:25所示的氨基酸序列,较野生型转氨酶的活性显著提高; SEQ ID NO:25:
MASMDKVFAGYALRQAILESTETTNPFAKGIAWVEGELVPLAEARIPLLDQGFMHSDLTYDVPSV WDGRFFRLDDHITRLEASCTKLRLRLPLPRDQVKQILVEMVAKSGIRDAFVELIVTRGLKGVRGTRPEDI VNRLYMFVQPYVWVMEPDMQRVGGSAVVARTVRRVPPGAIDNTVRYLQWGDYVRGMFEAADRGAT YPFLTDGDAHLTEGSRFNIVLVKDGVLYTPDRGVLQGVTRKSVINAAEAFGIEVRVEFVPVELAYRCDEI FMCTWAGGIMPITWLDGMPVNGGQIGPITKRIWDGYWAMHYLAAYSFEIDYNERN。
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经A13R、N138R、P177N、K180R、 N181Y、L187Y、G216W、T275W、T284F、K301R和D312R取代突变获得的氨基酸序列。
优选地,所述转氨酶突变体包括SEQ ID NO:27所示的氨基酸序列,较野生型转氨酶的活性显著提高; SEQ ID NO:27:
MASMDKVFAGYARRQAILESTETTNPFAKGIAWVEGELVPLAEARIPLLDQGFMHSDLTYDVPSV WDGRFFRLDDHITRLEASCTKLRLRLPLPRDQVKQILVEMVAKSGIRDAFVELIVTRGLKGVRGTRPEDI VNRLYMFVQPYVWVMEPDMQRVGGSAVVARTVRRVPPGAIDNTVRYLQWGDYVRGMFEAADRGAT YPFLTDGDAHLTEGSWFNIVLVKDGVLYTPDRGVLQGVTRKSVINAAEAFGIEVRVEFVPVELAYRCDE IFMCTWAGGIMPITFLDGMPVNGGQIGPITKRIWDGYWAMHYRAAYSFEIDYNERN。
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经A13L、N138W、P177N、K180R、 N181Y、L187Y、G216W、T275W、T284F、K301F和D312L取代突变获得的氨基酸序列。
优选地,所述转氨酶包括SEQ ID NO:29所示的氨基酸序列,较野生型转氨酶的活性显著提高;
SEQ ID NO:29:
MASMDKVFAGYALRQAILESTETTNPFAKGIAWVEGELVPLAEARIPLLDQGFMHSDLTYDVPSV WDGRFFRLDDHITRLEASCTKLRLRLPLPRDQVKQILVEMVAKSGIRDAFVELIVTRGLKGVRGTRPEDI VNWLYMFVQPYVWVMEPDMQRVGGSAVVARTVRRVPPGAIDNTVRYLQWGDYVRGMFEAADRGAT YPFLTDGDAHLTEGSWFNIVLVKDGVLYTPDRGVLQGVTRKSVINAAEAFGIEVRVEFVPVELAYRCDE IFMCTWAGGIMPITFLDGMPVNGGQIGPITKFIWDGYWAMHYLAAYSFEIDYNERN。
第二方面,本发明提供了一种核酸分子,所述核酸分子包括第一方面所述的转氨酶突变体的编码基因。
优选地,所述核酸分子包括SEQ ID NO:4、SEQ ID NO:6、SEQ ID NO:8、SEQ ID NO:10、SEQ ID NO: 12、SEQ ID NO:14、SEQ ID NO:16、SEQ ID NO:18、SEQ ID NO:20、SEQ IDNO:22、SEQ ID NO:24、 SEQ ID NO:26、SEQ ID NO:28或SEQ ID NO:30所示的核酸序列。
SEQ ID NO:2为SEQ ID NO:1所示的野生型转氨酶的编码基因:
atggcgagcatggacaaagtgtttgcgggttatgcggcgcgtcaggcgatcctggagagcaccgaaaccaccaacccgtttgcgaaaggcattgcgtggg tggaaggtgaactggttccgctggcggaggcgcgtatcccgctgctggaccaaggtttcatgcacagcgacctgacctatgatgtgccgagcgtttgggatggccg tttctttcgtctggacgatcacatcacccgtctggaagcgagctgcaccaaactgcgtctgcgtctgccgctgccgcgtgaccaggtgaagcaaattctggtggagat ggttgcgaaaagcggtatccgtgatgcgttcgtggaactgattgttacccgtggtctgaaaggcgttcgtggtacccgtccggaggacatcgtgaacaacctgtacat gtttgttcagccgtatgtgtgggttatggaaccggatatgcagcgtgtgggtggcagcgcggtggttgcgcgtaccgtgcgtcgtgttccgccgggtgcgattgatcc gaccgtgaagaacctgcagtggggcgacctggttcgtggcatgtttgaggcggcggatcgtggcgcgacctacccgtttctgaccgacggtgatgcgcacctgac cgaaggtagcggctttaacatcgtgctggttaaggacggtgtgctgtataccccggatcgtggcgtgctgcaaggtgttacccgtaaaagcgttatcaacgcggcgg aggcgtttggtattgaagtgcgtgttgaatttgtgccggttgagctggcgtaccgttgcgacgaaattttcatgtgcaccaccgcgggtggcatcatgccgattaccacc ctggatggcatgccggttaacggtggccagatcggtccgattaccaagaaaatctgggacggctactgggcgatgcactacgatgcggcgtatagctttgagattgactataacgaacgtaactaa。
SEQ ID NO:4为SEQ ID NO:3所示的转氨酶突变体的编码基因:
atggcgagcatggacaaagtgtttgcgggttatgcggcgcgtcaggcgatcctggagagcaccgaaaccaccaacccgtttgcgaaaggcattgcgtggg tggaaggtgaactggttccgctggcggaggcgcgtatcccgctgctggaccaaggtttcatgcacagcgacctgacctatgatgtgccgagcgtttgggatggccg tttctttcgtctggacgatcacatcacccgtctggaagcgagctgcaccaaactgcgtctgcgtctgccgctgccgcgtgaccaggtgaagcaaattctggtggagat ggttgcgaaaagcggtatccgtgatgcgttcgtggaactgattgttacccgtggtctgaaaggcgttcgtggtacccgtccggaggacatcgtgaacaacctgtacat gtttgttcagccgtatgtgtgggttatggaaccggatatgcagcgtgtgggtggcagcgcggtggttgcgcgtaccgtgcgtcgtgttccgccgggtgcgattgatcc gaccgtgaagaacctgcagtggggcgactacgttcgtggcatgtttgaggcggcggatcgtggcgcgacctacccgtttctgaccgacggtgatgcgcacctgac cgaaggtagcctgtttaacatcgtgctggttaaggacggtgtgctgtataccccggatcgtggcgtgctgcaaggtgttacccgtaaaagcgttatcaacgcggcgga ggcgtttggtattgaagtgcgtgttgaatttgtgccggttgagctggcgtaccgttgcgacgaaattttcatgtgcacctgggcgggtggcatcatgccgattaccaccc tggatggcatgccggttaacggtggccagatcggtccgattaccaagaaaatctgggacggctactgggcgatgcactacgatgcggcgtatagctttgagattgactataacgaacgtaactaa。
SEQ ID NO:6为SEQ ID NO:5所示的转氨酶突变体的编码基因:
atggcgagcatggacaaagtgtttgcgggttatgcggcgcgtcaggcgatcctggagagcaccgaaaccaccaacccgtttgcgaaaggcattgcgtggg tggaaggtgaactggttccgctggcggaggcgcgtatcccgctgctggaccaaggtttcatgcacagcgacctgacctatgatgtgccgagcgtttgggatggccg tttctttcgtctggacgatcacatcacccgtctggaagcgagctgcaccaaactgcgtctgcgtctgccgctgccgcgtgaccaggtgaagcaaattctggtggagat ggttgcgaaaagcggtatccgtgatgcgttcgtggaactgattgttacccgtggtctgaaaggcgttcgtggtacccgtccggaggacatcgtgaacaacctgtacat gtttgttcagccgtatgtgtgggttatggaaccggatatgcagcgtgtgggtggcagcgcggtggttgcgcgtaccgtgcgtcgtgttccgccgggtgcgattgatcc gaccgtgaagtacctgcagtggggcgacctggttcgtggcatgtttgaggcggcggatcgtggcgcgacctacccgtttctgaccgacggtgatgcgcacctgacc gaaggtagcctgtttaacatcgtgctggttaaggacggtgtgctgtataccccggatcgtggcgtgctgcaaggtgttacccgtaaaagcgttatcaacgcggcgga ggcgtttggtattgaagtgcgtgttgaatttgtgccggttgagctggcgtaccgttgcgacgaaattttcatgtgcacctgggcgggtggcatcatgccgattaccaccc tggatggcatgccggttaacggtggccagatcggtccgattaccaagaaaatctgggacggctactgggcgatgcactacgatgcggcgtatagctttgagattgactataacgaacgtaactaa。
SEQ ID NO:8为SEQ ID NO:7所示的转氨酶突变体的编码基因:
atggcgagcatggacaaagtgtttgcgggttatgcggcgcgtcaggcgatcctggagagcaccgaaaccaccaacccgtttgcgaaaggcattgcgtggg tggaaggtgaactggttccgctggcggaggcgcgtatcccgctgctggaccaaggtttcatgcacagcgacctgacctatgatgtgccgagcgtttgggatggccg tttctttcgtctggacgatcacatcacccgtctggaagcgagctgcaccaaactgcgtctgcgtctgccgctgccgcgtgaccaggtgaagcaaattctggtggagat ggttgcgaaaagcggtatccgtgatgcgttcgtggaactgattgttacccgtggtctgaaaggcgttcgtggtacccgtccggaggacatcgtgaacaacctgtacat gtttgttcagccgtatgtgtgggttatggaaccggatatgcagcgtgtgggtggcagcgcggtggttgcgcgtaccgtgcgtcgtgttccgccgggtgcgattgatcc gaccgtgcgtaacctgcagtggggcgactacgttcgtggcatgtttgaggcggcggatcgtggcgcgacctacccgtttctgaccgacggtgatgcgcacctgacc gaaggtagcctgtttaacatcgtgctggttaaggacggtgtgctgtataccccggatcgtggcgtgctgcaaggtgttacccgtaaaagcgttatcaacgcggcgga ggcgtttggtattgaagtgcgtgttgaatttgtgccggttgagctggcgtaccgttgcgacgaaattttcatgtgcaccaccgcgggtggcatcatgccgattaccaccc tggatggcatgccggttaacggtggccagatcggtccgattaccaagaaaatctgggacggctactgggcgatgcactacgatgcggcgtatagctttgagattgactataacgaacgtaactaa。
SEQ ID NO:10为SEQ ID NO:9所示的转氨酶突变体的编码基因:
atggcgagcatggacaaagtgtttgcgggttatgcggcgcgtcaggcgatcctggagagcaccgaaaccaccaacccgtttgcgaaaggcattgcgtggg tggaaggtgaactggttccgctggcggaggcgcgtatcccgctgctggaccaaggtttcatgcacagcgacctgacctatgatgtgccgagcgtttgggatggccg tttctttcgtctggacgatcacatcacccgtctggaagcgagctgcaccaaactgcgtctgcgtctgccgctgccgcgtgaccaggtgaagcaaattctggtggagat ggttgcgaaaagcggtatccgtgatgcgttcgtggaactgattgttacccgtggtctgaaaggcgttcgtggtacccgtccggaggacatcgtgaacaacctgtacat gtttgttcagccgtatgtgtgggttatggaaccggatatgcagcgtgtgggtggcagcgcggtggttgcgcgtaccgtgcgtcgtgttccgccgggtgcgattgataa caccgtgcgttacctgcagtggggcgactacgttcgtggcatgtttgaggcggcggatcgtggcgcgacctacccgtttctgaccgacggtgatgcgcacctgacc gaaggtagcggctttaacatcgtgctggttaaggacggtgtgctgtataccccggatcgtggcgtgctgcaaggtgttacccgtaaaagcgttatcaacgcggcgga ggcgtttggtattgaagtgcgtgttgaatttgtgccggttgagctggcgtaccgttgcgacgaaattttcatgtgcacctgggcgggtggcatcatgccgattaccaccc tggatggcatgccggttaacggtggccagatcggtccgattaccaagaaaatctgggacggctactgggcgatgcactacgatgcggcgtatagctttgagattgactataacgaacgtaactaa。
SEQ ID NO:12为SEQ ID NO:11所示的转氨酶突变体的编码基因:
atggcgagcatggacaaagtgtttgcgggttatgcggcgcgtcaggcgatcctggagagcaccgaaaccaccaacccgtttgcgaaaggcattgcgtggg tggaaggtgaactggttccgctggcggaggcgcgtatcccgctgctggaccaaggtttcatgcacagcgacctgacctatgatgtgccgagcgtttgggatggccg tttctttcgtctggacgatcacatcacccgtctggaagcgagctgcaccaaactgcgtctgcgtctgccgctgccgcgtgaccaggtgaagcaaattctggtggagat ggttgcgaaaagcggtatccgtgatgcgttcgtggaactgattgttacccgtggtctgaaaggcgttcgtggtacccgtccggaggacatcgtgaacaacctgtacat gtttgttcagccgtatgtgtgggttatggaaccggatatgcagcgtgtgggtggcagcgcggtggttgcgcgtaccgtgcgtcgtgttccgccgggtgcgattgataa caccgtgaagtacctgcagtggggcgactacgttcgtggcatgtttgaggcggcggatcgtggcgcgacctacccgtttctgaccgacggtgatgcgcacctgacc gaaggtagcctgtttaacatcgtgctggttaaggacggtgtgctgtataccccggatcgtggcgtgctgcaaggtgttacccgtaaaagcgttatcaacgcggcgga ggcgtttggtattgaagtgcgtgttgaatttgtgccggttgagctggcgtaccgttgcgacgaaattttcatgtgcacctgggcgggtggcatcatgccgattaccaccc tggatggcatgccggttaacggtggccagatcggtccgattaccaagaaaatctgggacggctactgggcgatgcactacgatgcggcgtatagctttgagattgactataacgaacgtaactaa。
SEQ ID NO:14为SEQ ID NO:13所示的转氨酶突变体的编码基因:
atggcgagcatggacaaagtgtttgcgggttatgcggcgcgtcaggcgatcctggagagcaccgaaaccaccaacccgtttgcgaaaggcattgcgtggg tggaaggtgaactggttccgctggcggaggcgcgtatcccgctgctggaccaaggtttcatgcacagcgacctgacctatgatgtgccgagcgtttgggatggccg tttctttcgtctggacgatcacatcacccgtctggaagcgagctgcaccaaactgcgtctgcgtctgccgctgccgcgtgaccaggtgaagcaaattctggtggagat ggttgcgaaaagcggtatccgtgatgcgttcgtggaactgattgttacccgtggtctgaaaggcgttcgtggtacccgtccggaggacatcgtgaacaacctgtacat gtttgttcagccgtatgtgtgggttatggaaccggatatgcagcgtgtgggtggcagcgcggtggttgcgcgtaccgtgcgtcgtgttccgccgggtgcgattgataa caccgtgcgtaacctgcagtggggcgactacgttcgtggcatgtttgaggcggcggatcgtggcgcgacctacccgtttctgaccgacggtgatgcgcacctgacc gaaggtagccgttttaacatcgtgctggttaaggacggtgtgctgtataccccggatcgtggcgtgctgcaaggtgttacccgtaaaagcgttatcaacgcggcgga ggcgtttggtattgaagtgcgtgttgaatttgtgccggttgagctggcgtaccgttgcgacgaaattttcatgtgcacctgggcgggtggcatcatgccgattaccaccc tggatggcatgccggttaacggtggccagatcggtccgattaccaagaaaatctgggacggctactgggcgatgcactacgatgcggcgtatagctttgagattgactataacgaacgtaactaa。
SEQ ID NO:16为SEQ ID NO:15所示的转氨酶突变体的编码基因:
atggcgagcatggacaaagtgtttgcgggttatgcggcgcgtcaggcgatcctggagagcaccgaaaccaccaacccgtttgcgaaaggcattgcgtggg tggaaggtgaactggttccgctggcggaggcgcgtatcccgctgctggaccaaggtttcatgcacagcgacctgacctatgatgtgccgagcgtttgggatggccg tttctttcgtctggacgatcacatcacccgtctggaagcgagctgcaccaaactgcgtctgcgtctgccgctgccgcgtgaccaggtgaagcaaattctggtggagat ggttgcgaaaagcggtatccgtgatgcgttcgtggaactgattgttacccgtggtctgaaaggcgttcgtggtacccgtccggaggacatcgtgaacaacctgtacat gtttgttcagccgtatgtgtgggttatggaaccggatatgcagcgtgtgggtggcagcgcggtggttgcgcgtaccgtgcgtcgtgttccgccgggtgcgattgataa caccgtgcgttacctgcagtggggcgactacgttcgtggcatgtttgaggcggcggatcgtggcgcgacctacccgtttctgaccgacggtgatgcgcacctgacc gaaggtagcctgtttaacatcgtgctggttaaggacggtgtgctgtataccccggatcgtggcgtgctgcaaggtgttacccgtaaaagcgttatcaacgcggcgga ggcgtttggtattgaagtgcgtgttgaatttgtgccggttgagctggcgtaccgttgcgacgaaattttcatgtgcacctgggcgggtggcatcatgccgattaccaccc tggatggcatgccggttaacggtggccagatcggtccgattaccaagaaaatctgggacggctactgggcgatgcactacgatgcggcgtatagctttgagattgactataacgaacgtaactaa。
SEQ ID NO:18为SEQ ID NO:17所示的转氨酶突变体的编码基因:
atggcgagcatggacaaagtgtttgcgggttatgcgctgcgtcaggcgatcctggagagcaccgaaaccaccaacccgtttgcgaaaggcattgcgtgggt ggaaggtgaactggttccgctggcggaggcgcgtatcccgctgctggaccaaggtttcatgcacagcgacctgacctatgatgtgccgagcgtttgggatggccgt ttctttcgtctggacgatcacatcacccgtctggaagcgagctgcaccaaactgcgtctgcgtctgccgctgccgcgtgaccaggtgaagcaaattctggtggagatg gttgcgaaaagcggtatccgtgatgcgttcgtggaactgattgttacccgtggtctgaaaggcgttcgtggtacccgtccggaggacatcgtgaacaacctgtacatg tttgttcagccgtatgtgtgggttatggaaccggatatgcagcgtgtgggtggcagcgcggtggttgcgcgtaccgtgcgtcgtgttccgccgggtgcgattgataac accgtgcgttacctgcagtggggcgactacgttcgtggcatgtttgaggcggcggatcgtggcgcgacctacccgtttctgaccgacggtgatgcgcacctgaccg aaggtagctggtttaacatcgtgctggttaaggacggtgtgctgtataccccggatcgtggcgtgctgcaaggtgttacccgtaaaagcgttatcaacgcggcggag gcgtttggtattgaagtgcgtgttgaatttgtgccggttgagctggcgtaccgttgcgacgaaattttcatgtgcacctgggcgggtggcatcatgccgattaccaccct ggatggcatgccggttaacggtggccagatcggtccgattaccaagcgtatctgggacggctactgggcgatgcactacgatgcggcgtatagctttgagattgactataacgaacgtaactaa。
SEQ ID NO:20为SEQ ID NO:19所示的转氨酶突变体的编码基因:
atggcgagcatggacaaagtgtttgcgggttatgcgcgtcgtcaggcgatcctggagagcaccgaaaccaccaacccgtttgcgaaaggcattgcgtgggt ggaaggtgaactggttccgctggcggaggcgcgtatcccgctgctggaccaaggtttcatgcacagcgacctgacctatgatgtgccgagcgtttgggatggccgt ttctttcgtctggacgatcacatcacccgtctggaagcgagctgcaccaaactgcgtctgcgtctgccgctgccgcgtgaccaggtgaagcaaattctggtggagatg gttgcgaaaagcggtatccgtgatgcgttcgtggaactgattgttacccgtggtctgaaaggcgttcgtggtacccgtccggaggacatcgtgaacaacctgtacatg tttgttcagccgtatgtgtgggttatggaaccggatatgcagcgtgtgggtggcagcgcggtggttgcgcgtaccgtgcgtcgtgttccgccgggtgcgattgataac accgtgcgttacctgcagtggggcgactacgttcgtggcatgtttgaggcggcggatcgtggcgcgacctacccgtttctgaccgacggtgatgcgcacctgaccg aaggtagccgttttaacatcgtgctggttaaggacggtgtgctgtataccccggatcgtggcgtgctgcaaggtgttacccgtaaaagcgttatcaacgcggcggag gcgtttggtattgaagtgcgtgttgaatttgtgccggttgagctggcgtaccgttgcgacgaaattttcatgtgcacctgggcgggtggcatcatgccgattacctttctg gatggcatgccggttaacggtggccagatcggtccgattaccaagtttatctgggacggctactgggcgatgcactacgatgcggcgtatagctttgagattgactataacgaacgtaactaa。
SEQ ID NO:22为SEQ ID NO:21所示的转氨酶突变体的编码基因:
atggcgagcatggacaaagtgtttgcgggttatgcgctgcgtcaggcgatcctggagagcaccgaaaccaccaacccgtttgcgaaaggcattgcgtgggt ggaaggtgaactggttccgctggcggaggcgcgtatcccgctgctggaccaaggtttcatgcacagcgacctgacctatgatgtgccgagcgtttgggatggccgt ttctttcgtctggacgatcacatcacccgtctggaagcgagctgcaccaaactgcgtctgcgtctgccgctgccgcgtgaccaggtgaagcaaattctggtggagatg gttgcgaaaagcggtatccgtgatgcgttcgtggaactgattgttacccgtggtctgaaaggcgttcgtggtacccgtccggaggacatcgtgaacaacctgtacatg tttgttcagccgtatgtgtgggttatggaaccggatatgcagcgtgtgggtggcagcgcggtggttgcgcgtaccgtgcgtcgtgttccgccgggtgcgattgataac accgtgcgttacctgcagtggggcgactacgttcgtggcatgtttgaggcggcggatcgtggcgcgacctacccgtttctgaccgacggtgatgcgcacctgaccg aaggtagctggtttaacatcgtgctggttaaggacggtgtgctgtataccccggatcgtggcgtgctgcaaggtgttacccgtaaaagcgttatcaacgcggcggag gcgtttggtattgaagtgcgtgttgaatttgtgccggttgagctggcgtaccgttgcgacgaaattttcatgtgcacctgggcgggtggcatcatgccgattaccaccct ggatggcatgccggttaacggtggccagatcggtccgattaccaagcgtatctgggacggctactgggcgatgcactactttgcggcgtatagctttgagattgactataacgaacgtaactaa。
SEQ ID NO:24为SEQ ID NO:23所示的转氨酶突变体的编码基因:
atggcgagcatggacaaagtgtttgcgggttatgcgcgtcgtcaggcgatcctggagagcaccgaaaccaccaacccgtttgcgaaaggcattgcgtgggt ggaaggtgaactggttccgctggcggaggcgcgtatcccgctgctggaccaaggtttcatgcacagcgacctgacctatgatgtgccgagcgtttgggatggccgt ttctttcgtctggacgatcacatcacccgtctggaagcgagctgcaccaaactgcgtctgcgtctgccgctgccgcgtgaccaggtgaagcaaattctggtggagatg gttgcgaaaagcggtatccgtgatgcgttcgtggaactgattgttacccgtggtctgaaaggcgttcgtggtacccgtccggaggacatcgtgaactggctgtacatgt ttgttcagccgtatgtgtgggttatggaaccggatatgcagcgtgtgggtggcagcgcggtggttgcgcgtaccgtgcgtcgtgttccgccgggtgcgattgataaca ccgtgcgttacctgcagtggggcgactacgttcgtggcatgtttgaggcggcggatcgtggcgcgacctacccgtttctgaccgacggtgatgcgcacctgaccga aggtagccgttttaacatcgtgctggttaaggacggtgtgctgtataccccggatcgtggcgtgctgcaaggtgttacccgtaaaagcgttatcaacgcggcggaggcgtttggtattgaagtgcgtgttgaatttgtgccggttgagctggcgtaccgttgcgacgaaattttcatgtgcacctgggcgggtggcatcatgccgattacctggctg gatggcatgccggttaacggtggccagatcggtccgattaccaagcgtatctgggacggctactgggcgatgcactacctggcggcgtatagctttgagattgactataacgaacgtaactaa。
SEQ ID NO:26为SEQ ID NO:25所示的转氨酶突变体的编码基因:
atggcgagcatggacaaagtgtttgcgggttatgcgctgcgtcaggcgatcctggagagcaccgaaaccaccaacccgtttgcgaaaggcattgcgtgggt ggaaggtgaactggttccgctggcggaggcgcgtatcccgctgctggaccaaggtttcatgcacagcgacctgacctatgatgtgccgagcgtttgggatggccgt ttctttcgtctggacgatcacatcacccgtctggaagcgagctgcaccaaactgcgtctgcgtctgccgctgccgcgtgaccaggtgaagcaaattctggtggagatg gttgcgaaaagcggtatccgtgatgcgttcgtggaactgattgttacccgtggtctgaaaggcgttcgtggtacccgtccggaggacatcgtgaaccgtctgtacatgt ttgttcagccgtatgtgtgggttatggaaccggatatgcagcgtgtgggtggcagcgcggtggttgcgcgtaccgtgcgtcgtgttccgccgggtgcgattgataaca ccgtgcgttacctgcagtggggcgactacgttcgtggcatgtttgaggcggcggatcgtggcgcgacctacccgtttctgaccgacggtgatgcgcacctgaccga aggtagccgttttaacatcgtgctggttaaggacggtgtgctgtataccccggatcgtggcgtgctgcaaggtgttacccgtaaaagcgttatcaacgcggcggaggcgtttggtattgaagtgcgtgttgaatttgtgccggttgagctggcgtaccgttgcgacgaaattttcatgtgcacctgggcgggtggcatcatgccgattacctggctg gatggcatgccggttaacggtggccagatcggtccgattaccaagcgtatctgggacggctactgggcgatgcactacctggcggcgtatagctttgagattgactataacgaacgtaactaa。
SEQ ID NO:28为SEQ ID NO:27所示的转氨酶突变体的编码基因:
atggcgagcatggacaaagtgtttgcgggttatgcgcgtcgtcaggcgatcctggagagcaccgaaaccaccaacccgtttgcgaaaggcattgcgtgggt ggaaggtgaactggttccgctggcggaggcgcgtatcccgctgctggaccaaggtttcatgcacagcgacctgacctatgatgtgccgagcgtttgggatggccgt ttctttcgtctggacgatcacatcacccgtctggaagcgagctgcaccaaactgcgtctgcgtctgccgctgccgcgtgaccaggtgaagcaaattctggtggagatg gttgcgaaaagcggtatccgtgatgcgttcgtggaactgattgttacccgtggtctgaaaggcgttcgtggtacccgtccggaggacatcgtgaaccgtctgtacatgt ttgttcagccgtatgtgtgggttatggaaccggatatgcagcgtgtgggtggcagcgcggtggttgcgcgtaccgtgcgtcgtgttccgccgggtgcgattgataaca ccgtgcgttacctgcagtggggcgactacgttcgtggcatgtttgaggcggcggatcgtggcgcgacctacccgtttctgaccgacggtgatgcgcacctgaccga aggtagctggtttaacatcgtgctggttaaggacggtgtgctgtataccccggatcgtggcgtgctgcaaggtgttacccgtaaaagcgttatcaacgcggcggaggcgtttggtattgaagtgcgtgttgaatttgtgccggttgagctggcgtaccgttgcgacgaaattttcatgtgcacctgggcgggtggcatcatgccgattacctttctgg atggcatgccggttaacggtggccagatcggtccgattaccaagcgtatctgggacggctactgggcgatgcactaccgtgcggcgtatagctttgagattgactataacgaacgtaactaa。
SEQ ID NO:30为SEQ ID NO:29所示的转氨酶突变体的编码基因:
atggcgagcatggacaaagtgtttgcgggttatgcgctgcgtcaggcgatcctggagagcaccgaaaccaccaacccgtttgcgaaaggcattgcgtgggt ggaaggtgaactggttccgctggcggaggcgcgtatcccgctgctggaccaaggtttcatgcacagcgacctgacctatgatgtgccgagcgtttgggatggccgt ttctttcgtctggacgatcacatcacccgtctggaagcgagctgcaccaaactgcgtctgcgtctgccgctgccgcgtgaccaggtgaagcaaattctggtggagatg gttgcgaaaagcggtatccgtgatgcgttcgtggaactgattgttacccgtggtctgaaaggcgttcgtggtacccgtccggaggacatcgtgaactggctgtacatgt ttgttcagccgtatgtgtgggttatggaaccggatatgcagcgtgtgggtggcagcgcggtggttgcgcgtaccgtgcgtcgtgttccgccgggtgcgattgataaca ccgtgcgttacctgcagtggggcgactacgttcgtggcatgtttgaggcggcggatcgtggcgcgacctacccgtttctgaccgacggtgatgcgcacctgaccga aggtagctggtttaacatcgtgctggttaaggacggtgtgctgtataccccggatcgtggcgtgctgcaaggtgttacccgtaaaagcgttatcaacgcggcggaggcgtttggtattgaagtgcgtgttgaatttgtgccggttgagctggcgtaccgttgcgacgaaattttcatgtgcacctgggcgggtggcatcatgccgattacctttctgg atggcatgccggttaacggtggccagatcggtccgattaccaagtttatctgggacggctactgggcgatgcactacctggcggcgtatagctttgagattgactataacgaacgtaactaa。
第三方面,本发明提供了一种重组表达载体,所述重组表达载体包括本发明第二方面所述的核酸分子。
第四方面,本发明提供了一种重组宿主细胞,所述重组宿主细胞表达本发明第一方面所述的转氨酶突变体。
优选地,所述重组宿主细胞的基因组中整合有本发明第二方面所述的核酸分子。
优选地,所述重组宿主细胞含有本发明第三方面所述的重组表达载体。
第五方面,本发明提供了一种第一方面所述的转氨酶突变体的制备方法,所述方法包括以下步骤:
(1)将第二方面所述的核酸分子连接入表达载体,构建重组表达载体;
(2)将所述重组表达载体转化感受态细胞,进行抗性筛选,获得阳性克隆菌体;
(3)对阳性克隆菌体进行抗性筛选和诱导培养,收集菌体进行破碎,得到转氨酶突变体粗酶液。
优选地,所述方法还包括对粗酶液进行纯化、得到纯化的转氨酶突变体的步骤。
第六方面,本发明提供一种组合物,所述组合物包括SEQ ID NO:1所示的野生型转氨酶、SEQ ID NO: 3所示的转氨酶突变体、SEQ ID NO:5所示的转氨酶突变体、SEQ ID NO:7所示的转氨酶突变体、SEQ ID NO:9所示的转氨酶突变体、SEQ ID NO:11所示的转氨酶突变体、SEQ ID NO:13所示的转氨酶突变体、 SEQ ID NO:15所示的转氨酶突变体、SEQ ID NO:17所示的转氨酶突变体、SEQ ID NO:19所示的转氨酶突变体、SEQ ID NO:21所示的转氨酶突变体、SEQ ID NO:23所示的转氨酶突变体、SEQ ID NO:25所示的转氨酶突变体、SEQ IDNO:27所示的转氨酶突变体或SEQ ID NO:29所示的转氨酶突变体中的任意一种或至少两种的组合。
优选地,所述组合物包括酶粉、酶液或细胞中的任意一种。
第七方面,本发明提供了一种(R)-2-(2,5-二氟苯基)吡咯烷的制备方法,所述方法包括:
将4-氯-1-(2,5-二氟苯基)丁-1-酮、第六方面所述的组合物、磷酸吡哆醛(PLP)、异丙胺和/或异丙胺盐酸盐、助溶剂和缓冲液混合,反应得到(R)-2-(2,5-二氟苯基)吡咯烷,反应机理如图1所示,在助溶剂和缓冲溶液混合液中,在转氨酶和磷酸吡哆醛(PLP)的作用下,异丙胺的氨基转移至化合物1上形成化合物2,化合物2在该反应条件下自发关环,且在转氨酶和磷酸吡哆醛(PLP)的作用下形成化合物3。
优选地,所述反应的时间为12~48h,例如可以是12h、18h、24h、30h、36h、42h或48h。
优选地,所述反应的温度为25~50℃,例如可以是25℃、30℃、35℃、40℃、45℃或50℃。
优选地,所述4-氯-1-(2,5-二氟苯基)丁-1-酮的浓度为1~100g/L,例如可以是1g/L、10g/L、20g/L、 30g/L、40g/L、50g/L、60g/L、70g/L、80g/L、90g/L或100g/L。
优选地,所述组合物的浓度为1~100g/L。
优选地,当所述组合物为转氨酶酶粉时,浓度为1~10g/L,例如可以是1g/L、2g/L、4g/L、6g/L、 8g/L或10g/L;当所述组合物为转氨酶酶液时,浓度为3~40g/L,例如可以是3g/L、10g/L、15g/L、20 g/L、25g/L、30g/L、35g/L或40g/L;当所述组合物为含有转氨酶的细胞时,浓度为10~100g/L,例如可以是10g/L、20g/L、30g/L、40g/L、50g/L、60g/L、70g/L、80g/L、90g/L或100g/L。
优选地,所述磷酸吡哆醛的浓度为0.01~1g/L,例如可以是0.01g/L、0.05g/L、0.1g/L、0.15g/L、0.2 g/L、0.25g/L、0.3g/L、0.35g/L、0.4g/L、0.45g/L、0.5g/L、0.55g/L、0.6g/L、0.65g/L、0.7g/L、0.75g/L、 0.8g/L、0.85g/L、0.9g/L、0.95g/L或1g/L。
优选地,所述异丙胺和/或异丙胺盐酸盐的浓度为1~110g/L,例如可以是1g/L、10g/L、20g/L、30g/L、 40g/L、50g/L、60g/L、70g/L、80g/L、90g/L、100g/L或110g/L。
优选地,所述助溶剂包括乙醇、乙腈或二甲基亚砜中的任意一种或至少两种的组合,优选为二甲基亚砜。
优选地,所述二甲基亚砜的体积浓度为10~50%V/V,例如可以是10%VV、15%VV、20%VV、25% VV、30%VV、35%VV、40%VV、45%VV或50%VV。
优选地,所述缓冲液包括磷酸盐缓冲液和/或三乙醇胺-盐酸缓冲液。
优选地,所述缓冲液的pH为7.0~9.0,例如可以是7.0、7.5、8.0、8.5或9.0。
第八方面,本发明提供了第一方面所述的转氨酶突变体、第二方面所述的核酸分子、第三方面所述的重组表达载体、第四方面所述的重组宿主细胞或第六方面所述的组合物在制备(R)-2-(2,5-二氟苯基) 吡咯烷中的应用。
与现有技术相比,本发明具有如下有益效果:
(1)本发明通过对野生型转氨酶第13~312位氨基酸进行取代突变,获得的转氨酶突变体具有显著增强的转氨酶活性,能够高效催化4-氯-1-(2,5-二氟苯基)丁-1-酮合成(R)-2-(2,5-二氟苯基)吡咯烷,提高了(R)-2-(2,5-二氟苯基)吡咯烷的合成效率;
(2)本发明的转氨酶突变体的制备方法简单、产率较高,纯化前的粗酶液和纯化后的高纯度转氨酶突变体均具有高效的转氨酶活性,可用于(R)-2-(2,5-二氟苯基)吡咯烷的合成;
(3)本发明的(R)-2-(2,5-二氟苯基)吡咯烷的合成方法无需使用昂贵或危险的化学催化剂,不需要构型翻转,反应条件温和,工艺简单,提高了生产效率及安全性,降低了生产成本,适合应用于大规模工业化生产,在制备拉罗替尼关键手型中间体(R)-2-(2,5-二氟苯基)吡咯烷、以及制备拉罗替尼方面具有重要的应用前景。
附图说明
图1为(R)-2-(2,5-二氟苯基)吡咯烷的合成路线图;
图2为4-氯-1-(2,5-二氟苯基)丁-1-酮标准品的反相非手性HPLC图谱;
图3为(R,S)-2-(2,5-二氟苯基)吡咯烷标准品的反相非手性HPLC图谱;
图4为(R,S)-2-(2,5-二氟苯基)吡咯烷的正相手性HPLC图谱;
图5为(S)-2-(2,5-二氟苯基)吡咯烷的正相手性HPLC图谱;
图6为实施例2反应中控的反相非手性HPLC图谱;
图7为实施例5合成的(R)-2-(2,5-二氟苯基)吡咯烷的正相手性HPLC图谱;
图8为实施例5合成的(R)-2-(2,5-二氟苯基)吡咯烷的MS图谱。
具体实施方式
为进一步阐述本发明所采取的技术手段及其效果,以下结合实施例和附图对本发明作进一步地说明。可以理解的是,此处所描述的具体实施方式仅仅用于解释本发明,而非对本发明的限定。
实施例中未注明具体技术或条件者,按照本领域内的文献所描述的技术或条件,或者按照产品说明书进行。所用试剂或仪器未注明生产厂商者,均为可通过正规渠道商购获得的常规产品。
实施例1 R型-转氨酶的表达
(1)将来源于土曲霉菌Aspergillus terreus的野生型转氨酶的编码基因SEQ IDNO:2送至南京金斯瑞公司进行密码子优化和全基因合成,连接入质粒pET30a(+)中;将重组质粒转入大肠杆菌BL21(DE3)获得含有转化成功的重组菌的菌液,将菌液涂布于含50μg/mL卡那霉素的LB琼脂平板上,置于37℃下培养过夜;挑取转化子进行测序验证,将验证正确的转化子命名为E.coli BL21/pET30a-No.2。
(2)将上述重组菌E.coli BL21/pET30a-No.2接种于5mL含有50μg/mL卡那霉素的LB液体培养基中,置于37℃下培养过夜;取1mL菌液接种于125mL含有50μg/mL卡那霉素的LB液体培养基中,置于37℃下培养6h,然后加入125μL 1M IPTG,25℃下诱导过夜;离心(4000rpm,4℃,10min)收集菌体,加入4倍体积的PBS缓冲液(pH=7.0)重悬,重悬后对细胞进行超声破碎,再将溶液进行冷冻干燥,获得野生型转氨酶酶粉,野生型转氨酶的氨基酸序列如SEQ ID NO:1所示。
实施例2 R型-转氨酶的活力检测
取10mg实施例1制备的野生型转氨酶酶粉,与1mg磷酸吡哆醛(PLP)、11mg异丙胺盐酸盐(ATA)、 100μL二甲基亚砜和900μL PBS缓冲液(pH=8.0)混合,随后加入10mg 4-氯-1-(2,5-二氟苯基)丁-1- 酮(4-chloro-1-(2,5-difluorophenly)butan-1-one),置于35℃下振荡过夜,反应路线图如图1所示;
取100μL反应液加入到900μL乙腈中充分振荡,使用0.22μm过滤膜过滤,对过滤后的溶液进行HPLC 分析。
本实施例首先对4-氯-1-(2,5-二氟苯基)丁-1-酮标准品、(R,S)-2-(2,5-二氟苯基)吡咯烷和(S) -2-(2,5-二氟苯基)吡咯烷进行HPLC分析,确认相关保留时间,如图2和图3所示,在反相非手性HPLC 中,4-氯-1-(2,5-二氟苯基)丁-1-酮的保留时间为6.171min,(R,S)-2-(2,5-二氟苯基)吡咯烷的保留时间为5.299min。如图4和图5所示,在正相手性HPLC中,(R)-2-(2,5-二氟苯基)吡咯烷的保留时间为5.660min,(S)-2-(2,5-二氟苯基)吡咯烷的保留时间为6.385min。
本实施例反应的中控HPLC图谱如图6所示,原料4-氯-1-(2,5-二氟苯基)丁-1-酮的保留时间为6.188 min,反应生成的2-(2,5-二氟苯基)吡咯烷的保留时间为5.325min。
实施例3转氨酶突变体文库构建
本实施例以野生型转氨酶的编码基因SEQ ID NO:1为模板,使用随机突变试剂盒(碧云天)进行易错PCR,使用DNA片段回收试剂盒(上海生工)并按照说明书回收上述易错PCR产物;
分别用限制性内切酶NdeI和XhoI对易错PCR产物和pET-30a(+)质粒进行双酶切,将回收的酶切产物用T4 DNA连接酶连接在线性化pET-30a(+)质粒中,22℃下连接1小时;
将连接产物转化入大肠杆菌BL21(DE3)感受态细胞(上海生工),37℃过夜培养,获得随机突变文库;
经初筛后,将获得的有益突变体进行DNA改组,进一步获得基因改组文库。
实施例4转氨酶突变体的筛选
(1)挑取文库中的菌落至于96孔微量培养板上,培养板的每孔含有150μL LB液体培养基和50μg/mL 卡那霉素,将菌体置于37℃、220rpm摇床中培养过夜;将20μL菌液转接至含有380μL LB液体培养基和50μg/mL卡那霉素的96孔板中,继续在37℃、220rpm的摇床中培养2~3h,添加IPTG至终浓度为 0.4mM进行诱导,然后降温至25℃培养过夜。
(2)将96孔板离心10min(4000rpm,室温),收集孔板中的菌体,用200μL裂解液(含1g/L溶菌酶和0.5g/L硫酸多粘菌素B)重悬菌体,于室温、600rpm下振荡2h进行裂解,裂解后的细胞碎片通过离心(4000rpm,4℃,10min)进行沉淀,取上清液进行反应检测。
(3)将含有10g/L 4-氯-1-(2,5-二氟苯基)丁-1-酮的二甲基亚砜溶液以50μL每份分装至96孔深孔板中,每孔加入250μL异丙胺盐酸盐(浓度为20g/L)和100μL磷酸吡哆醛(浓度为1g/L),最后每孔加入100μL步骤(2)获得的上清液,使每孔的体系中含有1g/L 4-氯-1-(2,5-二氟苯基)丁-1-酮、10g/L 异丙胺盐酸盐、0.1g/L磷酸吡哆醛、100mM三乙醇胺缓冲液(pH=7.5)和10%V/V的二甲基亚砜;使用铝箔封口膜热封孔板,置于40℃、220rpm摇床中反应过夜;
(4)反应结束后,向每孔中加入1mL甲醇淬灭反应,孔板4000rpm离心30min,取上清液进行HPLC 检测。
对具有转氨酶活性的菌落进行测序分析,获得转氨酶突变体的编码基因SEQ IDNO:4、SEQ ID NO:6、 SEQ ID NO:8、SEQ ID NO:10、SEQ ID NO:12、SEQ ID NO:14、SEQ IDNO:16、SEQ ID NO:18、SEQ ID NO:20、SEQ ID NO:22、SEQ ID NO:24、SEQ ID NO:26、SEQ IDNO:28或SEQ ID NO:30,相应的转氨酶突变体的氨基酸序列分别如SEQ ID NO:3、SEQ IDNO:5、SEQ ID NO:7、SEQ ID NO:9、SEQ ID NO:11、SEQ ID NO:13、SEQ ID NO:15、SEQ IDNO:17、SEQ ID NO:19、SEQ ID NO:21、SEQ ID NO: 23、SEQ ID NO:25、SEQ ID NO:27或SEQID NO:29所示。
不同转氨酶突变体的活性检测结果见表1。
表1转氨酶突变体的突变位点与活性检测
注:在酶板筛选中,统计数据表示如下:
+表示该突变体在酶板筛选中测得转化率为0~10%;
++表示该突变体在酶板筛选中测得转化率为10%~20%;
+++表示该突变体在酶板筛选中测得转化率为20%~30%;
++++表示该突变体在酶板筛选中测得转化率为30%~40%;
+++++表示该突变体在酶板筛选中测得转化率为40%~50%。
实施例5转氨酶突变体的活力检测
取0.2g实施例4筛选的转氨酶突变体SEQ ID NO:23、SEQ ID NO:25、SEQ ID NO:27或SEQ ID NO: 29中的任一种酶粉,与10mg磷酸吡哆醛、1.1g异丙胺盐酸盐、4mL二甲基亚砜和16mL PBS缓冲液 (pH=8.5)混合,随后加入1g 4-氯-1-(2,5-二氟苯基)丁-1-酮,于40℃下振荡24~48h;
取100μL反应液加入到900μL乙腈中充分振荡,使用0.22μm过滤膜过滤,进行HPLC分析;
转化率结果如表1所示。
表1
酶 | 转化率% |
SEQ ID NO:23 | 96.7% |
SEQ ID NO:25 | 98.3% |
SEQ ID NO:27 | 98.9% |
SEQ ID NO:29 | 97.3% |
以SEQ ID NO:27酶粉催化反应为例,待转化率大于95%后,用35%盐酸调节pH至2~3,室温搅拌 1h;过滤,滤液用2M NaOH溶液调节pH至9~10,用1倍体积乙酸异丙酯萃取3次后合并有机相,用1 倍体积10%NaCl溶液洗涤有机相,再经无水硫酸钠干燥后浓缩得0.8g油状液体,即为(R)-2-(2,5-二氟苯基)吡咯烷。
取样送正相手性HPLC分析,结果如图7所示,(R)-2-(2,5-二氟苯基)吡咯烷的保留时间为5.747 min,(S)-2-(2,5-二氟苯基)吡咯烷的保留时间为6.670min,(R)-2-(2,5-二氟苯基)吡咯烷的光学纯度为99.37%;取样送LC-MS分析,结果如图8所示,MS中[M+1]=184.1,(R)-2-(2,5-二氟苯基) 吡咯烷的分子量为183.20。
综上所述,本发明对SEQ ID NO:1所示的野生型转氨酶的第13~312位氨基酸中的一个或几个氨基酸进行取代突变,获得的转氨酶突变体可以高效催化4-氯-1-(2,5-二氟苯基)丁-1-酮合成(R)-2-(2,5-二氟苯基)吡咯烷,在(R)-2-(2,5-二氟苯基)吡咯烷以及拉罗替尼的工业化生产领域具有重要意义。
申请人声明,本发明通过上述实施例来说明本发明的详细方法,但本发明并不局限于上述详细方法,即不意味着本发明必须依赖上述详细方法才能实施。所属技术领域的技术人员应该明了,对本发明的任何改进,对本发明产品各原料的等效替换及辅助成分的添加、具体方式的选择等,均落在本发明的保护范围和公开范围之内。
SEQUENCE LISTING
<110> 上海飞腾化工科技有限公司
<120> 重组转氨酶及其在合成(R)-2-(2,5-二氟苯基)吡咯烷中的应用
<130> 20200819
<160> 30
<170> PatentIn version 3.3
<210> 1
<211> 325
<212> PRT
<213> 人工序列
<400> 1
Met Ala Ser Met Asp Lys Val Phe Ala Gly Tyr Ala Ala Arg Gln Ala
1 5 10 15
Ile Leu Glu Ser Thr Glu Thr Thr Asn Pro Phe Ala Lys Gly Ile Ala
20 25 30
Trp Val Glu Gly Glu Leu Val Pro Leu Ala Glu Ala Arg Ile Pro Leu
35 40 45
Leu Asp Gln Gly Phe Met His Ser Asp Leu Thr Tyr Asp Val Pro Ser
50 55 60
Val Trp Asp Gly Arg Phe Phe Arg Leu Asp Asp His Ile Thr Arg Leu
65 70 75 80
Glu Ala Ser Cys Thr Lys Leu Arg Leu Arg Leu Pro Leu Pro Arg Asp
85 90 95
Gln Val Lys Gln Ile Leu Val Glu Met Val Ala Lys Ser Gly Ile Arg
100 105 110
Asp Ala Phe Val Glu Leu Ile Val Thr Arg Gly Leu Lys Gly Val Arg
115 120 125
Gly Thr Arg Pro Glu Asp Ile Val Asn Asn Leu Tyr Met Phe Val Gln
130 135 140
Pro Tyr Val Trp Val Met Glu Pro Asp Met Gln Arg Val Gly Gly Ser
145 150 155 160
Ala Val Val Ala Arg Thr Val Arg Arg Val Pro Pro Gly Ala Ile Asp
165 170 175
Pro Thr Val Lys Asn Leu Gln Trp Gly Asp Leu Val Arg Gly Met Phe
180 185 190
Glu Ala Ala Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp
195 200 205
Ala His Leu Thr Glu Gly Ser Gly Phe Asn Ile Val Leu Val Lys Asp
210 215 220
Gly Val Leu Tyr Thr Pro Asp Arg Gly Val Leu Gln Gly Val Thr Arg
225 230 235 240
Lys Ser Val Ile Asn Ala Ala Glu Ala Phe Gly Ile Glu Val Arg Val
245 250 255
Glu Phe Val Pro Val Glu Leu Ala Tyr Arg Cys Asp Glu Ile Phe Met
260 265 270
Cys Thr Thr Ala Gly Gly Ile Met Pro Ile Thr Thr Leu Asp Gly Met
275 280 285
Pro Val Asn Gly Gly Gln Ile Gly Pro Ile Thr Lys Lys Ile Trp Asp
290 295 300
Gly Tyr Trp Ala Met His Tyr Asp Ala Ala Tyr Ser Phe Glu Ile Asp
305 310 315 320
Tyr Asn Glu Arg Asn
325
<210> 2
<211> 978
<212> DNA
<213> 人工序列
<400> 2
atggcgagca tggacaaagt gtttgcgggt tatgcggcgc gtcaggcgat cctggagagc 60
accgaaacca ccaacccgtt tgcgaaaggc attgcgtggg tggaaggtga actggttccg 120
ctggcggagg cgcgtatccc gctgctggac caaggtttca tgcacagcga cctgacctat 180
gatgtgccga gcgtttggga tggccgtttc tttcgtctgg acgatcacat cacccgtctg 240
gaagcgagct gcaccaaact gcgtctgcgt ctgccgctgc cgcgtgacca ggtgaagcaa 300
attctggtgg agatggttgc gaaaagcggt atccgtgatg cgttcgtgga actgattgtt 360
acccgtggtc tgaaaggcgt tcgtggtacc cgtccggagg acatcgtgaa caacctgtac 420
atgtttgttc agccgtatgt gtgggttatg gaaccggata tgcagcgtgt gggtggcagc 480
gcggtggttg cgcgtaccgt gcgtcgtgtt ccgccgggtg cgattgatcc gaccgtgaag 540
aacctgcagt ggggcgacct ggttcgtggc atgtttgagg cggcggatcg tggcgcgacc 600
tacccgtttc tgaccgacgg tgatgcgcac ctgaccgaag gtagcggctt taacatcgtg 660
ctggttaagg acggtgtgct gtataccccg gatcgtggcg tgctgcaagg tgttacccgt 720
aaaagcgtta tcaacgcggc ggaggcgttt ggtattgaag tgcgtgttga atttgtgccg 780
gttgagctgg cgtaccgttg cgacgaaatt ttcatgtgca ccaccgcggg tggcatcatg 840
ccgattacca ccctggatgg catgccggtt aacggtggcc agatcggtcc gattaccaag 900
aaaatctggg acggctactg ggcgatgcac tacgatgcgg cgtatagctt tgagattgac 960
tataacgaac gtaactaa 978
<210> 3
<211> 325
<212> PRT
<213> 人工序列
<400> 3
Met Ala Ser Met Asp Lys Val Phe Ala Gly Tyr Ala Ala Arg Gln Ala
1 5 10 15
Ile Leu Glu Ser Thr Glu Thr Thr Asn Pro Phe Ala Lys Gly Ile Ala
20 25 30
Trp Val Glu Gly Glu Leu Val Pro Leu Ala Glu Ala Arg Ile Pro Leu
35 40 45
Leu Asp Gln Gly Phe Met His Ser Asp Leu Thr Tyr Asp Val Pro Ser
50 55 60
Val Trp Asp Gly Arg Phe Phe Arg Leu Asp Asp His Ile Thr Arg Leu
65 70 75 80
Glu Ala Ser Cys Thr Lys Leu Arg Leu Arg Leu Pro Leu Pro Arg Asp
85 90 95
Gln Val Lys Gln Ile Leu Val Glu Met Val Ala Lys Ser Gly Ile Arg
100 105 110
Asp Ala Phe Val Glu Leu Ile Val Thr Arg Gly Leu Lys Gly Val Arg
115 120 125
Gly Thr Arg Pro Glu Asp Ile Val Asn Asn Leu Tyr Met Phe Val Gln
130 135 140
Pro Tyr Val Trp Val Met Glu Pro Asp Met Gln Arg Val Gly Gly Ser
145 150 155 160
Ala Val Val Ala Arg Thr Val Arg Arg Val Pro Pro Gly Ala Ile Asp
165 170 175
Pro Thr Val Lys Asn Leu Gln Trp Gly Asp Tyr Val Arg Gly Met Phe
180 185 190
Glu Ala Ala Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp
195 200 205
Ala His Leu Thr Glu Gly Ser Leu Phe Asn Ile Val Leu Val Lys Asp
210 215 220
Gly Val Leu Tyr Thr Pro Asp Arg Gly Val Leu Gln Gly Val Thr Arg
225 230 235 240
Lys Ser Val Ile Asn Ala Ala Glu Ala Phe Gly Ile Glu Val Arg Val
245 250 255
Glu Phe Val Pro Val Glu Leu Ala Tyr Arg Cys Asp Glu Ile Phe Met
260 265 270
Cys Thr Trp Ala Gly Gly Ile Met Pro Ile Thr Thr Leu Asp Gly Met
275 280 285
Pro Val Asn Gly Gly Gln Ile Gly Pro Ile Thr Lys Lys Ile Trp Asp
290 295 300
Gly Tyr Trp Ala Met His Tyr Asp Ala Ala Tyr Ser Phe Glu Ile Asp
305 310 315 320
Tyr Asn Glu Arg Asn
325
<210> 4
<211> 978
<212> DNA
<213> 人工序列
<400> 4
atggcgagca tggacaaagt gtttgcgggt tatgcggcgc gtcaggcgat cctggagagc 60
accgaaacca ccaacccgtt tgcgaaaggc attgcgtggg tggaaggtga actggttccg 120
ctggcggagg cgcgtatccc gctgctggac caaggtttca tgcacagcga cctgacctat 180
gatgtgccga gcgtttggga tggccgtttc tttcgtctgg acgatcacat cacccgtctg 240
gaagcgagct gcaccaaact gcgtctgcgt ctgccgctgc cgcgtgacca ggtgaagcaa 300
attctggtgg agatggttgc gaaaagcggt atccgtgatg cgttcgtgga actgattgtt 360
acccgtggtc tgaaaggcgt tcgtggtacc cgtccggagg acatcgtgaa caacctgtac 420
atgtttgttc agccgtatgt gtgggttatg gaaccggata tgcagcgtgt gggtggcagc 480
gcggtggttg cgcgtaccgt gcgtcgtgtt ccgccgggtg cgattgatcc gaccgtgaag 540
aacctgcagt ggggcgacta cgttcgtggc atgtttgagg cggcggatcg tggcgcgacc 600
tacccgtttc tgaccgacgg tgatgcgcac ctgaccgaag gtagcctgtt taacatcgtg 660
ctggttaagg acggtgtgct gtataccccg gatcgtggcg tgctgcaagg tgttacccgt 720
aaaagcgtta tcaacgcggc ggaggcgttt ggtattgaag tgcgtgttga atttgtgccg 780
gttgagctgg cgtaccgttg cgacgaaatt ttcatgtgca cctgggcggg tggcatcatg 840
ccgattacca ccctggatgg catgccggtt aacggtggcc agatcggtcc gattaccaag 900
aaaatctggg acggctactg ggcgatgcac tacgatgcgg cgtatagctt tgagattgac 960
tataacgaac gtaactaa 978
<210> 5
<211> 325
<212> PRT
<213> 人工序列
<400> 5
Met Ala Ser Met Asp Lys Val Phe Ala Gly Tyr Ala Ala Arg Gln Ala
1 5 10 15
Ile Leu Glu Ser Thr Glu Thr Thr Asn Pro Phe Ala Lys Gly Ile Ala
20 25 30
Trp Val Glu Gly Glu Leu Val Pro Leu Ala Glu Ala Arg Ile Pro Leu
35 40 45
Leu Asp Gln Gly Phe Met His Ser Asp Leu Thr Tyr Asp Val Pro Ser
50 55 60
Val Trp Asp Gly Arg Phe Phe Arg Leu Asp Asp His Ile Thr Arg Leu
65 70 75 80
Glu Ala Ser Cys Thr Lys Leu Arg Leu Arg Leu Pro Leu Pro Arg Asp
85 90 95
Gln Val Lys Gln Ile Leu Val Glu Met Val Ala Lys Ser Gly Ile Arg
100 105 110
Asp Ala Phe Val Glu Leu Ile Val Thr Arg Gly Leu Lys Gly Val Arg
115 120 125
Gly Thr Arg Pro Glu Asp Ile Val Asn Asn Leu Tyr Met Phe Val Gln
130 135 140
Pro Tyr Val Trp Val Met Glu Pro Asp Met Gln Arg Val Gly Gly Ser
145 150 155 160
Ala Val Val Ala Arg Thr Val Arg Arg Val Pro Pro Gly Ala Ile Asp
165 170 175
Pro Thr Val Lys Tyr Leu Gln Trp Gly Asp Leu Val Arg Gly Met Phe
180 185 190
Glu Ala Ala Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp
195 200 205
Ala His Leu Thr Glu Gly Ser Leu Phe Asn Ile Val Leu Val Lys Asp
210 215 220
Gly Val Leu Tyr Thr Pro Asp Arg Gly Val Leu Gln Gly Val Thr Arg
225 230 235 240
Lys Ser Val Ile Asn Ala Ala Glu Ala Phe Gly Ile Glu Val Arg Val
245 250 255
Glu Phe Val Pro Val Glu Leu Ala Tyr Arg Cys Asp Glu Ile Phe Met
260 265 270
Cys Thr Trp Ala Gly Gly Ile Met Pro Ile Thr Thr Leu Asp Gly Met
275 280 285
Pro Val Asn Gly Gly Gln Ile Gly Pro Ile Thr Lys Lys Ile Trp Asp
290 295 300
Gly Tyr Trp Ala Met His Tyr Asp Ala Ala Tyr Ser Phe Glu Ile Asp
305 310 315 320
Tyr Asn Glu Arg Asn
325
<210> 6
<211> 978
<212> DNA
<213> 人工序列
<400> 6
atggcgagca tggacaaagt gtttgcgggt tatgcggcgc gtcaggcgat cctggagagc 60
accgaaacca ccaacccgtt tgcgaaaggc attgcgtggg tggaaggtga actggttccg 120
ctggcggagg cgcgtatccc gctgctggac caaggtttca tgcacagcga cctgacctat 180
gatgtgccga gcgtttggga tggccgtttc tttcgtctgg acgatcacat cacccgtctg 240
gaagcgagct gcaccaaact gcgtctgcgt ctgccgctgc cgcgtgacca ggtgaagcaa 300
attctggtgg agatggttgc gaaaagcggt atccgtgatg cgttcgtgga actgattgtt 360
acccgtggtc tgaaaggcgt tcgtggtacc cgtccggagg acatcgtgaa caacctgtac 420
atgtttgttc agccgtatgt gtgggttatg gaaccggata tgcagcgtgt gggtggcagc 480
gcggtggttg cgcgtaccgt gcgtcgtgtt ccgccgggtg cgattgatcc gaccgtgaag 540
tacctgcagt ggggcgacct ggttcgtggc atgtttgagg cggcggatcg tggcgcgacc 600
tacccgtttc tgaccgacgg tgatgcgcac ctgaccgaag gtagcctgtt taacatcgtg 660
ctggttaagg acggtgtgct gtataccccg gatcgtggcg tgctgcaagg tgttacccgt 720
aaaagcgtta tcaacgcggc ggaggcgttt ggtattgaag tgcgtgttga atttgtgccg 780
gttgagctgg cgtaccgttg cgacgaaatt ttcatgtgca cctgggcggg tggcatcatg 840
ccgattacca ccctggatgg catgccggtt aacggtggcc agatcggtcc gattaccaag 900
aaaatctggg acggctactg ggcgatgcac tacgatgcgg cgtatagctt tgagattgac 960
tataacgaac gtaactaa 978
<210> 7
<211> 325
<212> PRT
<213> 人工序列
<400> 7
Met Ala Ser Met Asp Lys Val Phe Ala Gly Tyr Ala Ala Arg Gln Ala
1 5 10 15
Ile Leu Glu Ser Thr Glu Thr Thr Asn Pro Phe Ala Lys Gly Ile Ala
20 25 30
Trp Val Glu Gly Glu Leu Val Pro Leu Ala Glu Ala Arg Ile Pro Leu
35 40 45
Leu Asp Gln Gly Phe Met His Ser Asp Leu Thr Tyr Asp Val Pro Ser
50 55 60
Val Trp Asp Gly Arg Phe Phe Arg Leu Asp Asp His Ile Thr Arg Leu
65 70 75 80
Glu Ala Ser Cys Thr Lys Leu Arg Leu Arg Leu Pro Leu Pro Arg Asp
85 90 95
Gln Val Lys Gln Ile Leu Val Glu Met Val Ala Lys Ser Gly Ile Arg
100 105 110
Asp Ala Phe Val Glu Leu Ile Val Thr Arg Gly Leu Lys Gly Val Arg
115 120 125
Gly Thr Arg Pro Glu Asp Ile Val Asn Asn Leu Tyr Met Phe Val Gln
130 135 140
Pro Tyr Val Trp Val Met Glu Pro Asp Met Gln Arg Val Gly Gly Ser
145 150 155 160
Ala Val Val Ala Arg Thr Val Arg Arg Val Pro Pro Gly Ala Ile Asp
165 170 175
Pro Thr Val Arg Asn Leu Gln Trp Gly Asp Tyr Val Arg Gly Met Phe
180 185 190
Glu Ala Ala Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp
195 200 205
Ala His Leu Thr Glu Gly Ser Leu Phe Asn Ile Val Leu Val Lys Asp
210 215 220
Gly Val Leu Tyr Thr Pro Asp Arg Gly Val Leu Gln Gly Val Thr Arg
225 230 235 240
Lys Ser Val Ile Asn Ala Ala Glu Ala Phe Gly Ile Glu Val Arg Val
245 250 255
Glu Phe Val Pro Val Glu Leu Ala Tyr Arg Cys Asp Glu Ile Phe Met
260 265 270
Cys Thr Thr Ala Gly Gly Ile Met Pro Ile Thr Thr Leu Asp Gly Met
275 280 285
Pro Val Asn Gly Gly Gln Ile Gly Pro Ile Thr Lys Lys Ile Trp Asp
290 295 300
Gly Tyr Trp Ala Met His Tyr Asp Ala Ala Tyr Ser Phe Glu Ile Asp
305 310 315 320
Tyr Asn Glu Arg Asn
325
<210> 8
<211> 978
<212> DNA
<213> 人工序列
<400> 8
atggcgagca tggacaaagt gtttgcgggt tatgcggcgc gtcaggcgat cctggagagc 60
accgaaacca ccaacccgtt tgcgaaaggc attgcgtggg tggaaggtga actggttccg 120
ctggcggagg cgcgtatccc gctgctggac caaggtttca tgcacagcga cctgacctat 180
gatgtgccga gcgtttggga tggccgtttc tttcgtctgg acgatcacat cacccgtctg 240
gaagcgagct gcaccaaact gcgtctgcgt ctgccgctgc cgcgtgacca ggtgaagcaa 300
attctggtgg agatggttgc gaaaagcggt atccgtgatg cgttcgtgga actgattgtt 360
acccgtggtc tgaaaggcgt tcgtggtacc cgtccggagg acatcgtgaa caacctgtac 420
atgtttgttc agccgtatgt gtgggttatg gaaccggata tgcagcgtgt gggtggcagc 480
gcggtggttg cgcgtaccgt gcgtcgtgtt ccgccgggtg cgattgatcc gaccgtgcgt 540
aacctgcagt ggggcgacta cgttcgtggc atgtttgagg cggcggatcg tggcgcgacc 600
tacccgtttc tgaccgacgg tgatgcgcac ctgaccgaag gtagcctgtt taacatcgtg 660
ctggttaagg acggtgtgct gtataccccg gatcgtggcg tgctgcaagg tgttacccgt 720
aaaagcgtta tcaacgcggc ggaggcgttt ggtattgaag tgcgtgttga atttgtgccg 780
gttgagctgg cgtaccgttg cgacgaaatt ttcatgtgca ccaccgcggg tggcatcatg 840
ccgattacca ccctggatgg catgccggtt aacggtggcc agatcggtcc gattaccaag 900
aaaatctggg acggctactg ggcgatgcac tacgatgcgg cgtatagctt tgagattgac 960
tataacgaac gtaactaa 978
<210> 9
<211> 325
<212> PRT
<213> 人工序列
<400> 9
Met Ala Ser Met Asp Lys Val Phe Ala Gly Tyr Ala Ala Arg Gln Ala
1 5 10 15
Ile Leu Glu Ser Thr Glu Thr Thr Asn Pro Phe Ala Lys Gly Ile Ala
20 25 30
Trp Val Glu Gly Glu Leu Val Pro Leu Ala Glu Ala Arg Ile Pro Leu
35 40 45
Leu Asp Gln Gly Phe Met His Ser Asp Leu Thr Tyr Asp Val Pro Ser
50 55 60
Val Trp Asp Gly Arg Phe Phe Arg Leu Asp Asp His Ile Thr Arg Leu
65 70 75 80
Glu Ala Ser Cys Thr Lys Leu Arg Leu Arg Leu Pro Leu Pro Arg Asp
85 90 95
Gln Val Lys Gln Ile Leu Val Glu Met Val Ala Lys Ser Gly Ile Arg
100 105 110
Asp Ala Phe Val Glu Leu Ile Val Thr Arg Gly Leu Lys Gly Val Arg
115 120 125
Gly Thr Arg Pro Glu Asp Ile Val Asn Asn Leu Tyr Met Phe Val Gln
130 135 140
Pro Tyr Val Trp Val Met Glu Pro Asp Met Gln Arg Val Gly Gly Ser
145 150 155 160
Ala Val Val Ala Arg Thr Val Arg Arg Val Pro Pro Gly Ala Ile Asp
165 170 175
Asn Thr Val Arg Tyr Leu Gln Trp Gly Asp Tyr Val Arg Gly Met Phe
180 185 190
Glu Ala Ala Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp
195 200 205
Ala His Leu Thr Glu Gly Ser Gly Phe Asn Ile Val Leu Val Lys Asp
210 215 220
Gly Val Leu Tyr Thr Pro Asp Arg Gly Val Leu Gln Gly Val Thr Arg
225 230 235 240
Lys Ser Val Ile Asn Ala Ala Glu Ala Phe Gly Ile Glu Val Arg Val
245 250 255
Glu Phe Val Pro Val Glu Leu Ala Tyr Arg Cys Asp Glu Ile Phe Met
260 265 270
Cys Thr Trp Ala Gly Gly Ile Met Pro Ile Thr Thr Leu Asp Gly Met
275 280 285
Pro Val Asn Gly Gly Gln Ile Gly Pro Ile Thr Lys Lys Ile Trp Asp
290 295 300
Gly Tyr Trp Ala Met His Tyr Asp Ala Ala Tyr Ser Phe Glu Ile Asp
305 310 315 320
Tyr Asn Glu Arg Asn
325
<210> 10
<211> 978
<212> DNA
<213> 人工序列
<400> 10
atggcgagca tggacaaagt gtttgcgggt tatgcggcgc gtcaggcgat cctggagagc 60
accgaaacca ccaacccgtt tgcgaaaggc attgcgtggg tggaaggtga actggttccg 120
ctggcggagg cgcgtatccc gctgctggac caaggtttca tgcacagcga cctgacctat 180
gatgtgccga gcgtttggga tggccgtttc tttcgtctgg acgatcacat cacccgtctg 240
gaagcgagct gcaccaaact gcgtctgcgt ctgccgctgc cgcgtgacca ggtgaagcaa 300
attctggtgg agatggttgc gaaaagcggt atccgtgatg cgttcgtgga actgattgtt 360
acccgtggtc tgaaaggcgt tcgtggtacc cgtccggagg acatcgtgaa caacctgtac 420
atgtttgttc agccgtatgt gtgggttatg gaaccggata tgcagcgtgt gggtggcagc 480
gcggtggttg cgcgtaccgt gcgtcgtgtt ccgccgggtg cgattgataa caccgtgcgt 540
tacctgcagt ggggcgacta cgttcgtggc atgtttgagg cggcggatcg tggcgcgacc 600
tacccgtttc tgaccgacgg tgatgcgcac ctgaccgaag gtagcggctt taacatcgtg 660
ctggttaagg acggtgtgct gtataccccg gatcgtggcg tgctgcaagg tgttacccgt 720
aaaagcgtta tcaacgcggc ggaggcgttt ggtattgaag tgcgtgttga atttgtgccg 780
gttgagctgg cgtaccgttg cgacgaaatt ttcatgtgca cctgggcggg tggcatcatg 840
ccgattacca ccctggatgg catgccggtt aacggtggcc agatcggtcc gattaccaag 900
aaaatctggg acggctactg ggcgatgcac tacgatgcgg cgtatagctt tgagattgac 960
tataacgaac gtaactaa 978
<210> 11
<211> 325
<212> PRT
<213> 人工序列
<400> 11
Met Ala Ser Met Asp Lys Val Phe Ala Gly Tyr Ala Ala Arg Gln Ala
1 5 10 15
Ile Leu Glu Ser Thr Glu Thr Thr Asn Pro Phe Ala Lys Gly Ile Ala
20 25 30
Trp Val Glu Gly Glu Leu Val Pro Leu Ala Glu Ala Arg Ile Pro Leu
35 40 45
Leu Asp Gln Gly Phe Met His Ser Asp Leu Thr Tyr Asp Val Pro Ser
50 55 60
Val Trp Asp Gly Arg Phe Phe Arg Leu Asp Asp His Ile Thr Arg Leu
65 70 75 80
Glu Ala Ser Cys Thr Lys Leu Arg Leu Arg Leu Pro Leu Pro Arg Asp
85 90 95
Gln Val Lys Gln Ile Leu Val Glu Met Val Ala Lys Ser Gly Ile Arg
100 105 110
Asp Ala Phe Val Glu Leu Ile Val Thr Arg Gly Leu Lys Gly Val Arg
115 120 125
Gly Thr Arg Pro Glu Asp Ile Val Asn Asn Leu Tyr Met Phe Val Gln
130 135 140
Pro Tyr Val Trp Val Met Glu Pro Asp Met Gln Arg Val Gly Gly Ser
145 150 155 160
Ala Val Val Ala Arg Thr Val Arg Arg Val Pro Pro Gly Ala Ile Asp
165 170 175
Asn Thr Val Lys Tyr Leu Gln Trp Gly Asp Tyr Val Arg Gly Met Phe
180 185 190
Glu Ala Ala Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp
195 200 205
Ala His Leu Thr Glu Gly Ser Leu Phe Asn Ile Val Leu Val Lys Asp
210 215 220
Gly Val Leu Tyr Thr Pro Asp Arg Gly Val Leu Gln Gly Val Thr Arg
225 230 235 240
Lys Ser Val Ile Asn Ala Ala Glu Ala Phe Gly Ile Glu Val Arg Val
245 250 255
Glu Phe Val Pro Val Glu Leu Ala Tyr Arg Cys Asp Glu Ile Phe Met
260 265 270
Cys Thr Trp Ala Gly Gly Ile Met Pro Ile Thr Thr Leu Asp Gly Met
275 280 285
Pro Val Asn Gly Gly Gln Ile Gly Pro Ile Thr Lys Lys Ile Trp Asp
290 295 300
Gly Tyr Trp Ala Met His Tyr Asp Ala Ala Tyr Ser Phe Glu Ile Asp
305 310 315 320
Tyr Asn Glu Arg Asn
325
<210> 12
<211> 978
<212> DNA
<213> 人工序列
<400> 12
atggcgagca tggacaaagt gtttgcgggt tatgcggcgc gtcaggcgat cctggagagc 60
accgaaacca ccaacccgtt tgcgaaaggc attgcgtggg tggaaggtga actggttccg 120
ctggcggagg cgcgtatccc gctgctggac caaggtttca tgcacagcga cctgacctat 180
gatgtgccga gcgtttggga tggccgtttc tttcgtctgg acgatcacat cacccgtctg 240
gaagcgagct gcaccaaact gcgtctgcgt ctgccgctgc cgcgtgacca ggtgaagcaa 300
attctggtgg agatggttgc gaaaagcggt atccgtgatg cgttcgtgga actgattgtt 360
acccgtggtc tgaaaggcgt tcgtggtacc cgtccggagg acatcgtgaa caacctgtac 420
atgtttgttc agccgtatgt gtgggttatg gaaccggata tgcagcgtgt gggtggcagc 480
gcggtggttg cgcgtaccgt gcgtcgtgtt ccgccgggtg cgattgataa caccgtgaag 540
tacctgcagt ggggcgacta cgttcgtggc atgtttgagg cggcggatcg tggcgcgacc 600
tacccgtttc tgaccgacgg tgatgcgcac ctgaccgaag gtagcctgtt taacatcgtg 660
ctggttaagg acggtgtgct gtataccccg gatcgtggcg tgctgcaagg tgttacccgt 720
aaaagcgtta tcaacgcggc ggaggcgttt ggtattgaag tgcgtgttga atttgtgccg 780
gttgagctgg cgtaccgttg cgacgaaatt ttcatgtgca cctgggcggg tggcatcatg 840
ccgattacca ccctggatgg catgccggtt aacggtggcc agatcggtcc gattaccaag 900
aaaatctggg acggctactg ggcgatgcac tacgatgcgg cgtatagctt tgagattgac 960
tataacgaac gtaactaa 978
<210> 13
<211> 325
<212> PRT
<213> 人工序列
<400> 13
Met Ala Ser Met Asp Lys Val Phe Ala Gly Tyr Ala Ala Arg Gln Ala
1 5 10 15
Ile Leu Glu Ser Thr Glu Thr Thr Asn Pro Phe Ala Lys Gly Ile Ala
20 25 30
Trp Val Glu Gly Glu Leu Val Pro Leu Ala Glu Ala Arg Ile Pro Leu
35 40 45
Leu Asp Gln Gly Phe Met His Ser Asp Leu Thr Tyr Asp Val Pro Ser
50 55 60
Val Trp Asp Gly Arg Phe Phe Arg Leu Asp Asp His Ile Thr Arg Leu
65 70 75 80
Glu Ala Ser Cys Thr Lys Leu Arg Leu Arg Leu Pro Leu Pro Arg Asp
85 90 95
Gln Val Lys Gln Ile Leu Val Glu Met Val Ala Lys Ser Gly Ile Arg
100 105 110
Asp Ala Phe Val Glu Leu Ile Val Thr Arg Gly Leu Lys Gly Val Arg
115 120 125
Gly Thr Arg Pro Glu Asp Ile Val Asn Asn Leu Tyr Met Phe Val Gln
130 135 140
Pro Tyr Val Trp Val Met Glu Pro Asp Met Gln Arg Val Gly Gly Ser
145 150 155 160
Ala Val Val Ala Arg Thr Val Arg Arg Val Pro Pro Gly Ala Ile Asp
165 170 175
Asn Thr Val Arg Asn Leu Gln Trp Gly Asp Tyr Val Arg Gly Met Phe
180 185 190
Glu Ala Ala Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp
195 200 205
Ala His Leu Thr Glu Gly Ser Arg Phe Asn Ile Val Leu Val Lys Asp
210 215 220
Gly Val Leu Tyr Thr Pro Asp Arg Gly Val Leu Gln Gly Val Thr Arg
225 230 235 240
Lys Ser Val Ile Asn Ala Ala Glu Ala Phe Gly Ile Glu Val Arg Val
245 250 255
Glu Phe Val Pro Val Glu Leu Ala Tyr Arg Cys Asp Glu Ile Phe Met
260 265 270
Cys Thr Trp Ala Gly Gly Ile Met Pro Ile Thr Thr Leu Asp Gly Met
275 280 285
Pro Val Asn Gly Gly Gln Ile Gly Pro Ile Thr Lys Lys Ile Trp Asp
290 295 300
Gly Tyr Trp Ala Met His Tyr Asp Ala Ala Tyr Ser Phe Glu Ile Asp
305 310 315 320
Tyr Asn Glu Arg Asn
325
<210> 14
<211> 978
<212> DNA
<213> 人工序列
<400> 14
atggcgagca tggacaaagt gtttgcgggt tatgcggcgc gtcaggcgat cctggagagc 60
accgaaacca ccaacccgtt tgcgaaaggc attgcgtggg tggaaggtga actggttccg 120
ctggcggagg cgcgtatccc gctgctggac caaggtttca tgcacagcga cctgacctat 180
gatgtgccga gcgtttggga tggccgtttc tttcgtctgg acgatcacat cacccgtctg 240
gaagcgagct gcaccaaact gcgtctgcgt ctgccgctgc cgcgtgacca ggtgaagcaa 300
attctggtgg agatggttgc gaaaagcggt atccgtgatg cgttcgtgga actgattgtt 360
acccgtggtc tgaaaggcgt tcgtggtacc cgtccggagg acatcgtgaa caacctgtac 420
atgtttgttc agccgtatgt gtgggttatg gaaccggata tgcagcgtgt gggtggcagc 480
gcggtggttg cgcgtaccgt gcgtcgtgtt ccgccgggtg cgattgataa caccgtgcgt 540
aacctgcagt ggggcgacta cgttcgtggc atgtttgagg cggcggatcg tggcgcgacc 600
tacccgtttc tgaccgacgg tgatgcgcac ctgaccgaag gtagccgttt taacatcgtg 660
ctggttaagg acggtgtgct gtataccccg gatcgtggcg tgctgcaagg tgttacccgt 720
aaaagcgtta tcaacgcggc ggaggcgttt ggtattgaag tgcgtgttga atttgtgccg 780
gttgagctgg cgtaccgttg cgacgaaatt ttcatgtgca cctgggcggg tggcatcatg 840
ccgattacca ccctggatgg catgccggtt aacggtggcc agatcggtcc gattaccaag 900
aaaatctggg acggctactg ggcgatgcac tacgatgcgg cgtatagctt tgagattgac 960
tataacgaac gtaactaa 978
<210> 15
<211> 325
<212> PRT
<213> 人工序列
<400> 15
Met Ala Ser Met Asp Lys Val Phe Ala Gly Tyr Ala Ala Arg Gln Ala
1 5 10 15
Ile Leu Glu Ser Thr Glu Thr Thr Asn Pro Phe Ala Lys Gly Ile Ala
20 25 30
Trp Val Glu Gly Glu Leu Val Pro Leu Ala Glu Ala Arg Ile Pro Leu
35 40 45
Leu Asp Gln Gly Phe Met His Ser Asp Leu Thr Tyr Asp Val Pro Ser
50 55 60
Val Trp Asp Gly Arg Phe Phe Arg Leu Asp Asp His Ile Thr Arg Leu
65 70 75 80
Glu Ala Ser Cys Thr Lys Leu Arg Leu Arg Leu Pro Leu Pro Arg Asp
85 90 95
Gln Val Lys Gln Ile Leu Val Glu Met Val Ala Lys Ser Gly Ile Arg
100 105 110
Asp Ala Phe Val Glu Leu Ile Val Thr Arg Gly Leu Lys Gly Val Arg
115 120 125
Gly Thr Arg Pro Glu Asp Ile Val Asn Asn Leu Tyr Met Phe Val Gln
130 135 140
Pro Tyr Val Trp Val Met Glu Pro Asp Met Gln Arg Val Gly Gly Ser
145 150 155 160
Ala Val Val Ala Arg Thr Val Arg Arg Val Pro Pro Gly Ala Ile Asp
165 170 175
Asn Thr Val Arg Tyr Leu Gln Trp Gly Asp Tyr Val Arg Gly Met Phe
180 185 190
Glu Ala Ala Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp
195 200 205
Ala His Leu Thr Glu Gly Ser Leu Phe Asn Ile Val Leu Val Lys Asp
210 215 220
Gly Val Leu Tyr Thr Pro Asp Arg Gly Val Leu Gln Gly Val Thr Arg
225 230 235 240
Lys Ser Val Ile Asn Ala Ala Glu Ala Phe Gly Ile Glu Val Arg Val
245 250 255
Glu Phe Val Pro Val Glu Leu Ala Tyr Arg Cys Asp Glu Ile Phe Met
260 265 270
Cys Thr Trp Ala Gly Gly Ile Met Pro Ile Thr Thr Leu Asp Gly Met
275 280 285
Pro Val Asn Gly Gly Gln Ile Gly Pro Ile Thr Lys Lys Ile Trp Asp
290 295 300
Gly Tyr Trp Ala Met His Tyr Asp Ala Ala Tyr Ser Phe Glu Ile Asp
305 310 315 320
Tyr Asn Glu Arg Asn
325
<210> 16
<211> 978
<212> DNA
<213> 人工序列
<400> 16
atggcgagca tggacaaagt gtttgcgggt tatgcggcgc gtcaggcgat cctggagagc 60
accgaaacca ccaacccgtt tgcgaaaggc attgcgtggg tggaaggtga actggttccg 120
ctggcggagg cgcgtatccc gctgctggac caaggtttca tgcacagcga cctgacctat 180
gatgtgccga gcgtttggga tggccgtttc tttcgtctgg acgatcacat cacccgtctg 240
gaagcgagct gcaccaaact gcgtctgcgt ctgccgctgc cgcgtgacca ggtgaagcaa 300
attctggtgg agatggttgc gaaaagcggt atccgtgatg cgttcgtgga actgattgtt 360
acccgtggtc tgaaaggcgt tcgtggtacc cgtccggagg acatcgtgaa caacctgtac 420
atgtttgttc agccgtatgt gtgggttatg gaaccggata tgcagcgtgt gggtggcagc 480
gcggtggttg cgcgtaccgt gcgtcgtgtt ccgccgggtg cgattgataa caccgtgcgt 540
tacctgcagt ggggcgacta cgttcgtggc atgtttgagg cggcggatcg tggcgcgacc 600
tacccgtttc tgaccgacgg tgatgcgcac ctgaccgaag gtagcctgtt taacatcgtg 660
ctggttaagg acggtgtgct gtataccccg gatcgtggcg tgctgcaagg tgttacccgt 720
aaaagcgtta tcaacgcggc ggaggcgttt ggtattgaag tgcgtgttga atttgtgccg 780
gttgagctgg cgtaccgttg cgacgaaatt ttcatgtgca cctgggcggg tggcatcatg 840
ccgattacca ccctggatgg catgccggtt aacggtggcc agatcggtcc gattaccaag 900
aaaatctggg acggctactg ggcgatgcac tacgatgcgg cgtatagctt tgagattgac 960
tataacgaac gtaactaa 978
<210> 17
<211> 325
<212> PRT
<213> 人工序列
<400> 17
Met Ala Ser Met Asp Lys Val Phe Ala Gly Tyr Ala Leu Arg Gln Ala
1 5 10 15
Ile Leu Glu Ser Thr Glu Thr Thr Asn Pro Phe Ala Lys Gly Ile Ala
20 25 30
Trp Val Glu Gly Glu Leu Val Pro Leu Ala Glu Ala Arg Ile Pro Leu
35 40 45
Leu Asp Gln Gly Phe Met His Ser Asp Leu Thr Tyr Asp Val Pro Ser
50 55 60
Val Trp Asp Gly Arg Phe Phe Arg Leu Asp Asp His Ile Thr Arg Leu
65 70 75 80
Glu Ala Ser Cys Thr Lys Leu Arg Leu Arg Leu Pro Leu Pro Arg Asp
85 90 95
Gln Val Lys Gln Ile Leu Val Glu Met Val Ala Lys Ser Gly Ile Arg
100 105 110
Asp Ala Phe Val Glu Leu Ile Val Thr Arg Gly Leu Lys Gly Val Arg
115 120 125
Gly Thr Arg Pro Glu Asp Ile Val Asn Asn Leu Tyr Met Phe Val Gln
130 135 140
Pro Tyr Val Trp Val Met Glu Pro Asp Met Gln Arg Val Gly Gly Ser
145 150 155 160
Ala Val Val Ala Arg Thr Val Arg Arg Val Pro Pro Gly Ala Ile Asp
165 170 175
Asn Thr Val Arg Tyr Leu Gln Trp Gly Asp Tyr Val Arg Gly Met Phe
180 185 190
Glu Ala Ala Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp
195 200 205
Ala His Leu Thr Glu Gly Ser Trp Phe Asn Ile Val Leu Val Lys Asp
210 215 220
Gly Val Leu Tyr Thr Pro Asp Arg Gly Val Leu Gln Gly Val Thr Arg
225 230 235 240
Lys Ser Val Ile Asn Ala Ala Glu Ala Phe Gly Ile Glu Val Arg Val
245 250 255
Glu Phe Val Pro Val Glu Leu Ala Tyr Arg Cys Asp Glu Ile Phe Met
260 265 270
Cys Thr Trp Ala Gly Gly Ile Met Pro Ile Thr Thr Leu Asp Gly Met
275 280 285
Pro Val Asn Gly Gly Gln Ile Gly Pro Ile Thr Lys Arg Ile Trp Asp
290 295 300
Gly Tyr Trp Ala Met His Tyr Asp Ala Ala Tyr Ser Phe Glu Ile Asp
305 310 315 320
Tyr Asn Glu Arg Asn
325
<210> 18
<211> 978
<212> DNA
<213> 人工序列
<400> 18
atggcgagca tggacaaagt gtttgcgggt tatgcgctgc gtcaggcgat cctggagagc 60
accgaaacca ccaacccgtt tgcgaaaggc attgcgtggg tggaaggtga actggttccg 120
ctggcggagg cgcgtatccc gctgctggac caaggtttca tgcacagcga cctgacctat 180
gatgtgccga gcgtttggga tggccgtttc tttcgtctgg acgatcacat cacccgtctg 240
gaagcgagct gcaccaaact gcgtctgcgt ctgccgctgc cgcgtgacca ggtgaagcaa 300
attctggtgg agatggttgc gaaaagcggt atccgtgatg cgttcgtgga actgattgtt 360
acccgtggtc tgaaaggcgt tcgtggtacc cgtccggagg acatcgtgaa caacctgtac 420
atgtttgttc agccgtatgt gtgggttatg gaaccggata tgcagcgtgt gggtggcagc 480
gcggtggttg cgcgtaccgt gcgtcgtgtt ccgccgggtg cgattgataa caccgtgcgt 540
tacctgcagt ggggcgacta cgttcgtggc atgtttgagg cggcggatcg tggcgcgacc 600
tacccgtttc tgaccgacgg tgatgcgcac ctgaccgaag gtagctggtt taacatcgtg 660
ctggttaagg acggtgtgct gtataccccg gatcgtggcg tgctgcaagg tgttacccgt 720
aaaagcgtta tcaacgcggc ggaggcgttt ggtattgaag tgcgtgttga atttgtgccg 780
gttgagctgg cgtaccgttg cgacgaaatt ttcatgtgca cctgggcggg tggcatcatg 840
ccgattacca ccctggatgg catgccggtt aacggtggcc agatcggtcc gattaccaag 900
cgtatctggg acggctactg ggcgatgcac tacgatgcgg cgtatagctt tgagattgac 960
tataacgaac gtaactaa 978
<210> 19
<211> 325
<212> PRT
<213> 人工序列
<400> 19
Met Ala Ser Met Asp Lys Val Phe Ala Gly Tyr Ala Arg Arg Gln Ala
1 5 10 15
Ile Leu Glu Ser Thr Glu Thr Thr Asn Pro Phe Ala Lys Gly Ile Ala
20 25 30
Trp Val Glu Gly Glu Leu Val Pro Leu Ala Glu Ala Arg Ile Pro Leu
35 40 45
Leu Asp Gln Gly Phe Met His Ser Asp Leu Thr Tyr Asp Val Pro Ser
50 55 60
Val Trp Asp Gly Arg Phe Phe Arg Leu Asp Asp His Ile Thr Arg Leu
65 70 75 80
Glu Ala Ser Cys Thr Lys Leu Arg Leu Arg Leu Pro Leu Pro Arg Asp
85 90 95
Gln Val Lys Gln Ile Leu Val Glu Met Val Ala Lys Ser Gly Ile Arg
100 105 110
Asp Ala Phe Val Glu Leu Ile Val Thr Arg Gly Leu Lys Gly Val Arg
115 120 125
Gly Thr Arg Pro Glu Asp Ile Val Asn Asn Leu Tyr Met Phe Val Gln
130 135 140
Pro Tyr Val Trp Val Met Glu Pro Asp Met Gln Arg Val Gly Gly Ser
145 150 155 160
Ala Val Val Ala Arg Thr Val Arg Arg Val Pro Pro Gly Ala Ile Asp
165 170 175
Asn Thr Val Arg Tyr Leu Gln Trp Gly Asp Tyr Val Arg Gly Met Phe
180 185 190
Glu Ala Ala Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp
195 200 205
Ala His Leu Thr Glu Gly Ser Arg Phe Asn Ile Val Leu Val Lys Asp
210 215 220
Gly Val Leu Tyr Thr Pro Asp Arg Gly Val Leu Gln Gly Val Thr Arg
225 230 235 240
Lys Ser Val Ile Asn Ala Ala Glu Ala Phe Gly Ile Glu Val Arg Val
245 250 255
Glu Phe Val Pro Val Glu Leu Ala Tyr Arg Cys Asp Glu Ile Phe Met
260 265 270
Cys Thr Trp Ala Gly Gly Ile Met Pro Ile Thr Phe Leu Asp Gly Met
275 280 285
Pro Val Asn Gly Gly Gln Ile Gly Pro Ile Thr Lys Phe Ile Trp Asp
290 295 300
Gly Tyr Trp Ala Met His Tyr Asp Ala Ala Tyr Ser Phe Glu Ile Asp
305 310 315 320
Tyr Asn Glu Arg Asn
325
<210> 20
<211> 978
<212> DNA
<213> 人工序列
<400> 20
atggcgagca tggacaaagt gtttgcgggt tatgcgcgtc gtcaggcgat cctggagagc 60
accgaaacca ccaacccgtt tgcgaaaggc attgcgtggg tggaaggtga actggttccg 120
ctggcggagg cgcgtatccc gctgctggac caaggtttca tgcacagcga cctgacctat 180
gatgtgccga gcgtttggga tggccgtttc tttcgtctgg acgatcacat cacccgtctg 240
gaagcgagct gcaccaaact gcgtctgcgt ctgccgctgc cgcgtgacca ggtgaagcaa 300
attctggtgg agatggttgc gaaaagcggt atccgtgatg cgttcgtgga actgattgtt 360
acccgtggtc tgaaaggcgt tcgtggtacc cgtccggagg acatcgtgaa caacctgtac 420
atgtttgttc agccgtatgt gtgggttatg gaaccggata tgcagcgtgt gggtggcagc 480
gcggtggttg cgcgtaccgt gcgtcgtgtt ccgccgggtg cgattgataa caccgtgcgt 540
tacctgcagt ggggcgacta cgttcgtggc atgtttgagg cggcggatcg tggcgcgacc 600
tacccgtttc tgaccgacgg tgatgcgcac ctgaccgaag gtagccgttt taacatcgtg 660
ctggttaagg acggtgtgct gtataccccg gatcgtggcg tgctgcaagg tgttacccgt 720
aaaagcgtta tcaacgcggc ggaggcgttt ggtattgaag tgcgtgttga atttgtgccg 780
gttgagctgg cgtaccgttg cgacgaaatt ttcatgtgca cctgggcggg tggcatcatg 840
ccgattacct ttctggatgg catgccggtt aacggtggcc agatcggtcc gattaccaag 900
tttatctggg acggctactg ggcgatgcac tacgatgcgg cgtatagctt tgagattgac 960
tataacgaac gtaactaa 978
<210> 21
<211> 325
<212> PRT
<213> 人工序列
<400> 21
Met Ala Ser Met Asp Lys Val Phe Ala Gly Tyr Ala Leu Arg Gln Ala
1 5 10 15
Ile Leu Glu Ser Thr Glu Thr Thr Asn Pro Phe Ala Lys Gly Ile Ala
20 25 30
Trp Val Glu Gly Glu Leu Val Pro Leu Ala Glu Ala Arg Ile Pro Leu
35 40 45
Leu Asp Gln Gly Phe Met His Ser Asp Leu Thr Tyr Asp Val Pro Ser
50 55 60
Val Trp Asp Gly Arg Phe Phe Arg Leu Asp Asp His Ile Thr Arg Leu
65 70 75 80
Glu Ala Ser Cys Thr Lys Leu Arg Leu Arg Leu Pro Leu Pro Arg Asp
85 90 95
Gln Val Lys Gln Ile Leu Val Glu Met Val Ala Lys Ser Gly Ile Arg
100 105 110
Asp Ala Phe Val Glu Leu Ile Val Thr Arg Gly Leu Lys Gly Val Arg
115 120 125
Gly Thr Arg Pro Glu Asp Ile Val Asn Asn Leu Tyr Met Phe Val Gln
130 135 140
Pro Tyr Val Trp Val Met Glu Pro Asp Met Gln Arg Val Gly Gly Ser
145 150 155 160
Ala Val Val Ala Arg Thr Val Arg Arg Val Pro Pro Gly Ala Ile Asp
165 170 175
Asn Thr Val Arg Tyr Leu Gln Trp Gly Asp Tyr Val Arg Gly Met Phe
180 185 190
Glu Ala Ala Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp
195 200 205
Ala His Leu Thr Glu Gly Ser Trp Phe Asn Ile Val Leu Val Lys Asp
210 215 220
Gly Val Leu Tyr Thr Pro Asp Arg Gly Val Leu Gln Gly Val Thr Arg
225 230 235 240
Lys Ser Val Ile Asn Ala Ala Glu Ala Phe Gly Ile Glu Val Arg Val
245 250 255
Glu Phe Val Pro Val Glu Leu Ala Tyr Arg Cys Asp Glu Ile Phe Met
260 265 270
Cys Thr Trp Ala Gly Gly Ile Met Pro Ile Thr Thr Leu Asp Gly Met
275 280 285
Pro Val Asn Gly Gly Gln Ile Gly Pro Ile Thr Lys Arg Ile Trp Asp
290 295 300
Gly Tyr Trp Ala Met His Tyr Phe Ala Ala Tyr Ser Phe Glu Ile Asp
305 310 315 320
Tyr Asn Glu Arg Asn
325
<210> 22
<211> 978
<212> DNA
<213> 人工序列
<400> 22
atggcgagca tggacaaagt gtttgcgggt tatgcgctgc gtcaggcgat cctggagagc 60
accgaaacca ccaacccgtt tgcgaaaggc attgcgtggg tggaaggtga actggttccg 120
ctggcggagg cgcgtatccc gctgctggac caaggtttca tgcacagcga cctgacctat 180
gatgtgccga gcgtttggga tggccgtttc tttcgtctgg acgatcacat cacccgtctg 240
gaagcgagct gcaccaaact gcgtctgcgt ctgccgctgc cgcgtgacca ggtgaagcaa 300
attctggtgg agatggttgc gaaaagcggt atccgtgatg cgttcgtgga actgattgtt 360
acccgtggtc tgaaaggcgt tcgtggtacc cgtccggagg acatcgtgaa caacctgtac 420
atgtttgttc agccgtatgt gtgggttatg gaaccggata tgcagcgtgt gggtggcagc 480
gcggtggttg cgcgtaccgt gcgtcgtgtt ccgccgggtg cgattgataa caccgtgcgt 540
tacctgcagt ggggcgacta cgttcgtggc atgtttgagg cggcggatcg tggcgcgacc 600
tacccgtttc tgaccgacgg tgatgcgcac ctgaccgaag gtagctggtt taacatcgtg 660
ctggttaagg acggtgtgct gtataccccg gatcgtggcg tgctgcaagg tgttacccgt 720
aaaagcgtta tcaacgcggc ggaggcgttt ggtattgaag tgcgtgttga atttgtgccg 780
gttgagctgg cgtaccgttg cgacgaaatt ttcatgtgca cctgggcggg tggcatcatg 840
ccgattacca ccctggatgg catgccggtt aacggtggcc agatcggtcc gattaccaag 900
cgtatctggg acggctactg ggcgatgcac tactttgcgg cgtatagctt tgagattgac 960
tataacgaac gtaactaa 978
<210> 23
<211> 325
<212> PRT
<213> 人工序列
<400> 23
Met Ala Ser Met Asp Lys Val Phe Ala Gly Tyr Ala Arg Arg Gln Ala
1 5 10 15
Ile Leu Glu Ser Thr Glu Thr Thr Asn Pro Phe Ala Lys Gly Ile Ala
20 25 30
Trp Val Glu Gly Glu Leu Val Pro Leu Ala Glu Ala Arg Ile Pro Leu
35 40 45
Leu Asp Gln Gly Phe Met His Ser Asp Leu Thr Tyr Asp Val Pro Ser
50 55 60
Val Trp Asp Gly Arg Phe Phe Arg Leu Asp Asp His Ile Thr Arg Leu
65 70 75 80
Glu Ala Ser Cys Thr Lys Leu Arg Leu Arg Leu Pro Leu Pro Arg Asp
85 90 95
Gln Val Lys Gln Ile Leu Val Glu Met Val Ala Lys Ser Gly Ile Arg
100 105 110
Asp Ala Phe Val Glu Leu Ile Val Thr Arg Gly Leu Lys Gly Val Arg
115 120 125
Gly Thr Arg Pro Glu Asp Ile Val Asn Trp Leu Tyr Met Phe Val Gln
130 135 140
Pro Tyr Val Trp Val Met Glu Pro Asp Met Gln Arg Val Gly Gly Ser
145 150 155 160
Ala Val Val Ala Arg Thr Val Arg Arg Val Pro Pro Gly Ala Ile Asp
165 170 175
Asn Thr Val Arg Tyr Leu Gln Trp Gly Asp Tyr Val Arg Gly Met Phe
180 185 190
Glu Ala Ala Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp
195 200 205
Ala His Leu Thr Glu Gly Ser Arg Phe Asn Ile Val Leu Val Lys Asp
210 215 220
Gly Val Leu Tyr Thr Pro Asp Arg Gly Val Leu Gln Gly Val Thr Arg
225 230 235 240
Lys Ser Val Ile Asn Ala Ala Glu Ala Phe Gly Ile Glu Val Arg Val
245 250 255
Glu Phe Val Pro Val Glu Leu Ala Tyr Arg Cys Asp Glu Ile Phe Met
260 265 270
Cys Thr Trp Ala Gly Gly Ile Met Pro Ile Thr Trp Leu Asp Gly Met
275 280 285
Pro Val Asn Gly Gly Gln Ile Gly Pro Ile Thr Lys Arg Ile Trp Asp
290 295 300
Gly Tyr Trp Ala Met His Tyr Leu Ala Ala Tyr Ser Phe Glu Ile Asp
305 310 315 320
Tyr Asn Glu Arg Asn
325
<210> 24
<211> 978
<212> DNA
<213> 人工序列
<400> 24
atggcgagca tggacaaagt gtttgcgggt tatgcgcgtc gtcaggcgat cctggagagc 60
accgaaacca ccaacccgtt tgcgaaaggc attgcgtggg tggaaggtga actggttccg 120
ctggcggagg cgcgtatccc gctgctggac caaggtttca tgcacagcga cctgacctat 180
gatgtgccga gcgtttggga tggccgtttc tttcgtctgg acgatcacat cacccgtctg 240
gaagcgagct gcaccaaact gcgtctgcgt ctgccgctgc cgcgtgacca ggtgaagcaa 300
attctggtgg agatggttgc gaaaagcggt atccgtgatg cgttcgtgga actgattgtt 360
acccgtggtc tgaaaggcgt tcgtggtacc cgtccggagg acatcgtgaa ctggctgtac 420
atgtttgttc agccgtatgt gtgggttatg gaaccggata tgcagcgtgt gggtggcagc 480
gcggtggttg cgcgtaccgt gcgtcgtgtt ccgccgggtg cgattgataa caccgtgcgt 540
tacctgcagt ggggcgacta cgttcgtggc atgtttgagg cggcggatcg tggcgcgacc 600
tacccgtttc tgaccgacgg tgatgcgcac ctgaccgaag gtagccgttt taacatcgtg 660
ctggttaagg acggtgtgct gtataccccg gatcgtggcg tgctgcaagg tgttacccgt 720
aaaagcgtta tcaacgcggc ggaggcgttt ggtattgaag tgcgtgttga atttgtgccg 780
gttgagctgg cgtaccgttg cgacgaaatt ttcatgtgca cctgggcggg tggcatcatg 840
ccgattacct ggctggatgg catgccggtt aacggtggcc agatcggtcc gattaccaag 900
cgtatctggg acggctactg ggcgatgcac tacctggcgg cgtatagctt tgagattgac 960
tataacgaac gtaactaa 978
<210> 25
<211> 325
<212> PRT
<213> 人工序列
<400> 25
Met Ala Ser Met Asp Lys Val Phe Ala Gly Tyr Ala Leu Arg Gln Ala
1 5 10 15
Ile Leu Glu Ser Thr Glu Thr Thr Asn Pro Phe Ala Lys Gly Ile Ala
20 25 30
Trp Val Glu Gly Glu Leu Val Pro Leu Ala Glu Ala Arg Ile Pro Leu
35 40 45
Leu Asp Gln Gly Phe Met His Ser Asp Leu Thr Tyr Asp Val Pro Ser
50 55 60
Val Trp Asp Gly Arg Phe Phe Arg Leu Asp Asp His Ile Thr Arg Leu
65 70 75 80
Glu Ala Ser Cys Thr Lys Leu Arg Leu Arg Leu Pro Leu Pro Arg Asp
85 90 95
Gln Val Lys Gln Ile Leu Val Glu Met Val Ala Lys Ser Gly Ile Arg
100 105 110
Asp Ala Phe Val Glu Leu Ile Val Thr Arg Gly Leu Lys Gly Val Arg
115 120 125
Gly Thr Arg Pro Glu Asp Ile Val Asn Arg Leu Tyr Met Phe Val Gln
130 135 140
Pro Tyr Val Trp Val Met Glu Pro Asp Met Gln Arg Val Gly Gly Ser
145 150 155 160
Ala Val Val Ala Arg Thr Val Arg Arg Val Pro Pro Gly Ala Ile Asp
165 170 175
Asn Thr Val Arg Tyr Leu Gln Trp Gly Asp Tyr Val Arg Gly Met Phe
180 185 190
Glu Ala Ala Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp
195 200 205
Ala His Leu Thr Glu Gly Ser Arg Phe Asn Ile Val Leu Val Lys Asp
210 215 220
Gly Val Leu Tyr Thr Pro Asp Arg Gly Val Leu Gln Gly Val Thr Arg
225 230 235 240
Lys Ser Val Ile Asn Ala Ala Glu Ala Phe Gly Ile Glu Val Arg Val
245 250 255
Glu Phe Val Pro Val Glu Leu Ala Tyr Arg Cys Asp Glu Ile Phe Met
260 265 270
Cys Thr Trp Ala Gly Gly Ile Met Pro Ile Thr Trp Leu Asp Gly Met
275 280 285
Pro Val Asn Gly Gly Gln Ile Gly Pro Ile Thr Lys Arg Ile Trp Asp
290 295 300
Gly Tyr Trp Ala Met His Tyr Leu Ala Ala Tyr Ser Phe Glu Ile Asp
305 310 315 320
Tyr Asn Glu Arg Asn
325
<210> 26
<211> 978
<212> DNA
<213> 人工序列
<400> 26
atggcgagca tggacaaagt gtttgcgggt tatgcgctgc gtcaggcgat cctggagagc 60
accgaaacca ccaacccgtt tgcgaaaggc attgcgtggg tggaaggtga actggttccg 120
ctggcggagg cgcgtatccc gctgctggac caaggtttca tgcacagcga cctgacctat 180
gatgtgccga gcgtttggga tggccgtttc tttcgtctgg acgatcacat cacccgtctg 240
gaagcgagct gcaccaaact gcgtctgcgt ctgccgctgc cgcgtgacca ggtgaagcaa 300
attctggtgg agatggttgc gaaaagcggt atccgtgatg cgttcgtgga actgattgtt 360
acccgtggtc tgaaaggcgt tcgtggtacc cgtccggagg acatcgtgaa ccgtctgtac 420
atgtttgttc agccgtatgt gtgggttatg gaaccggata tgcagcgtgt gggtggcagc 480
gcggtggttg cgcgtaccgt gcgtcgtgtt ccgccgggtg cgattgataa caccgtgcgt 540
tacctgcagt ggggcgacta cgttcgtggc atgtttgagg cggcggatcg tggcgcgacc 600
tacccgtttc tgaccgacgg tgatgcgcac ctgaccgaag gtagccgttt taacatcgtg 660
ctggttaagg acggtgtgct gtataccccg gatcgtggcg tgctgcaagg tgttacccgt 720
aaaagcgtta tcaacgcggc ggaggcgttt ggtattgaag tgcgtgttga atttgtgccg 780
gttgagctgg cgtaccgttg cgacgaaatt ttcatgtgca cctgggcggg tggcatcatg 840
ccgattacct ggctggatgg catgccggtt aacggtggcc agatcggtcc gattaccaag 900
cgtatctggg acggctactg ggcgatgcac tacctggcgg cgtatagctt tgagattgac 960
tataacgaac gtaactaa 978
<210> 27
<211> 325
<212> PRT
<213> 人工序列
<400> 27
Met Ala Ser Met Asp Lys Val Phe Ala Gly Tyr Ala Arg Arg Gln Ala
1 5 10 15
Ile Leu Glu Ser Thr Glu Thr Thr Asn Pro Phe Ala Lys Gly Ile Ala
20 25 30
Trp Val Glu Gly Glu Leu Val Pro Leu Ala Glu Ala Arg Ile Pro Leu
35 40 45
Leu Asp Gln Gly Phe Met His Ser Asp Leu Thr Tyr Asp Val Pro Ser
50 55 60
Val Trp Asp Gly Arg Phe Phe Arg Leu Asp Asp His Ile Thr Arg Leu
65 70 75 80
Glu Ala Ser Cys Thr Lys Leu Arg Leu Arg Leu Pro Leu Pro Arg Asp
85 90 95
Gln Val Lys Gln Ile Leu Val Glu Met Val Ala Lys Ser Gly Ile Arg
100 105 110
Asp Ala Phe Val Glu Leu Ile Val Thr Arg Gly Leu Lys Gly Val Arg
115 120 125
Gly Thr Arg Pro Glu Asp Ile Val Asn Arg Leu Tyr Met Phe Val Gln
130 135 140
Pro Tyr Val Trp Val Met Glu Pro Asp Met Gln Arg Val Gly Gly Ser
145 150 155 160
Ala Val Val Ala Arg Thr Val Arg Arg Val Pro Pro Gly Ala Ile Asp
165 170 175
Asn Thr Val Arg Tyr Leu Gln Trp Gly Asp Tyr Val Arg Gly Met Phe
180 185 190
Glu Ala Ala Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp
195 200 205
Ala His Leu Thr Glu Gly Ser Trp Phe Asn Ile Val Leu Val Lys Asp
210 215 220
Gly Val Leu Tyr Thr Pro Asp Arg Gly Val Leu Gln Gly Val Thr Arg
225 230 235 240
Lys Ser Val Ile Asn Ala Ala Glu Ala Phe Gly Ile Glu Val Arg Val
245 250 255
Glu Phe Val Pro Val Glu Leu Ala Tyr Arg Cys Asp Glu Ile Phe Met
260 265 270
Cys Thr Trp Ala Gly Gly Ile Met Pro Ile Thr Phe Leu Asp Gly Met
275 280 285
Pro Val Asn Gly Gly Gln Ile Gly Pro Ile Thr Lys Arg Ile Trp Asp
290 295 300
Gly Tyr Trp Ala Met His Tyr Arg Ala Ala Tyr Ser Phe Glu Ile Asp
305 310 315 320
Tyr Asn Glu Arg Asn
325
<210> 28
<211> 978
<212> DNA
<213> 人工序列
<400> 28
atggcgagca tggacaaagt gtttgcgggt tatgcgcgtc gtcaggcgat cctggagagc 60
accgaaacca ccaacccgtt tgcgaaaggc attgcgtggg tggaaggtga actggttccg 120
ctggcggagg cgcgtatccc gctgctggac caaggtttca tgcacagcga cctgacctat 180
gatgtgccga gcgtttggga tggccgtttc tttcgtctgg acgatcacat cacccgtctg 240
gaagcgagct gcaccaaact gcgtctgcgt ctgccgctgc cgcgtgacca ggtgaagcaa 300
attctggtgg agatggttgc gaaaagcggt atccgtgatg cgttcgtgga actgattgtt 360
acccgtggtc tgaaaggcgt tcgtggtacc cgtccggagg acatcgtgaa ccgtctgtac 420
atgtttgttc agccgtatgt gtgggttatg gaaccggata tgcagcgtgt gggtggcagc 480
gcggtggttg cgcgtaccgt gcgtcgtgtt ccgccgggtg cgattgataa caccgtgcgt 540
tacctgcagt ggggcgacta cgttcgtggc atgtttgagg cggcggatcg tggcgcgacc 600
tacccgtttc tgaccgacgg tgatgcgcac ctgaccgaag gtagctggtt taacatcgtg 660
ctggttaagg acggtgtgct gtataccccg gatcgtggcg tgctgcaagg tgttacccgt 720
aaaagcgtta tcaacgcggc ggaggcgttt ggtattgaag tgcgtgttga atttgtgccg 780
gttgagctgg cgtaccgttg cgacgaaatt ttcatgtgca cctgggcggg tggcatcatg 840
ccgattacct ttctggatgg catgccggtt aacggtggcc agatcggtcc gattaccaag 900
cgtatctggg acggctactg ggcgatgcac taccgtgcgg cgtatagctt tgagattgac 960
tataacgaac gtaactaa 978
<210> 29
<211> 325
<212> PRT
<213> 人工序列
<400> 29
Met Ala Ser Met Asp Lys Val Phe Ala Gly Tyr Ala Leu Arg Gln Ala
1 5 10 15
Ile Leu Glu Ser Thr Glu Thr Thr Asn Pro Phe Ala Lys Gly Ile Ala
20 25 30
Trp Val Glu Gly Glu Leu Val Pro Leu Ala Glu Ala Arg Ile Pro Leu
35 40 45
Leu Asp Gln Gly Phe Met His Ser Asp Leu Thr Tyr Asp Val Pro Ser
50 55 60
Val Trp Asp Gly Arg Phe Phe Arg Leu Asp Asp His Ile Thr Arg Leu
65 70 75 80
Glu Ala Ser Cys Thr Lys Leu Arg Leu Arg Leu Pro Leu Pro Arg Asp
85 90 95
Gln Val Lys Gln Ile Leu Val Glu Met Val Ala Lys Ser Gly Ile Arg
100 105 110
Asp Ala Phe Val Glu Leu Ile Val Thr Arg Gly Leu Lys Gly Val Arg
115 120 125
Gly Thr Arg Pro Glu Asp Ile Val Asn Trp Leu Tyr Met Phe Val Gln
130 135 140
Pro Tyr Val Trp Val Met Glu Pro Asp Met Gln Arg Val Gly Gly Ser
145 150 155 160
Ala Val Val Ala Arg Thr Val Arg Arg Val Pro Pro Gly Ala Ile Asp
165 170 175
Asn Thr Val Arg Tyr Leu Gln Trp Gly Asp Tyr Val Arg Gly Met Phe
180 185 190
Glu Ala Ala Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp
195 200 205
Ala His Leu Thr Glu Gly Ser Trp Phe Asn Ile Val Leu Val Lys Asp
210 215 220
Gly Val Leu Tyr Thr Pro Asp Arg Gly Val Leu Gln Gly Val Thr Arg
225 230 235 240
Lys Ser Val Ile Asn Ala Ala Glu Ala Phe Gly Ile Glu Val Arg Val
245 250 255
Glu Phe Val Pro Val Glu Leu Ala Tyr Arg Cys Asp Glu Ile Phe Met
260 265 270
Cys Thr Trp Ala Gly Gly Ile Met Pro Ile Thr Phe Leu Asp Gly Met
275 280 285
Pro Val Asn Gly Gly Gln Ile Gly Pro Ile Thr Lys Phe Ile Trp Asp
290 295 300
Gly Tyr Trp Ala Met His Tyr Leu Ala Ala Tyr Ser Phe Glu Ile Asp
305 310 315 320
Tyr Asn Glu Arg Asn
325
<210> 30
<211> 978
<212> DNA
<213> 人工序列
<400> 30
atggcgagca tggacaaagt gtttgcgggt tatgcgctgc gtcaggcgat cctggagagc 60
accgaaacca ccaacccgtt tgcgaaaggc attgcgtggg tggaaggtga actggttccg 120
ctggcggagg cgcgtatccc gctgctggac caaggtttca tgcacagcga cctgacctat 180
gatgtgccga gcgtttggga tggccgtttc tttcgtctgg acgatcacat cacccgtctg 240
gaagcgagct gcaccaaact gcgtctgcgt ctgccgctgc cgcgtgacca ggtgaagcaa 300
attctggtgg agatggttgc gaaaagcggt atccgtgatg cgttcgtgga actgattgtt 360
acccgtggtc tgaaaggcgt tcgtggtacc cgtccggagg acatcgtgaa ctggctgtac 420
atgtttgttc agccgtatgt gtgggttatg gaaccggata tgcagcgtgt gggtggcagc 480
gcggtggttg cgcgtaccgt gcgtcgtgtt ccgccgggtg cgattgataa caccgtgcgt 540
tacctgcagt ggggcgacta cgttcgtggc atgtttgagg cggcggatcg tggcgcgacc 600
tacccgtttc tgaccgacgg tgatgcgcac ctgaccgaag gtagctggtt taacatcgtg 660
ctggttaagg acggtgtgct gtataccccg gatcgtggcg tgctgcaagg tgttacccgt 720
aaaagcgtta tcaacgcggc ggaggcgttt ggtattgaag tgcgtgttga atttgtgccg 780
gttgagctgg cgtaccgttg cgacgaaatt ttcatgtgca cctgggcggg tggcatcatg 840
ccgattacct ttctggatgg catgccggtt aacggtggcc agatcggtcc gattaccaag 900
tttatctggg acggctactg ggcgatgcac tacctggcgg cgtatagctt tgagattgac 960
tataacgaac gtaactaa 978
Claims (11)
1.一种转氨酶突变体,其特征在于,所述转氨酶突变体包括与SEQ ID NO:1具有80%以上同一性,且具有转氨酶活性。
2.根据权利要求1所述的转氨酶突变体,其特征在于,所述转氨酶突变体在SEQ ID NO:1所示的氨基酸序列的第13~312位发生了一个或多个氨基酸的修饰、取代、缺失或添加;
优选地,所述转氨酶突变体在SEQ ID NO:1的第13~312位发生了一个或多个氨基酸突变;
优选地,所述突变选自第180位、第181位、第187位、第216位或第275位中的任意一个或多个;
优选地,所述突变选自K180R、N181Y、L187Y、G216L或T275W中的任意一个或多个;
优选地,所述突变进一步选自第177位;
优选地,所述突变选自P177N、K180R、N181Y、L187Y、G216L、G216R或T275W中的任意一个或多个;
优选地,所述突变进一步选自第13位和/或第301位;
优选地,所述突变选自A13L、A13R、P177N、K180R、N181Y、L187Y、G216W、G216R、T275W、K301R或K301F中的任意一个或多个;
优选地,所述突变进一步选自第284位和/或第312位;
优选地,所述突变选自A13L、A13R、P177N、K180R、N181Y、L187Y、G216W、G216R、T275W、T284F、K301R、K301F或D312F中的任意一个或多个;
优选地,所述突变进一步选自第138位;
优选地,所述突变选自A13L、A13R、N138W、N138R、P177N、K180R、N181Y、L187Y、G216L、G216R、G216W、T275W、T284F、T284W、K301R、K301F、D312F、D312L或D312R中的任意一个或多个。
3.根据权利要求1或2所述的转氨酶突变体,其特征在于,所述转氨酶突变体包括SEQID NO:1所示的氨基酸序列经L187Y、G216L和T275W取代突变获得的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:3所示的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经N181Y、G216L和T275W取代突变获得的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:5所示的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经K180R、L187Y和G216L取代突变获得的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:7所示的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经P177N、K180R、N181Y、L187Y和T275W取代突变获得的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:9所示的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经P177N、N181Y、L187Y、G216L和T275W取代突变获得的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:11所示的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经P177N、K180R、L187Y、G216R和T275W取代突变获得的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:13所示的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经P177N、K180R、N181Y、L187Y、G216L和T275W取代突变获得的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:15所示的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经A13L、P177N、K180R、N181Y、L187Y、G216W、T275W和K301R取代突变获得的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:17所示的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经A13R、P177N、K180R、N181Y、L187Y、G216R、T275W、T284F和K301F取代突变获得的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:19所示的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经A13L、P177N、K180R、N181Y、L187Y、G216W、T275W、K301R和D312F取代突变获得的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:21所示的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经A13R、N138W、P177N、K180R、N181Y、L187Y、G216R、T275W、T284W、K301R和D312L取代突变获得的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:23所示的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经A13L、N138R、P177N、K180R、N181Y、L187Y、G216R、T275W、T284W、K301R和D312L取代突变获得的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:25所示的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经A13R、N138R、P177N、K180R、N181Y、L187Y、G216W、T275W、T284F、K301R和D312R取代突变获得的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:27所示的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:1所示的氨基酸序列经A13L、N138W、P177N、K180R、N181Y、L187Y、G216W、T275W、T284F、K301F和D312L取代突变获得的氨基酸序列;
优选地,所述转氨酶突变体包括SEQ ID NO:29所示的氨基酸序列。
4.一种核酸分子,其特征在于,所述核酸分子包括权利要求1-3任一项所述的转氨酶突变体的编码基因;
优选地,所述核酸分子包括SEQ ID NO:4、SEQ ID NO:6、SEQ ID NO:8、SEQ ID NO:10、SEQ ID NO:12、SEQ ID NO:14、SEQ ID NO:16、SEQ ID NO:18、SEQ ID NO:20、SEQ ID NO:22、SEQ ID NO:24、SEQ ID NO:26、SEQ ID NO:28或SEQ ID NO:30所示的核酸序列。
5.一种重组表达载体,其特征在于,所述重组表达载体包括权利要求4所述的核酸分子。
6.一种重组宿主细胞,其特征在于,所述重组宿主细胞表达权利要求1-3任一项所述的转氨酶突变体;
优选地,所述重组宿主细胞的基因组中整合有权利要求4所述的核酸分子;
优选地,所述重组宿主细胞含有权利要求5所述的重组表达载体。
7.一种权利要求1-3任一项所述的转氨酶突变体的制备方法,其特征在于,所述方法包括以下步骤:
(1)将权利要求4所述的核酸分子连接入表达载体,构建重组表达载体;
(2)将所述重组表达载体转化感受态细胞,进行抗性筛选,获得阳性克隆菌体;
(3)对阳性克隆菌体进行抗性筛选和诱导培养,收集菌体进行破碎,得到转氨酶突变体粗酶液;
优选地,所述方法还包括对粗酶液进行纯化、得到纯化的转氨酶突变体的步骤。
8.一种组合物,其特征在于,所述组合物包括SEQ ID NO:1所示的野生型转氨酶、SEQID NO:3所示的转氨酶突变体、SEQ ID NO:5所示的转氨酶突变体、SEQ ID NO:7所示的转氨酶突变体、SEQ ID NO:9所示的转氨酶突变体、SEQ ID NO:11所示的转氨酶突变体、SEQ IDNO:13所示的转氨酶突变体、SEQ ID NO:15所示的转氨酶突变体、SEQ ID NO:17所示的转氨酶突变体、SEQ ID NO:19所示的转氨酶突变体、SEQ ID NO:21所示的转氨酶突变体、SEQ IDNO:23所示的转氨酶突变体、SEQ ID NO:25所示的转氨酶突变体、SEQ ID NO:27所示的转氨酶突变体或SEQ ID NO:29所示的转氨酶突变体中的任意一种或至少两种的组合;
优选地,所述组合物包括酶粉、酶液或细胞中的任意一种。
9.一种(R)-2-(2,5-二氟苯基)吡咯烷的制备方法,其特征在于,所述方法包括:
将4-氯-1-(2,5-二氟苯基)丁-1-酮、权利要求8所述的组合物、磷酸吡哆醛、异丙胺和/或异丙胺盐酸盐、助溶剂和缓冲液混合,反应得到(R)-2-(2,5-二氟苯基)吡咯烷。
10.根据权利要求9所述的方法,其特征在于,所述反应的时间为12~48h;
优选地,所述反应的温度为25~50℃;
优选地,所述4-氯-1-(2,5-二氟苯基)丁-1-酮的浓度为1~100g/L;
优选地,所述组合物的浓度为1~100g/L;
优选地,所述磷酸吡哆醛的浓度为0.01~1g/L;
优选地,所述异丙胺和/或异丙胺盐酸盐的浓度为1~110g/L;
优选地,所述助溶剂包括乙醇、乙腈或二甲基亚砜中的任意一种或至少两种的组合,优选为二甲基亚砜;
优选地,所述二甲基亚砜的体积浓度为10~50%V/V;
优选地,所述缓冲液包括磷酸盐缓冲液和/或三乙醇胺-盐酸缓冲液;
优选地,所述缓冲液的pH为7.0~9.0。
11.权利要求1-3任一项所述的转氨酶突变体、权利要求4所述的核酸分子、权利要求5所述的重组表达载体、权利要求6所述的重组宿主细胞或权利要求8所述的组合物在制备(R)-2-(2,5-二氟苯基)吡咯烷中的应用。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010838888.1A CN114075557B (zh) | 2020-08-19 | 2020-08-19 | 重组转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 |
CN202311499773.4A CN118006577A (zh) | 2020-08-19 | 2020-08-19 | 一种转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 |
CN202311499567.3A CN117965480A (zh) | 2020-08-19 | 2020-08-19 | 一种重组转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 |
CN202311500027.2A CN118109434A (zh) | 2020-08-19 | 2020-08-19 | 转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010838888.1A CN114075557B (zh) | 2020-08-19 | 2020-08-19 | 重组转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 |
Related Child Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311499773.4A Division CN118006577A (zh) | 2020-08-19 | 2020-08-19 | 一种转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 |
CN202311500027.2A Division CN118109434A (zh) | 2020-08-19 | 2020-08-19 | 转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 |
CN202311499567.3A Division CN117965480A (zh) | 2020-08-19 | 2020-08-19 | 一种重组转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114075557A true CN114075557A (zh) | 2022-02-22 |
CN114075557B CN114075557B (zh) | 2023-12-22 |
Family
ID=80282896
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010838888.1A Active CN114075557B (zh) | 2020-08-19 | 2020-08-19 | 重组转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 |
CN202311500027.2A Pending CN118109434A (zh) | 2020-08-19 | 2020-08-19 | 转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 |
CN202311499773.4A Pending CN118006577A (zh) | 2020-08-19 | 2020-08-19 | 一种转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 |
CN202311499567.3A Pending CN117965480A (zh) | 2020-08-19 | 2020-08-19 | 一种重组转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 |
Family Applications After (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311500027.2A Pending CN118109434A (zh) | 2020-08-19 | 2020-08-19 | 转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 |
CN202311499773.4A Pending CN118006577A (zh) | 2020-08-19 | 2020-08-19 | 一种转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 |
CN202311499567.3A Pending CN117965480A (zh) | 2020-08-19 | 2020-08-19 | 一种重组转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 |
Country Status (1)
Country | Link |
---|---|
CN (4) | CN114075557B (zh) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106754806A (zh) * | 2016-12-20 | 2017-05-31 | 尚科生物医药(上海)有限公司 | 一种改进的转氨酶及其在(r)‑3‑氨基丁醇制备上的应用 |
CN111363732A (zh) * | 2020-03-12 | 2020-07-03 | 重庆迪维斯生物科技有限公司 | 来源于土曲霉菌nih2624的转氨酶突变体及其应用 |
CN111549011A (zh) * | 2020-06-03 | 2020-08-18 | 重庆迪维斯生物科技有限公司 | 来源于土曲霉菌的转氨酶突变体及其应用 |
-
2020
- 2020-08-19 CN CN202010838888.1A patent/CN114075557B/zh active Active
- 2020-08-19 CN CN202311500027.2A patent/CN118109434A/zh active Pending
- 2020-08-19 CN CN202311499773.4A patent/CN118006577A/zh active Pending
- 2020-08-19 CN CN202311499567.3A patent/CN117965480A/zh active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106754806A (zh) * | 2016-12-20 | 2017-05-31 | 尚科生物医药(上海)有限公司 | 一种改进的转氨酶及其在(r)‑3‑氨基丁醇制备上的应用 |
CN111363732A (zh) * | 2020-03-12 | 2020-07-03 | 重庆迪维斯生物科技有限公司 | 来源于土曲霉菌nih2624的转氨酶突变体及其应用 |
CN111549011A (zh) * | 2020-06-03 | 2020-08-18 | 重庆迪维斯生物科技有限公司 | 来源于土曲霉菌的转氨酶突变体及其应用 |
Non-Patent Citations (3)
Title |
---|
BIRREN,B., ET AL: "conserved hypothetical protein[Aspergillus terreus NIH2624]]" * |
罗详冲 等: "拉罗替尼治疗 NTRK 基因融合阳性癌症患者的研究进展" * |
陈素华 等: "治疗 NTRK 融合阳性成人和儿童实体瘤患者新药 larotrectinib" * |
Also Published As
Publication number | Publication date |
---|---|
CN118109434A (zh) | 2024-05-31 |
CN114075557B (zh) | 2023-12-22 |
CN117965480A (zh) | 2024-05-03 |
CN118006577A (zh) | 2024-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10837036B2 (en) | Method for preparing L-aspartic acid with maleic acid by whole-cell biocatalysis | |
CN108823179B (zh) | 一种源自放线菌的转氨酶、突变体、重组菌及应用 | |
CN108690854B (zh) | 一种利用化学-酶法生产l-草铵膦的方法 | |
WO2017119731A1 (ko) | Co 수화효소 및 이를 이용한 개미산의 제조방법 | |
CN113388592B (zh) | 一种7β-HSDH酶突变体及其编码基因和应用 | |
CN111690585B (zh) | rcsB基因缺失的重组粘质沙雷氏菌及其应用 | |
CN114075557B (zh) | 重组转氨酶及其在合成(r)-2-(2,5-二氟苯基)吡咯烷中的应用 | |
WO2022268006A1 (zh) | 亚胺还原酶突变体、亚胺还原酶和葡萄糖脱氢酶共表达酶及其应用 | |
CN110343728B (zh) | 一种生物转化合成六氢哒嗪-3-羧酸的方法 | |
CN114507650B (zh) | 亮氨酸脱氢酶突变体及其在合成(s)-邻氯苯甘氨酸中的应用 | |
CN115838697A (zh) | 亚胺还原酶突变体及其在拉罗替尼手性中间体合成中的应用 | |
US20110287488A1 (en) | Novel n-acetylglucosamine-2-epimerase and method for producing cmp-neuraminic acid using the same | |
WO2019123166A1 (en) | Nucleotide sequences encoding 3-quinuclidinone reductase and glucose dehydrogenase and soluble expression thereof | |
CN115404250A (zh) | 一种利用还原方式制备(s)-尼古丁的方法 | |
CN110804602B (zh) | 一种L-天冬氨酸β-脱羧酶突变体及其应用 | |
CN116064447A (zh) | 转氨酶及其突变体在手性胺合成中的应用 | |
CN108359666B (zh) | 一种nudC基因及其在制备烟酸方面的应用 | |
CN108323173B (zh) | 一种酶法合成氯霉素中间体的方法 | |
CN112725322A (zh) | 天冬氨酸酶突变体及编码基因及工程菌 | |
CN112280758B (zh) | 一种甾体5β还原酶变体及其用途 | |
CN114806999B (zh) | 一种基因工程菌及其在制备二氢大豆苷元中的应用 | |
CN116536279B (zh) | 一种基因工程菌及在制备去氢表雄酮上的应用 | |
CN113897322B (zh) | 一种3-甲基-4-硝基苯甲酸的工程菌及其制备方法 | |
WO2021098506A1 (zh) | 一种甾体5β还原酶变体及其用途 | |
CN115838680A (zh) | 一种d-对羟基苯甘氨酸高产菌株及其应用 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |