WO2016086015A1 - Myoglobin-based catalysts for carbene transfer reactions - Google Patents

Myoglobin-based catalysts for carbene transfer reactions Download PDF

Info

Publication number
WO2016086015A1
WO2016086015A1 PCT/US2015/062478 US2015062478W WO2016086015A1 WO 2016086015 A1 WO2016086015 A1 WO 2016086015A1 US 2015062478 W US2015062478 W US 2015062478W WO 2016086015 A1 WO2016086015 A1 WO 2016086015A1
Authority
WO
WIPO (PCT)
Prior art keywords
myoglobin
substituted
catalyst
aryl
aliphatic
Prior art date
Application number
PCT/US2015/062478
Other languages
French (fr)
Inventor
Rudi Fasan
Original Assignee
University Of Rochester
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University Of Rochester filed Critical University Of Rochester
Publication of WO2016086015A1 publication Critical patent/WO2016086015A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P13/00Preparation of nitrogen-containing organic compounds
    • C12P13/001Amines; Imines
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/795Porphyrin- or corrin-ring-containing peptides
    • C07K14/805Haemoglobins; Myoglobins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P11/00Preparation of sulfur-containing organic compounds
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P5/00Preparation of hydrocarbons or halogenated hydrocarbons
    • C12P5/002Preparation of hydrocarbons or halogenated hydrocarbons cyclic
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P9/00Preparation of organic compounds containing a metal or atom other than H, N, C, O, S or halogen

Definitions

  • the present invention relates to engineered variants of myoglobin and their use as biocatalysts for catalyzing carbene transfer reactions.
  • the invention relates to myoglobin-based catalysts having capability to catalyze olefin cyclopropanation reactions, carbene insertion reaction into N— H, S— H, and Si— H bonds, sigmatropic rearrangement reactions, and/or aldehyde olefination reactions.
  • the invention also relates to methods for carrying out these transformations in vitro and in whole cells comprising providing a carbene acceptor substrate, a carbene donor reagent, and a myoglobin-based catalyst in an isolated form or contained within a host cell.
  • Enzymes and other protein-based biocatalysts constitute an attractive alternative to traditional chemical catalysts due to their ability to operate in aqueous media and under very mild reactions conditions such as ambient temperature and pressure (Bornscheuer, Huisman et al. 2012). These properties combined with the proteinaceous nature of these catalysts make them particularly relevant toward the design and implementation of sustainable and environmentally friendly procedures for chemical synthesis and manufacturing (Bornscheuer, Huisman et al. 2012). Notably, no naturally occurring enzymes are known to catalyze the aforementioned carbene transfer reactions in biological systems. Recent studies have shown that cytochrome P450 enzymes (e.g., P450BM3, a.k.a.
  • CYP102A1 can react with diazocompounds and promote reactions such as the cyclopropanation of styrene derivatives (Coelho, Housead et al. 2013; Coelho, Wang et al. 2013; Wang, Renata et al. 2014) and carbene N— H insertion in aniline derivatives in vitro and in vivo (Wang, Peck et al. 2014). See also Coehlo et al. US Pat. 8,993,262 B2 and Coehlo et al. Pat. Appl. WO2014058729.
  • P450-based catalysts include their large size (5-115 kDa) and limited stability, in particular at elevated temperatures and in the presence of organic cosolvents.
  • these P450-based catalysts are often characterized by modest catalytic efficiency, limited substrate scope, and/or moderate diastereo- and enantio/stereoselectivity (Coelho, Housead et al. 2013; Coelho, Wang et al. 2013; Wang, Peck et al. 2014; Wang, Renata et al. 2014). 3.
  • An engineered myoglobin catalyst having an improved capability, as compared to the myoglobin of SEQ ID NO: 1, to catalyze a carbene transfer reaction, wherein the engineered myoglobin catalyst comprises an amino acid sequence that is at least 60% identical to SEQ ID NO: 1, 112, 113, 114, 115, and 116.
  • the improved capability of the myoglobin catalyst is an improvement in its catalytic activity, regioselectivity,
  • the myoglobin catalyst comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 and comprises an amino acid substitution at a position selected from the group consisting of position X29, X32, X33, X39, X44, X45, X46, X64, X67, X68, X93, X107, and XI 11 of SEQ ID NO: 1.
  • the amino acid sequence of the myoglobin catalyst comprises at least one of the features selected from the group consisting of: X29 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X32 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X33 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X39 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X43 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S,
  • the myoglobin catalyst is selected from the group consisting of SEQ ID NOS: 2 through 110.
  • the myoglobin catalyst contains a metal-binding cofactor selected from the group consisting of a heme analog, a metalloporphyrin, and a porphyrin analog.
  • the metal-binding cofactor is selected from the group consisting of mesoporphyrin, protoporphyrin, bisglycolporphyrin, corrole, phthalocyanine, phlorin, chlorin, 5-isocorrole, 10-isocorrole, and porphycene.
  • the metal bound by the metal-binding cofactor is selected from the group consisting of iron, manganese, cobalt, ruthenium, rhodium, and osmium.
  • the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is selected from the group consisting of serine, threonine, cysteine, tyrosine, histidine, aspartic acid, glutamic acid, selenocysteine, para-amino-phenylalanine, meta-amino- phenylalanine, para-mercaptomethyl-phenylalanine, meto-mercaptomethyl-phenylalanine, parci- (isocyanomethyl)-phenylalanine, meto-(isocyanomethyl)-phenylalanine, 3-pyridyl-alanine, and 3-methyl-histidine.
  • a method for catalyzing a carbene insertion reaction comprising:
  • R] a and R 2a are independently selected from H, halo, cyano (— CN), nitro (— N0 2 ), trifluoromethyl (— CF 3 ), optionally substituted C 1-18 alkyl, optionally substituted C 6 -io aryl, optionally substituted 5- to 10-membered heteroaryl, — C(0)OR lb , — C(0)N(Ri b )(Ric), — C(0)R lb , — Si(R lb )(R lc )(R ld ), and — S0 2 (R lb ), where each R] b , R] Ci and R ⁇ are independently selected from H, optionally substituted CMS alkyl, optionally substituted C 6 -io aryl, and optionally substituted 6- to 10-membered heteroaryl. (b) providing a myoglobin-based catalyst;
  • R 2 is independently selected from optionally substituted C 6 -i5 aryl, optionally substituted 5- to 15-membered heteroaryl, and optionally substituted C 1-18 aliphatic
  • R3 is independently selected from H, halo, cyano, optionally substituted C 1-18 aliphatic, optionally substituted C 6 -io aryl, optionally substituted 5- to 10-membered heteroaryl,— C(0)ORi b ,— C(0)N(Ri b )(Ri c ), and— C(0)Ri b , where each R lb and R lc are independently selected from H, optionally substituted C 1-18 aliphatic, optionally substituted C 6 -io aryl, and optionally substituted 5- to 10-membered heteroaryl;
  • R 4 and R5 are independently selected from H, halo, cyano, optionally substituted C 1-18 aliphatic, optionally substituted C 6 -io aryl, and optionally substituted 5- to
  • R 6 is independently selected from optionally substituted C 1-18 aliphatic, optionally substituted C 6 -i6 aryl, optionally substituted 5- to 10-membered heteroaryl, optionally substituted C 4 -Ci 6 cyclic aliphatic, and optionally substituted C 4 -Ci 6 heterocyclic group;
  • R7 is independently selected from H, optionally substituted C 1-18 aliphatic, optionally substituted C 6 -i6 aryl, optionally substituted 5- to 10-membered heteroaryl; or where R 6 and R7 are connected to form an optionally substituted C 4 -Ci 6 cyclic aliphatic or heterocyclic group;
  • Rs is selected from optionally substituted C 1-18 aliphatic, optionally substituted C 6 -i6 aryl, optionally substituted 5- to 10-membered heteroaryl, optionally substituted C 4 - Ci6 cyclic aliphatic, and optionally substituted C 4 -Ci 6 heterocyclic group;
  • R9 is independently selected from optionally substituted C 1-18 aliphatic, optionally substituted C 6 -i6 aryl, optionally substituted 5- to 10-membered heteroaryl, optionally substituted C 4 -Ci 6 cyclic aliphatic, and optionally substituted C 4 -Ci 6 heterocyclic group; Rio and R n are optionally substituted Ci_ 6 aliphatic groups. (d) contacting the diazo-containing carbene precursor and the carbene acceptor substrate with the myoglobin-based catalyst, optionally in the presence of a reducing agent; and
  • R] a , R 2a , R 2 , R3, R4, R5, R 6 , R7, Rs, R9, Rio, and Rn are as defined above.
  • the myoglobin comprises an amino acid sequence that is at least 60% identical to SEQ ID NO: 1, 112, 113, 114, 115, and 116.
  • the myoglobin catalyst comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 and comprises an amino acid substitution at a position selected from the group consisting of position X29, X32, X33, X39, X44, X45, X46, X64, X67, X68, X93, X107, and XI 11 of SEQ ID NO: 1.
  • the amino acid sequence of the myoglobin catalyst comprises at least one of the features selected from the group consisting of: X29 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X32 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X33 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X39 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X43 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S,
  • the myoglobin catalyst is selected from the group consisting of SEQ ID NOS: 2 through 110.
  • the myoglobin catalyst contains a metal-binding cofactor selected from the group consisting of a heme analog, a metalloporphyrin, and a porphyrin analog.
  • the metal-binding cofactor is selected from the group consisting of mesoporphyrin, protoporphyrin, bisglycolporphyrin, corrole,
  • the metal bound by the metal-binding cofactor is selected from the group consisting of iron, manganese, cobalt, ruthenium, rhodium, and osmium.
  • the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is selected from the group consisting of serine, threonine, cysteine, tyrosine, histidine, aspartic acid, glutamic acid, selenocysteine, para-amino-phenylalanine, meto-amino-phenylalanine, para- mercaptomethyl-phenylalanine, meto-mercaptomethyl-phenylalanine, /?ara-(isocyanomethyl)- phenylalanine, meto-(isocyanomethyl)-phenylalanine, 3-pyridyl-alanine, and 3-methyl-histidine.
  • the myoglobin catalyst is tethered to a solid support.
  • the myoglobin catalyst is contained in a host cell.
  • the host cell is selected from the group consisting of Escherichia coli, Saccharomyces cerevisiae, and Pichia pastoris.
  • the carbene insertion product of formula (III) is selected from the group of consisting of:
  • Ar is independently selected from optionally substituted C 6 -i5 aryl and optionally substituted 6 to 15 membered heteroaryl;
  • Alk is an optionally substituted C 1-18 aliphatic.
  • the diazo-containing carbene precursor and the carbene acceptor substrate are part of the same molecule.
  • a method for catalyzing a sigmatropic rearrangement reaction comprising:
  • R] a and R 2a are independently selected from H, halo, cyano (— CN), nitro (— N0 2 ), trifluoromethyl (— CF 3 ), optionally substituted C 1-18 alkyl, optionally substituted C 6 -io aryl, optionally substituted 5- to 10-membered heteroaryl, — C(0)OR lb , — C(0)N(Ri b )(Ric), — C(0)R lb , — Si(R lb )(R lc )(R ld ), and — S0 2 (R lb ), where each R] b , R] C and R w are independently selected from H, optionally substituted C 1-18 alkyl, optionally substituted C 6 -io aryl, and optionally substituted 6- to 10-membered heteroaryl.
  • R12 is selected from optionally substituted C 1-18 aliphatic, optionally substituted C 6 -i6 aryl, optionally substituted 5- to 10-membered heteroaryl, and optionally substituted C4-C16 heterocyclic group;
  • Ri 3 , Ri4, and R15 are independently selected from H, optionally substituted Ci_ 6 aliphatic groups, optionally substituted C 6 -i6 aryl, or where R1 3 and R14 are connected to form an optionally substituted C4-C16 cyclic aliphatic or heterocyclic group;
  • Ri6 is independently selected from optionally substituted C 1-18 aliphatic, optionally substituted C 6 -i6 aryl, optionally substituted 5- to 10-membered heteroaryl, and optionally substituted C4-C16 heterocyclic group;
  • Ri7 is independently selected from optionally substituted Ci_6 aliphatic, optionally substituted C 6 aryl, optionally substituted 5- to 6-membered heteroaryl; or where R] 6 and Ri7 are connected together to form an optionally substituted C4-C16 cyclic aliphatic or heterocyclic group;
  • Ri 8 , R19, and R 2 o are independently selected from H, optionally substituted Ci_6 aliphatic groups, optionally substituted C 6 -i6 aryl, or where Ris and R19 are connected together to form an optionally substituted C4-C16 cyclic aliphatic or heterocyclic group;
  • R] a , R 2a , R9, R12, R13, R14, R15, R1 ⁇ 2, R17, Ri8, R19, and R 2 o are as defined above.
  • the myoglobin comprises an amino acid sequence that is at least 60% identical to SEQ ID NO: 1, 112, 113, 114, 115, and 116.
  • the myoglobin catalyst comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 and comprises an amino acid substitution at a position selected from the group consisting of position X29, X32, X33, X39, X44, X45, X46, X64, X67, X68, X93, X107, and XI 11 of SEQ ID NO: 1.
  • the amino acid sequence of the myoglobin catalyst comprises at least one of the features selected from the group consisting of: X29 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X32 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X33 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X39 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X43 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S,
  • the myoglobin catalyst is selected from the group consisting of SEQ ID NOS: 2 through 110.
  • the myoglobin catalyst contains a metal-binding cofactor selected from the group consisting of a heme analog, a metalloporphyrin, and a porphyrin analog.
  • the metal-binding cofactor is selected from the group consisting of mesoporphyrin, protoporphyrin, bisglycolporphyrin, corrole,
  • the metal bound by the metal-binding cofactor is selected from the group consisting of iron, manganese, cobalt, ruthenium, rhodium, and osmium.
  • the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is selected from the group consisting of serine, threonine, cysteine, tyrosine, histidine, aspartic acid, glutamic acid, selenocysteine, para-amino-phenylalanine, meto-amino-phenylalanine, para- mercaptomethyl-phenylalanine, meto-mercaptomethyl-phenylalanine, /?ara-(isocyanomethyl)- phenylalanine, meto-(isocyanomethyl)-phenylalanine, 3-pyridyl-alanine, and 3-methyl-histidine.
  • the myoglobin catalyst is tethered to a solid support.
  • the myoglobin catalyst is contained in a host cell.
  • the host cell is selected from the group consisting of Escherichia coli, Saccharomyces cerevisiae, and Pichia pastoris.
  • the diazo-containing carbene precursor and the carbene acceptor substrate are part of the same molecule.
  • a method for catalyzing an aldehyde olefination reaction comprising:
  • R] a and R 2a are independently selected from H, halo, cyano (— CN), nitro (— N0 2 ), trifluoromethyl (— CF 3 ), optionally substituted C 1-18 alkyl, optionally substituted C 6 -io aryl, optionally substituted 5- to 10-membered heteroaryl, — C(0)OR lb , — C(0)N(Ri b )(Ric), — C(0)R lb , — Si(R lb )(R lc )(R ld ), and — S0 2 (R lb ), where each R] b , R] Ci and R ⁇ are independently selected from H, optionally substituted C 1-18 alkyl, optionally substituted C 6 -io aryl, and optionally substituted 6- to 10-membered heteroaryl.
  • R 2 i is selected from optionally substituted C 1-18 aliphatic, optionally substituted C 6 -i6 aryl, optionally substituted 5- to 10-membered heteroaryl, and optionally substituted C4-C16 heterocyclic group;
  • nucleophilic reagent selected from the group consisting of triphenylphosphine, triphenylarsine, and triphenylstilbine;
  • the myoglobin comprises an amino acid sequence that is at least 60% identical to SEQ ID NO: 1, 112, 113, 114, 115, and 116.
  • the myoglobin catalyst comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 and comprises an amino acid substitution at a position selected from the group consisting of position X29, X32, X33, X39,
  • the amino acid sequence of the myoglobin catalyst comprises at least one of the features selected from the group consisting of: X29 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X32 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X33 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X39 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X43 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S,
  • the myoglobin catalyst is selected from the group consisting of SEQ ID NOS: 2 through 110.
  • the myoglobin catalyst contains a metal-binding cofactor selected from the group consisting of a heme analog, a metalloporphyrin, and a porphyrin analog.
  • the metal-binding cofactor is selected from the group consisting of mesoporphyrin, protoporphyrin, bisglycolporphyrin, corrole,
  • the metal bound by the metal-binding cofactor is selected from the group consisting of iron, manganese, cobalt, ruthenium, rhodium, and osmium.
  • the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is selected from the group consisting of serine, threonine, cysteine, tyrosine, histidine, aspartic acid, glutamic acid, selenocysteine, para-amino-phenylalanine, meto-amino-phenylalanine, para- mercaptomethyl-phenylalanine, meto-mercaptomethyl-phenylalanine, /?ara-(isocyanomethyl)- phenylalanine, meto-(isocyanomethyl)-phenylalanine, 3-pyridyl-alanine, and 3-methyl-histidine.
  • the myoglobin catalyst is tethered to a solid support.
  • the myoglobin catalyst is contained in a host cell.
  • the host cell is selected from the group consisting of Escherichia coli, Saccharomyces cerevisiae, and Pichia pastoris.
  • the diazo-containing carbene precursor and the aldehyde substrate are part of the same molecule.
  • FIG. 1 Crystal structure of sperm whale myoglobin (pdb 1A6K).
  • the heme cof actor, the heme-coordinating proximal histidine, and various amino acid residues lining the active site ('distal pocket') of the hemoprotein are displayed as stick models.
  • FIG. 2 Carbene transfer reactions catalyzed by the myoglobin catalysts provided herein: (a) olefin cyclopropanation; (b) carbene N— H insertion; (c) carbene S— H insertion; (d) carbene Si— H insertion.
  • FIG. 3 Additional reactions catalyzed by the myoglobin catalysts provided herein.
  • FIG. 4 Activity and selectivity of wild-type sperm whale myoglobin and engineered variants thereof toward cyclopropanation of styrene in the presence of ethyl diazoacetate.
  • FIG. 5 Mechanistic model for myoglobin-catalyzed cyclopropanation of styrene with diazo esters.
  • FIGS. 6A-B Plots of initial rates (v 0 ) for Mb(H64V,V68A)-catalyzed
  • FIG. 7 Optimization of styrene: EDA ratio for Mb(H64V,V68A)-catalyzed reactions. Turnover numbers (TON) for the cyclopropane product and carbene dimerization byproduct (diethyl fumarate + diethyl maleate) are plotted against the different styrene : EDA ratios used in the reaction.
  • FIG. 8 Yields and turnovers numbers for Mb(H64V,V68A)-catalyzed styrene cyclopropanation in the presence of EDA at varying reagents and catalyst loadings using a constant styrene : EDA ratio of 1 : 2 and after one hour reaction time.
  • FIG. 9 Substrate scope for Mb(H64V,V68A)-catalyzed cyclopropanation.
  • FIGS. 10A-B Representative chiral GC chromatograms corresponding to the products 3a, 3b, 3c, and 3d (a) as authentic racemic standards obtained from the reaction with styrene and EDA in the presence of Rh 2 (OAc) 4 as the catalyst, and (b) as produced from the reaction with Mb(H64V,V68A) as the catalyst.
  • FIG. 11 Catalytic activity of hemin, wild-type sperm whale myoglobin (Mb), and the Mb(H64V,V68A) variant toward promoting carbene N— H insertion reaction in the presence of aniline and EDA.
  • FIG. 12 Yields and total turnover numbers (TTN) for Mb(H64V,V68A)-catalyzed carbene N— H insertion with various aryl amines. Reaction conditions: 10 mM amine, 10 mM EDA, 10 mM Na 2 S 2 0 4 with (a) 20 ⁇ (0.2 mol ) and (b) 1 ⁇ (0.01 mol ) hemoprotein.
  • FIG. 13 Total turnovers supported by the different Mb variants for formation of N— H insertion products 32b (N-methyl aniline + EDA) and 35 (aniline + iBDA).
  • FIG. 14 Catalytic turnovers (TON) and enantioselectivity exhibited by representative Mb catalysts for formation of carbene N— H insertion products starting from aniline and various carbene donor reagents.
  • FIG. 15 Catalytic turnovers (TON) and enantioselectivity exhibited by representative Mb catalysts for formation of carbene N— H insertion products starting from various alkyl amines and carbene donor reagents.
  • FIG. 16 Catalytic activity of sperm whale myoglobin (Mb) for the carbene S— H insertion reaction with thiophenol and EDA. Reaction conditions: 400 ⁇ -scale reactions, 12 hours, room temperature, anaerobic conditions.
  • FIGS. 17A-B Mb-catalyzed S-H reaction.
  • A GC chromatogram corresponding to the Mb-catalyzed S-H insertion reaction with thiophenol and EDA. The peaks corresponding to the S-H insertion product, oc-(phenylthio)acetate (53), and the internal standard (ISTD) are labelled. Thiophenol elutes at 2.42 min and is completely consumed in the reaction. Trace amounts of diphenyldisulfide (labeled with *) are observed in the reaction mixture.
  • FIG. 18 Total turnover numbers (TTN) supported by various engineered Mb variants for conversion of thiophenol and EDA to 53. Reaction conditions: 2.5 ⁇ Mb variant, 10 mM PhSH, 20 mM EDA, 10 mM Na 2 S 2 0 4 in KPi buffer (pH 8.0), 16 h. WT: wild-type.
  • FIG. 19 Yields and total turnover numbers (TTN) for Mb(L29A,H64V)-catalyzed carbene S— H insertion with various aryl mercaptans and oc-diazo esters. Reaction conditions: 10 mM thiol, 20 mM EDA, 10 mM Na 2 S 2 0 4 with (a) 20 ⁇ (0.2 mol ) and (b) 2.5 ⁇ (0.025 mol ) hemoprotein, 16 hours. * Buffer added with 20% (v/v) methanol.
  • FIG. 20 Substrate scope and catalytic activity of Mb(L29A,H64V) toward carbene S— H insertion in the presence of different alkyl mercaptans and oc-diazo esters. Reaction conditions: 10 mM thiol, 20 mM diazo ester, 20 ⁇ Mb(L29A,H64V) (0.2 mol%), 10 mM Na 2 S 2 0 4 in oxygen- free phosphate buffer (pH 8.0), 12 hours. * Total turnover numbers (TTN) were measured using 0.025 mol% protein (2.5 ⁇ ).
  • FIG. 21 Enantioselectivity of myoglobin (Mb) and variants thereof for the carbene S— H insertion reaction in the presence of ethyl oc-diazopropanoate (52e).
  • Reaction conditions 400 ⁇ -scale reactions, 20 ⁇ protein, 10 mM Na 2 S 2 0 4 , 12 hours, room temperature, anaerobic conditions.
  • Enantiomeric excess (% e.e.) was determined based on chiral gas chromatography using racemic standards for calibration.
  • FIG. 22 Representative chiral GC chromatograms corresponding to product 71 (a) as authentic racemic standard synthesized using Rh 2 (OAc) 4 as the catalyst, (b) as produced from the reaction with Mb(F43V) (Entry 3, FIG. 21), (c) as produced from the reaction with
  • FIG. 23 Myoglobin-catalyzed conversion of allyl(phenyl)sulfane to the [2,3]- sigmatropic rearrangement product 92 in the presence of EDA.
  • the table describes the catalytic activity (TON) and enantioselectivity of different engineered Mb variants. Reaction conditions: 10 mM thiol, 20 mM diazo reagent, 10 ⁇ Mb catalyst, KPi pH 8.0, room temperature, 12 hours.
  • FIGS. 24-25 Representative [2,3] sigmatropic rearrangement reactions involving different sulfane substrates and carbene donor reagents as catalyzed by the myoglobin catalysts provided herein.
  • S.r.c. standard reactions conditions (10 mM thiol, 20 mM diazo reagent, 10 ⁇ Mb catalyst, KPi pH 8.0, room temperature, 12 hours).
  • FIGS. 26A-D Metallo-substituted Mb variants. Overlay plot of the electronic absorption spectrum of (A) wild-type Mb, (B) the Mn-containing Mb variant, and (C) Co- containing Mb variant, in oxidized (solid line) and reduced form (dotted line). The Q band regions are enlarged in the inserts. (D) Overlay of the electronic absorption spectra of H93S and H93pAmF variants of sperm whale myoglobin.
  • FIG. 27 Relative yield of Mb(H64V,V68A)-catalyzed styrene cyclopropanation with ethyl 2-diazoacetate in whole-cell systems under anaerobic or aerobic conditions and in the presence or absence of glucose. Reaction conditions: 30 mM styrene, 60 mM EDA,
  • FIGS. 28A-B Whole-cell reactions involving E. coli cells expressing
  • % ee ⁇ Positive and negative values refer to the formation of the trans- (IS,2S) (3a) and trans-(lR,2R) (3b) stereoisomer, respectively.
  • % eez Positive and negative values refer to the formation of the cis-(lR,2S) (3d) and cis-(lS,2R) (3c) stereoisomer, respectively.
  • FIG. 30 Catalytic activity of hemin and wild-type sperm whale myoglobin (Mb) in the olefination of benzaldehyde with ethyl oc-diazoacetate (EDA). Reaction conditions: 10 mM 111a, 10 mM 112a, 20 ⁇ catalyst, 10 mM Na 2 S 2 0 4 , and 10 mM Y in oxygen-free phosphate buffer (pH 8.0) for 12 hours at room temperature.
  • EDA ethyl oc-diazoacetate
  • FIG. 31 Catalytic activity and selectivity of myoglobin variants in benzaldehyde olefination with EDA. Reaction conditions: same as described in legend of FIG. 30.
  • FIG. 32 Catalytic activity and selectivity of Mb(F43V,V68F) variants in
  • FIG. 33 Substrate scope for Mb(F43V,V68F)-catalyzed aldehyde olefination.
  • Reaction conditions 10 mM aryl aldehyde, 1 ⁇ Mb(F43V,V68F), 10 mM cyclohexyl cc-diazo- acetate (112d), 10 mM AsPh 3 , 10 mM Na 2 S 2 0 4 .
  • Myoglobin is a small (about 150 amino acid residues), oxygen-binding hemoprotein found in the muscle tissue of vertebrates. The physiological role of myoglobin is to bind molecular oxygen (0 2 ) with high affinity, providing a reservoir and source of oxygen to support the aerobic metabolism of muscle tissue.
  • Myoglobin contains a heme group (iron- protoporphyrin IX) which is coordinated at the proximal site via the imidazolyl group of a conserved histidine residue (e.g., His93 in sperm whale myoglobin).
  • a distal histidine residue (e.g., His64 in sperm whale myoglobin) is present on the distal face of the heme ring, playing a role in favoring binding of 0 2 to the heme iron center.
  • Myoglobin belongs to the globin superfamily of proteins and consists of multiple (typically eight) alpha helical segments connected by loops. In biological systems, myoglobin does not exert any catalytic function.
  • engineered variants of sperm whale myoglobin can provide robust, efficient, and selective biocatalysts for promoting a variety of carbene-mediated reactions of high synthetic utility.
  • engineered variants of sperm whale myoglobin can react with diazo-containing reagents and catalyze a variety of synthetically valuable reactions which include alkene cyclopropanation, carbene insertion into a N— H, S— H, or Si— H bond, the [2,3]-sigmatropic rearrangements of thioether and tertiary amine substrates, and aldehyde olefination.
  • the inventor has discovered methods, involving sperm whale myoglobin and engineered variants thereof, to catalyze a variety of other reactions, including carbene S— H insertion, carbene Si— H insertion, sigmatropic rearrangements of thioether/tertiary amine substrates, and aldehyde olefination, for which no natural or engineered biocatalysts have been reported.
  • the myoglobin-based catalysts provided herein constitute valuable and efficient catalysts for the synthesis of a variety of organic molecules, including cyclopropanes, amines, ethers, thioethers, silanes, and olefins.
  • FIG. 2 shows carbene transfer reactions catalyzed by the myoglobin catalysts provided herein: (a) olefin cyclopropanation; (b) carbene N— H insertion; (c) carbene S— H insertion; (d) carbene Si— H insertion.
  • FIG. 3 shows additional reactions catalyzed by the myoglobin catalysts provided herein: (a) [2,3] sigmatropic rearrangement of allylic thioethers; (b) [2,3] sigmatropic rearrangement of propargylic thioethers; (c) [2,3] sigmatropic rearrangement of allylic amines; (d) [2,3] sigmatropic rearrangement of propargylic amines; (e) aldehyde olefination.
  • a method for catalyzing an alkene cyclopropanation reaction to produce a product having two new C— C bonds comprising:
  • a method for catalyzing a carbene N— H insertion reaction to produce a product having a new C— N bond comprising: (a) providing an N— H containing substrate, a diazo-containing reagent as carbene precursor, and an engineered myoglobin variant as the catalyst;
  • a method for catalyzing a carbene S— H insertion reaction to produce a product having a new C— S bond comprising:
  • a method for catalyzing a carbene Si— H insertion reaction to produce a product having a new C— Si bond comprising:
  • a method for catalyzing a [2,3] sigmatropic rearrangement reaction to produce a product having a new C— S bond comprising:
  • a method for catalyzing a [2,3] sigmatropic rearrangement reaction to produce a product having a new C— N bond comprising:
  • Cytochrome P450s and engineered variants thereof have been reported to catalyze a carbene N— H insertion reaction with aniline derivatives and EDA (Wang, Peck et al. 2014).
  • these P450-based biocatalysts exhibit only modest catalytic efficiencies ( ⁇ 500 TON) and no enantioselectivity (e.g., in the reaction of aniline with EDP) in these reactions.
  • they have limited substrate scope, exhibiting no reactivity in the presence of alkyl amine substrates such as benzyl amine or morpholine (Wang, Peck et al. 2014).
  • sperm whale myoglobin can catalyze these carbene N— H insertion reactions with much greater efficiency (up to 7,000 TON with aniline and EDA).
  • these myoglobin-derived biocatalysts can react with alkyl amines (e.g., benzyl amine, cyclohexyl amine, morpholine) and are capable of catalyzing carbene N-H insertion reactions in a stereoselective manner (e.g., 50% e.e. with benzyl amine and EDP), thus exhibiting a broader scope and reactivity.
  • alkyl amines e.g., benzyl amine, cyclohexyl amine, morpholine
  • sperm whale myoglobin and engineered variants thereof can catalyze a number of other chemical transformations for which no natural or engineered biocatalysts have been reported to date. These transformations include carbene S— H insertion, carbene Si— H insertion reactions, [2,3] sigmatropic rearrangement reactions of thioether and tertiary amine substrates, and aldehyde olefination reactions.
  • these myoglobin-catalyzed reactions can be performed, if desired, in aqueous solvents in the presence of large amounts (e.g., up to 40%) of an organic cosolvent (e.g., acetonitrile, tetrahydrofuran, ethanol, dimethylformamide) and/or elevated temperatures (e.g., up to 60-70°C).
  • organic cosolvent e.g., acetonitrile, tetrahydrofuran, ethanol, dimethylformamide
  • elevated temperatures e.g., up to 60-70°C.
  • these myoglobin-catalyzed reactions can be performed, if desired, in the presence of high substrate loadings (e.g., 0.1-0.3 M substrate and diazo- containing reagents).
  • substrate loadings e.g., 0.1-0.3 M substrate and diazo- containing reagents.
  • the possibility to conduct reactions in the presence of high substrate loadings is convenient toward minimizing the volume of reaction and solvent waste associated with it.
  • these myoglobin-catalyzed reactions can be carried out in whole-cell systems that is, employing cells expressing the myoglobin catalyst instead of purified protein. This capability is important toward eliminating the costs and time associated with the purification of the protein and thus toward optimizing the cost- and time-effectiveness of the biocatalytic process.
  • aliphatic or "aliphatic group” as used herein means a straight or branched Ci-15 hydrocarbon chain that is completely saturated or that contains one or more units of unsaturation, or a monocyclic C 3 _s hydrocarbon, or bicyclic Cs-i2 hydrocarbon that is completely saturated or that contains one or more units of unsaturation, but which is not aromatic (also referred to herein as "cycloalkyl”).
  • suitable aliphatic groups include, but are not limited to, linear or branched alkyl, alkenyl, alkynyl groups or hybrids thereof such as
  • alkyl, alkenyl, or alkynyl group may be linear, branched, or cyclic and may contain up to 15, up to 8, or up to 5 carbon atoms.
  • alkyl groups include methyl, ethyl, propyl, cyclopropyl, butyl, cyclobutyl, pentyl, or cyclopentyl groups.
  • alkenyl groups include propenyl, butenyl, or pentenyl groups.
  • alkynyl groups include propynyl, butynyl, or pentynyl groups.
  • aryl and aryl group refers to an aromatic substituent containing a single aromatic or multiple aromatic rings that are fused together, directly linked, or indirectly linked (such as linked through a methylene or an ethylene moiety).
  • An aryl group may contain from 5 to 24 carbon atoms, 5 to 18 carbon atoms, or 5 to 14 carbon atoms.
  • heteroatom means nitrogen, oxygen, or sulphur, and includes any oxidized forms of nitrogen and sulfur, and the quaternized form of any basic nitrogen.
  • Heteroatom further includes Se, Si, and P.
  • heteroaryl refers to an aryl group in which at least one carbon atom is replaced with a heteroatom.
  • a heteroaryl group is a 5- to 18-membered, a 5- to 14-membered, or a 5- to 10-membered aromatic ring system containing at least one heteroatom selected from the group consisting of oxygen, sulphur, and nitrogen atoms.
  • heteroaryl groups include pyridyl, pyrrolyl, furyl, thienyl, indolyl, isoindolyl, indolizinyl, imidazolyl, pyridonyl, pyrimidyl, pyrazinyl, oxazolyl, thiazolyl, purinyl, quinolinyl, isoquinolinyl, benzofuranyl, and benzoxazolyl groups.
  • a heterocyclic group may be any monocyclic or polycyclic ring system which contains at least one heteroatom and may be unsaturated or partially or fully saturated.
  • heterocyclic thus includes heteroaryl groups as defined above as well as non-aromatic heterocyclic groups.
  • a heterocyclic group is a 3- to 18- membered, a 3- to 14-membered, or a 3- to 10-membered, ring system containing at least one heteroatom selected from the group consisting of oxygen, sulphur, and nitrogen atoms.
  • heterocyclic groups include the specific heteroaryl groups listed above as well as pyranyl, piperidinyl, pyrrolidinyl, dioaxanyl, piperazinyl, morpholinyl, thiomorpholinyl, morpholinosulfonyl, tetrahydroisoquinolinyl, and tetrahydrofuranyl groups.
  • a halogen atom may be a fluorine, chlorine, bromine, or iodine atom.
  • substituted sulfhydryl refers to a contiguous group of atoms.
  • substituted sulfhydryl include, without limitation: alkoxy, aryloxy, alkyl, heteroatom-containing alkyl, alkenyl, heteroatom-containing alkenyl, alkynyl, heteroatom-containing alkynyl, aryl, heteroatom-containing aryl, alkoxy, heteroatom-containing alkoxy, aryloxy, heteroatom- containing aryloxy, halo, hydroxyl (— OH), sulfhydryl (— SH), substituted sulfhydryl, carbonyl (— CO— ), thiocarbonyl, (— CS— ), carboxy (— COOH), amino (— NH 2 ), substituted amino, nitro (— N0 2 ), nitroso (—NO), sulfo (— S0 2 — OH), cyano (— C ⁇ N), cyanato (—
  • substituents include, without limitation, halogen atoms, hydroxyl (— OH), sulfhydryl (— SH), substituted sulfhydryl, carbonyl (—CO—), carboxy (—COOH), amino (— NH 2 ), nitro (— N0 2 ), sulfo (— S0 2 — OH), cyano (— C ⁇ N), thiocyanato (— S— C ⁇ N), phosphono (— P(0)OH 2 ), alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, aryl, heteroaryl, heterocyclic, alkylthiol, alkyloxy, alkylamino, arylthiol, aryloxy, or arylamino groups.
  • optionally substituted modifies a series of groups separated by commas (e.g., “optionally substituted A, B, or C”; or “A, B, or C optionally substituted with”), it is intended that each of the groups (e.g., A, B, or C) is optionally substituted.
  • heteroatom-containing aliphatic refers to an aliphatic moiety where at least one carbon atom is replaced with a heteroatom, e.g., oxygen, nitrogen, sulphur, selenium, phosphorus, or silicon, and typically oxygen, nitrogen, or sulphur.
  • alkyl and alkyl group refer to a linear, branched, or cyclic saturated hydrocarbon typically containing 1 to 24 carbon atoms, 1 to 18 carbon atoms or 1 to 12 carbon atoms, such as methyl, ethyl, n-propyl, isopropyl, n-butyl, isobutyl, t-butyl, octyl, decyl and the like.
  • heteroatom-containing alkyl refers to an alkyl moiety where at least one carbon atom is replaced with a heteroatom, e.g., oxygen, nitrogen, sulphur, phosphorus, or silicon, and typically oxygen, nitrogen, or sulphur.
  • alkenyl and alkenyl group refer to a linear, branched, or cyclic hydrocarbon group of 2 to 24 carbon atoms, 2 to 18 carbon atoms, or 2 to 12 carbon atoms, containing at least one double bond, such as ethenyl, n-propenyl, isopropenyl, n-butenyl, isobutenyl, octenyl, decenyl, and the like.
  • heteroatom-containing alkenyl refers to an alkenyl moiety where at least one carbon atom is replaced with a heteroatom.
  • alkynyl and alkynyl group refer to a linear, branched, or cyclic hydrocarbon group of 2 to 24 carbon atoms, 2 to 18 carbon atoms, or 2 to 12 carbon atoms, containing at least one triple bond, such as ethynyl, n-propynyl, and the like.
  • heteroatom-containing alkynyl refers to an alkynyl moiety where at least one carbon atom is replaced with a heteroatom.
  • heteroatom-containing aryl refers to an aryl moiety where at least one carbon atom is replaced with a heteroatom.
  • alkoxy and alkoxy group refer to an aliphatic group or a heteroatom-containing aliphatic group bound through a single, terminal ether linkage.
  • aryloxy and aryloxy group refer to an aryl group or a heteroatom-containing aryl group bound through a single, terminal ether linkage.
  • the term "contact” as used herein with reference to interactions of chemical units indicates that the chemical units are at a distance that allows short range non-covalent interactions (such as Van der Waals forces, hydrogen bonding, hydrophobic interactions, electrostatic interactions, dipole-dipole interactions) to dominate the interaction of the chemical units. For example, when a protein is 'contacted' with a chemical species, the protein is allowed to interact with the chemical species so that a reaction between the protein and the chemical species can occur.
  • polypeptide and “protein” as used herein refers to any chain of two or more amino acids bonded in sequence, regardless of length or post-translational modification. According to their common use in the art, the term “protein” refers to any polypeptide consisting of more than 50 amino acid residues. These definitions are however not intended to be limiting.
  • metal protein refers to a protein that contains one or more metal ions.
  • the metal ion(s) confers the protein with catalytic activity (e.g., iron atom in cytochrome P450 enzymes) or other properties such as that of binding other molecules (e.g., iron atom in myoglobin and hemoglobin).
  • hemoprotein or "heme-containing protein” refers to a protein containing a heme group (iron-protoporphyrin IX).
  • enzyme refers to a protein capable of catalyzing a reaction as part of its native biological function.
  • a cytochrome P450 monoxygenase whose native function is typically that of catalyzing an oxygenation reaction (e.g., hydroxylation)
  • Myoglobin whose native function is that of binding and releasing oxygen, is a hemoprotein but not an enzyme (or heme enzyme).
  • carrier equivalent or “carbene precursor” refers to a molecule that can be decomposed in the presence of a transition metal catalyst or a metalloprotein catalyst to a structure that contain at least one divalent carbon with only 6 valence shell electrons and that can be transferred to carbon-carbon double bonds to form cyclopropanes or to carbon— hydrogen or heteroatom— hydrogen bonds to form products with new C— C or C— heteroatom bonds.
  • Diazo-containing reagents can serve as carbene precursor molecules for the carbene transfer reactions encompassed in this disclosure.
  • Non limiting examples of diazo-containing reagents are oc-diazo-esters, oc-diazo-amides, oc- diazo-ketones, oc-cyano- oc-diazo-esters, oc-nitro-oc-diazo-esters, and oc-keto-oc-diazo-esters.
  • carbene transfer refers to a chemical transformation where a carbene equivalent is added to a carbon-carbon double bond, a carbon-heteroatom double bond or inserted into carbon-hydrogen or heteroatom— hydrogen bond.
  • carbene acceptor substrate refers to any compound that can be made react with a carbene precursor reagent in a myoglobin-catalyzed reaction according to the methods provided herein, thereby forming a product carrying one or more new C— C, C— N, C— S, and/or C— Si bond(s).
  • Representative examples of carbene acceptor substrate are compounds of formula (II), (IV), (VI), (VIII), (X), (XI), (XIV) or (XV) as defined below.
  • heme refers to iron-protoporphyrin IX.
  • heme analog and "metalloporphyrin” as used herein refer to any metal- containing porphyrin molecule other than iron-protoporphyrin IX.
  • heme analogs include but are not limited to iron-deuteroporphyrin, iron-mesoporphyrin, iron-protoporphyrin, iron-bisglycolporphyrin, etc.
  • These porphyrin molecules may contain metals other than Fe, including but not limited to Mn, Co, Ni, Cu, Rh, Ru, and Os.
  • metal ion e.g., Fe, Mn, Co, Rh, Ru, or Os
  • Examples of porphyrin analogs include but are not limited to corroles,
  • anaerobic when used in reference to a reaction, culture or growth condition, refers to a condition in which the concentration of oxygen is less than about 25 ⁇ , less than about 5 ⁇ , or less than 1 ⁇ .
  • the term is also intended to include sealed chambers of liquid or solid medium maintained with an atmosphere of less than about 1% oxygen.
  • anaerobic conditions are achieved by sparging a reaction mixture with an inert gas such as nitrogen or argon.
  • heterologous indicates molecules that are expressed in an organism other than the organism from which they originated or are found in nature, independently of the level of expression that can be lower, equal or higher than the level of expression of the molecule in the native microorganism.
  • homolog refers to distinct enzymes or genes of a second family or species which are determined by functional, structural or genomic analyses to be an enzyme or gene of the second family or species which corresponds to the original enzyme or gene of the first family or species. Homologs most often have functional, structural, or genomic similarities. Techniques are known by which homologs of an enzyme or gene can readily be cloned using genetic probes and PCR. Identity of cloned sequences as homolog can be confirmed using functional assays and/or by genomic mapping of the genes. [00145] A protein has "homology" or is "homologous" to a second protein if the amino acid sequence encoded by a gene has a similar amino acid sequence to that of the second gene.
  • a protein has homology to a second protein if the two proteins have "similar" amino acid sequences.
  • the term “homologous proteins” is intended to mean that the two proteins have similar amino acid sequences.
  • the homology between two proteins is indicative of its shared ancestry, related by evolution.
  • analogs and “analogous” include nucleic acid or protein sequences or protein structures that are related to one another in function only and are not from common descent or do not share a common ancestral sequence. Analogs may differ in sequence but may share a similar structure, due to convergent evolution.
  • mutant or “variant” as used herein with reference to a molecule such as polynucleotide or polypeptide, indicates that such molecule has been mutated from the molecule as it exists in nature.
  • mutate and “mutation” as used herein indicates any modification of a nucleic acid and/or polypeptide which results in an altered nucleic acid or polypeptide.
  • Mutations include any process or mechanism resulting in a mutant protein, enzyme, polynucleotide, or gene.
  • a mutation can occur in a polynucleotide or gene sequence, by point mutations, deletions, or insertions of single or multiple nucleotide residues.
  • a mutation in a polynucleotide includes mutations arising within a protein-encoding region of a gene as well as mutations in regions outside of a protein-encoding sequence, such as, but not limited to, regulatory or promoter sequences.
  • a mutation in a coding polynucleotide such as a gene can be "silent", i.e., not reflected in an amino acid alteration upon expression, leading to a "sequence-conservative" variant of the gene.
  • a mutation in a polypeptide includes but is not limited to mutation in the polypeptide sequence and mutation resulting in a modified amino acid.
  • Non-limiting examples of a modified amino acid include a glycosylated amino acid, a sulfated amino acid, a prenylated (e.g., farnesylated, geranylgeranylated) amino acid, an acetylated amino acid, an acylated amino acid, a PEGylated amino acid, a biotinylated amino acid, a carboxylated amino acid, a phosphorylated amino acid, and the like.
  • engine refers to any manipulation of a molecule that result in a detectable change in the molecule, wherein the manipulation includes but is not limited to inserting a polynucleotide and/or polypeptide heterologous to the cell and mutating a polynucleotide and/or polypeptide native to the cell.
  • myoglobin-based catalyst or simply “myoglobin catalyst” as used herein refer to any polypeptide which shares at least 60% sequence identity to SEQ. ID NO: l and exhibits carbene transfer reactivity within the scope of the disclosed compositions and methods.
  • Myoglobin catalysts also comprise engineered variants of sperm whale myoglobin (SEQ. ID NO: l), in which the naturally occurring heme cofactor is substituted for a heme analog, a metalloporphyrin (e.g., Co- or Mn-protoporphyrin IX), or a metalloporphyrin analog.
  • Myoglobin catalysts also comprise engineered variants of sperm whale myoglobin (SEQ.
  • Myoglobin catalysts further comprise polypeptides that share at least 60% sequence identity to SEQ. ID NOS: 112, 113, 114, 115, or 116, or engineered variants thereof.
  • nucleic acid molecule refers to any chain of two or more nucleotides bonded in sequence.
  • a nucleic acid molecule can be a DNA or a RNA.
  • a common type of vector is a "plasmid”, which generally is a self-contained molecule of double- stranded DNA that can be readily accept additional (foreign) DNA and which can readily introduced into a suitable host cell.
  • plasmid which generally is a self-contained molecule of double- stranded DNA that can be readily accept additional (foreign) DNA and which can readily introduced into a suitable host cell.
  • vectors including plasmid and fungal vectors, have been described for replication and/or expression in a variety of eukaryotic and prokaryotic hosts.
  • Non-limiting examples include pKK plasmids (Clonetech), pUC plasmids, pET plasmids (Novagen, Inc., Madison, Wis.), pRSET or pREP plasmids (Invitrogen, San Diego, Calif.), or pMAL plasmids (New England Biolabs, Beverly, Mass.), and many appropriate host cells, using methods disclosed or cited herein or otherwise known to those skilled in the relevant art.
  • the terms “express” and “expression” refer to allowing or causing the information in a gene or DNA sequence to become manifest, for example producing a protein by activating the cellular functions involved in transcription and translation of a corresponding gene or DNA sequence.
  • a DNA sequence is expressed in or by a cell to form an "expression product" such as a protein.
  • the expression product itself e.g., the resulting protein, may also be the to be “expressed” by the cell.
  • a polynucleotide or polypeptide is expressed recombinantly, for example, when it is expressed or produced in a foreign host cell under the control of a foreign or native promoter, or in a native host cell under the control of a foreign promoter.
  • fused means being connected through one or more covalent bonds.
  • bound means being connected through non-covalent interactions. Examples of non-covalent interactions are van der Waals, hydrogen bond, electrostatic, and hydrophobic interactions.
  • tethered as used herein means being connected through covalent or non-covalent interactions.
  • a polypeptide tethered to a solid support refers to a polypeptide that is connected to a solid support (e.g., surface, resin bead) either via non-covalent interactions or through covalent bonds.
  • Myoglobin catalysts are provided that are capable of promoting carbene transfer reactions with high efficiency and/or selectivity and across a broader range of substrates.
  • Myoglobin catalysts are provided having the capability to catalyze a carbene transfer reaction, wherein the myoglobin catalyst comprises an amino acid sequence having at least 60%, 80% or 90% sequence identity to SEQ. ID NOS:l, 112, 113, 114, 115, or 116.
  • the capability to catalyze a carbene transfer reaction corresponds to the capability of the myoglobin catalyst to react with a diazo-containing reagent and catalyze a carbene addition to an alkene group of an alkene-containing molecule.
  • such capability corresponds to the capability of the myoglobin catalyst to react with a diazo-containing reagent and catalyze a carbene insertion into the N— H bond of an N— H bond containing molecule.
  • such capability corresponds to the capability of the myoglobin catalyst to react with a diazo-containing reagent and catalyze a carbene insertion into the S— H bond of an S— H bond containing molecule.
  • such capability corresponds to the capability of the myoglobin catalyst to react with a diazo-containing reagent and catalyze a carbene insertion into the Si— H bond of a Si— H bond containing molecule. In other embodiments, such capability corresponds to the capability of the myoglobin catalyst to react with a diazo-containing reagent and catalyze [2,3] sigmatropic rearrangement in the presence of a thioether substrate to give a molecule with a new C— S bond.
  • such capability corresponds to the capability of the myoglobin catalyst to react with a diazo-containing reagent and catalyze [2,3] sigmatropic rearrangement in the presence of a tertiary amine substrate to give a molecule with a new C— N bond.
  • Myoglobin catalysts are provided that are capable of catalyzing the aforementioned reactions, and which have an improved property compared with a reference myoglobin such as wild-type sperm whale myoglobin (SEQ ID NO: 1), or when compared to another hemoprotein such as CYP102A1 (P450 BM 3) from Bacillus megaterium (SEQ ID NO: 111).
  • a reference myoglobin such as wild-type sperm whale myoglobin (SEQ ID NO: 1)
  • another hemoprotein such as CYP102A1 (P450 BM 3) from Bacillus megaterium (SEQ ID NO: 111).
  • the polypeptides can be described in reference to the amino acid sequence of a naturally occurring myoglobin or another engineered myoglobin variant.
  • the amino acid residue is determined in the myoglobin polypeptide beginning from the first amino acid after the initial methionine (M) residue (i.e., the first amino acid after the initial methionine M represents residue position 1).
  • M methionine
  • the initiating methionine residue may be removed by biological processing machinery such as in a host cell or in vitro translation system, to generate a mature protein lacking the initiating methionine residue.
  • the amino acid residue position at which a particular amino acid or amino acid change is present is sometimes described herein as "Xn", or "position n", where n refers to the residue position.
  • the myoglobin catalysts provided herein are characterized by an improved property as compared to the wild-type sperm whale myoglobin (SEQ ID NO: 1) or another reference hemoprotein (e.g., SEQ ID NO: 111). Changes to such properties can include, among others, improvements in catalytic efficiency, number of catalytic turnovers supported by the biocatalyst, regioselectivity, diastereoselectivity, enantioselectivity and/or reduced substrate or product inhibition.
  • the altered properties are based on engineered myoglobin polypeptides having residue differences at specific residue positions as compared to wild-type sperm whale myoglobin (SEQ ID NO: 1)
  • the myoglobin catalyst is an engineered variant of sperm whale myoglobin (SEQ ID NO: 1), the variant comprising an amino acid change at one or more of the following positions of SEQ ID NO: 1: X29, X32, X33, X39, X44, X45, X46, X64, X67, X68, X93, X107, and XI 11.
  • the myoglobin catalysts can have additionally one or more residue differences at residue positions not specified by an X above as compared to the sequence SEQ ID NO: 1.
  • the differences can be 1-2, 1-5, 1-10, 1-20, 1-30, 1-40, 1- 50, 1-75, or 1-90, residue differences at other amino acid residue positions not defined by X above.
  • the myoglobin catalysts having one or more of the improved enzyme properties described herein can comprise an amino acid sequence that is at least 60%, 70%, 80%, 85%, 90%, 95%, 99% or more identical to the sequence SEQ ID NO: 1.
  • the improved myoglobin catalyst can comprise an amino acid sequence that is at least 60%, 70%, 80%, 85%, 90%, 95%, 99% or more identical to a sequence corresponding to SEQ ID NO: 112, 113, 114, 115, or 116.
  • the improved myoglobin catalyst comprises an amino acid sequence corresponding to a sequence selected from the group consisting of SEQ ID NOS: 2 - 110.
  • the improved property of the myoglobin catalyst is with respect to its catalytic activity, regioselectivity, diastereoselectivity, and/or enantioselectivity.
  • the improvement in catalytic activity can be manifested by an increase in the number of catalytic turnovers (TON) supported by the myoglobin catalyst for the carbene transfer reaction, as compared to wild- type sperm whale myoglobin (SEQ ID NO: 1), or other reference sequences (e.g., SEQ ID NO: 111).
  • the myoglobin catalysts are capable of supporting a number of catalytic turnovers (TON) that is at least 1.1-fold, 2-fold, 5-fold, 10- fold, 100-fold, 200-fold, 500-fold, or more higher than the number of catalytic turnovers supported by the polypeptide having sequence SEQ ID NO: 1.
  • the improvement in catalytic activity can be also manifested by an increase in the catalytic efficiency for the carbene transfer reaction, this catalytic efficiency being
  • the myoglobin catalysts exhibit a catalytic efficiency that is at least 1.1-fold, 2-fold, 5-fold, 10-fold, 100-fold, 200-fold, 500-fold, or more higher than the catalytic efficiency of the polypeptide with sequence SEQ ID NO: 1.
  • the myoglobin catalysts having improved catalytic activity toward alkene cyclopropanation, toward carbene Y— H insertion, where Y is S, N, or Si, toward [2,3] sigmatropic rearrangement of a thioether or tertiary amine substrate, and/or toward aldehyde olefination comprise an amino acid sequence selected from the group consisting of SEQ ID NO: 2 - 110.
  • the degree of diastereoselectivity can be conventionally described in terms of diasteromeric excess (d. e.).
  • the improvement in diastereoselectivity exhibited by the myoglobin catalyst is with respect to producing the (E) diastereomer of the cyclopropanation product (i.e., diastereomer in which the configuration of the cyclopropane ring is trans or (£)). In some embodiments, such improvement in
  • the myoglobin catalysts are capable of cyclopropanating an alkene-containing substrate with a (Z)- or (£)-diastereoselectivity (i.e., diastereomeric excess) that is at least 1%, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 75%, 90%, 98%, 99% or more higher than that exhibited by the wild-type parental sequence SEQ ID NO: 1, or the reference sequence SEQ ID NO: 111.
  • the degree of stereoselectivity can be conventionally described in terms of stereomeric excess, that is in terms of enantiomeric excess (e. e. ) or diasteromeric excess (d. e. ) depending on the nature of the substrate.
  • the improvement in enantioselectivity exhibited by the myoglobin catalyst is with respect to producing the (IS,2S) stereoisomer of the cyclopropanation product. In some embodiments, such improvement in stereoselectivity is with respect to producing the (IR,2R), (IS,2R), or (IR,2S) stereoisomer of the cyclopropanation product.
  • the myoglobin catalysts are capable of cyclopropanating an alkene-containing substrate with a stereoselectivity (i.e., stereomeric excess) that is at least 1 %, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 75%, 90%, 98%, 99% or more higher than that exhibited by the wild-type parental sequence SEQ ID NO: 1 , or the reference sequence SEQ ID NO: 111.
  • a stereoselectivity i.e., stereomeric excess
  • the improvement in enantioselectivity can be manifested by an increase in the enantioselectivity by which a Y— H bond, where Y is S, N, or Si, is functionalized via a carbene insertion reaction by action of the myoglobin catalyst, as compared to the wild-type parental sequence SEQ ID NO: 1.
  • the degree of stereoselectivity can be conventionally described in terms of stereomeric excess, that is in terms of enantiomeric excess (e. e. ) or diasteromeric excess (d. e.) depending on the nature of the substrate.
  • the improvement in enantioselectivity exhibited by myoglobin catalyst is with respect to producing the (S) stereoisomer of the carbene insertion product.
  • such improvement in stereoselectivity is with respect to producing (R) stereoisomer of the carbene insertion product.
  • the engineered myoglobin catalysts are capable of catalyzing a carbene Y— H insertion reaction, where Y is S, N, or Si, with a stereoselectivity (i.e., stereomeric excess) that is at least 1%, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 75%, 90%, 98%, 99% or more higher than that exhibited by the wild-type parental sequence SEQ ID NO: 1, or the reference sequence SEQ ID NO: 111.
  • the improvement in enantioselectivity can be manifested by an increase in the enantioselectivity by which a [2,3] sigmatropic rearrangement of thioether or amine substrate is catalyzed by action of the myoglobin catalyst, as compared to the wild-type parental sequence SEQ ID NO: 1.
  • the degree of stereoselectivity can be conventionally described in terms of stereomeric excess, that is in terms of enantiomeric excess (e. e. ) or diasteromeric excess (d. e. ) depending on the nature of the substrate.
  • the improvement in enantioselectivity exhibited by the myoglobin catalyst is with respect to producing the (S) stereoisomer of the rearrangement product. In some embodiments, such improvement in stereoselectivity is with respect to producing (R) stereoisomer of the rearrangement product.
  • the myoglobin catalysts are capable of catalyzing the [2,3] sigmatropic rearrangement of thioether or amine substrate, with a stereoselectivity (i.e., stereomeric excess) that is at least 1 %, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 75%, 90%, 98%, 99% or more higher than that exhibited by the wild-type parental sequence SEQ ID NO: 1 , or the reference sequence SEQ ID NO: 111.
  • a stereoselectivity i.e., stereomeric excess
  • the myoglobin catalysts having improved catalytic activity toward alkene cyclopropanation, and/or toward carbene Y— H insertion— where Y is S, N, or Si— , and/or toward [2,3] sigmatropic rearrangement of a thioether or tertiary amine substrate comprise an amino acid sequence corresponding to SEQ ID NO: 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, or 13.
  • the degree of diastereoselectivity can be conventionally described in terms of diasteromeric excess (d.e. ).
  • the improvement in diastereoselectivity exhibited by the myoglobin catalyst is with respect to producing the (E) diastereomer of the aldehyde olefination product (i.e., diastereomer in which the configuration of the alkene is trans or (£)). In some embodiments, such improvement in diastereoselectivity is with respect to producing the (Z) diastereomer of the aldehyde olefination product.
  • the myoglobin catalysts are capable of olefinating an aldehyde substrate with a (Z)- or ( ⁇ -diastereoselectivity (i.e., diastereomeric excess) that is at least 1 %, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 75%, 90%, 98%, 99% or more higher than that exhibited by the wild-type parental sequence SEQ ID NO: 1 , or the reference sequence SEQ ID NO: 111.
  • a (Z)- or ( ⁇ -diastereoselectivity i.e., diastereomeric excess
  • the capability of the myoglobin catalysts to catalyze any of the aforementioned carbene transfer reactions can be established according to methods well known in the art. Most typically, such capability can be established by contacting the substrate with the myoglobin catalyst under suitable reaction conditions in which the myoglobin catalyst is functional (e.g., under reducing and anaerobic conditions), and then determining the formation of the desired product (e.g., cyclopropanation, carbene Y— H insertion, rearrangement, or aldehyde olfination product) by standard analytical methods such as, for example, thin-layer chromatography, HPLC, GC, LC-MS, and/or GC-MS.
  • suitable reaction conditions in which the myoglobin catalyst is functional
  • the desired product e.g., cyclopropanation, carbene Y— H insertion, rearrangement, or aldehyde olfination product
  • standard analytical methods such as, for example, thin-layer chromatography, HPLC,
  • Such catalytic activity of the myoglobin catalysts can be measured and expressed in terms of number of catalytic turnovers, product formation rate, catalytic efficiency (k cat /KM ratio), and the like.
  • substrate activity can be measured and expressed in terms of turnover numbers (TON) or total turnover numbers (TTN), the latter corresponding to the total number of catalytic turnovers supported by the myoglobin catalyst in the presence of a given carbene acceptor substrate (e.g., styrene or aniline) and carbene donor (e.g., ethyl diazoacetate, ethyl oc-diazopropanoate).
  • the diastereo- and stereoselectivity of the myoglobin catalysts for any of the aforementioned carbene transfer reactions can be measured by determining the relative distribution of stereoisomeric products generated by the reaction using conventional analytical methods such as, for example, (chiral) normal phase liquid chromatography, (chiral) reverse- phase liquid chromatography, or (chiral) gas chromatography.
  • the improved myoglobin catalysts comprise deletions of the myoglobin catalyst provided herein. Accordingly, for each of the embodiment of the myoglobin catalysts provided herein, the deletions can comprise 1, 2, 5, 10, 30, or more amino acids, as long as the functional activity and/or improved properties of the myoglobin catalyst is maintained.
  • the myoglobin catalysts are fused to a polypeptide that can serve as an affinity tag in order to facilitate the isolation and purification of the myoglobin polypeptide.
  • affinity tags include but are not limited to a polyhistidine affinity tag, a FLAG tag, and a glutathione-S-transferase tag.
  • the myoglobin catalysts can comprise one or more non- natural amino acids in their primary sequence.
  • the non-natural amino acid can be present at one or more of the positions defined by "Xn" above for the purpose of modulating the catalytic or selectivity properties of the myoglobin catalyst.
  • the non-natural amino acid can be introduced in another position of the myoglobin catalyst sequence for the purpose, for example, of linking the myoglobin catalyst to another protein, another biomolecule, or a solid support.
  • Several methods are known in the art for introducing an unnatural amino acid into a polypeptide. These include the use of the amber stop codon suppression methods using engineered tRNA/aminoacyl-tRNA synthetase (AARS) pairs such as those derived from AARS.
  • AARS engineered tRNA/aminoacyl-tRNA synthetase
  • Methanocaldococcus sp. and Metanosarcina sp. (Liu and Schultz 2010).
  • natural or engineered frameshift suppressor tRNAs and their cognate aminoacyl-tRNA synthetases can also be used for the same purpose (Rodriguez, Lester et al. 2006; Neumann, Wang et al. 2010).
  • an unnatural amino acid can be incorporated in a polypeptide using chemically (Dedkova, Fahmi et al. 2003) or enzymatically (Bessho, Hodgson et al.
  • non-natural amino acids include but are not limited to, para-ammo- phenylalanine, para-acetyl-phenylalanine, meta-acetyl-phenylalanine, para-mercaptomethyl- phenylalanine, 3-pyridyl-alanine, 3-methyl-histidine, /?ara-butyl-l,3-dione-phenylalanine, O- allyl-tyrosine, O-propargyl-tyrosine, para-azido-phenylalanine, para-borono-phenylalanine, /?ara-bromo-phenylalanine, para-iodo-phenylalanine, 3-iodo-tyrosine, para-benzoyl- phenylalanine, para-benzoyl-phenylalanine, ⁇ -N-allyloxycarbonyl-lysine, ⁇ -N- propargyloxycarbonyl-lysine
  • the myoglobin catalyst comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 and comprises at least one of the features selected from the group consisting of:
  • X29 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y;
  • X32 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y;
  • X33 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y;
  • X39 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y;
  • X43 is A, R, N, D,
  • the amino acid residue that coordinates the iron atom at the axial position of the heme cofactor in the myoglobin catalyst is a naturally occurring amino acid selected from the group consisting of serine, threonine, cysteine, tyrosine, histidine, aspartic acid, glutamic acid, and selenocysteine.
  • this non- naturally occurring oc-amino acid amino is para-amino-phenylalanine, meta-ammo- phenylalanine, para-mercaptomethyl-phenylalanine, meto-mercaptomethyl-phenylalanine, para- (isocyanomethyl)-phenylalanine, meta-(isocyanomethyl)-phenylalanine, 3-pyridyl-alanine, or 3- methyl-histidine.
  • the heme cofactor in the myoglobin catalyst is substituted for a heme analog, a metalloporphyrin, or a metalloporphyrin analog.
  • the heme cofactor in the myoglobin catalyst is substituted for a heme analog selected from the group consisting of iron-mesoporphyrin, iron-protoporphyrin, or iron-bisglycolporphyrin.
  • the heme cofactor in the myoglobin catalyst is substituted for a Mn-, Co-, Ru-, Rh-, or Os-porphyrin.
  • the heme cofactor in the myoglobin catalyst is substituted for a metalloporphyrin analog selected from the group consisting of corrole, phthalocyanine, phlorin, chlorin, 5-isocorrole, 10-isocorrole, and porphycene derivatives.
  • metalloporphyrin analog selected from the group consisting of corrole, phthalocyanine, phlorin, chlorin, 5-isocorrole, 10-isocorrole, and porphycene derivatives.
  • cofactor-substituted myoglobin catalysts can be prepared according to methods known in the arts, which include, for example, removal of the heme cofactor from the myoglobin polypeptide followed by refolding of the apoprotein in the presence of the heme analog, metalloporphyrin, or porphyrin analog (Yonetani and Asakura 1969; Yonetani, Yamamoto et al.
  • these cofactor-substituted myoglobin catalysts can be obtained via recombinant expression of the myoglobin polypeptide in bacterial strains that are capable of uptaking the heme analog or another metalloporphyrin from the culture medium (Woodward, Martin et al. 2007; Bordeaux, Singh et al. 2014).
  • the amino acid residue that coordinates the metal atom at the axial position of the protein-bound cofactor (e.g., heme, heme analog, metalloporphyrin, or metalloporphyrin analog) in the myoglobin catalyst is a naturally occurring amino acid selected from the group consisting of serine, threonine, cysteine, tyrosine, histidine, aspartic acid, glutamic acid, and selenocysteine.
  • this non-naturally occurring oc-amino acid amino is para-amino-phenylalanine, meto-amino-phenylalanine, para- mercaptomethyl-phenylalanine, meto-mercaptomethyl-phenylalanine, /?ara-(isocyanomethyl)- phenylalanine, meto-(isocyanomethyl)-phenylalanine, 3-pyridyl-alanine, or 3-methyl-histidine.
  • kits may contain an individual myoglobin catalyst or a plurality of myoglobin catalysts.
  • the myoglobin catalysts contained in the kit may be in lyophilized form, in solution, or tethered to a solid support.
  • the kits can further include reagents for carrying out the myoglobin-catalyzed reactions, substrates for assessing the activity of the myoglobin catalysts, and reagents for detecting the products.
  • the kits can also include instructions for the use of the kits.
  • the myoglobin catalysts described herein can be covalently or non-covalently linked to a solid support for the purpose, for example, of screening the myoglobin catalysts for activity on a range of different substrates or for facilitating the separation of reactants and products from the myoglobin catalyst after the reactions.
  • solid supports include but are not limited to, organic polymers such as polystyrene, polyacrylamide, polyethylene, polypropylene, poly ethylenegly cole, and the like, and inorganic materials such as glass, silica, controlled pore glass, metals.
  • the configuration of the solid support can be in the form of beads, spheres, particles, gel, a membrane, or a surface.
  • polynucleotide molecules are provided that encode for the myoglobin polypeptides disclosed herein.
  • the polynucleotides may be linked to one or more regulatory sequences controlling the expression of the myoglobin polypeptide-encoding gene to form a recombinant polynucleotide capable of expressing the polypeptide.
  • codons are selected to fit the host cell in which the polypeptide is being expressed.
  • codons used in bacteria are used to express the polypeptide in a bacterial host.
  • the polynucleotide molecule comprises a nucleotide sequence encoding for a myoglobin polypeptide with an amino acid sequence that is at least 60%, 70%, 80%, 85%, 90%, 95%, 99% or more identical to SEQ ID NO:l.
  • the polynucleotide molecule comprises a nucleotide sequence encoding for a myoglobin polypeptide with an amino acid sequence that is at least 60%, 70%, 80%, 85%, 90%, 95%, 99% or more identical to SEQ ID NO: 112, 113, 114, 115, or 116.
  • the polynucleotide molecule encoding for the myoglobin polypeptide is comprised in a recombinant expression vector.
  • Suitable recombinant expression vectors include but are not limited to, chromosomal, nonchromosomal and synthetic DNA sequences, e.g., derivatives of SV40; bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectors derived from combinations of plasmids and phage DNA, viral DNA such as vaccinia, adenovirus, fowl pox virus, pseudorabies, adenovirus, adeno-associated viruses, retroviruses and many others. Any vector that transduces genetic material into a cell, and, if replication is desired, which is replicable and viable in the relevant host can be used.
  • expression vectors and expression hosts are known in the art, and many of these are commercially available.
  • a person skilled in the art will be able to select suitable expression vectors for a particular application, e.g., the type of expression host (e.g., in vitro systems, prokaryotic cells such as bacterial cells, and eukaryotic cells such as yeast, insect, or mammalian cells) and the expression conditions selected.
  • the type of expression host e.g., in vitro systems, prokaryotic cells such as bacterial cells, and eukaryotic cells such as yeast, insect, or mammalian cells
  • an expression host system comprising a polynucleotide molecule encoding for the myoglobin polypeptides disclosed herein.
  • Expression host systems that may be used include any systems that support the transcription, translation, and/or replication of a polynucleotide molecule provided herein.
  • the expression host system is a cell.
  • Host cells for use in expressing the polypeptides encoded by the expression vector disclosed herein are well known in the art and include but are not limited to, bacterial cells (e.g., Escherichia coli, Streptomyces); fungal cells such as yeast cells (e.g., Saccharomyces cerevisiae, Pichia pastoris); insect cells; plant cells; and animal cells.
  • the expression host systems also include lysates of prokaryotic cells (e.g., bacterial cells) and lysates of eukaryotic cells (e.g., yeast, insect, or mammalian cells). These systems also include in vitro
  • transcription/translation systems many of which are commercially available.
  • the choice of the expression vector and host system depends on the type of application intended for the methods provided herein and a person skilled in the art will be able to select a suitable expression host based on known features and application of the different expression hosts.
  • the engineered myoglobin polypeptides can be prepared via mutagenesis of the polynucleotide encoding for the naturally occurring sperm whale myoglobin (SEQ ID NO: 1) or for an engineered variant thereof.
  • the engineered myoglobin polypeptides can be prepared via mutagenesis of the polynucleotide encoding for the naturally occurring myoglobins corresponding to SEQ ID NO: 112, 113, 114, 115, or 116, or an engineered variant thereof.
  • mutagenesis methods include, but are not limited to, site-directed mutagenesis, site-saturation mutagenesis, random mutagenesis, cassette- mutagenesis, DNA shuffling, homologous recombination, non-homologous recombination, site- directed recombination, and the like.
  • Detailed description of art-known mutagenesis methods can be found, among other sources, in U.S. Pat. No. 5,605,793; U.S. Pat. No. 5,830,721 ; U.S. Pat. No. 5,834,252; WO 95/22625; WO 96/33207; WO 97/20078; WO 97/35966; WO
  • oligonucleotide primers having a predetermined or randomized sequence can be prepared chemically by solid phase synthesis using commercially available equipment and reagents. Polynucleotide molecules can then be synthesized and amplified using a polymerase chain reaction, digested via endonucleases, ligated together, and cloned into a vector according to standard molecular biology protocols known in the art (e.g., Sambrook et al. Molecular Cloning: A Laboratory Manual (Third Edition), Cold Spring Harbor Press, 2001).
  • Engineered myoglobin polypeptides expressed in a host expression system can be isolated and purified using any one or more of the well-known techniques for protein purification, including, among others, cell lysis via sonication or chemical treatment, filtration, salting-out, and chromatography (e.g., ion-exchange chromatography, gel- filtration chromatography, etc.).
  • the recombinant myoglobin polypeptides obtained from mutagenesis of a parental myoglobin sequence can be screened for identifying engineered myoglobin polypeptides having improved catalytic and/or selectivity properties, such as improvements with respect to their catalytic activity, regioselectivity, diastereoselectivity and/or enantioselectivity for any of the
  • a method for catalyzing an alkene cyclopropanation reaction to produce a product having two new C— C bonds comprising:
  • R] a and R 2a are independently selected from H, halo, cyano (— CN), nitro (— N0 2 ), trifluoromethyl (— CF 3 ), optionally substituted C 1-18 alkyl, optionally substituted C 6 -io aryl, optionally substituted 5- to 10-membered heteroaryl, —
  • R] b , R] Ci and R ⁇ are independently selected from H, optionally substituted C MS alkyl, optionally substituted C 6 -io aryl, and optionally substituted 6- to 10-membered heteroaryl.
  • R 2 is independently selected from optionally substituted C 6 -i5 aryl, optionally substituted 5- to 15-membered heteroaryl, and optionally substituted C MS aliphatic
  • R 3 is independently selected from H, optionally substituted C MS aliphatic, optionally substituted C 6 -io aryl, optionally substituted 5- to 10-membered heteroaryl, — C(0)ORi b , — C(0)N(Rib)(Ric), and — C(0)Ri b , where each R lb and R lc are independently selected from H, optionally substituted C MS aliphatic, optionally substituted C 6 -io aryl, and optionally substituted 5- to 10-membered heteroaryl;
  • R 4 and R5 are independently selected from H, halo, cyano, optionally substituted C 1-18 aliphatic, optionally substituted C 6 -io aryl, and optionally substituted 5- to 10- membered heteroaryl.
  • R] a , R 2a , R 2 , R3, R 4 and R5 are as defined above.
  • a method for catalyzing a carbene N— H insertion reaction to produce a product having a new C— N bond comprising:
  • R ]a and R 2a are as defined above.
  • R 6 is independently selected from optionally substituted C 1-18 aliphatic, optionally substituted C 6 -i6 aryl, optionally substituted 5- to 10-membered heteroaryl, optionally substituted C4-C16 cyclic aliphatic, and optionally substituted C4-C16 heterocyclic group;
  • R 7 is independently selected from H, optionally substituted C 1-18 aliphatic, optionally substituted C 6 -i6 aryl, optionally substituted 5- to 10-membered heteroaryl; or where R 6 and R7 are connected to form an optionally substituted C4-C16 cyclic aliphatic or heterocyclic group.
  • R] a , R 2a , R 6 , and R7 are as defined above.
  • a method for catalyzing a carbene S— H insertion reaction to produce a product having a new C— S bond comprising:
  • R] a and R 2a are as defined above,
  • Rs is selected from optionally substituted C 1-18 aliphatic, optionally substituted C 6 -i6 aryl, optionally substituted 5- to 10-membered heteroaryl, optionally substituted C4-C16 cyclic aliphatic, and optionally substituted C4-C16 heterocyclic group.
  • R] a , R 2a , and Rs are as defined above.
  • a method for catalyzing a carbene Si— H insertion reaction to produce a product having a new C— Si bond comprising:
  • R ]a and R 2a are as defined above.
  • R9 is independently selected from optionally substituted CMS aliphatic, optionally substituted C 6 -i6 aryl, optionally substituted 5- to 10-membered heteroaryl, optionally substituted C4-C16 cyclic aliphatic, and optionally substituted C4-C16 heterocyclic group; Rio and Rn are optionally substituted Ci_6 aliphatic groups, (c) providing an engineered myoglobin variant as the catalyst; (d) contacting the S— H containing substrate and the diazo-containing reagent with the engineered myoglobin variant under appropriate reaction conditions; and
  • R] a , R 2a , R9, Rio, and Rn are as defined above.
  • a method for catalyzing a sulfur ylide [2,3] sigmatropic rearrangement to produce a product having a new C— S bond comprising:
  • R] a and R 2a are as defined above,
  • R] a , R 2a , R12, R13, Ri 4 , and R15 are as defined above.
  • a method for catalyzing a nitrogen ylide [2,3] sigmatropic rearrangement to produce a product having a new C— N bond comprising:
  • R] a and R 2a are as defined above,
  • R] 6 is independently selected from optionally substituted C 1-18 aliphatic, optionally substituted C 6 -i6 aryl, optionally substituted 5- to 10-membered heteroaryl, and optionally substituted C4-C16 heterocyclic group
  • Rn is independently selected from optionally substituted Ci_ 6 aliphatic, optionally substituted C 6 aryl, optionally substituted 5- to 6-membered heteroaryl; or where R] 6 and Rn are connected together to form an optionally substituted C4-C16 cyclic aliphatic or heterocyclic group
  • Ris, R19, and R 2 o are independently selected from H, optionally substituted Ci_ 6 aliphatic groups, optionally substituted C 6 -i6 aryl, or where Ris and R19 are connected together to form an optionally substituted C4-C16 cyclic aliphatic or heterocyclic group.
  • R] a , R 2a , R1 ⁇ 2, R17, Ri8, R19, and R 2 o are as defined above.
  • R] a and R 2a are as defined above.
  • nucleophilic reagent selected from the group consisting of triarylphosphine, triarylarsine, and triarylstilbine;
  • the diazo-containing carbene precursor in the methods described above is selected from the group consisting of ethyl 2-diazo-acetate, ieri-butyl 2- diazo-acetate, ethyl 2-diazo-2-phenylacetate, ethyl 2-diazo-propanoate, ieri-butyl 2-diazo- propanoate, ethyl 2-diazo-3,3,3-trifluoropropanoate, ethyl 2-cyano-2-diazoacetate, ethyl 2-diazo- 2-nitroacetate, diazomethane, diazo(nitro)methane, 2-diazoacetonitrile,
  • the myoglobin catalyst used in the methods described above comprises a polypeptide with an amino acid sequence that is at least 60%, 70%, 80%, 85%, 90%, 95%, 99% or more identical to a sequence selected from the group consisting of SEQ ID NO: 1 - 110.
  • the methods provided herein include forming reaction mixtures that contain the myoglobin catalyst, the carbene donor reagent (e.g., an alkyl a-diazoester) , the carbene acceptor substrate (e.g., alkene-containing substrate for the cyclopropanation reaction), and other additives (e.g., a reducing agent).
  • the carbene donor reagent e.g., an alkyl a-diazoester
  • the carbene acceptor substrate e.g., alkene-containing substrate for the cyclopropanation reaction
  • other additives e.g., a reducing agent
  • the myoglobin polypeptides may be added to the reaction mixture in the form of purified proteins, whole cells containing the myoglobin polypeptide, and/or cell extracts and/or lysates of such cells.
  • reactions are conducted under conditions sufficient to catalyze the formation of the desired products.
  • the reaction time and concentration of the myoglobin polypeptide in the reaction mixture can vary widely, in large part depending on the catalytic rate and efficiency of the myoglobin catalyst. Typically, reaction times range from 10 min to 24 hours. For example, the reaction time can be 30 min or 12 hours.
  • the amount of the myoglobin catalyst in the reaction mixture is also variable. Typically, the reaction mixtures contain between 0.001 mol% and 20 mol% myoglobin catalyst with respect to the diazo-containing reagent and/or the carbene acceptor substrate.
  • the reaction mixtures contain between 0.01 mol% and 2 mol% myoglobin catalyst with respect to the diazo-containing reagent and/or the carbene acceptor substrate.
  • concentration of the diazo-containing reagent and carbene acceptor substrate in the reaction mixtures can also vary. In an embodiment, the concentration of these compounds in the reaction mixture is between 100 ⁇ and 2 M. In another embodiment, the concentration of these compounds in the reaction mixture is between 1 mM and 500 mM.
  • the myoglobin-catalyzed reactions are carried out in a buffered aqueous solution.
  • buffering agents that can be used include sodium phosphate, sodium acetate, 2-amino-2-hydroxymethyl-propane-l,3-diol (TRIS), 3-morpholinopropane-l- sulfonic acid (MOPS), 2-[4-(2-hydroxyethyl)piperazin-l-yl]ethanesulfonic acid (HEPES), and 2- (N-morpholino)ethanesulfonic acid (MES).
  • additives can be present in these solutions, which include salts (e.g., NaCl, KC1, CaCl 2 ), detergents (e.g., sodium dodecylsulfate and Triton-X 100), chelators (e.g., 2-( ⁇ 2-[Bis(carboxymethyl)amino]ethyl ⁇
  • salts e.g., NaCl, KC1, CaCl 2
  • detergents e.g., sodium dodecylsulfate and Triton-X 100
  • chelators e.g., 2-( ⁇ 2-[Bis(carboxymethyl)amino]ethyl ⁇
  • EDTA (carboxymethyl)amino)acetic acid
  • EGTA ethylene glycol-bis(2-aminoethylether)-N,N,N',N'- tetraacetic acid
  • organic cosolvents such as, for example, methanol, ethanol, dimethylsulfoxide (DMSO), acetonitrile, dimethylformamide (DMF), and tetrahydrofuran (THF).
  • Buffers, cosolvents, salts, detergents, and chelators can be used at any suitable concentration, which can be readily determined by a person skilled in the art.
  • Cosolvents in particular, can be included in the reaction mixtures in amounts ranging from about 1 % v/v to about 70% v/v, or higher.
  • the myoglobin catalysts provided herein maintain carbene transfer reactivity in the context of the reactions described herein in the presence of a concentration of DMF, THF, acetonitrile, methanol, or ethanol in buffer as high as 50% v/v, or higher.
  • most proteins and heme-containing enzymes e.g., P450 undergo denaturation under these conditions.
  • the reactions can be conducted at any suitable temperature which is compatible with the catalytic function of the myoglobin polypeptides within the scope of the disclosed compositions and methods. Typically, the reactions are conducted at a temperature ranging from about 2°C to about 70°C. The reactions can be conducted, for example, at about 25 °C or about 50°C. The reactions can be conducted at any suitable pH which is compatible with the catalytic function of the myoglobin polypeptides within the scope of the disclosed compositions and methods. In general, the reactions are conducted at a pH ranging from about 6 to about 10. The reactions can be conducted, for example, at a pH of 6, 7, 8, or 9.
  • the reduced form of the myoglobin catalyst e.g., ferrous form vs. ferric form for heme-containing myoglobin catalysts
  • the reactions are conducted in the presence of a reducing agent, in particular in vitro reactions.
  • the reducing agent is sodium dithionite (Na 2 S 2 C>4).
  • reducing agents include, but are not limited to, ascorbic acid, enzymatic redox systems comprising of a myoglobin reductase enzyme and the cognate reduced nicotinamide adenine dinucleotide cofactor (NADPH or NADH), and non-enzymatic redox systems comprising of a reduced nicotinamide adenine dinucleotide cofactor (NADPH or NADH) or an NADH mimic (Paul, Arends et al.
  • the concentration of the ultimate reducing agent in the reaction mixtures can vary, ranging from substoichiometric amounts (e.g., 0.2, 0.5, 0.8 equiv.) to stoichiometric (1 equiv.) and overstoichiometric amounts (e.g., 2, 5, 10, 100 equiv.) with respect to the myoglobin catalyst.
  • the myoglobin catalyst can be reduced or maintained in the reduced ferrous form electrochemically by means of an electrode.
  • the reactions are conducted under anaerobic conditions.
  • Anaerobic conditions can be achieved by conducting the reactions under an inert atmosphere, such as a nitrogen atmosphere or argon atmosphere, and using solvents from which molecular oxygen has been removed via degassing.
  • the myoglobin-catalyzed reactions are allowed to proceed until a substantial amount of the substrate is transformed into the product.
  • Product formation or substrate consumption
  • can be monitored using standard analytical methods such as, for example, thin-layer chromatography, GC, HPLC, or LC-MS.
  • the methods provided herein can be assessed in terms of diastereoselectivity and/or enantioselectivity, that is the extent to which the reaction produces a particular isomer, be it a diastereomer or enantiomer.
  • a perfectly selective reaction produces a single isomer, such that the isomer constitutes 100% of the product.
  • a reaction producing a particular enantiomer constituting 95% of the total product can be said to be 95% enantioselective.
  • a reaction producing a particular diastereomer constituting 40% of the total product meanwhile, can be said to be 40% diastereoselective.
  • the methods provided herein include reactions that are from about 1 % to about 99.9% diastereoselective.
  • the reaction can be about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or about 95% diastereoselective.
  • the methods provided herein also include reactions that are from about 1% to about 99.9% enantioselective.
  • the reaction can be about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or about 95% enantioselective.
  • some embodiments disclosed herein provide methods wherein the reaction is at least 20% to at least 99% diastereoselective. In some embodiments, the reaction is at least 20% to at least 99% enantioselective.
  • two stereoisomeric products containing a new chiral carbon atom with an (R) or (5) absolute configuration are formed from the reactions described herein.
  • four or three stereoisomeric products containing two chiral carbon atoms each having an (R) or (5) absolute configuration are formed.
  • two "trans" or "£"' isomers and "cis” or “Z” isomers can be formed.
  • the two cis isomers are enantiomers with respect to one another, in that the structures are non- superimposable mirror images of each other.
  • the two trans isomers are enantiomers.
  • stereochemical configuration of certain of the products herein will depend on factors including the structures of the particular carbene acceptor substrate and diazo-containing reagent used in the reaction, as well as the nature and identity of the myoglobin catalyst. Accordingly, the distribution of the stereoisomeric products formed in the reactions described herein will also depend on such factors.
  • the distribution of a product mixture can be described in terms of the enantiomeric excess, or "% e.e. ".
  • the diasteromeric excess (% d.e.) can be calculated in the same manner.
  • the distribution of the (E) and (Z) isomers can be described in terms of the E : Z (or trans : cis) ratio.
  • the methods provided herein include reactions that lead to product mixtures exhibiting % e.e. values which range from about 1% to about 99.9%, or from about - 1% to about -99.9%.
  • the methods provided herein also include reactions that lead to product mixtures exhibiting % d.e. values which range from about 1% to about 99.9%, or from about - 1% to about -99.9%.
  • the methods provided herein include reactions that lead to product mixtures exhibiting a Z : E ratios ranging from about 1 : 99.9 to about 99.9: 1.
  • some embodiments provide methods that lead to a product mixture exhibiting at least 20% to at least 99% e.e.. In some embodiments, the product mixture exhibits at least 20% to at least 99% d.e.. Some embodiments also provide methods that lead to a mixture of cyclopropanation products with a Z : E ratio of at least 90 : 10 to at least 99 : 1. In other embodiments, the mixture of cyclopropanation products exhibit a Z : E ratio of at least 10 : 90 to at least 1 : 99.
  • the reactions can be conducted with intact cells expressing a myoglobin polypeptide provided herein (also referred to as "whole-cell reactions"). These whole-cell reactions can be carried out with any of the host cells used for expression of the myoglobin polypeptide, as described above.
  • the host cells are bacterial cells such as, for example, Escherichia coli cells.
  • the host cells are yeast cells such as, for example, Saccharomyces cerevisiae or Pichia pastoris cells.
  • suspension of cells expressing a myoglobin polypeptide provided herein can be formed in a suitable medium (e.g., phosphate buffer, M9 medium, Luria- Bertani medium, Terrific Broth medium) supplemented with nutrients such as, for example, mineral micronutrients (e.g., CoCl 2 , Q1SO 4 , MnCl 2 ), vitamins (e.g., thiamine, riboflavin, pantothenate) cofactor precursors (e.g., delta-aminolevulinic acid, p-aminobenzoic acid), sugars (e.g., glucose), and other energy sources (e.g., glycerol).
  • a suitable medium e.g., phosphate buffer, M9 medium, Luria- Bertani medium, Terrific Broth medium
  • nutrients such as, for example, mineral micronutrients (e.g., CoCl 2 , Q1SO 4 , MnCl 2 ), vitamins (e.g
  • the medium is also supplemented with a heme analog or a metalloporphyrin other than heme (e.g., Co- or Mn-protoporphyrin IX) with the purpose of incorporating such heme analog or metalloporphyrin in the myoglobin catalyst.
  • a heme analog or a metalloporphyrin other than heme e.g., Co- or Mn-protoporphyrin IX
  • Whole-cell reactions using cells expressing a myoglobin polypeptide provided herein can be carried under aerobic conditions or anaerobic conditions.
  • whole-cell reactions using cells expressing a myoglobin polypeptide provided herein can be carried without the addition of an exogenous reductant to the cell suspension.
  • a typical host cell e.g., E. coli
  • the intracellular concentration of oxygen in a typical host cell is sufficiently low to enable the myoglobin polypeptide provided herein to operate as carbene transfer catalyst.
  • the yield and rate of the whole-cell reactions can be controlled, at least in part, by varying the cell density of the cell suspension used in these reactions.
  • the cell density can be determined by measuring the absorbance at 600 nm and can be expressed as optical density at 600 nm (OD 6 oo)- Alternatively, the cell density can be expressed in gram cell dry weight per liter (g cdw L "1 ).
  • the whole-cell reactions can be conducted using cell suspensions that have an optical density (OD 6 oo) ranging from about 0.1 to about 100 or that have a cell density ranging from about 0.02 g cdw L "1 to about 20 g cdw L "1 .
  • Other cell densities can be useful, depending on the nature of the host cell, myoglobin catalyst, carbene acceptor substrate and diazo-containing reagent.
  • the concentration of the myoglobin catalyst in the cell suspensions used for the whole-cell reactions can be adjusted by varying the protein expression conditions (e.g., type of growth medium, temperature, concentration of the inducer of expression (e.g., ITPG, arabinose), and expression time) according to procedures well known in the art.
  • the number of catalytic turnovers supported by the myoglobin catalyst in whole-cell systems can be expressed in the form of amount of product (e.g., in mmol) per gram cell dry weight.
  • whole-cell reactions involving the myoglobin catalysts provided herein exhibit turnovers ranging from about 0.1 mmol (g cdw) -1 to about 20 mmol (g cdw) -1 .
  • the compounds provided herein may contain one or more chiral centers.
  • the compounds are intended to include racemic mixtures, diastereomers, enantiomers, and mixture enriched in one or more stereoisomer.
  • a group of substituents is disclosed herein, all the individual members of that group and all subgroups, including any isomers, enantiomers, and diastereomers are intended to be included in this disclosure.
  • cyclopropanation reaction involves formation of a heme-bound carbene intermediate upon reaction of EDA with the protein in its reduced, ferrous state.
  • 'End- on' (Wolf, Hamaker et al. 1995; Che, Huang et al. 2001 ; Li, Huang et al. 2002; Nowlan, Gregg et al. 2003) attack of the styrene molecule to this heme-carbenoid species would then lead to the cyclopropanation product (FIG. 5).
  • Positions 43 or 68 were substituted with amino acids carrying a larger (i.e., Mb(F43W) and Mb(V68F)) or smaller apolar side chain (i.e., Mb(F43V) and Mb(V68A)), in order to affect the catalyst selectivity in the cyclopropanation reaction by varying the steric bulk on either side of the heme (FIGS. 1 and 5).
  • Mb(F43W) and Mb(V68F) a larger i.e., Mb(F43W) and Mb(V68F)
  • Mb(F43V) and Mb(V68A) apolar side chain
  • the H64V mutation resulted in a two-fold increase in the turnover numbers, the highest among this set of single mutants, while having marginal effect on diastereo- and enantioselectivity.
  • all the mutations at the level of Phe43 and Val68 dramatically improved the enantioselectivity of the Mb variant as compared to wild-type Mb, resulting in formation of the (IS,2S) stereoisomer (3a) with e.e. (E values ranging from 44% to 99.9%.
  • the V68 substitutions also resulted in an appreciable increase in both catalytic activity (TON) and E- diastereoselectivity of the catalyst (FIG. 4).
  • the H64V mutation was found to be particularly effective in enhancing Mb-dependent cyclopropanation activity, whereas the mutations at the level of V68 and F43 were beneficial toward tuning its diastereo- and enantioselectivity.
  • a series of Mb double mutants were prepared and tested (FIG. 4).
  • Variant Mb(H64V,V68A) was found to exhibit high activity as well as excellent E-diastereoselectivity (>99.9% de) and (lS ⁇ -enantioselectivity (>99.9% ee) (FIG. 10B vs. FIG. 10A), and it was thus selected for further investigations.
  • Mb(H64V,V68A) was found to support TTNs ranging from 7,700 to 14,500 on these substrates.
  • Substrates such as oc-methylstyrene (lh) and N-methyl-3-vinyl-indole (li) could be also converted with to the corresponding cyclopropanation products 10a and 11a with high selectivity, albeit the efficiency of the reaction with the latter (li) was compromised by the instability of this substrate in water.
  • Mb(H64V,V68A)-catalyzed styrene cyclopropanation with ieri-butyl diazoacetate (12) and ethyl diazopropanoate (13) yielded the corresponding (IS,2S) cyclopropane products, i.e., iert-butyl 2-phenylcyclopropane-l-carboxylate (14) and ethyl l-methyl-2- phenylcyclopropane-l-carboxylate (15), respectively, with good diastereoselectivity (82% d.e. and 74% d.e.
  • Ethyl 2-diazo-2-phenylacetate was also accepted by the Mb variant although cyclopropanation of styrene with this diazo reagent proceeded with low efficiency (TON ⁇ 10).
  • Mb(H64V,V68A) and other engineered Mb variants were found to retain between 50 and 90% of their carbene transfer activity in the presence of up to 30-40% of an organic cosolvent (MeOH, THF, DMF, or CH 3 CN). Similarly, they were found to retain between 50 and 90% of their carbene transfer activity at elevated temperatures up to 60°C. These results further highlight the operational robustness of these biocatalysts for carbene transfer reactions.
  • engineered variants of sperm whale myoglobin can provide highly reactive and selective olefin cyclopropanation catalysts.
  • the engineered Mb variant Mb(H64V,V68A) is capable of catalyzing the
  • Tetramethylsilane (TMS) served as the internal standard (0 ppm) for ] H NMR and CDC1 3 was used as the internal standard (77.0 ppm) for 13 C NMR.
  • Silica gel chromatography purifications were carried out using AMD Silica Gel 60 230-400 mesh. Preparative thin layer chromatography was performed on TLC plates (Merck).
  • Gas chromatography (GC) analyses were carried out using a Shimadzu GC-2010 gas chromatograph equipped with a FID detector and a Shimadzu SHRXI-5MS column (15 m x 0.25 mm x 0.25 ⁇ film).
  • Enantiomeric excess was determined by chiral gas chromatography (GC) using a Shimadzu GC-2010 gas chromatograph equipped with a FID detector, and a Cyclosil-B column (30 m x 0.25 mm x 0.25 ⁇ film).
  • GC chiral gas chromatography
  • Wild-type Mb and the engineered Mb variants were expressed in E. coli BL21(DE3) cells as described previously (Bordeaux, Singh et al. 2014). Briefly, cells were grown in TB medium (ampicillin, 100 mg L _1 ) at 37 °C (150 rpm) until OD 6 oo reached 0.6. Cells were then induced with 0.25 mM ⁇ -d-l-thiogalactopyranoside (IPTG) and 0.3 mM ⁇ -aminolevulinic acid (ALA). After induction, cultures were shaken at 150 rpm and 27 °C and harvested after 20 h by centrifugation at 4000 rpm at 4 °C.
  • IPTG 0.25 mM ⁇ -d-l-thiogalactopyranoside
  • ALA ⁇ -aminolevulinic acid
  • the proteins were purified by Ni-affinity chromatography using the following buffers: loading buffer (50 mM Kpi, 800 mM NaCl, pH 7.0), wash buffer 1 (50 mM Kpi, 800 mM NaCl, pH 6.2), wash buffer 2 (50 mM Kpi, 800 mM NaCl, 250 mM glycine, pH 7.0) and elution buffer (50 mM Kpi, 800 mM NaCl, 300 mM L-histidine, pH 7.0).
  • buffer exchange 50 mM Kpi, pH 7.0
  • reactions were initiated by addition of 10 ⁇ of styrene (from a 1.2 M stock solution in methanol), followed by the addition of 10 ⁇ of EDA (from a 0.4 M stock solution in methanol) with a syringe, and the reaction mixture was stirred for 18 h at room temperature, under positive argon pressure. Reaction with hemin were carried out using an identical procedure with the exception that the purified Mb was replaced by 80 ⁇ of a hemin solution (100 ⁇ in
  • Rh 2 (OA)4-catalyzed cyclopropanation reactions were carried out according to the following general procedure. To a flame dried round bottom flask was added olefin (5 equiv.) and Rh 2 (OAC) 4 (2 mol ) in CH 2 C1 2 (2 mL) under argon. To this solution was added a solution of diazo compound (1 equiv.) in CH 2 C1 2 (3-5 mL) via slow addition over 30-40 minutes. The resulting mixture was stirred at room temperature for another 30 min to 1 hour.
  • Ethyl 2-(l-methyl-lH-indol-3-yl)cyclopropane-l-carboxylate (11) This product was obtained following the standard Rh-catalyzed cyclopropanation protocol starting from 1 -methyl - 3-vinyl-lH-indole, which was synthesized according to a published procedure (Waser, Caspar et al. 2006).
  • Mb(H64V,V68A) was found to exhibit also significantly higher N— H insertion reactivity than wild-type Mb (>500 vs. 210 TON, FIG. 11).
  • Mb(H64V,V68A) was found to remain active in the presence of the amine substrate and EDA at a concentration as high as 0.16 M, which corresponds to -15 g aniline per L (FIG. 11).
  • Mb(H64V,V68A) catalyzes nearly -3,000 turnovers. With 10 mM aniline and a catalyst loading of 0.001 mol , over 6,000 total turnovers (TTN) were supported by this Mb variant. This value is an order of magnitude higher than that recently reported engineered P450BM3 variants (Wang, Peck et al. 2014) and ranks among the highest TTNs reported for catalytic N— H insertion reactions with acceptor- only diazo compounds (Aviv and Gross 2006).
  • the Mb(H64V,V68A)-catalyzed reaction is also remarkably fast, proceeding at an initial rate of 740 and 174 turnovers min 1 over the first minute and first 10 min, respectively. .
  • a range of substituted anilines (24a-32a) and other arylamines (33a, 34a) were subjected to Mb(H64V,V68A)- catalyzed N— H functionalization in the presence of EDA. As summarized in FIG.
  • the Mb variants were capable to catalyze the reactions with oc-substituted diazo compounds (i.e., ethyl 2-diazopropanoate, ieri-butyl 2-diazopropanoate) in an enantioselective manner (15-30% e.e.)
  • oc-substituted diazo compounds i.e., ethyl 2-diazopropanoate, ieri-butyl 2-diazopropanoate
  • Chiral GC analyses were carried out using a Shimadzu GC-2010 gas chromatograph equipped with a FID detector, and a Cyclosil-B column (30 m x 0.25 mm x 0.25 ⁇ film). Separation method: 1 ⁇ L ⁇ injection, injector temp.: 200 °C, detector temp: 300 °C. Gradient: column temperature set at 140 °C for 3 min, then to 160 °C at 1.8 °C/min, then to 165 °C at 1 °C/min, then to 245 °C at 25 °C/min. Total run time was 28 min.
  • N-H insertion reactions were typically carried out at a 400 ⁇ scale using 20 ⁇ myoglobin, 10 mM aniline, 10 or 5 mM EDA, and 10 mM sodium dithionite.
  • a solution containing sodium dithionate (100 mM stock solution) in potassium phosphate buffer (50 mM, pH 8.0) was degassed by bubbling argon into the mixture for 4 min in a sealed vial.
  • a buffered solution containing myoglobin was carefully degassed in a similar manner in a separate vial. The two solutions were then mixed together via cannula.
  • Reactions were initiated by addition of 10 ⁇ of aniline (from a 0.4 M stock solution in methanol), followed by the addition of 10 ⁇ or 5 ⁇ of EDA (from a 0.4 M stock solution in methanol) with a syringe, and the reaction mixture was stirred for 12 h at room temperature, under positive argon pressure.
  • FIG. 16 Upon optimization of the reaction conditions, nearly quantitative conversion of thiophenol to 3 (68 ⁇ 98%), and correspondingly higher catalytic turnovers (TON: 170 ⁇ 492), could be achieved using a two-fold excess of EDA over the thiol substrate at a catalyst loading of 0.2 mol% (Entry 3, FIGS. 16 and 17A). Notably, comparable yields in this transformation have been obtained using transition metal complexes at 5- to 25-fold higher catalyst loadings
  • this mutation is likely to enhance the catalytic efficiency of Mb by increasing the accessibility of the heme pocket to the diazo ester and thiol reactants.
  • the initial rate for Mb(L29A,H64V)- catalyzed formation of the S-H insertion product 53 was determined to be 35 turnovers per minute.
  • Mb(L29A,H64V) was found to readily functionalize benzyl mercaptan (72), substituted benzyl mercaptan derivatives (73-75), and alkyl mercaptans such as cyclohexanethiol (76) and octane- 1 -thiol (77), providing conversions in the range of 30-50% and supporting between 930 and 2,550 total turnover numbers (Entries 1-6, FIG. 20).
  • 52c or 52d
  • Mb(F43V,V68A) showed appreciable enantioselectivity in this transformation (21-22% ee, Entries 3 and 5 in FIG. 21; FIG. 22). Since Mb(V68A) exhibited only 6% ee, the beneficial effect in terms of enantioselectivity can be mainly attributed to the substitution at the level of Phe43, which is located in close proximity to the heme cofactor (FIG. 1).
  • results demonstrate the amenability of the Mb catalysts to promote asymmetric carbene S-H insertions, the possibility to tune this property via active site engineering, and the scalability of Mb-catalyzed S-H insertion reactions, further highlighting the utility of these biocatalysts for synthetic applications.
  • Enantiomeric excess for product 71 was determined using the following method: 1 ⁇ injection, injector temp.: 200 °C, detector temp: 300 °C. Gradient: column temperature set at 80 °C for 3 min, then to 180 °C at 1.00 °C/min, then to 200 °C at 2 °C/min, then to 245 °C at 25 °C/min.
  • Reactions were initiated by addition of 10 ⁇ of thiophenol (from a 0.4 M stock solution in methanol), followed by the addition of 10 ⁇ L ⁇ of EDA (from a 0.2 M stock solution in methanol) with a syringe, and the reaction mixture was stirred for 12 h at room temperature, under positive argon pressure.
  • the preparative-scale reaction was carried out using solution containing sodium dithionate (100 mM stock solution, 1 mL, 10 mM) in potassium phosphate buffer (50 mM, pH 8.0, 5.87 mL) and 466 ⁇ of MeOH (>5 of reaction volume) was degassed by bubbling argon into the mixture for 20 min in a sealed vial.
  • the sulfonium ylide likely arises from nucleophilic attack of the sulfane substrate to the heme-bound carbene intermediate generated upon reaction of the diazo compound with the hemoprotein (FIG. 5).
  • the Mb(L29A,H64V)-catalyzed formation of 92 was also found to occur with a certain degree of enantioselectivity (15% e.e. , FIG. 23), as determined by chiral GC analysis.
  • Upon screening additional Mb variants it was possible to identify Mb catalysts with improved catalytic efficiency and enantioselectivity for the conversion of 90 to 92 (FIG. 23).
  • the Mb variant Mb(F43V,V68F) was found to have complementary enantioselectivity as compared to
  • EXAMPLE 3 (sulfanes) and EXAMPLE 2 (amines). Authentic standards for the rearrangement products were prepared according to general procedure described below.
  • the carbene transfer reactivity of the myoglobin catalysts is dependent upon the presence of a heme cofactor (iron-protoporphyrin IX) bound to the protein. Accordingly, it was envisioned that varying the nature of this cofactor, e.g., via using an alternative
  • metalloporphyrin cofactor could provide a means to modulate the carbene transfer reactivity of these catalysts.
  • catalytic properties of metallo- substituted Mb variants, in which the heme cofactor is substituted for a Mn- or Co-protoporphyrin IX was investigated.
  • Mn- and Co-substituted Mb variants have been previously obtained by reconstitution of apomyoglobin with the corresponding metallo-protoporphyrins IX (Yonetani and Asakura 1969; Yonetani, Yamamoto et al. 1974; Heinecke, Yi et al. 2012). While remaining viable, this approach however involves laborious and time-consuming refolding procedures. To overcome this issue, a convenient and practical strategy was implemented for the recombinant expression of metallo-substituted Mb variants by using E. coli cells, which express a heterologous, outer- membrane heme transporter (ChuA) (Varnado and Goodwin 2004).
  • ChuA heterologous, outer- membrane heme transporter
  • wild-type sperm whale Mb and the heme transporter ChuA from 0157:H7 E. coli were initially expressed in BL21(DE3) cells using a dual plasmid system in which the Mb and ChuA genes are under an IPTG-inducible promoter.
  • Cells were grown in M9 minimal medium supplemented with Mn m (ppIX). Under these conditions, Mn-substituted Mb
  • Mb(Mn m ) could be successfully isolated with a yield of approximately 5 mg / L of culture.
  • a second plasmid encoding for both ChuA and the chaperone complex, GroEL/ES was prepared. The latter was expected to increase the fraction of the desired protein in correctly folded and soluble form. Indeed, this system led to a significant increase (2.5-fold) in the yield of Mb(Mn m ) (13 vs. 5 mg/L culture).
  • Mb(Mn) and Mb(Co) The purified Mn- and Co-containing myoglobin variants, referred to as Mb(Mn) and Mb(Co), were characterized by electron absorption spectroscopy in both oxidized and reduced form (FIGS. 26A-C). As shown in FIG. 26B, Mb(Mn m ) shows a split Soret band with X max at 375 and 469 nm in phosphate buffer at pH 7.0. Upon addition of dithionite, a single Soret band with max at 438 nm becomes apparent, indicating complete reduction of the protein to
  • Mb(Mn n ) Mb(Mn n ).
  • the visible spectrum of Mb(Co m ) shows a prominent absorption band at 422, which shifts to 401 nm under reducing conditions, thus evidencing the formation of the reduced form, Mb(Co n ) (FIG. 26C).
  • E. coli BL21(DE3) (or C41(DE3) (Lucigen)) cells were co-transformed with the Mb-encoding plasmid (pET22_MYO) and the ChuA-encoding vectors pHPEX2 or pGroES/EL-ChuA. Cells were grown in M9 minimal media supplemented with micronutrients and the appropriate antibiotics at 37 °C until the OD 6 oo reached 0.6.
  • Example 6 Preparation and carbene transfer reactivity of myoglobin- based catalysts with alternative proximal ligands
  • proximal ligand Mb variants SEQ ID NO: 14 through 27
  • Mb(H64V,V68A) SEQ ID NO: 11
  • residues i.e., Cys, Asp, Glu, Tyr, Ser
  • residues have a side-chain group capable of coordinating the metal ion of the heme group (or other metalloporphyrin/metalloporphyrin analog) in the Mb catalyst, while others (Ala, Gly) do not, leaving the proximal site available for coordination by other species (e.g., solvent).
  • proximal ligand Mb variants were prepared by replacing His93 residue in wild-type sperm whale myoglobin (SEQ ID NO: l) and in one of the most promising cyclopropanation catalyst, Mb(H64V,V68A) (SEQ ID NO: 11), with the unnatural amino acid p- amino-phenylalanine (pAmF) (SEQ ID NOS: 28 and 31), 3-pyridyl-alanine (3PyA) (SEQ ID NOS: 29 and 32), and 3-methyl-histidine (3MeH) (SEQ ID NOS: 30 and 33) via amber stop codon suppression (Liu and Schultz 2010).
  • pAmF unnatural amino acid p- amino-phenylalanine
  • 3PyA 3-pyridyl-alanine
  • 3MeH 3-methyl-histidine
  • this gene was then expressed in BL21(DE3) cells containing a second plasmid encoding for an engineered, orthogonal Methanocaldococcus jannaschii tRNA/aminoacyl-tRNA synthetase (AARS) pair capable of suppressing an amber stop codon with p-amino-phenylalanine, 3-pyridyl-alanine, or 3-methyl-histidine.
  • AARS orthogonal Methanocaldococcus jannaschii tRNA/aminoacyl-tRNA synthetase
  • the unnatural amino acid-containing Mb variants were purified by Ni-affinity chromatography as described in EXAMPLE 1. All the variants were able to bind and retain the heme cofactor as indicated by the presence of a Soret band in their UV-VIS electronic absorption spectra (FIG. 26D).
  • His93Asp, His93Glu, His93Tyr, His93Cys, His93Ser, His93Gly, His93(pAmF), His93(3PyA), or His93(3MeH) mutation were tested for their carbene transfer reactivity using representative reactions for olefin cyclopropanation (styrene + EDA), carbene N— H insertion (aniline + EDA; aniline + EDP), and carbene S— H insertion (thiophenol + EDA; thiophenol + EDP).
  • the cells were resuspended in phosphate buffer (50 mM KPi, pH 8.0) to a final OD 6 oo of 40 and the substrate (styrene) and carbene donor (EDA) were added to the cell suspension.
  • phosphate buffer 50 mM KPi, pH 8.0
  • EDA carbene donor
  • Example 8 Gram-scale synthesis of drugs and advanced pharmaceutical intermediates using engineered myoglobin catalysts.
  • Example 9 Myoglobin catalysts with altered selectivity via active site mutagenesis.
  • EXAMPLES 1-4 illustrate how the catalytic activity and/or selectivity of myoglobin-based polypeptides as carbene transfer catalysts can be modulated via mutagenesis of amino acid residues defining the active site of the hemoprotein according to the methods provided herein.
  • a library of Mb variants was prepared starting from Mb(H64V) by mutating one or more additional active site residues in sperm whale myoglobin (FIG. 1) by site-directed mutagenesis.
  • Mb variants carrying double mutations at positions H64/V68, L29/H64, H64/I107, and F43/H64, and carrying triple mutations at positions L29/H64/V68 were prepared and then tested to identify Mb-based cyclopropanation catalysts with altered diastereo- and stereoselectivity as compared to the (IS- 25)-selective Mb(H64V,V68A) catalyst, using a model reaction with styrene and EDA (FIG. 29).
  • Mb(H64V,V68L) and Mb(H64V,V68F) were found to exhibit excellent trans -diastereoselectivity (>99.9% de) and high (IR,2R)- stereoselectivity, thus complementing the scope of Mb(H64V,V68A).
  • Mb catalysts were identified that can produce the cis product 3d with high stereoselectivity (e.g., 95- 99% ee) such as, for example, Mb(H64V,V68G).
  • Mb(H64V,V68G) Mb(H64V,V68G)
  • further mutagenesis can be applied to these and/or other Mb variants in order to further optimize the catalytic and selectivity properties of the myoglobin catalysts in the context of olefin cyclopropanation and/or the other carbene transfer reactions described herein.
  • Example 10 Aldehyde olefination reactions catalyzed by myoglobin-based catalysts.
  • hemin reaction is much less chemoselective, yielding larger amounts of the carbene dimerization byproducts, diethyl fumarate and diethyl maleate (TON ( 3 a )/TON(4 a) : 0.4 vs. 2.8 with Mb, FIG. 30).
  • Mb(F43V,V68F) used in combination with AsPh 3 , emerged as the most promising catalyst for this reaction, exhibiting 3 -fold higher TON compared to wild-type Mb, excellent diasteroselectivity (>99.9% de), and high chemoselectivity toward aldehyde olefination over carbene dimerization.
  • Mb(F43V,V68F) was determined to support over 1,100 catalytic turnovers for the conversion of 111 to E-113a, featuring an initial rate of 320 and 40 turnovers min "1 over the first minute and first 15 minutes, respectively.
  • Paquet 2004 Cao, Li et al. 2007; Lebel and Davi 2008; Lebel and Ladjel 2008).
  • Aldehyde olefination reaction Typically, reactions were carried out at a 400 ⁇ scale using 20 ⁇ myoglobin, 10 mM benzaldehyde, 10 mM EDA, 10 mM triphenylphosphine (or trialkyl phosphines, AsPh 3 , SbPh 3 , BiPh 3 ) and 10 mM sodium dithionite.
  • a solution containing sodium dithionate (100 mM stock solution) in potassium phosphate buffer (50 mM, pH 8.0) was degassed by bubbling argon into the mixture for 4 min in a sealed vial.
  • a buffered solution containing myoglobin was carefully degassed in a similar manner in a separate vial. The two solutions were then mixed together via cannula. Reactions were initiated by addition of 10 ⁇ L ⁇ of benzaldehyde (from a 0.4 M stock solution in DMSO), 10 ⁇ L ⁇

Abstract

Methods are provided for carrying out carbene transfer transformations such as olefin cyclopropanation reactions, carbene heteroatom-H insertion reactions (heteroatom = N, S, Si), sigmatropic rearrangement reactions, and aldehyde olefination reactions with high efficiency and selectivity by using a novel class of myoglobin-based biocatalysts. These methods are useful for the synthesis of a variety of organic compounds which contain one or more new carbon-carbon or carbon-heteroatom (N, S, or Si) bond. The methods can be applied for conducting these transformations in vitro (i.e., using the biocatalyst in isolated form) and in vivo (i.e., using the biocatalyst in a whole cell system).

Description

MYOGLOBIN-BASED CATALYSTS FOR CARBENE TRANSFER REACTIONS
[0001] This application claims priority to and the benefit of co-pending U.S. provisional patent application Serial No. 62/084,162, entitled MYOGLOBIN-BASED CATALYSTS FOR CARBENE TRANSFER REACTIONS, filed November 25, 2014, which is incorporated herein by reference in its entirety.
Statement Regarding Federally Sponsored Research or Development
[0002] This invention was made with government support under contract no. GM098628 awarded by the National Institutes of Health. The government has certain rights in the invention.
1. TECHNICAL FIELD
[0003] The present invention relates to engineered variants of myoglobin and their use as biocatalysts for catalyzing carbene transfer reactions. In particular, the invention relates to myoglobin-based catalysts having capability to catalyze olefin cyclopropanation reactions, carbene insertion reaction into N— H, S— H, and Si— H bonds, sigmatropic rearrangement reactions, and/or aldehyde olefination reactions. The invention also relates to methods for carrying out these transformations in vitro and in whole cells comprising providing a carbene acceptor substrate, a carbene donor reagent, and a myoglobin-based catalyst in an isolated form or contained within a host cell.
2. BACKGROUND
[0004] The chemical synthesis of new chemical entities heavily relies on efficient methods for the construction of new carbon— carbon and carbon— heteroatom bonds (e.g., C— N, C— S, C— Si bonds) which are ubiquitous in synthetic building blocks, advanced pharmaceutical intermediates, drugs, and biologically active natural and man-made compounds. A promising approach for the construction of the bonds is through transition metal-catalyzed insertion of carbenoid species into carbon— carbon double bonds (i.e., alkene cyclopropanation) or heteroatom— hydrogen bonds (e.g., N— H, S— H, Si— H bonds) (Doyle and Forbes 1998;
Doyle, McKervey et al. 1998; Lebel, Marcoux et al. 2003; Moody 2007; Pellissier 2008; Zhang and Wang 2008; Gillingham and Fei 2013). These procedures typically involves activation of a diazo-containing reagent or another carbene precursor (e.g., phosphonium and sulfonium ylides) by a transition metal catalyst followed by reaction of the resulting metal-carbenoid species with an alkenyl (C=C), amino (N— H), sulphydryl (S— H), or silyl (Si— H) group in an
intermolecular or intramolecular settings. Over the past decades, a large number of synthetic transition metal complexes have been investigated for promoting these types of carbene transfer/insertion reactions, a major group being represented by rhodium-based catalysts (Doyle and Forbes 1998; Doyle, McKervey et al. 1998; Lebel, Marcoux et al. 2003; Moody 2007; Pellissier 2008; Zhang and Wang 2008; Gillingham and Fei 2013). Despite this progress, achieving high catalytic efficiency, high regioselectivity, and/or high enantio/stereoselectivity with these systems has been difficult. Furthermore, many of these catalytic systems makes use of expensive and potentially very toxic metals (e.g., rhodium, ruthenium) which limits their attractiveness in context of the synthesis of pharmaceutical intermediates and drugs. Finally, many of these synthetic catalysts require harsh reaction conditions and exhibit poor chemoselectivity.
[0005] Enzymes and other protein-based biocatalysts constitute an attractive alternative to traditional chemical catalysts due to their ability to operate in aqueous media and under very mild reactions conditions such as ambient temperature and pressure (Bornscheuer, Huisman et al. 2012). These properties combined with the proteinaceous nature of these catalysts make them particularly relevant toward the design and implementation of sustainable and environmentally friendly procedures for chemical synthesis and manufacturing (Bornscheuer, Huisman et al. 2012). Notably, no naturally occurring enzymes are known to catalyze the aforementioned carbene transfer reactions in biological systems. Recent studies have shown that cytochrome P450 enzymes (e.g., P450BM3, a.k.a. CYP102A1) and engineered variants thereof, can react with diazocompounds and promote reactions such as the cyclopropanation of styrene derivatives (Coelho, Brustad et al. 2013; Coelho, Wang et al. 2013; Wang, Renata et al. 2014) and carbene N— H insertion in aniline derivatives in vitro and in vivo (Wang, Peck et al. 2014). See also Coehlo et al. US Pat. 8,993,262 B2 and Coehlo et al. Pat. Appl. WO2014058729. Despite these advances, important drawbacks of these P450-based catalysts include their large size (5-115 kDa) and limited stability, in particular at elevated temperatures and in the presence of organic cosolvents. In addition, these P450-based catalysts are often characterized by modest catalytic efficiency, limited substrate scope, and/or moderate diastereo- and enantio/stereoselectivity (Coelho, Brustad et al. 2013; Coelho, Wang et al. 2013; Wang, Peck et al. 2014; Wang, Renata et al. 2014). 3. SUMMARY
[0006] An engineered myoglobin catalyst is provided having an improved capability, as compared to the myoglobin of SEQ ID NO: 1, to catalyze a carbene transfer reaction, wherein the engineered myoglobin catalyst comprises an amino acid sequence that is at least 60% identical to SEQ ID NO: 1, 112, 113, 114, 115, and 116.
[0007] In an embodiment of the engineered myoglobin catalyst, the improved capability of the myoglobin catalyst is an improvement in its catalytic activity, regioselectivity,
diastereoselectivity and/or enantioselectivity.
[0008] In an embodiment of the engineered myoglobin catalyst, the myoglobin catalyst comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 and comprises an amino acid substitution at a position selected from the group consisting of position X29, X32, X33, X39, X44, X45, X46, X64, X67, X68, X93, X107, and XI 11 of SEQ ID NO: 1.
[0009] In an embodiment of the engineered myoglobin catalyst, the amino acid sequence of the myoglobin catalyst comprises at least one of the features selected from the group consisting of: X29 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X32 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X33 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X39 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X43 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X45 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X46 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X64 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X67 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X68 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y: X93 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X107 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; and XI 11 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y, of SEQ ID NO: 1.
[0010] In an embodiment of the engineered myoglobin catalyst, the myoglobin catalyst is selected from the group consisting of SEQ ID NOS: 2 through 110.
[0011] In an embodiment of the engineered myoglobin catalyst, the myoglobin catalyst contains a metal-binding cofactor selected from the group consisting of a heme analog, a metalloporphyrin, and a porphyrin analog. [0012] In an embodiment of the engineered myoglobin catalyst, the metal-binding cofactor is selected from the group consisting of mesoporphyrin, protoporphyrin, bisglycolporphyrin, corrole, phthalocyanine, phlorin, chlorin, 5-isocorrole, 10-isocorrole, and porphycene.
[0013] In an embodiment of the engineered myoglobin catalyst, the metal bound by the metal-binding cofactor is selected from the group consisting of iron, manganese, cobalt, ruthenium, rhodium, and osmium.
[0014] In an embodiment of the engineered myoglobin catalyst, the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is natural or non-natural oc-amino acid amino comprising a— SH,— NH2,— OH, =N-, — NC, imidazolyl, or pyridyl group within its side chain.
[0015] In an embodiment of the engineered myoglobin catalyst, the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is selected from the group consisting of serine, threonine, cysteine, tyrosine, histidine, aspartic acid, glutamic acid, selenocysteine, para-amino-phenylalanine, meta-amino- phenylalanine, para-mercaptomethyl-phenylalanine, meto-mercaptomethyl-phenylalanine, parci- (isocyanomethyl)-phenylalanine, meto-(isocyanomethyl)-phenylalanine, 3-pyridyl-alanine, and 3-methyl-histidine.
[0016] A method is provided for catalyzing a carbene insertion reaction, the method comprising:
(a) providing a diazo-containing carbene recursor of formula (I)
Figure imgf000006_0001
(I)
wherein, R]a and R2a are independently selected from H, halo, cyano (— CN), nitro (— N02), trifluoromethyl (— CF3), optionally substituted C1-18 alkyl, optionally substituted C6-io aryl, optionally substituted 5- to 10-membered heteroaryl, — C(0)ORlb, — C(0)N(Rib)(Ric), — C(0)Rlb, — Si(Rlb)(Rlc)(Rld), and — S02(Rlb), where each R]b, R]Ci and R^ are independently selected from H, optionally substituted CMS alkyl, optionally substituted C6-io aryl, and optionally substituted 6- to 10-membered heteroaryl. (b) providing a myoglobin-based catalyst;
(c providing a carbene acceptor substrate of formula (II), (IV), (VI) or (VIII):
Figure imgf000007_0001
where R2 is independently selected from optionally substituted C6-i5 aryl, optionally substituted 5- to 15-membered heteroaryl, and optionally substituted C1-18 aliphatic; R3 is independently selected from H, halo, cyano, optionally substituted C1-18 aliphatic, optionally substituted C6-io aryl, optionally substituted 5- to 10-membered heteroaryl,— C(0)ORib,— C(0)N(Rib)(Ric), and— C(0)Rib, where each Rlb and Rlc are independently selected from H, optionally substituted C1-18 aliphatic, optionally substituted C6-io aryl, and optionally substituted 5- to 10-membered heteroaryl; R4 and R5 are independently selected from H, halo, cyano, optionally substituted C1-18 aliphatic, optionally substituted C6-io aryl, and optionally substituted 5- to 10- membered heteroaryl;
R6 is independently selected from optionally substituted C1-18 aliphatic, optionally substituted C6-i6 aryl, optionally substituted 5- to 10-membered heteroaryl, optionally substituted C4-Ci6 cyclic aliphatic, and optionally substituted C4-Ci6 heterocyclic group; R7 is independently selected from H, optionally substituted C1-18 aliphatic, optionally substituted C6-i6 aryl, optionally substituted 5- to 10-membered heteroaryl; or where R6 and R7 are connected to form an optionally substituted C4-Ci6 cyclic aliphatic or heterocyclic group;
Rs is selected from optionally substituted C1-18 aliphatic, optionally substituted C6-i6 aryl, optionally substituted 5- to 10-membered heteroaryl, optionally substituted C4- Ci6 cyclic aliphatic, and optionally substituted C4-Ci6 heterocyclic group;
R9 is independently selected from optionally substituted C1-18 aliphatic, optionally substituted C6-i6 aryl, optionally substituted 5- to 10-membered heteroaryl, optionally substituted C4-Ci6 cyclic aliphatic, and optionally substituted C4-Ci6 heterocyclic group; Rio and Rn are optionally substituted Ci_6 aliphatic groups. (d) contacting the diazo-containing carbene precursor and the carbene acceptor substrate with the myoglobin-based catalyst, optionally in the presence of a reducing agent; and
(e) allowing the reaction to proceed for a time sufficient to form a carbene insertion
Figure imgf000008_0001
(III)
where R]a, R2a, R2, R3, R4, R5, R6, R7, Rs, R9, Rio, and Rn are as defined above.
[0017] In an embodiment of the method, the myoglobin comprises an amino acid sequence that is at least 60% identical to SEQ ID NO: 1, 112, 113, 114, 115, and 116.
[0018] In an embodiment of the method, the myoglobin catalyst comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 and comprises an amino acid substitution at a position selected from the group consisting of position X29, X32, X33, X39, X44, X45, X46, X64, X67, X68, X93, X107, and XI 11 of SEQ ID NO: 1.
[0019] In an embodiment of the method, the amino acid sequence of the myoglobin catalyst comprises at least one of the features selected from the group consisting of: X29 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X32 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X33 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X39 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X43 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X45 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X46 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X64 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X67 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X68 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y: X93 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X107 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; and XI 11 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y, of SEQ ID NO: 1.
[0020] In an embodiment of the method, the myoglobin catalyst is selected from the group consisting of SEQ ID NOS: 2 through 110. [0021] In an embodiment of the method, the myoglobin catalyst contains a metal-binding cofactor selected from the group consisting of a heme analog, a metalloporphyrin, and a porphyrin analog.
[0022] In an embodiment of the method, the metal-binding cofactor is selected from the group consisting of mesoporphyrin, protoporphyrin, bisglycolporphyrin, corrole,
phthalocyanine, phlorin, chlorin, 5-isocorrole, 10-isocorrole, and porphycene.
[0023] In an embodiment of the method, the metal bound by the metal-binding cofactor is selected from the group consisting of iron, manganese, cobalt, ruthenium, rhodium, and osmium.
[0024] In an embodiment of the method, the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is natural or non-natural oc-amino acid amino comprising a— SH,— NH2,— OH, =N-,— NC, imidazolyl, or pyridyl group within its side chain.
[0025] T In an embodiment of the method, the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is selected from the group consisting of serine, threonine, cysteine, tyrosine, histidine, aspartic acid, glutamic acid, selenocysteine, para-amino-phenylalanine, meto-amino-phenylalanine, para- mercaptomethyl-phenylalanine, meto-mercaptomethyl-phenylalanine, /?ara-(isocyanomethyl)- phenylalanine, meto-(isocyanomethyl)-phenylalanine, 3-pyridyl-alanine, and 3-methyl-histidine.
[0026] In an embodiment of the method, the myoglobin catalyst is tethered to a solid support.
[0027] In an embodiment of the method, the myoglobin catalyst is contained in a host cell.
[0028] In an embodiment of the method, the host cell is selected from the group consisting of Escherichia coli, Saccharomyces cerevisiae, and Pichia pastoris.
[0029] In an embodiment of the method, the carbene insertion product of formula (III) is selected from the group of consisting of:
Figure imgf000010_0001
wherein Ar is independently selected from optionally substituted C6-i5 aryl and optionally substituted 6 to 15 membered heteroaryl; Alk is an optionally substituted C1-18 aliphatic.
[0030] In an embodiment of the method, the diazo-containing carbene precursor and the carbene acceptor substrate are part of the same molecule.
[0031] A method is provided for catalyzing a sigmatropic rearrangement reaction, the method comprising:
(a) providing a diazo-containing carbene recursor of formula (I)
Figure imgf000011_0001
(I)
wherein, R]a and R2a are independently selected from H, halo, cyano (— CN), nitro (— N02), trifluoromethyl (— CF3), optionally substituted C1-18 alkyl, optionally substituted C6-io aryl, optionally substituted 5- to 10-membered heteroaryl, — C(0)ORlb, — C(0)N(Rib)(Ric), — C(0)Rlb, — Si(Rlb)(Rlc)(Rld), and — S02(Rlb), where each R]b, R]C and Rw are independently selected from H, optionally substituted C1-18 alkyl, optionally substituted C6-io aryl, and optionally substituted 6- to 10-membered heteroaryl.
(b) providing a myoglobin-based catalyst;
(c) pro V):
Figure imgf000011_0002
(X) (XI)
Figure imgf000011_0003
(XIV) (XV) wherein R12 is selected from optionally substituted C1-18 aliphatic, optionally substituted C6-i6 aryl, optionally substituted 5- to 10-membered heteroaryl, and optionally substituted C4-C16 heterocyclic group;
Ri3, Ri4, and R15 are independently selected from H, optionally substituted Ci_6 aliphatic groups, optionally substituted C6-i6 aryl, or where R13 and R14 are connected to form an optionally substituted C4-C16 cyclic aliphatic or heterocyclic group;
Ri6 is independently selected from optionally substituted C1-18 aliphatic, optionally substituted C6-i6 aryl, optionally substituted 5- to 10-membered heteroaryl, and optionally substituted C4-C16 heterocyclic group;
Ri7 is independently selected from optionally substituted Ci_6 aliphatic, optionally substituted C6 aryl, optionally substituted 5- to 6-membered heteroaryl; or where R]6 and Ri7 are connected together to form an optionally substituted C4-C16 cyclic aliphatic or heterocyclic group;
Ri8, R19, and R2o are independently selected from H, optionally substituted Ci_6 aliphatic groups, optionally substituted C6-i6 aryl, or where Ris and R19 are connected together to form an optionally substituted C4-C16 cyclic aliphatic or heterocyclic group;
(d) contacting the diazo-containing carbene precursor and the carbene acceptor substrate with the myoglobin-based catalyst, optionally in the presence of a reducing agent; and
(e) allowing the reaction to proceed for a time sufficient to form a sigmatropic rearrangement product of formula (XII), (XIII), (XVI), or (XVII) respectively,
Figure imgf000013_0001
Figure imgf000013_0002
(XVI) (XVII)
where R]a, R2a, R9, R12, R13, R14, R15, R½, R17, Ri8, R19, and R2o are as defined above.
[0032] In an embodimentof the method, the myoglobin comprises an amino acid sequence that is at least 60% identical to SEQ ID NO: 1, 112, 113, 114, 115, and 116.
[0033] In an embodimentof the method, the myoglobin catalyst comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 and comprises an amino acid substitution at a position selected from the group consisting of position X29, X32, X33, X39, X44, X45, X46, X64, X67, X68, X93, X107, and XI 11 of SEQ ID NO: 1.
[0034] In an embodimentof the method, the amino acid sequence of the myoglobin catalyst comprises at least one of the features selected from the group consisting of: X29 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X32 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X33 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X39 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X43 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X45 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X46 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X64 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X67 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X68 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y: X93 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X107 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; and XI 11 is A, R,
N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y, of SEQ ID NO: 1. [0035] In an embodimentof the method, the myoglobin catalyst is selected from the group consisting of SEQ ID NOS: 2 through 110.
[0036] In an embodimentof the method, the myoglobin catalyst contains a metal-binding cofactor selected from the group consisting of a heme analog, a metalloporphyrin, and a porphyrin analog.
[0037] In an embodimentof the method, the metal-binding cofactor is selected from the group consisting of mesoporphyrin, protoporphyrin, bisglycolporphyrin, corrole,
phthalocyanine, phlorin, chlorin, 5-isocorrole, 10-isocorrole, and porphycene.
[0038] In an embodimentof the method, the metal bound by the metal-binding cofactor is selected from the group consisting of iron, manganese, cobalt, ruthenium, rhodium, and osmium.
[0039] In an embodimentof the method, the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is natural or non-natural oc-amino acid amino comprising a— SH,— NH2,— OH, =N-,— NC, imidazolyl, or pyridyl group within its side chain.
[0040] In an embodimentof the method, the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is selected from the group consisting of serine, threonine, cysteine, tyrosine, histidine, aspartic acid, glutamic acid, selenocysteine, para-amino-phenylalanine, meto-amino-phenylalanine, para- mercaptomethyl-phenylalanine, meto-mercaptomethyl-phenylalanine, /?ara-(isocyanomethyl)- phenylalanine, meto-(isocyanomethyl)-phenylalanine, 3-pyridyl-alanine, and 3-methyl-histidine.
[0041] In an embodimentof the method, the myoglobin catalyst is tethered to a solid support.
[0042] In an embodimentof the method, the myoglobin catalyst is contained in a host cell.
[0043] In an embodimentof the method, the host cell is selected from the group consisting of Escherichia coli, Saccharomyces cerevisiae, and Pichia pastoris.
[0044] In an embodimentof the method, the diazo-containing carbene precursor and the carbene acceptor substrate are part of the same molecule.
[0045] A method is provided for catalyzing an aldehyde olefination reaction, the method comprising:
(a) providing a diazo-containing carbene precursor of formula (I)
Figure imgf000015_0001
(I)
wherein, R]a and R2a are independently selected from H, halo, cyano (— CN), nitro (— N02), trifluoromethyl (— CF3), optionally substituted C1-18 alkyl, optionally substituted C6-io aryl, optionally substituted 5- to 10-membered heteroaryl, — C(0)ORlb, — C(0)N(Rib)(Ric), — C(0)Rlb, — Si(Rlb)(Rlc)(Rld), and — S02(Rlb), where each R]b, R]Ci and R^ are independently selected from H, optionally substituted C1-18 alkyl, optionally substituted C6-io aryl, and optionally substituted 6- to 10-membered heteroaryl.
(b) providing a myoglobin-based catalyst;
(c) providing an aldehyde substrate of formula R2i-C(0)-H, wherein R2i is selected from optionally substituted C1-18 aliphatic, optionally substituted C6-i6 aryl, optionally substituted 5- to 10-membered heteroaryl, and optionally substituted C4-C16 heterocyclic group;
(d) providing a nucleophilic reagent selected from the group consisting of triphenylphosphine, triphenylarsine, and triphenylstilbine;
(e) contacting the diazo-containing carbene precursor, the aldehyde substrate, and the nucleophilic reagent with the myoglobin-based catalyst, optionally in the presence of a reducing agent; and
(f) allowing the reaction to proceed for a time sufficient to form an olefination product of formula (R]a)(R2a)C=CH(R2i), where R]a, R2a, and R2i are as defined above.
[0046] In an embodiment of the method, the myoglobin comprises an amino acid sequence that is at least 60% identical to SEQ ID NO: 1, 112, 113, 114, 115, and 116.
[0047] In an embodiment of the method, the myoglobin catalyst comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 and comprises an amino acid substitution at a position selected from the group consisting of position X29, X32, X33, X39,
X44, X45, X46, X64, X67, X68, X93, X107, and XI 11 of SEQ ID NO: 1.
[0048] In an embodiment of the method, the amino acid sequence of the myoglobin catalyst comprises at least one of the features selected from the group consisting of: X29 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X32 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X33 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X39 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X43 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X45 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X46 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X64 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X67 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X68 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y: X93 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X107 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; and XI 11 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y, of SEQ ID NO: 1.
[0049] In an embodiment of the method, the myoglobin catalyst is selected from the group consisting of SEQ ID NOS: 2 through 110.
[0050] In an embodiment of the method, the myoglobin catalyst contains a metal-binding cofactor selected from the group consisting of a heme analog, a metalloporphyrin, and a porphyrin analog.
[0051] In an embodiment of the method, the metal-binding cofactor is selected from the group consisting of mesoporphyrin, protoporphyrin, bisglycolporphyrin, corrole,
phthalocyanine, phlorin, chlorin, 5-isocorrole, 10-isocorrole, and porphycene.
[0052] In an embodiment of the method, the metal bound by the metal-binding cofactor is selected from the group consisting of iron, manganese, cobalt, ruthenium, rhodium, and osmium.
[0053] In an embodiment of the method, the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is natural or non-natural oc-amino acid amino comprising a— SH,— NH2,— OH, =N-,— NC, imidazolyl, or pyridyl group within its side chain.
[0054] In an embodiment of the method, the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is selected from the group consisting of serine, threonine, cysteine, tyrosine, histidine, aspartic acid, glutamic acid, selenocysteine, para-amino-phenylalanine, meto-amino-phenylalanine, para- mercaptomethyl-phenylalanine, meto-mercaptomethyl-phenylalanine, /?ara-(isocyanomethyl)- phenylalanine, meto-(isocyanomethyl)-phenylalanine, 3-pyridyl-alanine, and 3-methyl-histidine.
[0055] In an embodiment of the method, the myoglobin catalyst is tethered to a solid support. [0056] In an embodiment of the method, the myoglobin catalyst is contained in a host cell.
[0057] In an embodiment of the method, the host cell is selected from the group consisting of Escherichia coli, Saccharomyces cerevisiae, and Pichia pastoris.
[0058] In an embodiment of the method, the diazo-containing carbene precursor and the aldehyde substrate are part of the same molecule.
4. BRIEF DESCRIPTION OF THE DRAWINGS
[0059] The present invention is described herein with reference to the accompanying drawings, in which similar reference characters denote similar elements throughout the several views. It is to be understood that in some instances, various aspects of the invention may be shown exaggerated, enlarged, exploded, or incomplete to facilitate an understanding of the invention.
[0060] FIG. 1. Crystal structure of sperm whale myoglobin (pdb 1A6K). The heme cof actor, the heme-coordinating proximal histidine, and various amino acid residues lining the active site ('distal pocket') of the hemoprotein are displayed as stick models.
[0061] FIG. 2. Carbene transfer reactions catalyzed by the myoglobin catalysts provided herein: (a) olefin cyclopropanation; (b) carbene N— H insertion; (c) carbene S— H insertion; (d) carbene Si— H insertion.
[0062] FIG. 3. Additional reactions catalyzed by the myoglobin catalysts provided herein.
(a) [2,3] sigmatropic rearrangement of allylic thioethers; (b) [2,3] sigmatropic rearrangement of propargylic thioethers; (c) [2,3] sigmatropic rearrangement of allylic amines; (d) [2,3] sigmatropic rearrangement of propargylic amines; (e) aldehyde olefination.
[0063] FIG. 4. Activity and selectivity of wild-type sperm whale myoglobin and engineered variants thereof toward cyclopropanation of styrene in the presence of ethyl diazoacetate. TON: catalytic turnovers; d.e. = diastereomeric excess; e.e. = enantiomeric excess.
[0064] FIG. 5. Mechanistic model for myoglobin-catalyzed cyclopropanation of styrene with diazo esters.
[0065] FIGS. 6A-B. Plots of initial rates (v0) for Mb(H64V,V68A)-catalyzed
cyclopropanation of styrene with EDA in the presence of variable amounts of alkene (a) and diazo reagent (b). Data were fit to the Michaelis-Menten equation in order to obtain KM and kcat values. [0066] FIG. 7. Optimization of styrene: EDA ratio for Mb(H64V,V68A)-catalyzed reactions. Turnover numbers (TON) for the cyclopropane product and carbene dimerization byproduct (diethyl fumarate + diethyl maleate) are plotted against the different styrene : EDA ratios used in the reaction.
[0067] FIG. 8. Yields and turnovers numbers for Mb(H64V,V68A)-catalyzed styrene cyclopropanation in the presence of EDA at varying reagents and catalyst loadings using a constant styrene : EDA ratio of 1 : 2 and after one hour reaction time.
[0068] FIG. 9. Substrate scope for Mb(H64V,V68A)-catalyzed cyclopropanation.
[0069] FIGS. 10A-B. Representative chiral GC chromatograms corresponding to the products 3a, 3b, 3c, and 3d (a) as authentic racemic standards obtained from the reaction with styrene and EDA in the presence of Rh2(OAc)4 as the catalyst, and (b) as produced from the reaction with Mb(H64V,V68A) as the catalyst.
[0070] FIG. 11. Catalytic activity of hemin, wild-type sperm whale myoglobin (Mb), and the Mb(H64V,V68A) variant toward promoting carbene N— H insertion reaction in the presence of aniline and EDA.
[0071] FIG. 12. Yields and total turnover numbers (TTN) for Mb(H64V,V68A)-catalyzed carbene N— H insertion with various aryl amines. Reaction conditions: 10 mM amine, 10 mM EDA, 10 mM Na2S204 with (a) 20 μΜ (0.2 mol ) and (b) 1 μΜ (0.01 mol ) hemoprotein.
[0072] FIG. 13. Total turnovers supported by the different Mb variants for formation of N— H insertion products 32b (N-methyl aniline + EDA) and 35 (aniline + iBDA).
[0073] FIG. 14. Catalytic turnovers (TON) and enantioselectivity exhibited by representative Mb catalysts for formation of carbene N— H insertion products starting from aniline and various carbene donor reagents. S.r.c. = standard reactions conditions (10 mM aniline, 10 mM diazo reagent, 0.01 mol Mb catalyst, KPi pH 8.0, room temperature, 12 hours), n.d. = not determined.
[0074] FIG. 15. Catalytic turnovers (TON) and enantioselectivity exhibited by representative Mb catalysts for formation of carbene N— H insertion products starting from various alkyl amines and carbene donor reagents. S.r.c. = standard reactions conditions (10 mM amine, 10 mM diazo reagent, 0.01 mol Mb catalyst, KPi pH 8.0, room temperature, 12 hours), n.d. = not determined. [0075] FIG. 16. Catalytic activity of sperm whale myoglobin (Mb) for the carbene S— H insertion reaction with thiophenol and EDA. Reaction conditions: 400 μΕ-scale reactions, 12 hours, room temperature, anaerobic conditions.
[0076] FIGS. 17A-B. Mb-catalyzed S-H reaction. (A) GC chromatogram corresponding to the Mb-catalyzed S-H insertion reaction with thiophenol and EDA. The peaks corresponding to the S-H insertion product, oc-(phenylthio)acetate (53), and the internal standard (ISTD) are labelled. Thiophenol elutes at 2.42 min and is completely consumed in the reaction. Trace amounts of diphenyldisulfide (labeled with *) are observed in the reaction mixture. Reaction conditions: 20 μΜ Mb (0.2 mol ), lOmM thiophenol, 20 mM EDA, 10 mM dithionite in oxygen-free phosphate buffer (pH 8.0). (B) Time-dependent analysis of the formation of oc- (phenylthio)acetate (53) from thiophenol and EDA in the presence of Mb as the catalyst.
Reaction conditions: same as in (A).
[0077] FIG. 18. Total turnover numbers (TTN) supported by various engineered Mb variants for conversion of thiophenol and EDA to 53. Reaction conditions: 2.5 μΜ Mb variant, 10 mM PhSH, 20 mM EDA, 10 mM Na2S204 in KPi buffer (pH 8.0), 16 h. WT: wild-type.
[0078] FIG. 19. Yields and total turnover numbers (TTN) for Mb(L29A,H64V)-catalyzed carbene S— H insertion with various aryl mercaptans and oc-diazo esters. Reaction conditions: 10 mM thiol, 20 mM EDA, 10 mM Na2S204 with (a) 20 μΜ (0.2 mol ) and (b) 2.5 μΜ (0.025 mol ) hemoprotein, 16 hours. * Buffer added with 20% (v/v) methanol.
[0079] FIG. 20. Substrate scope and catalytic activity of Mb(L29A,H64V) toward carbene S— H insertion in the presence of different alkyl mercaptans and oc-diazo esters. Reaction conditions: 10 mM thiol, 20 mM diazo ester, 20 μΜ Mb(L29A,H64V) (0.2 mol%), 10 mM Na2S204 in oxygen- free phosphate buffer (pH 8.0), 12 hours. * Total turnover numbers (TTN) were measured using 0.025 mol% protein (2.5 μΜ).
[0080] FIG. 21. Enantioselectivity of myoglobin (Mb) and variants thereof for the carbene S— H insertion reaction in the presence of ethyl oc-diazopropanoate (52e). Reaction conditions: 400 μΕ-scale reactions, 20 μΜ protein, 10 mM Na2S204, 12 hours, room temperature, anaerobic conditions. Enantiomeric excess (% e.e.) was determined based on chiral gas chromatography using racemic standards for calibration.
[0081] FIG. 22. Representative chiral GC chromatograms corresponding to product 71 (a) as authentic racemic standard synthesized using Rh2(OAc)4 as the catalyst, (b) as produced from the reaction with Mb(F43V) (Entry 3, FIG. 21), (c) as produced from the reaction with
Mb(F43V) under optimized conditions (Entry 8, FIG. 21). The two enantiomers of 71 are labeled ent-A and ent-B.
[0082] FIG. 23. Myoglobin-catalyzed conversion of allyl(phenyl)sulfane to the [2,3]- sigmatropic rearrangement product 92 in the presence of EDA. The table describes the catalytic activity (TON) and enantioselectivity of different engineered Mb variants. Reaction conditions: 10 mM thiol, 20 mM diazo reagent, 10 μΜ Mb catalyst, KPi pH 8.0, room temperature, 12 hours.
[0083] FIGS. 24-25. Representative [2,3] sigmatropic rearrangement reactions involving different sulfane substrates and carbene donor reagents as catalyzed by the myoglobin catalysts provided herein. S.r.c. = standard reactions conditions (10 mM thiol, 20 mM diazo reagent, 10 μΜ Mb catalyst, KPi pH 8.0, room temperature, 12 hours).
[0084] FIGS. 26A-D. Metallo-substituted Mb variants. Overlay plot of the electronic absorption spectrum of (A) wild-type Mb, (B) the Mn-containing Mb variant, and (C) Co- containing Mb variant, in oxidized (solid line) and reduced form (dotted line). The Q band regions are enlarged in the inserts. (D) Overlay of the electronic absorption spectra of H93S and H93pAmF variants of sperm whale myoglobin.
[0085] FIG. 27. Relative yield of Mb(H64V,V68A)-catalyzed styrene cyclopropanation with ethyl 2-diazoacetate in whole-cell systems under anaerobic or aerobic conditions and in the presence or absence of glucose. Reaction conditions: 30 mM styrene, 60 mM EDA,
Mb(H64V,V68A)-expressing E. coli BL21(DE3) cells at OD600 = 40, 12 hours, room temperature.
[0086] FIGS. 28A-B. Whole-cell reactions involving E. coli cells expressing
Mb(H64V,V68A) for the stereoselective synthesis of chiral intermediates for the preparation of representative cyclopropane-containing drugs.
[0087] FIG. 29. Activity and selectivity of representative double and triple active site mutants of sperm whale myoglobin in the cyclopropanation of styrene (la) with ethyl diazoacetate (2, EDA). Reactions conditions: 20 μΜ Mb variant, 10 mM styrene, 10 mM EDA, 10 mM dithionite, 16 h. TON: catalytic turnovers; d.e. = diastereomeric excess; e.e. = enantiomeric excess. % ee^ : Positive and negative values refer to the formation of the trans- (IS,2S) (3a) and trans-(lR,2R) (3b) stereoisomer, respectively. % eez : Positive and negative values refer to the formation of the cis-(lR,2S) (3d) and cis-(lS,2R) (3c) stereoisomer, respectively.
[0088] FIG. 30. Catalytic activity of hemin and wild-type sperm whale myoglobin (Mb) in the olefination of benzaldehyde with ethyl oc-diazoacetate (EDA). Reaction conditions: 10 mM 111a, 10 mM 112a, 20 μΜ catalyst, 10 mM Na2S204, and 10 mM Y in oxygen-free phosphate buffer (pH 8.0) for 12 hours at room temperature.
[0089] FIG. 31. Catalytic activity and selectivity of myoglobin variants in benzaldehyde olefination with EDA. Reaction conditions: same as described in legend of FIG. 30.
[0090] FIG. 32. Catalytic activity and selectivity of Mb(F43V,V68F) variants in
benzaldehyde olefination with different oc-diazo esters. Reaction conditions: same as described in legend of FIG. 30 but using 20 μΜ catalyst (0.2 mol ).
[0091] FIG. 33. Substrate scope for Mb(F43V,V68F)-catalyzed aldehyde olefination.
Reaction conditions: 10 mM aryl aldehyde, 1 μΜ Mb(F43V,V68F), 10 mM cyclohexyl cc-diazo- acetate (112d), 10 mM AsPh3, 10 mM Na2S204.
5. DETAILED DESCRIPTION
[0092] Myoglobin (Mb) is a small (about 150 amino acid residues), oxygen-binding hemoprotein found in the muscle tissue of vertebrates. The physiological role of myoglobin is to bind molecular oxygen (02) with high affinity, providing a reservoir and source of oxygen to support the aerobic metabolism of muscle tissue. Myoglobin contains a heme group (iron- protoporphyrin IX) which is coordinated at the proximal site via the imidazolyl group of a conserved histidine residue (e.g., His93 in sperm whale myoglobin). A distal histidine residue (e.g., His64 in sperm whale myoglobin) is present on the distal face of the heme ring, playing a role in favoring binding of 02 to the heme iron center. Myoglobin belongs to the globin superfamily of proteins and consists of multiple (typically eight) alpha helical segments connected by loops. In biological systems, myoglobin does not exert any catalytic function.
[0093] As disclosed herein, the inventor has discovered that engineered variants of sperm whale myoglobin can provide robust, efficient, and selective biocatalysts for promoting a variety of carbene-mediated reactions of high synthetic utility. In particular, engineered variants of sperm whale myoglobin can react with diazo-containing reagents and catalyze a variety of synthetically valuable reactions which include alkene cyclopropanation, carbene insertion into a N— H, S— H, or Si— H bond, the [2,3]-sigmatropic rearrangements of thioether and tertiary amine substrates, and aldehyde olefination.
[0094] In previous studies, cytochrome P450s were reported to catalyze the
cyclopropanation of styrene in the presence of ethyl diazoacetate (EDA) (Coelho, Brustad et al. 2013) and the alkyation of aniline with EDA or ethyl 2-diazopropanoate (EDP) (Wang, Renata et al. 2014). These P450-based catalysts however often exhibited modest catalytic efficiency, limited substrate scope, and/or moderate or no stereoselectivity. In contrast, the inventors have discovered methods to catalyze these reactions with much higher catalytic efficiency and/or regio-, diastereo- and/or enantioselectivity using engineered variants of sperm whale myoglobin. In addition, the inventor has discovered methods, involving sperm whale myoglobin and engineered variants thereof, to catalyze a variety of other reactions, including carbene S— H insertion, carbene Si— H insertion, sigmatropic rearrangements of thioether/tertiary amine substrates, and aldehyde olefination, for which no natural or engineered biocatalysts have been reported. As such, the myoglobin-based catalysts provided herein constitute valuable and efficient catalysts for the synthesis of a variety of organic molecules, including cyclopropanes, amines, ethers, thioethers, silanes, and olefins.
[0095] FIG. 2 shows carbene transfer reactions catalyzed by the myoglobin catalysts provided herein: (a) olefin cyclopropanation; (b) carbene N— H insertion; (c) carbene S— H insertion; (d) carbene Si— H insertion.
[0096] FIG. 3 shows additional reactions catalyzed by the myoglobin catalysts provided herein: (a) [2,3] sigmatropic rearrangement of allylic thioethers; (b) [2,3] sigmatropic rearrangement of propargylic thioethers; (c) [2,3] sigmatropic rearrangement of allylic amines; (d) [2,3] sigmatropic rearrangement of propargylic amines; (e) aldehyde olefination.
[0097] In one embodiment, a method is provided for catalyzing an alkene cyclopropanation reaction to produce a product having two new C— C bonds, the method comprising:
(a) providing an alkene substrate, a diazo-containing reagent as carbene precursor, and an engineered myoglobin variant as the catalyst;
(b) contacting the alkene and the diazo-containing reagent with the engineered myoglobin variant under appropriate reaction conditions and allowing the reaction to proceed for a time sufficient to form the cyclopropanation product.
[0098] In another embodiment, a method is provided for catalyzing a carbene N— H insertion reaction to produce a product having a new C— N bond, the method comprising: (a) providing an N— H containing substrate, a diazo-containing reagent as carbene precursor, and an engineered myoglobin variant as the catalyst;
(b) contacting the N— H containing substrate and the diazo-containing reagent with the engineered myoglobin variant under appropriate reaction conditions and allowing the reaction to proceed for a time sufficient to form a product having a new C— N bond.
[0099] In another embodiment, a method is provided for catalyzing a carbene S— H insertion reaction to produce a product having a new C— S bond, the method comprising:
(a) providing an S— H containing substrate, a diazo-containing reagent as carbene precursor, and an engineered myoglobin variant as the catalyst;
(b) contacting the S— H containing substrate and the diazo-containing reagent with the engineered myoglobin variant under appropriate reaction conditions and allowing the reaction to proceed for a time sufficient to form a product having a new C— S bond.
[00100] In another embodiment, a method is provided for catalyzing a carbene Si— H insertion reaction to produce a product having a new C— Si bond, the method comprising:
(a) providing an Si— H containing substrate, a diazo-containing reagent as carbene precursor, and an engineered myoglobin variant as the catalyst;
(b) contacting the Si— H containing substrate and the diazo-containing reagent with the engineered myoglobin variant under appropriate reaction conditions and allowing the reaction to proceed for a time sufficient to form a product having a new C— Si bond.
[00101] In another embodiment, a method is provided for catalyzing a [2,3] sigmatropic rearrangement reaction to produce a product having a new C— S bond, the method comprising:
(a) providing a thioether substrate, a diazo-containing reagent as carbene precursor, and an engineered myoglobin variant as the catalyst;
(b) contacting the thioether substrate and the diazo-containing reagent with the engineered myoglobin variant under appropriate reaction conditions and allowing the reaction to proceed for a time sufficient to form a rearrangement product having a new C— S bond.
[00102] In another embodiment, a method is provided for catalyzing a [2,3] sigmatropic rearrangement reaction to produce a product having a new C— N bond, the method comprising:
(a) providing a tertiary amine substrate, a diazo-containing reagent as carbene precursor, and an engineered myoglobin variant as the catalyst; (b) contacting the tertiary amine substrate and the diazo-containing reagent with the engineered myoglobin variant under appropriate reaction conditions, and allowing the reaction to proceed for a time sufficient to form a rearrangement product having a new C— N bond.
[00103] In another embodiment, a method is provided for catalyzing an aldehyde olefination reaction to produce a product having a new C=C double bond, the method comprising:
(a) providing an aldehyde substrate, a diazo-containing reagent as carbene precursor, and an engineered myoglobin variant as the catalyst;
(b) contacting the aldehyde substrate and the diazo-containing reagent with the engineered myoglobin variant under appropriate reaction conditions, and allowing the reaction to proceed for a time sufficient to form a rearrangement product having a new C=C double bond.
[00104] Several factors contribute to the synthetic utility and convenience of the myoglobin- based catalysts disclosed in this application as compared to what is known in the art. For example, while cytochrome P450s and engineered variants thereof have been reported to catalyze the cyclopropanation of styrene derivatives with EDA, in vitro reactions with these biocatalysts are characterized by limited turnovers (<500 TON) and moderate
diastereoselectivity (up to 86% d.e.) (Coelho, Brustad et al. 2013). In addition, the diastereo and stereoselectivity of these P450-derived biocatalysts is highly dependent upon the olefin substrate (i.e., styrene derivative) (Coelho, Brustad et al. 2013). These enzymes are also large (50-115 KDa) and have limited stability in the presence of organic cosolvents and higher temperature, which further limit their scope and utility for practical applications.
[00105] By contrast, it is demonstrated here that certain engineered variants of sperm whale myoglobin can catalyze olefin cyclopropanation reactions with much greater efficiency (up to 46,800 TON) as well as higher diastero- and enantioselectivity (>99% d.e. and >99% e.e.). In addition, these engineered myoglobin variants maintain their diastereo- and stereoselectivity across a variety of different olefin substrates (e.g. styrene derivatives), thus exhibiting a broad substrate scope. Finally, these myoglobin-derived biocatalysts are small (17 kDa), monomeric proteins which, as discussed below, can operate under reaction conditions in which most enzymes, include P450s, would denature or be non-functional.
[00106] Cytochrome P450s and engineered variants thereof have been reported to catalyze a carbene N— H insertion reaction with aniline derivatives and EDA (Wang, Peck et al. 2014). However, these P450-based biocatalysts exhibit only modest catalytic efficiencies (< 500 TON) and no enantioselectivity (e.g., in the reaction of aniline with EDP) in these reactions. In addition, they have limited substrate scope, exhibiting no reactivity in the presence of alkyl amine substrates such as benzyl amine or morpholine (Wang, Peck et al. 2014).
[00107] By contrast, it is demonstrated here that certain engineered variants of sperm whale myoglobin can catalyze these carbene N— H insertion reactions with much greater efficiency (up to 7,000 TON with aniline and EDA). In addition, these myoglobin-derived biocatalysts can react with alkyl amines (e.g., benzyl amine, cyclohexyl amine, morpholine) and are capable of catalyzing carbene N-H insertion reactions in a stereoselective manner (e.g., 50% e.e. with benzyl amine and EDP), thus exhibiting a broader scope and reactivity.
[00108] It is further demonstrated that sperm whale myoglobin and engineered variants thereof can catalyze a number of other chemical transformations for which no natural or engineered biocatalysts have been reported to date. These transformations include carbene S— H insertion, carbene Si— H insertion reactions, [2,3] sigmatropic rearrangement reactions of thioether and tertiary amine substrates, and aldehyde olefination reactions.
[00109] Moreover, it is demonstrated that these myoglobin-catalyzed reactions can be performed, if desired, in aqueous solvents in the presence of large amounts (e.g., up to 40%) of an organic cosolvent (e.g., acetonitrile, tetrahydrofuran, ethanol, dimethylformamide) and/or elevated temperatures (e.g., up to 60-70°C). These reactions conditions can be useful to facilitate dissolution of the reagent(s)/substrate(s) in the reaction medium or to accelerate the course of reaction. It is also demonstrated that these myoglobin-catalyzed reactions can be performed, if desired, in the presence of high substrate loadings (e.g., 0.1-0.3 M substrate and diazo- containing reagents). The possibility to conduct reactions in the presence of high substrate loadings is convenient toward minimizing the volume of reaction and solvent waste associated with it. Finally, it is demonstrated that these myoglobin-catalyzed reactions can be carried out in whole-cell systems that is, employing cells expressing the myoglobin catalyst instead of purified protein. This capability is important toward eliminating the costs and time associated with the purification of the protein and thus toward optimizing the cost- and time-effectiveness of the biocatalytic process.
[00110] For clarity of disclosure, and not by way of limitation, the detailed description is divided into the subsections set forth below.
[00111] 5.1 Definitions [00112] The term "functional group" as used herein refers to a contiguous group of atoms that, together, may undergo a chemical reaction under certain reaction conditions. Examples of functional groups are, among many others, -OH, -N¾, -SH, -(C=0)-, -N3, -C≡CH.
[00113] The term "aliphatic" or "aliphatic group" as used herein means a straight or branched Ci-15 hydrocarbon chain that is completely saturated or that contains one or more units of unsaturation, or a monocyclic C3_s hydrocarbon, or bicyclic Cs-i2 hydrocarbon that is completely saturated or that contains one or more units of unsaturation, but which is not aromatic (also referred to herein as "cycloalkyl"). For example, suitable aliphatic groups include, but are not limited to, linear or branched alkyl, alkenyl, alkynyl groups or hybrids thereof such as
(cycloalkyl)alkyl, (cycloalkenyl)alkyl, or (cycloalkynyl)alkyl. The alkyl, alkenyl, or alkynyl group may be linear, branched, or cyclic and may contain up to 15, up to 8, or up to 5 carbon atoms. In various embodiments, alkyl groups include methyl, ethyl, propyl, cyclopropyl, butyl, cyclobutyl, pentyl, or cyclopentyl groups. In various embodiments, alkenyl groups include propenyl, butenyl, or pentenyl groups. In various embodiments, alkynyl groups include propynyl, butynyl, or pentynyl groups.
[00114] The term "aryl" and "aryl group" as used herein refers to an aromatic substituent containing a single aromatic or multiple aromatic rings that are fused together, directly linked, or indirectly linked (such as linked through a methylene or an ethylene moiety). An aryl group may contain from 5 to 24 carbon atoms, 5 to 18 carbon atoms, or 5 to 14 carbon atoms.
[00115] The terms "heteroatom" means nitrogen, oxygen, or sulphur, and includes any oxidized forms of nitrogen and sulfur, and the quaternized form of any basic nitrogen.
Heteroatom further includes Se, Si, and P.
[00116] The term "heteroaryl" as used herein refers to an aryl group in which at least one carbon atom is replaced with a heteroatom. For example, in an embodiment, a heteroaryl group is a 5- to 18-membered, a 5- to 14-membered, or a 5- to 10-membered aromatic ring system containing at least one heteroatom selected from the group consisting of oxygen, sulphur, and nitrogen atoms. Examples of heteroaryl groups include pyridyl, pyrrolyl, furyl, thienyl, indolyl, isoindolyl, indolizinyl, imidazolyl, pyridonyl, pyrimidyl, pyrazinyl, oxazolyl, thiazolyl, purinyl, quinolinyl, isoquinolinyl, benzofuranyl, and benzoxazolyl groups.
[00117] A heterocyclic group may be any monocyclic or polycyclic ring system which contains at least one heteroatom and may be unsaturated or partially or fully saturated. The term "heterocyclic" thus includes heteroaryl groups as defined above as well as non-aromatic heterocyclic groups. For example, in an embodiment, a heterocyclic group is a 3- to 18- membered, a 3- to 14-membered, or a 3- to 10-membered, ring system containing at least one heteroatom selected from the group consisting of oxygen, sulphur, and nitrogen atoms.
Examples of heterocyclic groups include the specific heteroaryl groups listed above as well as pyranyl, piperidinyl, pyrrolidinyl, dioaxanyl, piperazinyl, morpholinyl, thiomorpholinyl, morpholinosulfonyl, tetrahydroisoquinolinyl, and tetrahydrofuranyl groups.
[00118] A halogen atom may be a fluorine, chlorine, bromine, or iodine atom.
[00119] The term "substituents" refers to a contiguous group of atoms. Examples of "substituents" include, without limitation: alkoxy, aryloxy, alkyl, heteroatom-containing alkyl, alkenyl, heteroatom-containing alkenyl, alkynyl, heteroatom-containing alkynyl, aryl, heteroatom-containing aryl, alkoxy, heteroatom-containing alkoxy, aryloxy, heteroatom- containing aryloxy, halo, hydroxyl (— OH), sulfhydryl (— SH), substituted sulfhydryl, carbonyl (— CO— ), thiocarbonyl, (— CS— ), carboxy (— COOH), amino (— NH2), substituted amino, nitro (— N02), nitroso (—NO), sulfo (— S02— OH), cyano (— C≡N), cyanato (— O— C≡N), thiocyanato (— S— C≡N), formyl (— CO— H), thioformyl (— CS— H), phosphono (— P(0)OH2), substituted phosphono, and phospho (— P02).
[00120] By "optionally substituted", it is intended that in the any of the chemical groups listed above (e.g., alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, aryl, heteroaryl, heterocyclic, triazolyl groups), one or more hydrogen atoms are optionally replaced with an atom or chemical group other than hydrogen. Specific examples of such substituents include, without limitation, halogen atoms, hydroxyl (— OH), sulfhydryl (— SH), substituted sulfhydryl, carbonyl (—CO—), carboxy (—COOH), amino (— NH2), nitro (— N02), sulfo (— S02— OH), cyano (— C≡N), thiocyanato (— S— C≡N), phosphono (— P(0)OH2), alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, aryl, heteroaryl, heterocyclic, alkylthiol, alkyloxy, alkylamino, arylthiol, aryloxy, or arylamino groups. Where "optionally substituted" modifies a series of groups separated by commas (e.g., "optionally substituted A, B, or C"; or "A, B, or C optionally substituted with"), it is intended that each of the groups (e.g., A, B, or C) is optionally substituted.
[00121] The term "heteroatom-containing aliphatic" as used herein refers to an aliphatic moiety where at least one carbon atom is replaced with a heteroatom, e.g., oxygen, nitrogen, sulphur, selenium, phosphorus, or silicon, and typically oxygen, nitrogen, or sulphur. [00122] The terms "alkyl" and "alkyl group" as used herein refer to a linear, branched, or cyclic saturated hydrocarbon typically containing 1 to 24 carbon atoms, 1 to 18 carbon atoms or 1 to 12 carbon atoms, such as methyl, ethyl, n-propyl, isopropyl, n-butyl, isobutyl, t-butyl, octyl, decyl and the like.
[00123] The term "heteroatom-containing alkyl" as used herein refers to an alkyl moiety where at least one carbon atom is replaced with a heteroatom, e.g., oxygen, nitrogen, sulphur, phosphorus, or silicon, and typically oxygen, nitrogen, or sulphur.
[00124] The terms "alkenyl" and "alkenyl group" as used herein refer to a linear, branched, or cyclic hydrocarbon group of 2 to 24 carbon atoms, 2 to 18 carbon atoms, or 2 to 12 carbon atoms, containing at least one double bond, such as ethenyl, n-propenyl, isopropenyl, n-butenyl, isobutenyl, octenyl, decenyl, and the like.
[00125] The term "heteroatom-containing alkenyl" as used herein refer to an alkenyl moiety where at least one carbon atom is replaced with a heteroatom.
[00126] The terms "alkynyl" and "alkynyl group" as used herein refer to a linear, branched, or cyclic hydrocarbon group of 2 to 24 carbon atoms, 2 to 18 carbon atoms, or 2 to 12 carbon atoms, containing at least one triple bond, such as ethynyl, n-propynyl, and the like.
[00127] The term "heteroatom-containing alkynyl" as used herein refers to an alkynyl moiety where at least one carbon atom is replaced with a heteroatom.
[00128] The term "heteroatom-containing aryl" as used herein refers to an aryl moiety where at least one carbon atom is replaced with a heteroatom.
[00129] The terms "alkoxy" and "alkoxy group" as used herein refer to an aliphatic group or a heteroatom-containing aliphatic group bound through a single, terminal ether linkage.
[00130] The terms "aryloxy" and "aryloxy group" as used herein refer to an aryl group or a heteroatom-containing aryl group bound through a single, terminal ether linkage.
[00131] The term "contact" as used herein with reference to interactions of chemical units indicates that the chemical units are at a distance that allows short range non-covalent interactions (such as Van der Waals forces, hydrogen bonding, hydrophobic interactions, electrostatic interactions, dipole-dipole interactions) to dominate the interaction of the chemical units. For example, when a protein is 'contacted' with a chemical species, the protein is allowed to interact with the chemical species so that a reaction between the protein and the chemical species can occur. [00132] The term "polypeptide" and "protein" as used herein refers to any chain of two or more amino acids bonded in sequence, regardless of length or post-translational modification. According to their common use in the art, the term "protein" refers to any polypeptide consisting of more than 50 amino acid residues. These definitions are however not intended to be limiting.
[00133] The term "metalloprotein" refers to a protein that contains one or more metal ions. Typically, the metal ion(s) confers the protein with catalytic activity (e.g., iron atom in cytochrome P450 enzymes) or other properties such as that of binding other molecules (e.g., iron atom in myoglobin and hemoglobin).
[00134] The term "hemoprotein" or "heme-containing protein" refers to a protein containing a heme group (iron-protoporphyrin IX). The term "enzyme" refers to a protein capable of catalyzing a reaction as part of its native biological function. As a way of explanation, a cytochrome P450 monoxygenase, whose native function is typically that of catalyzing an oxygenation reaction (e.g., hydroxylation), is a hemoprotein and an enzyme, and more specifically a heme enzyme. Myoglobin, whose native function is that of binding and releasing oxygen, is a hemoprotein but not an enzyme (or heme enzyme).
[00135] The term "carbene equivalent" or "carbene precursor" refers to a molecule that can be decomposed in the presence of a transition metal catalyst or a metalloprotein catalyst to a structure that contain at least one divalent carbon with only 6 valence shell electrons and that can be transferred to carbon-carbon double bonds to form cyclopropanes or to carbon— hydrogen or heteroatom— hydrogen bonds to form products with new C— C or C— heteroatom bonds.
[00136] The term "diazo-containing reagent" or "diazo-containing compound" refers to an organic molecule that contains a diazo group (=N2). Diazo-containing reagents can serve as carbene precursor molecules for the carbene transfer reactions encompassed in this disclosure. Non limiting examples of diazo-containing reagents are oc-diazo-esters, oc-diazo-amides, oc- diazo-ketones, oc-cyano- oc-diazo-esters, oc-nitro-oc-diazo-esters, and oc-keto-oc-diazo-esters.
[00137] The term "carbene transfer" refers to a chemical transformation where a carbene equivalent is added to a carbon-carbon double bond, a carbon-heteroatom double bond or inserted into carbon-hydrogen or heteroatom— hydrogen bond.
[00138] The term "carbene acceptor substrate" as used herein refers to any compound that can be made react with a carbene precursor reagent in a myoglobin-catalyzed reaction according to the methods provided herein, thereby forming a product carrying one or more new C— C, C— N, C— S, and/or C— Si bond(s). Representative examples of carbene acceptor substrate are compounds of formula (II), (IV), (VI), (VIII), (X), (XI), (XIV) or (XV) as defined below.
[00139] The term "heme" refers to iron-protoporphyrin IX.
[00140] The terms "heme analog" and "metalloporphyrin" as used herein refer to any metal- containing porphyrin molecule other than iron-protoporphyrin IX. Examples of heme analogs include but are not limited to iron-deuteroporphyrin, iron-mesoporphyrin, iron-protoporphyrin, iron-bisglycolporphyrin, etc. These porphyrin molecules may contain metals other than Fe, including but not limited to Mn, Co, Ni, Cu, Rh, Ru, and Os.
[00141] The term "metalloporphyrin analog" refers to a pyrrole-containing macrocyclic molecule that can coordinate a metal ion (e.g., Fe, Mn, Co, Rh, Ru, or Os) with a square planar geometry. Examples of porphyrin analogs include but are not limited to corroles,
phthalocyanines, phlorins, chlorins,5-isocorrole, 10-isocorroles, and porphycenes.
[00142] The term "anaerobic", when used in reference to a reaction, culture or growth condition, refers to a condition in which the concentration of oxygen is less than about 25 μΜ, less than about 5 μΜ, or less than 1 μΜ. The term is also intended to include sealed chambers of liquid or solid medium maintained with an atmosphere of less than about 1% oxygen. In an embodiment, anaerobic conditions are achieved by sparging a reaction mixture with an inert gas such as nitrogen or argon.
[00143] The term "heterologous" as used herein with reference to molecules, and in particular enzymes and polynucleotides, indicates molecules that are expressed in an organism other than the organism from which they originated or are found in nature, independently of the level of expression that can be lower, equal or higher than the level of expression of the molecule in the native microorganism.
[00144] The term "homolog," as used herein with respect to an original enzyme or gene of a first family or species, refers to distinct enzymes or genes of a second family or species which are determined by functional, structural or genomic analyses to be an enzyme or gene of the second family or species which corresponds to the original enzyme or gene of the first family or species. Homologs most often have functional, structural, or genomic similarities. Techniques are known by which homologs of an enzyme or gene can readily be cloned using genetic probes and PCR. Identity of cloned sequences as homolog can be confirmed using functional assays and/or by genomic mapping of the genes. [00145] A protein has "homology" or is "homologous" to a second protein if the amino acid sequence encoded by a gene has a similar amino acid sequence to that of the second gene.
Alternatively, a protein has homology to a second protein if the two proteins have "similar" amino acid sequences. Thus, the term "homologous proteins" is intended to mean that the two proteins have similar amino acid sequences. In particular embodiments, the homology between two proteins is indicative of its shared ancestry, related by evolution.
[00146] The terms "analog" and "analogous" include nucleic acid or protein sequences or protein structures that are related to one another in function only and are not from common descent or do not share a common ancestral sequence. Analogs may differ in sequence but may share a similar structure, due to convergent evolution. For example, two enzymes are analogs or analogous if the enzymes catalyze the same reaction of conversion of a substrate to a product, are unrelated in sequence, and irrespective of whether the two enzymes are related in structure [00147] In general, the term "mutant" or "variant" as used herein with reference to a molecule such as polynucleotide or polypeptide, indicates that such molecule has been mutated from the molecule as it exists in nature. In particular, the term "mutate" and "mutation" as used herein indicates any modification of a nucleic acid and/or polypeptide which results in an altered nucleic acid or polypeptide. Mutations include any process or mechanism resulting in a mutant protein, enzyme, polynucleotide, or gene. A mutation can occur in a polynucleotide or gene sequence, by point mutations, deletions, or insertions of single or multiple nucleotide residues. A mutation in a polynucleotide includes mutations arising within a protein-encoding region of a gene as well as mutations in regions outside of a protein-encoding sequence, such as, but not limited to, regulatory or promoter sequences. A mutation in a coding polynucleotide such as a gene can be "silent", i.e., not reflected in an amino acid alteration upon expression, leading to a "sequence-conservative" variant of the gene. A mutation in a polypeptide includes but is not limited to mutation in the polypeptide sequence and mutation resulting in a modified amino acid. Non-limiting examples of a modified amino acid include a glycosylated amino acid, a sulfated amino acid, a prenylated (e.g., farnesylated, geranylgeranylated) amino acid, an acetylated amino acid, an acylated amino acid, a PEGylated amino acid, a biotinylated amino acid, a carboxylated amino acid, a phosphorylated amino acid, and the like.
[00148] The term "engineer" or "engineered" refers to any manipulation of a molecule that result in a detectable change in the molecule, wherein the manipulation includes but is not limited to inserting a polynucleotide and/or polypeptide heterologous to the cell and mutating a polynucleotide and/or polypeptide native to the cell.
[00149] The terms "myoglobin-based catalyst" or simply "myoglobin catalyst" as used herein refer to any polypeptide which shares at least 60% sequence identity to SEQ. ID NO: l and exhibits carbene transfer reactivity within the scope of the disclosed compositions and methods. Myoglobin catalysts also comprise engineered variants of sperm whale myoglobin (SEQ. ID NO: l), in which the naturally occurring heme cofactor is substituted for a heme analog, a metalloporphyrin (e.g., Co- or Mn-protoporphyrin IX), or a metalloporphyrin analog. Myoglobin catalysts also comprise engineered variants of sperm whale myoglobin (SEQ. ID NO: l), in which the amino acid residue (e.g., His93) involved in coordinating the metal atom in the protein-bound cofactor (e.g., heme) is substituted for another naturally occurring amino acid (e.g., A, R, N, D, C, Q, E, G, I, L, K, M, F, P, S, T, U, V, W, or Y) or for an unnatural amino acid (e.g., para-amino-phenylalanine, para-mercaptomethyl -phenylalanine, 3-pyridyl-alanine, 3- methyl-histidine). Myoglobin catalysts further comprise polypeptides that share at least 60% sequence identity to SEQ. ID NOS: 112, 113, 114, 115, or 116, or engineered variants thereof.
[00150] The term "polynucleotide molecule" as used herein refers to any chain of two or more nucleotides bonded in sequence. For example, a nucleic acid molecule can be a DNA or a RNA.
[00151] The terms "vector" and "vector construct" as used herein refer to a vehicle by which a DNA or RNA sequence (e.g., a foreign gene) can be introduced into a host cell, so as to transform the host and promote expression (e.g., transcription and translation) of the introduced sequence. A common type of vector is a "plasmid", which generally is a self-contained molecule of double- stranded DNA that can be readily accept additional (foreign) DNA and which can readily introduced into a suitable host cell. A large number of vectors, including plasmid and fungal vectors, have been described for replication and/or expression in a variety of eukaryotic and prokaryotic hosts. Non-limiting examples include pKK plasmids (Clonetech), pUC plasmids, pET plasmids (Novagen, Inc., Madison, Wis.), pRSET or pREP plasmids (Invitrogen, San Diego, Calif.), or pMAL plasmids (New England Biolabs, Beverly, Mass.), and many appropriate host cells, using methods disclosed or cited herein or otherwise known to those skilled in the relevant art. The terms "express" and "expression" refer to allowing or causing the information in a gene or DNA sequence to become manifest, for example producing a protein by activating the cellular functions involved in transcription and translation of a corresponding gene or DNA sequence. A DNA sequence is expressed in or by a cell to form an "expression product" such as a protein. The expression product itself, e.g., the resulting protein, may also be the to be "expressed" by the cell. A polynucleotide or polypeptide is expressed recombinantly, for example, when it is expressed or produced in a foreign host cell under the control of a foreign or native promoter, or in a native host cell under the control of a foreign promoter.
[00152] The term "fused" as used herein means being connected through one or more covalent bonds. The term "bound" as used herein means being connected through non-covalent interactions. Examples of non-covalent interactions are van der Waals, hydrogen bond, electrostatic, and hydrophobic interactions. The term "tethered" as used herein means being connected through covalent or non-covalent interactions. Thus, a "polypeptide tethered to a solid support" refers to a polypeptide that is connected to a solid support (e.g., surface, resin bead) either via non-covalent interactions or through covalent bonds.
[00153] 5.2 Myoglobin catalysts
[00154] Myoglobin catalysts are provided that are capable of promoting carbene transfer reactions with high efficiency and/or selectivity and across a broader range of substrates.
[00155] Myoglobin catalysts are provided having the capability to catalyze a carbene transfer reaction, wherein the myoglobin catalyst comprises an amino acid sequence having at least 60%, 80% or 90% sequence identity to SEQ. ID NOS:l, 112, 113, 114, 115, or 116.
[00156] In some embodiments, the capability to catalyze a carbene transfer reaction corresponds to the capability of the myoglobin catalyst to react with a diazo-containing reagent and catalyze a carbene addition to an alkene group of an alkene-containing molecule. In other embodiments, such capability corresponds to the capability of the myoglobin catalyst to react with a diazo-containing reagent and catalyze a carbene insertion into the N— H bond of an N— H bond containing molecule. In other embodiments, such capability corresponds to the capability of the myoglobin catalyst to react with a diazo-containing reagent and catalyze a carbene insertion into the S— H bond of an S— H bond containing molecule. In other embodiments, such capability corresponds to the capability of the myoglobin catalyst to react with a diazo-containing reagent and catalyze a carbene insertion into the Si— H bond of a Si— H bond containing molecule. In other embodiments, such capability corresponds to the capability of the myoglobin catalyst to react with a diazo-containing reagent and catalyze [2,3] sigmatropic rearrangement in the presence of a thioether substrate to give a molecule with a new C— S bond. In other embodiments, such capability corresponds to the capability of the myoglobin catalyst to react with a diazo-containing reagent and catalyze [2,3] sigmatropic rearrangement in the presence of a tertiary amine substrate to give a molecule with a new C— N bond. In other embodiments, such capability corresponds to the capability of the myoglobin catalyst to react with a diazo-containing reagent and catalyze an olefination reaction in the presence of an aldehyde substrate to give a molecule with a new C=C double bond.
[00157] Myoglobin catalysts are provided that are capable of catalyzing the aforementioned reactions, and which have an improved property compared with a reference myoglobin such as wild-type sperm whale myoglobin (SEQ ID NO: 1), or when compared to another hemoprotein such as CYP102A1 (P450BM3) from Bacillus megaterium (SEQ ID NO: 111).
[00158] In the characterization of the myoglobin catalysts provided herein, the polypeptides can be described in reference to the amino acid sequence of a naturally occurring myoglobin or another engineered myoglobin variant. As such, the amino acid residue is determined in the myoglobin polypeptide beginning from the first amino acid after the initial methionine (M) residue (i.e., the first amino acid after the initial methionine M represents residue position 1). It will be understood that the initiating methionine residue may be removed by biological processing machinery such as in a host cell or in vitro translation system, to generate a mature protein lacking the initiating methionine residue. The amino acid residue position at which a particular amino acid or amino acid change is present is sometimes described herein as "Xn", or "position n", where n refers to the residue position.
[00159] As described above, the myoglobin catalysts provided herein are characterized by an improved property as compared to the wild-type sperm whale myoglobin (SEQ ID NO: 1) or another reference hemoprotein (e.g., SEQ ID NO: 111). Changes to such properties can include, among others, improvements in catalytic efficiency, number of catalytic turnovers supported by the biocatalyst, regioselectivity, diastereoselectivity, enantioselectivity and/or reduced substrate or product inhibition. In the embodiments herein, the altered properties are based on engineered myoglobin polypeptides having residue differences at specific residue positions as compared to wild-type sperm whale myoglobin (SEQ ID NO: 1)
[00160] In some embodiments, the myoglobin catalyst is an engineered variant of sperm whale myoglobin (SEQ ID NO: 1), the variant comprising an amino acid change at one or more of the following positions of SEQ ID NO: 1: X29, X32, X33, X39, X44, X45, X46, X64, X67, X68, X93, X107, and XI 11. [00161] In some embodiments, the myoglobin catalysts can have additionally one or more residue differences at residue positions not specified by an X above as compared to the sequence SEQ ID NO: 1. In some embodiments, the differences can be 1-2, 1-5, 1-10, 1-20, 1-30, 1-40, 1- 50, 1-75, or 1-90, residue differences at other amino acid residue positions not defined by X above.
[00162] In some embodiments, the myoglobin catalysts having one or more of the improved enzyme properties described herein, can comprise an amino acid sequence that is at least 60%, 70%, 80%, 85%, 90%, 95%, 99% or more identical to the sequence SEQ ID NO: 1.
[00163] In some embodiments, the improved myoglobin catalyst can comprise an amino acid sequence that is at least 60%, 70%, 80%, 85%, 90%, 95%, 99% or more identical to a sequence corresponding to SEQ ID NO: 112, 113, 114, 115, or 116.
[00164] In some embodiments, the improved myoglobin catalyst comprises an amino acid sequence corresponding to a sequence selected from the group consisting of SEQ ID NOS: 2 - 110.
[00165] In some embodiments, the improved property of the myoglobin catalyst is with respect to its catalytic activity, regioselectivity, diastereoselectivity, and/or enantioselectivity.
[00166] The improvement in catalytic activity can be manifested by an increase in the number of catalytic turnovers (TON) supported by the myoglobin catalyst for the carbene transfer reaction, as compared to wild- type sperm whale myoglobin (SEQ ID NO: 1), or other reference sequences (e.g., SEQ ID NO: 111). In some embodiments, the myoglobin catalysts are capable of supporting a number of catalytic turnovers (TON) that is at least 1.1-fold, 2-fold, 5-fold, 10- fold, 100-fold, 200-fold, 500-fold, or more higher than the number of catalytic turnovers supported by the polypeptide having sequence SEQ ID NO: 1.
[00167] The improvement in catalytic activity can be also manifested by an increase in the catalytic efficiency for the carbene transfer reaction, this catalytic efficiency being
conventionally defined by the kCAT/KM ratio, where kcat is the turnover number and KM is the Michaelis-Menten constant, as compared to wild-type sperm whale myoglobin (SEQ ID NO: 1), or the reference sequence SEQ ID NO: 111. In some embodiments, the myoglobin catalysts exhibit a catalytic efficiency that is at least 1.1-fold, 2-fold, 5-fold, 10-fold, 100-fold, 200-fold, 500-fold, or more higher than the catalytic efficiency of the polypeptide with sequence SEQ ID NO: 1. [00168] In some embodiments, the myoglobin catalysts having improved catalytic activity toward alkene cyclopropanation, toward carbene Y— H insertion, where Y is S, N, or Si, toward [2,3] sigmatropic rearrangement of a thioether or tertiary amine substrate, and/or toward aldehyde olefination, comprise an amino acid sequence selected from the group consisting of SEQ ID NO: 2 - 110.
[00169] In some embodiments, the improvement in diastereoselectivity can be manifested by an increase in the diastereoselectivity by which a C=C double bond in an alkene-containing substrate is cyclopropanated by action of the myoglobin catalyst in the presence of a carbene precursor reagent as compared to the wild-type parental sequence (SEQ ID NO: 1) or the reference sequence SEQ ID NO: 111. The degree of diastereoselectivity can be conventionally described in terms of diasteromeric excess (d. e.). In some embodiments, the improvement in diastereoselectivity exhibited by the myoglobin catalyst is with respect to producing the (E) diastereomer of the cyclopropanation product (i.e., diastereomer in which the configuration of the cyclopropane ring is trans or (£)). In some embodiments, such improvement in
diastereoselectivity is with respect to producing the (Z) diastereomer of the cyclopropanation product. In some embodiments, the myoglobin catalysts are capable of cyclopropanating an alkene-containing substrate with a (Z)- or (£)-diastereoselectivity (i.e., diastereomeric excess) that is at least 1%, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 75%, 90%, 98%, 99% or more higher than that exhibited by the wild-type parental sequence SEQ ID NO: 1, or the reference sequence SEQ ID NO: 111.
[00170] In some embodiments, the improvement in enantioselectivity can be manifested by an increase in the enantioselectivity by which a C=C double bond in an alkene-containing substrate is cyclopropanated by action of the myoglobin catalyst in the presence of a carbene precursor reagent, as compared to the wild- type parental sequence (SEQ ID NO: 1) or the reference sequence SEQ ID NO: 111. The degree of stereoselectivity can be conventionally described in terms of stereomeric excess, that is in terms of enantiomeric excess (e. e. ) or diasteromeric excess (d. e. ) depending on the nature of the substrate. In some embodiments, the improvement in enantioselectivity exhibited by the myoglobin catalyst is with respect to producing the (IS,2S) stereoisomer of the cyclopropanation product. In some embodiments, such improvement in stereoselectivity is with respect to producing the (IR,2R), (IS,2R), or (IR,2S) stereoisomer of the cyclopropanation product. In some embodiments, the myoglobin catalysts are capable of cyclopropanating an alkene-containing substrate with a stereoselectivity (i.e., stereomeric excess) that is at least 1 %, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 75%, 90%, 98%, 99% or more higher than that exhibited by the wild-type parental sequence SEQ ID NO: 1 , or the reference sequence SEQ ID NO: 111.
[00171] In some embodiments, the improvement in enantioselectivity can be manifested by an increase in the enantioselectivity by which a Y— H bond, where Y is S, N, or Si, is functionalized via a carbene insertion reaction by action of the myoglobin catalyst, as compared to the wild-type parental sequence SEQ ID NO: 1. The degree of stereoselectivity can be conventionally described in terms of stereomeric excess, that is in terms of enantiomeric excess (e. e. ) or diasteromeric excess (d. e.) depending on the nature of the substrate. In some embodiments, the improvement in enantioselectivity exhibited by myoglobin catalyst is with respect to producing the (S) stereoisomer of the carbene insertion product. In some
embodiments, such improvement in stereoselectivity is with respect to producing (R) stereoisomer of the carbene insertion product. In some embodiments, the engineered myoglobin catalysts are capable of catalyzing a carbene Y— H insertion reaction, where Y is S, N, or Si, with a stereoselectivity (i.e., stereomeric excess) that is at least 1%, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 75%, 90%, 98%, 99% or more higher than that exhibited by the wild-type parental sequence SEQ ID NO: 1, or the reference sequence SEQ ID NO: 111.
[00172] In some embodiments, the improvement in enantioselectivity can be manifested by an increase in the enantioselectivity by which a [2,3] sigmatropic rearrangement of thioether or amine substrate is catalyzed by action of the myoglobin catalyst, as compared to the wild-type parental sequence SEQ ID NO: 1. The degree of stereoselectivity can be conventionally described in terms of stereomeric excess, that is in terms of enantiomeric excess (e. e. ) or diasteromeric excess (d. e. ) depending on the nature of the substrate. In some embodiments, the improvement in enantioselectivity exhibited by the myoglobin catalyst is with respect to producing the (S) stereoisomer of the rearrangement product. In some embodiments, such improvement in stereoselectivity is with respect to producing (R) stereoisomer of the rearrangement product. In some embodiments, the myoglobin catalysts are capable of catalyzing the [2,3] sigmatropic rearrangement of thioether or amine substrate, with a stereoselectivity (i.e., stereomeric excess) that is at least 1 %, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 75%, 90%, 98%, 99% or more higher than that exhibited by the wild-type parental sequence SEQ ID NO: 1 , or the reference sequence SEQ ID NO: 111. [00173] In some embodiments, the myoglobin catalysts having improved catalytic activity toward alkene cyclopropanation, and/or toward carbene Y— H insertion— where Y is S, N, or Si— , and/or toward [2,3] sigmatropic rearrangement of a thioether or tertiary amine substrate, comprise an amino acid sequence corresponding to SEQ ID NO: 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, or 13.
[00174] In some embodiments, the improvement in diastereoselectivity can be manifested by an increase in the diastereoselectivity by which a carbonyl group (C=0) in an aldehyde substrate is converted to an alkene group (C=C) ("olefination") by action of the myoglobin catalyst in the presence of a carbene precursor reagent, as compared to the wild-type parental sequence (SEQ ID NO: 1) or the reference sequence SEQ ID NO: 111. The degree of diastereoselectivity can be conventionally described in terms of diasteromeric excess (d.e. ). In some embodiments, the improvement in diastereoselectivity exhibited by the myoglobin catalyst is with respect to producing the (E) diastereomer of the aldehyde olefination product (i.e., diastereomer in which the configuration of the alkene is trans or (£)). In some embodiments, such improvement in diastereoselectivity is with respect to producing the (Z) diastereomer of the aldehyde olefination product. In some embodiments, the myoglobin catalysts are capable of olefinating an aldehyde substrate with a (Z)- or (^-diastereoselectivity (i.e., diastereomeric excess) that is at least 1 %, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 75%, 90%, 98%, 99% or more higher than that exhibited by the wild-type parental sequence SEQ ID NO: 1 , or the reference sequence SEQ ID NO: 111.
[00175] The capability of the myoglobin catalysts to catalyze any of the aforementioned carbene transfer reactions can be established according to methods well known in the art. Most typically, such capability can be established by contacting the substrate with the myoglobin catalyst under suitable reaction conditions in which the myoglobin catalyst is functional (e.g., under reducing and anaerobic conditions), and then determining the formation of the desired product (e.g., cyclopropanation, carbene Y— H insertion, rearrangement, or aldehyde olfination product) by standard analytical methods such as, for example, thin-layer chromatography, HPLC, GC, LC-MS, and/or GC-MS. A person skilled in the art will be capable of selecting the most appropriate method in each case.
[00176] Such catalytic activity of the myoglobin catalysts can be measured and expressed in terms of number of catalytic turnovers, product formation rate, catalytic efficiency (kcat/KM ratio), and the like. Most conveniently, such substrate activity can be measured and expressed in terms of turnover numbers (TON) or total turnover numbers (TTN), the latter corresponding to the total number of catalytic turnovers supported by the myoglobin catalyst in the presence of a given carbene acceptor substrate (e.g., styrene or aniline) and carbene donor (e.g., ethyl diazoacetate, ethyl oc-diazopropanoate).
[00177] The diastereo- and stereoselectivity of the myoglobin catalysts for any of the aforementioned carbene transfer reactions can be measured by determining the relative distribution of stereoisomeric products generated by the reaction using conventional analytical methods such as, for example, (chiral) normal phase liquid chromatography, (chiral) reverse- phase liquid chromatography, or (chiral) gas chromatography.
[00178] In some embodiments, the improved myoglobin catalysts comprise deletions of the myoglobin catalyst provided herein. Accordingly, for each of the embodiment of the myoglobin catalysts provided herein, the deletions can comprise 1, 2, 5, 10, 30, or more amino acids, as long as the functional activity and/or improved properties of the myoglobin catalyst is maintained.
[00179] In some embodiments, the myoglobin catalysts are fused to a polypeptide that can serve as an affinity tag in order to facilitate the isolation and purification of the myoglobin polypeptide. Examples of affinity tags include but are not limited to a polyhistidine affinity tag, a FLAG tag, and a glutathione-S-transferase tag.
[00180] In some embodiments, the myoglobin catalysts can comprise one or more non- natural amino acids in their primary sequence. The non-natural amino acid can be present at one or more of the positions defined by "Xn" above for the purpose of modulating the catalytic or selectivity properties of the myoglobin catalyst. Alternatively, the non-natural amino acid can be introduced in another position of the myoglobin catalyst sequence for the purpose, for example, of linking the myoglobin catalyst to another protein, another biomolecule, or a solid support. Several methods are known in the art for introducing an unnatural amino acid into a polypeptide. These include the use of the amber stop codon suppression methods using engineered tRNA/aminoacyl-tRNA synthetase (AARS) pairs such as those derived from
Methanocaldococcus sp. and Metanosarcina sp. (Liu and Schultz 2010). Alternatively, natural or engineered frameshift suppressor tRNAs and their cognate aminoacyl-tRNA synthetases can also be used for the same purpose (Rodriguez, Lester et al. 2006; Neumann, Wang et al. 2010). Alternatively, an unnatural amino acid can be incorporated in a polypeptide using chemically (Dedkova, Fahmi et al. 2003) or enzymatically (Bessho, Hodgson et al. 2002) aminoacylated tRNA molecules and using a cell-free protein expression system in the presence of the aminoacylated tRNA molecules (Kourouklis, Murakami et al. 2005; Murakami, Ohta et al. 2006). Examples of non-natural amino acids include but are not limited to, para-ammo- phenylalanine, para-acetyl-phenylalanine, meta-acetyl-phenylalanine, para-mercaptomethyl- phenylalanine, 3-pyridyl-alanine, 3-methyl-histidine, /?ara-butyl-l,3-dione-phenylalanine, O- allyl-tyrosine, O-propargyl-tyrosine, para-azido-phenylalanine, para-borono-phenylalanine, /?ara-bromo-phenylalanine, para-iodo-phenylalanine, 3-iodo-tyrosine, para-benzoyl- phenylalanine, para-benzoyl-phenylalanine, ε-N-allyloxycarbonyl-lysine, ε-N- propargyloxycarbonyl-lysine, ε-N-azidoethyloxycarbonyl-lysine, e-N-(o-azido-benzyl)- oxycarbonyl-lysine.
[00181] In some embodiments, the myoglobin catalyst comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 and comprises at least one of the features selected from the group consisting of: X29 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X32 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X33 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X39 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X43 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X45 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X46 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X64 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X67 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X68 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y: X93 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X107 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; and XI 11 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y.
[00182] In some embodiments, the amino acid residue that coordinates the iron atom at the axial position of the heme cofactor in the myoglobin catalyst is a naturally occurring amino acid selected from the group consisting of serine, threonine, cysteine, tyrosine, histidine, aspartic acid, glutamic acid, and selenocysteine. In other embodiments, the amino acid residue that coordinates the iron atom at the axial position of the heme cofactor in the myoglobin catalyst is a non-naturally occurring oc-amino acid amino comprising a— SH,— NH2,— OH, =N-,— NC group, imidazolyl, or pyridyl group within its side chain. In specific embodiments, this non- naturally occurring oc-amino acid amino is para-amino-phenylalanine, meta-ammo- phenylalanine, para-mercaptomethyl-phenylalanine, meto-mercaptomethyl-phenylalanine, para- (isocyanomethyl)-phenylalanine, meta-(isocyanomethyl)-phenylalanine, 3-pyridyl-alanine, or 3- methyl-histidine.
[00183] In some embodiments, the heme cofactor in the myoglobin catalyst is substituted for a heme analog, a metalloporphyrin, or a metalloporphyrin analog. In some embodiments, the heme cofactor in the myoglobin catalyst is substituted for a heme analog selected from the group consisting of iron-mesoporphyrin, iron-protoporphyrin, or iron-bisglycolporphyrin. In other embodiments, the heme cofactor in the myoglobin catalyst is substituted for a Mn-, Co-, Ru-, Rh-, or Os-porphyrin. In other embodiments, the heme cofactor in the myoglobin catalyst is substituted for a metalloporphyrin analog selected from the group consisting of corrole, phthalocyanine, phlorin, chlorin, 5-isocorrole, 10-isocorrole, and porphycene derivatives. These cofactor-substituted myoglobin catalysts can be prepared according to methods known in the arts, which include, for example, removal of the heme cofactor from the myoglobin polypeptide followed by refolding of the apoprotein in the presence of the heme analog, metalloporphyrin, or porphyrin analog (Yonetani and Asakura 1969; Yonetani, Yamamoto et al. 1974; Hayashi, Dejima et al. 2002; Hayashi, Matsuo et al. 2002; Heinecke, Yi et al. 2012). Alternatively, these cofactor-substituted myoglobin catalysts can be obtained via recombinant expression of the myoglobin polypeptide in bacterial strains that are capable of uptaking the heme analog or another metalloporphyrin from the culture medium (Woodward, Martin et al. 2007; Bordeaux, Singh et al. 2014).
[00184] In some embodiments, the amino acid residue that coordinates the metal atom at the axial position of the protein-bound cofactor (e.g., heme, heme analog, metalloporphyrin, or metalloporphyrin analog) in the myoglobin catalyst is a naturally occurring amino acid selected from the group consisting of serine, threonine, cysteine, tyrosine, histidine, aspartic acid, glutamic acid, and selenocysteine. In other embodiments, this amino acid residue is a non- naturally occurring oc-amino acid amino comprising a— SH,— NH2,— OH, =N-,— NC group, imidazolyl, or pyridyl group within its side chain. In specific embodiments, this non-naturally occurring oc-amino acid amino is para-amino-phenylalanine, meto-amino-phenylalanine, para- mercaptomethyl-phenylalanine, meto-mercaptomethyl-phenylalanine, /?ara-(isocyanomethyl)- phenylalanine, meto-(isocyanomethyl)-phenylalanine, 3-pyridyl-alanine, or 3-methyl-histidine.
[00185] In some embodiments, the myoglobin catalysts described herein can be provided in form of a kit. These kits may contain an individual myoglobin catalyst or a plurality of myoglobin catalysts. The myoglobin catalysts contained in the kit may be in lyophilized form, in solution, or tethered to a solid support. The kits can further include reagents for carrying out the myoglobin-catalyzed reactions, substrates for assessing the activity of the myoglobin catalysts, and reagents for detecting the products. The kits can also include instructions for the use of the kits.
[00186] In some embodiments, the myoglobin catalysts described herein can be covalently or non-covalently linked to a solid support for the purpose, for example, of screening the myoglobin catalysts for activity on a range of different substrates or for facilitating the separation of reactants and products from the myoglobin catalyst after the reactions. Examples of solid supports include but are not limited to, organic polymers such as polystyrene, polyacrylamide, polyethylene, polypropylene, poly ethylenegly cole, and the like, and inorganic materials such as glass, silica, controlled pore glass, metals. The configuration of the solid support can be in the form of beads, spheres, particles, gel, a membrane, or a surface.
[00187] 5.3 Polynucleotides and host cells for expression of the myoglobin catalysts
[00188] In another aspect, polynucleotide molecules are provided that encode for the myoglobin polypeptides disclosed herein. The polynucleotides may be linked to one or more regulatory sequences controlling the expression of the myoglobin polypeptide-encoding gene to form a recombinant polynucleotide capable of expressing the polypeptide.
[00189] Since the correspondence of all the possible three-base codons to the various amino acids is known, providing the amino acid sequence of the myoglobin polypeptide provides also a description of all the polynucleotide molecules encoding for such polypeptide. Thus, a person skilled in the art will be able, given a certain polypeptide sequence, to generate any number of different polynucleotides encoding for the same polypeptide. In an embodiment, the codons are selected to fit the host cell in which the polypeptide is being expressed. For example, in an embodiment, codons used in bacteria are used to express the polypeptide in a bacterial host.
[00190] In some embodiments, the polynucleotide molecule comprises a nucleotide sequence encoding for a myoglobin polypeptide with an amino acid sequence that is at least 60%, 70%, 80%, 85%, 90%, 95%, 99% or more identical to SEQ ID NO:l.
[00191] In some embodiments, the polynucleotide molecule comprises a nucleotide sequence encoding for a myoglobin polypeptide with an amino acid sequence that is at least 60%, 70%, 80%, 85%, 90%, 95%, 99% or more identical to SEQ ID NO: 112, 113, 114, 115, or 116. [00192] In some embodiments, the polynucleotide molecule encoding for the myoglobin polypeptide is comprised in a recombinant expression vector. Examples of suitable recombinant expression vectors include but are not limited to, chromosomal, nonchromosomal and synthetic DNA sequences, e.g., derivatives of SV40; bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectors derived from combinations of plasmids and phage DNA, viral DNA such as vaccinia, adenovirus, fowl pox virus, pseudorabies, adenovirus, adeno-associated viruses, retroviruses and many others. Any vector that transduces genetic material into a cell, and, if replication is desired, which is replicable and viable in the relevant host can be used. A large number of expression vectors and expression hosts are known in the art, and many of these are commercially available. A person skilled in the art will be able to select suitable expression vectors for a particular application, e.g., the type of expression host (e.g., in vitro systems, prokaryotic cells such as bacterial cells, and eukaryotic cells such as yeast, insect, or mammalian cells) and the expression conditions selected.
[00193] In another aspect, an expression host system is provided comprising a polynucleotide molecule encoding for the myoglobin polypeptides disclosed herein. Expression host systems that may be used include any systems that support the transcription, translation, and/or replication of a polynucleotide molecule provided herein. In an embodiment, the expression host system is a cell. Host cells for use in expressing the polypeptides encoded by the expression vector disclosed herein are well known in the art and include but are not limited to, bacterial cells (e.g., Escherichia coli, Streptomyces); fungal cells such as yeast cells (e.g., Saccharomyces cerevisiae, Pichia pastoris); insect cells; plant cells; and animal cells. The expression host systems also include lysates of prokaryotic cells (e.g., bacterial cells) and lysates of eukaryotic cells (e.g., yeast, insect, or mammalian cells). These systems also include in vitro
transcription/translation systems, many of which are commercially available. The choice of the expression vector and host system depends on the type of application intended for the methods provided herein and a person skilled in the art will be able to select a suitable expression host based on known features and application of the different expression hosts.
[00194] 5.4 Methods of preparing the engineered myoglobin polypeptides
[00195] The engineered myoglobin polypeptides can be prepared via mutagenesis of the polynucleotide encoding for the naturally occurring sperm whale myoglobin (SEQ ID NO: 1) or for an engineered variant thereof. Similarly, the engineered myoglobin polypeptides can be prepared via mutagenesis of the polynucleotide encoding for the naturally occurring myoglobins corresponding to SEQ ID NO: 112, 113, 114, 115, or 116, or an engineered variant thereof.
[00196] Many mutagenesis methods are known in the art and these include, but are not limited to, site-directed mutagenesis, site-saturation mutagenesis, random mutagenesis, cassette- mutagenesis, DNA shuffling, homologous recombination, non-homologous recombination, site- directed recombination, and the like. Detailed description of art-known mutagenesis methods can be found, among other sources, in U.S. Pat. No. 5,605,793; U.S. Pat. No. 5,830,721 ; U.S. Pat. No. 5,834,252; WO 95/22625; WO 96/33207; WO 97/20078; WO 97/35966; WO
98/27230; WO 98/42832; WO 99/29902; WO 98/41653; WO 98/41622; WO 98/42727; WO 00/18906; WO 00/04190; WO 00/42561 ; WO 00/42560; WO 01/23401 ; WO 01/64864.
[00197] Numerous methods for making nucleic acids encoding for polypeptides having a predetermined or randomized sequence are known to those skilled in the art. For example, oligonucleotide primers having a predetermined or randomized sequence can be prepared chemically by solid phase synthesis using commercially available equipment and reagents. Polynucleotide molecules can then be synthesized and amplified using a polymerase chain reaction, digested via endonucleases, ligated together, and cloned into a vector according to standard molecular biology protocols known in the art (e.g., Sambrook et al. Molecular Cloning: A Laboratory Manual (Third Edition), Cold Spring Harbor Press, 2001). These methods, in combination with the mutagenesis methods mentioned above, can be used to generate polynucleotide molecules that encode for engineered cytochrome P450 polypeptides as well as suitable vectors for the expression of these polypeptides in a host expression system.
[00198] Engineered myoglobin polypeptides expressed in a host expression system, such as, for example, in a host cell, can be isolated and purified using any one or more of the well-known techniques for protein purification, including, among others, cell lysis via sonication or chemical treatment, filtration, salting-out, and chromatography (e.g., ion-exchange chromatography, gel- filtration chromatography, etc.).
[00199] The recombinant myoglobin polypeptides obtained from mutagenesis of a parental myoglobin sequence (e.g., SEQ ID NO: 1, 112, 113, 114, 115, 116, or an engineered variant thereof) can be screened for identifying engineered myoglobin polypeptides having improved catalytic and/or selectivity properties, such as improvements with respect to their catalytic activity, regioselectivity, diastereoselectivity and/or enantioselectivity for any of the
aforementioned carbene transfer reactions. The improvement resulting from the introduced amino acid mutation(s) in any one or more of these catalytic and selectivity properties can be then measured according to methods known in the art, as described above. Amino acid substitutions that are found to be beneficial for a given property (e.g., catalytic activity, regioselectivity, diastereoselectivity, enantioselectivity) can be combined in order to obtain myoglobin catalysts with further improved catalytic and/or selectivity properties.
[00200] 5.5 Reactions catalyzed by myoglobin catalysts
[00201] In one embodiment, a method is provided for catalyzing an alkene cyclopropanation reaction to produce a product having two new C— C bonds, the method comprising:
(a) providing a diazo-containing carbene precursor of formula (I)
Figure imgf000045_0001
(I)
wherein, R]a and R2a are independently selected from H, halo, cyano (— CN), nitro (— N02), trifluoromethyl (— CF3), optionally substituted C1-18 alkyl, optionally substituted C6-io aryl, optionally substituted 5- to 10-membered heteroaryl, —
C(0)ORlb, — C(0)N(Rlb)(Rlc), — C(0)Rlb, — Si(Rlb)(Rlc)(Rld), and
where each R]b, R]Ci and R^ are independently selected from H, optionally substituted CMS alkyl, optionally substituted C6-io aryl, and optionally substituted 6- to 10-membered heteroaryl.
(b) providing an alkene-contain a (II):
Figure imgf000045_0002
(II)
wherein, R2 is independently selected from optionally substituted C6-i5 aryl, optionally substituted 5- to 15-membered heteroaryl, and optionally substituted CMS aliphatic; R3 is independently selected from H, optionally substituted CMS aliphatic, optionally substituted C6-io aryl, optionally substituted 5- to 10-membered heteroaryl, — C(0)ORib, — C(0)N(Rib)(Ric), and — C(0)Rib, where each Rlb and Rlc are independently selected from H, optionally substituted CMS aliphatic, optionally substituted C6-io aryl, and optionally substituted 5- to 10-membered heteroaryl; R4 and R5 are independently selected from H, halo, cyano, optionally substituted C1-18 aliphatic, optionally substituted C6-io aryl, and optionally substituted 5- to 10- membered heteroaryl.
(c) providing an engineered myoglobin variant as the catalyst;
(d) contacting the alkene-containing substrate and the diazo-containing reagent with the engineered myoglobin variant under appropriate reaction conditions; and
(e) allowing the reaction to proceed for a time sufficient to form a cyclopropanation product of formula (III):
Figure imgf000046_0001
(III)
where R]a, R2a, R2, R3, R4 and R5 are as defined above.
[00202] In another embodiment, a method is provided for catalyzing a carbene N— H insertion reaction to produce a product having a new C— N bond, the method comprising:
(a) providing a diazo-containing carbene recursor of formula (I)
Figure imgf000046_0002
(I)
wherein R]a and R2a are as defined above.
(b) providing an N— H containing substrate of formula (IV):
Figure imgf000046_0003
(IV) wherein R6 is independently selected from optionally substituted C1-18 aliphatic, optionally substituted C6-i6 aryl, optionally substituted 5- to 10-membered heteroaryl, optionally substituted C4-C16 cyclic aliphatic, and optionally substituted C4-C16 heterocyclic group; R7 is independently selected from H, optionally substituted C1-18 aliphatic, optionally substituted C6-i6 aryl, optionally substituted 5- to 10-membered heteroaryl; or where R6 and R7 are connected to form an optionally substituted C4-C16 cyclic aliphatic or heterocyclic group.
(c) providing an engineered myoglobin variant as the catalyst;
(d) contacting the N— H containing substrate and the diazo-containing reagent with the engineered myoglobin variant under appropriate reaction conditions; and
(e) allowing the reaction to proceed for a time sufficient to form a product of formula (V):
Figure imgf000047_0001
(V)
where R]a, R2a, R6, and R7 are as defined above.
[00203] In another embodiment, a method is provided for catalyzing a carbene S— H insertion reaction to produce a product having a new C— S bond, the method comprising:
(a) providing a diazo-containing carbene precursor of formula (I)
Figure imgf000047_0002
(I)
wherein R]a and R2a are as defined above,
(b) providing an S— H containing substrate of formula (VI):
R8 S H
(VI)
wherein Rs is selected from optionally substituted C1-18 aliphatic, optionally substituted C6-i6 aryl, optionally substituted 5- to 10-membered heteroaryl, optionally substituted C4-C16 cyclic aliphatic, and optionally substituted C4-C16 heterocyclic group.
(c) providing an engineered myoglobin variant as the catalyst;
(d) contacting the S— H containing substrate and the diazo-containing reagent with the engineered myoglobin variant under appropriate reaction conditions; and
(e) allowing the reaction to proceed for a time sufficient to form a product of formula (VII):
Figure imgf000048_0001
(VII)
where R]a, R2a, and Rs are as defined above.
[00204] In another embodiment, a method is provided for catalyzing a carbene Si— H insertion reaction to produce a product having a new C— Si bond, the method comprising:
(a) providing a diazo-containing carbene precursor of formula (I)
Figure imgf000048_0002
(I)
wherein R]a and R2a are as defined above.
(b) providing an Si— H contain a (VIII):
Figure imgf000048_0003
(VIII)
wherein R9 is independently selected from optionally substituted CMS aliphatic, optionally substituted C6-i6 aryl, optionally substituted 5- to 10-membered heteroaryl, optionally substituted C4-C16 cyclic aliphatic, and optionally substituted C4-C16 heterocyclic group; Rio and Rn are optionally substituted Ci_6 aliphatic groups, (c) providing an engineered myoglobin variant as the catalyst; (d) contacting the S— H containing substrate and the diazo-containing reagent with the engineered myoglobin variant under appropriate reaction conditions; and
(e) allowing the reaction to proceed for a time sufficient to form a product of formula (IX):
Figure imgf000049_0001
(IX)
where R]a, R2a, R9, Rio, and Rn are as defined above.
[00205] In another embodiment, a method is provided for catalyzing a sulfur ylide [2,3] sigmatropic rearrangement to produce a product having a new C— S bond, the method comprising:
(a) providing a diazo-containing carbene precursor of formula (I)
Figure imgf000049_0002
(I)
wherein R]a and R2a are as defined above,
(b) providing a thioether substrate of formula
Figure imgf000049_0003
(X) (xi) wherein R]2 is selected from optionally substituted C1-18 aliphatic, optionally substituted C6-i6 aryl, optionally substituted 5- to 10-membered heteroaryl, and optionally substituted C4-C16 heterocyclic group; R13 , R14, and R15 are independently selected from H, optionally substituted Ci_6 aliphatic groups, optionally substituted C6-i6 aryl, or where R13 and R14 are connected to form an optionally substituted C4- Ci6 cyclic aliphatic or heterocyclic group.
(c) providing an engineered myoglobin variant as the catalyst;
(d) contacting the thioether substrate and the diazo-containing reagent with the
engineered myoglobin variant under appropriate reaction conditions; and
(e) allowing the reaction to proceed for a time sufficient to form a product of formula (XII) from the substrate of formula (X) or a product of formula (XIII) from the substrate of formula XI)
Figure imgf000050_0001
(XII) (XIII)
where R]a, R2a, R12, R13, Ri4, and R15 are as defined above.
[00206] In another embodiment, a method is provided for catalyzing a nitrogen ylide [2,3] sigmatropic rearrangement to produce a product having a new C— N bond, the method comprising:
(a) providing a diazo-containing carbene precursor of formula (I)
Figure imgf000050_0002
(I)
wherein R]a and R2a are as defined above,
(b) providing a tertiary amine substrate of formula (XIV) or (XV):
Figure imgf000050_0003
(XIV) (XV) wherein R]6 is independently selected from optionally substituted C1-18 aliphatic, optionally substituted C6-i6 aryl, optionally substituted 5- to 10-membered heteroaryl, and optionally substituted C4-C16 heterocyclic group; Rn is independently selected from optionally substituted Ci_6 aliphatic, optionally substituted C6 aryl, optionally substituted 5- to 6-membered heteroaryl; or where R]6 and Rn are connected together to form an optionally substituted C4-C16 cyclic aliphatic or heterocyclic group; Ris, R19, and R2o are independently selected from H, optionally substituted Ci_6 aliphatic groups, optionally substituted C6-i6 aryl, or where Ris and R19 are connected together to form an optionally substituted C4-C16 cyclic aliphatic or heterocyclic group.
(c) providing an engineered myoglobin variant as the catalyst;
(d) contacting the amine substrate and the diazo-containing reagent with the engineered myoglobin variant under appropriate reaction conditions; and
(e) allowing the reaction to proceed for a time sufficient to form a product of formula (XVI) from the substrate of formula (XIV) or a product of formula (XVII) from the substrate of formula XV)
Figure imgf000051_0001
(XVI) (XVII)
where R]a, R2a, R½, R17, Ri8, R19, and R2o are as defined above.
[00207] In another embodiment, a method is provided for catalyzing an aldehyde olefination reaction to produce a product having a new C=C double bond, the method comprising:
(a) providing a diazo-containing carbene precursor of formula (I)
Figure imgf000051_0002
(I)
wherein R]a and R2a are as defined above. (b) providing an aldehyde substrate of formula R2i-C(0)-H, wherein R2i is selected from optionally substituted C1-18 aliphatic, optionally substituted C6-i6 aryl, optionally substituted 5- to 10-membered heteroaryl, and optionally substituted C4-C16 heterocyclic group;
(c) providing a nucleophilic reagent selected from the group consisting of triarylphosphine, triarylarsine, and triarylstilbine;
(d) providing an engineered myoglobin variant as the catalyst;
(e) contacting the diazo-containing carbene precursor, the aldehyde substrate, and the nucleophilic reagent with the myoglobin-based catalyst, optionally in the presence of a reducing agent; and
(f) allowing the reaction to proceed for a time sufficient to form an olefination product of formula (R]a)(R2a)C=CH(R2i), where R]a, R2a, and R2i are as defined above.
[00208] In certain embodiments, the diazo-containing carbene precursor in the methods described above is selected from the group consisting of ethyl 2-diazo-acetate, ieri-butyl 2- diazo-acetate, ethyl 2-diazo-2-phenylacetate, ethyl 2-diazo-propanoate, ieri-butyl 2-diazo- propanoate, ethyl 2-diazo-3,3,3-trifluoropropanoate, ethyl 2-cyano-2-diazoacetate, ethyl 2-diazo- 2-nitroacetate, diazomethane, diazo(nitro)methane, 2-diazoacetonitrile,
(diazomethyl)trimethylsilane, diethyl 2-diazomalonate, ethyl 2-diazo-3-oxobutanoate, 2-diazo- 1,1,1 -trifluoroethane.
[00209] In some embodiments, the myoglobin catalyst used in the methods described above comprises a polypeptide with an amino acid sequence that is at least 60%, 70%, 80%, 85%, 90%, 95%, 99% or more identical to a sequence selected from the group consisting of SEQ ID NO: 1 - 110.
[00210] The methods provided herein include forming reaction mixtures that contain the myoglobin catalyst, the carbene donor reagent (e.g., an alkyl a-diazoester) , the carbene acceptor substrate (e.g., alkene-containing substrate for the cyclopropanation reaction), and other additives (e.g., a reducing agent).
[00211] In carrying out the reactions described herein, the myoglobin polypeptides may be added to the reaction mixture in the form of purified proteins, whole cells containing the myoglobin polypeptide, and/or cell extracts and/or lysates of such cells.
[00212] Reactions are conducted under conditions sufficient to catalyze the formation of the desired products. The reaction time and concentration of the myoglobin polypeptide in the reaction mixture can vary widely, in large part depending on the catalytic rate and efficiency of the myoglobin catalyst. Typically, reaction times range from 10 min to 24 hours. For example, the reaction time can be 30 min or 12 hours. The amount of the myoglobin catalyst in the reaction mixture is also variable. Typically, the reaction mixtures contain between 0.001 mol% and 20 mol% myoglobin catalyst with respect to the diazo-containing reagent and/or the carbene acceptor substrate. In an embodiment, the reaction mixtures contain between 0.01 mol% and 2 mol% myoglobin catalyst with respect to the diazo-containing reagent and/or the carbene acceptor substrate. The concentration of the diazo-containing reagent and carbene acceptor substrate in the reaction mixtures can also vary. In an embodiment, the concentration of these compounds in the reaction mixture is between 100 μΜ and 2 M. In another embodiment, the concentration of these compounds in the reaction mixture is between 1 mM and 500 mM.
[00213] Typically, the myoglobin-catalyzed reactions are carried out in a buffered aqueous solution. Non- limiting examples of buffering agents that can be used include sodium phosphate, sodium acetate, 2-amino-2-hydroxymethyl-propane-l,3-diol (TRIS), 3-morpholinopropane-l- sulfonic acid (MOPS), 2-[4-(2-hydroxyethyl)piperazin-l-yl]ethanesulfonic acid (HEPES), and 2- (N-morpholino)ethanesulfonic acid (MES). In addition, other additives can be present in these solutions, which include salts (e.g., NaCl, KC1, CaCl2), detergents (e.g., sodium dodecylsulfate and Triton-X 100), chelators (e.g., 2-({2-[Bis(carboxymethyl)amino]ethyl}
(carboxymethyl)amino)acetic acid (EDTA), ethylene glycol-bis(2-aminoethylether)-N,N,N',N'- tetraacetic acid (EGTA)), and organic cosolvents such as, for example, methanol, ethanol, dimethylsulfoxide (DMSO), acetonitrile, dimethylformamide (DMF), and tetrahydrofuran (THF). Buffers, cosolvents, salts, detergents, and chelators can be used at any suitable concentration, which can be readily determined by a person skilled in the art. Cosolvents, in particular, can be included in the reaction mixtures in amounts ranging from about 1 % v/v to about 70% v/v, or higher. Experimentally, it was determined that the myoglobin catalysts provided herein maintain carbene transfer reactivity in the context of the reactions described herein in the presence of a concentration of DMF, THF, acetonitrile, methanol, or ethanol in buffer as high as 50% v/v, or higher. Notably, most proteins and heme-containing enzymes (e.g., P450) undergo denaturation under these conditions.
[00214] The reactions can be conducted at any suitable temperature which is compatible with the catalytic function of the myoglobin polypeptides within the scope of the disclosed compositions and methods. Typically, the reactions are conducted at a temperature ranging from about 2°C to about 70°C. The reactions can be conducted, for example, at about 25 °C or about 50°C. The reactions can be conducted at any suitable pH which is compatible with the catalytic function of the myoglobin polypeptides within the scope of the disclosed compositions and methods. In general, the reactions are conducted at a pH ranging from about 6 to about 10. The reactions can be conducted, for example, at a pH of 6, 7, 8, or 9.
[00215] Experimentally, it was determined that the reduced form of the myoglobin catalyst (e.g., ferrous form vs. ferric form for heme-containing myoglobin catalysts) is generally more active catalytically than the oxidized form. Accordingly, in an embodiment, the reactions are conducted in the presence of a reducing agent, in particular in vitro reactions. In another embodiment, the reducing agent is sodium dithionite (Na2S2C>4). Alternatively, other reducing agents can be used which include, but are not limited to, ascorbic acid, enzymatic redox systems comprising of a myoglobin reductase enzyme and the cognate reduced nicotinamide adenine dinucleotide cofactor (NADPH or NADH), and non-enzymatic redox systems comprising of a reduced nicotinamide adenine dinucleotide cofactor (NADPH or NADH) or an NADH mimic (Paul, Arends et al. 2014) (e.g., 1 -benzyl- 1 ,4-dihydronicotinamide, l-methyl-1,4- dihydronicotinamide, 1 -( 1-benzyl- 1 ,4-dihydropyridin-3-yl)ethanone, 1 -benzyl- 1 ,4- dihydropyridine-3-carbonitrile) and an electron transfer mediator (e.g., flavin mononucleotide (FMN), riboflavin (vitamin B2), flavin adenine dinucleotide (FAD), methylene blue). The concentration of the ultimate reducing agent in the reaction mixtures can vary, ranging from substoichiometric amounts (e.g., 0.2, 0.5, 0.8 equiv.) to stoichiometric (1 equiv.) and overstoichiometric amounts (e.g., 2, 5, 10, 100 equiv.) with respect to the myoglobin catalyst. Alternatively, the myoglobin catalyst can be reduced or maintained in the reduced ferrous form electrochemically by means of an electrode. Since reduction of the heme (or alternative metallo- porphyrin cofactor) in myoglobin is associated with a shift in the Soret band (400-450 nm range) of the protein, the identification of suitable reducing agents and conditions is straightforward for a person skilled in the art.
[00216] Since binding of molecular oxygen to myoglobin may interfere with the carbene transfer reactivity of this hemoprotein, in an embodiment, the reactions are conducted under anaerobic conditions. Anaerobic conditions can be achieved by conducting the reactions under an inert atmosphere, such as a nitrogen atmosphere or argon atmosphere, and using solvents from which molecular oxygen has been removed via degassing. [00217] Typically, the myoglobin-catalyzed reactions are allowed to proceed until a substantial amount of the substrate is transformed into the product. Product formation (or substrate consumption) can be monitored using standard analytical methods such as, for example, thin-layer chromatography, GC, HPLC, or LC-MS. Experimental parameters such as amount of myoglobin catalyst added to the reaction mixture, temperature, pH, solvent composition, reductant concentration, etc. can be readily optimized by routine experimentation and a person skilled in the art will be able to identify most suitable reaction conditions according to the substrate and the myoglobin catalyst utilized in the process.
[00218] Purification of the products of formula (III), (V), (VII), (IX), (XII), (XIII), (XVI), (XVII), and
Figure imgf000055_0001
can be achieved by a variety of techniques known in the art, such as by normal phase liquid chromatography through silica gel; reverse-phase liquid chromatography through bonded silica gel such as octadecylsilica, octylsilica and the like; and recrystallization using pure organic solvents or solvent mixtures.
[00219] The methods provided herein can be assessed in terms of diastereoselectivity and/or enantioselectivity, that is the extent to which the reaction produces a particular isomer, be it a diastereomer or enantiomer. A perfectly selective reaction produces a single isomer, such that the isomer constitutes 100% of the product. As another non-limiting example, a reaction producing a particular enantiomer constituting 95% of the total product can be said to be 95% enantioselective. A reaction producing a particular diastereomer constituting 40% of the total product, meanwhile, can be said to be 40% diastereoselective.
[00220] In general, the methods provided herein include reactions that are from about 1 % to about 99.9% diastereoselective. For example, the reaction can be about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or about 95% diastereoselective. The methods provided herein also include reactions that are from about 1% to about 99.9% enantioselective. For example, the reaction can be about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or about 95% enantioselective. Accordingly, some embodiments disclosed herein provide methods wherein the reaction is at least 20% to at least 99% diastereoselective. In some embodiments, the reaction is at least 20% to at least 99% enantioselective.
[00221] In certain instances, two stereoisomeric products containing a new chiral carbon atom with an (R) or (5) absolute configuration are formed from the reactions described herein. In certain instances, four or three stereoisomeric products containing two chiral carbon atoms each having an (R) or (5) absolute configuration are formed. In the case of the cyclopropanation reaction, two "trans" or "£"' isomers and "cis" or "Z" isomers can be formed. The two cis isomers are enantiomers with respect to one another, in that the structures are non- superimposable mirror images of each other. Similarly, the two trans isomers are enantiomers. A person skilled in the art will appreciate that the stereochemical configuration of certain of the products herein will depend on factors including the structures of the particular carbene acceptor substrate and diazo-containing reagent used in the reaction, as well as the nature and identity of the myoglobin catalyst. Accordingly, the distribution of the stereoisomeric products formed in the reactions described herein will also depend on such factors.
[00222] The distribution of a product mixture can be described in terms of the enantiomeric excess, or "% e.e. ". The enantiomeric excess corresponds to the difference in the mole fractions of two enantiomers in a mixture and can be calculated using the formula: % e.e. (5) = [(% (5) - % (R)/(% (S) + % (R))] x 100%). The diasteromeric excess (% d.e.) can be calculated in the same manner. In the case of the cyclopropanation products, the distribution of the (E) and (Z) isomers can be described in terms of the E : Z (or trans : cis) ratio.
[00223] In general, the methods provided herein include reactions that lead to product mixtures exhibiting % e.e. values which range from about 1% to about 99.9%, or from about - 1% to about -99.9%. The methods provided herein also include reactions that lead to product mixtures exhibiting % d.e. values which range from about 1% to about 99.9%, or from about - 1% to about -99.9%. In the case of the cyclopropanation reactions, the methods provided herein include reactions that lead to product mixtures exhibiting a Z : E ratios ranging from about 1 : 99.9 to about 99.9: 1. Accordingly, some embodiments provide methods that lead to a product mixture exhibiting at least 20% to at least 99% e.e.. In some embodiments, the product mixture exhibits at least 20% to at least 99% d.e.. Some embodiments also provide methods that lead to a mixture of cyclopropanation products with a Z : E ratio of at least 90 : 10 to at least 99 : 1. In other embodiments, the mixture of cyclopropanation products exhibit a Z : E ratio of at least 10 : 90 to at least 1 : 99.
[00224] The reactions can be conducted with intact cells expressing a myoglobin polypeptide provided herein (also referred to as "whole-cell reactions"). These whole-cell reactions can be carried out with any of the host cells used for expression of the myoglobin polypeptide, as described above. In some embodiments, the host cells are bacterial cells such as, for example, Escherichia coli cells. In other embodiments, the host cells are yeast cells such as, for example, Saccharomyces cerevisiae or Pichia pastoris cells.
[00225] For the whole-cell reactions, suspension of cells expressing a myoglobin polypeptide provided herein can be formed in a suitable medium (e.g., phosphate buffer, M9 medium, Luria- Bertani medium, Terrific Broth medium) supplemented with nutrients such as, for example, mineral micronutrients (e.g., CoCl2, Q1SO4, MnCl2), vitamins (e.g., thiamine, riboflavin, pantothenate) cofactor precursors (e.g., delta-aminolevulinic acid, p-aminobenzoic acid), sugars (e.g., glucose), and other energy sources (e.g., glycerol). In certain instances, the medium is also supplemented with a heme analog or a metalloporphyrin other than heme (e.g., Co- or Mn-protoporphyrin IX) with the purpose of incorporating such heme analog or metalloporphyrin in the myoglobin catalyst.
[00226] Whole-cell reactions using cells expressing a myoglobin polypeptide provided herein can be carried under aerobic conditions or anaerobic conditions. In addition, whole-cell reactions using cells expressing a myoglobin polypeptide provided herein can be carried without the addition of an exogenous reductant to the cell suspension. Indeed, it has been determined that the reducing intracellular environment of a typical host cell (e.g., E. coli) is sufficient to maintain the myoglobin catalyst in the catalytically more active form (e.g., ferrous form for heme-containing myoglobin and engineered variants thereof). In addition, it has been determined that the intracellular concentration of oxygen in a typical host cell (e.g., E. coli) is sufficiently low to enable the myoglobin polypeptide provided herein to operate as carbene transfer catalyst.
[00227] The yield and rate of the whole-cell reactions can be controlled, at least in part, by varying the cell density of the cell suspension used in these reactions. The cell density can be determined by measuring the absorbance at 600 nm and can be expressed as optical density at 600 nm (OD6oo)- Alternatively, the cell density can be expressed in gram cell dry weight per liter (g cdw L"1). In general, the whole-cell reactions can be conducted using cell suspensions that have an optical density (OD6oo) ranging from about 0.1 to about 100 or that have a cell density ranging from about 0.02 g cdw L"1 to about 20 g cdw L"1. Other cell densities can be useful, depending on the nature of the host cell, myoglobin catalyst, carbene acceptor substrate and diazo-containing reagent.
[00228] The concentration of the myoglobin catalyst in the cell suspensions used for the whole-cell reactions can be adjusted by varying the protein expression conditions (e.g., type of growth medium, temperature, concentration of the inducer of expression (e.g., ITPG, arabinose), and expression time) according to procedures well known in the art. The number of catalytic turnovers supported by the myoglobin catalyst in whole-cell systems can be expressed in the form of amount of product (e.g., in mmol) per gram cell dry weight. In general, whole-cell reactions involving the myoglobin catalysts provided herein exhibit turnovers ranging from about 0.1 mmol (g cdw)-1 to about 20 mmol (g cdw)-1.
[00229] The compounds provided herein may contain one or more chiral centers.
Accordingly, the compounds are intended to include racemic mixtures, diastereomers, enantiomers, and mixture enriched in one or more stereoisomer. When a group of substituents is disclosed herein, all the individual members of that group and all subgroups, including any isomers, enantiomers, and diastereomers are intended to be included in this disclosure.
Additionally, all isotopic forms of the compounds disclosed herein are intended to be included in this disclosure. For example, it is understood that any one or more hydrogens in a molecule disclosed herein can be replaced with deuterium or tritium.
[00230] A person skilled in the art will also appreciate that starting materials, biological materials, reagents, synthetic methods, purification methods, analytical methods, assay methods, and biological methods other than those specifically exemplified can be employed in the practice of the compositions and methods provided herein. All art-known functional equivalents of any such materials and methods are intended to be included in the compositions and methods provided herein.
[00231] Unless defined otherwise herein, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains.
[00232] The following examples are offered by way of illustration and not by way of limitation.
6. EXAMPLES
[00233] 6.1 Example 1: Cyclopropanation reactions catalyzed by myoglobin-based catalysts.
[00234] In initial studies, the ability of sperm whale myoglobin (Mb; SEQ ID NO:l) to catalyze the cyclopropanation of styrene (la) in the presence of ethyl diazoacetate (2) as carbene source was tested. Under reducing and anaerobic conditions, Mb was found to effectively promote this reaction supporting about 180 turnovers and leading to (£)-ethyl 2- phenylcyclopropanecarboxylate (3a-b) as the major product (86% d.e.) (FIG. 4). Notably, this cyclopropanation activity compares well with that reported for the P450BM3-based variants in vitro (200-360 total turnovers) (Coelho, Brustad et al. 2013) under similar reaction conditions (0.02 mol% protein, 3: 1 styrene:EDA ratio). In the absence of reductant (dithionite) or in the presence of air, no cyclopropanation product was observed indicating that ferrous form of Mb is the catalytically active species and that 02 is deleterious to this reactivity, likely due to competition with the diazo reagent for binding to the heme. Despite this promising activity, wild-type Mb showed however no asymmetric induction in the cyclopropanation reaction, leading to a racemic mixture for both the Z and E product as observed for free hemin (FIG. 4).
[00235] Based on these initial results, it was hypothesized that the Mb-catalyzed
cyclopropanation reaction involves formation of a heme-bound carbene intermediate upon reaction of EDA with the protein in its reduced, ferrous state. 'End- on' (Wolf, Hamaker et al. 1995; Che, Huang et al. 2001 ; Li, Huang et al. 2002; Nowlan, Gregg et al. 2003) attack of the styrene molecule to this heme-carbenoid species would then lead to the cyclopropanation product (FIG. 5). While the E-selectivity of the Mb-catalyzed reaction clearly indicates that a 'trans' heme-carbene / styrene arrangement (i.e., carbene ester group and styrene phenyl group on opposite sides) may be used, the lack of enantioselectivity suggested that the native Mb scaffold dictates no facial selectivity for styrene approach to the heme-carbene intermediate. With this in mind, it was reasoned that mutation of the amino acid residues lying at the periphery of the porphyrin cofactor could provide a means to improve the diastereo- and enantioselectivity of this catalyst, possibly by imposing only one modality of attack of the styrene molecule to the heme-carbene group.
[00236] Upon inspection of sperm whale Mb crystal structure (Vojtechovsky, Chu et al. 1999), residues Phe43, His64, and Val68, were selected as targets for mutagenesis due to their close proximity to the distal face of the heme (FIG. 1). Specifically, a Mb variant where the distal histidine (His64) is mutated to Val, Mb(H64V), was considered as this residue 'blocks' access to the distal face of the heme cofactor from the solvent. Positions 43 or 68 were substituted with amino acids carrying a larger (i.e., Mb(F43W) and Mb(V68F)) or smaller apolar side chain (i.e., Mb(F43V) and Mb(V68A)), in order to affect the catalyst selectivity in the cyclopropanation reaction by varying the steric bulk on either side of the heme (FIGS. 1 and 5). Analysis of these Mb variants revealed an important effect of the active site mutations on the activity and/or selectivity of the hemoprotein toward styrene cyclopropanation with EDA (FIG. 4). In particular, the H64V mutation resulted in a two-fold increase in the turnover numbers, the highest among this set of single mutants, while having marginal effect on diastereo- and enantioselectivity. Conversely, all the mutations at the level of Phe43 and Val68 dramatically improved the enantioselectivity of the Mb variant as compared to wild-type Mb, resulting in formation of the (IS,2S) stereoisomer (3a) with e.e. (E values ranging from 44% to 99.9%. The V68 substitutions also resulted in an appreciable increase in both catalytic activity (TON) and E- diastereoselectivity of the catalyst (FIG. 4). Thus, the H64V mutation was found to be particularly effective in enhancing Mb-dependent cyclopropanation activity, whereas the mutations at the level of V68 and F43 were beneficial toward tuning its diastereo- and enantioselectivity. To combine the beneficial effects of these mutations, a series of Mb double mutants were prepared and tested (FIG. 4). Variant Mb(H64V,V68A) was found to exhibit high activity as well as excellent E-diastereoselectivity (>99.9% de) and (lS^^-enantioselectivity (>99.9% ee) (FIG. 10B vs. FIG. 10A), and it was thus selected for further investigations.
[00237] Mb(H64V,V68A)-catalyzed cyclopropanation was determined to follow Michaelis- Menten kinetics, with estimated KM values of ~2 mM and ~5 mM for styrene and EDA, respectively (FIGS. 6A-B). In order to optimize this transformation, the impact of the olefin : diazoester ratio on the efficiency of the reaction was examined (FIG. 7). These experiments revealed an increase in TON as the EDA to styrene ratio was raised from 1:5 to 6: 1, with significant amounts (37%) of dimerization byproducts (diethyl maleate and fumarate), accumulating only in the presence of a large (6-fold) excess of the diazo compound (FIG. 7). Overall, a two-fold excess of EDA over styrene was established to be optimal for maximizing cyclopropanation turnovers while keeping dimerization to a minimum (<1%). Notably, using this reagent ratio and a catalyst loading of 0.01 mol%, quantitative conversion of the olefin could be achieved in the presence of up to 0.2 M styrene (20 g L"1) within as little as one hour (FIG. 8). Despite the fact that at this reagent concentration the reaction is biphasic, excellent levels of diastereo- and enantioselectivity (99.9% e.e. 99.9% d.e.) were maintained, indicating that the Mb variant is stable under these conditions (release of hemin from the protein would lead to racemization). Quantitative conversion of the olefin in these reactions also suggested that the TON supported by the Mb catalyst are in excess of 10,000. To examine this aspect, reactions at high substrate loading (0.2 M styrene, 0.4 M EDA) were repeated using decreasing amounts of the hemoprotein (20 to 1 μΜ, FIG. 8). At a catalyst loading of 0.001 mol%, Mb(H64V,V68A) was found to support about 30,000 turnovers after 1 hour and over 46,000 total turnovers (TTN) after overnight incubation with styrene and EDA (FIG. 4). Time-course experiments also revealed that Mb(H64V,V68A)-catalyzed cyclopropanation proceeds very rapidly, with an initial rate of 1,000 turnovers min 1 over the first 10 minutes and an average rate of 500 turnovers min"1 over the first hour of the reaction. Overall, the catalytic efficiency of this engineered Mb rivals that of some of the most active transition metal catalysts reported to date for similar transformations (11-98,000 TTN), (Che, Huang et al. 2001 ; Davies and
Venkataramani 2003; Anding, Ellern et al. 2012) while offering greater diastero- and stereocontrol (cp. to 75-94% d.e. and 83-98% e.e.) (Che, Huang et al. 2001 ; Davies and
Venkataramani 2003; Anding, Ellern et al. 2012). Furthermore, unlike the latter, no slow addition of the diazo reagent was required in the Mb-catalyzed reactions to minimize dimer formation.
[00238] To examine the substrate scope of the Mb variant, a variety of styrene derivatives as well as additional olefin substrates were subjected to Mb(H64V,V68A)-catalyzed
cyclopropanation in the presence of EDA. Using a catalyst loading of 0.07 mol%, efficient cyclopropanation of para- (lb-le), meta- (If), and ori zo-substituted (lg) styrenes could be achieved with yields ranging from 69 to 92% (FIG. 9). Importantly, excellent levels of E- diastereoselectivity and (lS^^-enantioselectivity were observed in each case, highlighting the broad scope of the Mb-based catalyst in terms of both activity and selectivity across the variously substituted styrene derivatives. At even lower catalyst loadings (0.001 mol%), Mb(H64V,V68A) was found to support TTNs ranging from 7,700 to 14,500 on these substrates. Substrates such as oc-methylstyrene (lh) and N-methyl-3-vinyl-indole (li) could be also converted with to the corresponding cyclopropanation products 10a and 11a with high selectivity, albeit the efficiency of the reaction with the latter (li) was compromised by the instability of this substrate in water.
[00239] The scope of the Mb variant with respect to the carbene precursor reagent was also explored. Mb(H64V,V68A)-catalyzed styrene cyclopropanation with ieri-butyl diazoacetate (12) and ethyl diazopropanoate (13) yielded the corresponding (IS,2S) cyclopropane products, i.e., iert-butyl 2-phenylcyclopropane-l-carboxylate (14) and ethyl l-methyl-2- phenylcyclopropane-l-carboxylate (15), respectively, with good diastereoselectivity (82% d.e. and 74% d.e. , respectively) and moderate enantioselectivity (58% e.e. and 1% e.e. , respectively). Ethyl 2-diazo-2-phenylacetate was also accepted by the Mb variant although cyclopropanation of styrene with this diazo reagent proceeded with low efficiency (TON < 10).
[00240] Further experiments were carried out to assess the stability of the myoglobin catalysts in organic co-solvents and at elevated temperature. Notably, using the
cyclopropanation of styrene with EDA as a test reaction, Mb(H64V,V68A) and other engineered Mb variants were found to retain between 50 and 90% of their carbene transfer activity in the presence of up to 30-40% of an organic cosolvent (MeOH, THF, DMF, or CH3CN). Similarly, they were found to retain between 50 and 90% of their carbene transfer activity at elevated temperatures up to 60°C. These results further highlight the operational robustness of these biocatalysts for carbene transfer reactions.
[00241] In summary, this work demonstrates that engineered variants of sperm whale myoglobin can provide highly reactive and selective olefin cyclopropanation catalysts. For example, the engineered Mb variant Mb(H64V,V68A) is capable of catalyzing the
cyclopropanation of a variety of olefins with an unprecedented combination of catalytic proficiency (10-46,800 TON) and excellent E-diastereo- and enantioselectivity (>99%). The practical utility of this biocatalyst is further highlighted by its ability to operate at high reagent concentration (i.e., 0.2-0.4 M) and in presence of organic cosolvents or elevated temperatures.
[00242] Experimental Details.
[00243] Reagents and Analytical Methods. All the chemicals and reagents were purchased from commercial suppliers (Sigma-Aldrich, ACS Scientific, Acros) and used without any further purification, unless otherwise stated. All dry reactions were carried out under argon or nitrogen in oven-dried glassware with magnetic stirring using standard gas-light syringes, cannulae and septa. ]H and 13C NMR spectra were measured on Bruker DPX-400 (operating at 400 MHz for ]H and 100 MHz for 13C) or Bruker DPX-500 (operating at 500 MHz for ]H and 125 MHz for 13C). Tetramethylsilane (TMS) served as the internal standard (0 ppm) for ]H NMR and CDC13 was used as the internal standard (77.0 ppm) for 13C NMR. Silica gel chromatography purifications were carried out using AMD Silica Gel 60 230-400 mesh. Preparative thin layer chromatography was performed on TLC plates (Merck). Gas chromatography (GC) analyses were carried out using a Shimadzu GC-2010 gas chromatograph equipped with a FID detector and a Shimadzu SHRXI-5MS column (15 m x 0.25 mm x 0.25 μιη film). Enantiomeric excess was determined by chiral gas chromatography (GC) using a Shimadzu GC-2010 gas chromatograph equipped with a FID detector, and a Cyclosil-B column (30 m x 0.25 mm x 0.25 μιη film).
[00244] Cloning. The gene encoding for sperm whale myoglobin was cloned into the Nde I/Xho I cassette of plasmid pET22b (Novagen) to give pET22_Mb. The engineered Mb variants were prepared by SOE PCR using appropriate mutagenizing primers which were ordered from IDT Technologies.
[00245] Protein expression and purification. Wild-type Mb and the engineered Mb variants were expressed in E. coli BL21(DE3) cells as described previously (Bordeaux, Singh et al. 2014). Briefly, cells were grown in TB medium (ampicillin, 100 mg L_1) at 37 °C (150 rpm) until OD6oo reached 0.6. Cells were then induced with 0.25 mM β-d-l-thiogalactopyranoside (IPTG) and 0.3 mM δ-aminolevulinic acid (ALA). After induction, cultures were shaken at 150 rpm and 27 °C and harvested after 20 h by centrifugation at 4000 rpm at 4 °C. After cell lysis by sonication, the proteins were purified by Ni-affinity chromatography using the following buffers: loading buffer (50 mM Kpi, 800 mM NaCl, pH 7.0), wash buffer 1 (50 mM Kpi, 800 mM NaCl, pH 6.2), wash buffer 2 (50 mM Kpi, 800 mM NaCl, 250 mM glycine, pH 7.0) and elution buffer (50 mM Kpi, 800 mM NaCl, 300 mM L-histidine, pH 7.0). After buffer exchange (50 mM Kpi, pH 7.0), the enzymes were stored at +4 °C. Myoglobin concentration was determined using an extinction coefficient ε4ιο = 157 mM-1 cm-1 (Redaelli, Monzani et al. 2002).
[00246] Cyclopropanation reactions. Initial reactions (FIG. 4) were carried out at a 400 μΕ scale using 20 μΜ myoglobin, 30 mM styrene, 10 mM EDA, and 10 mM sodium dithionite. In a typical procedure, a solution containing sodium dithionate (100 mM stock solution) in potassium phosphate buffer (50 mM, pH 7.0) was degassed by bubbling argon into the mixture for 5 min in a sealed vial. A buffered solution containing myoglobin was carefully degassed in a similar manner in a separate vial. The two solutions were then mixed together via cannulation.
Reactions were initiated by addition of 10 μΕ of styrene (from a 1.2 M stock solution in methanol), followed by the addition of 10 μΕ of EDA (from a 0.4 M stock solution in methanol) with a syringe, and the reaction mixture was stirred for 18 h at room temperature, under positive argon pressure. Reaction with hemin were carried out using an identical procedure with the exception that the purified Mb was replaced by 80 μΕ of a hemin solution (100 μΜ in
DMSO:H20, 1: 1). Other reactions were performed according to the general procedure described above but varying catalyst loading, substrate and reagent concentrations as described above. [00247] Product analysis. The reactions were analyzed by adding 20 of internal standard (benzodioxole, 100 mM in methanol) to the reaction mixture, followed by extraction with 400 μL· of butyl acetate and analyzed by GC-FID. Calibration curves for quantification of the different cyclopropane products were constructed using authentic standards produced synthetically as described below. All measurements were performed at least in duplicate. For each experiment, negative control samples containing either no hemoprotein or no reductant were included. For enantio- and stereoselectivity determination, the samples were analyzed by GC-FID using a chiral column. Racemic cyclopropane produced synthetically (Rh2(OAc)4 as catalyst) were used as standards. Absolute configuration of the steresoisomeric products was assigned according to reference reactions carried out in the presence of a previously reported chiral Ru(II)-(Pheox) catalyst (Abu-Elfotoh, Phomkeona et al. 2010).
[00248] Chemical synthesis of standard racemic cyclopropanation products. To generate authentic racemic stardards for the different cyclopropanation products, Rh2(OA)4-catalyzed cyclopropanation reactions were carried out according to the following general procedure. To a flame dried round bottom flask was added olefin (5 equiv.) and Rh2(OAC)4 (2 mol ) in CH2C12 (2 mL) under argon. To this solution was added a solution of diazo compound (1 equiv.) in CH2C12 (3-5 mL) via slow addition over 30-40 minutes. The resulting mixture was stirred at room temperature for another 30 min to 1 hour. The solvent was removed under vacuum and the crude mixture was purified by flash chromatography using a 9: 1 hexanes: diethyl ether mixture. The identity of the cyclopropane products was confirmed by GC-MS and ]H and 13C NMR.
[00249] Ethyl 2-(p-tolyl)cyclopropane-l-carboxylate (4): Following the standard procedure, yield = 82%, GC-MS m/z (% relative intensity): 204(28.4), 158(19.7), 147(21.1), 131(100), 91(28.1), E-isomers: colorless liquid, ]H NMR (CDC13, 400 MHz): δ 7.11 (d, / = 7.6 Hz, 2H), 7.01 (d, J = 7.6 Hz, 2H), 4.21 (q, J = 7.2 Hz, 2H), 2.53-2.48 (m, 1H), 2.32 (s, 3H), 1.90-1.85 (m, 1H), 1.61-1.56 (m, 1H), 1.31-1.27 (m, 4H) ppm, 13C NMR (CDC13, 100 MHz): δ 173.5, 137.0,
136.0, 129.1, 126.1, 60.6, 25.9, 24.0, 20.9, 16.9, 14.2 ppm, Z-isomers: colorless liquid, ]H NMR (CDCI3, 400 MHz): δ 7.15 (d, / = 8.0 Hz, 2H), 7.06 (d, / = 7.6 Hz, 2H), 3.91 (q, / = 6.8 Hz, 2H), 2.56 (dd, / = 16.8, 8.8 Hz, 1H), 2.29 (s, 3H), 2.06-2.01 (m, 1H), 1.69 (dd, / = 12.4, 5.6 Hz, 1H), 132-1.27 (m, 1H), 1.02 (t, / = 7.2 Hz, 3H) ppm, 13C NMR (CDC13, 100 MHz): δ 171.0,
136.1, 133.4, 129.1, 128.5, 60.1, 25.1, 21.6, 21.0, 14.0, 11.1 ppm.
[00250] Ethyl 2-(4-methoxyphenyl)cyclopropane-l-carboxylate (5). Following the standard procedure, yield = 84%, GC-MS m/z (% relative intensity): 220(41.8), 191(14.5), 147(100), 91(32.6), E-isomers: white solid, ]H NMR (CDC13, 400 MHz): δ 7.03(d, / = 8.4 Hz, 2H), 6.82 (d, / = 8.4 Hz, 2H), 4.18 (q, / = 7.2 Hz, 2H), 3.76 (s, 3H), 2.49-2.45 (m, IH), 1.83-1.79 (m, IH), 1.56-1.52 (m, IH), 1.32-1.21 (m, 4H), 13C NMR (CDC13, 100 MHz): δ 173.5, 158.3, 132.1, 127.5, 113.9, 60.6, 55.2, 25.6, 23.8, 16.7, 14.2 ppm, Z-isomers: colorless liquid, ]H NMR (CDCI3, 400 MHz): δ 7.18 (d, / = 8.4 Hz, 2H), 6.80 (d, / = 8.4 Hz, 2H), 3.91 (q, / = 7.2 Hz, 2H), 3.78 (s, 3H), 2.54 (dd, / = 16.8, 8.4 Hz, IH), 2.05 (dd, 14.4, 8.0 Hz, IH), 1.67-1.61 (m, IH), 1.31-1.26 (m, IH), 1.02 (t, / = 7.2 Hz, 3H) ppm, 13C NMR (CDC13, 100 MHz): δ 171.1, 158.3, 130.3, 128.5, 113.3, 60.1, 55.1, 24.8, 21.7, 14.1, 11.2 ppm.
[00251] Ethyl 2-(4-chlorophenyl)cyclopropane-l-carboxylate (6). Following the standard procedure, yield = 82%, GC-MS m/z (% relative intensity): 224(38.9), 178(20.9), 151(100), 115(98.9), 89(14.9), E-isomers: colorless liquid, ]H NMR (CDC13, 400 MHz): δ 7.24 (d, / = 7.6 Hz, 2H), 7.02 (d, J = 7.2 Hz, 2H), 4.19 (q, J = 6.8 Hz, 2H), 2.50-2.45 (m, IH), 1.87-1.83 (m, IH), 1.61-1.56 (m, IH), 1.29-1.23 (m, 4H) ppm, 13C NMR (CDC13, 100 MHz): δ 173.1, 138.6, 132.1, 128.5, 127.5, 60.8, 25.4, 24.1, 16.9, 14.2 ppm.
[00252] Ethyl 2-(4-(trifluoromethyl)phenyl)cyclopropane-l-carboxylate (7). Following the standard procedure, yield =69%, GC-MS m/z (% relative intensity): 258(58.8), 230(35.3), 203(43.9), 185(100), 165(70), 115(48.8), E-isomers: colorless liquid, ]H NMR (CDC13, 400 MHz): δ 7.52 (d, / = 7.6 Hz, 2H), 7.19 (d, / = 8.0 Hz, 2H), 4.20 (q, / = 7.2 Hz, 2H), 2.57-2.52 (m, IH), 1.95-1.91 (m, IH), 1.67-1.62 (m, IH), 1.34-1.25 (m, 4H) ppm, 13C NMR (CDC13, 100 MHz): δ 172.8, 144.3, 126.4, 125.3, 60.9, 25.6, 24.4, 17.2, 14.2 ppm, Z-isomers: colorless liquid, ]H NMR (CDC13, 400 MHz): δ 7.51 (d, J = 7.6 Hz, 2H), 7.37 (d, J = 8.0 Hz, 2H), 3.90 (q, J = 7.2 Hz, 2H), 2.61 (dd, J = 16.4, 8.4, IH), 2.15 (dd, J = 14.8, 8.0 Hz, IH), 1.74-1.70 (m, IH), 1.40-1.36 (m, IH), 0.99 (t, J = 7.2 Hz, 3H) ppm.
[00253] Ethyl 2-(m-tolyl)cyclopropane-l-carboxylate (8). Following the standard procedure, yield = 81%, GC-MS m/z (% relative intensity): 204(20.9), 158(16.1), 147(13.7), 131(100), 115(21.2), E-isomers: colorless liquid, ]H NMR (CDC13, 400 MHz): δ 7.19-7.16 (m, IH), 7.03- 7.01 (m, IH), 6.93-6.89 (m, 2H), 4.20 (q, / = 6.8 Hz, 2H), 2.50-2.49 (m, IH), 2.33 (s, 3H), 1.91- 1.88 (m, IH), 1.60-1.57 (m, IH), 1.31-1.27 (m, 4H) ppm, 13C NMR (CDC13, 100 MHz): δ 173.4, 140.0, 138.0, 128.3, 127.2, 127.0, 123.1 60.6, 26.1, 24.1, 21.3, 16.9, 14.3 ppm, Z-isomers:
colorless liquid, ]H NMR (CDC13, 400 MHz): δ 7.16-7.12 (m, IH), 7.08-7.04 (m, 2H), 7.01 (d, J = 7.2 Hz, IH), 3.91 (q, / = 7.2 Hz, 2H), 2.57 (dd, / = 16.8, 8.4 Hz, IH), 2.31 (s, 3H), 2.08 (dd, J = 14.4, 8.4 Hz, IH), 1.71 (dd, J = 12.0, 5.6 Hz, IH), 1.32-1.27 (m, IH), 1.00 (t, J = 7.2 Hz, 3H) ppm, 13C NMR (CDC13, 100 MHz): δ 171.0, 137.3, 136.4, 130.1, 127.7, 127.3, 126.3, 60.1, 25.3, 21.7, 21.3, 14.0, 11.0 ppm.
[00254] Ethyl 2-(o-tolyl)cyclopropane-l-carboxylate (9). Following the standard procedure, yield = 81%, GC-MS m/z (% relative intensity): 204(29.13), 158(16.7), 147(16.7), 131(100), 91(28.30), E-isomers: colorless liquid, ]H NMR (CDC13, 400 MHz): δ 7.16 (m, 3H), 7-01-6.99 (m, 1H), 4.23 (q, / = 7.6 Hz, 2H), 2.55-2.49 (m, 1H), 2.38 (s, 3H), 1.82-1.76 (m, 1H), 1.60-1.55 (m, 1H), 1.31-1.28 (m, 4H) ppm, 13C NMR (CDC13, 100 MHz): δ 173.8, 138.0, 137.8, 129.8, 126.7, 125.8, 60.6, 24.6, 22.3, 19.5, 15.3, 14.3 ppm, Z-isomers: colorless liquid, ]H NMR (CDCI3, 400 MHz): δ 7.20 (m, 1H), 7.11 (m, 3H), 3.87 (q, / = 7.2 Hz, 2H), 2.47 (dd, / = 16.8, 8.4 Hz, 1H), 2.34 (s, 3H), 2.18 (dd, / = 14.0, 8.4 Hz, 1H), 1.76-1.72 (m, 1H), 1.37-1.32 (m, 1H), 0.94 (t, 7 = 7.2 Hz, 3H) ppm, 13 C NMR (CDC13, 100 MHz): δ 171.2, 138.1, 134.9, 129.4, 129.1, 126.7, 125.3, 60.0, 24.4, 21.1, 19.3, 13.9, 11.2 ppm.
[00255] Ethyl 2-methyl-2-phenylcyclopropane-l-carboxylate (10): Following the standard procedure, yield = 72%, GC-MS m/z (% relative intensity): 204(4.18), 175(9.7), 159(15.19), 147(13.9), 131(100), 91(41.1), E-isomers: colorless liquid, ]H NMR (CDC13, 400 MHz): δ 7.30 (m, 4H), 7..22-7.20 (m, 1H), 4.23 (q, / = 6.2 Hz, 2H), 1.99-1.96 (m, 1H), 1.54 (s, 3H), 1.46-1.40 (m, 2H), 1.32-1.29 (m, 3H) ppm, 13C NMR (CDC13, 100 MHz): δ 172.1, 145.9, 128.4, 127.3, 126.4, 60.4, 30.5, 27.8, 20.7, 19.6, 14.4 ppm, Z-isomers: colorless liquid, ]H NMR (CDC13, 400 MHz): δ 7.26 (m, 4H), 7.20-7.18 (m, 1H), 3.87-3.78(m, 2H), 1.91-1.88 (m, 1H), 1.79-1.76 (m, 1H), 1.46 (s, 3H), 1.57-1.13 ( m, 1H), 0.95-0.92 (m, 3H) ppm, 13C NMR (CDC13, 100 MHz): δ 171.2, 141.9, 128.7, 128.1, 126.6, 60.0, 32.0, 28.5, 19.4, 13.9 ppm.
[00256] Ethyl 2-(l-methyl-lH-indol-3-yl)cyclopropane-l-carboxylate (11): This product was obtained following the standard Rh-catalyzed cyclopropanation protocol starting from 1 -methyl - 3-vinyl-lH-indole, which was synthesized according to a published procedure (Waser, Caspar et al. 2006). Yield = 54%, GC-MS m/z (% relative intensity): 243(62.8), 214(31.7), 170(100), E- isomers: Brown semi-solid, ]H NMR (CDC13, 400 MHz): δ 7.66 (d, / = 7.6 Hz, 1H), 7.29-7.21 (m, 2H), 7.14-7.10 (m, 1H), 6.80 (s, 1H), 4.22 (q, / = 7.2 Hz, 2H), 3.72 (s, 3H), 2.59-2.57 (m, 1H), 1.87-1.84 (m, 1H), 1.55-1.46 (m, 1H), 1.31-1.21 (m, 4H) ppm. Z-isomers: Brown semisolid, ]H NMR (CDCI3, 400 MHz): δ 7.65 (d, / = 7.6 Hz, 1H), 7.25-7.23 (m, 1H), 7.19-7.16 (m, 1H), 7.09-7.05 (m, 1H), 6.89 (s, 1H), 3.89-3.72 (m, 2H), 3.66 (s, 3H), 2.59 (dd, / = 16.4, 8.4 Hz, 1H), 2.14 (dd, / = 14.0, 8.0 Hz, 1H), 1.60-1.56 (m, 1H), 1.38-1.34 (m, 1H), 0.93 (t, 7 = 7.2 Hz, 3H) ppm. [00257] 7¾rt-butyl 2-phenylcyclopropane-l-carboxylate (14). Following the standard procedure, yield = 78%, GC-MS m/z (% relative intensity): 218(0.25), 162(67.5), 144(42.7), 117(100), 57(68.3), E-isomers: colorless liquid, ]H NMR (CDC13, 400 MHz): δ 7.29 (d, 7 = 7.2 Hz, 2H), 7.21-7.17 (m, 1H), 7.10 (d, 7 = 7.2 Hz, 2H), 2.47-2.42 (m, 1H), 1.86-1.82 (m, 1H), 1.56-1.52 (m, 1H), 1.48 (s, 9H), 1.26-1.22 (m, 1H) ppm, 13C NMR (CDC13, 100 MHz): δ 172.5, 140.5, 128.5, 126.3, 126.0, 80.5, 28.4, 28.1, 27.9, 25.7, 25.3, 17.0 ppm. Z-isomers: colorless liquid, ]H NMR (CDC13, 400 MHz): δ 7. ,26-7.22 (m, 4H), 7.19-7.17 (m, 1H), 2.55-2.51 (m, 1H), 2.00-1.94 (m, 1H), 1.65-1.61 (m, 1H), 1.25-1.20 (m, 1H), 1.13 (s, 9H) ppm, 13C NMR (CDC13, 100 MHz): δ 170.1, 136.8, 129.5, 127.8, 126.4, 80.0, 27.7, 25.0, 22.7, 10.5 ppm.
[00258] Ethyl l-methyl-2-phenylcyclopropane-l-carboxylate (15). Following the standard procedure, yield = 62%, GC-MS m/z (% relative intensity): 204(23.1), 158(23.6), 147(26.2), 131(100), 91(43.6), E-isomers: colorless liquid, ]H NMR (CDC13, 400 MHz): δ 7.29-7.17 (m, 5H), 4.19 (q, 7 = 7.2 Hz, 2H), 2.82 (t, 7 = 8.0 Hz, 1H), 1.70-1.66 (m, 1H), 1.30 (t, 7 = 7.2 Hz, 3H), 1.17-1.14 (m, 1H), 0.98 (s, 3H) ppm, 13C NMR (CDC13, 100 MHz): δ 175.6, 136.9, 129.1, 127.8, 126.6, 60.7, 31.6, 25.1, 14.5 ppm.
[00259] 6.2 Example 2: N— H carbene insertion reactions catalyzed by myoglobin-based catalysts
[00260] Upon the inventor' s discovery of the remarkable reactivity of Mb-based catalysts toward olefin cyclopropanation in the presence of diazo reagents, the utility and scope of these catalysts in the context of carbene N— H insertion reactions was investigated. Initially, the activity of wild-type sperm whale myoglobin (Mb; SEQ ID NO: l) toward catalyzing the conversion of aniline (21) to ethyl 2-(phenylamino)acetate (23) in the presence of ethyl diazoacetate (EDA, 22) (FIG. 11) was tested.
[00261] Under anaerobic conditions and in the presence of dithionite as a reductant, formation of the desired product 23 was observed, thus demonstrating that this hemoprotein is able to mediate carbene N— H insertion reactions. Negligible formation of 23 was noted in the absence of reductant or in the presence of oxygen, indicating that ferrous Mb is responsible for the observed reactivity and that molecular oxygen interferes with it, most likely through competing with the diazo reagent for binding to the heme iron. No product formation upon complexation of the ferrous Mb to carbon monoxide provided further evidence for the direct involvement of the heme cof actor in catalysis. In previous studies, it was established that the Mb variant Mb(H64V,V68A) possesses greatly enhanced carbene transfer activity in the context of olefin cyclopropanation (Example 1).
[00262] Upon testing, Mb(H64V,V68A) was found to exhibit also significantly higher N— H insertion reactivity than wild-type Mb (>500 vs. 210 TON, FIG. 11).
[00263] Following reaction optimization, it was established that quantitative conversion of aniline to 23 could be obtained at millimolar substrate concentration (0.01 M) using
Mb(H64V,V68A) at 0.2 mol and an equimolar ratio of the amine and diazo reagent (FIG. 11). As a comparison, 10- to 25 -fold higher catalyst loadings have been reported in association with similar transformations and yields using transition metal complexes (Morilla, Diaz-Requejo et al. 2002; Liu, Zhu et al. 2007). Relatively high turnover numbers (200 TON) were obtained also in the presence of stoichiometric amounts of dithionite relative to the Mb catalyst (FIG. 11), indicating that an excess of reductant is beneficial but not essential for the transformation.
Importantly, Mb(H64V,V68A) was found to remain active in the presence of the amine substrate and EDA at a concentration as high as 0.16 M, which corresponds to -15 g aniline per L (FIG. 11). These findings are noteworthy considering that aniline is known to coordinate the heme iron in heme-containing enzymes and thus potentially inhibit their function(Rein, Maricic et al. 1976; Locuson, Hutzler et al. 2007).
[00264] Furthermore, no formation of EDA dimerization byproducts (i.e., ethyl fumarate and maleate) or of the double addition product (Ph-N(CH2COOEt)2) was noted in the
Mb(H64V,V68A) reactions, even under the aforementioned high-substrate-loading conditions. This result is in contrast with the mixture of single and double addition products generated from similar reactions in the presence of iron-porphyrins (Aviv and Gross 2006, Aviv and Gross 2008) or free hemin (Wang, Peck et al. 2014) as catalysts.
[00265] At the highest substrate concentration tested (0.16 M), Mb(H64V,V68A) catalyzes nearly -3,000 turnovers. With 10 mM aniline and a catalyst loading of 0.001 mol , over 6,000 total turnovers (TTN) were supported by this Mb variant. This value is an order of magnitude higher than that recently reported engineered P450BM3 variants (Wang, Peck et al. 2014) and ranks among the highest TTNs reported for catalytic N— H insertion reactions with acceptor- only diazo compounds (Aviv and Gross 2006). The Mb(H64V,V68A)-catalyzed reaction is also remarkably fast, proceeding at an initial rate of 740 and 174 turnovers min 1 over the first minute and first 10 min, respectively. . [00266] To investigate the substrate scope of the Mb-based catalyst, a range of substituted anilines (24a-32a) and other arylamines (33a, 34a) were subjected to Mb(H64V,V68A)- catalyzed N— H functionalization in the presence of EDA. As summarized in FIG. 12, all of the aniline derivatives, including para- (24a-29a), meta- (30a), and ori zo-substituted anilines (31a), could be converted to the desired N— H insertion product in very good to excellent yields (67- 99%). In terms of substituents on the aniline rings, both electron-donating (25a) and electron- withdrawing (26b, 27b) groups were well tolerated by the Mb catalyst. Formation of the N- methyl derivative 32b from N-methylaniline in 67% yield indicated that secondary aryl amines are also converted by the Mb(H64V,V68A) variant, although with somewhat lower efficiency than for aniline. Substrates 33b and 34a were then tested to explore the scope of the reaction across other aromatic amines. Also in this case, high conversions for the Mb(H64V,V68A)- catalyzed reaction were achieved, further supporting the broad substrate scope of this catalyst. Reactions with the amine substrates described above were also carried out in the presence of low catalyst loading (0.01 mol%). Under these conditions, betweenl,000 and 6,900 total turnovers were measured, highlighting the reactivity and robustness of the Mb catalyst for N— H carbene insertion across various aromatic amines (FIG. 12).
[00267] Despite the broad substrate scope and high activity of Mb(H64V,V68A), comparatively lower yields were observed with N-methyl- aniline as the carbene acceptor substrate (FIG. 12). In the interest of determining whether improved activities on these substrates could be obtained by using other engineered Mb variants, a panel of previously prepared Mb variants containing 1-2 amino acid mutations within the distal cavity of the protein (FIG. 1) was screened. From the screening, both Mb(H64V) and Mb(H64V,L29A) were found to be considerably more efficient than Mb(H64V,V68A) toward the conversion of N-methylaniline, supporting about nearly four-fold higher turnovers on this substrate (3,910 vs. 1,030, FIG. 13, top panel). Using Mb(H64V,L29A) at 0.2 mol%, 32b could be thus obtained in higher yields (75%). To examine the performance of the Mb variants in the presence of a different oc- diazoester reagent, reactions with aniline were repeated with ieri-butyl diazoacetate (iBDA) as the carbene source. Whereas this reaction is efficiently catalyzed by Mb(H64V,V68A) (3,620 TTN, FIG. 13, bottom panel), even higher TTN values for formation of the product 35 were observed for Mb(H64V) and Mb(H64V,L29A) (5,730 and 7,540 TTN, respectively).
[00268] The L29A substitution was determined to have a distinctive beneficial effect for N— H insertion activity (FIG. 13), while the same mutation had little impact on Mb-catalyzed olefin cyclopropanation, despite a common reactive intermediate (i.e., heme -bound carbenoid) is likely involved in these reactions. As this mutation expands the volume above the heme iron center (Fe— Leu29(Cp) distance ~8A, FIG. 1), it could play a role in better accommodating the amine substrate prior or after attack by the carbenoid species. Notably, the beneficial effect of this substitution for carbene N— H insertion reactivity appeared to be rather general, as indicated by the high TTNs measured with Mb(L29A) with other aniline derivatives in addition to those considered above (e.g., 25b: 4,990 turnovers; 26b: 6,790 turnovers).
[00269] In addition to the Mb variants listed in FIG. 13, several other active variants of sperm whale Mb were found to exhibit high activity toward olefin cyclopropanation reactions as described in Example 1 and toward the carbene N— H insertion reactions described here. These Mb variants include Mb(F43W,I107M), Mb(L29A,I107A), and Mb(F43V,V68F).
[00270] In order to investigate the reactivity of the Mb catalysts in the context of additional carbene donor reagents, various engineered Mb variants were screened for their catalytic N— H insertion activity on aniline in the presence of cyclohexyl 2-diazoacetate, ethyl 2- diazopropanoate, tert-butyl 2-diazopropanoate, or ethyl 2-diazo-2-phenylacetate as the carbene donor. As illustrated by the representative data in FIG. 14, many of the engineered Mb variants exhibited high activity on these substrates and reagents, this activity ranging from 2- to over 10- fold higher than wild-type sperm whale myoglobin (SEQ ID NO: l). Furthermore, unlike wild- type Mb, the Mb variants were capable to catalyze the reactions with oc-substituted diazo compounds (i.e., ethyl 2-diazopropanoate, ieri-butyl 2-diazopropanoate) in an enantioselective manner (15-30% e.e.)
[00271] In order to investigate the reactivity of the Mb catalysts in the context of less reactive N— H containing substrates (i.e., alkyl amines), the panel of engineered Mb variants were screened for their catalytic N— H insertion activity on benzylamine, morpholine, and cyclohexylamine (FIG. 15). Notably, all these N— H containing substrates could be efficiently converted to the corresponding carbene N— H insertion products (40, 41, 42, 43, respectively) by the Mb catalysts. Also in this case, 5- to 20-fold higher TON were measured for the engineered Mb variants as compared to the wild-type hemoprotein (FIG. 15). Furthermore, the utility of these Mb catalysts to catalyze enantioselective carbene N— H insertion reactions was highlighted by the good % e.e. obtained for some of the Mb variants in the synthesis of ethyl 2- (benzylamino)propanoate (41). [00272] In conclusion, the data presented above provide a first demonstration of the ability of engineered Mb variants to efficiently catalyze carbenoid N— H insertion reactions, also in an enantioselective manner. The excellent chemoselectivity, high numbers of catalytic turnovers, and broad substrate scope across various aryl and alkyl amines contribute to make these Mb- based catalysts particularly useful from a synthetical standpoint.
[00273] Experimental Details.
[00274] Analytical Methods. Gas chromatography (GC) analyses were carried out using a Shimadzu GC-2010 gas chromatograph equipped with a FID detector and a Shimadzu SHRXI- 5MS column (15 m x 0.25 mm x 0.25 μιη film). Separation method: 1 μL· injection, injector temp.: 200 °C, detector temp: 300 °C. Gradient: column temperature set at 60 °C for 1 min, then to 200 °C at 10 °C/min, then to 290 °C at 30 °C/min. Total run time was 19.00 min. Chiral GC analyses were carried out using a Shimadzu GC-2010 gas chromatograph equipped with a FID detector, and a Cyclosil-B column (30 m x 0.25 mm x 0.25 μιη film). Separation method: 1 μL· injection, injector temp.: 200 °C, detector temp: 300 °C. Gradient: column temperature set at 140 °C for 3 min, then to 160 °C at 1.8 °C/min, then to 165 °C at 1 °C/min, then to 245 °C at 25 °C/min. Total run time was 28 min.
[00275] Protein expression and purification. Wild-type Mb and the engineered Mb variants were expressed in E. coli BL21(DE3) cells as described in EXAMPLE 1.
[00276] N-H insertion reactions. Reactions were typically carried out at a 400 μΕ scale using 20 μΜ myoglobin, 10 mM aniline, 10 or 5 mM EDA, and 10 mM sodium dithionite. In a typical procedure, a solution containing sodium dithionate (100 mM stock solution) in potassium phosphate buffer (50 mM, pH 8.0) was degassed by bubbling argon into the mixture for 4 min in a sealed vial. A buffered solution containing myoglobin was carefully degassed in a similar manner in a separate vial. The two solutions were then mixed together via cannula. Reactions were initiated by addition of 10 μΕ of aniline (from a 0.4 M stock solution in methanol), followed by the addition of 10 μΕ or 5 μΕ of EDA (from a 0.4 M stock solution in methanol) with a syringe, and the reaction mixture was stirred for 12 h at room temperature, under positive argon pressure.
[00277] Product analysis. The reactions were analyzed by adding 20 μΕ of internal standard (benzodioxole, 50 mM in methanol) to the reaction mixture, followed by extraction with 400 μΕ of dichloromethane and separated organic layer was analyzed by GC-FID. Calibration curves for quantification of the different N-H insertion products were constructed using authentic standards prepared synthetically using Rh2(OAc)4 catalyst as described below. All measurements were performed at least in duplicate. For each experiment, negative control samples containing either no enzyme or no reductant were included.
[00278] General procedure for chemical synthesis of authentic N-H insertion product standards. To a flame dried round bottom flask under argon, equipped with a stir bar was added amine (1 equiv.) and Rh2(OAC)4 (1 mol ) in toluene (2-3 mL). To this solution was added a solution of diazo compound (1 equiv.) in toluene (1-2 mL) over 30 minutes at 0 °C. The resulting mixture was heated at 80 °C for another 15-18 hrs. The solvent was removed under vacuum and the crude mixture was purified by flash column chromatography (hexanes/ethyl acetate) to provide N-H insertion products in good to excellent yield. The insertion products were characterized by GC-MS, ]H NMR and 13C NMR techniques.
[00279] Ethyl phenylglycinate (23). Following the standard procedure, pale brown solid, % yield (88), GC-MS m/z (% relative intensity): 179(32.1), 106(100), 77(21.2), 51(5.9); ]H NMR (CDC13, 500 MHz): δ 7.23 (t, J = 6.0 Hz, 2H), 6.79 (t, J = 6.5 Hz, 1H), 6.64 (d, J = 7.0 Hz, 2H), 4.34-4.24 (m, 3H), 3.91 (s, 2H), 1.33 (t, / = 6.5 Hz, 3H) ppm; 13C NMR (CDC13, 125 MHz): δ
171.2, 147.1, 129.3, 118.1, 113.0, 61.3, 45.8, 14.2 ppm.
[00280] Ethyl 7-tolylglycinate (24b). Following the standard procedure, light brown solid, % yield (86), GC-MS m/z (% relative intensity): 193(26.3), 120(100), 91(18.7), 65(5.7); ]H NMR (CDCI3, 400 MHz): δ 7.03 (d, / = 7.6 Hz, 2H), 6.56 (d, / = 8.0 Hz, 2H), 4.27 (q, / = 6.8 Hz, 2H), 3.88 (s, 2H), 2.26 (s, 3H), 1.32 (t, / = 7.2 Hz, 3H) ppm; 13C NMR (CDC13, 100 MHz): δ
171.3, 144.8, 129.8, 127.4, 113.2, 61.2, 46.2, 20.4, 14.2 ppm.
[00281] Ethyl (4-methoxyphenyl)glycinate (25b). Following the standard procedure, colorless solid, % yield (80), GC-MS m/z (% relative intensity): 209(42.0), 136(100), 121(14.6), 108(13.0), 77(10.0); ]H NMR (CDC13, 400 MHz): δ 6.79 (d, / = 8.4 Hz, 2H), 6.58 (d, / = 8.4 Hz, 2H), 4.24 (q, J = 7.2 Hz, 2H), 4.04 (br s, 1H), 3.84 (s, 2H), 3.73 (s, 3H), 1.29 (t, J = 7.2 Hz, 3H) ppm; 13C NMR (CDC13, 100 MHz): δ 171.4, 152.6, 141.3, 114.8, 114.3, 61.2, 55.7, 46.8, 14.2 ppm.
[00282] Ethyl (4-chlorophenyl)glycinate (26b). Following the standard procedure, colorless solid, % yield (82), GC-MS m/z (% relative intensity): 213(22.7), 142(42.5), 140(100), 105(14.1), 77(13.3); ]H NMR (CDC13, 500 MHz): δ 7.13 (d, / = 9.0 Hz, 2H), 6.52 (d, / = 8.5 Hz, 2H), 4.26 (q, / = 6.8 Hz, 2H), 3.85 (s, 2H), 1.30 (t, / = 7.0 Hz, 3H) ppm; 13C NMR (CDC13, 125 MHz): δ 170.8, 145.6, 129.1, 122.8, 114.0, 61.4, 45.8, 14.2 ppm. [00283] Ethyl (4-nitrophenyl)glycinate (27b). Following the standard procedure, yellow solid, % yield (75), GC-MS m/z (% relative intensity): 224(12.6), 151(100), 105(47.4), 76(4.2); ]H NMR (CDC13, 500 MHz): δ 8.12 (d, / = 9.0 Hz, 2H), 6.56 (d, / = 8.5 Hz, 2H), 5.08 (br s, 1H), 4.31 (q, J = 7.5 Hz, 2H), 3.98 (s, 2H), 1.33 (t, 7 = 7.1 Hz, 3H) ppm; 13C NMR (CDC13, 125 MHz): δ 169.7, 151.9, 126.3, 111.5, 61.9, 44.9, 14.1 ppm.
[00284] Ethyl (4-isopropylphenyl)glycinate (28b). Following the standard procedure, light brown oil, % yield (82), GC-MS m/z (% relative intensity): 221(33.7), 206(32.2), 178(12.5), 148(100), 132(30.6); ]H NMR (CDC13, 400 MHz): δ 7.08 (d, J = 7.2 Hz, 2H), 6.58 (d, J = 7.2 Hz, 2H), 4.27-4.22 (m, 3H), 3.89 (s, 2H), 2.85-2.79 (m, 1H), 1.34-1.28 (m, 3H), 1.24-1.21 (m, 6H) ppm; 13C NMR (CDC13, 100 MHz): δ 171.3, 145.1, 138.7, 127.2, 113.1, 61.2, 46.1, 33.2, 24.2, 14.2 ppm.
[00285] Ethyl (4-(fert-butyl)phenyl)glycinate (29b). Following the standard procedure, brown oil, % yield (80), GC-MS m/z (% relative intensity): 235(34.6), 220(87.9), 192(26.3), 162(100), 146(36.1); ]H NMR (CDC13, 400 MHz): δ 7.23 (d, / = 8.4 Hz, 2H), 6.58 (d, / = 8.0 Hz, 2H), 4.26 (q, / = 7.2 Hz, 2H), 3.88 (s, 2H), 1.31-1.27 (m, 12 H) ppm; 13C NMR (CDC13, 100 MHz): δ 171.3, 144.6, 140.9, 126.1, 112.8, 61.3, 46.1, 33.9, 31.5, 31.3, 14.2 ppm.
[00286] Ethyl m-tolylglycinate (30b). Following the standard procedure, colorless solid, % yield (83), GC-MS m/z (% relative intensity): 193(41.4), 120(100), 91(35.1), 65(11.2); ]H NMR (CDCI3, 400 MHz): δ 7.12 (t, / = 7.6 Hz, 1H), 6.61 (d, / = 7.2 Hz, 1H), 6.45-6.43 (m, 2H), 4.28- 4.23 (m, 3H), 3.90 (s, 2H), 2.30 (s, 3H), 1.33 (t, / = 7.2 Hz, 3H) ppm; 13C NMR (CDC13, 100 MHz): δ 171.2, 147.1, 139.1, 129.2, 119.1, 113.8, 110.1, 61.3, 45.9, 21.6, 14.2 ppm.
[00287] Ethyl o-tolylglycinate (31b). Following the standard procedure, light brown oil, % yield (85), GC-MS m/z (% relative intensity): 193(31.2), 120(100), 91(25.1), 65(7.0); ]H NMR (CDCI3, 500 MHz): δ 7.16 (t, / = 7.5 Hz, 1H), 7.11 (d, / = 7.5 Hz, 1H), 6.75 (t, / = 7.5 Hz, 1H), 6.52 (d, J = 8.0 Hz, 1H), 4.30 (q, J = 7.0 Hz, 2H), 4.07 (br s, 1H), 3.96 (s, 2H), 2.24 (s, 3H), 1.35 (t, / = 7.0 Hz, 3H) ppm; 13C NMR (CDC13, 125 MHz): δ 171.3, 145.1, 130.2, 127.1, 122.5,
117.8, 109.9, 61.3, 45.9, 17.3, 14.2 ppm.
[00288] Ethyl jV-methyl-jV-phenylglycinate (32b). Following the standard procedure, brown oil, % yield (79), GC-MS m/z (% relative intensity): 193(27.1), 120(100), 91(18.5), 65(5.75); ]H NMR (CDCI3, 500 MHz): δ 7.26 (t, J = 7.0 Hz, 2H), 6.78-6.70 (m, 3H), 4.21 (q, J = 7.0 Hz, 2H), 4.07 (s, 2H), 3.08 (s, 3H), 1.27 (t, / = 7.0 Hz, 3H) ppm; 13C NMR (CDC13, 125 MHz): δ 171.1,
148.9, 129.2, 117.3, 112.3, 60.8, 54.5, 39.5, 14.2 ppm. [00289] Ethyl benzo[rf][l,3]dioxol-5-ylglycinate (33b). Following the standard procedure, light brown solid, % yield (75), GC-MS m/z (% relative intensity): 223(36.7), 150(100), 120(9.4), 92(12.7), 65(16.9); ]H NMR (CDC13, 400 MHz): δ 6.66 (d, / = 8.0 Hz, 1H), 6.24 (s, 1H), 6.02 (d, / = 8.0 Hz, 1H), 5.85 (s, 2H), 4.21-4.19 (m, 2H), 4.07 (br s, 1H), 3.82 (s, 2H), 1.30-1.26 (m, 3H) ppm; 13C NMR (CDC13, 100 MHz): δ 171.2, 148.4, 142.7, 140.2, 108.6, 104.5, 100.7, 96.3, 61.3, 46.7, 14.2 ppm.
[00290] Ethyl naphthalen-2-ylglycinate (34b). Following the standard procedure, purple solid, % yield (78), GC-MS m/z (% relative intensity): 229(19.7), 156(100), 127(19.7); ]H NMR (CDCI3, 400 MHz): δ 7.71-7.63(m, 3H), 7.41 (t, / = 7.6 Hz, 1H), 7.25 (t, / = 6.4 Hz, 1H), 6.96 (d, J = 8.4 Hz, 1H), 6.75 (s, 1H), 4.50 (br s, 1H), 4.31 (q, J = 7.2 Hz, 2H), 4.01 (s, 2H), 1.34 (t, J = 7.2 Hz, 3H) ppm; 13C NMR (CDC13, 100 MHz): δ 171.0, 144.7, 135.0, 129.1, 127.8, 127.7, 126.4, 126.0, 122.3, 117.9, 104.7, 61.4, 45.8, 14.2 ppm.
[00291] tert-butyl phenylglycinate (35). Following the standard procedure, yellow oil, % yield (89), GC-MS m/z (% relative intensity): 207(13.5), 151(37.3), 106(100), 77(18.7), 57(27.9), ]H NMR (CDC13, 500 MHz): δ 7.21-7.18 (m, 2H), 6.76 (t, / = 7.5 Hz, 1H), 6.62 (d, / = 8.0 Hz, 2H), 3.80 (s, 2 H), 1.50 (s, 9H) ppm, 13C NMR (CDC13, 125 MHz): δ 170.3, 147.2, 129.2, 118.0, 113.0, 81.9, 45.5, 28.1 ppm.
[00292] 6.3 Example 3: Carbene S— H insertion reactions catalyzed by myoglobin-based catalysts.
[00293] The ability of the wild-type sperm myoglobin (SEQ ID NO: l) and engineered variants thereof to catalyze carbene S— H insertion reactions was also investigated. In initial experiments, the activity of wild-type sperm whale myoglobin (Mb) toward catalyzing the insertion of ethyl a-diazoacetate (EDA, 52a) into the S-H bond of thiophenol (51) in aqueous buffer (KPi, pH 8.0) and in the presence of dithionite (Na2S2C>4) as a reductant was examined
(FIG. 16). Promisingly, this reaction was found to lead to the desired S-H insertion product, ethyl oc-(phenylthio)acetate (53), in 68% yield as determined based on GC analysis (Entry 1,
FIG. 16). Upon optimization of the reaction conditions, nearly quantitative conversion of thiophenol to 3 (68→ 98%), and correspondingly higher catalytic turnovers (TON: 170→ 492), could be achieved using a two-fold excess of EDA over the thiol substrate at a catalyst loading of 0.2 mol% (Entry 3, FIGS. 16 and 17A). Notably, comparable yields in this transformation have been obtained using transition metal complexes at 5- to 25-fold higher catalyst loadings
(i.e., 1-5 mol%).(Galardon, LeMaux et al. 1997; Galardon, Roue et al. 1998; Del Zotto, Baratta et al. 1999; Zhang, Ma et al. 2003; Zhang, Zhu et al. 2009; Xu, Zhu et al. 2014) Furthermore, no formation of the dimerization byproducts, ethyl maleate and fumarate, was observed in these myoglobin-catalyzed reactions in spite of the presence of excess EDA and mixing of the reagents in a single addition (FIG. 17A). This result is in contrast with the laborious, slow- addition protocols typically required to avoid this side reaction in the context of transition metal- catalyzed S-H insertion processes. (Galardon, LeMaux et al. 1997; Galardon, Roue et al. 1998; Del Zotto, Baratta et al. 1999; Zhang, Ma et al. 2003; Zhang, Zhu et al. 2009)
[00294] Time-course experiments showed that the Mb reaction with thiophenol and EDA was close to completion (94%) after 6 hours, with 68% of the S-H insertion product being formed within the first hour (FIG. 17B). These experiments also indicated that under these conditions the catalytic turnovers of the hemoprotein are limited by the concentration of thiophenol. By lowering the protein concentration, higher TON values were indeed observed (Entries 4-5, FIG. 16), with Mb supporting a maximum of 985 turnovers at a catalyst loading of 0.05 mol%.
[00295] Investigation of the dependence of Mb activity on the reductant showed a decrease in TON as the dithionite concentration was lowered, suggesting that ferrous Mb is the catalytically active form of the protein. Importantly, significant levels of catalytic activity are still maintained in the presence of stoichiometric amounts of dithionite relative to the protein as compared to using an excess of reductant (220 vs. 492 TON; FIG. 16). Furthermore, appreciable Mb-dependent S-H insertion activity was observed also in the absence of reductant (130 TON; Entry 7, FIG. 16).
[00296] Encouraged by the results obtained with wild-type Mb, attention was turned to identifying engineered Mb variants with enhanced reactivity toward S-H insertion. In previous studies, it was found that mutations at the level of the distal pocket could improve the activity (and selectivity) of this hemoprotein toward carbene transfer reactions (EXAMPLES 1 and 2). Accordingly, a panel of Mb variants carrying one and two active site mutations were screened for their improved ability to convert thiophenol into 53 in the presence of EDA as determined based on total turnover numbers (TTN). A number of Mb variants were found to exhibit greatly increased (> 2-fold) catalytic efficiency in this transformation (FIG. 18). In particular, the single mutant Mb(L29A) and double mutant Mb(L29A,H64V) emerged as the most promising catalysts, yielding 2190 and 2680 TTN, respectively, as compared to the 985 total turnovers supported by wild-type Mb. Interestingly, a similar activity-enhancing effect for the L29A mutation was noted also in the context of Mb-catalyzed carbene N-H insertion (EXAMPLE 2). Noteworthy is also the additive effect of the H64V mutation, whose introduction leads to a comparable increase in TTN (-20%) in both the wild-type Mb and the Mb(L29A) background. As judged based on analysis of the crystal structure of the sperm whale myoglobin (FIG. 1), this mutation is likely to enhance the catalytic efficiency of Mb by increasing the accessibility of the heme pocket to the diazo ester and thiol reactants. The initial rate for Mb(L29A,H64V)- catalyzed formation of the S-H insertion product 53 was determined to be 35 turnovers per minute.
[00297] Having identified Mb(L29A,H64V) as the most promising Mb-based catalyst for S-H insertion, further experiments were carried out to explore its scope across different aryl mercaptans (FIG. 19). To this end, variously substituted thiophenols (54-60) were subjected to Mb(L29A,H64V) catalysis (0.02 mol%) in the presence of EDA (52a). Notably, high to quantitative conversion (67-99%) to the desired S-H insertion products (61-65) were obtained starting from the para-substituted thiophenol derivatives 54-58, showing that electron-donating and electron- withdrawing substituents at this position are equally well tolerated by the protein catalyst. Similar results were obtained with the meta- and orzTto-substituted thiophenols 59 and 60, respectively, although a certain influence of the ortho substitution on the efficiency of the reaction was also evident (60% vs. 86-96% conversion for 67 vs 61 and 66).
[00298] To further investigate the reactivity scope of Mb(L29A,H64V), the reactions with thiophenol were then performed in the presence of different types of oc-diazo ester reagents, namely ieri-butyl (52b), cyclohexyl (52c), and benzyl oc-diazoacetate (52d). Importantly, formation of the respective S-H insertion products 68, 69, and 70 in 75-95% yields (FIG. 19) demonstrated the high degree of tolerance of the Mb-derived catalyst toward substitutions at the level of the ester group of the diazo reagent. The successful synthesis of 71 indicated that Mb(L29A,H64V) can also accept the oc-substituted ethyl oc-diazopropanate (2e) as a carbene donor. Altogether, the experiments outlined in FIG. 19 demonstrated the broad scope of the Mb(L29A,H64V) catalyst across different aryl thiol substrates and diazo reagents. Furthermore, repeating these reactions under low catalyst loading conditions (0.025 mol%) showed that Mb(L29A,H64V) support TTN values in excess of 1,300 in each case, yielding over 4,100 and 5,400 TTN in the reactions of thiophenol with 52a and 52c, respectively. These catalytic efficiencies are one to two orders of magnitude higher than those reported for similar S-H insertion reactions with transition metal catalysts. (Galardon, LeMaux et al. 1997; Galardon, Roue et al. 1998; Del Zotto, Baratta et al. 1999; Zhang, Ma et al. 2003; Zhang, Zhu et al. 2009; Xu, Zhu et al. 2014)
[00299] In order to assess the scalability of these Mb-catalyzed transformations, the synthesis of ethyl oc-(phenylthio)acetate (53) from 51 and 52a was carried out at a larger scale (-11 mg thiophenol, 0.2 mol% Mb(L29A,H64V). Successful isolation of 13.2 mg of 53 from this reaction in 67% isolated yield thus demonstrated the potential utility of these Mb-mediated reactions for synthetic purposes.
[00300] To determine whether the scope of Mb(L29A,H64V)-mediated S-H insertion could be extended to non-aromatic thiols, tests with benzylic and alkyl mercaptans as the substrates were carried out (FIG. 20). In the presence of EDA, Mb(L29A,H64V) was found to readily functionalize benzyl mercaptan (72), substituted benzyl mercaptan derivatives (73-75), and alkyl mercaptans such as cyclohexanethiol (76) and octane- 1 -thiol (77), providing conversions in the range of 30-50% and supporting between 930 and 2,550 total turnover numbers (Entries 1-6, FIG. 20). Given the clear advantage of using 52c (or 52d) toward improving the TTN in the insertion reactions with thiophenol (FIG. 19), the Mb(L29A,H64V)- catalyzed transformations of benzyl (72) and cyclohexanethiol (76) were carried out also in the presence of 52d. Higher yields (83-99%) and total turnovers (TON: 3-4,600) were obtained in both cases (FIG. 20), further evidencing the good match between the Mb catalyst and these carbene donors.
[00301] The development of catalytic systems for asymmetric carbene S-H insertions has proven remarkably difficult, with only low levels of enantioselectivity being observed in most cases (8-23% ee). (Brunner, Wutz et al. 1990; Galardon, Roue et al. 1998; Zhang, Ma et al. 2003) Interestingly, the reaction of ethyl a-diazo-propanoate with thiophenol in the presence of wild-type Mb or Mb(L29A,H64V) showed that neither of these proteins was capable of providing chiral induction in the S-H insertion reaction (Entries 1-2, FIG. 21). In contrast, screening of the panel of Mb active-site variants revealed that both Mb(F43V) and
Mb(F43V,V68A) showed appreciable enantioselectivity in this transformation (21-22% ee, Entries 3 and 5 in FIG. 21; FIG. 22). Since Mb(V68A) exhibited only 6% ee, the beneficial effect in terms of enantioselectivity can be mainly attributed to the substitution at the level of Phe43, which is located in close proximity to the heme cofactor (FIG. 1).
[00302] Further improvements in the enantioselectivity of the Mb(F43V)-catalyzed insertion reaction could be then achieved through optimization of the thiol : diazo ester ratio and other reaction parameters such as pH and temperature (Entries 6-8, FIG. 21). Under optimal conditions (51 : 52e in 1 : 1 ratio, pH 7.0, 4°C), the S-H insertion product 71 was obtained with an enantiomeric excess of 49% (FIG. 22), which corresponds to the highest enantioselectivity ever reported with a single-catalyst system and in the absence of exogenous additives. (Zhang, Zhu et al. 2009; Xu, Zhu et al. 2014)
[00303] Taken together, these results demonstrate that engineered Mb variants constitute efficient systems for promoting carbene S-H insertion reactions, providing the first example of a biocatalyst capable of supporting this synthetically valuable transformation. These Mb-based catalysts were found to offer high catalytic activity (1,000-5,400 TON) across a wide range of aryl and alkyl mercaptan substrates as well as across different oc-diazo esters as carbene precursors. The results also demonstrate the amenability of the Mb catalysts to promote asymmetric carbene S-H insertions, the possibility to tune this property via active site engineering, and the scalability of Mb-catalyzed S-H insertion reactions, further highlighting the utility of these biocatalysts for synthetic applications.
[00304] Experimental Details.
[00305] Analytical Methods. Gas chromatography (GC) analyses were carried out using a Shimadzu GC-2010 gas chromatograph equipped with a FID detector and aChiral Cyclosil-B column (30 m x 0.25 mm x 0.25 μιη film). Method: 1 \L injection, injector temp.: 200 °C, detector temp: 300 °C. Gradient: column temperature set at 140 °C for 3 min, then to 160 °C at 1.8 °C/min, then to 165 °C at 1 °C/min, then to 245 °C at 25 °C/min. Enantiomeric excess for product 71 was determined using the following method: 1 μΕ injection, injector temp.: 200 °C, detector temp: 300 °C. Gradient: column temperature set at 80 °C for 3 min, then to 180 °C at 1.00 °C/min, then to 200 °C at 2 °C/min, then to 245 °C at 25 °C/min.
[00306] Protein expression and purification. Wild-type Mb and the engineered Mb variants were expressed in E. coli BL21(DE3) cells as described in Example 1.
[00307] S-H insertion reactions. Typically, reactions were carried out at a 400 μΕ scale using 20 μΜ myoglobin, 10 mM thiophenol, 5 mM EDA, and 10 mM sodium dithionite. In a typical procedure, a solution containing sodium dithionate (100 mM stock solution) in potassium phosphate buffer (50 mM, pH 8.0) was degassed by bubbling argon into the mixture for 4 min in a sealed vial. A buffered solution containing myoglobin was carefully degassed in a similar manner in a separate vial. The two solutions were then mixed together via cannula. Reactions were initiated by addition of 10 μΕ of thiophenol (from a 0.4 M stock solution in methanol), followed by the addition of 10μL· of EDA (from a 0.2 M stock solution in methanol) with a syringe, and the reaction mixture was stirred for 12 h at room temperature, under positive argon pressure. The preparative-scale reaction was carried out using solution containing sodium dithionate (100 mM stock solution, 1 mL, 10 mM) in potassium phosphate buffer (50 mM, pH 8.0, 5.87 mL) and 466 μΕ of MeOH (>5 of reaction volume) was degassed by bubbling argon into the mixture for 20 min in a sealed vial. A buffered solution containing 20 μΜ Mb(L29A, H64V) (2.63 mL of 76 μΜ stock solution) was carefully degassed in a similar manner in a separate vial. The two solutions were then mixed together via cannula. Reactions were initiated by addition of 10.3 μL· of pure thiophenol, followed by the addition of 24 μL· of pure EDA with a syringe, and the reaction mixture was stirred for 12 h at room temperature, under positive argon pressure. The reaction mixture was extracted with dichloromethane (4 x 10 mL), organic layer evaporated under reduced pressure and the residue was purified by flash column
chromatography (10% ethyl acetate in hexanes) to yield product 3 as colorless liquid (13.2 mg, 67%). The reactions were analyzed by adding 20 μL· of internal standard (benzodioxole, 50 mM in methanol) to the reaction mixture, followed by extraction with 400 μL· of dichloromethane and separated organic layer was analyzed by GC-FID as described above. Calibration curves were constructed using authentic standards prepared synthetically using Rh2(OAc)4 as the catalyst as described below. All measurements were performed at least in duplicate. For each experiment, negative control samples containing either no enzyme or no reductant were included.
[00308] General procedure for synthesis of authentic standards: To a flame dried round bottom flask under argon, equipped with a stir bar was added thiol (1 equiv.) and Rh2(OAC)4 (5 mol%) in dichloromethane (2-3 mL). To this solution was added a solution of diazo compound (1 equiv.) in dichloromethane (1-2 mL) by slow addition over 30-45 minutes at 0°C. The resulting mixture was stirred at room temperature for another 2-3 hour. The solvent was removed under vacuum and the crude mixture was purified by 9: 1 hexanes to diethyl ether using flash chromatography to obtained S-H insertion products in good to excellent yield. The identity of the S-H insertion products was determined using GC-MS, ]H and 13C NMR.
[00309] Ethyl 2-(phenylthio)acetate (53). Following the standard procedure, % yield (86), GC-MS m/z (% relative intensity): 196(57.3), 123(100), 109(12.0), 77(10.6);^ NMR (CDC13, 500 MHz): δ 7.32 (d, / = 7.5 Hz, 2H), 7.26-7.21 (m, 3H), 4.18 (q, / = 7.0 Hz, 2H), 3.63 (s, 2H), 1.24 (t, / = 7.0 Hz, 3H) ppm;13C NMR (CDC13, 125 MHz): δ 169.7, 135.0, 130.0, 129.0, 126.9, 61.5, 36.7, 14.1 ppm.
[00310] Ethyl 2-(/?-tolylthio)acetate (61). Following the standard procedure, % yield (79), GC-MS m/z (% relative intensity): 210(67.0), 137(100), 99(17.9);^ NMR (CDC13, 500 MHz): δ 7.34 (d, / = 8.5 Hz, 2H), 7.12 (d, J = 8.0 Hz, 2H), 4.17 (q, / = 7.0 Hz, 2H), 3.57 (s, 2H), 2.32 (s, 3H), 1.23 (t, J = 7.0 Hz, 3H) ppm, 13C NMR (CDC13, 125 MHz): δ 169.8, 137.3, 130.9,
129.8, 61.4, 37.4, 21.1, 14.1 ppm.
[00311] Ethyl 2-((4-methoxyphenyl)thio)acetate (62). Following the standard procedure, % yield (83), GC-MS m/z (% relative intensity): 226(100), 153(88.0), 139(43.8), 109(18.5);^ NMR (CDCI3, 500 MHz): δ 7.41 (d, / = 9.0 Hz, 2H), 6.83 (d, / = 9.0 Hz, 2H), 4.14 (q, / = 7.5 Hz, 2H), 3.77 (s, 3H), 3.49 (s, 2H), 1.21 (t, / = 7.5 Hz, 3H) ppm,13C NMR (CDC13, 125 MHz): δ
169.9, 159.6, 134.2, 124.9, 114.6, 61.3, 55.3, 38.6, 14.1 ppm.
[00312] Ethyl 2-((4-chlorophenyl)thio)acetate (63). Following the standard procedure, % yield (85), GC-MS m/z (% relative intensity): 230(65.5), 157(100), 143(8.4), 108(9.2);^ NMR (CDCI3, 500 MHz): δ 7.35 (d, / = 8.5 Hz, 2H), 7.26 (d, / = 8.5 Hz, 2H), 4.17 (q, / = 7.0 Hz, 2H), 3.59 (s, 2H), 1.23 (t, / = 7.0 Hz, 3H) ppm, 13C NMR (CDC13, 125 MHz): δ 169.4, 133.5, 133.2, 131.5, 129.2, 61.6, 36.8, 14.1 ppm.
[00313] Ethyl 2-((4-bromophenyl)thio)acetate (64). Following the standard procedure, % yield (79), GC-MS m/z (% relative intensity):274(100), 202(45.7), 201(47.3), 122(74.0), 108(17.4); ]H NMR (CDC13, 400 MHz): δ 7.39 (d, J = 7.6 Hz, 2H), 7.26 (d, J = 7.2 Hz, 2H), 4.16 (q, / = 7.2 Hz, 2H), 3.58 (s, 2H), 1.22 (t, / = 7.2 Hz, 3H) ppm, 13C NMR (CDC13, 125 MHz): 5169.3, 134.2, 132.1, 131.5, 120.9, 61.6, 36.6, 14.1 ppm.
[00314] Ethyl 2-((4-(trifluoromethyl)phenyl)thio)acetate (65). Following the standard procedure, % yield (72), GC-MS m/z (% relative intensity): 264(100), 191(98.1), 171(33.6);^ NMR (CDCI3, 500 MHz): δ7.54 (d, J = 8.0 Hz, 2H), 7.45 (d, J = 8.0 Hz, 2H), 4.21 (q, J = 7.0 Hz, 2H), 3.70 (s, 2H), 1.25 (t, / = 7.0 Hz, 3H) ppm, 13C NMR (CDC13, 125 MHz): δ 169.1, 140.5, 128.7, 128.5, 128.2, 128.1,127.3, 125.8, 125.7, 125.1, 122.9, 120.7 61.8, 35.3, 14.0 ppm.
[00315] Ethyl 2-(m-tolylthio)acetate (66). Following the standard procedure, % yield (81), GC-MS m/z (% relative intensity): 210(67.5), 137(100), 91(17.9);^ NMR (CDC13, 400 MHz): δ 7.22-7.17 (m, 3H), 7.03 (d, / = 6.4 Hz, 1H), 4.18 (q, / = 6.8 Hz, 2H), 3.61 (s, 2H), 2.31 (s, 3H), 1.23 (t, J = 7.2 Hz, 3H) ppm, 13C NMR (CDC13, 100 MHz): δ 169.7, 138.8, 134.7, 130.5, 128.8, 127.8, 126.9, 61.5, 36.7, 21.3, 14.1 ppm. [00316] Ethyl 2-(o-tolylthio)acetate (67). Following the standard procedure, % yield (84), GC-MS m/z (% relative intensity): 210(74.7), 164(35.3), 137(100), 91(31.8);^ NMR (CDC13, 400 MHz): δ 7.35 (d, / = 6.4 Hz, 1H), 7.16-7.14 (m, 3H), 4.17 (q, / = 7.2 Hz, 2H), 3.61 (s, 2H), 2.41 (s, 3H), 1.23 (t, / = 7.2 Hz, 3H) ppm, 13C NMR (CDC13, 100 MHz): 5169.6, 138.2, 134.1, 130.2, 129.4, 126.8, 126.6, 61.5, 35.9, 20.3, 14.1 ppm.
[00317] tert-Butyl 2-(phenylthio)acetate (68). Following the standard procedure, % yield (86), GC-MS m/z (% relative intensity): 224(16.5), 168(33.1), 123(58.6), 57(100);^ NMR (CDCI3, 500 MHz): δ 7.41 (d, / = 7.5 Hz, 2H), 7.29-7.27 (m, 2H), 7.22 (d, / = 5.6 Hz, 1H), 3.55 (s, 2H), 1.39 (s, 9H) ppm, 13C NMR (CDC13, 125 MHz): δ 168.8, 135.3, 129.8, 128.9, 126.7, 81.9, 37.7, 27.9 ppm.
[00318] Cyclohexyl 2-(phenylthio)acetate (69). Following the standard procedure, % yield (78), GC-MS m/z (% relative intensity): 250(55.9), 168(28.7), 123(65.8), 83(100), 55(66.9);^ NMR (CDCI3, 500 MHz): δ 7.42 (d, / = 6.0 Hz, 2H), 7.30-7.20 (m, 3H), 4.79-4.76 (m, 1H), 3.62 (s, 2H), 1.79-1.23 (m, 10H) ppm, 13C NMR (CDC13, 125 MHz): δ 169.2, 135.1, 129.9, 128.9, 126.8, 73.9, 36.9, 31.4, 25.3, 23.5 ppm.
[00319] Benzyl 2-(phenylthio)acetate (70). Following the standard procedure, % yield (86), GC-MS m/z (% relative intensity): 258(69.8), 123(61.1), 91(100), 65(11.5);^ NMR (CDC13, 400 MHz): δ 7.34-7.21 (m, 10H), 5.14 (s, 2H), 3.68 (s, 2H) ppm, 13C NMR (CDC13, 100 MHz): δ 169.6, 135.3, 134.8, 130.1, 129.1, 128.6, 128.4, 128.3, 127.0, 67.3, 36.7 ppm.
[00320] Ethyl 2-(phenylthio)propanoate (71). Following the standard procedure, % yield (62), GC-MS m/z (% relative intensity): 210(41.9), 137(100), 109(24.1);]H NMR (CDC13, 400 MHz): δ7.46 (d, / = 6.8 Hz, 2H) 7.31-7.28 (m, 3H), 4.13 (q, / = 7.2 Hz, 2H), 3.80 (q, / = 7.2 Hz, 1H), 1.48 (d, J = 7.2 Hz, 3H), 1.18 (t, J = 7.2 Hz, 3H) ppm, 13C NMR (CDC13, 100 MHz): δ 172.7, 133.0, 128.9, 127.9, 61.2, 45.2, 17.3, 14.0 ppm.
[00321] Ethyl 2-(benzylthio)acetate (78). Following the standard procedure, % yield (81), GC-MS m/z (% relative intensity): 210(23.4), 123(86.8), 91(100), 65(11.5);^ NMR (CDC13, 500 MHz): δ 7.33-7.30 (m, 4H), 7.27-7.24 (m, 1H), 4.20 (q, / = 7.0 Hz, 2H), 3.83 (s, 2H), 3.07 (s, 2H), 1.31 (t, / = 7.0 Hz, 3H) ppm, 13C NMR (CDC13, 125 MHz): δ 170.4, 137.2, 129.2, 128.5, 127.2, 61.3, 36.3, 32.3, 14.2 ppm.
[00322] Ethyl 2-((4-methylbenzyl)thio)acetate (79). Following the standard procedure, % yield (76), GC-MS m/z (% relative intensity): 224(24.9), 137(90.7), 105(100), 79(11.8);^ NMR (CDCI3, 400 MHz): δ 7.23 (d, J = 7.2 Hz, 2H), 7.13 (d, J = 7.2 Hz, 2H), 4.20 (q, J = 7.2 Hz, 2H), 3.79 (s, 2H), 3.05 (s, 2H), 2.33 (s, 3H), 1.30 (t, / = 7.2 Hz, 3H) ppm, 1JC NMR (CDC13, 100 MHz): δ 170.4, 136.8, 134.1, 129.2, 61.2, 36.0, 32.2, 21.1, 14.2 ppm.
[00323] Ethyl 2-((4-methoxybenzyl)thio)acetate (80). Following the standard procedure, % yield (79), GC-MS m/z (% relative intensity): 240(9.9), 153(9.5), 121(100), 77(4.8);]H NMR (CDCI3, 500 MHz): δ 7.26 (d, / = 8.5 Hz, 2H), 6.86 (d, /= 8.5 Hz, 2H), 4.20 (q, J = 7.0 Hz, 2H), 3.79 (br s, 5H), 3.05 (s, 2H), 1.31 (t, / = 7.0 Hz, 3H) ppm, 13C NMR (CDC13, 125 MHz): δ 170.5, 158.8, 130.3, 129.2, 113.9, 61.3, 55.3, 35.7, 32.2, 14.2 ppm.
[00324] Ethyl 2-((4-chlorobenzyl)thio)acetate (81). Following the standard procedure, % yield (72), GC-MS m/z (% relative intensity): 244(25.8), 157(100), 76.9(125), 89(16.2);^ NMR (CDCI3, 400 MHz): δ 7.26 (br s, 4H), 4.18 (q, / = 7.2 Hz, 2H), 3.77 (s, 2H), 3.02 (s, 2H), 1.28 (t, J = 7.2 Hz, 3H) ppm, 13C NMR (CDC13, 100 MHz): δ 170.2, 135.8, 133.0, 130.5, 128.6, 61.3, 35.5, 32.1, 14.1 ppm.
[00325] Ethyl 2-(cyclohexylthio)acetate (82). Following the standard procedure, % yield (72), GC-MS m/z (% relative intensity): 202(25.8), 115(100), 81(81.7), 67(26.8), 55(31.5);^ NMR (CDCI3, 500 MHz): δ 4.18 (q, 7 = 7.0 Hz, 2H), 3.22 (s, 2H), 2.79-2.76 (m, 1H), 1.97-1.96 (m, 2H), 1.75 (m, 2H), 1.60-1.58 (m, 1H), 1.33-1.20 (m, 8H) ppm, 13C NMR (CDC13, 125 MHz): δ170.9, 61.2, 43.9, 33.1, 32.1, 25.9, 25.7, 14.1 ppm.
[00326] Ethyl 2-(octylthio)acetate (83). Following the standard procedure, % yield (67), GC-MS m/z (% relative intensity): 232(33.5), 159(15.6), 145(100), 88(89.7), 69(80.7),
55(21.5);1H NMR (CDC13, 400 MHz): δ 4.13-4.11 (m, 2H), 3.14 (s, 2H), 2.58-2.55 (m, 2H), 1.55-1.52 (m, 2H), 1.32-1.22 (m, 13H), 0.82 (m, 3H) ppm, 13C NMR (CDC13, 100 MHz): δ 170.5, 61.1, 33.6, 32.6, 31.7, 29.1, 28.9, 28.7, 22.5, 14.1, 13.9 ppm.
[00327] Benzyl 2-(benzylthio)acetate (84). Following the standard procedure, % yield (82), GC-MS m/z (% relative intensity): 272(1.6), 181(83.6), 107(16.8), 91(100), 65(8.2);^ NMR (CDCI3, 400 MHz): δ 7.39-7.26 (m, 10H), 5.18 (s, 2H), 3.82 (s, 2H), 3.13 (s, 2H) ppm, 13C NMR (CDCI3, 100 MHz): δ 170.2, 137.1, 135.6, 129.2, 128.6, 128.5, 128.4, 128.3, 127.3, 67.0, 36.3, 32.3 ppm.
[00328] Benzyl 2-(cyclohexylthio)acetate (85). Following the standard procedure, % yield (76), GC-MS m/z (% relative intensity): 264(17.3), 173(25.3), 115(61.1), 91(100), 81(58.4), 55(25.8);^ NMR (CDC13, 400 MHz): δ 7.35-7.33 (m, 5H), 5.16 (s, 2H), 3.27 (s, 2H), 2.74 (m, 1H), 1.95-1.93 (m, 2H), 1.72 (m, 2H), 1.59 (br s, 1H), 1.29-1.21 (m, 5H) ppm, 13C NMR (CDCI3, 100 MHz): δ 170.6, 135.7, 128.5, 128.3, 66.9, 43.9, 33.1, 32.0, 25.9, 25.7 ppm. [00329] 6.4 Example 4: [2,3] sigmatropic rearrangement reactions catalyzed by myoglobin-based catalysts.
[00330] The ability of myoglobin-based catalysts to catalyze [2,3] sigmatropic rearrangement reactions was initially investigated using a model allyl sulfide, allyl(phenyl)sulfane (90) as the carbene acceptor substrate, EDA as the carbene donor, and Mb(L29A,H64V) as the catalyst (FIG. 23). This reaction led to the efficient formation of the rearrangement product 92 (44% conversion; TON: 440), whose formation is consistent with the [2,3]-sigmatropic rearrangement of a sulfonium ylide intermediate (91, FIG. 23) generated upon Mb-catalyzed carbene transfer to the sulfide. The sulfonium ylide likely arises from nucleophilic attack of the sulfane substrate to the heme-bound carbene intermediate generated upon reaction of the diazo compound with the hemoprotein (FIG. 5). Notably, the Mb(L29A,H64V)-catalyzed formation of 92 was also found to occur with a certain degree of enantioselectivity (15% e.e. , FIG. 23), as determined by chiral GC analysis. Upon screening additional Mb variants, it was possible to identify Mb catalysts with improved catalytic efficiency and enantioselectivity for the conversion of 90 to 92 (FIG. 23). Notably, in addition to higher TON values (900 vs 440 TON), the Mb variant Mb(F43V,V68F) was found to have complementary enantioselectivity as compared to
Mb(L29A,H64V) (FIG. 23).
[00331] To investigate the scope of this novel Mb-catalyzed reaction, a variety of other allylic and propargylic sulfide substrates were tested also in combination with different oc-diazo esters (FIGS. 24-25). As illustrated by the representative data of FIGS. 24 and 25, all the tested substrates, including phenyl(alkyl)-, aryl(alkyl)-, benzyl(alkyl)-, and dialkyl-sulfanes could be converted to the desired rearrangement products (compounds 93-99) with high efficiency and yields (30-99%). In addition, as exemplified by the reaction with allyl(phenyl)sulfane and EDA to give product 92, many of the engineered Mb variants exhibited superior performance in terms of catalytic efficiency and/or enantioselectivity as compared to wild-type sperm whale myoglobin (SEQ ID NO: l).
[00332] To establish whether a similar Mb-catalyzed sigmatropic rearrangement reaction could be achieved starting from tertiary allylic amines as the carbene acceptors, the model substrate N-allyl-N-methylaniline was made react with EDA in the presence of sperm whale myoglobin (SEQ ID NO: l) and Mb(H64V,V68A). In both cases, the desired rearrangement product (ethyl 2-(methyl(phenyl)amino)pent-4-enoate) could be obtained in good yields (50- 60%). [00333] Altogether, these experiments demonstrate the ability of the myoglobin-based catalysts provided herein to catalyze synthetically useful transformations such as the [2,3] sigmatropic rearrangement of thioether and tertiary amine substrates. As noted for the transformations described earlier (EXAMPLES 1-3), the catalytic activity and enantioselectivity of the Mb catalysts in the context of these sigmatropic rearrangement reactions could be tuned via mutagenesis of the active site residues. Importantly, no natural or engineered biocatalyst has been reported to date to be capable of catalyzing these reactions.
[00334] Experimental Details.
[00335] Protein expression and purification. Wild-type Mb and the engineered Mb variants were expressed in E. coli BL21(DE3) cells as described in EXAMPLE 1.
[00336] Reaction conditions and analytical methods. The [2,3] sigmatropic rearrangement reactions were carried out and analyzed according to the same procedures described in
EXAMPLE 3 (sulfanes) and EXAMPLE 2 (amines). Authentic standards for the rearrangement products were prepared according to general procedure described below.
[00337] Synthesis of authentic standard of ethyl 2-(phenylthio)pent-4-enoate (92). To a flame dried round bottom flask under argon, equipped with a stir bar was added allyl-phenyl-sulfane (1 equiv.) and Rh2(OAC)4 (5 mol ) in dichloromethane (2-3 rriL). To this solution was added a solution of EDA (1 equiv.) in dichloromethane (1-2 rriL) over 30 minutes at 0°C. The resulting mixture was stirred at room temperature for overnight. The solvent was removed under vacuum and the crude mixture was purified by 9: 1 hexanes to diethyl ether using flash chromatography to obtain 92 in 80% yield. GC-MS m/z (% relative intensity): 236(61.9), 195(81.5), 163(88.1), 149(98.1), 121 (93.7), 109 (lOO);1!! NMR (CDC13, 500 MHz): δ 7.47 (d, J = 5.5 Hz, 2H), 7.30- 7.26 (m, 3H), 5.85-5.76 (m, IH), 5.15-5.08 (m, 2H), 4.18-4.09 (m, 2H), 3.71-3.67 (m, IH), 2.66- 2.60 (m, IH), 2.54-2.49 (m, IH), 1.18-1.14 (m, 3H) ppm, 13C NMR (CDC13, 125 MHz): δ 171.6, 133.9, 133.1, 128.9, 128.0, 118.0, 61.1, 50.3, 35.8, 14.1 ppm.
[00338] 6.5 Example 5: Preparation and carbene transfer reactivity of cofactor- substituted myoglobin-based catalysts
[00339] The carbene transfer reactivity of the myoglobin catalysts is dependent upon the presence of a heme cofactor (iron-protoporphyrin IX) bound to the protein. Accordingly, it was envisioned that varying the nature of this cofactor, e.g., via using an alternative
metalloporphyrin cofactor, could provide a means to modulate the carbene transfer reactivity of these catalysts. To examine this aspect, the catalytic properties of metallo- substituted Mb variants, in which the heme cofactor is substituted for a Mn- or Co-protoporphyrin IX, was investigated.
[00340] Mn- and Co-substituted Mb variants have been previously obtained by reconstitution of apomyoglobin with the corresponding metallo-protoporphyrins IX (Yonetani and Asakura 1969; Yonetani, Yamamoto et al. 1974; Heinecke, Yi et al. 2012). While remaining viable, this approach however involves laborious and time-consuming refolding procedures. To overcome this issue, a convenient and practical strategy was implemented for the recombinant expression of metallo-substituted Mb variants by using E. coli cells, which express a heterologous, outer- membrane heme transporter (ChuA) (Varnado and Goodwin 2004). Accordingly, wild-type sperm whale Mb and the heme transporter ChuA from 0157:H7 E. coli (Varnado and Goodwin 2004) were initially expressed in BL21(DE3) cells using a dual plasmid system in which the Mb and ChuA genes are under an IPTG-inducible promoter. Cells were grown in M9 minimal medium supplemented with Mnm(ppIX). Under these conditions, Mn-substituted Mb
(Mb(Mnm)) could be successfully isolated with a yield of approximately 5 mg / L of culture. Upon observation that a good fraction of the expressed Mb accumulated in the form of inclusion bodies, a second plasmid encoding for both ChuA and the chaperone complex, GroEL/ES, was prepared. The latter was expected to increase the fraction of the desired protein in correctly folded and soluble form. Indeed, this system led to a significant increase (2.5-fold) in the yield of Mb(Mnm) (13 vs. 5 mg/L culture). Under further optimized conditions (glycerol as energy source and Mn(ppIX) at 30 μg/mL), the expression yield could be raised to 19 mg / L culture with quantitative incorporation of Mnm(ppIX). Finally, a further 25% increase in the yield of Mb(Mnm) (23 mg / L) was obtained by using an engineered derivative of BL21(DE3) E. coli strain which favors the expression of toxic proteins (Dumon-Seignovert, Cariot et al. 2004). The same protocol could be then applied for the expression and isolation of Mb(Com) in good yield (16 mg/L culture).
[00341] The purified Mn- and Co-containing myoglobin variants, referred to as Mb(Mn) and Mb(Co), were characterized by electron absorption spectroscopy in both oxidized and reduced form (FIGS. 26A-C). As shown in FIG. 26B, Mb(Mnm) shows a split Soret band with Xmax at 375 and 469 nm in phosphate buffer at pH 7.0. Upon addition of dithionite, a single Soret band with max at 438 nm becomes apparent, indicating complete reduction of the protein to
Mb(Mnn). On the other hand, the visible spectrum of Mb(Com) shows a prominent absorption band at 422, which shifts to 401 nm under reducing conditions, thus evidencing the formation of the reduced form, Mb(Con) (FIG. 26C).
[00342] The metallo- substituted Mb(Mn) and Mb(Co) variants were then tested for their ability to catalyze the cyclopropanation of styrene in presence of EDA as the carbene donor under standard reaction conditions as described in EXAMPLE 1. These experiments showed that both Mb(Mn) and Mb(Co) variants were able to catalyze this reaction with higher efficiency and/or altered selectivity as compared to wild-type sperm whale myoglobin (SEQ ID NO: l). Altogether, these experiments provide a proof-of-principle demonstration of the functionality of cofactor-modified myoglobin catalysts for carbene transfer reactions.
[00343] Experimental details.
[00344] Construction of ChuA-containing plasmids. A first pACYC-derived plasmid
(pHPEX2) (Varnado and Goodwin 2004) containing E. coli 0157:H7 ChuA gene under a lacUV5 promoter was provided by Prof. Douglas Goodwin (Auburn University). A second pACYC-based plasmid (pGroES/EL-ChuA) was constructed by inserting the ChuA gene under a T7 promoter and E. coli GroES/EL gene under an araBAD promoter. To generate this construct, the ChuA gene was amplified from pHPEX2 using primers T7_ChuA_SacI_for and
ChuA_XhoI_rev (Table SI) and cloned into a Sac I / Xho I cassette of the pACYC-derived vector. The GroES-EL chaperone genes were amplified from E. coli genomic DNA using primers GroEL/ES_BglII_for and GroEL/ES_SalI_rev and cloned into the Bgl II / Sal I cassette of the same vector.
[00345] Expression of Co- and Mn-substituted myoglobin. E. coli BL21(DE3) (or C41(DE3) (Lucigen)) cells were co-transformed with the Mb-encoding plasmid (pET22_MYO) and the ChuA-encoding vectors pHPEX2 or pGroES/EL-ChuA. Cells were grown in M9 minimal media supplemented with micronutrients and the appropriate antibiotics at 37 °C until the OD6oo reached 0.6. Cell cultures were then induced with 0.25 mM β-D-l-thiogalactopyranoside (IPTG) and added with Mnm(ppIX) or Com(ppIX) to a final concentration of 6 or 30 μg/mL. Cells containing the pGroES/EL-ChuA vector were also induced with 0.5% arabinose at this point. After induction, cultures were shaken at 150 rpm and 27 °C and harvested after 20 hours by centrifugation at 4000 rpm at 4 °C. Proteins were purified as described above. Extinction coefficient ε424 = 152,5 mM _1 cm _1 and ε47ο = 60 mM _1 cm _1 were used to determine the concentration of Mb(Com) and Mb(Mnm), respectively (Heinecke, Yi et al. 2012). Electronic absorption spectra were recorded in phosphate buffer (50 mM, pH 7.0) at 20°C. [00346] 6.6. Example 6: Preparation and carbene transfer reactivity of myoglobin- based catalysts with alternative proximal ligands
[00347] In myoglobins, the side-chain imidazolyl group of a conserved histidine residue (His93 in sperm whale myoglobin) coordinates the iron atom of the heme cofactor at the proximal face of the porphyrin ring (FIG. 1). Since the nature of this proximal ligand can affect the electronic properties of the metal center, it was envisioned that modifying this position with natural or unnatural amino acid could provide a means to modulate the reactivity of the myoglobin-based catalysts described herein in the context of carbene transfer reactions.
[00348] Accordingly, a series of proximal ligand Mb variants (SEQ ID NO: 14 through 27) were prepared by replacing His93 residue in wild-type sperm whale myoglobin (SEQ ID NO: l) and in one of the most promising cyclopropanation catalyst, Mb(H64V,V68A) (SEQ ID NO: 11), with Ala, Asp, Glu, Tyr, Cys, Ser, or Gly. Some of these residues (i.e., Cys, Asp, Glu, Tyr, Ser) have a side-chain group capable of coordinating the metal ion of the heme group (or other metalloporphyrin/metalloporphyrin analog) in the Mb catalyst, while others (Ala, Gly) do not, leaving the proximal site available for coordination by other species (e.g., solvent).
[00349] In addition, three proximal ligand Mb variants were prepared by replacing His93 residue in wild-type sperm whale myoglobin (SEQ ID NO: l) and in one of the most promising cyclopropanation catalyst, Mb(H64V,V68A) (SEQ ID NO: 11), with the unnatural amino acid p- amino-phenylalanine (pAmF) (SEQ ID NOS: 28 and 31), 3-pyridyl-alanine (3PyA) (SEQ ID NOS: 29 and 32), and 3-methyl-histidine (3MeH) (SEQ ID NOS: 30 and 33) via amber stop codon suppression (Liu and Schultz 2010). These unnatural amino acids furnish an alternative metal-coordinating side-chain group (e.g., arylamino, pyridyl, methylimidazolyl group, respectively) as compared to those provided by the pool of natural amino acid, thus expanding opportunities for modulating the reactivity properties of the Mb catalysts. To generate these Mb variants, a gene encoding for the polypeptide sequence of Mb(H64V,V68A) (SEQ ID NO: 11) and containing an amber stop codon (TAG) in place of His93, was cloned under an IPTG- inducible promoter in a pET22 vector (Novagen). Using procedures known in the art (Liu and Schultz 2010; Kolev, Zaengle et al. 2014), this gene was then expressed in BL21(DE3) cells containing a second plasmid encoding for an engineered, orthogonal Methanocaldococcus jannaschii tRNA/aminoacyl-tRNA synthetase (AARS) pair capable of suppressing an amber stop codon with p-amino-phenylalanine, 3-pyridyl-alanine, or 3-methyl-histidine. After expression, the unnatural amino acid-containing Mb variants were purified by Ni-affinity chromatography as described in EXAMPLE 1. All the variants were able to bind and retain the heme cofactor as indicated by the presence of a Soret band in their UV-VIS electronic absorption spectra (FIG. 26D).
[00350] All the aforementioned proximal-ligand Mb variants carrying a His93Ala,
His93Asp, His93Glu, His93Tyr, His93Cys, His93Ser, His93Gly, His93(pAmF), His93(3PyA), or His93(3MeH) mutation, were tested for their carbene transfer reactivity using representative reactions for olefin cyclopropanation (styrene + EDA), carbene N— H insertion (aniline + EDA; aniline + EDP), and carbene S— H insertion (thiophenol + EDA; thiophenol + EDP). Notably, all of the Mb(H64V,V68A)-derived variants were found to possess detectable activity on one or more of these reactions, with their catalytic activity ranging from 10% to 80% that of the parent hemoprotein, Mb(H64V,V68A). Altogether, these experiments demonstrate that alternative amino acid residues, including unnatural amino acids, can be introduced in place of the proximal residue His93, thus enabling tuning and optimization of the catalytic and selectivity properties of the Mb catalysts in the context of carbene transfer transformations.
[00351] 6.7 Example 7: Carbene transfer reactions catalyzed by myoglobin-based catalysts in whole-cell systems
[00352] Performing biocatalytic reactions in whole-cell systems eliminates the time and efforts associated with purification of the biocatalyst, increasing the time- and cost-effectiveness of the process. To establish whether the Mb-catalyzed transformations described in the previous EXAMPLES could be carried out using whole cell systems, a model reaction (styrene cyclopropanation with EDA) was investigated using E. coli cells (BL21(DE3)) expressing the Mb variant, Mb(H64V,V68A). After protein expression, the cells were resuspended in phosphate buffer (50 mM KPi, pH 8.0) to a final OD6oo of 40 and the substrate (styrene) and carbene donor (EDA) were added to the cell suspension. Complete conversion of styrene to the desired cyclopropanation product (3a) could be achieved using the Mb(H64V,V68A)-expressing cells under anaerobic conditions (FIG. 27). This reaction also proceeded with high diastereo- and enantioselectivity (99% d.e. , 99% e.e.) indicating that the cyclopropanation transformation is almost exclusively catalyzed by the Mb(H64V,V68A) catalyst inside the cell, as free hemin would lead to lower diasteroselectivity and no enantioselectivity (FIG. 4). Importantly, the successful in vivo conversion of styrene in the absence of exogeneous reductant added to the cell suspension indicated that the intracellular environment is sufficiently reducing to maintain the hemoprotein in the catalytically active ferrous form. [00353] Performing the same reaction using Mb(H64V,V68A)-expressing cells under aerobic conditions lead to a somewhat reduced product conversion (-50% relative yield). However, a significant increase in product conversion (70-80% relative yield as compared to yield obtained under optimal (anaerobic) conditions) was achieved upon addition of glucose to the cell suspension and by simply sealing the reaction vessel (FIG. 27). This result is likely due to the glucose stimulating the consumption of intracellular oxygen via the aerobic metabolism and respiratory pathways, thus reducing the inhibitory effect of oxygen on the myoglobin catalyst. Similar results were obtained with a model N— H insertion reaction (aniline and EDA), demonstrating the applicability of this strategy across different carbene transfer transformations. Overall, these results demonstrated the possibility of applying the myoglobin catalysts and performing the myoglobin-catalyzed reactions using whole-cell systems, also under aerobic conditions.
[00354] 6.8 Example 8: Gram-scale synthesis of drugs and advanced pharmaceutical intermediates using engineered myoglobin catalysts.
[00355] To further demonstrate the utility of the myoglobin catalysts described herein for the synthesis of high- value products such as pharmaceuticals and advanced pharmaceutical intermediates, the following proof-of-principle experiments were carried out.
[00356] In order to prepare a key intermediate for the synthesis of tranylcypromine (101, FIG. 28A), an FDA-approved monoamine oxidase (MAO) inhibitor in use as antidepressant drug, a large-scale whole-cell reaction was performed using BL21(DE3) E. coli cells expressing the Mb(H64V,V68A) variant. The cells were suspended at an OD6oo of 40 in phosphate buffer (pH 7.2) under aerobic conditions and added with styrene (2.88 g) and EDA (3.15 g). Glucose was added to the medium at a final concentration of 50 mM. The reaction mixture was stirred at room temperature for 20 hours and extracted with ethyl acetate. After flash chromatography purification, the desired product, (lS^^-ethyl 2-phenylcyclopropanecarboxylate, could be isolated in excellent yield (4.7 g, 91% yield). The crude mixture was found to be extremely clean, containing less than 2% of unreacted styrene as the only impurity. Chiral GC analysis showed that the cyclopropanation product was produced with nearly absolute diastereo- and enantioselectivity (i.e, 99.9% d.e. and 99.9% e.e.). Using procedure known in the art, (IS,2S)- ethyl 2-phenylcyclopropanecarboxylate could be then converted to (15',2/?)-tranylcypromine (101, FIG. 28A). [00357] In another example, Pfizer's clinical drug candidate 104 (FIG. 28B) was targeted for synthesis by using a stereoselective myoglobin-catalyzed cyclopropanation process to give access to the key chiral cyclopropane moiety. Accordingly, a whole-cell cyclopropanation reaction was carried out under aerobic conditions using Mb(H64V,V68A)-expressing
BL21(DE3) E. coli cells (OD600 = 40), EDA, and the olefin substrate 102 (FIG. 28B). From this reaction, 123 mg of the desired cyclopropanation product, (lS,2S)-ethyl-2-methyl-2-(6- (trifluoromethyl)pyridin-3-yl)cyclopropanecarboxylate (103, FIG. 28B) could be isolated in 69% yield. Chiral GC analysis showed that 103 was produced with excellent diastereo- and enantioselectivity (99.9% d.e. , 99.9% e.e.), which further highlighted the broad substrate scope of the Mb(H64V,V68A) catalyst. The enantiopure intermediate 103 could then be converted to the target compound 104 using art-known procedures and protocols.
[00358] Altogether, these experiments demonstrate the practical utility and scalability of the engineered myoglobin catalysts described herein for the stereoselective synthesis of high-value compounds and building blocks.
[00359] 6.9 Example 9: Myoglobin catalysts with altered selectivity via active site mutagenesis.
[00360] The results described in EXAMPLES 1-4 illustrate how the catalytic activity and/or selectivity of myoglobin-based polypeptides as carbene transfer catalysts can be modulated via mutagenesis of amino acid residues defining the active site of the hemoprotein according to the methods provided herein. To further illustrate this point, a library of Mb variants was prepared starting from Mb(H64V) by mutating one or more additional active site residues in sperm whale myoglobin (FIG. 1) by site-directed mutagenesis. For example, Mb variants carrying double mutations at positions H64/V68, L29/H64, H64/I107, and F43/H64, and carrying triple mutations at positions L29/H64/V68, were prepared and then tested to identify Mb-based cyclopropanation catalysts with altered diastereo- and stereoselectivity as compared to the (IS- 25)-selective Mb(H64V,V68A) catalyst, using a model reaction with styrene and EDA (FIG. 29).
[00361] As illustrated by the data provided in FIG. 29, using this strategy a number of promising Mb catalysts could be identified for producing alternative stereoisomers of the cyclopropanation product (i.e., 3b-d). For example, Mb(H64V,V68L) and Mb(H64V,V68F) were found to exhibit excellent trans -diastereoselectivity (>99.9% de) and high (IR,2R)- stereoselectivity, thus complementing the scope of Mb(H64V,V68A). Similarly, other Mb catalysts were identified that can produce the cis product 3d with high stereoselectivity (e.g., 95- 99% ee) such as, for example, Mb(H64V,V68G). Using a similar approach, further mutagenesis can be applied to these and/or other Mb variants in order to further optimize the catalytic and selectivity properties of the myoglobin catalysts in the context of olefin cyclopropanation and/or the other carbene transfer reactions described herein.
[00362] These results together with those described in the previous EXAMPLES suggest that engineered variants of other naturally occurring myoglobins can be prepared and applied for catalyzing carbene transfer reactions according to the methods described herein. As a way of example, the amino acid sequence of sperm whale myoglobin (SEQ ID NO: l) was used to identify related myoglobin polypeptides such as Xenopus laevis myoglobin (SEQ ID NO. 112), Channichthys rhinoceratus myoglobin (SEQ ID NO: 113), Thunnus alalunga myoglobin (SEQ ID NO: 114), Clonorchis sinensis myoglobin (SEQ ID NO: 115), and Schistosoma mansoni myoglobin (SEQ ID NO: 116). Via protein sequence alignment, the His or Tyr residue corresponding to the distal histidine (His64, FIG. 1) in sperm whale myoglobin (SEQ ID NO: l) was identified. Since a H64V mutation was found useful to greatly increase the carbene transfer reactivity of sperm whale myoglobin (EXAMPLE 1), a similar mutation was introduced into Xenopus laevis myoglobin (H82V), into Channichthys rhinoceratus myoglobin (H59V), into Thunnus alalunga myoglobin (H59V), into Clonorchis sinensis myoglobin (Y82V), and into Schistosoma mansoni myoglobin (Y82V), resulting in engineered myoglobin catalysts with improved catalytic activity and/or altered selectivity for olefin cyclopropanation (using the cyclopropanation of styrene with EDA as model reaction) as compared to the respective, wild- type polypeptides. These results show how the methods described herein can be extended to myoglobins other than sperm whale myoglobin for the purpose of developing novel and improved biocatalysts for the carbene transfer reactions described herein.
[00363] 6.10 Example 10: Aldehyde olefination reactions catalyzed by myoglobin-based catalysts.
[00364] The transition metal-catalyzed transformation of carbonyls in the presence of diazo compounds and tertiary phosphines has provided an attractive strategy for the synthesis of olefins from aldehydes under mild conditions Ledford and Carreira 1997; Lebel and Paquet 2004; Graban and Lemke 2002; Chen, Huang et al. 2003; Mirafzal, Cheng et al. 2002; Chen, Huang et al. 2003; Aggarwal, Fulton et al. 2003; Cao, Li et al. 2007. While a number of synthetic catalysts have been described for aldehyde olefination, no natural enzyme or artificial biocatalysts is known to promote this synthetically valuable transformation.
[00365] To address this gap, the ability of wild-type sperm whale myoglobin (SEQ ID NO: l) to promote the conversion of benzyl aldehyde 111a and ethyl oc-diazo acetate (EDA, 112a) to ethyl cinnamate 113a in the presence of triphenylphosphine (ΡΡ1¾) was initially investigated. Formation of the desired product 113a was observed, with good diastereoselectivity (76% de for E-isomer), albeit with only modest activity (31 turnovers or TON) (FIG. 30, Entry 3). Both reducing (Na2S204) and oxygen-free conditions were found to be required for the observed Mb- dependent aldehyde olefination activity, consistent with the idea that the ferrous form of the hemoprotein is involved in the activation of the diazo compound. Additional experiments showed that hemin is also able to promote this transformation, although this reaction proceeds with reduced catalytic efficiency (22 TON) and lower diastereoselectivity (65% de) as compared to that conducted in the presence of Mb (FIG. 30, Entry 1). In addition, the hemin reaction is much less chemoselective, yielding larger amounts of the carbene dimerization byproducts, diethyl fumarate and diethyl maleate (TON(3a)/TON(4a): 0.4 vs. 2.8 with Mb, FIG. 30). In an effort to improve the efficiency and selectivity of the Mb-mediated olefination reaction, a variety of trialkyl phosphines (e.g., PEt3, P(i-Bu)3, P(n-Bu)3) as well as heavier congeners of PPI13 (i.e., AsPh3, AsPh3, and BiPh3) were tested as a substitute for triphenylphosphine (FIG. 30).
Interestingly, whereas neither BiPh3 nor any of the trialkyl phosphines led to the desired olefin product, the reaction carried out in the presence of AsPh3 exhibited excellent
diastereoselectivity, leading to the formation of trans-113a as the only detectable isomer (>99.9% de).
[00366] The investigations were extended to a panel of Mb variants containing one or two mutations at the level of the protein active site. As described in the previous EXAMPLES, it was found that mutagenesis of these residues had a profound impact on the selectivity and activity of Mb-catalyzed carbene transfer reactions. Accordingly, the Mb active site variants were tested for their relative activity and selectivity in the olefination reaction with benzaldehyde and EDA in the presence of either PPI13 or AsPh3. As summarized in FIG. 31, the active site mutations were found to have a noticeable effect on the catalytic efficiency (TON), diastereoselectivity, and chemoselectivity of the reaction. Among the Mb variants tested, the double mutant
Mb(F43V,V68F), used in combination with AsPh3, emerged as the most promising catalyst for this reaction, exhibiting 3 -fold higher TON compared to wild-type Mb, excellent diasteroselectivity (>99.9% de), and high chemoselectivity toward aldehyde olefination over carbene dimerization. At a catalyst loading of 0.01 mol%, Mb(F43V,V68F) was determined to support over 1,100 catalytic turnovers for the conversion of 111 to E-113a, featuring an initial rate of 320 and 40 turnovers min"1 over the first minute and first 15 minutes, respectively.
Importantly, nearly absolute E-selectivity as well as high chemoselectivity (TON(oiefm):TON(dimer)
> 4) were maintained under these conditions. Across nearly all Mb variants, the AsPh3- supported reactions consistently furnished higher degrees of diastereoselectivity as compared to those performed in the presence of ΡΡ1¾ (FIG. 31). The only exception is Mb(H64V,V68A), for which a reversal of this trend was observed (70% vs. 57% de for reaction with PPI13 vs. ASPI13).
[00367] To investigate the scope of Mb(F43V,V68F) as aldehyde olefination catalyst, the reaction with benzaldehyde was carried out in the presence of other oc-diazo esters, including ieri-butyl (112b), benzyl (112c), and cyclohexyl (112d) oc-diazo acetate. Notably, despite their variable alkyl chain, all of the oc-diazo-acetates (112b-112d) could be readily processed by the biocatalyst to yield the corresponding trans P-aryl-oc,P-unsaturated ester products, 113b-113d, with good (79-83% de) to excellent (98-99.9% de) selectivity in the presence of PPI13 and AsPh3, respectively (FIG. 32). In combination with 113c, Mb(F43V,V68F) gave the highest TON value
(4,920) and conversion ratio (49%), whereas the use of the Mb(F43V,V68F)/113d/AsPh3 system provided an optimal combination of high catalytic activity (4,200 TON) with excellent stereocontrol (99.9% de). As such, the latter system was maintained for further studies on the scope of this hemoprotein across different aldehydes (vide infra). Under these optimized conditions, the TON values supported by Mb(F43V,V68F) in water and at room temperature are one to two orders of magnitudes higher than those previously reported for organometallic catalysts in organic solvents and at elevated temperature in the context of similar
transformations (50-300 TON (Lu, Fang et al. 1989; Herrmann and Wang 1991 ; Ledford and
Carreira 1997; Graban and Lemke 2002; Mirafzal, Cheng et al. 2002; Aggarwal, Fulton et al.
2003; Chen, Huang et al. 2003; Chen, Huang et al. 2003; Santos, Romao et al. 2003; Lebel and
Paquet 2004; Cao, Li et al. 2007; Lebel and Davi 2008; Lebel and Ladjel 2008).
[00368] Next, the scope of Mb(F43V,V68F)-catalyzed olefination across different aldehyde substrates was investigated. As summarized in FIG. 33, a variety of monosubstituted benzyaldehyde derivatives (115a-123a) could be readily converted to the corresponding cyclohexyl irans-cinnamate esters 115b- 123b with very good to excellent diastereoselectivity
(99-99.9% de), with the Mb catalyst supporting from 1,110 (123b) to 3,400 turnovers (115b and 117b). The successful conversion of 124a to 124b showed that disubstituted benzyaldehydes could be also processed by the Mb(F43V,V68F) catalyst, albeit with lower efficiency (1,140 vs. 3,400 TON) and selectivity (91% vs. 98% de) compared to the monosubstituted counterpart, 115b. Substrates such as 2-naphthaldehyde (125a) and thiophene-2-carbaldehyde (126a) could also be converted to the corresponding trans olefin products 125b and 126b, respectively, with excellent selectivity (99% de), further supporting the broad substrate scope of Mb(F43V,V68F) across structurally different aryl aldehydes. Finally, the successful olefination of
phenylacetaldehyde (127a) to give 127b (1,940 TON; 92% de) demonstrated the reactivity of the catalyst also toward alkyl aldehydes.
[00369] Altogether, these results show that engineered myoglobins described herein can provide efficient and selective biocatalysts for the olefination of aldehydes under mild and neutral conditions. Using Mb(F43V,V68F), for example, a variety of aryl aldehydes and alkyl oc- diazo acetates could be converted to the corresponding olefin products with high catalytic efficiency (1,100-4,900 TON) and excellent ^-selectivity (94-99.9% de). The Mb-catalyzed aldehyde olefination described represents expands the number of synthetically valuable transformations accessible through biocatalysis.
[00370] Experimental Details.
[00371] Protein expression and purification. Wild-type Mb and the engineered Mb variants were expressed in E. coli BL21(DE3) cells as described in EXAMPLE 1.
[00372] Analytical methods. Gas chromatography (GC) analyses were carried out using a Shimadzu GC-2010 gas chromatograph equipped with a FID detector and a Chiral Cyclosil-B column (30 m x 0.25 mm x 0.25 μιη film). Separation method for calculation of TON and TTN values: 1 μΕ injection, injector temp.: 200 °C, detector temp: 300 °C. Gradient: column temperature set at 80 °C for 3 min, then to 160 °C at 2.80 °C/min, then to 165 °C at 1 °C/min, then to 245 °C at 25 °C/min.
[00373] Aldehyde olefination reaction. Typically, reactions were carried out at a 400 μΕ scale using 20 μΜ myoglobin, 10 mM benzaldehyde, 10 mM EDA, 10 mM triphenylphosphine (or trialkyl phosphines, AsPh3, SbPh3, BiPh3) and 10 mM sodium dithionite. In a typical procedure, a solution containing sodium dithionate (100 mM stock solution) in potassium phosphate buffer (50 mM, pH 8.0) was degassed by bubbling argon into the mixture for 4 min in a sealed vial. A buffered solution containing myoglobin was carefully degassed in a similar manner in a separate vial. The two solutions were then mixed together via cannula. Reactions were initiated by addition of 10 μL· of benzaldehyde (from a 0.4 M stock solution in DMSO), 10 μL·
triphenylphosphine (from a 0.4 M stock solution in DMSO) of followed by the addition of l0μL· of EDA (from a 0.4 M stock solution in DMSO) with a syringe, and the reaction mixture was stirred for 12 h at room temperature, under positive argon pressure. The reactions were analyzed by adding 20 μL· of internal standard (benzodioxole, 50 mM in methanol) to the reaction mixture, followed by extraction with 400 μL· of dichloromethane and analysis by GC-FID. Calibration curves for quantification of the different aldehyde olefination products were constructed using authentic standards prepared synthetically according to art-known procedures. All measurements were performed at least in duplicate. For each experiment, negative control samples containing either no enzyme or no reductant were included.
[00374] Ethyl cinnamate (113a). GC-MS m/z (% relative intensity): 176(29.0), 148(15.9), 131(100), 103(48.3); ]H NMR (CDC13, 500 MHz): δ 7.70 (d, J = 16 Hz, 1H), 7.51-7.50 (m, 2H), 7.37-7.36 (m, 3H), 6.45 (d, J = 16 Hz, 1H), 4.28 (q, J = 7.0 Hz, 2H), 1.34 (t, J = 7.0 Hz, 3H) ppm, 13C NMR (CDC13, 125 MHz): δ 166.9, 144.5, 134.5, 130.2, 128.8, 128.0, 118.3, 60.4, 14.3 ppm.
[00375] Cyclohexyl cinnamate (113b). GC-MS m/z (% relative intensity): 204(7.9), 147(100), 131(65.2), 103(31.8); ]H NMR (CDC13, 500 MHz): δ 7.61 (d, J =16 Hz, 1H), 7.50-
7.49 (m, 2H), 7.36-7.35 (m, 3H), 6.39 (d, J = 16 Hz, 1H), 1.54 (s, 9H) ppm, 13C NMR (CDC13, 125 MHz): δ 166.3, 143.5, 134.7, 129.9, 128.8, 127.9, 120.2, 80.4, 28.2 ppm.
[00376] Cyclohexyl cinnamate (113d). GC-MS m/z (% relative intensity): 230(3.6), 149(37.7), 131(100), 103(38.7); ]H NMR (CDC13, 400 MHz): δ 7.69 (d, J = 16.0 Hz, 1H), 7.51-
7.50 (m, 2H) 7.37-7.35 (m, 3H), 6.45 (d, J = 16 Hz, 1H), 4.92-4.87 (m, 1H), 1.92-1.91 (m, 2H), 1.78-1.75 (m, 2H), 1.58-1.27 (m, 6H) ppm, 13C NMR (CDC13, 100 MHz): δ 166.1, 144.0, 134.3, 129.8, 128.6, 127.7, 118.7, 72.4, 31.5, 25.2, 23.6 ppm.
[00377] (E)-Cyclohexyl 3-(o-tolyl)acrylate (115b). GC-MS m/z (% relative intensity): 244(5.7), 162(28.1), 145(100), 116(68.4); ]H NMR (CDC13, 400 MHz): δ 7.98 (d, J = 16 Hz 1H), 7.55 (d, J = 7.5 Hz, 1H), 7.26-7.17 (m, 3H), 6.37 (d, J = 16 Hz, 1H), 4.92-4.88 (m, 1H), 2.43 (s, 3H), 1.92-1.76 (m, 2H), 1.58-1.55 (m, 2H), 1.54-1.31 (m, 6H) ppm, 13C NMR (CDC13, 100 MHz): δ 166.5, 141.9, 137.5, 133.5, 130.7, 129.8, 126.4, 126.3, 119.9, 72.6, 31.7, 25.5, 23.8, 19.7 ppm.
[00378] (E)-Cyclohexyl 3-(4-methyl-phenyl)acrylate (116b). GC-MS m/z (% relative intensity): 246(13.0), 164(100), 147(69.4), 120(18.4); ]H NMR (CDC13, 400 MHz): δ 7.66 (d, J = 15.6 Hz, 1H), 7.42 (d, J = 6.0 Hz, 2H), 7.18 (d, J = 5.6 Hz, 2H), 6.40 (d, J = 15.6 Hz, 1H), 4.89-4.88 (m, 1H), 2.36 (s, 3H), 2.00 (m, 2H), 1.90 (m, 2H), 1.58-1.26 (m, 6H) ppm, 13C NMR (CDC13, 100 MHz): δ 166.4, 144.0, 140.2, 131.6, 129.3, 127.7, 117.6, 72.4, 31.5, 25.2, 23.6, 21.2 ppm.
[00379] (E)-Cyclohexyl 3-(4-methoxy-phenyl)acrylate (117b). GC-MS m/z (% relative intensity): 260(25.4), 178(100), 161(74.8), 134(42.8); ]H NMR (CDC13, 400 MHz): δ 7.64 (d, J = 16 Hz, 1H), 7.48 (d, J = 8.4 Hz, 2H), 6.90 (d, J = 8.4 Hz, 2H), 6.32 (d, J = 16 Hz, 1H), 4.90- 4.84 (m, 1H), 3.82 (s, 3H), 1.91-1.90 (m, 2H), 1.78-1.75 (m, 2H), 1.58-1.27 (m, 6H) ppm, 13C NMR (CDCI3, 100 MHz): δ 166.5, 161.0, 143.6, 129.4, 127.1, 116.2, 114.0, 72.2, 55.1, 31.5, 25.2, 23.6 ppm.
[00380] (E)-Cyclohexyl 3-(4-fluoro-phenyl)acrylate (118b). GC-MS m/z (% relative intensity): 248(3.1), 166(61.9), 149(100), 28.8(121), 101(30.5); ]H NMR (CDC13, 400 MHz): δ 7.64 (d, J = 16 Hz, 1H), 7.51-7.48 (m, 2H), 7.08-7.04 (m, 2H), 6.37 (d, J = 16 Hz, 1H), 4.89- 4.86 (m, 1H), 1.91-1.89 (m, 2H), 1.76-1.75 (m, 2H), 1.57-1.25 (m, 6H) ppm, 13C NMR (CDC13, 100 MHz): δ 166.0, 142.7, 130.6, 129.6, 129.5, 118.4, 115.8, 115.6, 72.5, 31.5, 25.2, 23.6 ppm.
[00381] (E)-Cyclohexyl 3-(4-chloro-phenyl)acrylate (119b). GC-MS m/z (% relative intensity): 264(5.7), 182(86.8), 165(100), 137(33.1), 102(42.0); ]H NMR (CDC13, 400 MHz): δ 7.62 (d, J =16 Hz, 1H), 7.46 (d, J = 8.4 Hz, 2H), 7.36 (d, J = 7.6 Hz, 2H), 6.42 (d, J = 16 Hz, 1H), 4.90-4.86 (m, 1H), 1.91-1.90 (m, 2H), 1.78-1.75 (m, 2H), 1.56-1.25 (m, 6H) ppm, 13C NMR (CDCI3, 100 MHz): δ 165.9, 142.5, 135.7, 132.8, 129.0, 128.9, 119.2, 72.7, 31.5, 25.2, 23.6 ppm.
[00382] (E)-Cyclohexyl 3-(4-bromo-phenyl)acrylate (120b). GC-MS m/z (% relative intensity): 308(4.1), 226(73.1), 209(59.5), 102(100); ]H NMR (CDC13, 400 MHz): δ 7.61(d, J = 16 Hz, 1H), 7.51 (d, J = 8.4 Hz, 2H), 7.39 (d, J = 8.4 Hz, 2H), 6.43 (d, J = 16 Hz, 1H), 4.90-4.85 (m, 1H), 1.91-1.90 (m, 2H), 1.78-1.75 (m, 2H), 1.60-1.25 (m, 6H) ppm, 13C NMR (CDC13, 100 MHz): δ 165.9, 142.6, 133.2, 131.8, 129.1, 124.1, 119.4, 72.6, 31.5, 25.2, 23.5 ppm.
[00383] (E)-Cyclohexyl 3-(4-(trifluoromethyl)phenyl)acrylate (121b). GC-MS m/z (% relative intensity):217(73.8), 199(100), 171(30.4), 151(45.7);1H NMR (CDC13, 400 MHz): δ 7.69-7.63 (m, 5H), 6.52 (d, J = 16 Hz, 1H), 4.90 (m, 1H), 1.91 (m, 2H), 1.76 (m, 2H), 1.58-1.25 (m, 6H) ppm, 13C NMR (CDC13, 100 MHz): δ 165.6, 142.1, 137.7, 127.8, 125.5, 121.3, 72.9, 31.5, 25.1, 23.5 ppm. [00384] (E)-Cyclohexyl 3-(4-nitro-phenyl)acrylate (122b). GC-MS m/z (% relative intensity): 194(85.8), 176(100), 130(37.9), 102(44.1); ]H NMR (CDC13, 400 MHz): δ 8.24 (d, J = 7.2 Hz, 2H), 7.70-7.65 (m, 3H), 6.57 (d, J = 16 Hz, IH), 4.91-4.90 (m, IH), 1.90 (m, 2H), 1.75 (m, 2H), 1.58-1.25 (m, 6H) ppm, 13C NMR (CDC13, 100 MHz): δ 165.2, 148.2 141.0, 140.5, 128.3, 123.9, 123.0, 73.1, 31.4, 25.1, 23.5 ppm.
[00385] (E)-Cyclohexyl 3-(4-(dimethylamino)-phenyl)acrylate (123b). GC-MS m/z (% relative intensity):273(54.9), 191(100), 174(34.2), 147(49.9); ]H NMR (CDC13, 400 MHz): δ 7.62 (d, J = 15.6 Hz, IH), 7.42 (d, J = 8.8 Hz, 2H), 6.66 (d, J = 8.4 Hz, 2H), 6.23 (d, J = 15.6 Hz, IH), 4.89-4.84 (m, IH), 2.99 (s, 6H), 1.92-1.91 (m, 2H), 1.78-1.76 (m, 2H), 1.58-1.26 (m, 6H) ppm, 13C NMR (CDC13, 100 MHz): δ 167.1, 151.4, 144.6, 129.4, 122.1, 112.9, 111.5, 71.9, 39.9, 31.6, 25.2, 23.7 ppm.
[00386] (E)-Cyclohexyl 3-(2,5-dimethylphenyl)acrylate (124b). GC-MS m/z (% relative intensity): 258(17.6), 176(52.8), 159(100), 130(94.9), 115(44.6);^ NMR (CDC13, 400 MHz): δ 7.96 (d, J = 16 Hz, IH), 7.35 (s, IH), 7.05 (s, 2H), 6.37 (d, J = 15.6 Hz, IH), 4.94-4.87 (m, IH),
2.37 (s, 3H), 2.30 (s, 3H), 1.93-1.91 (m, 2H), 1.79-1.76 (m, 2H), 1.58-1.27 (m, 6H) ppm, 13C NMR (CDCI3, 100 MHz): δ 166.2, 141.9, 141.8, 135.3, 134.3, 132.9, 130.5, 130.4, 126.6, 119.2, 72.3, 31.5, 25.3, 23.6 ppm.
[00387] (E)-Ethyl 3-(naphthalen-2-yl)acrylate (125b). GC-MS m/z (% relative intensity): 226(73.3), 198(14.7), 181(100), 152(92.5); ]H NMR (CDC13, 500 MHz): δ 7.85-7.75 (m, 4H), 7.61 (d, J = 8.5 Hz, IH), 7.49-7.45 (m, 2H), 6.55 (d, J = 16 Hz, IH), 4.33 (q, J = 7.0 Hz, 2H),
1.38 (t, J = 7.0 Hz, 3H) ppm, 13C NMR (CDC13, 125 MHz): δ 167.0, 144.6, 134.2, 133.3, 131.9, 129.9, 128.6, 128.5, 127.8, 127.2, 126.7, 123.5, 118.4, 60.5, 14.4 ppm.
[00388] (E)-Ethyl 3-(thiophen-2-yl)acrylate (126b). GC-MS m/z (% relative intensity): 182(35.4), 154(11.5), 137(100), 109(40.1);^ NMR (CDC13, 500 MHz): δ 7.76 (d, J = 15.5 Hz, IH), 7.32 (d, J = 4.5 Hz, IH), 7.20 (d, J = 3.0 Hz, IH), 7.01-6.99 (m, IH), 6.22 (d, J = 15.5 Hz, IH), 4.23 (q, J = 7.0 Hz, 2H), 1.30 (t, J = 7.5 Hz, 3H) ppm, 13C NMR (CDC13, 125 MHz): δ 166.7, 139.5, 136.9, 130.8, 128.3, 128.0, 117.0, 60.4, 14.3 ppm.
[00389] (E)-Ethyl 4-phenylbut-2-enoate (127b). GC-MS m/z (% relative intensity):
190(40.8), 145(39.0), 127(18.7), 117(100);^ NMR (CDC13, 500 MHz): δ 7.33(t, J = 7.0 Hz, 2H), 7.26( t, J =7.5 Hz, IH), 7.19 (d, J = 7.5 Hz, 2H), 7.14-7.08 (m, IH), 5.84 (d, J = 15.5 Hz, IH), 4.21(q, J = 7.0Hz, 2H), 3.53 (d, J = 6.5 Hz, 2H), 1.29 (t, J = 7.0 Hz, 3H) ppm, 13C NMR (CDCI3, 125 MHz): δ 166.4, 147.3, 137.7, 128.8, 128.7, 126.6, 122.4, 60.2, 38.4, 14.3 ppm. [00390] REFERENCES
Abu-Elfotoh, A. M., K. Phomkeona, et al. (2010). Angew. Chem. Int. Ed. 49(45): 8439-8443.
Aggarwal, V. K., J. R. Fulton, et al. (2003). J. Am. Chem. Soc. 125(20): 6034-6035.
Anding, B. J., A. Ellern, et al. (2012). Organometallics 31(9): 3628-3635.
Aviv, I. and Z. Gross (2006). Synlett(6): 951-953.
Aviv, I. and Z. Gross (2008). Chemistry 14(13): 3995-4005.
Bessho, Y., D. R. Hodgson, et al. (2002). Nat Biotechnol 20(7): 723-728.
Bordeaux, M., R. Singh, et al. (2014). Bioorg Med Chem 22(20): 5697-5704.
Bordeaux, M., R. Singh, et al. (2014). Bioorg. Med. Chem. 22(20): 5697-5704.
Bornscheuer, U. T., G. W. Huisman, et al. (2012). Nature 485(7397): 185-194.
Brunner, H., K. Wutz, et al. (1990). Monatsch. Chem. 121 : 755-764.
Cao, P., C. Y. Li, et al. (2007). J. Org. Chem. 72(17): 6628-6630.
Che, C. M., J. S. Huang, et al. (2001). J. Am. Chem. Soc. 123(18): 4119-4129.
Chen, Y., L. Huang, et al. (2003). J. Org. Chem. 68(9): 3714-3717.
Chen, Y., L. Y. Huang, et al. (2003). Org. Lett. 5(14): 2493-2496.
Coelho, P. S., E. M. Brustad, et al. (2013). Science 339(6117): 307-310.
Coelho, P. S., Z. J. Wang, et al. (2013). Nat. Chem. Biol. 9(8): 485-U433.
Davies, H. M. L. and C. Venkataramani (2003). Org. Lett. 5(9): 1403-1406.
Dedkova, L. M., N. E. Fahmi, et al. (2003). Journal of the American Chemical Society 125(22):
6616-6617.
Del Zotto, A., W. Baratta, et al. (1999). J. Chem. Soc. Perkin Trans.(21): 3079-3081.
Doyle, M. P. and D. C. Forbes (1998). Chem. Rev. 98(2): 911-936.
Doyle, M. P., M. A. McKervey, et al. (1998). Modern catalytic methods for organic synthesis with diazo compounds: from cyclopropanes to ylides. New York, Wiley-CH.
Dumon-Seignovert, L., G. Cariot, et al. (2004). Protein Expr. Purif. 37(1): 203-206.
Galardon, E., P. LeMaux, et al. (1997). J. Chem. Soc. Perkin Trans.(17): 2455-2456.
Galardon, E., S. Roue, et al. (1998). Tetrahedron Lett. 39(16): 2333-2334.
Gillingham, D. and N. Fei (2013). Chem. Soc. Rev. 42(12): 4918-4931.
Graban, E. and F. R. Lemke (2002). Organometallics 21(18): 3823-3826.
Hayashi, T., H. Dejima, et al. (2002). J Am Chem Soc 124(38): 11226-11227.
Hayashi, T., T. Matsuo, et al. (2002). J Inorg Biochem 91(1): 94-100.
Heinecke, J. L., J. Yi, et al. (2012). J. Inorg. Biochem. 107(1): 47-53. Herrmann, W. A. and M. Wang (1991). Angew. Chem. Int. Ed. 30(12): 1641-1643.
Kolev, J. N., J. M. Zaengle, et al. (2014). Chembiochem 15(7): 1001-1010.
Kourouklis, D., H. Murakami, et al. (2005). Methods 36(3): 239-244.
Lebel, H. and M. Davi (2008). Adv. Synth. Catal. 350(14-15): 2352-2358.
Lebel, H. and C. Ladjel (2008). Organometallics 27(11): 2676-2678.
Lebel, H. and V. Paquet (2004). J. Am. Chem. Soc. 126(1): 320-328.
Lebel, H., J. F. Marcoux, et al. (2003). Chem. Rev. 103(4): 977-1050.
Ledford, B. E. and E. M. Carreira (1997). Tetrahedron Lett. 38(47): 8125-8128.
Li, Y., J. S. Huang, et al. (2002). J. Am. Chem. Soc. 124(44): 13185-13193.
Liu, B., S. F. Zhu, et al. (2007). J. Am. Chem. Soc. 129(18): 5834-5835.
Liu, C. C. and P. G. Schultz (2010). Annu. Rev. Biochem. 79: 413-444.
Locuson, C. W., J. M. Hutzler, et al. (2007). Drug Metab. Dispos. 35(4): 614-622.
Lu, X. Y., H. Fang, et al. (1989). J. Organomet. Chem. 373(1): 77-84.
Mirafzal, G. A., G. L. Cheng, et al. (2002). J. Am. Chem. Soc. 124(2): 176-177.
Moody, C. J. (2007). Angew. Chem. Int. Ed. 46(48): 9148-9150.
Morilla, M. E., M. M. Diaz-Requejo, et al. (2002). Chem. Commun. (24): 2998-2999.
Murakami, H., A. Ohta, et al. (2006). Nat Methods 3(5): 357-359.
Neumann, H., K. Wang, et al. (2010). Nature 464(7287): 441-444.
Nowlan, D. T., T. M. Gregg, et al. (2003). J. Am. Chem. Soc. 125(51): 15902-15911.
Paul, C. E., I. W. C. E. Arends, et al. (2014). Acs Catalysis 4(3): 788-797.
Pellissier, H. (2008). Tetrahedron 64(30-31): 7041-7095.
Redaelli, C, E. Monzani, et al. (2002). Chembiochem 3(2-3): 226-233.
Rein, H., A. Maricic, et al. (1976). Biochim. Biophys. Acta 446(1): 325-330.
Rodriguez, E. A., H. A. Lester, et al. (2006). Proc Natl Acad Sci U S A 103(23): 8650-8655.
Santos, A. M., C. C. Romao, et al. (2003). J. Am. Chem. Soc. 125(9): 2414-2415.
Varnado, C. L. and D. C. Goodwin (2004). Prot. Expr. Purif. 35(1): 76-83.
Vojtechovsky, J., K. Chu, et al. (1999). Biophys. J. 77(4): 2153-2174.
Xu, B., S. F. Zhu, et al. (2014). Chem. Sci. 5(4): 1442-1448.
Wang, Z. J., N. E. Peck, et al. (2014). Chem. Sci. 5(2): 598-601.
Wang, Z. J., H. Renata, et al. (2014). Angew. Chem. Int. Ed. 53(26): 6810-6813.
Waser, J., B. Caspar, et al. (2006). J. Am. Chem. Soc. 128(35): 11693-11712.
Wolf, J. R., C. G. Hamaker, et al. (1995). J. Am. Chem. Soc. 117(36): 9194-9199. Woodward, J. J., N. I. Martin, et al. (2007). Nat. Methods 4(1): 43-45.
Yonetani, T. and T. Asakura (1969). J. Biol. Chem. 244(17): 4580-4588.
Yonetani, T., H. Yamamoto, et al. (1974). J. Biol. Chem. 249(3): 682-690.
Zhang, Z. and J. Wang (2008). Tetrahedron 64: 6577-6605.
Zhang, X., M. Ma, et al. (2003). Arkivoc (ii): 84-91.
Zhang, Y. Z., S. F. Zhu, et al. (2009). Chem. Commun.(36): 5362-5364.
[00391] It will be appreciated that variants of the above-disclosed and other features and functions or alternatives thereof, may be combined into many other different systems or applications. Various presently unforeseen or unanticipated alternatives, modifications, variations, or improvements therein may be subsequently made by those skilled in the art which are also intended to be encompassed by the following claims.
[00392] While embodiments of the present disclosure have been particularly shown and described with reference to certain examples and features, it will be understood by one skilled in the art that various changes in detail may be effected therein without departing from the spirit and scope of the present disclosure as defined by claims that can be supported by the written description and drawings. Further, where exemplary embodiments are described with reference to a certain number of elements it will be understood that the exemplary embodiments can be practiced utilizing either less than or more than the certain number of elements.
[00393] All references cited herein are incorporated herein by reference in their entirety and for all purposes to the same extent as if each individual publication, patent or patent application was specifically and individually indicated to be incorporated by reference in its entirety for all purposes.
[00394] The citation of any publication is for its disclosure prior to the filing date and should not be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention.

Claims

What is claimed is:
1. An engineered myoglobin catalyst having an improved capability, as compared to the
myoglobin of SEQ ID NO: 1, to catalyze a carbene transfer reaction, wherein the engineered myoglobin catalyst comprises an amino acid sequence that is at least 60% identical to SEQ ID NO: 1, 112, 113, 114, 115, or 116.
2. The engineered myoglobin catalyst of claim 1, wherein the improved capability of the
myoglobin catalyst is an improvement in its catalytic activity, regioselectivity,
diastereoselectivity and/or enantioselectivity.
3. The engineered myoglobin catalyst of claim 1, wherein the myoglobin catalyst comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 and comprises an amino acid substitution at position X29, X32, X33, X39, X44, X45, X46, X64, X67, X68, X93, X107, or XI 11 of SEQ ID NO: 1.
4. The engineered myoglobin catalyst of claim 3, wherein the amino acid sequence of the myoglobin catalyst comprises at least one feature, wherein the feature is: X29 is A, R, N, D,
C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X32 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X33 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X39 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X43 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X45 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X46 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X64 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X67 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X68 is A, R, N,
D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y: X93 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X107 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; and/or XI 11 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y, of SEQ ID NO: 1.
5. The engineered myoglobin catalyst of claim 3, wherein the myoglobin catalyst is selected from the group consisting of SEQ ID NOS: 2 through 110.
6. The engineered myoglobin catalyst of claim 1, wherein the myoglobin catalyst comprises a metal-binding cofactor, wherein the metal-binding cofactor is a heme analog, a metalloporphyrin, or a porphyrin analog.
7. The engineered myoglobin catalyst of claim 6, wherein the metal-binding cofactor is a mesoporphyrin, protoporphyrin, bisglycolporphyrin, corrole, phthalocyanine, phlorin, chlorin, 5-isocorrole, 10-isocorrole, or porphycene.
8. The engineered myoglobin catalyst of claim 6, wherein the metal bound by the metal- binding cofactor is iron, manganese, cobalt, ruthenium, rhodium, or osmium.
9. The engineered myoglobin catalyst of claim 1, wherein the amino acid residue that
coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is natural or non-natural oc-amino acid amino comprising a— SH,— N¾,— OH, =N-, — NC, imidazolyl, or pyridyl group within its side chain.
10. The engineered myoglobin catalyst of claim 9, wherein the amino acid residue that
coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is serine, threonine, cysteine, tyrosine, histidine, aspartic acid, glutamic acid, selenocysteine, para-amino-phenylalanine, meto-amino-phenylalanine, para- mercaptomethyl-phenylalanine, meto-mercaptomethyl-phenylalanine, para- (isocyanomethyl)-phenylalanine, meto-(isocyanomethyl)-phenylalanine, 3-pyridyl-alanine, or 3-methyl-histidine.
11. A method for catalyzing a carbene insertion reaction, the method comprising:
(a) providing a diazo-containing carbene precursor of formula (I)
Figure imgf000102_0001
(I) wherein,
Ria and R2a are independently selected from the group consisting of H, halo, cyano (— CN), nitro (— N02), trifluoromethyl (— CF3), CMS alkyl, substituted Ci_i8 alkyl, C6-io aryl, substituted C6-io aryl, 5- to 10-membered heteroaryl, substituted 5- to 10- membered heteroaryl, — C(0)ORib, — C(0)N(Rib)(Ric), — C(0)Rib, — Si(Rib)(Ric)(Rid), and— S02(Rlb),
where each R]b, R]C and Rw are independently selected from the group consisting of H, Ci-18 alkyl, substituted C1-18 alkyl, C6-io aryl, substituted C6-io aryl, 6- to 10- membered heteroaryl and substituted 6- to 10-membered heteroaryl group;
(b) providing a myoglobin-based catalyst;
(c) providing a carbene acceptor substrate of formula (II), (IV), (VI) or (VIII):
Figure imgf000103_0001
(II) (iv) (vni)
where
R2 is C6-i5 aryl, substituted C6-i5 aryl, 5- to 15-membered heteroaryl, substituted 5- to 15-membered heteroaryl, C1-18 aliphatic or substituted C1-18 aliphatic group;
R3 is H, halo, cyano, C1-18 aliphatic, substituted C1-18 aliphatic, C6-io aryl, substituted C6-io aryl, 5- to 10-membered heteroaryl, substituted 5- to 10-membered heteroaryl, — C(0)ORlb,— C(0)N(Rib)(Ric), and— C(0)Rlb,
where each of R]b and R]C are independently selected from the group consisting of H, Ci-18 aliphatic, substituted C1-18 aliphatic, C6-io aryl, substituted C6-io aryl, 5- to 10- membered heteroaryl, and substituted 5- to 10-membered heteroaryl group;
each of R4 and R5 are independently selected from the group consisting of H, halo, cyano, C1-18 aliphatic, substituted C1-18 aliphatic, C6-io aryl, substituted C6-io aryl, 5- to 10-membered heteroaryl, and substituted 5- to 10-membered heteroaryl group; R6 is Ci-18 aliphatic, substituted C1-18 aliphatic, C6-i6 aryl, substituted C6-i6 aryl, 5- to 10-membered heteroaryl, substituted 5- to 10-membered heteroaryl, C4-C16 cyclic aliphatic, substituted C4-C16 cyclic aliphatic, C4-C16 heterocyclic, or substituted C4- Ci6 heterocyclic group; R7 is H, Ci-18 aliphatic, substituted C1-18 aliphatic, C6-i6 aryl, substituted C6-i6 aryl, 5- to 10-membered heteroaryl, or substituted 5- to 10-membered heteroaryl; or where R6 and R7 are connected to form a C4-C16 cyclic aliphatic or heterocyclic group or a substituted C4-C16 cyclic aliphatic or heterocyclic group;
Rs is Ci-18 aliphatic, substituted CMS aliphatic, C6-i6 aryl, substituted C6-i6 aryl, 5- to 10-membered heteroaryl, substituted 5- to 10-membered heteroaryl, C4-C16 cyclic aliphatic, substituted C4-C16 cyclic aliphatic, C4-C16 heterocyclic, or substituted C4- Ci6 heterocyclic group;
R9 is Ci-18 aliphatic, substituted CMS aliphatic, C6-i6 aryl, substituted C6-i6 aryl, 5- to 10-membered heteroaryl, substituted 5- to 10-membered heteroaryl, C4-C16 cyclic aliphatic, substituted C4-C16 cyclic aliphatic, C4-C16 heterocyclic, or substituted C4- Ci6 heterocyclic group;
Rio and Rn are Ci_6 aliphatic groups or substituted Ci_6 aliphatic groups;
(d) contacting the diazo-containing carbene precursor and the carbene acceptor substrate with the myoglobin-based catalyst; and
(e) allowing the reaction to proceed for a time sufficient to form a carbene insertion
Figure imgf000104_0001
(HI)
where R]a, R2a, R2, R3, R4, R5, R6, R7, Rs, R9, Rio, and Rn are defined as recited in (a) and (c) above.
12. The method of claim 11, wherein the myoglobin comprises an amino acid sequence that is at least 60% identical to SEQ ID NO: 1, 112, 113, 114, 115, or 116.
13. The method of claim 12, wherein the myoglobin catalyst comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 and comprises an amino acid substitution at position X29, X32, X33, X39, X44, X45, X46, X64, X67, X68, X93, X107, or XI 11 of SEQ ID NO: 1.
14. The method of claim 13, wherein the amino acid sequence of the myoglobin catalyst
comprises at least one feature, wherein the feature is: X29 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X32 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X33 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X39 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X43 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X45 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X46 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X64 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X67 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X68 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y: X93 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X107 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; and/or XI 11 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y, of SEQ ID NO: 1.
15. The method of claim 13, wherein the myoglobin catalyst is selected from the group
consisting of SEQ ID NOS: 2 through 110.
16. The method of claim 12, wherein the myoglobin catalyst comprises a metal-binding
cofactor, wherein the metal-binding cofactor is a heme analog, a metalloporphyrin, or a porphyrin analog.
17. The method of claim 16, wherein the metal-binding cofactor is a mesoporphyrin,
protoporphyrin, bisglycolporphyrin, corrole, phthalocyanine, phlorin, chlorin, 5-isocorrole, 10-isocorrole, or porphycene.
18. The method of claim 16, wherein the metal bound by the metal-binding cofactor
manganese, cobalt, ruthenium, rhodium, or osmium.
19. The method of claim 12, wherein the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is natural or non-natural oc-amino acid amino comprising a— SH,— NH2,— OH, =N-,— NC, imidazolyl, or pyridyl group within its side chain.
20. The method of claim 19, wherein the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is serine, threonine, cysteine, tyrosine, histidine, aspartic acid, glutamic acid, selenocysteine, para- amino-phenylalanine, meto-amino-phenylalanine, /?ara-mercaptomethyl-phenylalanine, meto-mercaptomethyl-phenylalanine, /?ara-(isocyanomethyl)-phenylalanine, meta- (isocyanomethyl)-phenylalanine, 3-pyridyl-alanine, or 3-methyl-histidine.
21. The method of claim 11, wherein the myoglobin catalyst is tethered to a solid support.
22. The method of claim 11, wherein the myoglobin catalyst is contained in a host cell.
23. The method of claim 22, wherein the host cell is Escherichia coli, Saccharomyces
cerevisiae, or Pichia pastoris.
24. The method of claim 11, wherein the carbene insertion product of formula (III) is selected from the group of consisting of:
Figure imgf000107_0001
Figure imgf000107_0002
wherein Ar is C6-i5 aryl, substituted C6-i5 aryl, 6 to 15 membered heteroaryl, or substituted 6 to 15 membered heteroaryl; and
Alk is Ci-18 aliphatic or substituted CMS aliphatic.
25. The method of claim 11, wherein the diazo-containing carbene precursor and the carbene acceptor substrate are part of the same molecule.
26. A method for catalyzing a sigmatropic rearrangement reaction, the method comprising:
(a) providing a diazo-containing carbene precursor of formula (I)
Figure imgf000108_0001
(I)
wherein, R]a and R2a are independently selected from H, halo, cyano (— CN), nitro (— N02), trifluoromethyl (— CF3), CMS alkyl, substituted CMS alkyl, C6-io aryl, substituted C6-io aryl, - to 10-membered heteroaryl, substituted 5- to 10-membered heteroaryl,— C(0)ORlb,— C(0)N(Rlb)(Rlc),— C(0)Rlb,— Si(Rlb)(Rlc)(Rld), and— where each of R]b, R]Ci and Ru are independently selected from H, CMS alkyl, substituted CMS alkyl, C6-io aryl, substituted C6-io aryl, 6- to 10-membered heteroaryl and substituted 6- to 10-membered heteroaryl;
(b) providing a myoglobin-based catalyst;
(c) providing a carbene acceptor substrate of formula (X), (XI), (XIV) or (XV):
Figure imgf000109_0001
Figure imgf000109_0002
(XI ) (xv)
wherein R12 is selected from C1-18 aliphatic, substituted C1-18 aliphatic, C6-i6 aryl, substituted C6-i6 aryl, 5- to 10-membered heteroaryl, substituted 5- to 10-membered heteroaryl, C4-C16 heterocyclic, and substituted C4-C16 heterocyclic group;
Ri3, Ri4, and R15 are independently selected from H, Ci_6 aliphatic groups, substituted Ci-6 aliphatic groups, C6-i6 aryl, and substituted C6-i6 aryl, or where R13 and R14 are connected to form a C4-C16 cyclic aliphatic or heterocyclic group or a substituted C4- Ci6 cyclic aliphatic or heterocyclic group;
Ri6 is Ci-18 aliphatic, substituted CMS aliphatic, C6-i6 aryl, substituted C6-i6 aryl, 5- to 10-membered heteroaryl, substituted 5- to 10-membered heteroaryl, C4-C16 heterocyclic, and substituted C4-C16 heterocyclic group;
Ri7 is Ci-6 aliphatic, substituted Ci_6 aliphatic, C6 aryl, substituted C6 aryl, 5- to 6- membered heteroaryl, or substituted 5- to 6-membered heteroaryl; or where R]6 and Ri7 are connected together to form a C4-C16 cyclic aliphatic or heterocyclic group or a substituted C4-C16 cyclic aliphatic or heterocyclic group;
Ri8, R19, and R2o are independently selected from H, Ci_6 aliphatic groups, substituted Ci-6 aliphatic groups, C6-i6 aryl, and substituted C6-i6 aryl, or where Ris and R19 are connected together to form a C4-C16 cyclic aliphatic or heterocyclic group or a substituted C4-C16 cyclic aliphatic or heterocyclic group;
(d) contacting the diazo-containing carbene precursor and the carbene acceptor substrate with the myoglobin-based catalyst; and (e) allowing the reaction to proceed for a time sufficient to form a sigmatropic rearrangement product of formula (XII), (XIII), (XVI), or (XVII) respectively,
Figure imgf000110_0001
(XII) (XIII)
Figure imgf000110_0002
where R]a, R2a, R9, Ri2, R13, R14, R15, R½, R17, Ris, R19, and R20 are defined as recited in (a) and (c) above.
27. The method of claim 26, wherein the myoglobin comprises an amino acid sequence that is at least 60% identical to SEQ ID NO: 1, 112, 113, 114, 115, or 116.
28. The method of claim 27, wherein the myoglobin catalyst comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 and comprises an amino acid substitution at position X29, X32, X33, X39, X44, X45, X46, X64, X67, X68, X93, X107, or XI 11 of SEQ ID NO: 1.
29. The method of claim 28, wherein the amino acid sequence of the myoglobin catalyst
comprises at least one feature, wherein the feature is: X29 is A, R, N, D, C, Q, E, G, H, I, L,
K, M, F, P, S, T, U, V, W, or Y; X32 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U,
V, W, or Y; X33 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X39 is
A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X43 is A, R, N, D, C, Q, E,
G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X45 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X46 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X64 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X67 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X68 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y: X93 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X107 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; and/or XI 11 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y, of SEQ ID NO: 1.
30. The method of claim 28, wherein the myoglobin catalyst is selected from the group
consisting of SEQ ID NOS: 2 through 110.
31. The method of claim 27, wherein the myoglobin catalyst contains a metal-binding cofactor, wherein the metal-binding cofactor is a heme analog, a metalloporphyrin, or a porphyrin analog.
32. The method of claim 31, wherein the metal-binding cofactor is mesoporphyrin,
protoporphyrin, bisglycolporphyrin, corrole, phthalocyanine, phlorin, chlorin, 5-isocorrole, 10-isocorrole, or porphycene.
33. The method of claim 31, wherein the metal bound by the metal-binding cofactor is iron, manganese, cobalt, ruthenium, rhodium, or osmium.
34. The method of claim 27, wherein the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is natural or non-natural oc-amino acid amino comprising a— SH,— NH2,— OH, =N-,— NC, imidazolyl, or pyridyl group within its side chain.
35. The method of claim 34, wherein the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is serine, threonine, cysteine, tyrosine, histidine, aspartic acid, glutamic acid, selenocysteine, para- amino-phenylalanine, meto-amino-phenylalanine, /?ara-mercaptomethyl-phenylalanine, meta-mercaptomethyl-phenylalanine, /?ara-(isocyanomethyl)-phenylalanine, meta- (isocyanomethyl)-phenylalanine, 3-pyridyl-alanine, or 3-methyl-histidine.
36. The method of claim 26, wherein the myoglobin catalyst is tethered to a solid support.
37. The method of claim 26, wherein the myoglobin catalyst is contained in a host cell.
38. The method of claim 37, wherein the host cell is Escherichia coli, Saccharomyces
cerevisiae, or Pichia pastoris.
39. The method of claim 26, wherein the diazo-containing carbene precursor and the carbene acceptor substrate are part of the same molecule.
40. A method for catalyzing an aldehyde olefination reaction, the method comprising:
(a) providing a diazo-containing carbene recursor of formula (I)
Figure imgf000112_0001
(I)
wherein, R]a and R2a are independently selected from H, halo, cyano (— CN), nitro (— N02), trifluoromethyl (— CF3), C1-18 alkyl, substituted C1-18 alkyl, C6-io aryl, substituted C6-io aryl, 5- to 10-membered heteroaryl, substituted 5- to 10-membered heteroaryl,— C(0)ORlb,— C(0)N(Rlb)(Rlc),— C(0)Rlb,— Si(Rlb)(Rlc)(Rld), and— where each of R]b, R]Ci and Ru are independently selected from H, CMS alkyl, substituted CMS alkyl, C6-io aryl, substituted C6-io aryl, 6- to 10-membered heteroaryl and substituted 6- to 10-membered heteroaryl.
(b) providing a myoglobin-based catalyst;
(c) providing an aldehyde substrate of formula R2i-C(0)-H, wherein R2i is selected from Ci_i8 aliphatic, substituted CMS aliphatic, C6-i6 aryl, substituted C6-i6 aryl, 5- to 10-membered heteroaryl, substituted 5- to 10-membered heteroaryl, C4-C16 heterocyclic group and substituted C4-C16 heterocyclic group;
(d) providing a nucleophilic reagent, wherein the nucleophilic reagent is triphenylphosphine, triphenylarsine, or triphenylstilbine;
(e) contacting the diazo-containing carbene precursor, the aldehyde substrate, and the nucleophilic reagent with the myoglobin-based catalyst; and
(f) allowing the reaction to proceed for a time sufficient to form an olefination product of formula
Figure imgf000113_0001
where R]a, R2a, and R2i are defined as recited in (a) and (c) above.
41. The method of claim 40, wherein the myoglobin comprises an amino acid sequence that is at least 60% identical to SEQ ID NO: 1, 112, 113, 114, 115, or 116.
42. The method of claim 41, wherein the myoglobin catalyst comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 and comprises an amino acid substitution at position X29, X32, X33, X39, X44, X45, X46, X64, X67, X68, X93, X107, or XI 11 of SEQ ID NO: 1.
43. The method of claim 42, wherein the amino acid sequence of the myoglobin catalyst
comprises at least one feature, wherein the feature is: X29 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X32 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X33 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X39 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X43 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X45 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X46 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X64 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X67 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X68 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y: X93 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; X107 is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y; or Xlll is A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, U, V, W, or Y, of SEQ ID NO: 1.
Ill
44. The method of claim 42, wherein the myoglobin catalyst is selected from the group consisting of SEQ ID NOS: 2 through 110.
45. The method of claim 41, wherein the myoglobin catalyst comprises a metal-binding
cofactor, wherein the metal-binding cofactor is a heme analog, a metalloporphyrin, or a porphyrin analog.
46. The method of claim 45, wherein the metal-binding cofactor is mesoporphyrin,
protoporphyrin, bisglycolporphyrin, corrole, phthalocyanine, phlorin, chlorin, 5-isocorrole, 10-isocorrole, or porphycene.
47. The method of claim 45, wherein the metal bound by the metal-binding cofactor is iron, manganese, cobalt, ruthenium, rhodium, or osmium.
48. The method of claim 41, wherein the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is natural or non-natural oc-amino acid amino comprising a— SH,— NH2,— OH, =N-,— NC, imidazolyl, or pyridyl group within its side chain.
49. The method of claim 48, wherein the amino acid residue that coordinates the metal atom at the axial position of the metal-containing cofactor in the myoglobin catalyst is serine, threonine, cysteine, tyrosine, histidine, aspartic acid, glutamic acid, selenocysteine, para- amino-phenylalanine, meto-amino-phenylalanine, /?ara-mercaptomethyl-phenylalanine, meto-mercaptomethyl-phenylalanine, /?ara-(isocyanomethyl)-phenylalanine, meta- (isocyanomethyl)-phenylalanine, 3-pyridyl-alanine, or 3-methyl-histidine.
50. The method of claim 40, wherein the myoglobin catalyst is tethered to a solid support.
51. The method of claim 40, wherein the myoglobin catalyst is contained in a host cell.
52. The method of claim 51, wherein the host cell is Escherichia coli, Saccharomyces
cerevisiae, or Pichia pastoris.
3. The method of claim 40, wherein the diazo-containing carbene precursor and the aldehyde substrate are part of the same molecule.
PCT/US2015/062478 2014-11-25 2015-11-24 Myoglobin-based catalysts for carbene transfer reactions WO2016086015A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201462084162P 2014-11-25 2014-11-25
US62/084,162 2014-11-25

Publications (1)

Publication Number Publication Date
WO2016086015A1 true WO2016086015A1 (en) 2016-06-02

Family

ID=56074986

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2015/062478 WO2016086015A1 (en) 2014-11-25 2015-11-24 Myoglobin-based catalysts for carbene transfer reactions

Country Status (1)

Country Link
WO (1) WO2016086015A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016191612A3 (en) * 2015-05-26 2017-01-19 California Institute Of Technology Hemoprotein catalysts for improved enantioselective enzymatic synthesis of ticagrelor
US9732080B2 (en) 2006-11-03 2017-08-15 Vertex Pharmaceuticals Incorporated Azaindole derivatives as CFTR modulators
US10071979B2 (en) 2010-04-22 2018-09-11 Vertex Pharmaceuticals Incorporated Process of producing cycloalkylcarboxamido-indole compounds
US10081621B2 (en) 2010-03-25 2018-09-25 Vertex Pharmaceuticals Incorporated Solid forms of (R)-1(2,2-difluorobenzo[D][1,3]dioxol-5-yl)-N-(1-(2,3-dihydroxypropyl)-6-fluoro-2-(1-hydroxy-2-methylpropan-2-yl)-1H-indol-5-yl)cyclopropanecarboxamide
US10208322B2 (en) 2012-10-09 2019-02-19 California Institute Of Technology In vivo and in vitro olefin cyclopropanation catalyzed by heme enzymes
US10206877B2 (en) 2014-04-15 2019-02-19 Vertex Pharmaceuticals Incorporated Pharmaceutical compositions for the treatment of cystic fibrosis transmembrane conductance regulator mediated diseases
CN114479110A (en) * 2022-02-11 2022-05-13 云南大学 Covalent organic framework with triphenylantimony as framework and preparation method and application thereof
US11518768B2 (en) 2015-10-14 2022-12-06 The Regents Of The University Of California Artificial metalloenzymes containing noble metal-porphyrins

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110130550A1 (en) * 2008-03-11 2011-06-02 Osaka University Protein monomer, protein polymer obtained from said monomer, and device that contains them
WO2011111253A1 (en) * 2010-03-11 2011-09-15 国立大学法人大阪大学 Complex of plurality of metal nanoparticles and apoprotein derived from heme protein dimer
US20140242647A1 (en) * 2012-10-09 2014-08-28 California Institute Of Technology In vivo and in vitro olefin cyclopropanation catalyzed by heme enzymes

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110130550A1 (en) * 2008-03-11 2011-06-02 Osaka University Protein monomer, protein polymer obtained from said monomer, and device that contains them
WO2011111253A1 (en) * 2010-03-11 2011-09-15 国立大学法人大阪大学 Complex of plurality of metal nanoparticles and apoprotein derived from heme protein dimer
US20140242647A1 (en) * 2012-10-09 2014-08-28 California Institute Of Technology In vivo and in vitro olefin cyclopropanation catalyzed by heme enzymes

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BORDEAUX ET AL.: "Highly diastereo- and enantioselective olefin cyclopropanation via engineered myoglobin-based catalysts", ANGEW CHEM INT ED ENGL., vol. 54, no. 6, 2 February 2015 (2015-02-02), pages 1744 - 1748 *
COELHO ET AL.: "Olefin Cyclopropanation via Carbene Transfer Catalyzed by Engineered Cytochrome P450 Enzymes", SCIENCE, vol. 339, no. 6117, 18 January 2013 (2013-01-18), pages 307 - 310, XP055252970, DOI: doi:10.1126/science.1231434 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9732080B2 (en) 2006-11-03 2017-08-15 Vertex Pharmaceuticals Incorporated Azaindole derivatives as CFTR modulators
US10081621B2 (en) 2010-03-25 2018-09-25 Vertex Pharmaceuticals Incorporated Solid forms of (R)-1(2,2-difluorobenzo[D][1,3]dioxol-5-yl)-N-(1-(2,3-dihydroxypropyl)-6-fluoro-2-(1-hydroxy-2-methylpropan-2-yl)-1H-indol-5-yl)cyclopropanecarboxamide
US10071979B2 (en) 2010-04-22 2018-09-11 Vertex Pharmaceuticals Incorporated Process of producing cycloalkylcarboxamido-indole compounds
US10208322B2 (en) 2012-10-09 2019-02-19 California Institute Of Technology In vivo and in vitro olefin cyclopropanation catalyzed by heme enzymes
US11008596B2 (en) 2012-10-09 2021-05-18 California Institute Of Technology Cytochrome P450 BM3 enzyme variants for preparation of cyclopropanes
US10206877B2 (en) 2014-04-15 2019-02-19 Vertex Pharmaceuticals Incorporated Pharmaceutical compositions for the treatment of cystic fibrosis transmembrane conductance regulator mediated diseases
WO2016191612A3 (en) * 2015-05-26 2017-01-19 California Institute Of Technology Hemoprotein catalysts for improved enantioselective enzymatic synthesis of ticagrelor
US11518768B2 (en) 2015-10-14 2022-12-06 The Regents Of The University Of California Artificial metalloenzymes containing noble metal-porphyrins
CN114479110A (en) * 2022-02-11 2022-05-13 云南大学 Covalent organic framework with triphenylantimony as framework and preparation method and application thereof
CN114479110B (en) * 2022-02-11 2023-02-28 云南大学 Covalent organic framework with triphenylantimony as framework as well as preparation method and application thereof

Similar Documents

Publication Publication Date Title
WO2016086015A1 (en) Myoglobin-based catalysts for carbene transfer reactions
Natoli et al. Noble− metal substitution in hemoproteins: an emerging strategy for abiological catalysis
Moore et al. Chemoselective cyclopropanation over carbene Y–H insertion catalyzed by an engineered carbene transferase
Patil et al. Oxidoreductase-catalyzed synthesis of chiral amines
Farwell et al. Enantioselective imidation of sulfides via enzyme-catalyzed intermolecular nitrogen-atom transfer
Wolf et al. Engineering of RuMb: Toward a green catalyst for carbene insertion reactions
Steck et al. Enantioselective synthesis of chiral amines via biocatalytic carbene N–H insertion
Brenna et al. Opposite Enantioselectivity in the Bioreduction of (Z)‐β‐Aryl‐β‐cyanoacrylates Mediated by the Tryptophan 116 Mutants of Old Yellow Enzyme 1: Synthetic Approach to (R)‐and (S)‐β‐Aryl‐γ‐lactams
JP2015534464A (en) In vivo and in vitro olefin cyclopropanation catalyzed by heme enzymes
Gober et al. Non-natural carbenoid and nitrenoid insertion reactions catalyzed by heme proteins
US20150267232A1 (en) In vivo and in vitro carbene insertion and nitrene transfer reactions catalyzed by heme enzymes
US10927355B2 (en) Method for producing an organosilicon product
WO2016191612A2 (en) Hemoprotein catalysts for improved enantioselective enzymatic synthesis of ticagrelor
Crotti et al. Stereoselectivity switch in the reduction of α-alkyl-β-arylenones by structure-guided designed variants of the ene reductase OYE1
US11518768B2 (en) Artificial metalloenzymes containing noble metal-porphyrins
Zhang et al. Biocatalytic aromaticity-breaking epoxidation of naphthalene and nucleophilic ring-opening reactions
Ebrecht et al. Natural alternative heme-environments allow efficient peroxygenase activity by cytochrome P450 monooxygenases
US11525123B2 (en) Diverse carbene transferase enzyme catalysts derived from a P450 enzyme
Athavale et al. Engineering Enzymes for New‐to‐Nature Carbene Chemistry
US20160222423A1 (en) Enzyme-catalyzed enantioselective aziridination of olefins
Kagan Asymmetric oxidation of sulfides
Zhang New-To-Nature Selective CH Alkylation Using Engineered Carbene Transferases
CN114836486B (en) Method for synthesizing chiral beta-amino alcohol by enzyme catalysis
Rumo Artificial metalloenzymes based on copper heteroscorpionate complexes for CH functionalization catalysis
Chen Expanding the Catalytic Repertoire of Hemeproteins as Carbene Transferases to Access Diverse Molecular Structures

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15864009

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15864009

Country of ref document: EP

Kind code of ref document: A1